You are on page 1of 22

Advanced Review

Response surface methodology


Andre I. Khuri1 and Siuli Mukhopadhyay2

The purpose of this article is to provide a survey of the various stages in the
development of response surface methodology (RSM). The coverage of these
stages is organized in three parts that describe the evolution of RSM since its
introduction in the early 1950s. Part I covers the period, 19511975, during which the
so-called classical RSM was developed. This includes a review of basic experimental
designs for fitting linear response surface models, in addition to a description of
methods for the determination of optimum operating conditions. Part II, which
covers the period, 19761999, discusses more recent modeling techniques in RSM,
in addition to a coverage of Taguchis robust parameter design and its response
surface alternative approach. Part III provides a coverage of further extensions
and research directions in modern RSM. This includes discussions concerning
response surface models with random effects, generalized linear models, and
graphical techniques for comparing response surface designs. 2010 John Wiley &
Sons, Inc. WIREs Comp Stat 2010 2 128149

PART I. THE FOUNDATIONAL YEARS: the mean response, that is, the expected value of y,
19511975 and is denoted by (x).
Two important models are commonly used in
An Introduction and Some Preliminaries RSM. These are special cases of model (1) and include
the first-degree model (d = 1),
R esponse surface methodology (RSM) consists
of a group of mathematical and statistical
techniques used in the development of an adequate 
k

functional relationship between a response of interest, y = 0 + i xi +  (2)


y, and a number of associated control (or input) i=1

variables denoted by x1 , x2 , . . . , xk . In general, such a


and the second-degree model (d = 2)
relationship is unknown but can be approximated by
a low-degree polynomial model of the form

k  
k
y = 0 + i xi + ij xi xj + ii x2i + .
y = f  (x) +  (1) i=1 i<j i=1
(3)
where x = (x1 , x2 , . . . , xk ) , f (x) is a vector function of
p elements that consists of powers and cross- products The purpose of considering a model such as (1) is
of powers of x1 , x2 , . . . , xk up to a certain degree threefold:
denoted by d ( 1), is a vector of p unknown
constant coefficients referred to as parameters, and
1. To establish a relationship, albeit approximate,
 is a random experimental error assumed to have
between y and x1 , x2 , . . . , xk that can be used to
a zero mean. This is conditioned on the belief that
predict response values for given settings of the
model (1) provides an adequate representation of the
control variables.
response. In this case, the quantity f  (x) represents
2. To determine, through hypothesis testing, signif-
Correspondence icance of the factors whose levels are represented
to: ufakhuri@stat.ufl.edu
1 Department
by x1 , x2 , . . . , xk .
of Statistics, University of Florida, Gainesville, FL
32611, USA 3. To determine the optimum settings of
2 Departmentof Mathematics, Indian Institute of Technology x1 , x2 , . . . , xk that result in the maximum (or
Bombay, Powai, Mumbai 400076, India minimum) response over a certain region of
DOI: 10.1002/wics.73 interest.

128 2010 Jo h n Wiley & So n s, In c. Vo lu me 2, March /April 2010


WIREs Computational Statistics Response surface methodology

In order to achieve the above three objectives, a series The quantity f  (xu ) also gives the so-called predicted
of n experiments should first be carried out, in each response, y(xu ), at the uth design point (u =
of which the response y is measured (or observed) for 1, 2, . . . , n). In general, at any point, x, in an
specified settings of the control variables. The totality experimental region, denoted by R, the predicted
of these settings constitutes the so-called response response y(x) is
surface design, or just design, which can be represented
by a matrix, denoted by D, of order n k called the y(x) = f  (x), x R. (10)
design matrix,
Since is an unbiased estimator of , y(x) is an
x11 x12 . . . x1k unbiased estimator of f  (x), which is the mean
x22 x22 . . . x2k
response at x R. Using Eq. (8), the variance of y(x)
. . ... .
D=
.
(4) is of the form
. ... .
. .
. ... Var[y(x)] = 2 f  (x)(X  X)1 f (x). (11)
xn1 xn2 . . . xnk
The proper choice of design is very important in any
where xui denotes the uth design setting of xi response surface investigation. This is true because the
(i = 1, 2, . . . , k; u = 1, 2, . . . , n). Each row of D quality of prediction, as measured by the size of the
represents a point, referred to as a design point, prediction variance, depends on the design matrix D
in a k-dimensional Euclidean space. Let yu denote as can be seen from formula (11). Furthermore, the
the response value obtained as a result of applying determination of the optimum response amounts to
the uth setting of x, namely xu = (xu1 , xu2 , . . . , xuk ) finding the optimal value of y(x) over the region R.
(u = 1, 2, . . . , n). From Eq. (1), we then have It is therefore imperative that the prediction variance
in Eq. (11) be as small as possible provided that the
yu = f  (xu ) +  u , u = 1, 2, . . . , n (5) postulated model in Eq. (1) does not suffer from lack
of fit (for a study of lack of fit of a fitted response
where  u denotes the error term at the uth model, see e.g., Ref 1, Section 2.6).
experimental run. Model (5) can be expressed in
matrix form as
Some Common Design Properties
y = X +  (6)
The choice of design depends on the properties it
is required, or desired, to have. Some of the design
where y = (y1 , y2 , . . . , yn ) , X is a matrix of order
properties considered in the early development of
n p whose uth row is f  (xu ), and  = ( 1 ,  2 ,
RSM include the following:
. . . ,  n ) . Note that the first column of X is the column
of ones 1n .
Assuming that  has a zero mean and a Orthogonality
variancecovariance matrix given by 2 I n , the so- A design D is said to be orthogonal if the matrix X  X
called ordinary least-squares estimator of is (see is diagonal, where X is the model matrix in Eq. (6).
e.g., Ref 1, Section 2.3) The advantage of this approach is that the elements
of will be uncorrelated because the off-diagonal
= (X  X)1 X  y. (7) elements of Var() in Eq. (8) will be zero. If the
error vector  in Eq. (6) is assumed to be normally
distributed as N(0, 2 I n ), then these elements will be
The variancecovariance matrix of is then of the
also stochastically independent. This makes it easier
form
to test the significance of the unknown parameters in
the model.
Var() = (X  X)1 X  ( 2 I n )X(X  X)1
= 2 (X  X)1 . (8)
Rotatability
A design D is said to be rotatable if the prediction
Using , an estimate, (xu ), of the mean response at variance in Eq. (11) is constant at all points that are
xu is obtained by replacing by , that is, equidistant from the design center, which, by a proper
coding of the control variables, can be chosen to be the
(xu ) = f  (xu ), u = 1, 2, . . . , n. (9) point at the origin of the k-dimensional coordinates

Vo lu me 2, March /April 2010 2010 Jo h n Wiley & So n s, In c. 129


Advanced Review www.wiley.com/wires/compstats

system. It follows that Var[y(x)] is constant at all be robust if its properties are not severely impacted
points that fall on the surface of a hypersphere by failures to satisfy the assumptions made about the
centered at the origin, if the design is rotatable. model and the error distribution.
The advantage of this property is that the prediction
variance remains unchanged under any rotation of
the coordinate axes. In addition, if optimization of Design Optimality
y(x) is desired on concentric hyperspheres, as in the Optimal designs are those that are constructed on the
application of ridge analysis, which will be discussed basis of a certain optimality criterion that pertains
later, then it would be desirable for the design to be to the closeness of the predicted response, y(x), to
rotatable. This makes it easier to compare the values the mean response, (x), over a certain region of
of y(x) on a given hypersphere as all such values have interest denoted by R. The design criteria that address
the same variance. the minimization of the variance associated with the
The necessary and sufficient condition for a estimation of model (1)s unknown parameters are
design to be rotatable was given by Box and Hunter.2 called variance-related criteria. The most prominent
More recently, Khuri3 introduced a measure of of such criteria is the D-optimality criterion that
rotatability as a function of the so-called moments of maximizes the determinant of the matrix X  X. This
the design under consideration (see e.g., Appendix 2B amounts to the minimization of the size of the
in Ref 1). The function is expressible as a percentage confidence region on the vector in model (6).
taking large values for a high degree of rotatability. Actually, this criterion results in the so-called discrete
The value 100 is attained when the design is rotatable. D-optimal design as compared with the continuous D-
The advantages of this measure are: optimal design, which was introduced by Jack Kiefer.
The latter design is based on the notion that a design
1. The ability to compare designs on the basis of represents a probability measure defined on the region
rotatability. R. A discrete design is then treated as a special case
2. The assessment of the extent of departure from consisting of a collection of n points in R that are not
rotatability of a design whose rotatability may necessarily distinct.
be sacrificed to satisfy another desirable design Another variance-related criterion that is closely
property. related to D-optimality is the G-optimality criterion
which requires the minimization of the maximum
3. The ability to improve rotatability by a proper
over R of the prediction variance in Eq. (11).
augmentation of a nonrotatable design.
Kiefer developed a continuous counterpart of this
criterion and showed that it is equivalent to
Uniform Precision the continuous D-optimality criterion. This result
A rotatable design is said to have the additional
is based on the equivalence theorem proved by
uniform precision property if Var[y(x)] at the origin
Kiefer and Wolfowitz.5 Other less-known variance-
is equal to its value at a distance of one from the
related criteria include A-optimality and E-optimality.
origin. This property, which was also introduced
See Ref 6, Chapter 4, for a description of these
by Box and Hunter,2 provides for an approximate
criteria.
uniform distribution of the prediction variance inside
These variance-related criteria are often referred
a hypersphere of radius one. This helps in producing
to as alphabetic optimality. They are meaningful
some stability in the prediction variance in the vicinity
when the fitted model in Eq. (1) represents the true
of the design center.
relationship connecting y to its control variables.

Design Robustness
Box and Draper4 listed several additional design Designs for First- and Second-Degree
properties that pertain to detection of lack of fit, Models
generation of satisfactory distribution of information As was pointed out earlier in the Section on
throughout the experimental region, estimation of the Introduction and Some Preliminaries, the first-
error variance, insensitivity to outliers and to errors degree model in Eq. (2) and second-degree model in
made in the actual implementation of the settings Eq. (3) are the most-frequently used approximating
of the control variables. These properties provide polynomial models in classical RSM. Designs for
guidelines for the choice of a design (i.e., a wish fitting first-degree models are called first-order designs
list). It is not, however, expected that a single design and those for fitting second-degree models are referred
will satisfy all of these properties. A design is said to to as second-order designs.

130 2010 Jo h n Wiley & So n s, In c. Vo lu me 2, March /April 2010


WIREs Computational Statistics Response surface methodology

First-Order Designs k + 1 parameters in model (2) can be estimated. The


The most common first-order designs are 2k
factorial construction of fractions of a 2k design is carried out
(k is the number of control variables), PlackettBur- in a particular manner, a description of which can be
man, and simplex designs. found in several experimental design textbooks, such
as Refs 79. See also Chapter 3 in Ref 1.
The 2k Factorial Design
In a 2k factorial design, each control variable is The PlackettBurman Design
measured at two levels, which can be coded to take The PlackettBurman design allows two levels for each
the values, 1, 1, that correspond to the so-called low of the k control variables, just like a 2k design, but
and high levels, respectively, of each variable. This requires a much smaller number of experimental runs,
design consists of all possible combinations of such especially if k is large. It is therefore more economical
levels of the k factors. Thus, each row of the design than the 2k design. Its number, n, of design points is
matrix D in Eq. (4) consists of all 1s, all 1s, or a equal to k + 1, which is the same as the number of
combination of 1s and 1s and represents a particular parameters in model (2). In this respect, the design
treatment combination. In this case, the number, n, is said to be saturated because its number of design
of experimental runs is equal to 2k provided that points is equal to the number of parameters to be
no single design point is replicated more than once. estimated in the model. Furthermore, this design is
For example, in a chemical experiment, the control available only when n is a multiple of 4. Therefore, it
variables are x1 = temperature of a reaction measured can be used when the number, k, of control variables
at 250, 300 ( C), x2 = pressure set at 10, 16 (psi), and is equal to 3, 7, 11, 15,....
x3 = time of the reaction taken at 4, 8 (minutes). The To construct a PlackettBurman design in k
coded settings, 1, for x1 , x2 , x3 are attained through variables, a row is first selected whose elements are
the linear transformation, equal to 1 or 1 such that the number of 1s is k+1 2 and
the number of 1s is k1 2 . The next k 1 rows are
temperature 275 generated from the first row by shifting it cyclically
x1 = (12)
25 one place to the right k 1 times. Then, a row of
pressure 13 negative ones is added at the bottom of the design.
x2 = (13)
3 For example, for k = 7, the design matrix, D, has
time 6 eight points whose coordinates are x1 , x2 , . . . , x7 and
x3 = . (14)
2 is of the form

The corresponding 23 design matrix is of order 8 3 1 1 1 1 1 1 1
1 1 1 1 1 1 1
of the form
1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1
D= . (16)
1 1 1
1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1
D= . (15)

1 1 1 1 1 1 1 1 1 1
1 1 1

1 1 1 Design arrangements for k = 3, 7, 11, . . . , 99 factors
1 1 1 can be found in Ref 10.

If k is large (k 5), the 2k design requires a The Simplex Design


large number of design points. Since the number of The simplex design is also a saturated design with
unknown parameters in Eq. (2) is only k + 1, fractions n = k + 1 points. Its design points are located at the
of 2k can be considered to fit such a model. For vertices of a k-dimensional regular-sided figure, or a
example, we can consider a one-half fraction design simplex, characterized by the fact that that the angle,
that consists of one-half the number of points of a 2k , which any two points make with the design center
design, or a one-fourth fraction design that consists (located at the origin of the coordinates system) is such
of one- fourth the number of points of a 2k design. that cos = 1k . For example, for k = 2, the simplex
In general, a 2m th fraction of a 2k design consists design consists of the vertices of an equilateral triangle
of 2km points from a full 2k design. Here, m is a whose center is (0, 0), and for k = 3, the design points
positive integer such that 2km k + 1 so that all the are the vertices of a tetrahedron centered at (0, 0, 0).

Vo lu me 2, March /April 2010 2010 Jo h n Wiley & So n s, In c. 131


Advanced Review www.wiley.com/wires/compstats

Box11 presented a procedure for constructing The Central Composite Design (CCD)
a simplex design using a particular pattern of a This is perhaps the most popular of all second-order
one-factor-at-a-time design. This procedure is also designs. It was first introduced in Ref 13. This design
explained in Ref 1, Section 3.3.5. The simplex consists of the following three portions:
design is a less frequently used design than 2k or
PlackettBurman designs. This is because the actual
1. A complete (or a fraction of) 2k factorial design
settings in a simplex design are, in general, difficult to
whose factors levels are coded as 1, 1. This is
attain exactly in a real experimental situation.
called the factorial portion.
All the above designs (2k or fractions of,
PlackettBurman, simplex) share the same property 2. An axial portion consisting of 2k points
of being orthogonal. For a first-order design, arranged so that two points are chosen on the
orthogonality can be achieved if the design matrix axis of each control variable at a distance of
D is such that D  D is diagonal. We may recall that from the design center (chosen as the point at
an orthogonal design causes the variancecovariance the origin of the coordinates system).
matrix of , the least-squares estimator of the vector 3. n0 center points.
of unknown parameters in the model, to be diagonal (if
the error vector  is assumed to have a zero mean and Thus, the total number of design points in a CCD
a variancecovariance matrix 2 I n ). This means that is n =
2 + 2k + n0 . For example, a CCD for k = 2,
k
the elements of are uncorrelated, hence independent = 2, n0 = 2 has the form
under the normality assumption on . Furthermore,
it can be shown that under an orthogonal design,
1 1
the variances of the elements of have minimum 1 1
values (see Section 3.3 in Ref 1). This means that
1 1
an orthogonal first-order design provides maximum
1 1
precision for estimating the unknown parameters in
2 0
model (2). D= . (17)
2 0

0 2
Second-Order Designs
2
The number of parameters in the second-degree 0
0 0
model in Eq. (3) is p = 1 + 2k + 12 k(k 1). Hence,
0 0
the number of distinct design points of a second-
order design must be at least equal to p. The design
We note that the CCD is obtained by augmenting
settings are usually coded so that 1n nu=1 xui = 0 and
1 n a first-order design, namely, the 2k factorial with
u=1 xui = 1, i = 1, 2, . . . , k, where n is the number
2
n additional experimental runs, namely, the 2k axial
of experimental runs and xui is the uth setting of the
points and the n0 center-point replications. Thus, this
ith control variable (u = 1, 2, . . . , n).
design is developed in a manner consistent with the
The most frequently used second-order designs
sequential nature of a response surface investigation
are the 3k factorial, central composite, and the
in starting with a first-order design, to fit a first-degree
BoxBehnken designs.
model, followed by the addition of design points to
fit the larger second-degree model. The first-order
design serves in a preliminary phase to get initial
The 3k Factorial Design information about the response system and to assess
The 3k factorial design consists of all the combinations the importance of the factors in a given experiment.
of the levels of the k control variables which have The additional experimental runs are chosen for the
three levels each. If the levels are equally spaced, purpose of getting more information that can lead to
then they can be coded so that they correspond to the determination of optimum operating conditions on
1, 0, 1. The number of experimental runs for this the control variables using the second-degree model.
design is 3k , which can be very large for a large The values of (or the axial parameter) and n0 ,
k. Fractions of a 3k design can be considered to the number of center-point replications, are chosen so
reduce the cost of running such an experiment. A that the CCD can acquire certain desirable properties.
general procedure for constructing fractions of 3k For example, choosing = F1/4 , where F denotes
is described in Montgomery9 (Chapter 9). See also the number of points in the factorial portion, causes
Ref 12, Appendix 2. the CCD to be rotatable. The value of n0 can

132 2010 Jo h n Wiley & So n s, In c. Vo lu me 2, March /April 2010


WIREs Computational Statistics Response surface methodology

then be chosen so that the CCD can achieve either Since the first-degree model is usually used at the
the orthogonality property or the uniform precision preliminary stage of a response surface investigation,
property. Note that orthogonality of a second-order we shall only mention here optimization techniques
design is attainable only after expressing model (3) that are applicable to second-degree models. Such
in terms of orthogonal polynomials as explained in models are used after a series of experiments have
Box and Hunter2 (pp. 200201). See also Khuri and been sequentially carried out leading up to a region
Cornell1 (Section 4.3). In particular, Table 4.3 in that is believed to contain the location of the optimum
Khuri and Cornells book can be used to determine response.
the value of n0 for a rotatable CCD to have either
the additional orthogonality property or the uniform Optimization of a Second-Degree Model
precision property. Let us consider the second-degree model in Eq. (3),
which can be written as
The BoxBehnken Design
This design was developed by Box and Behnken.14 y = 0 + x + x Bx +  (18)
It provides three levels for each factor and consists
of a particular subset of the factorial combinations
where x = (x1 , x2 , . . . , xk ) , = ( 1 , 2 , . . . , k ) , and
from the 3k factorial design. The actual construction
B is a symmetric matrix of order k k whose ith
of such a design is described in the three RSM books
diagonal element is ii (i = 1, 2, . . . , k), and its (i, j)th
Box and Draper,15 Section 15.4, Khuri and Cornell,1
off-diagonal element is 12 ij (i, j = 1, 2, . . . , k; i = j).
Section 4.5.2, and Myers and Montgomery,16 Sec-
tion 7.4.7. If n observations are obtained on y using a design
The use of the BoxBehnken design is popular in matrix D as in Eq. (4), then Eq. (18) can be written
industrial research because it is an economical design in vector form as in Eq. (6), where the parameter
and requires only three levels for each factor where vector consists of 0 and the elements of and B.
the settings are 1, 0, 1. Some BoxBehnken designs Assuming that E() = 0 and Var() = 2 I n , the least-
are rotatable, but, in general, this design is not always squares estimate of is as given in Eq. (7). The
rotatable. Box and Behnken14 list a number of design predicted response at a point x in the region R is then
arrangements for k = 3, 4, 5, 6, 7, 9, 10, 11, 12, and of the form
16 factors.
Other second-order designs are available but y(x) = 0 + x + x Bx (19)
are not as frequently used as the ones we have
already mentioned. Some of these designs include where 0 and the elements of and B are the
Hoke17 designs, BoxDraper saturated designs (see least-squares estimates of 0 and the corresponding
Ref 18), uniform shell designs by Doehlert,19 and elements of and B, respectively.
hybrid designs by Roquemore.20
The Method of Ridge Analysis
Determination of Optimum Conditions This is a useful procedure for optimizing the predicted
One of the main objectives of RSM is the deter- response based on the fitted second-degree model in
mination of the optimum settings of the control Eq. (19). It was introduced by Hoerl22 and formalized
variables that result in a maximum (or a mini- by Draper.23 This method optimizes y(x) in Eq. (19)
mum) response over a certain region of interest, subject to x being on the surface of a hypersphere of
R. This requires having a good fitting model that radius r and centered at the origin, namely,
provides an adequate representation of the mean
response because such a model is to be utilized to 
k
determine the value of the optimum. Optimization x2i = r2 . (20)
techniques used in RSM depend on the nature of the i=1
fitted model. For first-degree models, the method of
steepest ascent (or descent) is a viable technique for This constrained optimization is conducted using
sequentially moving toward the optimum response. several values of r corresponding to hyperspheres
This method is explained in detail in Myers and contained within the region R. The rationale for doing
Montgomery,16 Khuri and Cornell1 (Chapter 5), and this is to get information about the optimum at various
Box and Draper15 (Chapter 6). Myers and Khuri21 distances from the origin within R.
developed certain improvements regarding the stop- Since this optimization is subject to the equality
ping rule used in the execution of this method. constraint given by Eq. (20), the method of Lagrange

Vo lu me 2, March /April 2010 2010 Jo h n Wiley & So n s, In c. 133


Advanced Review www.wiley.com/wires/compstats

multipliers can be used to search for the optimum. Let estimates of the optimum response. For this reason,
us therefore consider the function Khuri and Myers24 proposed a modification of the
method of ridge analysis whereby the optimization of
H = 0 + x + x Bx (x x r2 ) (21) y(x) is carried out under an added constraint on the
size of the prediction variance. This modification can
where is a Lagrange multiplier. Differentiating H produce better optimization results when the design
with respect to x and equating the derivative to zero, used to fit model (18) is not rotatable. More recently,
we get Paul and Khuri25 extended the use of Khuri and
Myers modification to linear models where the error
+ 2(Bx x) = 0. (22) variances are heterogeneous and also to generalized
linear models.
Solving for x, we obtain

1
x = (B I n )1 . (23) PART II. FURTHER DEVELOPMENTS
2
AND THE TAGUCHI ERA: 19761999
The solution in Eq. (23) represents just a stationary
point of y(x). A maximum (minimum) is achieved at Multiresponse Experiments
this point

if the Hessian matrix, that is, the matrix In a multiresponse experiment, measurements on
H
x x of second-order partial derivatives of H with
several responses are obtained for each setting of a
respect to x is negative definite (positive definite). group of control variables. Examples of multiresponse
From Eq. (22), this matrix is given by experiments are numerous, for example, a chemical
engineer may be interested in maximizing the
H yield while minimizing the cost of a certain
= 2(B I n ). (24)
x x chemical process. Refs 2628 cited several papers in
which multiresponse experiments were studied. While
Therefore, to achieve a maximum, Draper23 suggested analyzing the data from a multiresponse experiment,
that be chosen larger than the largest eigenvalue of special attention should be given to the correlated
B. Such a choice causes B I n to be negative definite. nature of the data within experimental runs. Usually,
Choosing smaller than the smallest eigenvalue of B it is assumed that the responses are correlated within
causes B I n to be positive definite, which results runs but independent otherwise.
in a minimum. Thus, by choosing several values of Suppose that n is the number of experimental
in this fashion, we can, for each , find the location runs and q is the number of responses. Then the ith
of the optimum (maximum or minimum) by using response may be modeled as (see Ref 1, pp. 252254)
formula (23) and hence obtain the value of x x = r2 .
The solution from Eq. (23) is feasible provided that r yi = Xi i + i , i = 1, . . . , q (25)
corresponds to a hypersphere that falls entirely within
the region R. The optimal value of y(x) is computed where yi is an n 1 vector of observations on the ith
by substituting x from Eq. (23) into the right-hand response, Xi is an n pi known matrix of rank pi , i
side of Eq. (19). This process generates plots of y and is a vector of pi unknown regression parameters, and
xi against r (i = 1, 2, . . . , k). These plots are useful in i is a vector of random errors associated with the ith
determining, at any given distance r from the origin, response. Using matrix notation, the above model can
the value of the optimum as well as its location. More be expressed as
details concerning this method can be found in Myers
and Montgomery,16 Khuri and Cornell1 (Section 5.7), Y = X +  (26)
and Box and Draper15 (Chapter 19).
Since the method of ridge analysis optimizes where Y = [y1 , . . . , yq ] , X is the block-diagonal
y(x) on concentric hyperspheres within the region
matrix, diag(X1 , . . . , Xq ), = [1 , . . . , q ] , and  =
R, its application is meaningful provided that the
[1 , . . . , q ] . It is assumed that E() = 0 and the
prediction variance in formula (11) is constant on
the surface of any given hypersphere. This calls for variancecovariance matrix Var() =  I, where I
the use of a rotatable design to fit model (18). is the n n identity matrix. The best linear unbiased
If, however, the design is not rotatable, then the estimate of is given by
prediction variance can vary appreciably on the
surface of a hypersphere, which may lead to poor = [X ( 1 I)X]1 X ( 1 I)Y. (27)

134 2010 Jo h n Wiley & So n s, In c. Vo lu me 2, March /April 2010


WIREs Computational Statistics Response surface methodology

An estimate of  is used to find when  is unknown. the responses within a region of interest. In mul-
A commonly used estimate of  is the one proposed tiresponse experiments, the meaning of optimum is
by Zellner.29 sometimes unclear as there is no unique way to order
In selecting a design optimality criterion for the multiresponse data. Conditions that are optimal
multiresponse experiments, one needs to consider all for one response may be far from optimal or even
the responses simultaneously. Draper and Hunter30 physically impractical for the other responses from the
proposed a criterion for the estimation of the experimental point of view. For example, in a dosere-
unknown parameters in a multiresponse situation. sponse experiment, where both efficacy and toxicity
Their criterion was used for selecting additional responses are measured at each dose, the experimenter
experimental runs after a certain number of runs may wish to find the dose level of the drug(s) which
have already been chosen. The approach used was simultaneously maximize efficacy while minimizing
Bayesian, with the unknown parameter vector having toxicity. Common knowledge is that as the dose of a
a uniform prior distribution. The variancecovariance drug increases, so do its efficacy and toxic side effects.
matrix of the responses was assumed to be known. This implies that the efficacy response is optimized at
Box and Draper31 later extended Draper and a higher dose level, whereas the toxicity response is
Hunters30 criterion by considering response surface minimized at a lower dose level. Thus, it is difficult
designs with blocks. The most common criterion for to identify dose levels which are optimal for both
multiresponse designs is the D-optimality criterion. responses. The problem of simultaneous optimiza-
One such criterion was developed by Fedorov32 for tion for linear multiresponse models was addressed in
linear multiresponse designs. Fedorovs procedure was Refs 4750.
sequential in nature and was used for constructing Lind et al.47 developed a graphical approach in
D-optimal designs. However, it required knowledge which contours of all the responses were superimposed
of the variancecovariance matrix associated with the on each other and the region where operating
several response variables. Wijesinha and Khuri33 later conditions were near optimal for all the responses
modified Fedorovs procedure by using an estimate of was identified. As the number of responses and control
 at each step of the sequential process. Some recent factors increases, finding the optimum graphically
works on D-optimal designs for linear multiresponse
becomes infeasible. Myers and Carter51 proposed
models include those of Krafft and Schaefer,34
the dual response system consisting of a primary
Bischoff,35 Chang,36,37 Imhof,38 and Atashgah and
response and a secondary response. The procedure
Seifi.39 Locally D-optimal designs for describing the
involved determining the operating conditions for
behavior of a biological system were constructed
which the primary response was optimized while
by Hatzis and Larntz40 for nonlinear multiresponse
the secondary response was constrained to be equal
models. Other design criteria for linear multiresponse
to some prespecified value. Refs 5255 provided
models include the power criterion of Wijesinha and
various extensions to the dual response approach.
Khuri41 and the robustness criterion of Wijesinha and
Khuri42 and Yue.43 Harrington48 developed the desirability approach
The model given in Eq. (26) is said to suffer to multiresponse optimization. In his algorithm,
from lack of fit if it does not represent the q true mean exponential type transformations were used to
responses adequately. Due to the correlated nature transform each of the responses into desirability
of the responses in a multiresponse situation, lack functions. Derringer and Suich49 later generalized
of fit of one response may affect the fit of the other the transformations and developed more flexible
responses. Khuri44 proposed a multivariate test of lack desirability functions. The individual desirability
of fit that considers all the q responses simultaneously. functions were then incorporated into a single
His test was based on Roys union intersection function, which gave the desirability for the whole set
principle45 (Chapters 4 and 5) and required that of responses. Both Refs 48 and 49 used the geometric
replicated observations on all responses be taken at mean of the individual desirability functions to
some points of the experimental region. Levy and construct a single overall desirability function, which
Neill46 considered additional multivariate lack of fit was maximized to determine the operating conditions.
tests and used simulations to compare the power Del Castillo et al.56 modified the desirability approach
functions of these tests. of Ref 49 such that both the desirability function and
One important objective of multiresponse exper- its first derivative were continuous. However, their
imentation is to determine the optimum operating approach ignored variations and correlations existing
conditions on the control variables that lead to the among the responses. Wu57 presented an approach
simultaneous optimization of the predicted values of based on the modified double-exponential desirability

Vo lu me 2, March /April 2010 2010 Jo h n Wiley & So n s, In c. 135


Advanced Review www.wiley.com/wires/compstats

function taking into account correlations among the 1. The smaller, the better: minimizing the response.
responses. 2. The larger, the better: maximizing the response.
Multiresponse optimization using the so-called
generalized distance approach was introduced by 3. Target is best: achieving a given target value.
Khuri and Conlon.50 The main characteristic of this
approach was that it took into account the heterogene- For each of the different goals, Taguchi defined
ity of the variances of the responses and also the corre- performance criteria known as signal-to-noise (S/N)
lated nature of the responses. If the individual optima ratios that took into account both the process mean
were not attained at the same settings of the con- and the variance. Each set of settings of the control
trol factors, then compromise conditions on the input variables contained n runs in the noise variables from
variables that are favorable to all the mean responses the outer array. For each of the three different goals,
were determined. The deviation from the ideal opti- he defined the S/N ratios as follows:
mum was measured by a distance function expressed
in terms of the estimated mean responses along with n
their variancecovariance matrix. By minimizing such 1. The smaller, the better: 10 log[ 1n 2
i=1 yi ].
a distance function, Khuri and Conlon arrived at 
2. The larger, the better: 10 log[1/n ni=1 1/y2i ].
a set of conditions for a compromise optimum. 2
Vining58 proposed a mean squared error method to 3. Target is best: 10 log( ys 2 ), where y is the sample
determine the compromise optimum for a multire- mean and s2 is the sample variance.
sponse experiment. Pignatiello59 and Ames et al.60
also proposed approaches based on the squared error
loss. A comprehensive survey of the various meth- All the above S/N ratios are to be maximized.
ods of multiresponse optimization was presented Although the Taguchi method was a significant step
in Ref 61. toward quality improvement, it received a number
of criticisms. It was pointed out (see Ref 69) that in
the Taguchi methodology (1), interactions among the
Taguchis Robust Parameter Design control factors were not estimated, (2) large numbers
Robust parameter design is a well-established engi- of experimental runs were required, (3) S/N ratios
neering technique to increase the quality of a product were unable to distinguish between inputs affecting
by making it robust/insensitive to the uncontrollable process mean from those affecting the variance.
variations present in the production process. Since the Some of the authors who discussed the Taguchi
introduction of parameter design in the United States methodology in detail and offered criticisms were
by Genichi Taguchi during the 1980s, a multitude of Myers and Montgomery,69 Box,72,73 Easterling,74
papers by Kackar,62 Taguchi and Wu,63 Taguchi,64 Pignatiello and Ramberg,75 Nair and Pregibon,76 ,
Nair and Shoemaker65 and books authored by Khuri Welch et al.77 , and Nair.78
and Cornell (Chapter 11),1 Taguchi,66 Phadke,67 Wu
and Hamada,68 and Myers and Montgomery,69 and
several other authors, have been written on the topic.
Review articles by Myers et al.70 and Robinson et al.71 Response Surface Approach to Robust
cite several papers based on robust parameter designs. Parameter Design
Taguchi proposed that the input variables in an exper- Two response surface approaches to parameter design
iment were of two types, (1) control factors: easy were introduced during the 1990s. The approaches
to control and (2) noise factors: difficult to control. were (1) the dual response approach and (2) the
These difficult-to-control noise factors are the cause single model approach. In the dual response approach,
of variations in a production process. The main aim separate models were fitted to the process mean
of parameter design is to determine the settings of and the process variance. While in the single model
the control factors for which the process response is approach, as the name suggests, a single model
robust to the variability in the system caused by the containing both the noise and the control variables
noise factors. To achieve this goal, Taguchi advocated was fitted to the process response.
the use of crossed arrays by crossing an orthogonal Vining and Myers79 were the first to propose
array of control variables (inner array) with an orthog- that Taguchis aim of keeping the mean on target
onal array of noise variables (outer array). Taguchi while minimizing the process variance could also
identified that there were three specific goals in an be achieved in a response surface framework. They
experiment: fitted separate second-degree models to the process

136 2010 Jo h n Wiley & So n s, In c. Vo lu me 2, March /April 2010


WIREs Computational Statistics Response surface methodology

mean () and the process variance ( 2 ), response, there were still two response surfaces for the
process mean and the variance,
= b0 + x b + x Bx, (28)
2 = c + x c + x Cx (29) E[y(x, z)] = 0 + g (x) (31)
0

where b0 , b, B, c0 , c, C were the estimates of the and


coefficients. Using the dual response optimization of
Ref 51, 2 was minimized while keeping the process Var[y(x, z)] = [ + g (x)]Var(z)[ + g (x)] + 2e .
mean at target. (32)
Del Castillo and Montgomery80 suggested the
use of nonlinear programming to solve a similar To solve the parameter design problem, they chose
optimization problem as proposed by Vining and the estimated squared error loss as the performance
Myers,79 but replacing the equality constraints criterion,
by inequalities. Criticizing the use of Lagrangian
multipliers and equality constraints, Lin and Tu81 E[y(x, z) T]2 (33)
proposed a procedure based on the mean squared
error (MSE) criterion. Using the same examples as where T was the prespecified target value, and
discussed by Vining and Myers79 and Del Castillo minimized it with respect to x. Myers et al.90
and Montgomery,80 they showed that more reduction proposed a linear mixed effects approach, in which
in the variance was possible by introducing a little the elements of and  in Eq. (30) were treated as
bias. One criticism of using the MSE was that no random. Aggarwal and Bansal91 and Aggarwal et al.92
restriction was placed on the distance of the mean considered robust parameter designs involving both
from the target. Addressing this problem, Copeland quantitative and qualitative factors. Brenneman and
and Nelson82 minimized the variance while keeping Myers93 considered the single model in the control and
the distance between the mean and the target less noise variables to model the response. In their model,
than some specified quantity, . For processes where the noise variables were considered to be categorical
it was important to keep the mean near the target, in nature.
 was chosen to be small. Various other extensions
to the dual response approach were suggested by
Fan,55 Kim and Lin,83 Del Castillo et al.,84 Koksoy PART III. EXTENSIONS AND NEW
and Doganaksoy,85 Kim and Cho,86 and Tang and DIRECTIONS: 2000 ONWARDS
Xu.87
To overcome the shortcomings (requirement of Response Surface Models with Random
too many runs and being unable to fit interaction Effects
terms) of Taguchis crossed array, Welch et al.77 The response surface models we have considered
proposed a combined array, which was a single so far include only fixed polynomial effects. These
experimental design for both the control and the noise models are suitable whenever the levels of the factors
variables. The combined array was shown to be more considered in a given experiment are of particular
economical than the crossed arrays of Taguchi (see interest to the experimenter, for example, the
Refs 77, 88, 89). Myers et al.90 used the combined temperature and concentrations of various chemicals
array of Welch et al.77 to fit a single model containing in a certain chemical reaction. There are, however,
both the control and noise variables to the response other experimental situations where, in addition to the
variable, main control factors, the response may be subject to
variations due to the presence of some random effects.
y(x, z) = 0 + g (x) + z + g (x)z +  (30) For example, the raw material used in a production
process may be obtained in batches selected at
where x and z are the control and noise variables, random from the warehouse supply. Because batches
respectively. In the above model, g (x) is a row of the may differ in quality, the response model should
design matrix containing polynomial and interaction include a random effect to account for the batch-
terms in the control variables, and are the vectors to-batch variability. In this section, we consider
of regression coefficients for the control and noise response surface models which, in addition to the
variables, respectively, and  contains the interaction fixed polynomial effects, include random effects.
coefficients. They showed that although a single model Let (x) denote the mean of a response variable,
in the noise and control variables was fitted to the y, at a point x = (x1 , x2 , . . . , xk ) . It is assumed that

Vo lu me 2, March /April 2010 2010 Jo h n Wiley & So n s, In c. 137


Advanced Review www.wiley.com/wires/compstats

(x) is represented by a polynomial model of degree assumed to be independent of all the other random
d (1) of the form effects and is distributed as N(0, 2 I n ). Consequently,
the mean of y and its variancecovariance matrix are
(x) = f  (x). (34) given by

Suppose that the experimental runs used to estimate E(y) = X, (38)


the model parameters in Eq. (34) are heterogeneous p

due to an extraneous source of variation referred Var(y) = 2 ZZ + 2j U j U j + 2 I n . (39)
to as a block effect. The experimental runs are j=2
therefore divided into b blocks of sizes n1 , n2 , . . . , nb .

Let n = bi=1 ni be the total number of observations. On the basis of Eqs. (38) and (39), the best linear
The block effect is considered random, and the actual unbiased estimator of is the generalized least-squares
response value at the uth run is represented by the estimator (GLSE), ,
model
= (X   1 X)1 X   1 y (40)
yu = f  (xu ) + zu + g (xu )zu +  u , u = 1, 2, . . . , n
(35)
where
  
where g (x) is such that f (x) = [1, g (xu )], xu is the
1
value of x at the uth run, zu = (zu1 , zu2 , . . . , zub ) ,  = Var(y)
where zui is an indicator variable taking the value 1 2
if the uth trial is in the ith block and the value 0 2 
p
2j 
otherwise, = ( 1 , 2 , . . . , b ) , where i denotes the = ZZ + U jU j + I n. (41)
2 2
effect of the ith block, and  u is a random experimental j=2

error (i = 1, 2, . . . , b; u = 1, 2, . . . , n). The matrix 


contains interaction coefficients between the blocks The variancecovariance matrix of is
and the fixed polynomial terms in the model. Because
the polynomial portion in Eq. (35) is fixed and the Var() = (X   1 X)1 2 . (42)
elements of and  are random, model (35) is
considered to be a mixed-effects model. It can be
The GLSE of requires knowledge of
expressed in vector form as 2 22 2p
the ratios of variance components, 2
 
, 2 , . . . , 2
.

p
Because the variance components, 2 2 2
, 2, . . . , p,  2
y = X + Z + U jj +  (36) are unknown, they must first be estimated. Let
j=2
2 , 22 , . . . , 2 denote the corresponding estimates of
the variance components. Substituting these estimates
where the matrix X is of order n p and is the same in Eq. (41), we get
as in model (6), Z is a block-diagonal matrix of order
n b of the form
2 
p
2j 

 = ZZ + U jU j + I n. (43)
Z = diag(1n1 , 1n2 , . . . , 1nb ). (37) 2 2
j=2 

U j is a matrix of order n b whose ith column is


obtained by multiplying the elements of the jth column Using  in place of  in Eq. (40), we get the so-called
of X with the corresponding elements of the ith column estimated generalized least-squares estimator (EGLSE)

of Z (i = 1, 2, . . . , b; j = 2, 3, . . . , p), j is a vector of of , denoted by ,
interaction coefficients between the blocks and the jth
1 1
polynomial term (j = 2, 3, . . . , p) in model (35). Note = (X   X)1 X   y. (44)
that j is the same as the transpose of the jth row of 
in Eq. (35). The corresponding estimated variancecovariance
We assume that , 2 , 3 , . . . , p are normally
matrix of is approximately given by
and independently distributed with zero means and
variancecovariance matrices 2 I b , 22 I b , . . . , 2p I b ,
) (X   1 X)1 2 .
Var( (45)
respectively. The random error vector, , is also 

138 2010 Jo h n Wiley & So n s, In c. Vo lu me 2, March /April 2010


WIREs Computational Statistics Response surface methodology

Furthermore, the predicted response at a point x in a we can use the F-ratio,


region R is
Type III S.S. for j
Fj = , j = 2, 3, . . . , p (53)
y (x) = f  (x) . (46) (b 1)MSE

This also gives an estimate of the mean response (x) which, under H0j , has the F-distribution with b 1
in Eq. (34). The corresponding prediction variance is and m degrees of freedom. Type III sum of squares for
approximately given by j is the sum of squares obtained by adjusting j for
all the remaining effects in model (35). More details
y (x)] 2 f  (x)(X   1 X)1 f (x).
Var[ (47) concerning these tests can be found in Ref 96.

It should be noted that in the case of the
Estimates of the variance components can be obtained mixed model in Eq. (35), the process variance (i.e.,
by using either the method of maximum likelihood the variancecovariance matrix of y) in eq. (39) is no
(ML) or the method of restricted maximum likelihood longer of the form 2 I n , as is the case with classical
(REML). These estimates can be easily obtained by response surface models with only fixed effects where
using PROC MIXED in SAS.94 the response variance is assumed constant throughout
Tests concerning the fixed effects (i.e., the the region R. Furthermore, the process variance
elements of ) in the mixed model (35) can be depends on the settings of the control variables, and
carried out by using the ESGLSE of and its hence on the chosen design, as can be seen from
estimated variancecovariance matrix in Eq. (45). formula (39). This lack of homogeneity in the values
More specifically, to test, for example, the hypothesis, of the response variance should be taken into account
when searching for the optimum response over R on
H0 : a = c (48) the basis of the predicted response expression given
in Eq. (46). In particular, the size of the prediction
where a and c are given constants, the corresponding variance in Eq. (47) plays an important role in
test statistic is determining the operating conditions on the control
variables that lead to optimum values of y (x) over R.

a c This calls for an application of modified ridge analysis
t= 1
(49) by Khuri and Myers,24 which was mentioned earlier.
[a (X   X)1 a 2 ]1/2 More details concerning such response optimization
can be found in Ref 96 and also in Khuri and Cornell1
which, under H0 , has approximately the t-distribution
(Section 8.3.3).
with degrees of freedom. Several methods are
More recently, the analysis of the mixed model
available in PROC MIXED in SAS for estimating
in Eq. (35) under the added assumption that the
. The preferred method is the one based on Kenward
experimental error variance is different for the
and Rogers95 procedure.
different blocks was outlined in Ref 97. This provides
Of secondary importance is testing the signifi-
an extension of the methodology presented in this
cance of the random effects in the mixed model (35).
section to experimental situations where the error
The test statistic for testing the hypothesis,
variance can change from one block to another due to
some extraneous sources of variation such as machine
H0 : 2 = 0 (50)
malfunction in a production process over a period of
time.
is given by the F-ratio,

R( |, 2 , 3 , . . . , p ) Introduction to Generalized Linear Models


F= (51)
(b 1)MSE Generalized linear models (GLMs) were first intro-
duced by Nelder and Wedderburn98 as an extension of
where MSE is the error (residual) mean square for the class of linear models. They are used to fit discrete
model (35), and R( |, 2 , 3 , . . . , p ) is the type III as well as continuous data having a variety of parent
sum of squares for the -effect (block effect). Under distributions. The traditional assumptions of normal-
H0 , F has the F-distribution with b 1 and m degrees ity and homogeneous variances of the response data,
of freedom, where m = n b (p 1)b. Similarly, to usually made in an analysis of variance (or regression)
test the hypothesis situation, are no longer needed. A classic book on
GLMs is the one by McCullagh and Nelder.99 In addi-
H0j : 2j = 0, j = 2, 3, . . . , p (52) tion, the more recent books by Lindsey,100 Dobson,101

Vo lu me 2, March /April 2010 2010 Jo h n Wiley & So n s, In c. 139


Advanced Review www.wiley.com/wires/compstats

McCulloch and Searle,102 and Myers et al.103 provide of the parameters are updated and used to find
added insight into the application and usefulness of additional design points in the subsequent stages.
GLMs. This process is carried out till convergence is
In a GLM situation, the response variable, achieved with respect to some optimality cri-
y, is assumed to follow a distribution from the terion, for example, D-optimality. Sequential
exponential family. This includes the normal as well designs were proposed by Wu,112 Sitter and
as the binomial, Poisson and gamma distributions. Forbes,113 Sitter and Wu,114 among others.
The mean response is modeled as a function of the Bayesian designs: in the Bayesian approach, a
form, (x) = h[f  (x)], where x = (x1 , . . . , xk ) , f(x) is prior distribution is assumed on the parameter
a known vector function of order p 1 and is a vector, ( as in the linear predictor),
vector of p unknown parameters. The function f  (x) which is then incorporated into an appropriate
is called the linear predictor and is usually denoted design criterion by integrating it over the
by (x). It is assumed that h() is a strictly monotone prior distribution. For example, one criterion
function. Using the inverse of the function h(), one maximizes the average over the prior distribution
can express (x) as g[(x)]. The function g() is called of the logarithm of the determinant of Fishers
the link function. information matrix. This criterion is equivalent
Estimation of is based on the method of to D-optimality in linear models. Bayesian
maximum likelihood using an iterative weighted versions of other alphabetic optimality criteria
least-squares procedure (see Ref 99, pp. 4043). The can also be used such as A-optimality. One
estimate of is then used to estimate the mean of the early papers on Bayesian D-optimality
response (x). criterion is the one by Zacks.115 Later, the
A good design is one which has a low Bayesian approach was discussed by several
prediction variance or low mean-squared error of authors including Chaloner,116 Chaloner and
prediction (MSEP) (expressions for the prediction Larntz,117,118 and Chaloner and Verdinelli.119
variance and MSEP are given in Ref 104). However, Designs for a family of exponential models
the prediction variance and the MSEP for GLMs were presented by Dette and Sperlich120 and
depend on the unknown parameters of the fitted Mukhopadhyay and Haines121 (see Ref 122).
model. Thus, to minimize either criterion, we require Atkinson et al.123 developed D- and Ds - (optimal
some prior knowledge about . This leads to the for a subset of parameters) optimal Bayesian
design dependence problem of GLMs. Some common designs for a compartmental model.
approaches to this problem are listed below. A detailed
Quantile dispersion graphs (QDGs) approach:
review of design issues for GLMs can be found in
this approach was recently introduced by
Ref 105.
Robinson and Khuri124 in a logistic regression
situation. In this graphical technique, designs
Locally optimal designs: designs for GLMs were compared on the basis of their quantile
depend on the unknown parameters of the fitted dispersion profiles. Since in small samples, the
model. Due to this dependence, the construction parameter estimates are often biased, Robinson
of a design requires some prior knowledge of and Khuri124 considered the mean-squared error
the parameters. If initial values of the parameters of prediction (MSEP) as a criterion for comparing
are assumed, then a design obtained on the basis designs. Khuri and Mukhopadhyay104 later
of an optimality criterion, such as D-optimality applied the QDG approach to compare designs
or A-optimality, is called locally optimal. The for log-linear models representing Poisson-
adequacy of such a design depends on how close distributed data.
the initial values are to the true values of the Robust design approach: in this approach, a
parameters. A key reference in this area is the minmax procedure is used to obtain designs
one by Mathew and Sinha106 concerning designs robust to poor initial parameter estimates.
for a logistic regression model. Other related Sitter125 applied the minmax procedure to binary
work include those of Abdelbasit and Plackett,107 data and used the D-optimality and the Fieller
Minkin,108 Khan and Yazdi,109 Wu,110 and Sitter optimality criteria to select designs. It was shown
and Wu.111 by Sitter125 that his D-optimal designs were more
Sequential designs: in this approach, experimen- robust to poor initial parameters than locally D-
tation is not stopped at the initial stage. Instead, optimal designs for binary data. An extension of
using the information obtained, initial estimates Sitters work was given by King and Wong.126

140 2010 Jo h n Wiley & So n s, In c. Vo lu me 2, March /April 2010


WIREs Computational Statistics Response surface methodology

Some recent works on designs for GLMs include those used the generalized distance approach, initially
by Dror and Steinberg,127 Woods et al.,128 and Russell developed for the simultaneous optimization of several
et al.129 All of these papers are focused on GLMs with linear response surface models, for optimization in a
several independent variables. Dror and Steinberg127 multivariate GLM situation.
and Russell et al.129 used clustering techniques for Application of GLMs to the robust parameter
constructing optimal designs. design problem has been discussed by several authors
All the above references discuss design issues for including Nelder and Lee,135 Engel and Huele,136
GLMs with a single response. Very little work has Brinkley et al.,137 Hamada and Nelder,138 Nelder
been done on multiresponse or multivariate GLMs, and Lee,139 Lee and Nelder,140 and Myers et al.141
particularly in the design area. Such models are Nelder and Lee135 modeled the mean and the variance
considered whenever several response variables can separately using GLMs. Both mean and variance were
be measured for each setting of a group of control functions of the control factors. Nelder and Lee139
variables, and the response variables are adequately modeled the mean as a function of both the control
represented by GLMs. Books by Fahrmeir and Tutz130 and the noise variables, whereas the variance was
and McCullagh and Nelder99 discuss the analysis of a function of the control variables only. Engel and
multivariate GLMs. Huele136 adopted the single response model of Myers
In multivariate generalized linear models et al.90 and assumed nonconstant error variances. In
(GLMs), the q-dimensional vector of responses, y, is their paper, the process variance depended on the
assumed to follow a distribution from the exponential noise variables as well as the residual variance. They
family. The mean response (x) = [1 (x), . . . , q (x)] modeled the residual variance using an exponential
at a given point x in the region of interest, R, is related model which guaranteed positive variance estimates.
to the linear predictor (x) = [1 (x), . . . , q (x)] by the This exponential model was previously used by Box
link function g : Rq Rq , and Meyer,142 Grego,143 and Chan and Mak.144 A
recent paper by Robinson et al.145 proposed the use
(x) = Z (x) = g[(x)] (54) of generalized mixed models in a situation where the
response was nonnormal and the noise variable was a
q
where x = (x1 , . . . , xk ) , Z(x) = i=1 fi (x), fi (x) is a random effect.
known vector function of x, is a vector of unknown
parameters. If the inverse of g, denoted by h, exists,
where h : Rq Rq , then Graphical Procedures for Assessing
the Prediction Capability of a Response
(x) = h[(x)] = h[Z (x)]. (55)
Surface Design
Standard design optimality criteria usually base their
Estimation of is based on the method of
evaluations on a single number, like D-efficiency, but
maximum likelihood using an iterative weighted
do not consider the quality of prediction throughout
least-squares procedure (see Ref 130, p. 106). The
the experimental region. However, the prediction
variancecovariance matrix, Var(), is dependent on
capability of any response surface design does not
the unknown parameter vector . This causes the
remain constant throughout the experimental region.
design dependence problem in multivariate GLMs.
Thus, rather than relying on a single-number design
Some of the key references for optimal designs in
criterion, a study of the prediction capability of the
multivariate GLMs are Refs 131 and 132. Heise
design throughout the design region should give more
and Myers131 studied optimal designs for bivariate
information about the designs performance. Three
logistic regression, whereas Zocchi and Atkinsons132
graphical methods have been proposed to study the
work was based on optimal designs for multinomial
performance of a design throughout the experimental
logistic models. A recent work by Mukhopadhyay
region.
and Khuri133 compares designs for multivariate GLMs
using the technique of quantile dispersion graphs.
The problem of optimization in a GLM 1. Variance dispersion graphs: Giovannitti-Jensen
environment is not as well developed as in the case and Myers146 and Myers et al.147 proposed
of linear models. In single-response GLMs, Paul and the graphical technique of variance dispersion
Khuri25 used modified ridge analysis to carry out graphs (VDGs). VDGs are two-dimensional
optimization of the response. Instead of optimizing the plots displaying the maximum, minimum, and
mean response directly, Paul and Khuri25 optimized average of the prediction variance on concentric
the linear predictor. Mukhopadhyay and Khuri134 spheres, chosen within the experimental region,

Vo lu me 2, March /April 2010 2010 Jo h n Wiley & So n s, In c. 141


Advanced Review www.wiley.com/wires/compstats

against their radii. The prediction variance is the prediction. Vining et al.153 made use of VDGs
variance of the predicted response, in mixture experiments.
2. Fraction of design space plots: Zahran et al.154
y(x) = f (x) (56) proposed fraction of design space (FDS) plots
where the prediction variance is plotted against
as given in formula (10), where f(x) is a known the fraction of the design space that has
vector function of x = (x1 , . . . , xk ) and is the prediction variance at or below the given value.
least-squares estimate of in Eq. (7). As in They argued that VDGs provide some but not
Eq. (11), the prediction variance is all information on SPV of a design, because
two designs may have the same VDG pattern
Var[y(x)] = f (x)(X X)1 f(x) 2 . (57) but different SPV distributions. This happens
because VDGs fail to weight the information in
A scaled version of the prediction variance (SPV)
each sphere by the proportion of design space
is given by
it represents. Zahran et al.154 defined the FDS
N Var[y(x)] criterion as
= Nf (x)(X X)1 f(x) (58) 
2 1
FDS = dx (61)
where N is the number of experimental runs. A
The average prediction variance, Vr , over the
surface, Ur , of a sphere of radius r is where A = {x : V(x) < Q}, V(x) = N Var[ 2
y(x)]
,
is the total volume of the experimental region,

Q is some specified quantity.
Vr = N 2 Var[y(x)] dx (59)
Ur 3. Quantile plots: Khuri et al.155 proposed the
quantile plots (QPs) approach for evaluating
and the maximum and minimum prediction and comparing several response surface designs
variances, respectively, are given by based on linear models. In this graphical
technique, the distribution of SPV on a given
N Var[y(x)] N Var[y(x)] sphere was studied in terms of its quantiles.
max 2
, min (60)
xUr xUr 2 Khuri et al.155 showed through examples that
 two designs can have the same VDG pattern but
where
 Ur = {x : ki=1 x2i = r2 } and 1 = may display very different distributions of SPV.
Ur dx is the surface area of Ur . A design is con- They stated that this occurred because VDGs
sidered to be good if it has low and stable values provide information only on the extreme values
of Vr throughout the experimental region. The and average value of the SPV but not on the
maximum and minimum prediction variance distribution of SPV on a given sphere. To obtain
reflect the extent of variability in the prediction the QPs on the surface, Ur , of a sphere of radius
variance values over Ur . A big gap between the r, several points were first generated randomly
maximum and minimum values implies that on Ur . Spherical coordinates were used for the
the variance function is not stable over the random generation of points. The values of the
region. The software for constructing VDGs SPV function, V(x), at points x on U(r) were
was discussed by Vining.148 Borkowski149 computed. All these values formed a sample
determined the maximum, minimum, and T(r). The quantiles of T(r) were obtained and
average prediction variances for central com- the pth quantile of T(r) was denoted by Qr (p).
posite and BoxBehnken designs analytically Plots of Qr (p) against p gave the QPs on U(r).
and showed that they were functions of the Khuri et al.156 used the QPs to compare designs
radius and the design parameters only. Trinca for mixture models on constrained regions.
and Gilmour150 further extended the VDG
approach and applied it to blocked response
surface designs. Borror et al.151 used the VDG Graphical Methods for Comparing
approach to compare robust parameter designs. Response Surface Designs
Similar to the VDG approach, Vining and Quantile dispersion graphs (QDGs) were proposed
Myers152 proposed a graphical approach to by Khuri157 for comparing designs for estimating
evaluate and compare response surface designs variance components in an analysis of variance
on the basis of the mean-squared error of (ANOVA) situation. The exact distribution of the

142 2010 Jo h n Wiley & So n s, In c. Vo lu me 2, March /April 2010


WIREs Computational Statistics Response surface methodology

variance component estimator was determined in Qmax


D (p, ) = max{QD (p, , )}. (63)
C
terms of its quantiles. These quantiles were dependent
on the unknown variance components of the ANOVA
model. Plots of the maxima and the minima of Plotting these values against p resulted in the QDGs
the quantiles over a subset of the parameter space of the MSEP over R . Using several values of , the
produced the QDGs. These graphs assessed the entire region R was covered. Given several designs
quality of the ANOVA estimators and allowed to compare, a design that displays close and small
comparison of designs with respect to their estimation values of QmaxD and QminD over the range of p is
capabilities. Lee and Khuri158,159 extended the use of considered desirable because it has good prediction
QDGs to unbalanced random one-way and two-way capability and is robust to the changes in . Khuri
models, respectively. They used QDGs to compare and Mukhopadhyay104 also studied the effect of the
designs based on ANOVA and maximum likelihood choice of the link function and/or the nature of the
estimation procedures. QDGs were used by Khuri response distribution on the shape of the QDGs for a
and Lee160 to evaluate and compare the prediction given design.
capabilities of nonlinear designs with one control Mukhopadhyay and Khuri133 extended the
variable throughout the region of interest R. More application of GLMs to compare response surface
recently, Saha and Khuri161 used QDGs to compare designs for multivariate GLMs. Since the MSEP is a
designs for response surface models with random matrix in the multivariate situation, they considered a
block effects. scalar-valued function of the MSEP, namely the largest
Robinson and Khuri124 generalized the work eigenvalue of the MSEP matrix (EMSEP), as their
comparison criterion. Similar to the MSEP, EMSEP
of Khuri and Lee160 by addressing nonnormality
also depends on x and . As in the univariate case,
and nonconstant error variance. They considered the
quantiles of EMSEP were computed on concentric
problem of discriminating among designs for logistic
regions R , for a given design D. Mukhopadhyay
regression models using QDGs based on the MSEP.
and Khuri133 chose the parameter space C to be the
Khuri and Mukhopadhyay104 later applied the QDG
(1 )100% confidence region of . Subsequently,
approach to compare the prediction capabilities of
the minimum and maximum quantiles were computed
designs for Poisson regression models, also using the
over the values of in a grid of points from C and
MSEP criterion. The MSEP incorporates both the
plotted against p to obtain the QDGs. The authors
prediction variance and the prediction bias, which
illustrated their proposed methodology using a data
results from using maximum likelihood estimates
set from a combination drug therapy study on male
(MLEs) of the parameters of the fitted model. As
mice taken from Gennings et al.,162 pp. 429451 (see
in any design criterion for GLMs, the MSEP and its
Section 6 in Mukhopadhyay and Khuri.133 )
quantiles depend on the unknown parameters of the
A numerical example. In a drug therapy
model. For a given design, quantiles of the MSEP
experiment on male mice, the pain relieving effects of
were obtained within a region of interest. To compare
two drugs, morphine sulfate (x1 ), and 9 -tetrahydro-
designs using QDGs in a region R, several points
cannabinol (x2 ), on two binary responses, pain relief
were generated on concentric surfaces, denoted by R , (y1 ), and side effect (y2 ), were studied. The response
which were obtained by shrinking the boundary of R y1 takes the value 1 if relief occurs, otherwise it takes
by a factor . The value of the MSEP was computed the value zero. The response y2 is equal to one if a
for each x on R and in a parameter space, C. An harmful side effect develops, otherwise it is equal to
initial data set on the response was used to construct zero. A 5 7 factorial design with six mice in each
the parameter space C. Quantiles of MSEP for a given run was considered. The experimental region R was
design D were computed, and the pth quantile was rectangular in shape with R : {0 x1 8, 0 x2
denoted by QD (p, , ) for 0 p 1. These quantiles 15}. The following first-degree models were used to fit
provided a description of the distribution of MSEP for the data
values of x on R . The dependence of the quantiles
on was investigated by computing QD (p, , ) for
1 (x) = 1 + 2 x1 + 3 x2 ,
several values of that formed a grid, C, inside C.
Subsequently, the minimum and maximum values of 2 (x) = 4 + 5 x1 + 6 x2 ,
QD (p, , ) over the values of in C were obtained 3 (x) = 7 + 8 x1 + 9 x2 . (64)

Qmin Here, (x) = [1 (x), 2 (x), 3 (x)] is the three-


D (p, ) = min{QD (p, , )} (62)
C dimensional linear predictor as explained in Eq. (54).

Vo lu me 2, March /April 2010 2010 Jo h n Wiley & So n s, In c. 143


Advanced Review www.wiley.com/wires/compstats

nu = 1 nu = 0.9

Quantiles of EMSEP

Quantiles of EMSEP
0.03 0.03
D1 D1
D2 D2
0.02 0.02

0.01 0.01

0.00 0.00
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
Probabilities (p) Probabilities (p)

nu = 0.8 nu = 0.6
Quantiles of EMSEP

Quantiles of EMSEP
0.03 0.03
D1 D1
D2 D2
0.02 0.02

FIGURE 1 | Comparison 0.01 0.01


of the QDGs for designs D1
(5 7 factorial) and D2
(32 factorial), for 0.00 0.00
p = 0(0.05)1 and 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
= 1, 0.9, 0.8, 0.6. Probabilities (p) Probabilities (p)

The corresponding link function used was three times and all other points replicated four times.
The EMSEP values were computed for x on concentric
exp[i (x)]
i (x) = 3 , i = 1, 2, 3 (65) rectangles R and C. The 95% confidence region
1 + l=1 exp[l (x)] of was chosen to be C and C was a set of 500 ran-
domly chosen points from C. The QDGs comparing
where i (x) is the estimate of i (x) for i = 1, 2, 3.
designs D1 and D2 on the basis of the quantiles of
Note that i (x) is the ith element of (x) =
EMSEP values are shown in Figure 1. From Figure 1,
[1 (x), 2 (x), 3 (x)] , where for a given x, 1 (x) is
the probability that y1 = 1 and y2 = 1, 2 (x) is the it can be noted that both designs are robust to the
probability that y1 = 1 and y2 = 0, and 3 (x) is the changes in the parameter values, and overall, D2 has
probability that y1 = 0 and y2 = 1. The original 5 7 better prediction capability than D1 for almost all val-
factorial design (D1 ) was compared with D2 , a 32 fac- ues of and p. More details concerning this example
torial design with the center point (4, 7.5) replicated can be found in Sections 5 and 6 in Ref 133.

REFERENCES
1. Khuri, AI, Cornell, JA. Response Surfaces. 2nd ed. 4. Box GEP, Draper NR. Robust designs. Biometrika
New York: Dekker; 1996. 1975, 62:347352.
2. Box GEP, Hunter JS. Multifactor experimental
5. Kiefer J, Wolfowitz J. The equivalence of two
designs for exploring response surfaces. Ann Math
extremum problems. Can J Math 1960, 12:363366.
Stat 1957, 28:195241.
3. Khuri AI. A measure of rotatability for response sur- 6. Pazman A. Foundations of Optimum Experimental
face designs. Technometrics 1988, 30:95104. Design. Dordrecht, Holland: Reidel; 1986.

144 2010 Jo h n Wiley & So n s, In c. Vo lu me 2, March /April 2010


WIREs Computational Statistics Response surface methodology

7. Box GEP, Hunter WG, Hunter JS. Statistics for Exper- 28. Khuri AI. Multiresponse surface methodology. In:
imenters. New York: John Wiley & Sons; 1978. Ghosh S, Rao CR eds. Handbook of Statistics, vol. 13.
Amsterdam: Elsevier Science B.V.; 1996, 377406.
8. Ratkoe BL, Hedayat A, Federer WT. Factorial
Designs. New York: John Wiley & Sons; 1981. 29. Zellner A. An efficient method of estimating seemingly
unrelated regressions and tests for aggregation bias.
9. Montgomery DC. Design and Analysis of Experi-
J Am Stat Assoc 1962, 57:348368.
ments. 6th ed. New York: John Wiley & Sons; 2005.
30. Draper NR, Hunter WG. Design of experiments
10. Plackett RL, Burman JP. The design of optimum multi- for parameter estimation in multiresponse situations.
factorial experiments. Biometrika 1946, 33:305325. Biometrika 1966, 53:525533.
11. Box GEP. Multi-factor designs of first order. 31. Box MJ, Draper NR. Estimation and design crite-
Biometrika 1952, 39:4957. ria for multiresponse non-linear models with non-
12. McLean RA, Anderson VL. Applied Factorial and homogeneous variance. J R Stat Soc., Ser C 1972,
Fractional Designs. New York: Dekker; 1984. 21:1324.
13. Box GEP, Wilson KB. On the experimental attain- 32. Fedorov V. Theory of Optimal Experiments. New
ment of optimum conditions. J R Stat Soc., Ser B York: Academic Press; 1972.
1951, 13:145 (with discussion). 33. Wijesinha M, Khuri AI. The sequential generation of
multiresponse D-optimal designs when the variance-
14. Box GEP, Behnken DW. Some new three-level designs
covariance matrix is not known. Commun Stat:
for the study of quantitative variables. Technometrics
Simulation Comput 1987, 16:239259.
1960, 2:455475.
34. Krafft O, Schaefer M. D-optimal designs for a mul-
15. Box GEP, Draper NR. Response Surfaces, Mixtures,
tivariate regression model. J Multivariate Anal 1992,
and Ridge Analyses. 2nd ed. Hoboken, New Jersey:
42:130140.
John Wiley & Sons; 2007.
35. Bischoff W. On D-optimal designs for linear models
16. Myers RH, Montgomery DC. Response Surface under correlated observations with an application to a
Methodology. New York: John Wiley & Sons; 1995. linear model with multiple response. J Stat Plan Infer
17. Hoke AT. Economical second-order designs based on 1993, 37:6980.
irregular fractions of the 3n factorial. Technometrics 36. Chang SI. Some properties of multiresponse
1974, 16:375384. D-optimal designs. J Math Anal Appl 1994,
18. Box MJ, Draper NR. Factorial designs, the |X X| cri- 184:256262.
terion and some related matters. Technometrics 1971, 37. Chang S. An algorithm to generate near D-optimal
13:731742. designs for multiple response surface models. IIE
19. Doehlert DH. Uniform shell designs. J R Stat Soc., Trans 1997, 29:10731081.
Ser C 1970, 19:231239. 38. Imhof L. Optimum designs for a multiresponse regres-
sion model. J Multivariate Anal 2000, 72:120131.
20. Roquemore KG. Hybrid designs for quadratic
response surfaces. Technometrics 1976, 18:419423. 39. Atashgah AB, Seifi A. Application of semi-definite
programming to the design of multi-response experi-
21. Myers RH, Khuri AI. A new procedure for steep-
ments. IIE Trans 2007, 39:763769.
est ascent. Commun Stat: Theory Methods 1979,
8:13591376. 40. Hatzis C, Larntz K. Optimal design in nonlinear mul-
tiresponse estimation: Poisson model for filter feeding.
22. Hoerl AE. Optimum solutions of many variable equa- Biometrics 1992, 48:12351248.
tions. Chem Eng Prog 1959, 55:6978.
41. Wijesinha MC, Khuri AI. Construction of optimal
23. Draper NR. Ridge analysis of response surfaces. Tech- designs to increase the power of the multiresponse
nometrics 1963, 5:469479. lack of fit test. J Stat Plan Infer 1987, 16:179192.
24. Khuri AI, Myers RH. Modified ridge analysis. Tech- 42. Wijesinha MC, Khuri AI. Robust designs for first-
nometrics 1979, 21:467473. order multiple design multivariate models. Commun
25. Paul S, Khuri AI. Modified ridge analysis under non- Stat Theory Methods 1991, 41:29872999.
standard conditions. Commun Stat: Theory Methods 43. Yue R. Model-robust designs in multiresponse situa-
2000, 29:21812200. tions. Stat Probab Lett 2002, 58:369379.
26. Hill WJ, Hunter WG. A review of response sur- 44. Khuri AI. A test for lack of fit of a linear multiresponse
face methodology: a literature review. Technometrics model. Technometrics 1985, 27:213218.
1966, 8:571590. 45. Morrison DF. Multivariate Statistical Methods. 2nd
27. Myers RH, Khuri AI, Carter WH Jr. Response sur- ed. New York: McGraw-Hil; 1976.
face methodology: 19661988. Technometrics 1989, 46. Levy MS, Neill JW. Testing for lack of fit in lin-
31:137157. ear multiresponse models based on exact or near

Vo lu me 2, March /April 2010 2010 Jo h n Wiley & So n s, In c. 145


Advanced Review www.wiley.com/wires/compstats

replicates. Commun Stat: Theory Methods 1990, 64. Taguchi G. System of Experimental Design: Engineer-
19:19872002. ing Methods to Optimize Quality and Minimize Cost.
White Plains: UNIPUB/Kraus International; 1987.
47. Lind EE, Goldin J, Hickman JB. Fitting yield and cost
response surfaces. Chem Eng Prog 1960, 56:6268. 65. Nair VN, Shoemaker AC. The role of experimentation
in quality engineering: a review of Taguchis contribu-
48. Harrington ECJ. The desirability functions. Ind Qual-
tions. In: Ghosh S, ed. Statistical Design and Analysis
ity Control 1965, 12:494498.
of Industrial Experiments. New York: Marcel Dekker;
49. Derringer G, Suich R. Simultaneous optimization of 1990, 247277.
several response variables. J Quality Technol 1980,
66. Taguchi G. Introduction to Quality Engineering.
12:214219.
White Plains: UNIPUB/Kraus International; 1986.
50. Khuri AI, Conlon M. Simultaneous optimization for 67. Phadke MS. Quality Engineering Using Robust
multiple responses represented by polynomial regres- Design. Englewood Cliffs: Prentice-Hall; 1989.
sion functions. Technometrics 1981, 23:363375.
68. Wu CFJ, Hamada M. Experiments: Planning, Analy-
51. Myers RH, Carter WH. Response surface techniques sis, And Parameter Design Optimization. New York:
for dual response systems. Technometrics 1973, Wiley-Interscience; 2000.
15:301317.
69. Myers RH, Montgomery DC. Response Surface
52. Biles WE. A response surface method for experimental Methodology: Process and Product Optimization
optimization of multiple response processes. Ind Eng Using Designed Experiments. 2nd ed. New York:
Chem Process Design Dev 1975, 14:152158. John Wiley & Sons; 2002.
53. Del Castillo E. Multiresponse process optimization 70. Myers RH, Montgomery DC, Vining GG, Borror
via constrained confidence regions. J Quality Technol CM, Kowalski SM. Response surface methodology: a
1996, 28:6170. retrospective and literature survey. J Quality Technol
54. Del Castillo E, Fan SK, Semple J. Optimization of 2004, 36:5377.
dual response systems: a comprehensive procedure 71. Robinson TJ, Borror CM, Myers RH. Robust param-
for degenerate and nondegenerate problems. Eur J eter design: a review. Quality Reliab Eng Int 2004,
Oper Res 1999, 112:174186. 20:81101.
55. Fan SK. A generalized global optimization algorithm 72. Box GEP. Discussion of off-line quality control,
for dual response systems. J Quality Technol 2000, parameter design, and the Taguchi methods. J Quality
32:444456. Technol 1985, 17:198206.
56. Del Castillo E, Montgomery DC, McCarville DR. 73. Box GEP. Signal-to-noise ratios, performance criteria,
Modified desirability functions for multiple response and transformations. Technometrics 1988, 30:117.
optimization. J Quality Technol 1996, 28:337345.
74. Easterling RB. Discussion of off-line quality control,
57. Wu FC. Optimization of correlated multiple qual- parameter design, and the Taguchi methods. J Quality
ity characteristics using desirability function. Quality Technol 1985, 17:198206.
Eng 2005, 17:119126. 75. Pignatiello JJ, Ramberg JS. Discussion of off-line
58. Vining G. A compromise approach to multiresponse quality control, parameter design, and the Taguchi
optimization. J Quality Technol 1998, 30:309313. methods. J Quality Technol 1985, 17:198206.
59. Pignatiello JJ. Strategies for robust multiresponse 76. Nair VN, Pregibon D. Analyzing dispersion effects
quality engineering. IIE Trans 1993, 25:515. from replicated factorial experiments. Technometrics
1988, 30:247258.
60. Ames A, Mattucci N, McDonald S, Szonyi G,
Hawkins D. Quality loss function for optimization 77. Welch WJ, Yu TK, Kang SM, Sacks J. Computer
across multiple response surfaces. J Quality Technol experiments for quality control by parameter design.
1997, 29:339346. J Quality Technol 1990, 22:1522.
61. Khuri AI, Valeroso ES. Optimization methods in mul- 78. Nair VN. Taguchis parameter design: a panel discus-
tiresponse surface methodology. In: Park SH, Vining sion. Technometrics 1992, 34:127161.
GG, eds. Statistical Process Monitoring and Opti- 79. Vining GG, Myers RH. Combining Taguchi and
mization. New York: Dekker; 2000, 411433. response surface philosophies: a dual response
62. Kackar RN. Off-line quality control, parameter approach. J Quality Technol 1990, 22:3845.
design, and the Taguchi method. J Quality Technol 80. Del Castillo E, Montgomery DC. A nonlinear pro-
1985, 17:176209. gramming solution to the dual response problem.
63. Taguchi G, Wu Y. Introduction to Off-Line Qual- J Quality Technol 1993, 25:199204.
ity Control. Nagoya, Japan: Central Japan Quality 81. Lin DKJ, Tu W. Dual response surface optimization.
Control Association; 1985. J Quality Technol 1995, 27:3439.

146 2010 Jo h n Wiley & So n s, In c. Vo lu me 2, March /April 2010


WIREs Computational Statistics Response surface methodology

82. Copeland KAF, Nelson PR. Dual response opti- 100. Lindsey JK. Applying Generalized Linear Models.
mization via direct function minimization. J Quality New York: Springer; 1997.
Technol 1996, 28:331336. 101. Dobson AJ. An Introduction to Generalized Linear
83. Kim K, Lin DKJ. Dual response surface optimization: Models. 2nd ed. Boca Raton: Chapman and Hall;
a fuzzy modeling approach. J Quality Technol 1998, 2001.
30:110. 102. McCulloch CE, Searle SR. Generalized, Linear, and
84. Del Castillo E, Fan SK, Semple J. Computation of Mixed Models. New York: John Wiley & Sons; 2001.
global optima in dual response systems. J Quality 103. Myers RH, Montgomery DC, Vining GG. General-
Technol 1997, 29:347353. ized Linear Models with Applications in Engineering
85. Koksoy O, Doganaksoy N. Joint optimization of mean and the Sciences. New York: John Wiley & Sons;
and standard deviation using response surface meth- 2002.
ods. J Quality Technol 2003, 35:239252. 104. Khuri AI, Mukhopadhyay S. GLM designs: the depen-
86. Kim YJ, Cho BR. Development of priority-based dence on unknown parameters dilemma. In: Khuri
robust design. Quality Eng 2002, 14:355363. AI, ed. Response Surface Methodology and Related
Topics. Singapore: World Scientific; 2006, 203223.
87. Tang LC, Xu K. A unified approach for dual response
optimization. J Quality Technol 2002, 34:437447. 105. Khuri AI, Mukherjee B, Sinha BK, Ghosh M. Design
issues for generalized linear models: a review. Stat Sci
88. Lucas JM. Achieving a robust process using response 2006, 21:376399.
surface methodology. Paper presented at the Joint
Statistical Meetings, Washington, DC, August 610, 106. Mathew T, Sinha BK. Optimal designs for binary
1989. data under logistic regression. J Stat Plan Infer 2001,
93:295307.
89. Shoemaker AC, Tsui KL, Wu CFJ. Economical exper-
107. Abdelbasit KM, Plackett RL. Experimental design for
imentation methods for robust parameter design.
binary data. J Am Stat Assoc 1983, 78:9098.
Paper presented at the Fall Technical Conference,
American Society of Quality Control, Houston, TX, 108. Minkin S. Optimal design for binary data. J Am Stat
1989. Assoc 1987, 82:10981103.
90. Myers RH, Khuri AI, Vining GG. Response surface 109. Khan MK, Yazdi AA. On D-optimal designs for binary
alternatives to the Taguchi robust parameter design data. J Stat Plan Infer 1988, 18:8391.
approach. Am Stat 1992, 46:131139. 110. Wu CFJ. Optimal design for percentile estimation
91. Aggarwal ML, Bansal A. Robust response surface of a quantal response curve. In: Dodge Y, Fedorov
designs for quantitative and qualitative factors. Com- VV, Wynn HP, eds. Optimal Design and Analysis
mun Stat Theory Methods 1998, 27:89106. of Experiments. Amsterdam: Elsevier Science; 1988,
213223.
92. Aggarwal ML, Gupta BC, Bansal A. Small robust
response surface designs for quantitative and qual- 111. Sitter RR, Wu CFJ. Optimal designs for binary
itative factors. Am J Math Manage Sci 2000, response experiments: Fieller, D, and A criteria. Scand
39:103130. J Stat 1993, 20:329341.

93. Brenneman WA, Myers WR. Robust parameter design 112. Wu CFJ. Efficient sequential designs with binary data.
with categorical noise variables. J Quality Technol J Am Stat Assoc 1985, 80:974984.
2003, 35:335341. 113. Sitter RR, Forbes B. Optimal two-stage designs
94. SAS Institute Inc. Version 9.1.3. Cary, NC: for binary response experiments. Stat Sin 1997,
SAS; 2008. [OnlineDoc]. http://support.sas.com/ 7:941956.
onlinedoc/913/docMainpage.jsp. Accessed 2008. 114. Sitter RR, Wu CFJ. Two-stage design of quantal
response studies. Biometrics 1999, 55:396402.
95. Kenward MG, Roger JH. Small-sample inference for
fixed effects from restricted maximum likelihood. Bio- 115. Zacks S. Problems and approaches in design of experi-
metrics 1997, 53:983997. ments for estimation and testing in non-linear models.
In: Krishnaiah PR, ed. Multivariate Analysis, vol. 4.
96. Khuri AI. Response surface models with mixed effects.
Amsterdam: North-Holland; 1977, 209223.
J Quality Technol 1996, 28:177186.
116. Chaloner K. An approach to design for general-
97. Khuri AI. Mixed response surface models with hetero-
ized linear models. Proceedings of the Workshop
geneous within-block error variances. Technometrics
on Model-Oriented Data Analysis, Wartburg. Lec-
2006, 48:206218.
ture Notes in Economics and Mathematical Systems.
98. Nelder JA, Wedderburn RWM. Generalized linear Berlin: Springer; 1987, 312.
models. J R Stat Soc., Ser A 1972, 135:370384. 117. Chaloner K, Larntz K. Optimal Bayesian designs
99. McCullagh P, Nelder JA. Generalized Linear Models. applied to logistic regression experiments. J Stat Plan
2nd ed. London: Chapman and Hall; 1989. Infer 1989, 21:191208.

Vo lu me 2, March /April 2010 2010 Jo h n Wiley & So n s, In c. 147


Advanced Review www.wiley.com/wires/compstats

118. Chaloner K, Larntz K. Optimal Bayesian design for 135. Nelder JA, Lee Y. Generalized linear models for the
accelerated life testing experiments. J Stat Plan Infer analysis of Taguchi-type experiments. Appl Stoch
1991, 33:245259. Models Data Anal 1991, 7:107120.
119. Chaloner K, Verdinelli I. Bayesian experimental 136. Engel J, Huele AF. A generalized linear modeling
design: a review. Stat Sci 1995, 10:273304. approach to robust design. Technometrics 1996,
38:365373.
120. Dette H, Sperlich S. A note on Bayesian D-optimal
designs for generalization of the simple exponen- 137. Brinkley PA, Meyer KP, Lu JC. Combined generalized
tial growth model. South African Stat J 1994, linear modelling/non-linear programming approach
28:103117. to robust process design: a case-study in circuit board
quality improvement. Appl Stat 1996, 45:99110.
121. Mukhopadhyay S, Haines L. Bayesian D-optimal
138. Hamada M, Nelder JA. Generalized linear models for
designs for the exponential growth model. J Stat Plan
quality-improvement experiments. J Quality Technol
Infer 1995, 44:385397.
1997, 29:292304.
122. Atkinson AC, Haines LM. Designs for nonlinear and
139. Nelder JA, Lee Y. Letter to the editor. Technometrics
generalized linear models. In: Ghosh S, Rao CR, eds.
1998, 40:168171.
Handbook of Statistics, vol. 13. Amsterdam: Elsevier
Science; 1996, 437475. 140. Lee Y, Nelder JA. Robust design via generalized linear
models. J Quality Technol 2003, 35:212.
123. Atkinson AC, Chaloner K, Herzberg AM, Juritz J.
141. Myers WR, Brenneman WA, Myers RH. A dual-
Optimum experimental designs for properties of
response approach to robust parameter design for
a compartmental model. Biometrics 1993, 49:
a generalized linear model. Technometrics 2005,
325337.
37:130138.
124. Robinson KS, Khuri AI. Quantile dispersion graphs
142. Box GEP, Meyer RD. Dispersion effects from frac-
for evaluating and comparing designs for logistic
tional designs. Technometrics 1986, 28:1927.
regression models. Comput Stat Data Anal 2003,
43:4762. 143. Grego JM. Generalized linear models and process
variation. J Quality Technol 1993, 25:288295.
125. Sitter RR. Robust designs for binary data. Biometrics
1992, 48:11451155. 144. Chan LK, Mak TK. A regression approach for discov-
ering small variation around a target. Appl Stat 1995,
126. King J, Wong W. Minimax D-optimal designs for the 44:369377.
logistic model. Biometrics 2000, 56:12631267.
145. Robinson TJ, Wulff SS, Montgomery DC, Khuri
127. Dror HA, Steinberg DM. Robust experimental design AI. Robust parameter design using generalized linear
for multivariate generalized linear models. Techno- mixed models. J Quality Technol 2006, 38:6575.
metrics 2006, 48:520529. 146. Giovannitti-Jensen A, Myers RH. Graphical assess-
128. Woods DC, Lewis SM, Eccleston JA, Russell KG. ment of the prediction capability of response surface
Designs for generalized linear models with sev- designs. Technometrics 1989, 31:159171.
eral variables and model uncertainty. Technometrics 147. Myers RH, Vining GG, Giovannitti-Jensen A, Myers
2008, 48:284292. SL. Variance dispersion properties of second order
129. Russell KG, Woods DC, Lewis SM, Eccleston JA. response surface designs. J Quality Technol 1992,
D-optimal designs for Poisson regression models. Stat 24:111.
Sin 2009, 19:721730. 148. Vining GG. A computer program for generating
variance dispersion graphs. J Quality Technol 1993,
130. Fahrmeir L, Tutz G. Multivariate Statistical Modelling
25:4558.
Based on Generalized Linear Models. 2nd ed: New
York: Springer; 2001. 149. Borkowski JJ. Spherical prediction-variance proper-
ties of central composite and Box-Behnken designs.
131. Heise MA, Myers RH. Optimal designs for bivariate
Technometrics 1995, 37:399410.
logistic regression. Biometrics 1996, 52:613624.
150. Trinca LA, Gilmour SG. Variance dispersion graphs
132. Zocchi SS, Atkinson AC. Optimum experimental for comparing blocked response surface designs.
designs for multinomial logistics models. Biometrics J Quality Technol 1998, 30:349364.
1999, 55:437443.
151. Borror CM, Montgomery DC, Myers RH. Evaluation
133. Mukhopadhyay S, Khuri AI. Comparison of designs of statistical designs for experiments involving noise
for multivariate generalized linear models. J Stat Plan variables. J Quality Technol 2002, 34:5470.
Infer 2008, 138:169183.
152. Vining GG, Myers RH. A graphical approach for eval-
134. Mukhopadhyay S, Khuri AI. Optimization in a mul- uating response surface designs in terms of the mean
tivariate generalized linear model situation. Comput squared error of prediction. Technometrics 1991,
Stat Data Anal 2008, 52:46254634. 33:315326.

148 2010 Jo h n Wiley & So n s, In c. Vo lu me 2, March /April 2010


WIREs Computational Statistics Response surface methodology

153. Vining GG, Cornell JA, Myers RH. A graphical 158. Lee J, Khuri AI. Graphical technique for compar-
approach for evaluating mixture designs. Appl Stat ing designs for random models. J Appl Stat 1999,
1993, 42:127138. 26:933947.
154. Zahran A, Anderson-Cook CM, Myers RH. Frac- 159. Lee J, Khuri AI. Quantile dispersion graphs for the
tion of design space to assess prediction capability comparison of designs for a random two-way model.
of response surface designs. J Quality Technol 2003, J Stat Planning Infer 2000, 91:123137.
35:377386.
160. Khuri AI, Lee J. A graphical approach for evaluating
155. Khuri AI, Kim HJ, Um Y. Quantile plots of the pre-
and comparing designs for nonlinear models. Comput
diction variance for response surface designs. Comput
Stat Data Anal 1998, 27:433443.
Stat Data Anal 1996, 22:395407.
156. Khuri AI, Harrison JM, Cornell JA. Using quan- 161. Saha S, Khuri AI. Comparison of designs for response
tile plots of the prediction variance for comparing surface models with random block effects. Quality
designs for a constrained mixture region: an applica- Tech Quantitative Manage 2009, 6:219234.
tion involving a fertilizer experiment. Appl Stat 1999, 162. Gennings C, Carter WH Jr, Martin BR. Drug interac-
48:521532. tions between morphine and marijuana. In: Lange N,
157. Khuri AI. Quantile dispersion graphs for analysis of Ryan L, Billard L, Brillinger D, Conquest L, et al. eds.
variance estimates of variance components. J Appl Case Studies in Biometry. New York: John Wiley &
Stat 1997, 24:711722. Sons; 1994, 429451.

Vo lu me 2, March /April 2010 2010 Jo h n Wiley & So n s, In c. 149

You might also like