Professional Documents
Culture Documents
Revised Design of Experiments and Analytical Techniques
Revised Design of Experiments and Analytical Techniques
Techniques
1
1. Introduction to Statistics
• What is statistics
The science of collecting, organizing, and analyzing/interpreting data.
2
Introduction to Statistics Con…
3
Why statistics?
With inferential statistics, you take data from samples and make generalizations about
a population.
5
Basic terms in statistics
Population: A collection, or set of individuals or
objects or events whose properties are to be
analyzed.
Sample: A subset of the population.
Variable: A characteristic about a population or
sample.
Data : The value of the variable associated with
one element of a population or sample. This
value may be a number, a word, or a symbol.
6
Nature of variables
• Qualitative/categorical :- A variable that can be expressed interims of words.
• Quantitative / numerical :- A variable that can be expressed interims of numbers.
7
• Nominal variable is a qualitative variable where no ordering is possible or
implied in the levels. For example, the variable gender is nominal because
there is no order in the levels female/male.
• A nominal variable can have between two levels (e.g., do you smoke? Yes/No
or what is your gender? Female/Male)
• Another good example is health, which can take values such as poor,
reasonable, good, or excellent.
8
Sources of Data
9
2. INTRODUCTION TO DESIGN OF EXPERIMENT
10
Strategy of experimentation
• Observation on a system or process leads to theory or hypothesis about what makes the system
work.
• An experiment demonstrate whether the theories or hypothesis is correct or not.
• Experiment is tests of series of runs in which changes are made to the input variables purposefully
and the changes observed justifying the need of the change.
• The impute variables responsible for the change in the response will be identified…decision to the
system (improvement)
11
Strategy of experimentation, cont.…
A well designed experiment …excellent result and conclusion.
• One way of the scientific process by which we learn about how the system
or process works.
15
Application of Experimental Design Cont.…
*Improvement in product design activities where a new products are developed and
the existing one improved
16
Application of Experimental Design Cont…
17
Basic Principles
*Statistical design of experiments refers to the process of planning the experiment so that
appropriate data will be collected and analysed by statistical methods, resulting in valid and
objective conclusions.
*There are two aspects to any experimental problem: the design of the experiment and the
statistical analysis of the data. These two subjects are closely related because the method of
• By properly randomizing the experiment, we also assist in “averaging out” the effects of
1) It allows the experimenter to obtain an estimate of the experimental error. This estimate of error
becomes a basic unit of measurement for determining whether observed differences in the data are
2) Second, if the sample mean is used to estimate the true mean response for one of the factor levels in the
experiment, replication permits the experimenter to obtain a more precise estimate of this parameter
*Replication reflects sources of variability both between runs and (potentially) within runs.
20
Basic Principles cont…
BLOCKING: is a design technique used to improve the precision with which
from nuisance factors—that is, factors that may influence the experimental
conditions.
21
Guidelines for Designing Experiments
22
Guidelines for Designing Experiments cont…
of the phenomenon being studied and the final solution of the problem.
23
Guidelines for Designing Experiments cont…
• Some (but by no means all) of the reasons for running experiments include:
a. Factor screening or characterization: to learn which factors have the most influence on the
response(s) of interest.
Screening experiments are extremely important when working with new systems or technologies
so that valuable resources will not be wasted using best guess and OFAT approaches.
b. Optimization: finding the settings or levels of the important factors that result in desirable
values of the response
24
Guidelines for Designing Experiments cont…
c. Confirmation: the experimenter is usually trying to verify that the system operates or behaves
in a manner that is consistent with some theory or past experience
• It is also used in substituting the new material results in no change in product characteristics
that impact its use.
d. Robustness: These experiments often address questions such as what conditions would lead
to unacceptable variability in the response variables?
A variation of this is determining how we can set the factors in the system that we can control to
minimize the variability transmitted into the response from factors that we cannot control very
well
25
Guidelines for Designing Experiments cont…
3. Choice of factors, levels, and range: factors can be classified as either potential design factors or
nuisance factors.
3.1 The potential design factors are those factors that the experimenter may wish to vary in the
experiment. Often we find that there are a lot of potential design factors, such as design factors, held-
The design factors are factors actually selected for study in the experiment.
Allowed-to-vary factors, the experimental units or the “materials” to which the design factors are
applied are usually nonhomogeneous, yet we often ignore this unit-to-unit variability and rely on
We often assume that the effects of held-constant factors and allowed-to vary factors are relatively
small
3.2 Nuisance factors may have large effects that must be accounted for, yet we may not be interested in
them in the context of the present experiment.
Nuisance factors are often classified as controllable, uncontrollable, or noise factors.
A controllable nuisance factor is one whose levels may be set by the experimenter
When a factor that varies naturally and uncontrollably in the process can be controlled for purposes of
an experiment, we often call it a noise factor.
*In such situations, our objective is usually to find the settings of the controllable design factors that
minimize the variability transmitted from the noise factors.
28
Guidelines for Designing Experiments cont…
Once the experimenter has selected the design factors, he or she must choose the ranges over which
these factors will be varied and the specific levels at which runs will be made
When the objective of the experiment is factor screening or process characterization, it is usually best
In factor screening, the range over which the factors are varied should be broad. As we learn more
about which variables are important and which levels produce the best results, the region of interest in
29
Guidelines for Designing Experiments cont…
4. Choice of experimental design. If pre-experimental planning activities are done correctly, this step
is relatively easy.
Choice of design involves consideration of sample size (number of replicates), selection of a
suitable run order for the experimental trials, and determination of whether or not blocking or
other randomization restrictions are involved.
There are also several interactive statistical software packages that support this phase of
experimental design.
The experimenter can enter information about the number of factors, levels, and ranges, and
these programs will either present a selection of designs for consideration or recommend a
particular design.
30
In selecting the design, it is important to keep the experimental objectives in mind.
Guidelines for Designing Experiments cont…
the process carefully to ensure that everything is being done according to plan.
validity.
One of the most common mistakes is that the people conducting the experiment
31
Guidelines for Designing Experiments cont…
6 Statistical analysis of the data: Statistical methods should be used to analyse the data so that results and
Often we find that simple graphical methods play an important role in data analysis and interpretation
It is also usually very helpful to present the results of many experiments in terms of an empirical model,
that is, an equation derived from the data that express the relationship between the response and the
Residual analysis and model adequacy checking are also important analysis techniques.
*Statistical techniques coupled with good engineering or process knowledge and common sense will usually
33
Using Statistical Techniques in Experimentation
Much of the research in engineering, science, and industry is empirical and makes extensive use of
experimentation.
Empirical research is a type of research methodology that makes use of verifiable evidence in order to arrive
at research outcomes.
*The proper use of statistical techniques in experimentation requires that the experimenter keep the
following points in mind:
1. Use your nonstatistical knowledge of the problem:
Experimenters are usually highly knowledgeable in their fields (has considerable practical experience and
formal academic training in this area).
35
Using Statistical Techniques in Experimentation cont…
36
Using Statistical Techniques in Experimentation cont…
4. Experiments are usually iterative:
Remember that in most situations it is unwise to design too comprehensive an experiment at the start of a study
Successful design requires the knowledge of important factors, the ranges over which these factors are varied, the
appropriate number of levels for each factor, and the proper methods and units of measurement for each factor and
response.
Generally, we are not well equipped to answer these questions at the beginning of the experiment, but we learn the
We usually should not invest more than about 25 percent of the resources of experimentation (runs, budget, time, etc.)
Often these first efforts are just learning experiences, and some resources must be available to accomplish the final
There are situations where comprehensive experiments are entirely appropriate, but as a general rule37 most
3. Introduction to Factorial Designs
• Many experiments involve the study of the effects of two or more factors. In general, factorial
• By a factorial design, we mean that in each complete trial or replicate of the experiment all
possible combinations of the levels of the factors are investigated. For example, if there are “a”
levels of factor A and “b” levels of factor B, each replicate contains all “ab” treatment
combinations.
• When factors are arranged in a factorial design, they are often said to be crossed.
• The effect of a factor is defined to be the change in response produced by a change in the level of
the factor. This is frequently called a main effect because it refers to the primary factors of interest
38
Introduction to Factorial Designs cont…
• Consider a two-factor factorial experiment with both design factors at two levels.
We call these levels “low” and “high” and denoted them “-” and “+” respectively.
The main effect of factor A in this two-level design can be thought of as the
difference between the average response at the high level of A and the average
response at the low level of A.
39
Fig 4. 1 A two-factor factorial experiment, with Fig 4.2 A two-factor factorial experiment
the response (y) shown at the corners with interaction
52− 30 40 −20
𝐴𝐵= − =1
2 2
Increasing factor A from the low level to the high level causes an average response
increase of 21 units. Similarly, increasing factor B from the low level to the high level
causes an average response increase of 11 units. Interaction effect value is 1.
40
• In some experiments, we may find that the difference in response between the
levels of one factor is not the same at all levels of the other factors. When this
occurs, there is an interaction between the factors. For example, consider the two-
factor factorial experiment shown (Fig. 4.2). At the low level of factor B (or B-), the A
effect is *A = 50 - 20 = 30 and at the high level of factor B (or B+), the A effect is *A
= 12 - 40= -28
• Because the effect of A depends on the level chosen for factor B, we see that there
is interaction between A and B. The magnitude of the interaction effect is the
average difference in these two A effects, or AB (-28 - 30)/2 = -29. Clearly, the
interaction is large in this experiment
41
• These ideas may be illustrated graphically (see Fig.4.4).
• Note that, the B- and B+ lines are approximately parallel (Fig 4.3), indicating a lack of interaction between
factors A and B
• B- and B+ lines are not parallel (Fig. 4.4). This indicates an interaction between factors A and B. Two-factor
interaction graphs such as these are frequently very useful in interpreting significant interactions and in
reporting results to non statistically trained personnel.
Fig.4 .3 A factorial experiment without interaction Fig.4 .4 A factorial experiment with interaction
42
• There is another way to illustrate the concept of interaction. Suppose that both of our design
factors are quantitative (such as temperature, pressure, time, etc.). Then a regression model
representation of the two-factor factorial experiment could be written as.
where y is the response, the βs are parameters whose values are to be determined, x1 is a variable
that represents factor A, x2 is a variable that represents factor B, and ∈ is a random error term. The
variables x1 and x2 are defined on a coded scale from -1 to +1 (the low and high levels of A and B),
43
• The parameter estimates in this regression model turn out to be related to the effect estimates. For the
experiment shown in Fig. 4.1, we found the main effects of A and B to be A = 21 and B = 11. The estimates of
A and B in Fig 4.1 are one-half the value of the corresponding main effect; therefore, β 1 =21/2=10.5 and, β2
=11/2=5.5. The interaction effect in Fig. 4.1 is AB = 1, so the value of interaction coefficient in the regression
model is β12=1/2 = 0.5. The parameter βo is estimated by the average of all four responses, or β o = (20 + 40 +
*Y=35.5+10.5x1+5.5x2+0.5x1x2
• The interaction coefficient (β12 = 0.5) is small relative to the main effect coefficients and . We will take this to
mean that interaction is small and can be ignored. Therefore, dropping the term 0.5x 1x2 gives us the model;
* Y=35.5+10.5x1+5.5x2
44
Fig. 4.5 presents graphical representations of this model. In Figure 4.5a we have a plot of the plane
of y-values generated by the various combinations of x1 and x2. This three dimensional graph is called
a response surface plot. Figure 4.5b shows the contour lines of constant response y in the x1, x2,
plane. Notice that because the response surface is a plane, the contour plot contains parallel straight
lines.
Fig.4.5 Response surface and contour plot for the mode Y=35.5+10.5x1+5.5x2
45
• Now suppose that the interaction contribution to this experiment was not negligible; that is, the coefficient β 12 was not
small. Figure 4.6 presents the response surface and contour plot for the model. Y=35.5+10.5x1+5.5x2+8x1x2
• (We have let the interaction effect be the average of the two main effects.) Notice that the significant interaction effect
“twists” the plane in Fig. 4.6a. This twisting of the response surface results in curved contour lines of constant response in
the x1, x2 plane, as shown in Fig.4.6b. Thus, interaction is a form of curvature in the underlying
response surface model for the experiment. The response surface model for an experiment is extremely
important and useful. Generally, when an interaction is large, the corresponding main effects have little practical
meaning.
46
Fig. 4.6 Response surface and contour plot for the model Y=35.5+10.5x1+5.5x2+8x1x2
The Advantages of Factorial designs
misleading conclusions.
the other factors, yielding conclusions that are valid over a range of experimental
conditions.
47
5. The 2k Factorial Design
• Factorial designs are widely used in experiments involving several factors where it is necessary to study
the joint effect of the factors on a response. However, several special cases of the general factorial design
are important because they are widely used in research work and also because they form the basis of
• The most important of these special cases is that of k factors, each at only two levels. These levels may be
quantitative, such as two values of temperature, pressure, or time; or they may be qualitative, such as
two machines, two operators, the “high” and “low” levels of a factor, or perhaps the presence and
• The 2k design is particularly useful in the early stages of experimental work when many factors are likely to
be investigated. 48
The 22 Design
• The first design in the 2k series is one with only two factors, say A and B, each run at two levels.
This design is called a 22 - factorial design. The levels of the factors may be arbitrarily called “low”
and “high.” As an example, consider an investigation into the effect of the concentration of the
reactant and the amount of the catalyst on the conversion (yield) in a chemical process. The
objective of the experiment was to determine if adjustments to either of these two factors would
increase the yield. Let the reactant concentration be factor A and let the two levels of interest be
15 and 25 percent. The catalyst is factor B, with the high level denoting the use of 2 pounds of the
catalyst and the low level denoting the use of only 1 pound. The experiment is replicated three
times, so there are 12 runs. The order in which the runs are made is random, so this is a completely
• We see from the figure that the high level of any factor in the treatment combination is denoted by the
corresponding lowercase letter and that the low level of a factor in the treatment combination is denoted by
the absence of the corresponding letter. Thus, “a” represents the treatment combination of A at the high
level and B at the low level, “b” represents A at the low level and B at the high level, and ab”” represents
• By convention, (1) is used to denote both factors at the low level. This notation is used throughout the 2 k
series. Also, the symbols (1), a, b, and ab now represent the total of the response observation at all n
51
*Now the effect of A at the low level of B is [a - (1)]/n, The average main effect of B is found from the
and the effect of A at the high level of B is [ab - b]/n. effect of B at the low level of A (i.e., [b - (1)]/n)
Averaging these two quantities yields the main effect of and at the high level of A (i.e., [ab - a]/n) as
A:
52
Alternatively, we may define AB as the average difference between the effect of B at the high level of A and the effect of B at
the low level of A. The formulas for the effects of A, B, and AB may be derived by another method. The effect of A can be
found as the difference in the average response of the two treatment combinations on the right-hand side of the square in
Fig.5.1 (call this average ỹA+ because it is the average response at the treatment combinations where A is at the high level)
and the two treatment combinations on the left-hand side (orỹ A-). That is,
53
Using the experiment in Fig. 5.1, we may estimate the average effects as
The effect of A (reactant concentration) is positive; this suggests that increasing A from the low level (15%) to
the high level (25%) will increase the yield. The effect of B (catalyst) is negative; this suggests that increasing
the amount of catalyst added to the process will
decrease the yield.
The interaction effect appears to be small relative to the two main effects.
In experiments involving 2k designs, it is always important to examine the magnitude and direction of the factor
effects to determine which variables are likely to be important.
54
• It is often convenient to write down the treatment combinations in the order (1), a, b, ab. This is
referred to as standard order (or Yates’s order, for Frank Yates who was one of Fisher co-workers
and who made many important contributions to designing and analysing experiments). Using this
standard order, we see that the contrast coefficients used in estimating the effects are
• Note that the contrasts for the effects A, B, and AB are orthogonal. Thus, the 2 2 (and all 2k
designs) is an orthogonal design. The +/-1 coding for the low and high levels of the factors
is often called the orthogonal coding or the effects coding.
55
The Regression Model
• In a 2k factorial design, it is easy to express the results of the experiment in terms of a regression model.
Because the 2k is just a factorial design, we could also use either an effect or a means model, but the
regression model approach is much more natural and intuitive. For the chemical process experiment in Fig.
Y=βo+βx1+βx2+ɛ
• where x1 is a coded variable that represents the reactant concentration, x 2 is a coded variable that represents
the amount of catalyst, and the β ’s are regression coefficients. The relationship between the natural
variables, the reactant concentration and the amount of catalyst, and the coded variables are seen below
56
When the natural variables have only two levels, this coding will produce the familiar +/-1
notation for the levels of the coded variables. To illustrate this for our example, note that
Thus, if the concentration is at the high level (Conc = 25%), then x 1 = +1; if the concentration is at the low level (Conc
= 15%), then x1 = -1. Furthermore,
Thus, if the catalyst is at the high level (Catalyst = 2 pounds), then x 2 =+1; if the catalyst is at the low level (Catalyst = 1
pound), then x2 = -1.
estimates.
the effect estimate is based on a two-unit change (from -1 to 1). This simple
parameter estimates.
58
Residuals and Model Adequacy
The regression model can be used to obtain the predicted or fitted value of y at the four points in the design. The
residuals are the differences between the observed and fitted values of y. For example, when the reactant concentration
is at the low level (x1 = -1) and the catalyst is at the low level (x2 = -1), the predicted yield
here are three observations at this treatment combination, and the residuals are
The remaining predicted values and residuals are calculated similarly. For the high level of the reactant concentration
and the low level of the catalyst, and
59
For the low level of the reactant concentration
and the high level of the catalyst,
Figure 5.2 presents a normal probability plot of these residuals and a plot of the residuals versus the predicted
yield. These plots appear satisfactory, so we have no reason to suspect that there are any problems with the
validity of our conclusions. 60
The Response Surface plot
• The regression model
can be used to generate response surface plots. If it is desirable to construct these plots in terms of the
natural factor levels, then we simply substitute the relationships between the natural and coded variables
that we gave earlier into the regression model, yielding
61
ANOVA
• There are also special time-saving methods for performing the calculations manually.
Consider determining the sums of squares for A, B, and AB.
64
• There are seven degrees of freedom between the eight treatment combinations in the 2 3
design. Three degrees of freedom are associated with the main effects of A, B, and C.
Four degrees of freedom are associated with interactions; one each with AB, AC, and BC
and one with ABC
65
• The two-factor interaction effects may be computed easily. A measure of the AB
interaction is the difference between the average A effects at the two levels of B. By
convention, one-half of this difference is called the AB interaction.
The ABC interaction is defined as the average difference between the AB interaction at the two different levels of
C. Thus,
66
Signs for the main effects are determined by associating a plus with the high level and a minus with the low level. Once
the signs for the main effects have been established, the signs for the remaining columns can be obtained by multiplying
the appropriate preceding columns row by row. For example, the signs in the AB column are the product of the A and B
column signs in each row. The contrast for any effect can be obtained easily from this table.
Sums of squares for the effects are easily computed because each effect has a corresponding single-
degree-of-freedom contrast. In the 23 design with n replicates, the sum of squares for any effect is
67
2k-p Fractional Factorial Design
*Fundamental Principles Regarding Factorial Effects
• Suppose there are k factors (A,B,...,J,K) in an experiment. All possible factorial effects
include
effects of order 1: A, B, ..., K (main effects)
effects of order 2: AB, AC, ....,JK (2-factor interactions)
………
Hierarchical Ordering principle
Lower order effects are more likely to be important than higher order effects.
Effects of the same order are equally likely to be important
68
Effect Sparsity Principle (Pareto principle)
The number of relatively important effects in a factorial experiment is small
significant.
May not have sources (time, money, etc) for full factorial design
Consider 2k design
If k = 7 → 128 runs required
are negligible, information on the main effects and low-order interactions may
• These fractional factorial designs are among the most widely used types of
industrial/business experimentation. 70
• A major use of fractional factorials is in screening experiments—experiments in which
many factors are considered and the objective is to identify those factors (if any) that
have large effects.
• Screening experiments are usually performed in the early stages of a project when many
of the factors initially considered likely have little or no effect on the response.
• The factors identified as important are then investigated more thoroughly in subsequent
experiments.
71
Aliasing in 24−1 Design
• For four factors A, B, C and D, there are 24 − 1 effects: A, B, C, D, AB, AC,
AD, BC, BD, CD, ABC, ABD, ACD, BCD, ABCD
Contrasts for main effects by converting − to -1 and + to 1; contrasts for other effects
obtained by multiplication. 72
There are other 5 pairs. They are caused by the defining relation
I=ABCD; that is, I (the intercept) and 4-factor interaction ABCD are
aliased
73
Alias Structure for 24 -1 with I = ABCD (denoted by d1)
Alias Structure:
I = ABCD
A = A ∗ I = A ∗ ABCD = BCD
B = .......... = ACD
C = .......... = ABD
D = .......... = ABC
AB = AB ∗ I = AB ∗ ABCD = CD
AC = ............ = BD
AD = ............ = BC
all 16 factorial effects for A, B, C and D are partitioned into 8 groups
each with 2 aliased effects.
74
The One-Quarter Fraction of the 2k Design
• For a moderately large number of factors, smaller fractions of the 2 k design are frequently
useful. Consider a one-quarter fraction of the 2 k design. This design contains 2k-2 runs and is
usually called a 2k-2 fractional factorial.
• The 2k-2 design may be constructed by first writing down a basic design consisting of the runs
associated with a full factorial in k - 2 factors and then associating the two additional columns
with appropriately chosen interactions involving the first k - 2 factors.
• Thus, a One-Quarter Fraction of the 2 k has two generators.
• If P and Q represent the generators chosen, then I = P and I = Q are called the generating
relations for the design
75
• All four fractions associated with the choice of generators +/-P and +/-Q are members of the
same family.
• The fraction for which both P and Q are positive is the principal fraction.
• The complete defining relation for the design consists of all the columns that are equal to the
identity column I.
• These will consist of P, Q, and their generalized interaction PQ; that is, the defining relation is I = P
= Q = PQ. We call the elements P, Q, and PQ in the defining relation words. The aliases of any
effect are produced by the multiplication of the column for that effect by each word in the
• The experimenter should be careful in choosing the generators so that potentially important
• Suppose we choose I = ABCE and I = BCDF as the design generators. Now the generalized interaction of the generators
ABCE and BCDF is ADEF; therefore, the complete defining relation for this design is, I = ABCE = BCDF = ADEF
• Consequently, this is a resolution IV design. To find the aliases of any effect (e.g., A), multiply that effect by each word in the defining
B=ACE=CDF=ABDEF
C=ABE=BDF=ACDEF
.
.
.
AB=EC=ACDF=BDEF
.
.
. 77
• It is easy to verify that every main effect is aliased by three- and five-factor interactions, whereas two-factor interactions are
aliased with each other and with higher order interactions
• To construct the design, first write down the basic design, which consists of the 16 runs for a full 26-2 = 24 design in A, B, C, and
D. Then the two factors E and F are added by associating their plus and minus levels with the plus and minus signs of the
interactions ABC and BCD, respectively
78
INTRODUCTION TO TAGUCHI METHOD
• The technique of laying out the conditions of experiments involving multiple factors was first proposed by the
Englishman, Sir R.A. Fisher. The method is popularly known as the factorial design of experiments.
• A full factorial design will identify all possible combinations for a given set of factors. Since most industrial
experiments usually involve a significant number of factors, a full factorial design results in a large number of
experiments.
• To reduce the number of experiments to a practical level, only a small set from all the possibilities is selected.
• The method of selecting a limited number of experiments which produces the most information is known as a
partial fraction experiment. Although this method is well known, there are no general guidelines for its application
or the analysis of the results obtained by performing the experiments. *Taguchi constructed a special set of
general design guidelines for factorial experiments that cover many applications. 79
Definition
which are based on well defined guidelines. This method uses a special set of arrays
called orthogonal arrays. These standard arrays stipulates the way of conducting
the minimal number of experiments which could give the full information of all the
factors that affect the performance parameter. The root of the orthogonal arrays
method lies in choosing the level combinations of the input design variables for
each experiment.
80
A typical orthogonal array
• While there are many standard orthogonal arrays available, each of the arrays is meant for a specific number of
• For example, if one wants to conduct an experiment to understand the influence of 4 different independent
variables with each variable having 3 set values ( level values), then an L9 orthogonal array might be the right
choice.
• The L9 orthogonal array is meant for understanding the effect of 4 independent factors each having 3 factor
level values. This array assumes that there is no interaction between any two factor. While in many cases, no
interaction model assumption is valid, there are some cases where there is a clear evidence of interaction.
81
Table 7.1 Layout of L9 orthogonal array.
Table 7.1 shows an L9 orthogonal array. There are totally 9 experiments to be conducted and each experiment is
based on the combination of level values as shown in the table. For example, the third experiment is conducted by
keeping the independent design variable 1 at level 1, variable 2 at level 3, variable 3 at level 3, and variable 4 at level 3.
82
Properties of an orthogonal array
• The orthogonal arrays has the following special properties that reduces the
1. The vertical column under each independent variables of the above table has
a special combination of level settings. All the level settings appears an equal
number of times. For L9 array under variable 4 , level 1 , level 2 and level 3
2. All the level values of independent variables are used for conducting the
experiments 83
3. The sequence of level values for conducting the experiments shall not be changed. This means one can not
conduct experiment 1 with variable 1, level 2 setup and experiment 4 with variable 1 , level 1 setup.
The reason for this is that the array of each factor columns are mutually orthogonal to any other column of
level values.
The inner product of vectors corresponding to weights is zero. If the above 3 levels are normalized between -1
and 1, then the weighing factors for level 1, level 2 , level 3 are -1 , 0 , 1 respectively. Hence the inner product
(-1 * -1+-1*0+-1*1)+(0*0+0*1+0*-1)+(1*0+1*1+1*-1)=0
84
Assumptions of the Taguchi method
• The additive assumption implies that the individual or main effects of the independent
• Under this assumption, the effect of each factor can be linear, quadratic or of higher order,
but the model assumes that there exists no cross product effects (interactions) among the
individual factors.
• That means the effect of independent variable 1 on performance parameter does not
depend on the different level settings of any other independent variables and vice versa. If at
anytime, this assumption is violated, then the additivity of the main effects does not hold,
7. Inference
86
1. Selection of the independent variables
• Before conducting the experiment, the knowledge of the product/process under investigation is of prime
importance for identifying the factors likely to influence the outcome. In order to compile a comprehensive list of
factors, the input to the experiment is generally obtained from all the people involved in the project.
Once the independent variables are decided, the number of levels for each variable is decided. The selection of
number of levels depends on how the performance parameter is affected due to different level settings. If the
performance parameter is a linear function of the independent variable, then the number of level setting shall be 2.
However, if the independent variable is not linearly related, then one could go for 3, 4 or higher levels depending on
87
• In the absence of exact nature of relationship between the independent variable and the performance
parameter, one could choose 2 level settings. After analyzing the experimental data, one can decide whether
the assumption of level setting is right or not based on the percent contribution and the error calculations.
• The minimum number of experiments that must be run to study the factors shall be more than the total
• In counting the total degrees of freedom the investigator commits 1 degree of freedom to the overall mean
88
• The number of degrees of freedom associated with each factor under
study equals one less than the number of levels available for that factor.
total degrees of freedom is 12. Hence the selected orthogonal array shall
very essential. In case of mixed level variables and interaction between variables, the
• Finally, before conducting the experiment, the actual level values of each design
variable shall be decided. It shall be noted that the significance and the percent
90
5. Conducting the experiment
91
6. Analysis of the data
segregate the individual effect of independent variables. This can be done by summing
• For example, in order to find out the main effect of level 1 setting of the independent
variable 2 (refer Table 7.1), sum the performance parameter values of the experiments
1, 4 and 7. Similarly for level 2, sum the experimental results of 2, 5 and 8 and so on.
92
Once the mean value of each level of a particular independent variable is
calculated, the sum of square of deviation of each of the mean value from the
insignificant, one may conclude that the design variables is not influencing the
variable.
93
7. Inference
• From the above experimental analysis, it is clear that the higher the value of sum of square of an independent
variable, the more it has influence on the performance parameter. One can also calculate the ratio of
individual sum of square of a particular independent variable to the total sum of squares of all the variables.
This ratio gives the percent contribution of the independent variable on the performance parameter.
• In addition to above, one could find the near optimal solution to the problem. This near optimum value may
not be the global optimal solution. However, the solution can be used as an initial / starting value for the
95
Table of Taguchi Designs (Orthogonal Arrays)
96
8. Response Surface Methods and Designs/RSM
• RSM, is a collection of mathematical and statistical techniques useful for the modelling and
analysis of problems in which a response of interest is influenced by several variables and the
• For example, suppose that an engineer wishes to find the levels of temperature (x1) and pressure
(x2) that maximize the yield (y) of a process. The process yield is a function of the levels of
temperature and pressure, say Y= f (x1, x2) +ɛ, where ɛ represents the noise or error observed in the
response y.
• If we denote the expected response by E(y) = f (x1, x2) , then the surface represented by ɳ = f (x1, x2), is
98
• In most RSM problems, the form of the relationship between the response and
the independent variables is unknown. Thus, the first step in RSM is to find a
suitable approximation for the true functional relationship between y and the
set of independent variables.
• Usually, a low-order polynomial in some region of the independent variables is
employed. If the response is well modelled by a linear function of the
independent variables, then the approximating function is the first-order model
y = β0 +β1x1+ β2x2 +…+ β k xk +ɛ
• RSM is a sequential procedure. Often, when we are at a point on the response surface
that is remote from the optimum, such as the current operating conditions in fig. 8.1,
there is little curvature in the system and the first-order model will be appropriate.
• Our objective here is to lead the experimenter rapidly and efficiently along a path of
improvement toward the general vicinity of the optimum. Once the region of the
optimum has been found, a more elaborate model, such as the second-order model,
100
• The analysis of a response surface can be thought of as “climbing a hill,” where the top of the hill represents the point
of maximum response. If the true optimum is a point of minimum response, then we may think of “descending into a
valley.”
• The eventual objective of RSM is to determine the optimum operating conditions for the system or to determine a
region of the factor space in which operating requirements are satisfied.
101
Fig. 8.1The sequential nature of RSM
8.1. Response Surface Designs
• Response surface design methodology is often used to refine models after you have
• The difference between a response surface equation and the equation for a factorial
design is the addition of the squared (or quadratic) terms that lets you model curvature in
• A central composite design is the most commonly used response surface designed experiment.
Central composite designs are a factorial or fractional factorial design with centre points,
augmented with a group of axial points (also called star points) that let you estimate curvature.
The star or axial points are, in general, at some value α and −α on each axis.
• You can use a central composite design to:
• Efficiently estimate first- and second-order terms.
• Model a response variable with curvature by adding centre and axial points to a previously-done factorial design.
103
• Central composite designs are especially useful in sequential experiments because you can often
build on previous factorial experiments by adding axial and centre points.
*If we have k factors, then we have, 2k factorial points, 2∗k axial points and nc centre points.
104
• The spherical designs are rotatable in the sense that the points are all equidistant from the centre.
• Rotatable refers to the variance of the response function. A rotatable design exists when there is an
equal prediction variance for all points a fixed distance from the centre, 0 (outstanding feature of CCD).
• If you pick the centre of your design space and run your experiments, all points that are equal distance
from the centre in any direction, have an equal variance of prediction.
105
*Why do we take about five or six centre points in the design?
• The reason is also related to the variance of a predicted value. When fitting a response
surface you want to estimate the response function in this design region where we are
trying to find the optimum.
• We want the prediction to be reliable throughout the region, and especially near the centre
since we hope the optimum is in the central region.
• By picking five to six centre points, the variance in the middle is approximately the same as
the variance at the edge (balance the precision at the edge of the design relative to the
middle).
• If you only had one or two centre points, then you would have less precision in the middle
than you would have at the edge.
106
*How do you select the region where you want to run the
experiment?
• If the lower natural unit is really the lowest number that you can test,
because the experiment won't work lower than this, then, the star
• BBD usually have fewer design points than CCD, thus, they are less expensive to run
• They can efficiently estimate the first- and second-order coefficients; however, they
• BBD always have 3 levels per factor, unlike CCD which can have up to 5.
• Also unlike CCD, BBD, never include runs where all factors are at their extreme
setting, such as all of the low and their high level settings at the same time.
• Box-Behnken designs have treatment combinations that are at the midpoints of the
edges of the experimental space and require at least three continuous factors.
108
• Box-Behnken designs can also prove useful if you know the
• CCD usually have axial points outside the "cube." These points
sure that all design points fall within your safe operating zone.
109