You are on page 1of 109

Design of Experiments and Analytical

Techniques

Ethiopian Institute of Textile and Fashion Technology,


Bahir Dar University

Million Ayele (Assistant Professor)


2020

1
1. Introduction to Statistics
• What is statistics
 The science of collecting, organizing, and analyzing/interpreting data.

 A set of procedures and rules…for reducing large masses of data to


manageable proportions and allowing the user to draw meaningful
conclusions from those data.
 Statistical techniques play major roles in the development and
Analysis of research, projects and publication of research findings.

2
Introduction to Statistics Con…

• Statistical Data: Numerical statement of facts.


• Statistical Methods: A complete body of the principles and techniques
used in collecting, analyzing and interpreting data.

3
Why statistics?

a. To get orientations on how to draw logical conclusions about


target population based on sample.

b. To be familiar with Statistical languages ( concepts and


technical terms in statistics).

c. To express ideas and relationships in numerical forms

d. To deal with repeated or repeatable measures


4
Areas of statistics
Descriptive Statistics: collection , organization, presentation, and description of
sample data (chart, graph, table..).

Inferential Statistics: making decisions and drawing conclusions about populations.

With inferential statistics, you take data from samples and make generalizations about
a population.

• data collectiondata presentation/organizationdata summarizationdata


analyzData interpretation

5
Basic terms in statistics
Population: A collection, or set of individuals or
objects or events whose properties are to be
analyzed.
Sample: A subset of the population.
Variable: A characteristic about a population or
sample.
Data : The value of the variable associated with
one element of a population or sample. This
value may be a number, a word, or a symbol.
6
Nature of variables
• Qualitative/categorical :- A variable that can be expressed interims of words.
• Quantitative / numerical :- A variable that can be expressed interims of numbers.

 Discrete variables are variables for which


its values are countable and have a finite
number of possibilities. The values are
often (but not always) integers.
 Continuous variables are variables for
which the values are not countable and
have an infinite number of possibilities
(any variable having measuring units).

7
• Nominal variable is a qualitative variable where no ordering is possible or
implied in the levels. For example, the variable gender is nominal because
there is no order in the levels female/male.

• A nominal variable can have between two levels (e.g., do you smoke? Yes/No
or what is your gender? Female/Male)

• Ordinal variable is a qualitative variable with an order implied in the levels.


For instance, if the severity of road accidents has been measured on a scale
such as light, moderate and fatal accidents, this variable is a qualitative
ordinal variable because there is a clear order in the levels.

• Another good example is health, which can take values such as poor,
reasonable, good, or excellent.

8
Sources of Data

• Primary source :- A data collected from field by investigator in


contact with respondents.
• Secondary source : Obtained in published form

9
2. INTRODUCTION TO DESIGN OF EXPERIMENT

It is the branch of statistics which deals with designing or


planning the experiment according to the objective of the study,
conducting the experiment according to the planning, analysing
the data using analysis of variance technique and finally writing
the conclusions on the basis of analysis.

10
Strategy of experimentation
• Observation on a system or process leads to theory or hypothesis about what makes the system
work.
• An experiment demonstrate whether the theories or hypothesis is correct or not.
• Experiment is tests of series of runs in which changes are made to the input variables purposefully
and the changes observed justifying the need of the change.
• The impute variables responsible for the change in the response will be identified…decision to the
system (improvement)

11
Strategy of experimentation, cont.…
 A well designed experiment …excellent result and conclusion.

*OFAT approach….selecting a starting point, or baseline set of levels,


for each factor, and then successively varying each factor over its
range with the other factors held constant at the baseline level.

*The major disadvantage of the OFAT strategy is that it fails to


consider any possible interaction between the factors.

OFAT experiments are always less efficient than other methods


based on a statistical approach to design.
12
Strategy of experimentation, cont.…
*Factorial experiment: deals with several
factors at a time
• This is an experimental strategy in which
factors are varied together, instead of one at a
time.
• If there are k factors, each at two levels, the
factorial design would require 2k runs.
13
Strategy of experimentation, cont.…

• As the number of factors of interest increases, the number of


runs required increases rapidly
• If there are four or more factors, it is usually unnecessary to
run all possible combinations of factor levels.

*A fractional factorial experiment is a version of the basic


factorial design in which only a subset of the runs is used
14
Application of Experimental Design

• One way of the scientific process by which we learn about how the system
or process works.

• How we learn series of activities in which we make conjectures/inferences


about the process, perform experiment, generate data from the process,
and use the information from the experiment to establish a new
conjectures, which leads to a new experiment and so on.

15
Application of Experimental Design Cont.…

Application in process development

1. Improved process yields

2. Reduced variability and closer conformance to nominal or target requirements

3. Reduced development time

4. Reduced overall costs.

*Improvement in product design activities where a new products are developed and
the existing one improved
16
Application of Experimental Design Cont…

Some applications in engineering design includes


• Evaluation and comparison of basic design configurations
• Evaluation of material alternatives
• Selection of design parameters
• Determination of key product design parameters that impact product performance
• Formulation of new products.

17
Basic Principles
*Statistical design of experiments refers to the process of planning the experiment so that

appropriate data will be collected and analysed by statistical methods, resulting in valid and

objective conclusions.

• The statistical approach to experimental design is necessary if we wish to draw meaningful

conclusions from the data.

*There are two aspects to any experimental problem: the design of the experiment and the

statistical analysis of the data. These two subjects are closely related because the method of

analysis depends directly on the design employed.


18
Basic Principles cont.…
*The three basic principles of experimental design are randomization,

replication, and blocking, which are parts of every experiment.


RANDOMIZATION: Allocation of the experimental material and the order in which the

individual runs of the experiment are to be performed are randomly determined.

• By properly randomizing the experiment, we also assist in “averaging out” the effects of

extraneous factors that may be present.

• Sometimes experimenters encounter situations where randomization of some aspect of

the experiment is difficult.


19
Basic Principles cont…
REPLICATION: Independent repeat run of each factor combination.

• Each of the observations should be run in random order.

*Replication has two important properties.

1) It allows the experimenter to obtain an estimate of the experimental error. This estimate of error

becomes a basic unit of measurement for determining whether observed differences in the data are

really statistically different.

2) Second, if the sample mean is used to estimate the true mean response for one of the factor levels in the

experiment, replication permits the experimenter to obtain a more precise estimate of this parameter

*There is an important distinction between replication and repeated measurements.

*Replication reflects sources of variability both between runs and (potentially) within runs.
20
Basic Principles cont…
BLOCKING: is a design technique used to improve the precision with which

comparisons among the factors of interest are made.

• Often blocking is used to reduce or eliminate the variability transmitted

from nuisance factors—that is, factors that may influence the experimental

response but in which we are not directly interested.

• Generally, a block is a set of relatively homogeneous experimental

conditions.

21
Guidelines for Designing Experiments

• To use the statistical approach in designing and analysing an


experiment, it is necessary to have a clear idea in advance of
exactly what is to be studied, how the data are to be collected,
and at least a qualitative understanding of how these data are
analysed

22
Guidelines for Designing Experiments cont…

*An outline of the recommended procedure in designing and analysing experiment

1) Recognition of statement of the problem

It is usually helpful to prepare a list of specific problems or questions that are to be

addressed by the experiment.

A clear statement of the problem often contributes substantially to better understanding

of the phenomenon being studied and the final solution of the problem.

23
Guidelines for Designing Experiments cont…

• Some (but by no means all) of the reasons for running experiments include:

a. Factor screening or characterization: to learn which factors have the most influence on the
response(s) of interest.

Screening experiments are extremely important when working with new systems or technologies
so that valuable resources will not be wasted using best guess and OFAT approaches.

b. Optimization: finding the settings or levels of the important factors that result in desirable
values of the response

• Finding the levels of factors that maximize the yield

24
Guidelines for Designing Experiments cont…

c. Confirmation: the experimenter is usually trying to verify that the system operates or behaves
in a manner that is consistent with some theory or past experience
• It is also used in substituting the new material results in no change in product characteristics
that impact its use.
d. Robustness: These experiments often address questions such as what conditions would lead
to unacceptable variability in the response variables?
A variation of this is determining how we can set the factors in the system that we can control to
minimize the variability transmitted into the response from factors that we cannot control very
well
25
Guidelines for Designing Experiments cont…

2. Selection of the responses variable: In selecting the response


variables, the experimenter should be certain that these variables
really provides useful information about the process under study.
• It is usually critically important to identify issues related to
defining the responses of interest and how they are to be
measured before conducting the experiment.
26
Guidelines for Designing Experiments cont…

3. Choice of factors, levels, and range: factors can be classified as either potential design factors or
nuisance factors.

3.1 The potential design factors are those factors that the experimenter may wish to vary in the

experiment. Often we find that there are a lot of potential design factors, such as design factors, held-

constant factors, and allowed-to-vary factors.

 The design factors are factors actually selected for study in the experiment.

 Allowed-to-vary factors, the experimental units or the “materials” to which the design factors are

applied are usually nonhomogeneous, yet we often ignore this unit-to-unit variability and rely on

randomization to balance out any material or experimental unit effect.


27
Guidelines for Designing Experiments cont…

 We often assume that the effects of held-constant factors and allowed-to vary factors are relatively
small

3.2 Nuisance factors may have large effects that must be accounted for, yet we may not be interested in
them in the context of the present experiment.
 Nuisance factors are often classified as controllable, uncontrollable, or noise factors.

 A controllable nuisance factor is one whose levels may be set by the experimenter

 When a factor that varies naturally and uncontrollably in the process can be controlled for purposes of
an experiment, we often call it a noise factor.

*In such situations, our objective is usually to find the settings of the controllable design factors that
minimize the variability transmitted from the noise factors.
28
Guidelines for Designing Experiments cont…
 Once the experimenter has selected the design factors, he or she must choose the ranges over which

these factors will be varied and the specific levels at which runs will be made

 When the objective of the experiment is factor screening or process characterization, it is usually best

to keep the number of factor levels low (usually two levels).

 In factor screening, the range over which the factors are varied should be broad. As we learn more

about which variables are important and which levels produce the best results, the region of interest in

subsequent experiments will usually become narrower.

29
Guidelines for Designing Experiments cont…

4. Choice of experimental design. If pre-experimental planning activities are done correctly, this step
is relatively easy.
 Choice of design involves consideration of sample size (number of replicates), selection of a
suitable run order for the experimental trials, and determination of whether or not blocking or
other randomization restrictions are involved.
 There are also several interactive statistical software packages that support this phase of
experimental design.
 The experimenter can enter information about the number of factors, levels, and ranges, and
these programs will either present a selection of designs for consideration or recommend a
particular design.
30
 In selecting the design, it is important to keep the experimental objectives in mind.
Guidelines for Designing Experiments cont…

5. Performing the experiment: When running the experiment, it is vital to monitor

the process carefully to ensure that everything is being done according to plan.

 Errors in experimental procedure at this stage will usually destroy experimental

validity.

 One of the most common mistakes is that the people conducting the experiment

failed to set the variables to the proper levels on some runs.

31
Guidelines for Designing Experiments cont…

6 Statistical analysis of the data: Statistical methods should be used to analyse the data so that results and

conclusions are objective rather than judgmental in nature.

 Often we find that simple graphical methods play an important role in data analysis and interpretation

 It is also usually very helpful to present the results of many experiments in terms of an empirical model,

that is, an equation derived from the data that express the relationship between the response and the

important design factors.

 Residual analysis and model adequacy checking are also important analysis techniques.

*Statistical techniques coupled with good engineering or process knowledge and common sense will usually

lead to sound conclusions. 32


Guidelines for Designing Experiments cont…

7. Conclusions and recommendations. Once the data have been analysed,


the experimenter must draw practical conclusions about the results and
recommend a course of action

*A successful experiment requires knowledge of the important factors, the


ranges over which these factors should be varied, the appropriate number of
levels to use, and the proper units of measurement for these variables.

33
Using Statistical Techniques in Experimentation
 Much of the research in engineering, science, and industry is empirical and makes extensive use of
experimentation.
 Empirical research is a type of research methodology that makes use of verifiable evidence in order to arrive
at research outcomes.

*The proper use of statistical techniques in experimentation requires that the experimenter keep the
following points in mind:
1. Use your nonstatistical knowledge of the problem:
 Experimenters are usually highly knowledgeable in their fields (has considerable practical experience and
formal academic training in this area).

 This type of nonstatistical knowledge is invaluable/irreplaceable in choosing factors, determining factor


levels, deciding how many replicates to run, interpreting the results of the analysis, and so forth. 34
Using Statistical Techniques in Experimentation cont…

2. Keep the design and analysis as simple as possible


 Don’t be obsessive in the use of complex, sophisticated statistical techniques.
Relatively simple design and analysis methods are almost always best.

 In fact, a well-designed experiment will sometimes almost analyse itself!

35
Using Statistical Techniques in Experimentation cont…

3. Recognize the difference between practical and statistical significance:


 Just because two experimental conditions produce mean responses that are statistically
different, there is no assurance that this difference is large enough to have any practical
value.
 Suppose an experimenter modifies a process in the production of specific material,
he/she may get statistically significant result in quality or productivity by the new
process, however, if the production cost is highly significant for the new process, the
over all difference may be too small or insignificant.

36
Using Statistical Techniques in Experimentation cont…
4. Experiments are usually iterative:

 Remember that in most situations it is unwise to design too comprehensive an experiment at the start of a study

 Successful design requires the knowledge of important factors, the ranges over which these factors are varied, the

appropriate number of levels for each factor, and the proper methods and units of measurement for each factor and

response.

 Generally, we are not well equipped to answer these questions at the beginning of the experiment, but we learn the

answers as we go along. This argues in favour of the iterative, or sequential, approach

 We usually should not invest more than about 25 percent of the resources of experimentation (runs, budget, time, etc.)

in the initial experiment.

 Often these first efforts are just learning experiences, and some resources must be available to accomplish the final

objectives of the experiment.

 There are situations where comprehensive experiments are entirely appropriate, but as a general rule37 most
3. Introduction to Factorial Designs
• Many experiments involve the study of the effects of two or more factors. In general, factorial

designs are most efficient for this type of experiment.

• By a factorial design, we mean that in each complete trial or replicate of the experiment all

possible combinations of the levels of the factors are investigated. For example, if there are “a”

levels of factor A and “b” levels of factor B, each replicate contains all “ab” treatment

combinations.

• When factors are arranged in a factorial design, they are often said to be crossed.

• The effect of a factor is defined to be the change in response produced by a change in the level of

the factor. This is frequently called a main effect because it refers to the primary factors of interest
38
Introduction to Factorial Designs cont…

• Consider a two-factor factorial experiment with both design factors at two levels.
We call these levels “low” and “high” and denoted them “-” and “+” respectively.
The main effect of factor A in this two-level design can be thought of as the
difference between the average response at the high level of A and the average
response at the low level of A.

39
Fig 4. 1 A two-factor factorial experiment, with Fig 4.2 A two-factor factorial experiment
the response (y) shown at the corners with interaction

52− 30 40 −20
𝐴𝐵= − =1
2 2

Increasing factor A from the low level to the high level causes an average response
increase of 21 units. Similarly, increasing factor B from the low level to the high level
causes an average response increase of 11 units. Interaction effect value is 1.
40
• In some experiments, we may find that the difference in response between the
levels of one factor is not the same at all levels of the other factors. When this
occurs, there is an interaction between the factors. For example, consider the two-
factor factorial experiment shown (Fig. 4.2). At the low level of factor B (or B-), the A
effect is *A = 50 - 20 = 30 and at the high level of factor B (or B+), the A effect is *A
= 12 - 40= -28
• Because the effect of A depends on the level chosen for factor B, we see that there
is interaction between A and B. The magnitude of the interaction effect is the
average difference in these two A effects, or AB (-28 - 30)/2 = -29. Clearly, the
interaction is large in this experiment

41
• These ideas may be illustrated graphically (see Fig.4.4).

• Note that, the B- and B+ lines are approximately parallel (Fig 4.3), indicating a lack of interaction between
factors A and B
• B- and B+ lines are not parallel (Fig. 4.4). This indicates an interaction between factors A and B. Two-factor
interaction graphs such as these are frequently very useful in interpreting significant interactions and in
reporting results to non statistically trained personnel.

Fig.4 .3 A factorial experiment without interaction Fig.4 .4 A factorial experiment with interaction
42
• There is another way to illustrate the concept of interaction. Suppose that both of our design
factors are quantitative (such as temperature, pressure, time, etc.). Then a regression model
representation of the two-factor factorial experiment could be written as.

where y is the response, the βs are parameters whose values are to be determined, x1 is a variable

that represents factor A, x2 is a variable that represents factor B, and ∈ is a random error term. The

variables x1 and x2 are defined on a coded scale from -1 to +1 (the low and high levels of A and B),

and x1x2 represents the interaction between x1 and x2

43
• The parameter estimates in this regression model turn out to be related to the effect estimates. For the

experiment shown in Fig. 4.1, we found the main effects of A and B to be A = 21 and B = 11. The estimates of

A and B in Fig 4.1 are one-half the value of the corresponding main effect; therefore, β 1 =21/2=10.5 and, β2

=11/2=5.5. The interaction effect in Fig. 4.1 is AB = 1, so the value of interaction coefficient in the regression

model is β12=1/2 = 0.5. The parameter βo is estimated by the average of all four responses, or β o = (20 + 40 +

30 + 52)/4 = 35.5. Therefore, the fitted regression model is

*Y=35.5+10.5x1+5.5x2+0.5x1x2

• The interaction coefficient (β12 = 0.5) is small relative to the main effect coefficients and . We will take this to

mean that interaction is small and can be ignored. Therefore, dropping the term 0.5x 1x2 gives us the model;

* Y=35.5+10.5x1+5.5x2

44
Fig. 4.5 presents graphical representations of this model. In Figure 4.5a we have a plot of the plane
of y-values generated by the various combinations of x1 and x2. This three dimensional graph is called

a response surface plot. Figure 4.5b shows the contour lines of constant response y in the x1, x2,
plane. Notice that because the response surface is a plane, the contour plot contains parallel straight
lines.

Fig.4.5 Response surface and contour plot for the mode Y=35.5+10.5x1+5.5x2
45
• Now suppose that the interaction contribution to this experiment was not negligible; that is, the coefficient β 12 was not

small. Figure 4.6 presents the response surface and contour plot for the model. Y=35.5+10.5x1+5.5x2+8x1x2

• (We have let the interaction effect be the average of the two main effects.) Notice that the significant interaction effect
“twists” the plane in Fig. 4.6a. This twisting of the response surface results in curved contour lines of constant response in
the x1, x2 plane, as shown in Fig.4.6b. Thus, interaction is a form of curvature in the underlying
response surface model for the experiment. The response surface model for an experiment is extremely

important and useful. Generally, when an interaction is large, the corresponding main effects have little practical

meaning.

46
Fig. 4.6 Response surface and contour plot for the model Y=35.5+10.5x1+5.5x2+8x1x2
The Advantages of Factorial designs

*In summary, note that factorial designs have several advantages.

1) They are more efficient than one-factor-at-a-time experiments.

2) A factorial design is necessary when interactions may be present to avoid

misleading conclusions.

3) Factorial designs allow the effects of a factor to be estimated at several levels of

the other factors, yielding conclusions that are valid over a range of experimental

conditions.
47
5. The 2k Factorial Design

• Factorial designs are widely used in experiments involving several factors where it is necessary to study

the joint effect of the factors on a response. However, several special cases of the general factorial design

are important because they are widely used in research work and also because they form the basis of

other designs of considerable practical value.

• The most important of these special cases is that of k factors, each at only two levels. These levels may be

quantitative, such as two values of temperature, pressure, or time; or they may be qualitative, such as

two machines, two operators, the “high” and “low” levels of a factor, or perhaps the presence and

absence of a factor. A complete replicate of such a design requires 2 x 2 x . . . x2 = 2 k observations and is

called a 2k factorial design.

• The 2k design is particularly useful in the early stages of experimental work when many factors are likely to

be investigated. 48
The 22 Design
• The first design in the 2k series is one with only two factors, say A and B, each run at two levels.

This design is called a 22 - factorial design. The levels of the factors may be arbitrarily called “low”

and “high.” As an example, consider an investigation into the effect of the concentration of the

reactant and the amount of the catalyst on the conversion (yield) in a chemical process. The

objective of the experiment was to determine if adjustments to either of these two factors would

increase the yield. Let the reactant concentration be factor A and let the two levels of interest be

15 and 25 percent. The catalyst is factor B, with the high level denoting the use of 2 pounds of the

catalyst and the low level denoting the use of only 1 pound. The experiment is replicated three

times, so there are 12 runs. The order in which the runs are made is random, so this is a completely

randomized experiment. The data obtained are as follows: 49


The four treatment combinations in this design are shown graphically in
Figure 5.1. By convention, we denote the effect of a factor by a capital
Latin letter. Thus “A” refers to the effect of factor A, “B” refers to the
effect of factor B, and “AB” refers to the AB interaction. In the 2 2 design,
the low and high levels of A and B are denoted by “-” and “+,”
respectively, on the A and B axes. Thus, “-” on the A axis represents the
low level of concentration (15%), whereas “+” represents the high level
(25%), and on the B axis represents the low level of catalyst, “-” (1 pound)
and “+” denotes the high level (2 pounds).
Fig. 5.1 Treatment combinations in the 22
50
design
• The four treatment combinations in the design are also represented by lowercase letters, as shown in Fig. 5.1.

• We see from the figure that the high level of any factor in the treatment combination is denoted by the

corresponding lowercase letter and that the low level of a factor in the treatment combination is denoted by

the absence of the corresponding letter. Thus, “a” represents the treatment combination of A at the high

level and B at the low level, “b” represents A at the low level and B at the high level, and ab”” represents

both factors at the high level.

• By convention, (1) is used to denote both factors at the low level. This notation is used throughout the 2 k

series. Also, the symbols (1), a, b, and ab now represent the total of the response observation at all n

replicates taken at the treatment combination, as illustrated in Fig. 5.1

51
*Now the effect of A at the low level of B is [a - (1)]/n, The average main effect of B is found from the
and the effect of A at the high level of B is [ab - b]/n. effect of B at the low level of A (i.e., [b - (1)]/n)
Averaging these two quantities yields the main effect of and at the high level of A (i.e., [ab - a]/n) as
A:

We define the interaction effect AB as the average difference between the


effect of A at the high level of B and the effect of A at the low level of B. Thus,

52
Alternatively, we may define AB as the average difference between the effect of B at the high level of A and the effect of B at
the low level of A. The formulas for the effects of A, B, and AB may be derived by another method. The effect of A can be
found as the difference in the average response of the two treatment combinations on the right-hand side of the square in
Fig.5.1 (call this average ỹA+ because it is the average response at the treatment combinations where A is at the high level)

and the two treatment combinations on the left-hand side (orỹ A-). That is,

Finally, the interaction effect AB is the


The effect of B, is found as the difference between the average average of the right-to-left diagonal
of the two treatment combinations on the top of the square treatment combinations in the square [ab
(ỹB+ ) and the average of the two treatment combinations on and (1)] minus the average of the left-to-
the bottom (ỹB- ), or right diagonal treatment combinations (a
and b), or

53
Using the experiment in Fig. 5.1, we may estimate the average effects as

The effect of A (reactant concentration) is positive; this suggests that increasing A from the low level (15%) to
the high level (25%) will increase the yield. The effect of B (catalyst) is negative; this suggests that increasing
the amount of catalyst added to the process will
decrease the yield.
The interaction effect appears to be small relative to the two main effects.
In experiments involving 2k designs, it is always important to examine the magnitude and direction of the factor
effects to determine which variables are likely to be important.

54
• It is often convenient to write down the treatment combinations in the order (1), a, b, ab. This is
referred to as standard order (or Yates’s order, for Frank Yates who was one of Fisher co-workers
and who made many important contributions to designing and analysing experiments). Using this
standard order, we see that the contrast coefficients used in estimating the effects are

• Note that the contrasts for the effects A, B, and AB are orthogonal. Thus, the 2 2 (and all 2k
designs) is an orthogonal design. The +/-1 coding for the low and high levels of the factors
is often called the orthogonal coding or the effects coding.
55
The Regression Model
• In a 2k factorial design, it is easy to express the results of the experiment in terms of a regression model.

Because the 2k is just a factorial design, we could also use either an effect or a means model, but the

regression model approach is much more natural and intuitive. For the chemical process experiment in Fig.

5.1, the regression model is

Y=βo+βx1+βx2+ɛ

• where x1 is a coded variable that represents the reactant concentration, x 2 is a coded variable that represents

the amount of catalyst, and the β ’s are regression coefficients. The relationship between the natural

variables, the reactant concentration and the amount of catalyst, and the coded variables are seen below

56
When the natural variables have only two levels, this coding will produce the familiar +/-1
notation for the levels of the coded variables. To illustrate this for our example, note that

Thus, if the concentration is at the high level (Conc = 25%), then x 1 = +1; if the concentration is at the low level (Conc
= 15%), then x1 = -1. Furthermore,

Thus, if the catalyst is at the high level (Catalyst = 2 pounds), then x 2 =+1; if the catalyst is at the low level (Catalyst = 1
pound), then x2 = -1.

The fitted regression model is 57


• where the intercept is the grand average of all 12 observations, and the

regression coefficients and are one-half the corresponding factor effect

estimates.

• The regression coefficient is one-half the effect estimate because a regression

coefficient measures the effect of a one-unit change in x on the mean of y, and

the effect estimate is based on a two-unit change (from -1 to 1). This simple

method of estimating the regression coefficients results in least squares

parameter estimates.
58
Residuals and Model Adequacy
The regression model can be used to obtain the predicted or fitted value of y at the four points in the design. The
residuals are the differences between the observed and fitted values of y. For example, when the reactant concentration
is at the low level (x1 = -1) and the catalyst is at the low level (x2 = -1), the predicted yield

here are three observations at this treatment combination, and the residuals are

The remaining predicted values and residuals are calculated similarly. For the high level of the reactant concentration
and the low level of the catalyst, and

59
For the low level of the reactant concentration
and the high level of the catalyst,

Finally, for the high level of both factors

Figure 5.2 presents a normal probability plot of these residuals and a plot of the residuals versus the predicted
yield. These plots appear satisfactory, so we have no reason to suspect that there are any problems with the
validity of our conclusions. 60
The Response Surface plot
• The regression model
can be used to generate response surface plots. If it is desirable to construct these plots in terms of the
natural factor levels, then we simply substitute the relationships between the natural and coded variables
that we gave earlier into the regression model, yielding

61
ANOVA
• There are also special time-saving methods for performing the calculations manually.
Consider determining the sums of squares for A, B, and AB.

F-value for model is computed by the


following formula

Where MS model= MSA+MSB+MSAB 62


The 23 Design
• Suppose that three factors, A, B, and C, each at two levels, are of interest. The design is called a 2 3
factorial design, and the eight treatment combinations can now be displayed geometrically as a
cube, as shown in Figure 5.2a. Using the “+ and -” orthogonal coding to represent the low and high
levels of the factors, we may list the eight runs in the 23 design as in Figure 5.2b.

Fig 5.2. The 23 factorial design


63
• we write the treatment combinations in standard order as (1), a, b, ab, c, ac, bc, and abc. Remember that
these symbols also represent the total of all n observations taken at that particular treatment combination.
Three different notations are widely used for the runs in the 2k design. The first is the + and - notation, often
called the geometric coding (or the orthogonal coding or the effects coding). The second is the use of
lowercase letter labels to identify the treatment combinations. The final notation uses 1 and 0 to denote high
and low factor levels, respectively, instead of + and - . These different notations are illustrated below for the
23 design

Orthogonality: Two vectors of the same length are orthogonal if


the sum of the products of their corresponding elements is
0. Note: An experimental design is orthogonal if the effects of
any factor balance out (sum to zero) across the effects of the
other factors.

64
• There are seven degrees of freedom between the eight treatment combinations in the 2 3
design. Three degrees of freedom are associated with the main effects of A, B, and C.
Four degrees of freedom are associated with interactions; one each with AB, AC, and BC
and one with ABC

In a similar manner, the effect of B is the difference in


averages between the four treatment combinations in
the front face of the cube and the four in the back.
This yields

The effect of C is the difference in averages between the four


treatment combinations in the top face of the cube and the four in
the bottom, that is,

65
• The two-factor interaction effects may be computed easily. A measure of the AB
interaction is the difference between the average A effects at the two levels of B. By
convention, one-half of this difference is called the AB interaction.

In this form, the AB interaction is easily seen to be the


difference in averages between runs on two diagonal
planes in the cube in Figure 5.2b

The ABC interaction is defined as the average difference between the AB interaction at the two different levels of
C. Thus,

66
Signs for the main effects are determined by associating a plus with the high level and a minus with the low level. Once
the signs for the main effects have been established, the signs for the remaining columns can be obtained by multiplying
the appropriate preceding columns row by row. For example, the signs in the AB column are the product of the A and B
column signs in each row. The contrast for any effect can be obtained easily from this table.

Sums of squares for the effects are easily computed because each effect has a corresponding single-
degree-of-freedom contrast. In the 23 design with n replicates, the sum of squares for any effect is

67
2k-p Fractional Factorial Design
*Fundamental Principles Regarding Factorial Effects
• Suppose there are k factors (A,B,...,J,K) in an experiment. All possible factorial effects
include
 effects of order 1: A, B, ..., K (main effects)
 effects of order 2: AB, AC, ....,JK (2-factor interactions)
………
Hierarchical Ordering principle
 Lower order effects are more likely to be important than higher order effects.
 Effects of the same order are equally likely to be important

68
Effect Sparsity Principle (Pareto principle)
 The number of relatively important effects in a factorial experiment is small

Effect Heredity Principle


 In order for an interaction to be significant, at least one of its parent factors should be

significant.

 May not have sources (time, money, etc) for full factorial design

 Number of runs required for full factorial grows quickly

Consider 2k design
 If k = 7 → 128 runs required

 Can estimate 127 effects

 Only 7 df for main effects, 21 for 2-factor interactions

 the remaining 99 df are for interactions of order ≥ 3


69
 Often only lower order effects are important
 Full factorial design may not be necessary according to

 Hierarchical ordering principle

 Effect Sparsity Principle

• If the experimenter can reasonably assume that certain high-order interactions

are negligible, information on the main effects and low-order interactions may

be obtained by running only a fraction of the complete factorial experiment.

• These fractional factorial designs are among the most widely used types of

designs for product and process design, process improvement, and

industrial/business experimentation. 70
• A major use of fractional factorials is in screening experiments—experiments in which
many factors are considered and the objective is to identify those factors (if any) that
have large effects.

• Screening experiments are usually performed in the early stages of a project when many
of the factors initially considered likely have little or no effect on the response.

• The factors identified as important are then investigated more thoroughly in subsequent
experiments.

71
Aliasing in 24−1 Design
• For four factors A, B, C and D, there are 24 − 1 effects: A, B, C, D, AB, AC,
AD, BC, BD, CD, ABC, ABD, ACD, BCD, ABCD

Contrasts for main effects by converting − to -1 and + to 1; contrasts for other effects
obtained by multiplication. 72
There are other 5 pairs. They are caused by the defining relation
I=ABCD; that is, I (the intercept) and 4-factor interaction ABCD are
aliased

73
Alias Structure for 24 -1 with I = ABCD (denoted by d1)

Alias Structure:
I = ABCD
A = A ∗ I = A ∗ ABCD = BCD
B = .......... = ACD
C = .......... = ABD
D = .......... = ABC
AB = AB ∗ I = AB ∗ ABCD = CD
AC = ............ = BD
AD = ............ = BC
all 16 factorial effects for A, B, C and D are partitioned into 8 groups
each with 2 aliased effects.
74
The One-Quarter Fraction of the 2k Design
• For a moderately large number of factors, smaller fractions of the 2 k design are frequently
useful. Consider a one-quarter fraction of the 2 k design. This design contains 2k-2 runs and is
usually called a 2k-2 fractional factorial.
• The 2k-2 design may be constructed by first writing down a basic design consisting of the runs
associated with a full factorial in k - 2 factors and then associating the two additional columns
with appropriately chosen interactions involving the first k - 2 factors.
• Thus, a One-Quarter Fraction of the 2 k has two generators.

• If P and Q represent the generators chosen, then I = P and I = Q are called the generating
relations for the design

75
• All four fractions associated with the choice of generators +/-P and +/-Q are members of the

same family.

• The fraction for which both P and Q are positive is the principal fraction.

• The complete defining relation for the design consists of all the columns that are equal to the

identity column I.

• These will consist of P, Q, and their generalized interaction PQ; that is, the defining relation is I = P

= Q = PQ. We call the elements P, Q, and PQ in the defining relation words. The aliases of any

effect are produced by the multiplication of the column for that effect by each word in the

defining relation. Clearly, each effect has three aliases.

• The experimenter should be careful in choosing the generators so that potentially important

effects are not aliased with each other 76


Example, consider the 26-2 design.

• Suppose we choose I = ABCE and I = BCDF as the design generators. Now the generalized interaction of the generators

ABCE and BCDF is ADEF; therefore, the complete defining relation for this design is, I = ABCE = BCDF = ADEF
• Consequently, this is a resolution IV design. To find the aliases of any effect (e.g., A), multiply that effect by each word in the defining

relation. For A, this produces

A = BCE = ABCDF = DEF

B=ACE=CDF=ABDEF

C=ABE=BDF=ACDEF
.
.
.

AB=EC=ACDF=BDEF
.
.
. 77
• It is easy to verify that every main effect is aliased by three- and five-factor interactions, whereas two-factor interactions are
aliased with each other and with higher order interactions
• To construct the design, first write down the basic design, which consists of the 16 runs for a full 26-2 = 24 design in A, B, C, and
D. Then the two factors E and F are added by associating their plus and minus levels with the plus and minus signs of the
interactions ABC and BCD, respectively

78
INTRODUCTION TO TAGUCHI METHOD
• The technique of laying out the conditions of experiments involving multiple factors was first proposed by the

Englishman, Sir R.A. Fisher. The method is popularly known as the factorial design of experiments.

• A full factorial design will identify all possible combinations for a given set of factors. Since most industrial

experiments usually involve a significant number of factors, a full factorial design results in a large number of

experiments.

• To reduce the number of experiments to a practical level, only a small set from all the possibilities is selected.

• The method of selecting a limited number of experiments which produces the most information is known as a

partial fraction experiment. Although this method is well known, there are no general guidelines for its application

or the analysis of the results obtained by performing the experiments. *Taguchi constructed a special set of

general design guidelines for factorial experiments that cover many applications. 79
Definition

• Taguchi has envisaged a new method of conducting the design of experiments

which are based on well defined guidelines. This method uses a special set of arrays

called orthogonal arrays. These standard arrays stipulates the way of conducting

the minimal number of experiments which could give the full information of all the

factors that affect the performance parameter. The root of the orthogonal arrays

method lies in choosing the level combinations of the input design variables for

each experiment.
80
A typical orthogonal array
• While there are many standard orthogonal arrays available, each of the arrays is meant for a specific number of

independent design variables and levels .

• For example, if one wants to conduct an experiment to understand the influence of 4 different independent

variables with each variable having 3 set values ( level values), then an L9 orthogonal array might be the right

choice.

• The L9 orthogonal array is meant for understanding the effect of 4 independent factors each having 3 factor

level values. This array assumes that there is no interaction between any two factor. While in many cases, no

interaction model assumption is valid, there are some cases where there is a clear evidence of interaction.

81
Table 7.1 Layout of L9 orthogonal array.

Table 7.1 shows an L9 orthogonal array. There are totally 9 experiments to be conducted and each experiment is
based on the combination of level values as shown in the table. For example, the third experiment is conducted by
keeping the independent design variable 1 at level 1, variable 2 at level 3, variable 3 at level 3, and variable 4 at level 3.

82
Properties of an orthogonal array

• The orthogonal arrays has the following special properties that reduces the

number of experiments to be conducted.

1. The vertical column under each independent variables of the above table has

a special combination of level settings. All the level settings appears an equal

number of times. For L9 array under variable 4 , level 1 , level 2 and level 3

appears thrice. This is called the balancing property of orthogonal arrays

2. All the level values of independent variables are used for conducting the

experiments 83
3. The sequence of level values for conducting the experiments shall not be changed. This means one can not

conduct experiment 1 with variable 1, level 2 setup and experiment 4 with variable 1 , level 1 setup.

The reason for this is that the array of each factor columns are mutually orthogonal to any other column of

level values.

The inner product of vectors corresponding to weights is zero. If the above 3 levels are normalized between -1

and 1, then the weighing factors for level 1, level 2 , level 3 are -1 , 0 , 1 respectively. Hence the inner product

of weighing factors of independent variable 1 and independent variable 3 would be

(-1 * -1+-1*0+-1*1)+(0*0+0*1+0*-1)+(1*0+1*1+1*-1)=0

84
Assumptions of the Taguchi method
• The additive assumption implies that the individual or main effects of the independent

variables on performance parameter/response are separable/independent.

• Under this assumption, the effect of each factor can be linear, quadratic or of higher order,

but the model assumes that there exists no cross product effects (interactions) among the

individual factors.

• That means the effect of independent variable 1 on performance parameter does not

depend on the different level settings of any other independent variables and vice versa. If at

anytime, this assumption is violated, then the additivity of the main effects does not hold,

and the variables interact. 85


Designing an experiment

• The design of an experiment involves the following steps

1. Selection of independent variables

2. Selection of number of level settings for each independent variable

3. Selection of orthogonal array

4. Assigning the independent variables to each column

5. Conducting the experiments

6. Analyzing the data

7. Inference
86
1. Selection of the independent variables
• Before conducting the experiment, the knowledge of the product/process under investigation is of prime

importance for identifying the factors likely to influence the outcome. In order to compile a comprehensive list of

factors, the input to the experiment is generally obtained from all the people involved in the project.

2. Deciding the number of levels

Once the independent variables are decided, the number of levels for each variable is decided. The selection of

number of levels depends on how the performance parameter is affected due to different level settings. If the

performance parameter is a linear function of the independent variable, then the number of level setting shall be 2.

However, if the independent variable is not linearly related, then one could go for 3, 4 or higher levels depending on

whether the relationship is quadratic, cubic or higher order.

87
• In the absence of exact nature of relationship between the independent variable and the performance

parameter, one could choose 2 level settings. After analyzing the experimental data, one can decide whether

the assumption of level setting is right or not based on the percent contribution and the error calculations.

3. Selection of an orthogonal array


• Before selecting the orthogonal array, the minimum number of experiments to be conducted shall be fixed

based on the total number of degrees of freedom present in the study.

• The minimum number of experiments that must be run to study the factors shall be more than the total

degrees of freedom available.

• In counting the total degrees of freedom the investigator commits 1 degree of freedom to the overall mean

of the response under study.

88
• The number of degrees of freedom associated with each factor under

study equals one less than the number of levels available for that factor.

For example, in case of 11 independent variables, each having 2 levels, the

total degrees of freedom is 12. Hence the selected orthogonal array shall

have at least 12 experiments. An L12 orthogonal satisfies this requirement.

• Once the minimum number of experiments is decided, the further

selection of orthogonal array is based on the number of independent

variables and number of factor levels for each independent variable.


89
4. Assigning the independent variables to columns
• The order in which the independent variables are assigned to the vertical column is

very essential. In case of mixed level variables and interaction between variables, the

variables are to be assigned at right columns as stipulated by the orthogonal array .

• Finally, before conducting the experiment, the actual level values of each design

variable shall be decided. It shall be noted that the significance and the percent

contribution of the independent variables changes depending on the level values

assigned. It is the designers responsibility to set proper level values.

90
5. Conducting the experiment

Once the orthogonal array is selected, the experiments are


conducted as per the level combinations. It is necessary that all the
experiments be conducted.

The performance parameter/response under study is noted down


for each experiment to conduct the sensitivity/SN ratio analysis.

91
6. Analysis of the data

• Since each experiment is the combination of different factor levels, it is essential to

segregate the individual effect of independent variables. This can be done by summing

up the performance parameter values for the corresponding level settings.

• For example, in order to find out the main effect of level 1 setting of the independent

variable 2 (refer Table 7.1), sum the performance parameter values of the experiments

1, 4 and 7. Similarly for level 2, sum the experimental results of 2, 5 and 8 and so on.

92
Once the mean value of each level of a particular independent variable is

calculated, the sum of square of deviation of each of the mean value from the

grand mean value is calculated. This sum of square deviation of a particular

variable indicates whether the performance parameter is sensitive to the

change in level setting. If the sum of square deviation is close to zero or

insignificant, one may conclude that the design variables is not influencing the

performance of the process.

In other words, by conducting the sensitivity analysis, and performing analysis

of variance (ANOVA), one can decide which independent factor dominates

over other and the percentage contribution of that particular independent

variable.
93
7. Inference

• From the above experimental analysis, it is clear that the higher the value of sum of square of an independent

variable, the more it has influence on the performance parameter. One can also calculate the ratio of

individual sum of square of a particular independent variable to the total sum of squares of all the variables.

This ratio gives the percent contribution of the independent variable on the performance parameter.

• In addition to above, one could find the near optimal solution to the problem. This near optimum value may

not be the global optimal solution. However, the solution can be used as an initial / starting value for the

standard optimization technique.

Signal - to - Noise ratio


• In the Taguchi method “signal” and “noise” for output characteristics and the signal/noise ratio (S/N)
represent the desirable signal value and the undesirable noise value, respectively.
• The equation of S/N ratio depends on the criteria used for optimization of the quality characteristics.
94
where, yi denotes the nth observations of response variable and μ2 denotes the square of mean and σ2 the
variance of the observations of the replicated response values.
• To determine the effect each variable has on the output, the signal-to-noise ratio, or the SN
number, needs to be calculated for each experiment conducted.

95
Table of Taguchi Designs (Orthogonal Arrays)

96
8. Response Surface Methods and Designs/RSM
• RSM, is a collection of mathematical and statistical techniques useful for the modelling and

analysis of problems in which a response of interest is influenced by several variables and the

objective is to optimize this response.

• For example, suppose that an engineer wishes to find the levels of temperature (x1) and pressure

(x2) that maximize the yield (y) of a process. The process yield is a function of the levels of

temperature and pressure, say Y= f (x1, x2) +ɛ, where ɛ represents the noise or error observed in the

response y.

• If we denote the expected response by E(y) = f (x1, x2) , then the surface represented by ɳ = f (x1, x2), is

called a response surface. 97


A three-dimensional response surface showing the expected A contour plot of a response surface
yield (ɳ) as a function of temperature (x1) and pressure (x2)

98
• In most RSM problems, the form of the relationship between the response and
the independent variables is unknown. Thus, the first step in RSM is to find a
suitable approximation for the true functional relationship between y and the
set of independent variables.
• Usually, a low-order polynomial in some region of the independent variables is
employed. If the response is well modelled by a linear function of the
independent variables, then the approximating function is the first-order model
y = β0 +β1x1+ β2x2 +…+ β k xk +ɛ

• If there is curvature in the system, then a polynomial of higher degree must be

used, such as the second-order model. 99


• Almost all RSM problems use one or both of these models.

• RSM is a sequential procedure. Often, when we are at a point on the response surface

that is remote from the optimum, such as the current operating conditions in fig. 8.1,

there is little curvature in the system and the first-order model will be appropriate.

• Our objective here is to lead the experimenter rapidly and efficiently along a path of

improvement toward the general vicinity of the optimum. Once the region of the

optimum has been found, a more elaborate model, such as the second-order model,

may be employed, and an analysis may be performed to locate the optimum

100
• The analysis of a response surface can be thought of as “climbing a hill,” where the top of the hill represents the point
of maximum response. If the true optimum is a point of minimum response, then we may think of “descending into a
valley.”
• The eventual objective of RSM is to determine the optimum operating conditions for the system or to determine a
region of the factor space in which operating requirements are satisfied.

101
Fig. 8.1The sequential nature of RSM
8.1. Response Surface Designs

• A response surface design is a set of advanced design of experiments (DOE) techniques

that help you better understand and optimize your response.

• Response surface design methodology is often used to refine models after you have

determined important factors using screening designs or factorial designs.

• The difference between a response surface equation and the equation for a factorial

design is the addition of the squared (or quadratic) terms that lets you model curvature in

the response, making them useful for:


• Understanding or mapping a region of a response surface. Response surface equations

model how changes in variables affect a response of interest.

• Finding the levels of variables that optimize a response.

• Selecting the operating conditions to meet specifications. 102


8.1.1 Central Composite designs (CCD)

• A central composite design is the most commonly used response surface designed experiment.
Central composite designs are a factorial or fractional factorial design with centre points,
augmented with a group of axial points (also called star points) that let you estimate curvature.
The star or axial points are, in general, at some value α and −α on each axis.
• You can use a central composite design to:
• Efficiently estimate first- and second-order terms.
• Model a response variable with curvature by adding centre and axial points to a previously-done factorial design.

103
• Central composite designs are especially useful in sequential experiments because you can often
build on previous factorial experiments by adding axial and centre points.

*If we have k factors, then we have, 2k factorial points, 2∗k axial points and nc centre points.

104
• The spherical designs are rotatable in the sense that the points are all equidistant from the centre.
• Rotatable refers to the variance of the response function. A rotatable design exists when there is an
equal prediction variance for all points a fixed distance from the centre, 0 (outstanding feature of CCD).
• If you pick the centre of your design space and run your experiments, all points that are equal distance
from the centre in any direction, have an equal variance of prediction.

105
*Why do we take about five or six centre points in the design?
• The reason is also related to the variance of a predicted value. When fitting a response
surface you want to estimate the response function in this design region where we are
trying to find the optimum.
• We want the prediction to be reliable throughout the region, and especially near the centre
since we hope the optimum is in the central region.
• By picking five to six centre points, the variance in the middle is approximately the same as
the variance at the edge (balance the precision at the edge of the design relative to the
middle).
• If you only had one or two centre points, then you would have less precision in the middle
than you would have at the edge.

106
*How do you select the region where you want to run the

experiment?

• Remember, for each factor X we said we need to choose the lower

level and the upper level for the region of experimentation. We

usually picked the -1 and 1 as the boundary.

• If the lower natural unit is really the lowest number that you can test,

because the experiment won't work lower than this, then, the star

point is not possible because it is outside the range of

experimentation. In this case BBD might be useful 107


8.1.2. Box-Behnken designs (BBD)

• BBD usually have fewer design points than CCD, thus, they are less expensive to run

with the same number of factors.

• They can efficiently estimate the first- and second-order coefficients; however, they

can't include runs from a factorial experiment.

• BBD always have 3 levels per factor, unlike CCD which can have up to 5.

• Also unlike CCD, BBD, never include runs where all factors are at their extreme

setting, such as all of the low and their high level settings at the same time.

• Box-Behnken designs have treatment combinations that are at the midpoints of the

edges of the experimental space and require at least three continuous factors.
108
• Box-Behnken designs can also prove useful if you know the

safe operating zone for your process.

• CCD usually have axial points outside the "cube." These points

may not be in the region of interest, or may be impossible to

conduct because they are beyond safe operating limits.

• Box-Behnken designs do not have axial points, thus, you can be

sure that all design points fall within your safe operating zone.

109

You might also like