You are on page 1of 60

An Overview of Design of

Experiments

Dr. Keerti Jain


INDEX
• EXPERIMENTS
• A QUICK HISTORY OF DESIGN OF EXPERIMENTS
• WHY WE USE EXPERIMENTAL DESIGNS
• WHAT IS DESIGN OF EXPERIMENT
• HOW DESIGN OF EXPERIMENT CONTRIBUTES
• TERMINOLOGY
• ANALYSIS OF VARIATION (ANOVA)
• BASIC PRINCIPLE OF DESIGN OF EXPERIMENTS
• SOME EXPERIMENTAL DESIGNS
5/20/2019 2
EXPERIMENT
Experiments involve manipulation of one or
more independent variables, and observing
the effect on some outcome (dependent
variable). Experiments can be done in the
field or in a laboratory.

5/20/2019 3
A QUICK HISTORY OF
DESIGN OF EXPERIMENTS
• The agricultural origins, 1918 – 1940s
• R. A. Fisher & his co-workers
• Profound impact on agricultural science
• Factorial designs, ANOVA
• The first industrial era, 1951 – late 1970s
• Box & Wilson, response surfaces
• Applications in the chemical & process industries

5/20/2019 4
CONTD…
• The second industrial era, late 1970s – 1990
• Quality improvement initiatives in many companies
• TQM were important ideas and became management goals
• Taguchi and robust parameter design, process robustness
• The modern era, beginning 1990
• Six sigma, Lean Six sigma
• Clinical Trails, Mathematical biology.
• Algorithm design and analysis,
• Networking, group testing, and cryptography
5/20/2019 5
Why we use
Experimental Designs
"All experiments are designed experiments, it is
just that some are poorly designed and some are
well-designed."
Experimental designs are used so that the
treatments may be assigned in an organized manner
to allow valid statistical analysis to be carried out
on the resulting data.

5/20/2019 6
What is Design of
Experiments
It is a logical planning (or construction) of the
experiment having a complete sequence of steps taken
ahead of time to ensure that the appropriate data will
be obtained in a way which permits an objective
analysis of a particular problem leading to valid and
precise inference in most economic and useful forms.

5/20/2019 7
Subject Matter of Design
of Experiments
It includes:
• Planning of the experiment
• Obtaining data from it
• Making statistical analysis of the data obtained.

5/20/2019 8
HOW DESIGN OF EXPERIMENT
CONTRIBUTES

• Reduce time to design/develop new products &


processes
• Improve performance of existing processes
• Improve reliability and performance of products
• Achieve product & process robustness
• Perform evaluation of materials, design
alternatives, setting component & system tolerances, etc.
5/20/2019 9
TERMINOLOGY

CONTROL LEVEL BLOCKS


GROUP

TREATMENT EXPERIMENTAL
RANDOMNESS
GROUP ERROR

FACTORS REPLICATION ANOVA

5/20/2019 10
TERMINOLOGY

• Control Group :- A group assigned to the experiment, but not for the
purpose of being exposed to the treatment. Performance of this group
serves as a baseline.
• Treatment Group:- The Group in an experiment which receives the
specified treatment.
• Factor:- This term is used when an experiment involves more than one
variable. These variables are often identified as factor.
• Level:- Refers to the degree or intensity of a factor.
• Randomness:-refers to the property of completely chance events that
are not predictable.
• Replication:- The repetition of the treatment under consideration.
• Blocks:- refers to the categories of subjects with a treatment5/20/2019
group. 11
EXPERIMENTAL
ERROR

Experimental Error is the variation in the responses among experimental units


which are assigned the same treatment, and are observed under the same
experimental conditions. It is measured by SSE (or MSE).
Ideally, we would like experimental error to be zero.
This is impossible because of (at least) one or more of the following reasons:
• There are inherent differences in the experimental units before they receive
treatments.
• There is variation in the devices that record the measurements.
• There is variation in applying or setting the treatments.
• There are extraneous factors other than the treatments which affect the
response.

5/20/2019 12
ANALYSIS OF VARIANCE
(ANOVA)
• This Statistical technique was first developed by
R.A.Fisher and was extensively used for agriculture
experiments.

• It is mainly employed for comparison of means of 3 or


more samples including the variations in each sample.

• ANOVA is the method to estimate the contribution made


by each factor to the total variation.

5/20/2019 13
ANOVA TABLE FORMAT

Source of Sum of Degree of


Mean Squares
Variation Squares Freedom F
(MS)
(SV) (SS) (df)

Treatment SSt dft = nt-1 MSTR=SSt / dft

MSTR / MSE
Error SSr dfe = dfT-dft MSE=SSr / dfe

Total SST dfT = nT-1

5/20/2019 14
The Steps in Designing an
Experiment

• Step 1: Identify the problem or claim to be studied.


The statement of the problem needs to be as specific
as possible. As your text says, it must "identify the
response variable and the population to be studied".
• Step 2: Determine the factors affecting the response variable.
This is best done by an expert in the field, but we'll
be able to do this for most examples we'll be looking
at.

5/20/2019 15
The Steps in Designing an
Experiment (Contd…)
• Step 3: Determine the number of experimental units.
In general, more experimental units is better.
Unfortunately, time and money will always be
limiting factors, so we have to decide an appropriate
number

5/20/2019 16
The Steps in Designing an
Experiment (Contd…)
• Step 4: Determine the level(s) of each factor.
We split factors up into three categories:
o Control: If possible, we try to fix the level of factors that we're
not interested in.
o Manipulate: This is the treatment - we manipulate the levels of
the variable that we think will affect the response variable.
o Randomize: Often, there are factors we just can't control. To
mitigate their effect on the data, we randomize the groups. By
randomly assigning experimental units, these factors should be
equally spread among all groups.

5/20/2019 17
The Steps in Designing an
Experiment (Contd…)
• Step 5: Conduct the experiment.

• Step 6: Test the claim.

• Step 7: Interpret the results

5/20/2019 18
BASIC PRINCIPLE OF DESIGN
OF EXPERIMENTS

• Randomization
• Replication
• Local Control (Blocking)

5/20/2019 19
Complete and Incomplete Block
Designs

5/20/2019 20
SOME EXPERIMENTAL
DESIGNS

• Completely Randomized Design (CRD)


• Randomized Block Design (RBD)
• Latin Square Design (LSD)
• Factorial Designs
• Balanced Incomplete Block Design (BIBD)
• Nested Balanced Incomplete Block designs (NBIBD)
• Balanced Incomplete Block Design with Nested Rows
and Columns
5/20/2019 21
CRD

Complete
Designs

LSD RBD

5/20/2019 22
COMPLETELY RANDOMIZED DESIGN
(CRD)

• COMPLETELY RANDOMIZED DESIGNS are the simplest design


in which the treatments are assigned to the experimental units
completely at random. This allows every experimental unit to
have an equal probability of receiving a treatment.

• For CRD, any difference among experimental units receiving


the same treatment is considered as experimental error.

5/20/2019 23
Characteristics of the CRD

• CRD is the simplest design to use.


• CRD is appropriate only for experiments with homogeneous
experimental units, such as laboratory experiments, where
environmental effects are relatively easy to control. .
• The CRD is best suited for experiments with a small number of
treatments.
• For field experiments, where there is generally large variation among
experimental plots in such environmental factors as soil, the CRD is
rarely used.
• Every experimental unit has the same probability of receiving any
treatment
• Treatments are assigned to experimental units completely at random
using a random number table, computer program, etc. 5/20/2019 24
EXAMPLE OF CRD
• In order to determine whether there is significant difference in the
durability of 3 makes of computers, samples of size 5 are selected
from each make and the frequency of repair during the first year is
observed. The results are as follows:
Makes
A B C
5 8 7
6 10 3
8 11 5
9 12 4
7 4 1 5/20/2019 25
VARIOUS STEPS TO BE FOLLOWED
Step 1. • Write the hypotheses to be tested.
Step 2. • Calculate the Correction Factor.
Step 3. • Calculate the Total SS
Step 4. • Calculate the Treatment SS
Step 5. • Calculate the Error SS
Step 6. • Complete the ANOVA table
Step 7. • Look up Table F-values.
Step 8. • Make conclusions.
5/20/2019 26
HYPOTHESIS

H0: The three makes of computers do not differ


significantly in the durability.

H1: Atleast one of the makes of computers differ


significantly in the durability.

5/20/2019 27
TABLE FOR CALCULATION

MAK Ti2 ∑X2i


Xij Ti ni Ti2/ni
E j

A 5 6 8 9 7 35 5 1225 245 255

B 8 10 11 12 4 45 5 2025 405 445

C 7 3 5 4 1 20 5 400 80 100

TOTAL 100 15 3650 730 800

5/20/2019 28
Null Hypothesis :
H0: the 3 makes of computers do not differ in the durability

• CF = (Ti)2/ni
= (100)2/15
= 666.67

• SST = ∑∑X2ij – CF
= 800 -666.67
= 133.33

• SSM = ∑Ti2/ni – CF
= 730 - 666.67
= 63.33

• SSE = SST – SSM


= 133.33 -63.33
= 70
5/20/2019 29
ANOVA TABLE
Sources of Sum of Degree of Mean sum of F0
Variation Square freedom Square

Between 63.33 2 31.67 31.67 /


Makes 5.83
Within 70 12 5.83
Makes = 5.43

Total 133.33 14

From F – Tables, F5%(v1= 2, v2= 12) = 3.88


F0 > F5% Null hypothesis is rejected.
There is significant difference between the makes of computers.
5/20/2019 30
ADVANTAGES
• Very flexible design (i.e. number of treatments and
replicates is only limited by the available number of
experimental units).
• Statistical analysis is simple compared to other designs.
• Loss of information due to missing data is small
compared to other designs due to the larger number of
degrees of freedom for the error source of variation.
• Provides maximum number of degree of freedom.

5/20/2019 31
DISADVANTAGES
• If experimental units are not homogeneous and you fail
to minimize this variation using blocking, there may be a
loss of precision.
• Usually the least efficient design unless experimental
units are homogeneous.
• Not suited for a large number of treatments.

5/20/2019 32
CRD

Complete
Designs

LSD RBD
5/20/2019 33
RANDOMIZED BLOCK DESIGN

5/20/2019 34
RANDOMISED BLOCK
DESIGN (RBD)

• Any experimental design in which the randomization of


treatments is restricted to groups of experimental units within
a predefined block of units assumed to be internally
homogeneous is called a randomized block design.
• Divides the group of experimental units into n homogeneous
groups of equal or unequal sizes.
• These homogeneous groups are called blocks.
• The treatments are then randomly assigned to the experimental
units in each block - one treatment to a unit in each block.
5/20/2019 35
CHARACTERISTICS OF
RBD

• A randomized block experiment is assumed to be a two-factor


experiment., the factors are blocks and treatments.
• The blocks of experimental units are uniform.
• There is one observation per cell. It is assumed that there is no
interaction between blocks and treatments.
• The degrees of freedom for the interaction is used to estimate
error.
• Treatments randomly assigned to each experimental unit of a
block.

5/20/2019 36
ANOVA TABLE FORMAT

Source of Sum of Degree of


Mean
Variation Squares Freedom F
Squares (MS)
(SV) (SS) (df)
dfb = nb-1 MSB=SSb / dfb
Blocks SSb MSB / MSErr

Treatment SSt dft = nt-1 MSTR=SSt / dft

dfe = dfT-dfb- MSTR / MSErr


MSErr=SSe / dfe
Error SSe dft

dfT = nT-1
Total SST
5/20/2019 37
Example
Four Doctors each test 4 treatments for certain disease and
observe the number of each days each patient takes to recover.
The results are : Treatments
Doctor 1 2 3 4
A 10 14 19 20
B 11 15 17 21
C 9 12 16 19
D 8 13 17 20

5/20/2019 38
Hypothesis
Two WAY ANALYSIS
H0A: There is no significant difference between the doctors.
H1A: Atleast one of the doctor is significantly different.

H0B: There is no significant difference between the


treatments.
H1B: Atleast one of the treatment is significantly different.

5/20/2019 39
Table for calculations
Doctor 1 2 3 4 Ti K Ti2 / k ∑X2ij
A 10 14 19 20 63 4 992.25 1057
B 11 15 17 21 64 4 1024 1076
C 9 12 16 19 56 4 784 842
D 8 13 17 20 58 4 841 922
∑Ti2 / k =
Tj 38 54 69 80 241 16 3641.25 3897

∑Tj2 / h
Tj2 / h 361 729 1190.25 1600 = 3880.25

∑X2ij 366 734 1195 1602 => 3897 5/20/2019 40


• CF = (Ti)2 / N
= (241)2 / 16 =3630.06

• SSTotal = ∑∑X2ij - CF
= 3897 – 3630.06 = 266.94

• SSD = ∑Ti2 / h – CF
= 3641.25 – 3630.06 = 11.19

• SSt = ∑Tj2 / k – CF
= 3880.25 -3630.06 = 250.19

• SSe= SSTotal - SSD - SSt = 5.56


5/20/2019 41
ANOVA TABLE
Source of Sum of Degree of Mean sum F0
Variation Square Freedom of square
Doctors 11.19 3 3.73 3.73 / 0.62

= 6.02
Treatments 250.19 3 83.40 83.40 / 0.62
= 134.52
Error 5.56 9 0.62 -
Total 266.94 15

From F – Tables, F5%(v1= 3, v2= 9) = 3.86

F0 > F5%

The difference between the doctors is significant and that between the
Treatments is highly significant. 5/20/2019 42
ADVANTAGES
• Complete flexibility can have any number of treatments
and blocks.
• Provides more accurate results than the completely
randomized design due to grouping.
• Relatively easy statistical analysis even with missing
data.
• Some treatments may be replicated more times than
others.
• Whole treatments or entire replicates may be deleted
from the analysis.
5/20/2019 43
DISADVANTAGES
• Not suitable for large numbers of treatments because
blocks become too large, and there is possibility of
hetertrogenity among the experimental units of the
blocks
• Interactions between block and treatment effects
increase error.
• Serious problem with the analysis if a block factor by
treatment interaction effect actually exists and no
replication within blocks has been included. (solution:
use replication within blocks when possible).

5/20/2019 44
CRD

Complete

LSD RBD

5/20/2019 45
LATIN SQUARE DESIGN (LSD)
• A Latin square is a square array of objects (letters A, B, C, …) such that each
object appears once and only once in each row and each column.
• Example - 4 x 4 Latin Square.
ABCD
BCDA
CDAB
DABC
• The Latin Square Design is for a situation in which there are two extraneous
sources of variation. If the rows and columns of a square are thought of as levels
of the the two extraneous variables, then in a Latin square each treatment appears
exactly once in each row and column.
• With the Latin Square design we are able to control variation in two directions.
5/20/2019 46
CHARACTERISTICS
OF LSD
• In LSD we have three factors:
Treatments, Rows and Columns
• The number of treatments = the number of rows = the number of
colums = t (say).
• The row-column treatments are represented by cells in a t x t array.
• The treatments are assigned to row-column combinations using a
Latin-square arrangement, that is each row contains every treatment.
and each column contains every treatment.
• Every treatment occurs once in each row and column.
5/20/2019 47
ANOVA TABLE FORMAT
Source Of Sum Of Degree Of
Mean Squares
Variation Squares Freedom F
(MS)
(SV) (SS) (df)
dft = nt-1 MSTR = SSt / dft
Treatment SSt MSTR / MSErr

MSRow = SSr /
Rows SSr dfr = nr-1 MSRow / MSErr
dfr

dfc = nc-1 MSCol = SSc / dfc MSCol / MSErr


Columns SSc

dfe = dfT-dft-dfr-
MSErr = SSe / dfe
Error SSe dfc

dfT = nT-1
Total SST 5/20/2019 48
Example

• The Following Data resulted from an experiment to compare


three burners B1, B2 and B3. LSD was used as the tests were
made on 3 engines and were spread over 3 days.
Engine 1 Engine 2 Engine 3
Day 1 B1 – 16 B2 – 17 B3 - 20
Day 2 B2 – 16 B3 – 21 B1 - 15
Day 3 B3 – 15 B1 - 12 B2 - 13

5/20/2019 49
HYPOTHESIS

H0A: There is no significant difference between burners.


H1A: Atleast one of the burner is significantly different.
H0B: There is no significant difference between the days.
H1B: Atleast one of the day is significantly different
H0C: There is no significant difference between Engines.
H1C: Atleast one of the engine is significantly different

5/20/2019 50
E1 E2 E3 Ti Ti2 / n ∑X2ij
Day 1 16(B1) 17(B2) 20(B3) 53 936.33 945
Day 2 16(B2) 21(B3) 15(B1) 52 901.33 922
Day 3 15(B3) 12(B1) 13(B2) 40 533.33 538
Tj 47 50 48 145 ∑= 2405
2370.99
T2j / n 736.33 833.33 768 ∑=
2337.66
∑X2ij 737 874 794 2405

Rearranging data values according to the Burners :


Burner Xk Tk Tk2 / n
B1 16 15 12 43 616.33
B2 17 16 13 46 705.33
B3 20 21 15 56 1045.33
2366.99
5/20/2019 51
• CF = (Ti)2 / n
= (145)2 / 9 = 2336.11

• SSTotal =∑∑X2ij – CF
= 2405 – 2336.11 = 68.89

• SSD1=∑∑Ti2 / n – CF
= 2370.99 – 2336.11 = 34.88

• SSD2=∑∑Tj2 / n – CF
= 2337.66 – 2336.11 = 1.55

• SSD3=∑∑Tk2 / n – CF
= 2366.99 – 2336.11 = 30.88

• SSE = SSTotal – SSD1 – SSD2 – SSD3


= 1.55 5/20/2019 52
ANOVA TABLE

S.V S.S d.f M.S F0

Days 34.88 2 17.44 17.55 / 0.775

= 22.51
Engines 1.55 2 0.775 0.775 / 0.775
=1
Burners 30.88 2 15.44 15.44 / 0.775
= 19.93
Error 1.55 2 0.775

Total 68.89 8

5/20/2019 53
CONCLUSION

• From F – Tables, F5%(v1= 2, v2= 2) = 19.00


• F0(19.93) > F5% There is a significant Difference Between the
Burners

• F0(22.51) > F5% The Difference Between the Days is significant

• F0(1) < F5% The Difference Between the Engine is not significant
5/20/2019 54
ADVANTAGES
• We can control variation in two directions. It means
LSD is more efficient then CRD and RBD.
• Being an incomplete 3-way desin it is economic over
the corresponding complete 3-way design. Instead of
𝑟 3 experimental units, here only 𝑟 2 experimental
units are sufficient.
• The analysis remains relatively simple even with
missing data.
5/20/2019 55
DISADVANTAGES
• Number of treatment is limited to the number of replicates
which seldom exceeds 10.
• If we have less than 5 treatments, the df for controlling random
variation is relatively large and the df for error is small.
• The number of treatments must equal the number of replicates.
• The experimental error is likely to increase with the size of the
square.
• Evaluation of interactions between rows and columns, rows
and treatments & columns and treatments is not possible
separately.

5/20/2019 56
FACTORIAL
EXPERIMENT
Factorial designs include two or more factors, each
having more than one level or treatment. Participants
typically are randomized to a combination that includes
one treatment or level from each factor.

5/20/2019 57
BALANCED INCOMPLETE
BLOCK DESIGNS (BIBD)
• Situation where the number of treatments exceeds number of
units per block (or logistics do not allow for assignment of all
treatments to all blocks)
• # of Treatments  v
• # of Blocks  b
• Replicates per Treatment  r < b
• Block Size  k < v
• Total Number of Units  N = kb = rv
• All pairs of Treatments appear together in l = r(k-1)/(v-1)
Blocks for some integer l

5/20/2019 58
NESTED DESIGNS

• In certain multifactor experiments, the levels of one


factor are similar but not identical for different levels
of another factor, (is unique to that particular factor)
this is called hierarchical or nested design.
http://jrss.in/data/5I12.pdf

5/20/2019 59
Thank
you

5/20/2019 60

You might also like