You are on page 1of 10

Probability and Statistics 1

Unit - 4 Design of Experiments


Introduction
Design of experiments may be defined as the logical construction of the experiment
in which the degree uncertainty with which the inference is drawn may be well defined.
Basic Principles of Experimental Design
The three basic principle of experimental design (i) replication (ii) randomization
(iii) local control.
Replication
Replication mean the repetition of an experiment. It means the repetition of the
treatment under investigation. It helps us in estimating the expermental error. Thus, the rep-
etition of treatments results in more reliable estimate than is possible with single observation.
Randomization
Randomization is a technique by which two or more treatments are applied to differ-
ent group of the sample, the treatments to be applied to any group being decided by random
sampling technique.
Local Control
The process of reducing the experimental error by dividing the relatively the hetero-
geneous experimental area into homogeneous is known local control.
Analysis of variance
Analysis of variance (ANOVA) is a technique, which is used to test the equality of
three or more population means by comparing the sample variances using F-distribution. This
technique split up the variance into its various components, usually into two parts.

1. Variance between the sample.

2. Variance within the sample.

ANOVA is used to determining

• Which of various training methods produces the faster learning record.

• Whether the effects of some fertilizers on the yields are significantly different.

• Whether the mean qualities of outputs of various machines differ significantly. etc.,

In general analysis of variance studies mainly homogeneity of populations by separating


the total variance into its various components. According to R.A. Fisher, analyze of variance
(ANOVA) is the separation of variance ascribable to one group of causes from the variance
ascribable to other groups.
Assumptions in Analysis of Variance

1. Each of the samples is drawn from a normal population.

2. The variances for the population from which samples have been drawn are equal.

3. The variation of each value around its own grand mean should be independent for
each value.
2 Design of Experiments

Analysis of Variance for one-way classification (Completely Randomized Design)


In this case observations (or data) are classified according to one factor (or) criterion.
Step-I:Calculation of total variation
1. N is total number of observations.
P P P
2. T = x1 + x2 + · · · + xk .
T2
3. Correction Factor = CF = .
N
x21 +
x22 + · · · + x2k − CF .
P P P
4. Total Sum of Squares = SST =

( x1 ) 2 ( x2 ) 2 ( xk ) 2
P P P
5. Sum of Squares between the sample = SSB = + +···+ − CF .
n1 n2 nk
6. Sum of Squares with in the sample = SSW = SST-SSB.
Step-II:ANOVA table

Source of Sum of Degree of Mean sum of


Test Statistic
variation Squares freedom squares
SSB M SB
Between the sample SSB k−1 M SB = F =
k−1 M SW
SSW
Within the sample SSW N −k M SW =
N −k
Total SST N −1
Example:1 The following data give the yields of 14 plots of land in three samples of 5, 5, 4
plots under three varieties of seeds A, B and C.
A 35 40 33 36 31
B 30 25 34 28 33
C 28 24 30 26 -
To construct the ANOVA table and use it to test whether difference in the average yields under
the three varieties of seeds is significant or not.
Solution
Null Hypothesis H0 : There is no significant difference in average yields under the three varieties
of seeds.
Alternative Hypothesis H1 : The difference is significant.
Step-I:
Subtracting 30 from each of the given value, to simplify calculations.
x1 x2 x3 x21 x22 x23
5 0 -2 25 0 4
10 -5 -6 100 25 36
3 4 0 9 16 0
6 -2 -4 36 4 16
1 3 - 1 9 -
25 0 -12 sum 171 54 56
Probability and Statistics 3

Step-II:

N = n1 + n2 + n3 = 5 + 5 + 4 = 14.
X X X
T = x1 + x2 + x3 = 25 + 0 − 12 = 13.
T2 (13)2 169
Correction f actor = = = = 12.07.
X X XN 14 14
SST = x21 + x22 + x23 − CF = 171 + 54 + 56 − 12.07 = 268.93.
( x1 ) 2 ( x2 ) 2 ( x3 ) 2 (25)2 (0)2 (−12)2
P P P
SSB = + + − CF = + + − 12.07 = 148.93.
n1 n2 n3 5 5 4
SSW = SST − SSB = 268.93 − 148.93 = 120.

Step-III:ANOVA table

Source of Sum of Degree of Mean sum of


Test Statistic
variation Squares freedom squares
SSB M SB
Source of M SB = F =
SSB = 148.93 k−1=3−1=2 k−1 M SW
variation 148.93 74.47
= = 74.47 = = 6.83
2 10.9
SSW
Within M SW =
SSW = 120 N − k = 14 − 3 = 11 N −k
the sample 120
= = 10.9
11
Total SST = 268.93 N − 1 = 14 − 1 = 13

Step-IV:
Table value of F at 5% level of significance for (2, 11) degrees of freedom is 3.98
Here the calculated value of F = 6.83 is greater then the table value of F = 3.98
∴ H0 is rejected.
i.e., There is a significant difference in the average yields under three varieties of seeds.
Example:2 A manager want to test whether its three salesman A, B and C tend to make sales
of the same size or differ in their selling abilities. During a week there have been 14 sale calls.
A made 5 calls, B made 4 calls and C made 5 calls. Following are the sales data for the week
of the three salesman.
A 500 400 700 800 600
B 300 700 400 600 -
C 500 300 500 400 300
Perform the analysis of variance and draw your conclusion. Solution
Null Hypothesis H0 : The three salesmen tend to make sales of the same size.
Alternative Hypothesis H1 : They differ in their selling abilities.
Step-I:
As the sales data have a common factor 100, we divide the given values by 100 and
write the data and their squares in the following tables.
4 Design of Experiments

x1 x2 x3 x21 x22 x23


5 3 5 25 9 25
4 7 3 16 49 9
7 4 5 49 16 25
8 6 4 64 36 16
6 - 3 36 - 9
30 20 20 sum 190 110 84

Step-II:

N = n1 + n2 + n3 = 5 + 4 + 5 = 14.
X X X
T = x1 + x2 + x3 = 30 + 20 + 20 = 70.
T2 (70)2 4900
Correction f actor = = = = 350.
X X XN 14 14
SST = x21 + x22 + x23 − CF = 190 + 110 + 84 − 350 = 34.
( x1 ) 2 ( x2 ) 2 ( x3 ) 2 (30)2 (20)2 (20)2
P P P
SSB = + + − CF = + + − 350 = 10.
n1 n2 n3 5 4 5
SSW = SST − SSB = 34 − 10 = 24.

Step-III:ANOVA table

Source of Sum of Degree of Mean sum of


Test Statistic
variation Squares freedom squares
SSB M SB
Between M SB = F =
SSB = 10 k−1=3−1=2 k−1 M SW
the sample 10 5
= =5 = = 2.29
2 2.18
SSW
Within M SW =
SSW = 24 N − k = 14 − 3 = 11 N −k
the sample 24
= = 2.18
11
Total SST = 34 N − 1 = 14 − 1 = 13

Step-IV:
Table value of F at 5% level of significance for (2, 11) degrees of freedom is 3.98
Here the calculated value of F = 2.29 is greater then the table value of F = 3.98
∴ H0 is accepted.
i.e., There is no difference in the sales ability of the three salesmen.

Example:3 A company wish to test whether its three salesmen A, B and C tend to make
sales of the same size or whether their selling ability as measured by the average size of their
sales. During the previous week there have been 15 sale calls - A made 5 calls, B made 5 calls
and C also made 5calls. The salesmans sales record is given in the following table.
Probability and Statistics 5

A B C
5000 4000 6000
3000 3000 7000
4000 3000 3000
0 6000 4000
3000 0 5000

Perform the analysis of variance and test the hypothesis whether there is any significant differ-
ence in the sales size of the salesmen at 5% level of significance.
Analysis of Variance for two-way classification ( Randomized Block Design)
In this case observations (or data) are classified according to two different factor (or) cri-
terion.
Step-I:Calculation of total variation

1. N is total number of observations.


P P P
2. T = x1 + x2 + · · · + xk .

T2
3. Correction Factor = CF = .
N
x21 +
x22 + · · · + x2c − CF .
P P P
4. Total Sum of Squares = SST =

( x1 ) 2 ( x2 ) 2 ( x k )2
P P P
5. Sum of Squares between the columns = SSC = + +···+ − CF .
n1 n2 nc

( x 1 ) 2 ( x 2 )2 ( xr ) 2
P P P
6. Sum of Squares between the rows = SSR = + + ··· + − CF .
m1 m2 mr
7. Sum of Squares due to errors=SST-SSC-SSR.

Step-II:ANOVA table

Source of Sum of Degree of Mean sum of


Test Statistic
variation Squares freedom squares
Between the SSC M SC
SSC c−1 M SC = F =
Columns c−1 M SE
Between the SSR M SR
SSR r−1 M SR = F =
Rows r−1 M SE
SSE
Residual error SSE (c − 1)(r − 1) M SE =
(c − 1)(r − 1)
Total SST N −1

Example:1 The following data represents the number of units of production per day turned
out by 5 different workers using 4 different types of machines.
6 Design of Experiments

Machine Type
Workers
A B C D
1 44 38 47 36
2 46 40 52 43
3 34 36 44 32
4 43 38 46 33
5 38 42 49 39

(i) Test whether the mean productivity is the same for the different machine types.

(ii) Test whether the 5 men differ respect to mean productivity.

Solution
Null Hypothesis H0 : The mean productivity is the same for the different machines different
mean.
Alternative Hypothesis H1 : The mean productivity will differ.
Step-I:
Subtracting 40 from all the data, then the given table becomes.

Machine Type
Workers
A B C D Row Total
1 4 -2 7 -4 5
2 6 0 12 3 21
3 -6 -4 4 -8 -14
4 3 -2 6 -7 0
5 -2 2 9 -1 8
Column Total 5 -6 38 -17 20

Step-II:

1. N is total number of observations=20.


P P P
2. T = x1 + x2 + · · · + xk = 20.

T2 (20)2
3. Correction Factor = CF = = = 20.
N 20
P 2 P 2
x1 + x2 + · · · + x2c − CF .
P
4. Total Sum of Squares = SST =

= 42 + (−2)2 + 72 + (−4)2 + 62 + 02 + 122 + 32 + (−6)2 + (−4)2 + 42 + (−8)2




+32 + (−2)2 + 62 + (−7)2 + (−2)2 + 22 + 92 + (−1)2 − 20 = 594 − 20 = 574.




x1 ) 2 ( x2 ) 2 ( x k )2
P P P
(
5. Sum of Squares between the columns = SSC = + +···+ − CF .
hn 1
2
n2
2 2
i nc
SSC (Machines) = 55 + (−6) + (38) + (−17)
2
5 5 5
− 20
= 358.8 − 20 = 338.8
Probability and Statistics 7

x 1 ) 2 ( x 2 )2 ( xr ) 2
P P P
(
6. Sum of Squares between the rows = SSR = + + ··· + − CF .
h 2m1 2
m2
2 2
mr
i
SSR (workers) = 54 + (21) + (−14) + (0)4 + 84 − 20
2
4 5
= 181.5-20=161.5

7. Sum of Squares due to errors=SST-SSC-SSR=574-338.8-161.5=73.7.


Step-II:ANOVA table

Source of Sum of Degree of Mean sum of


Test Statistic
variation Squares freedom squares
SSC M SC
Between the SSC c−1 M SC = F =
Columns 338.8 4−1=3 c−1 M SE
= 112.93 = 18.39
SSR M SR
Between the SSR r−1 M SR = F =
Rows 161.5 4−1=4 r−1 M SE
= 40.38 = 6.57
SSE
SSE (c − 1)(r − 1) M SE =
Residual error (c − 1)(r − 1)
73.7 3 × 4 = 12
= 6.14
SST N −1
Total
594 20 − 1 = 19

Step-IV:
Conclusion:
(i) Table value of F at 5% level of significance for (3, 12) degrees of freedom is 3.49
Here the calculated value of F = 18.39 is greater then the table value of F = 3.49
(ii) Table value of F at 5% level of significance for (4, 12) degrees of freedom is 3.26
Here the calculated value of F = 6.57 is greater then the table value of F = 3.26

∴ H0 is rejected.
i.e., The mean productivity is not the same for the four different types of machines and five
different workers.

Analysis of Variance for three-way classification ( Latin Square Design)


We consider an agricultural experiment, in which n2 plots are taken and arranged in the form
of an n × n square, such that the plots in each row will be homogeneous as for as possible with
respect to one factor of classification.
The n treatment are given to thesse plots such that each treatment occurs only once in each row
and only once in each column. The various possible arrangements obtained in this manner are
known as Latin squares of order n. This design of experiment is called the Latin Square Design.

Step-I:Calculation of total variation


8 Design of Experiments

1. N is total number of observations.


P P P
2. T = x1 + x2 + · · · + xk .

T2
3. Correction Factor = CF = .
N
x21 + x22 + · · · + x2c − CF .
P P P
4. Total Sum of Squares = SST =

1P
5. Sum of Squares between the columns = SSC = ( C1 )2 + ( C2 )2 + · · · + ( Cn )2 −
P P 
n
CF .
1P
( R1 )2 + ( R2 )2 + · · · + ( Rn )2 −CF .
P P 
6. Sum of Squares between the rows = SSR =
n
1P
7. Sum of Squares between the Letters = SSL = ( L1 )2 + ( L2 )2 + · · · + ( Ln )2 −
P P 
n
CF .

8. Sum of Squares due to errors=SST-SSC-SSR-SSL.

Step-II:ANOVA table

Source of Sum of Degree of Mean sum of


Test Statistic
variation Squares freedom squares
Between the SSC M SC
SSC n−1 M SC = F =
Columns n−1 M SE
Between the SSR M SR
SSR n−1 M SR = F =
Rows n−1 M SE
Between the SSL M SL
SSL n−1 M SR = F =
Letters n−1 M SE
SSE
Residual error SSE (n − 1)(n − 2) M SE =
(n − 1)(n − 2)
Total SST n2 − 1

Example:1 Analyse the variance in the following latin square design of yields (in kg) of paddy
where A, B, C, D denote the different methods of cultivation.

D 122 A 121 C 123 B 122


B 124 C 123 A 122 D 125
A120 B 119 D 120 C 121
C 122 D 123 B 121 A 122
Probability and Statistics 9

Examine whether the different methods of cultivation have given significantly different yields.
Solution
Null Hypothesis H0 : There is no significant difference between the methods of cultivation.
Alternative Hypothesis H1 : There is a significant difference.
Step-I:
Subtracting 120 from all the data, then the given table becomes.

1 2 3 4 Row total
1 D2 A1 C3 B2 8
2 B4 C3 A2 D5 14
3 A0 B -1 D0 C1 0
4 C2 D3 B1 A2 8
Column Total 8 6 6 10 30

Letter Total
A 1 2 0 2 5
B 2 4 -1 1 6
C 3 3 1 2 9
D 2 5 0 3 10

Step-II:

1. N is total number of observations=16.


P P P
2. T = x1 + x2 + · · · + xk = 30.
T2 (30)2
3. Correction Factor = CF = = = 56.25.
N 16
P 2 P 2
x1 + x2 + · · · + x2c − CF .
P
4. SST =
=[22 + 12 + 32 + 22 + 42 + 32 + 22 + 52 + 02 + (−1)2 + 02 + 12 + 22 + 32 + 12 + 22 ]−56.25
=35.75
1P
SSC = ( C1 )2 + ( C2 )2 + · · · + ( Cn )2 − CF
P P 
5.
n
1
= [82 + 62 + 62 + 102 ] − 56.25 = 2.75.
4
1P
SSR = ( R1 )2 + ( R2 )2 + · · · + ( Rn )2 − CF
P P 
6.
n
1 2
= [8 + 142 + 02 + 82 ] − 56.25 = 24.75.
4
1P
SSL = ( L1 )2 + ( L2 )2 + · · · + ( Ln )2 − CF
P P 
7.
n
1
= [52 + 62 + 92 + 102 ] − 56.25 = 4.25.
4
8. Sum of Squares due to errors=SST-SSC-SSR=35.75-2.75-24.75-4.24=4.

Step-II:ANOVA table
10 Design of Experiments

Source of Sum of Degree of Mean sum of


Test Statistic
variation Squares freedom squares
SSC M SC
Between the SSC n−1 M SC = F =
Columns = 2.75 4−1=3 n−1 M SE
= 0.92 = 1.37
SSR M SR
Between the SSR n−1 M SR = F =
Rows = 24.75 4−1=3 n−1 M SE
= 8.25 = 12.31
SSL M SL
Between the SSL n−1 M SL = F =
Letters = 4.25 4−1=3 n−1 M SE
= 1.42 = 2.12
SSE
SSE (n − 1)(n − 2) M SE =
Residual error (n − 1)(n − 2)
=4 =3×2=6
= 0.67
SST n2 − 1
Total
= 35.75 = 16 − 1 = 15

Step-IV:
Conclusion: Table value of F at 5% level of significance for (3, 6) degrees of freedom is 4.76
∴ H0 is accepted.
i.e., There is no significant difference between the methods of cultivation.

You might also like