You are on page 1of 18

ANALYSIS OF VARIANCE

Learning Objective

At the end of this lesson, the participants should be able to :


• discuss the concept of analysis of variance;
• enumerate the basic assumptions of the analysis of variance technique; and
• outline the analysis of variance table associated with the various experimental
designs commonly used in rice research.

Introduction

Frequently researchers are confronted with a situation where samples from several
populations are obtained, for example yields of three different varieties of rice, and it is of
interest to test the null hypothesis that the population means are all the same. The
technique used to make these tests is called the Analysis of Variance (ANOVA). If the
treatments, rice varieties in this case, all had exactly the same population means and
variances then one would expect plots receiving different varieties to show the same
variability as plots receiving the same varieties. This variability being due to numerous
non-specific causes such as variable land, seed and cultivation practices. On the other
hand if the varieties had different population means, but the same population variances,
then one would expect the variability between plots with different varieties to be greater
than that for plots with the same variety. Hence the idea of ANOVA is to compare
variability of plots receiving different treatments to that of plots receiving the same
treatments.

The basic assumptions of the Analysis of Variance technique are :

• the samples from each population are statistically independent

• the populations all have same variances (populations are homoscedastic) but possibly
different means

• the populations follow normal distributions

• treatment effects are additive, i.e., means differ by additive amounts and not by
Multiplicative factors.

Analysis of Variance 163


Failure to satisfy one or more of these assumptions affects both the level of significance
and the sensitivity of the F-test in the Analysis of Variance.

Completely Randomized Design

A completely randomized design (CRD) is one where the t treatments are assigned
completely at random so that each of r experimental unit has the same chance of receiving
any one treatment. For the CRD, any difference amongst the experimental units receiving
the same treatment is considered as experimental error. In CRD the total variability in the
data, as measured by the total sum of squares (SS) is partitioned into:

• sum of squares due to treatment (TrSS) - a measure of differences between treatment


means.
• sum of squares due to experimental error (ESS) - a measure of differences in
observations within treatments which is due to experimental error.

Hence, the following relationship may symbolically written as:

TSS = TrSS + ESS

The analysis of variance table associated with a completely randomized design follows:

SV dF SS MS Fc
Treatment t-1 TrSS TrMS = TrSS / (t - 1) TrMS / EMS
Error t (r - 1) ESS EMS = ESS / t (n - 1)
Total tr - 1 TSS

where TrMS is the estimate of the variability among treatments; while EMS is the
estimate of the inherent variability within treatments.

164 Completely Randomized Design


If the treatments have no different effects then, TrMS and EMS should be very similar. If
they are not similar the observed difference must be caused by differences in treatment
means or treatment variances. If the variances are assumed to be constant, then it must be
the means which differ. Thus, it is clear that a test of the hypothesis of no difference in
treatment means can be performed by comparing TrMS and EMS.

A measure of the precision of treatment means is given by the standard error of a mean,
SEM or the standard error of the difference between two means SED as

SEM = EMS / r SED = 2EMS / r

Example 1

Suppose an experiment has been set up with four replicates of three varieties and had
assessed 12 plots to obtain the following yields:

Variety A Variety B Variety C


3.8 4.3 4.1
4.5 5.0 4.3
3.8 4.5 4.7
3.9 5.0 4.9
Mean 4.0 4.7 4.5
Total 16.0 18.8 18.0

To construct the ANOVA Table for this data set, the following computations are made:

Total dF = (3 varieties)(4 replicates)-1 = 11


Varieties dF = (3 varieties)-1 = 2
Error dF = Total dF - Varieties dF = 9
CF = (3.8 + 4.5 + ... + 4.9)2 / (3)(4) = 232.32
TSS = (3.82 + 4.52 + ... + 4.92) - 232.32 = 2.16
TrSS = (16.02 + 18.82 + 18.02) / 4 - 232.32 = 1.04
ESS = TSS - TrSS = 1.12

Analysis of Variance 165


ANOVA Table for grain yield (kg/plot) involving 3 varieties, and 4 replications.

SV dF SS MS F-value
Treatment (Varieties) 2 1.04 0.520 4.18ns
Error 9 1.12 0.124
Total 11 2.16
c.v. = 8.0 %

The null hypothesis is that there is no difference between the mean yields of varieties A,
B, and C. Note that the treatment variation is greater than the error variation. The
question here is: How likely is it to see a value large or larger than the observed treatment
MS if the null hypothesis is true. The answer is obtained by conducting an F-test.
F = 0.520/0.124 = 4.18 with 2 and 9 dF. This is not larger than the 5% critical value of
the F-distribution (4.26) with 2 and 9 dF, so the null hypothesis cannot be rejected.

Randomized Complete Block (RCB) Design

The Analysis of Variance is also useful when the populations are classified according to
more than one factor, such as different blocks of land and different varieties in the case of
a Randomized Complete Block (RCB) Design. There are three sources of variation in an
RCB design: treatment, replication (or block), and experimental error. Notice that this is
one more than that for a CRD, because of the addition of replication, which corresponds
to the variability among blocks. Hence, for an experiment layed-out in RCB design with t
treatments and r replications, the outline of the ANOVA table follows:

166 Randomized Complete Block (RCB) Design


SV dF SS MS Fc
Replicates r-1 BlkSS BlkMS = BlkSS / (r - 1) BlkMS / EMS1
Treatment t-1 TrSS TrMS = TrSS / (t - 1) TrMS / EMS
Error (r -1) (t - 1) ESS EMS = ESS /(r - 1) (t - 1)
Total rt - 1 TSS

A measure of precision of treatment means in RCB is given by the standard error of a


mean, SEM or the standard error of the difference between two means SED as

SEM = EMS / r SED = 2EMS / r

Example 2

ANOVA Table for grain yield (kg/plot) involving 3 varieties, 4 replications.

SV dF SS MS F-value
Replication 3 0.587 0.196
Variety (V) 2 1.040 0.520 5.84*
Error 6 0.533 0.089
c.v. = 6.8 %

1
Some researchers test the Blocks MS for significance as well as the Treatment MS, but there is little point
in this as we already expect the Blocks MS to be greater than the residual if we have blocked the plots in the
correct manner.

Analysis of Variance 167


In this analysis, the varieties MS is tested against the error MS. It was found that the F-
statistic is 5.84, which is larger than the 5% critical value of the F-distribution with 2 and
6 dF (5.14), so that it can be concluded that the variation between plots with different
varieties is larger than that between plots with the same variety after removing variation
due to differences between blocks. Notice that the block SS and error SS sum to the
within varieties SS of the previous example. Blocking, therefore, has increased the
sensitivity of the experiment and shown the differences between varieties to be significant
whereas the previous analysis was not sufficiently sensitive to find this.

Latin Square Designs

The major feature of the Latin Square (LS) Design is its capacity to simultaneously
handle two known sources of variation among experimental units. It treats the sources as
two independent blocking criteria, instead of only one as in the RCB design. As such the
Analysis of Variance associated with LS Design has four sources of variation; two more
than that for the CRD and one more than that for the RCB Design. The sources of
variation are: row, column, treatment and experimental error. Thus for an experimental
data resulting from an LS Design with r rows, c columns and t treatments, the outline of
the ANOVA table follows:

SV dF SS MS Fc
Row t-1 RSS RMS = RSS / (t - 1) RMS / EMS2
Column t-1 CSS CMS = CSS / (t - 1) CMS / EMS2
Treatment t-1 TrSS TrMS = TrSS / (t - 1) TrMS / EMS
Error (t -1) (t - 2) ESS EMS = ESS /(t - 1) (t - 2)
Total t2 - 1 TSS

2
Often times the significance of the Row and Column factors are not tested.

168 Latin Square Designs


Example 3

ANOVA table for grain yield (t/ha) involving 4 sprayer rates in 4 replications.

SV dF SS MS Fc
Row 3 1.8930 0.6310 5.06*
Column 3 0.7418 0.2473 1.98ns
Treatment 3 2.5571 0.8524 6.83**
Error 6 0.7433 0.1247
Total 15 5.9352
c.v. = 7.8%

Factorial Treatment Arrangements

Frequently data are classified according to more than two factors, as in the case of a
factorial treatment arrangements. For example, an experiment involving two factors,
each at two levels, such as two varieties (V1 and V2) and two nitrogen rates (N1 and N2),
is referred to as a 2x2 factorial experiment. Its treatments consist of the following four
possible combinations of the two levels of the two factors.

Treatment Combination
Treatment Number Variety Nitrogen Rate (kg./ha)
1 V1 N1
2 V1 N2
3 V2 N1
4 V2 N2

Note that the term factorial describes a specific way in which the treatments are formed
and does not, in any way, refer to the experimental design used. That is one may arrange
these in the field by using any one of the basic designs like the completely randomized
design (CRD), the randomized complete block design (RCBD) or the Latin Square (LS).

Analysis of Variance 169


In cases of factorial treatment arrangements the intention is to partition the total
variability into components due to main effects of each factor ( V and N for example),
interactions between factors (VxN), and residuals.

Two factors are said to have an interaction if the effect of one factor varies with the level
of the other factor. In similar manner, three factors are said to have interaction if the
interaction between any two varies over the levels of the third. Higher order interactions
are analogously defined. The following table presents two hypothetical sets of 2x2
factorial data: one with, and another without, interaction between two factors.

A 2x2 Factorial Hypothetical Rice Yield Data with No Interaction Between Variety and
Nitrogen Rates.
Variety N1 (0 kg / ha) N2 (60 kg / ha) Average
V1 1.00 3.00 2.00
V2 2.00 4.00 3.00
Average 1.50 3.50

A 2x2 Factorial Hypothetical Rice Yield Data with Interaction Between Variety and
Nitrogen Rates.
Variety N1 (0 kg / ha) N2 (60 kg / ha) Average
V1 1.00 1.00 1.00
V2 2.00 4.00 3.00
Average 1.50 2.50

170 Factorial Treatment Arrangements


A set of graphical representation of the nitrogen response of the two varieties is shown
below:

Yield (kg/ha) Yield (kg/ha)

4 (a) V2 4 (c) V2

3 V1 3

2 2

1 1 V1

N1 N2 N1 N2

Yield (kg/ha) Yield (kg/ha)

4 (b) V2 4 (d)

3 3 V1

2 V1 2

1 1 V2

N1 N2 N1 N2

This set of graphs represent different magnitudes of interaction between varieties (V1 and
V2) and nitrogen rates (N1 and N2) with (a) showing no interaction, (b) and (c) showing
intermediate interactions, and (d) showing high interaction.

Analysis of Variance 171


The ANOVA Table for a Factorial Treatment in CRD with r replications.
SV dF SS MS F-value
Factor (A) a-1 ASS AMS AMS/EMS
Factor (B) b-1 BSS BMS BMS/EMS
AxB (a-1)(b-1) AxBSS AxBMS AxBMS/EMS
Error ab(r-1) ESS EMS
Total rab-1

A measure of precision of factorial treatment means in CRD is :

SEM = EMS / r SED = 2EMS / r

The ANOVA Table for a Factorial Treatment in RCB with r blocks.

SV dF SS MS F-value
Block r-1 RSS RMS RMS/EMS
Factor (A) a-1 ASS AMS AMS/EMS
Factor (B) b-1 BSS BMS BMS/EMS
AxB (a-1)(b-1) AxBSS AxBMS AxBMS/EMS
Error (r-1)(ab-1) ESS EMS
Total rab-1

A measure of precision of factorial treatment means in RCB design is

SEM = EMS / r or SED = 2EMS / r

172 Factorial Treatment Arrangements


Example 4

ANOVA Table for grain yield (t/ha) involving 3 varieties, 8 fertilizer treatments and 4
replications.
SV dF SS MS F-value
Replication 3 0.60873 0.20291 1.4ns
Variety (V) 2 14.39724 7.19862 50.8**
Fertilizer (F) 7 214.86423 30.69489 216.4**
VxF 14 3.66954 0.26211 1.9*
Error 69 9.78696 0.14184
c.v. = 6.0%
SED = 0.19
SEM = 0.27

The result indicates that the main effects of variety and fertilizer are significant at the 1%
level of significance. The results also show a significant interaction between variety and
fertilizer, indicating that the varietal difference was significantly affected by fertilizer
level applied and that the fertilizer effect differed significantly with the varieties tested.

Split-Plot Design

The split-plot design is specifically suited for a 2-factor experiment that has more
treatments than can be accommodated by a complete block design. In a split-plot design
one of the factors is assigned to the main plot. The assigned factor is called Main-Plot
Factor. The main plot is divided into subplots to which the second factor, called the Sub-
Plot Factor, is assigned. Thus, each main plot becomes a block for the subplot
treatments.

The Analysis of Variance of a split-plot design is divided into main plot analysis and the
sub-plot analysis. By denoting A and B as the main plot and sub-plot factors,
respectively, the outline of the ANOVA table for a split-plot design is as follows:

Analysis of Variance 173


SV dF SS MS F-value
Replication r-1 RSS RMS
Mainplot (A) a-1 ASS AMS AMS/EaMS
Error (a) (r-1)(a-1) EaSS EaMS

Subplot (B) b-1 BSS BMS BMS/EbMS


Ax B (a-1)(b-1) A x BSS A x BMS AxBMS/EbMS
Error (b) a(r-1)(b-1) EbSS EbMS
Total rab-1

where a = no. of main plot treatments


b = no. of sub-plot treatments
r = no. of blocks

Types of Pair Comparison in a


Split-Plot Design

Standard Error of the


Number Between Mean Difference (SED)

1 Two main plot means 2 Ea


(averaged over all subplot rb
treatments)

2 Two subplot means


2 Eb
(averaged over all main
ra
plot treatments)

3 Two subplot means


2 Eb
in the same mainplot
treatment r

4 Two main plot means


2 [(b − 1) Eb + Ea ]
at the same subplot
treatment rb

Ea = Error(a) MS, Eb = Error(b) MS

174 Split-Plot Design


Example 5

ANOVA Table for grain yield (kg/plot) involving 4 N-rates (mainplot), 6 varieties (sub-
plot) and 4 replications.
SV dF SS MS F-value
Block 3 2.3473 0.7824 1.08ns
N-rate (N) 3 7.6385 2.5462 3.50ns
Error (a) 9 6.5390 0.7266

Variety (V) 5 94.2347 18.8469 92.09**


NxV 15 8.6682 0.5779 2.82*
Error (b) 60 12.2800 0.2047
Total 95 131.7077
c.v. (a) = 26.0% , c.v. (b) = 13.8%

Comparison SEM SED


2 V-means at each level of N 0.23 0.32
2 N-means at each level of V 0.27 0.38

Analysis of Variance 175


Strip-Plot Design

The strip-plot design is specifically suited for a 2-factor in which the desired precision for
measuring interaction effect between the two factors is higher than that for measuring the
main effect of either one of the two factors. This is done with the use of three plot sizes:
1. Vertical-strip plot for the first factor
2. Horizontal-strip plot for the second factor
3. Intersection plot for the interaction between the two factors

The Analysis of Variance of a strip-plot design is divided into three parts: the horizontal-
factor analysis, the vertical-factor analysis, and the interaction analysis. By denoting A
and B as the horizontal and vertical factors, respectively, the outline of the ANOVA table
for a strip-plot design is as follows:

SV dF SS MS F-value
Replication r-1 RSS RMS
Horizontal Factor (A) a-1 ASS AMS AMS/EaMS
Error (a) (r-1)(a-1) EaSS EaMS

Vertical Factor (B) b-1 BSS BMS BMS/EbMS


Error (b) (r-1)(b-1) EbSS EbMS

AxB (a-1)(b-1) AxBSS AxBMS AxBMS/EcMS


Error (c) (r-1)(a-1)(b-1) EcSS EcMS
Total rab-1
where a = no. of horizontal strip treatments
b = no. of vertical strip treatments
r = no. of blocks

176 Strip-Plot Design


Types of Pair Comparison in a
Strip-Plot Design

Standard Error of the


Number Between Mean Difference (SED)

1 Two horizontal means 2 Ea


(averaged over all rb
vertical treatments)

2 Two vertical means 2 Eb


(averaged over all rb
horizontal treatments)

3 Two horizontal means


2 [(b − 1) Ec + Ea ]
in the same vertical
treatment rb

4 Two vertical means 2 [(a − 1) Ec + Ea ]


at the same
ra
horizontal treatment

Ea = Error(a) MS, Eb = Error(b) MS, Ec = Error(c) MS

Analysis of Variance 177


Example 6

ANOVA Table for grain yield (t/ha) involving 6 water regimes (horizontal factors) and 3
N-management (vertical factor) and 4 replications.
SV dF SS MS F-value
Block 3 3.2176 1.0725 11.61**
Water Regime (W) 5 4.4989 0.8998 9.74**
Error (a) 15 3.3861 0.2257

Nitrogen (N) 2 0.7776 0.3888 <1


Error (b) 6 2.7506 0.4584

WxN 10 0.4043 0.0404 <1


Error (c) 30 5.1469 0.1716
Total 71 18.1819
c.v. (a) = 10.1% , c.v. (b) = 14.38% , c.v. (c) = 8.8%

Comparison SEM SED


2 W means 0.14 0.19
2 N means 0.14 0.20
2 W means at each level of N 0.22 0.31
2 N means at each level of W 0.23 0.34

178 Example
References:

Cochran, W.G., and Cox, G.M. (1957). Experimental Designs, Second Edition. John
Wiley, New York, New York.

Gomez, K.A., and Gomez, A.A. (1984). Statistical Procedures for Agricultural Research.
John Wiley, New York, New York.

Searle, S.R. (1987). Linear Models for Unbalanced Data. John Wiley, New York, New
York.

Snedecor, G.W., and Cochran, W.G. (1989). Statistical Methods, Eighth Edition. Iowa
State University, Ames, Iowa.

Analysis of Variance 179


180 Example

You might also like