You are on page 1of 18

Analysis-of-Variance Models for

Biostatistics

Ernesto Ponsot Balaguer


PhD in Statistics, MSc. In Applied Statistics, Systems Engineering
http://webdelprofesor.ula.ve/economia/ernesto
E-mail: eponsot@yachaytech.edu.ec

University of Experimental Technologies Research Yachay (Yachay Tech)


School of Mathematical Sciences and Information Technology
School of Biological Sciences and Engineering

Imbabura, Ecuador - April 2018


Content

1 Introduction

2 ANOVA model estimation


Introduction ANOVA model estimation

ANOVA Models

In many experimental situations, a researcher applies several


treatments or treatment combinations to randomly selected
experimental units and then wishes to compare the treatment
means for some response y.
In analysis-of-variance (ANOVA), we use linear models to
facilitate a comparison of these means.
The model is often expressed with more parameters than can
be estimated, which results in an X matrix that is not of full
rank.

3 / 18
Introduction ANOVA model estimation

ANOVA Models
One-Way Model

Suppose that a researcher has developed two biological


fertilizers to increase the production of a certain variety of
corn. To formulate the model, we might start with the notion
that fertilizers, a plant yields an average of µ cobs. Then if
fertilizer 1 is added, the number of cobs is expected to
increase by τ1 , and if fertilizer 2 is added, the number of cobs
would increase by τ2 .
Thinking about the experiment, it is clear that we should
preferably use a single land, to which we perform three tests
(maybe): One without fertilizers, another with fertilizer 1 and
another with fertilizer 2.
Of course, we need a clear separation between the three
parcels

4 / 18
Introduction ANOVA model estimation

ANOVA Models
One-Way Model

We need a single land to ensure that its characteristics does


not affect our experiment. The separation to ensure that the
fertilizer 2 does not mix with the fertilizer 1. Obviously, we
must ensure that the environmental conditions are the same.
If we have the minimum resources possible, we only need to
do two tests, one with fertilizer 1 and one with fertilizer 2.
The model will be:
(
y1 = µ + τ1 + 1
y2 = µ + τ2 + 2

In our model yi are observed performance (number of cobs)


and i is an error (unobservable) for i = 1, 2. We would like to
estimate the parameters and test hypotheses such as
H0 : τ1 = τ2 , for example.
5 / 18
Introduction ANOVA model estimation

ANOVA Models
One-Way Model

As you probably already noticed, that minimal experiment


does not look very reliable, mainly because we will only get
one sample of each condition. We need more samples.
Suppose that we use six parcels, then adding fertilizer 1 to
three and fertilizer 2 to the other three. The new model is


 y11 = µ + τ1 + 11
y12 = µ + τ1 + 12





y
13 = µ + τ1 + 13

 y21 = µ + τ2 + 21




 y22 = µ + τ2 + 22
y23 = µ + τ2 + 23

6 / 18
Introduction ANOVA model estimation

ANOVA Models
One-Way Model

This is,
yij = µ + τi + ij , i = 1, 2; j = 1, 2, 3
In matrix form:
y = Xβ + 
with
     
y11 1 1 0 11
y12  1 1 0 12 
    
 
y 

1 µ
 13  1 0
13 
 
y =  , X =   , β = τ1  ,  =  
  
y21  1 0 1 21 
 
y22 

1
 τ2  
0 1 22 
y23 1 0 1 23
Do you notice any problem?
7 / 18
Introduction ANOVA model estimation

ANOVA Models
Two-Way Model

Suppose now that our researcher suspects that the


performance is also related to the variety of corn and wants to
add this information to the experiment, for example by adding
three different varieties of corn: V1, V2 and V3.
As before, µ is the mean, τ1 is the effect of fertilizer 1 and τ2
is the effect of fertilizer 2. We need more parameters. Let γ1
be the effect of the variety V1, γ2 be the effect of V2 and γ3
be the effect of V3.
We might also need more trials, what do you think? What
would you propose?

8 / 18
Introduction ANOVA model estimation

ANOVA Models
Two-Way Model

Table 0: Number of samples when add varieties to our experiment


Variety
Fertilizer V1 V2 V3 Total
1 2 2 2 6
2 2 2 2 6
Total 4 4 4 12
The new purely additive model is
yijk = µ + τi + γj + ijk , i = 1, 2; j = 1, 2, 3; k = 1, 2

Can you build the matrix form of the model?


Do you note the difference between the One-Way and
Two-Way models?
9 / 18
Introduction ANOVA model estimation

ANOVA Models Estimation

As we know, in the ANOVA model the matrix X is not of full rank


and then @ (X 0 X)−1 . This happens because the model is
overparameterized. We have three strategies to work with this
limitation:
1 Reparametrize: Redefine the model using a smaller number
of new parameters that are unique.
2 To restrict: Use the overparameterized model but place
constraints on the parameters so that they become unique.
3 Find Linear Estimable Functions: In the overparameterized
model, work with linear combinations of the parameters that
are unique and can be unambiguously estimated.
We will describe 1, 2 and leave 3 for later.

10 / 18
Introduction ANOVA model estimation

ANOVA Models Estimation


1. Reparametrization

Let’s look the general version of the One-Way model.

yij = µ + τi + ij , i = 1, 2, · · · , k; j = 1, 2, · · · , n

Here k is the number of levels of the (only one) factor or the


number of treatments, and n is the (equal) number of
observations for each treatment. This model with an equal
number of observations for all factor levels is called the
Balanced Case.
This design results in an X matrix with k + 1 columns, as we
postulate k + 1 parameters: µ, τ1 , τ2 , · · · , τk , but
r(X) < k + 1.

11 / 18
Introduction ANOVA model estimation

ANOVA Models Estimation


1. Reparametrization

Suppose now that on second thought, that it’s not so


important to differentiate between µ, τi (for all i). It is only
important to differentiate the effects of each treatment. Then
we can write:
yij = µi + ij , i = 1, 2, · · · , k; j = 1, 2, · · · , n
Where µi = µ + τi , ∀ i. Then, for our example (k = 2, n = 3)
we have:
     
y11 1 0 11
y12  1 0 12 
   " #  
y  1 0 µ1
 
y = W µ +  ⇒  13  =  +  13 
    
y21  0 1  µ2 21 

     
y22  0 1  22 
y23 0 1 23
12 / 18
Introduction ANOVA model estimation

ANOVA Models Estimation


1. Reparametrization

Now our model has a full rank W matrix since r(W ) = 2, and
we can use all previous theory to estimate the µ parameters
and to test hypotheses.

13 / 18
Introduction ANOVA model estimation

ANOVA Models Estimation


2. To restrict

Suppose now that distinguish between the original parameters


µ, τi (for all i) it is important, but we can put conditions in
some reasonable way. Such constraints are called side
conditions.
Let τ1 + τ2 + · · · + τk = 0 be the condition imposed. Now the
model is:
X
yij = µ + τi + ij , subject to τi = 0,
i
i = 1, 2, · · · , k; j = 1, 2, · · · , n

Note that we have not added new parameters, we have only


put a condition on existing ones.

14 / 18
Introduction ANOVA model estimation

ANOVA Models Estimation


2. To restrict

Then, for our example (k = 2, n = 3) we have:


     
y11 1 1 11
y12  1 1  12 
   " #  
y  1 1  µ  
 13  
y = Zτ +  ⇒   =  +  13 
  
y21  1 −1 τ1 21 

     
y22  1 −1 22 
y23 1 −1 23

Because we know that τ1 + τ2 = 0 ⇒ τ2 = −τ1 and we do not


need to directly estimate τ2 .
Again our new model has a full rank Z matrix since r(Z ) = 2,
and we can use all previous theory to estimate the τ
parameters and to test hypotheses.
15 / 18
Introduction ANOVA model estimation

ANOVA Models Estimation


2. To restrict

Note that, in general, if we want to estimate k + 1 parameters


but the r (X) = r < k + 1, we will need k + 1 − r linearly
independent constraints or side conditions.

16 / 18
Introduction ANOVA model estimation

ANOVA Models Estimation

Exercise 1
Three methods of packaging frozen foods were compared by
Daniel (1974, p. 196). The response variable was ascorbic acid
(mg/100g). The data are in next table:

Table 1: Three methods of packaging frozen foods


Method A B C
14.29 20.06 20.04
19.10 20.64 26.23
19.09 18.00 22.74
16.25 19.56 24.04
15.09 19.47 23.37
16.61 19.07 25.02
19.63 18.38 23.27
17 / 18
Introduction ANOVA model estimation

ANOVA Models Estimation

R code 1
# Exercise 11
library(ggplot2)
data<-read.table("ex11.txt",header=T,sep=";")
data
# Scatter plot
ggplot(data, aes(MP, y)) + geom_point()
# Boxplot
ggplot(data, aes(MP, y)) + geom_boxplot()
mod <- lm(y~MP, data=data, x=T); mod
mod$x
summary(mod)
anova(mod)
contrasts(data$MP)

18 / 18

You might also like