You are on page 1of 25

One way ANOVA

This test is applied to find significance difference


between several means.
Null hypothesis is H0 :
Alternative hypothesis is Ha=
Test procedure:
Suppose k groups having ni observations each
Group 1:
Group 2: and so on
Group k:
CF =, where G=, i=1,2,------,k and j=

Total sum of squares is calculated as follows,


TSS= – CF
Sum of squares due to groups (or treatments) is
calculated as follows,
SST=
Where total of all the values in ith group
Sum of squares due to error is given by
SSE=TSS-SST
ANOVA table
Source df Sum of Mean sum of F ratio
squares square
(SS) (MSS=SS/df)
Treatments k-1 SST MST=SST/(k-1) F=MST/MSE,
(Groups) with (k-1) and
Error N-k SSE MSE=SSE/(N-k) (N-k) df
Total N-1 TSS  

Conclusion:
Critical value of F is obtained from F table with (k-1) and (N-k)
df at α level of significance ,usually 0.05 or 0.01
If F(calculated)≤critical value then accept
Otherwise Reject
OR p value<0.05,Reject .Otherwise Accept
Example :
Following data is regarding sales(in 1000 Rs) of five salesmen.
Sales Salesman A Salesman B Salesman C Salesman D Salesman E

100 150 98 100 80


120 101 100 105 150
90 85 120 99 125
85 70 85 150 95
100 85 70 120 75
92 90 75 96 105
75 90 100  
100 90
85
110
6 8 7 10 6
587 756 638 1055 630
: There is no significant difference between sales of
five salesmen
: There is significant difference between sales of
five salesmen
G=587+756+638+1055+630=3666
N=6+8+7+10+6=37
CF=G2/N=363231.24
TSS=(1002+1202+---------------+1052) – CF
= 378386-363231.24=15154.76
SST= 364471.81-363231.24=1240.57
SSE=TSS-SST=13914.19
ANOVA Table
Source df SS MSS F ratio

Salesman k-1=5-1=4 1240.57 1240.57/4=310.14 F=310.14/434.82=0.


7133
Error N-k=37-5=32 13914.29 13914.29/32=434.82

Total N-1=37-1=36 15154.76  

Critical value= 4.17 with 4 and 32 df at α=0.05

0.7133< 4.17, Accept

 
IN R
The R function is aov().
The function summary.aov() is used to summarize the analysis of variance model.
Example:
#Import excel file One way ANOVA
#One way ANOVA
result=aov(Sales~Salesman,data=one_way_ANOVA)
summary(result)
#OR
result=oneway.test(Sales~Salesman,data=one_way_ANOVA,var.equal = T)
Result
#OR
anova(lm(Sales~Salesman,data=one_way_ANOVA))
OUT PUT
Df Sum Sq Mean Sq F value Pr(>F)
Salesman 4 1241 310.1 0.713 0.589
Residuals 32 13914 434.8
Example
# dataset population from package tidyr
population
result=aov(population~country,data=population
)
summary(result)
Out put
Df Sum Sq Mean Sq F value Pr(>F)
country 218 5.953e+19 2.731e+17 4459 <2e-16
Residuals 3841 2.352e+17 6.125e+13
TWO WAY ANOVA
This test is applied to find significance difference
between several means of two variables.
Null hypothesis is a) H0 :
b) H0 :
Alternative hypothesis is
a) Ha=
b) Ha:
Test procedure:
CF =, where G=, i=1,2,------,r and j=------------,c N=
rxc , r is number of rows and c is number of columns
Total sum of squares is calculated as follows,
TSS= – CF
Sum of squares due to rows is calculated as
follows,
SSR=
SSC=
Where total of all the values in ith row
total of all the values in jth column
Sum of squares due to error is given by
SSE=TSS-SSR-SSC
 
ANOVA Table
Source df Sum of Mean sum of F ratio
squares(SS) square(MSS=SS/df)

Rows r-1 SSR MSR=SST/(r-1) FR=MSR/MSE


 
Columns c-1 SSC MSC=SSC/(c-1)
FC=MSC/MSE
Error (r-1)(c-1) SSE MSE=SSE/(r-1)c-1)
Total N-1 TSS  

FR with (r-1) and (r-1)(c-1) df


FC with (c-1) and (r-1)(c-1) df
Conclusion:
Critical value of F is obtained from F table with respective df at α level of significance ,usually
0.05 or 0.01
If F(calculated)≤critical value then accept
Otherwise Reject
Example:
Sales(in 100 of Rs) of five salesmen in four districts is given .We want
to find is there any difference of sales between salesmen and also
between districts.
Districts Salesmen
A B C D E
1 50 56 60 52 45
2 40 52 45 40 50
3 60 55 65 45 70
4 45 70 40 45 65
G=50+40+---------+70+65=1050
N=5x4=20
Cj are 195,233,210.182,230
Ri are 263,227,295,265
CF=10502/20=55125
 TSS=502+402+---------------+652 – CF
=56944-55125=1819
SSR=(2632+-----------+2652)/5-CF
=55589.6-55125=464.6
SSC = (1952+---------+2302)/4-CF
=55609.5-55125=484.5
SSE=1819-464.6-484.5=869.9
ANOVA Table
Source df SS MSS F ratio
Districts 4-1=3 464.6 464.6/3=154.77 FR=154.77/72.49
=2.13 with 3 and 12
Salesman 5-1=4 484.5 484.5/4=121.13 df
Error 3x4=12 869.9 869.9/12=72.49 FC=121.13/72.49
=1.67 with 4 and 12
Total 20- 1819   df
1=19

Critical value of F with 3 and 12 df at α=0.05=3.49


Critical value of F with 4 and 12 df at α=0.05=3.26
2.13<3.49, accept H0 implies There is no significant difference of
district wise sales .
1.67<3.26, accept H0 implies There is no significant difference of
salesmen wise sales .
IN R

# Import excel file Two way ANOVA


result=aov(Sales~Salesman+District,data = Two_way_ANOVA)
summary(result)
#OR
result=lm(Sales~Salesman+District,data = Two_way_ANOVA)
anova(result)
Out put
Df Sum Sq Mean Sq F value Pr(>F)
Salesman 4 484.5 121.12 1.671 0.221
District 3 464.6 154.87 2.136 0.149
Residuals 12 869.9 72.49
Two way ANOVA with interaction
effect
•The two-way ANOVA can evaluate not only the
main effect of each Independent variable but
also the potential interaction between them.
• For example,
The effect of fertilizer ,quality of seed and
interaction effect of both on yield of
crop.
Type of effects and hypotheses

In two-way ANOVA with two factors (independent variables) A and


B, we can test two main effects and one interaction effect:
• Main effect of A: whether the population means are the same for
different levels (groups) of A?
– The null hypothesis: the population means are the same for
different levels of A.
• Main effect of B: whether the population means are the same for
different levels (groups) of B?
– The null hypothesis: the population means are the same for
different levels of B.
• The interaction effect of A and B: whether the difference in the
population means for difference level of A depends on the level of B
or vice visa?
– The null hypothesis: the effect of A or B on the outcome does not
depend on B
Two-way ANOVA F tests
Like in one-way ANOVA, F test is used in hypothesis
testing. The test is constructed based on the decomposition
of the sum of squares of the outcome variables.
Specifically, we have
SST = SSA + SSB + SSA×B + SSW
• SST or SStotal is the sum of squares for the outcome variable.
• SSA is the sum of squares attributes to the factor A.
• SSB is the sum of squares attributes to the factor B.
• SSA×B is the sum of squares attributes to the joint effect of
A and B.
• SSW or SSwithin is the A, B or their interaction.
Two-way ANOVA F tests
Like in one-way ANOVA, F test is used in hypothesis
testing. The test is constructed based on the decomposition
of the sum of squares of the outcome variables.
Specifically, we have
SST = SSA + SSB + SSA×B + SSW
• SST or SStotal is the sum of squares for the outcome variable.
• SSA is the sum of squares attributes to the factor A.
• SSB is the sum of squares attributes to the factor B.
• SSA×B is the sum of squares attributes to the joint effect of
A and B.
• SSW or SSwithin is the A, B or their interaction.
ANOVA table
Let N be the sample size, J is the number of levels in factor A, and K is the
number of levels in factor B. With this information, ANOVA table is-

Source df Sum of Mean sum of F statistics


square square

Factor A J-1 SSA MSA = SSA/(J−1) FA = MSA/MSW


Factor B K-1 SSB MSB = SSB/(K−1) FB = MSB/MSW
Interactio (J − 1) (K − SSA×B MSA×B = FA×B =
n 1) SSA×B/((J−1)×(K−1) MSA×B/MSW
)
Error of N − (J × K) SSW SW = SSW/(N−(J×K)  
Residual

Total N-1 SST  


Conclusion

•For testing the effect of Factor A:


compare the statistic FA to an F distribution with degrees of
freedom J −1 and N −(J ×K).
•For testing the effect of Factor B,:
compare the statistic FB to an F distribution with
degrees of freedom K − 1 and N − (J × K).
To test the interaction effect of Factor A and Factor B:
compare the statistic FA×B to an F distribution with degrees
of freedom
(J − 1) _ (K − 1) and N − (J × K).
IN R
#Import excel file Two way anova interaction
#Two way ANOVA with interaction
attach(Two_way_anova_interaction)
model <- lm(Yield~fertizier*rain)
anova(model)
#OR
result=aov(Yield~fertizier+rain+fertizier*rain)
summary(result)

You might also like