You are on page 1of 10

# http://stattrek.com/tables/f.

aspx

General Statistics
Analysis of Variance - ANOVA
Comparing more than 2 population means
Analysis of Variance (ANOVA) is a statistical test used to determine if more than two
population means are equal.
The test uses the F-distribution (probability distribution) function and information
about the variances of each population (within)
and grouping of populations (between) to help decide if variability between and
within each populations are significantly different.
So the method of A!"A test the hypotheses that#

H0# or
Ha# ot all the means are equal
1. no! the purpose of the analysis of "ariance test.
The analysis of variance (A!"A) test statistics is used to test if more than \$
population means are equal.
2. no! the #ifference \$et!een the !ithin-sample estimate of the "ariance an#
the \$et!een-sample estimate of the "ariance
an# ho! to calculate them.
%hen comparing two or more populations there are several ways to estimate the
variance.
The !ithin-sample or treatment "ariance or variation is the average of the all the
variances for each population and is an estimate of
whether the null hypothesis& H
0
is true or not.
& for ' ( ) to *& where * is the number of samples or populations.
%&ample# +iven , populations with sample variances# 2.0' 2.2' 2.()' 2.*1 an# 2.*)& the
within standard deviation is#
The !ithin-sample variance is often called the une&plaine# "ariation.
The \$et!een-sample "ariance or error is the average of the square variations of each
population mean from the mean
or all the data (Gran# +ean& ) and is a estimate of only if the null
hypothesis& H
0
is true.
%hen the null hypothesis is false this variance is relatively large and by comparing it
with the within-sample variance
we can tell statistically whether H
0
is true or not.
The \$et!een-sample variance is associated with the e&plaine# "ariation of our
e-periment.
%&ample# +iven the means of , samples# 210' 212' 21(' 21* an# 21*&
Then another estimate of the sample variance of the means is#
for ' ( ) to *& where * is the number of samples or populations.

Sample Sample
mean
,1-
, -
+ean -
Gran#
+ean ,2-
, -
Sum of
S.uare
, -
2

,(-
1 \$). ) /.0/
2 \$)\$ \$ ..1/
( \$)1 1 ..)/
* \$)2 2 ).3/
) \$)2 , ).3/
A"erage
,1- /\$)\$./
0otal ,(-
/)).\$
Gran# +ean& ( \$)\$./& * ( ,& and s
2
4
(between) ( ( )
\$
5 (*-)) ( )).\$ 5 2 ( 2.1
%hen the null hypothesis& H
0
is true the within-sample variance and the between-
sample variance will be about the same6
however& if the between-sample variance is much larger than the within& we would
re'ect H
0
.
7f the data from both e-amples above are from the same , samples or populations then
a ratio of both estimates of the variance
would give the following#
This ratio has a 2-#istri\$ution.
(. no! the properties of an 2 3istri\$ution.
There is an infinite number of F-8istribution based on the combination of alpha
significance level&
the degree of freedom (#f
1
) of the within-sample variance and the degree of freedom
(#f
1
) of the between-sample variance.
The F-8istribution is the ratio of the between-sample estimate of and the within-
sample estimate#
7f there are 4 number of population and n number of data values of the all the sample&
then the degree of freedom of the within-sample variance& #f
1
/ 4 -1 and
the degrees of freedom of the between-sample variance is given has #f
2
/ n - 4.
The graph of an F probability distribution starts a . and e-tends indefinitely to the
right.
7t is s*ewed to the right similar to the graph shown below.

2-3istri\$ution Graphs5
*. no! ho! sum of s.uares relate to Analysis of Variance.
9emember bac* in Chapter ( ,6egression- we introduced the concept that the total
sum of squares
is equal to the sum of the e-plained and une-plained variation6 this section is an
e-tension of that discussion.

0otal Variation / %&plaine# Variation
7 8ne&plaine# Variation.
Sum of S.uares
The sum of squares for the \$et!een-sample variation is either given by the
symbol SS9 (sum of squares between)
or SS06 (sum of squares for treatments) and is the e&plaine# "ariation.
To calculate SS9 or SS06& we sum the squared deviations of the sample treatment
means from the grand mean
and multiply by the number of observations for each sample.
The sum of squares for the !ithin-samplevariation is either given by the
symbol SS: (sum of square within)
or SS% (sum of square for error).
To calculate the SS% we first obtained the sum of squares for each sample and then
sum them.
The Total Sum of Squares& SS0O / SS9 7 SS:
The \$et!een-sample "ariance& & where 4 is the number of samples or
treatment and is often
called the +ean S.uare 9et!een& &
The !ithin-sample "ariance& & where n is the total number of observations
in all the samples is often
called the +ean S.uare :ithin or +ean S.uare %rror& &
%&amples :ompute the SS9' SS% an# SS0O for the following samples#

6o!s' i 0reatment of samples ; ( ) to
*)& * ( 1
Sample 1 Sample 2 Sample (
1 << 1( 10
2 <= =1 12
( 1< =* 1>
* 1) 11 1)
) <1 1) 10
;ean 81.2 88.2 82.6
+rand
;ean&
1*
Sum of S.uare :or4sheet

6o!s' i 0reatment of samples ; ( ) to *)& * ( 1
SS ,Column
1-
SS ,Column
2-
SS ,Column
(-
1 ,<<-11.2-
2
/ 1
<.>*
,1(-11.2-
2
/ 2
<.0*
,10-12.>-
2
/ >
.<>
2 ,<=-11.2-
2
/ * ,=1-11.2-
2
/ < ,12-12.>-
2
/ 0
.1* .1* .(>
( ,1<-11.2-
2
/ (
(.>*
,=*-11.2-
2
/ (
(.>*
,1>-12.>-
2
/ 1
1.)>
* ,1)-11.2-
2
/ 1
*.**
,11-11.2-
2
/ 0
.0*
,1)-12.>-
2
/ )
.<>
) ,<1-11.2-
2
/ 1
0.2*
,1)-11.2-
2
/ 1
0.2*
,10-12.>-
2
/ >
.<>
SS' (:ol) 80.8 78.8 31.2
Sample
si<e& nj
5 5 5
SS: / SS% ( ( 10.1 7 <1.1 7 (1.2 ( 1=0.1
SS9 / SS06 ( ( ,=(>).\$ - 1*)
\$
? (>>.\$ - 1*)
\$
? (>\$./ - 1*)
\$
@ (1(<.2
+S9 ( and +S: ( +S% (
2 (
). no! ho! to construct an ANOVA 0a\$le.
The various statistics computed from the analysis of variance above can be
summari<ed in an A!"A Table
as shown below# These summaries are then used to draw inference about the various
samples or treatments of which we are studying.
ANOVA 0a\$le - 9asic layout5

Source Sum of
S.uares
,SS-
3egree of
2ree#om
,#f-
+ean
S.uare
2-
Statistics
?-
"alue
9et!een
Samples
,%&plaine#-
SSB k-1 MSB=
F=
Value
from
0a\$le
:ithin SSE n-k MSE=
Samples
,8ne&plaine#-
0otal SSTO n-1
The A!"A table is easily constructed from the ANOVA program by entering each
observations for each sample in
appropriate columns and deleting any division by . in selected regions of the program.
For the data above the A!"A table is#

>. no! ho! to interpret the #ata in the ANOVA ta\$le against the null
hypothesis.
The A!"A table program computes the necessary statistics for evaluating the null
hypothesis that
the means are equal# H
0
# .
Ase the degrees of freedom and an alpha significance level to obtain the e-pected F-
8istribution statistics
from the loo4up ta\$le or from the ANOVA program.
Acceptance Criteria for the Null Hypothesis5
7f the F-statistics computed in the A!"A table is less than the F-table statistics or
the B-value if greater than the alpha level of significance& then there is not reason to
re'ect the null hypothesis
that all the means are the same#

That is accept H0 if# 2-Statistics @ 2-
ta\$le or ?-"alue A alpha.
For the e-ample above we would re;ect the null hypothesis at the ,C significance
level and conclude
that the means are not equal since F-Stat > F-Table.
<. no! the proce#ure for testing the null hypothesis that the mean for more
than t!o populations are e.ual.
Step 1 - Formulate Dypotheses#
H
0
# and
H
a
# ot all the means are equal
Step 2. Select the 2-Statistics 0est for equality of more than two means
Step (. !btain or decide on a significance level for alpha& say
Step *. :ompute the test statistics from the ANOVA ta\$le .
Step ). 7dentify the critical 9egion# The region of re'ection of H
0
is obtained from
the 2-ta\$le with alpha and degrees of freedom (*-)& n-*).
Step >. ;a*e a decision#
That is accept H
0
if# 2-Statistics @ 2-ta\$le or ?-"alue A alpha.
:or4sheet for Computing Analysis of Variance Statistics5

6o!s' i 0reatment of samples ; ( ) to *
Sum of
S.uare
'()
Sum of
S.uare
;/2
Sum of
S.uare
;/(
Sum of
S.uare
;/*
Sum of
S.uare
;/)
1
2
(
*
)
>
<
1
=
10
11
12
1(
1*
SS' (:olumn
Total)

Sample
si<e& nj

SS: / SS% ( (
SS9 / SS06 ( (
ANOVA 0a\$le

Source Sum of
S.uares
,SS-
3egree of
2ree#om
,#f-
+ean
S.uare
2-
Statistics
?-
"alue B
2-
0a\$le
9et!een
Samples
,%&plaine#-
SS9/ 4-1/
+S9/
/
2/ /
:ithin
Samples
,8ne&plaine#-
SS%/ n-4/
+S%/
/

0otal
SS0O/ n-1/