You are on page 1of 41

F-Test

Dr. Gehan Jayasuriya


Compare standard deviation of two
samples.
Mean was used, to check whether two samples
are belongs to the same population or not.
There are two unique parameters for a population :
Mean and standard deviation (SD).

Populations with equal mean but with different SD.

Therefore, important to check whether the


standard deviation of samples are equal or not,
to determine whether the samples are belongs
to the same population or not.
Sample standard deviations of a population
has a F distribution.

Therefore F test is used to compare SD of


two samples.
Two parameters to determine F distribution:
Degrees of freedom of the numerator and degrees
of freedom of the denominator
F-Test
Can be a two-tailed test or a one-tailed test.
Two-tailed version, tests against the alternative
that the standard deviations are not equal.
One-tailed version only tests in one direction,
that is the standard deviation from the first
population is either greater than or less than (but
not both) the second population standard
deviation .
e.g.: to test an effect of NPK fertilizer brand on soy
bean yield, two samples containing 16 soybean
plants each were used. One of that samples was
treated with new brand whereas the other was
treated with the traditional brand. Sample 1
which is treated with new brand had a yield
mean of 1.3 kg per plant with SD of 0.4, whereas
the other sample had a mean yield of 0.9 kg with
SD of 0.6. Check whether these two samples
have a different yield or not? Do a F test to
check whether these two smaples have equal
SD or not?
Hypothesis:
H0 : σ1 = σ2
Ha : σ1 ≠ σ2

Rejection criteria:
If calculated F > critical F { F(, N1-1, N2-1)}
Ho can be rejected.
Otherwise we can not reject Ho

Test statistics:
F = s12 / s22
Always take the highest value for the numerator.
S12 = (0.6)2
S22 = (0.4)2 F = (0.6)2 / (0.4)2
F = 2.25
Critical F value
To get the critical value we have to know three parameters.
Numerator degree of freedom and denominator degrees of freedom
and .
Numerator degrees of freedom = 16 -1
= 15
Denominator degrees of freedom = 16 -1
= 15
 = 0.05
F(, N1-1, N2-1) = 2.40

Cal F < Critical F

Therefore there is no evidence to reject Ho


That is σ1 = σ2
Analysis of variance (ANOVA)

When there are more than two treatments


(samples) to compare,
Several T tests can be performed to check
whether these treatments are different from each
other or not.
For 3 treatments: T1, T2, T3
3 t-tests to compare,
T1 vs T2
T1 vs. T3
T2 vs T3
For 4 treatments 6 t-tests have to be done.
Each time t-test is conducted 5 % error can
occur.
When 6 t-tests conducted,
probability of doing an error is 1- 0.956 = 0.27
Therefore when there are more than two samples
(treatments) analysis of variance can be performed
instead of t or z tests.

Analysis of Variance (ANOVA)


As the name implies it test differences in variance.

e.g. 1. To test the effect of a new fertilizer, two doses of


fertilizer and no dose of fertilizer were used. Three
samples of 12 plants were selected. One sample was
treated with 10 g of fertilizer per plant and another
sample was treated with 5 g of fertilizer per plant
whereas the other sample was not treated.
As mentioned variance or variation is compared in
these treatments.

What are the factors that can cause variation of


yield between plants,

• Random error
• Effect of treatment (fertilizer dose)

Random error cause the variation within group.


Effect of treatment and random error cause the
variation between groups.
We compare the variance between groups with the
variance within groups.
If, variance between group is higher than the
variance within group, there is an effect of
treatment on yield.
If variance between groups is lower than the
variance within group, there is no effect of
treatment on yield.
To compare variance F test has to be conducted.
Following are the results obtained for the experiment explained in e.g.1

Fertilizer dosage per plant (g)


0 5 10
Variance within groups 12 14 16
: due to random error
10 13 17
13 14 14
10 15 13
14 13 17
9 16 15
12 14 13
10 17 14
9 14 12
13 13 17
12 16 15
11 14 16
Total 135 173 179

Variance between groups : due to random error + variance due to treatment


How can we calculate the variance within
groups and variance between groups?

Deviation Deviation of Deviation of


of each = the score + Group mean
score from from its own from grand
the Grand group mean mean
Mean Group means are If this is large
estimates of the the groups are
population mean very different

As negative numbers are awkward to


deal with we square the deviations

Total sums of Within group sums Between group


square deviation of square deviation sums of square
deviation
From the Sums of Squared Deviations
about a mean comes the variance

Total sums of sq Within group sums of sq Between group


deviation = deviation (A) + sums of sq deviation
(B)

Total Variance = Within Group Variance + Between =Group Variance


(B/k-1)
= A/k(n-1)

k= number of groups
n= number of observations in each group
F ratio:
Between Group Variance
Within Group Variance

If calculated F value is higher than the tabulated F we


reject Ho:
ANOVA TABLE
Source of Degrees Sums of Mean F
Variation of Squares Squares
Freedom
Treatments k-1 SStr SStr/(k -1) MStr/MSe
Error nk-k SSe SSe/(n-k)
Total nk-1 SSTotal
Where,
k: Number of treatments
n: Number of replicates in each treatment
SStr : treatment sums of squares
SSe : error sums of squares
SSTotal: Total sums of squares
MS : Mean squares
k
SStr =n x Σ (yi. – y..)2
i =1

k n
SStot = Σ Σ (yij – y..)2
i =1 j=1

Total = Treatment + Error

SSe = SStot - SStr

Mean square = sums of square / degrees of freedom

F = MStrt / MSe
Let’s consider e.g.1.

Hypothesis
H0 : µ1 = µ2 = µ3
Ha : µ1 ≠ µ2 ≠ µ3

Rejection criteria:
When Calculated F > critical F
Ho can be rejected
otherwise Ho can not be rejected
Test statistics
ANOVA TABLE
Source of Degrees Sums of Mean F
Variation of Squares Squares
Freedom
Treatments k-1
Error nk-k
Total nk-1
K = 3 , n = 12
Source of Degrees of Sums Mean F
Variation Freedom of Squares
Squar
es
Treatments 3-1 = 2
Error (12x3)-3 = 33
Total (12x3)-1= 35
Fertilizer dosage per plant (g)
0 5 10
12 14 16
10 13 17
13 14 14
10 15 13
14 13 17
9 16 15
12 14 13
10 17 14
9 14 12
13 13 17
12 16 15
11 14 16
Total 135 173 179 487
k
SStr =n x Σ (yi. – y..)2
i =1

y1 = 135 /12 , y2 = 173/12, y3 = 179/12, y.. = 487/36


= 11.25, = 14.42, = 14.91 = 13.52
SStr =12 x [ (11.25 – 13.52)2 + (11.25 – 13.52)2 + (11.25 – 13.52)2]
= 7.907 x 12
= 94.88
k n
SStot = Σ Σ (yij – y..)2
i =1 j=1

SStot = [ ( 12– 13.52)2 + (10 – 13.52)2 + …….. + (14 – 13.52)2 + (13 – 13.52)2 +
……… + ( 16– 13.52)2 + (17– 13.52)2 + …….. + (12– 13.52)2]
= 176.97

SSe = SStot - SStr

SSe = 176.97 – 94.88


= 82.08
Source of Degrees of Sums of Mean F
Variation Freedom Squares Squares

Treatments 3-1 = 2 94.88


Error 12x3-3 = 33 82.08

Total 12x3-1= 35 176.97

Mean square = sums of square / degrees of freedom


MStrt = SStrt / DFtrt, MSe = SSe – DFe
MStrt = 94.88/2 = 47.44
MSe = 82.08 / 33 = 2.48
Source of Degrees of Sums of Mean F
Variation Freedom Squares Squares
Treatments 3-1 = 2 94.88 47.44
Error 12x3-3 = 33 82.08 2.48
Total 12x3-1= 35 176.97
F = MStrt / MSe
= 47.44/ 2.48
= 19.07
Complete ANOVA Table
Source of Degrees of Sums of Mean F
Variation Freedom Squares Squares
Treatments 3-1 = 2 94.88 47.44 19.07
Error 12x3-3 = 33 82.08 2.48
Total 12x3-1= 35 176.97
Critical F value,
We have to know numerator degrees of freedom and denominator
degrees of freedom.
Numerator is SStrt
Denominator is SSe
Therefore,
Numerator degrees of freedom = treatment df = 2
Denominator degrees of freedom = error df = 33

Therefore,
F (, k-1, nk-k) = F(0.05, 2, 33) = 3.49

Calculated F (19.07) > Critical F (3.49)


Therefore H0 can be rejected.
Experimental designs and one way vs.
two way ANOVA

Experimental designs
Design of the experiment dictates how treatments
are randomized among experimental subjects or
units.
In the last experiment we had three treatments
with 12 replicates in each treatment, and almost
similar conditions were applied for these
experimental units except for the treatment.
Most simplest design of experimenting. This
experimental design is known as completely
randomized design (CRD).

Where, experimental units are arrange in


completely random way in the statistical ground.
There are no known sources of variation except
for the treatment effect and random error effect.
There are three main experimental designs.

1. Completely Randomized Design (CRD)


CRD is the simplest of all designs.
It is equivalent to a t-test when only two
treatments are examined.
Replications of treatments are assigned
completely at random to independent
experimental subjects.
Adjacent subjects could potentially have the
same treatment.
e.g. different colors represent different
treatments.
2. Randomized complete block design (RCBD)
In this design there is an uncontrolled known variability
among experimental units. Due to this variability
experimental units are depended.
RCBD is equivalent to paired T-test.
To understand this we can take an examples.

e.g. RCBD specially related to agricultural field experiments.


Performance of 3 fertilizer brands on yield of soybean
variety needed to be compared. However, agriculture
field we have has a slope to one direction. This slope can
cause variability in yield among experimental units.
Therefore we divide the land into blocks perpendicular to
the direction of the variability.
Direction of
Blocks

the known
variability
(slope)

Then treatments are located randomly in each block.


Each has single experimental unit from each treatment
Therefore replicates = blocks
Since treatment allocation in a block is completely
randomized, this design is known as CRBD.
Treatment
(Fertilizer)
F1B1 F3B1 F2B1

F2B2 F3B2 F1B2


Blocks

F2B3 F1B3 F3B3

F3B4 F2B4 F1B4

F1B5 F2B5 F3B5


e.g. 1. results obtained for above experiment is given
below. Yield are in bushels per hec.

Fertilizer
Block 1 2 3
1 110 134 124
2 125 142 128
3 124 147 131
4 118 145 127
5 119 136 125
Hypothesis
H0 : µ1 = µ2 = µ3
Ha : µ1 ≠ µ2 ≠ µ3
Rejection criteria:
When Calculated F > critical F
Ho can be rejected
otherwise Ho can not be rejected
Test statistics
ANOVA TABLE
Source of Degrees of Sums of Mean F
Variation Freedom Squares Squares
Treatments k-1
Blocks b-1
Error (bk -1) –
[(k-1) + (b-1)]
Total bk-1
k
SStr =b x Σ (yi. – y..)2
i =1

k n
SStot = Σ Σ (yij – y..)2
i =1 j=1
b
SSblk =t x Σ(y.j – y..)2
j =1
Fertilizer
Block 1 2 3 Mean variation
1 110 134 124 122.66 -6.33 40.1
2 125 142 128 131.66 2.6 7.11
3 124 147 131 134 5 25
4 118 145 127 130 1 1
5 119 136 125 126.66 -2.33 5.44
Mean 119.2 140.8 127 129
variation -9.8 11.8 -2
96.04 139.24 4
SStr = 717.8
SSblk = 393.25
SStot = 1496
SSer = 1496 – (393.25 + 717.8)
= 384.95
Source of Degrees of Sums of Mean F
Variation Freedom Squares Squares

Treatments 2 717.8 358.9 7.45


Blocks 4 393.25 98.31
Error 8 384.95 48.12

Total 14 1496
Critical F value
F (,2,8 ) = F(0.05, 2, 8) = 4.44
Calculated F (7.45) > Critical F (4.44)
Therefore H0 can be rejected.

That means fertilizers have different effects.


There are two parameters to determine F distribution:
Degrees of freedom of the numerator and degrees of
freedom of the denominator
3. Factorial design
Designs we discussed above have only one factor.
e.g. variety
Fertilizer
In factorial design we compare different levels (treatments) of two
factors.
At the same time we can get the interaction effect of two factors.

As a example if we want to compare effect of fertilizer and effect of


hormone on yield of soybean plants we can do a single
experiment with treatment having combinations of fertilizer
levels and hormone levels.
Three fertilizer levels; 0, 5 and 10 g per plant, and three hormone
levels were used 0, 100 and 200 ppm.

You might also like