You are on page 1of 36

ANALYSIS OF VARIANCE

(ANOVA)
Learning Objectives:
At the end of this lesson, the students should be able to:
1. Define the one-way Analysis of Variance (ANOVA).
2. Recognize the characteristics of the F distribution.
3. Use the one-way ANOVA technique to determine if there is a
significant difference among three or more means.
ANALYSIS OF VARIANCE (ANOVA)
- refers to a technique when F test is used to test a
hypothesis concerning the means of three or more
populations

- uses two estimates of the population variance such as


the between group variance (variance of the sample
means) and the within-group variance (over-all variance
of all the values).

* The one-way analysis of variance test is used to test


the equality of three or more means using sample
variances.
Statistical analysis with software applications by McGraw Hill Education
Characteristics of the F distribution

1. The values of F cannot be negative, because


variances are always positive or zero.
2. The distribution is positively skewed.
3. The mean value of F is approximately equal to 1.
4. The F distribution is a family of curves based on
the degrees of freedom of the variance of the
numerator and the degrees of freedom of
freedom of the variance of the denominator.

Statistical analysis with software applications by McGraw Hill Education


ANALYSIS OF VARIANCE (ANOVA)

The analysis of variance is a powerful procedure


for testing the homogeneity of a set of means.
However, it must be noted that if we reject the
null hypothesis and accept the alternative
hypothesis – that the means are not equal – we
still do not know which of the population means
are equal and which are different. At any rate, the
following assumptions are considered.
Assumptions for the F test for comparing Three or
More Means
1. The populations from which the samples were
obtained must be normally or approximately normally
distributed.
2. The samples must be independent of one another.
3. The variance of the populations must be equal.
4. The samples must be simple random samples, one
from each of the population.

Statistical analysis with software applications by McGraw Hill Education


ANALYSIS OF VARIANCE (ANOVA)
One-way ANOVA notation for a population of N observations and J treatments.
Treatment / Group
1 2 3 4 .. j where:
J = total number of treatments/groups (column)
1 X1,1
n = frequency of observations in the jth column.
2 X2,3 N = total number of observations in the sample.
3 X3,3 𝑿𝒊𝒋 = value in the ith row and jth column.
. 𝑿𝒋 = mean of the values in the jth column.
.
ഥ = mean of all column means or the grand
𝑿
.
mean.
i Xi,j

𝑋1 𝑋2 𝑋3 𝑋4 𝑋ഥ𝑗
ANALYSIS OF VARIANCE (ANNOVA)
Formula to determine the value of F ratio:
𝑴𝑺𝑪
F=
𝑴𝑺𝑬
𝑆𝑆𝐶 𝑆𝑆𝐸
where MSC = and MSE =
𝐽−1 𝑁−𝐽

• Mean Square Column (MSC) also called the Mean Square Between,
measures the amount of variability between the columns or the
explained variability.
Sum of Squares Column (SSC) yields the sum of squares between
treatments.
𝟐

𝑺𝑺𝑪 = σ 𝒏 𝑿𝒋 − 𝑿
ANALYSIS OF VARIANCE (ANNOVA)
• Mean Square Error (MSE) also called the Mean Square Within,
measures the amount of variability within the columns or the
unexplained variability.

Sum of Squares Error (SSE) yields the variation within columns.

𝟐
SSE= σ σ 𝑿𝒊𝒋 − 𝑿𝒋

Note: Sum of Squares Total ---> SST = SSC + SSE


ANALYSIS OF VARIANCE (ANNOVA)

The degrees of freedom for this F test are d.f.N. = J -1, where J is
the number of groups (columns), and d.f.D. = N - J, where N is the
sum of the sample sizes of the groups N = n1 + n2 + … + nk. The
sample sizes need not be equal. The F test to compare means is
always right-tailed.
ANALYSIS OF VARIANCE (ANNOVA)

Procedures in finding the value of F-ratio

1. To get the value of SSC:


a. Determine the mean for each column.
b. Find the grand mean by adding all the means divided
by the number of columns.
c. Subtract the grand mean from each of the means.
d. Square all the differences and multiply each by n.
e. Find the sum. The sum is the value for SSC.
ANALYSIS OF VARIANCE (ANNOVA)
Procedures in finding the value of F-ratio

2. To get the value of SSE:


a. Subtract the mean of each column from the values on respective column .
b. Find the square of their differences.
c. Add all of them. The sum is the value of SSE.
3. To get the value of MSC, just divide SSC by the number of columns
less 1. The quotient is the value of MSC.
4. To gets MSE, just divide SSE by the total number of observations less
the number of columns. The quotient is the value of MSC.
5. To find the value of F, just divide MSC by MSE. The quotient is the
computed value of F.
ANALYSIS OF VARIANCE (ANNOVA)
The one-way analysis of variance follows the regular five-step hypothesis
testing procedure.

Step 1 State the hypotheses.


Step 2 Find the critical values.
Step 3 Compute the test value.
Step 4 Make the decision.
Step 5 Summarize the results.

Note: In the succeeding examples, assume that all variables are normally distributed, that the samples are
independent, that the population variances are equal, and that the samples are simple random
samples, one from each of the populations.

Statistical analysis with software applications by McGraw Hill Education


ANALYSIS OF VARIANCE (ANNOVA)
EXAMPLE 1
The Noble records’ highest selling musical CD categories
for five days are rap, jazz, and rock. The following sales
are in terms of the number of cd’s sold.
RAP JAZZ ROCK
29 32 25
27 33 24
30 31 24
27 34 25
28 30 26

Use F test to find if there is a significant difference


among the means at 𝛼 = .05 .
ANALYSIS OF VARIANCE (ANNOVA)
Step 1 State the hypothesis and identify the claim.
H0: µrap = µjazz = µrock
H1: Not all µ’s are equal/ at least one mean is
different from the others (claim).

Step 2 Find the critical value. Since J = 3, N = 15, and


𝛼 = .05 , d.f.N. = J – 1 = 2 and
d.f.D. = N – J = 15 – 3 = 12
The critical value is 3.89 (see F distribution, slide 35).
ANALYSIS OF VARIANCE (ANNOVA)
Step 3 Compute the test value.
a. Find the mean for each sample and the grand mean.
𝐗 𝟏 (rap) = 28. 2 , 𝐗 𝟐 (jazz) = 32, 𝐗 𝟑 (rock) = 24.8 and
𝐗ഥ (grand mean) = 28. 33
b. Find the SSC, SSE and SST.
𝟐
𝐒𝐒𝐂 = σ 𝐧 𝐗ഥ𝐣 − 𝐗
ഥ = 5 28.2 − 28.33 2 + 5 32 − 28.33 2
+
5 24.8 − 28.33 2
= 129.73
𝟐

𝐒𝐒𝐄 = σ σ 𝐗 𝐢𝐣 − 𝐗 𝐣 = 29 − 28.2 2+ 27 − 28.2 2+ 30 − 28.2 2+
27 − 28.2 2 + 28 − 28.2 2 + 32 − 32 2 + 33 − 32 2 + 31 − 32 2 +
34 − 32 2 + 30 − 32 2 + 25 − 24.8 2 + 24 − 24.8 2 + 24 − 24.8 2+
25 − 24.8 2 + 26 − 24.8 2 = 19.6

SST = SSC + SSE = 129.73 + 19.6 = 149.33


ANALYSIS OF VARIANCE (ANNOVA)
c. Find MSC and MSE.
𝟏𝟐𝟗.𝟕𝟑
𝑴𝑺𝑪 = = 64.87
𝟐

𝟏𝟗.𝟔𝟎
𝑴𝑺𝑬 = = 1.63
𝟏𝟐

d. Find the F test value.


𝟔𝟒.𝟖𝟕
F= = 39.80
𝟏.𝟔𝟑
ANALYSIS OF VARIANCE (ANNOVA)
Step 4 Make the decision.
The test value 39.80 > 3.89, so the decision is to
reject the null hypothesis.

Step 5 Summarize the results.


There is enough evidence to conclude that at least
one mean is different from the others. Not all means
are equal.
ANALYSIS OF VARIANCE (ANNOVA)
EXAMPLE 2
An experiment was conducted to determine whether any significant differences exist in the
strength of parachutes woven from synthetic fibers from the different suppliers. Five
parachutes were woven for each group – Supplier 1, Supplier 2, Supplier 3, and Supplier
4. The strength of the parachutes is measured by placing them in a testing device that
pulls on both ends of a parachute until it tears apart. The amount of force required to tear
the parachute is measured on a tensile-strength scale, where the larger the value, the
stronger the parachute. The table contains the results of this experiment (in terms of
tensile strength). At 𝛼 = .05, determine whether these sample results are sufficiently
different to concluded that the population means are not all equal.
Supplier 1 Supplier 2 Supplier 3 Supplier 4
18.5 26.3 20.6 25.4
24.0 25.3 25.2 19.9
17.2 24.0 20.8 22.6
19.9 21.2 24.7 17.5
18.0 24.5 22.9 20.4
ANALYSIS OF VARIANCE (ANNOVA)
Step 1 State the hypothesis and identify the claim.
H0: µ1 = µ2 = µ3 = µ4
H1: Not all the means are equal (claim). At least one of
the suppliers differs with respect to the mean tensile
strength.

Step 2 Find the critical value. Since J = 4, N = 20, and 𝛼 = .05 ,


d.f.N. = J – 1 = 3 and d.f.D. = N – J = 20 – 4 = 16
The critical value is 3.24 (see F distribution, slide 35).
ANALYSIS OF VARIANCE (ANNOVA)
Step 3 Compute the test value.
a. Find the mean for each sample and the grand mean.

𝐗 𝟏 = 19.52 , 𝐗 𝟐 = 24.26, 𝐗 𝟑 = 22.84, 𝐗 𝟒 = 21. 16 and

𝐗ഥ (grand mean) = 21. 945

b. Find the SSC, SSE and SST.


𝟐
ഥ ഥ
𝐒𝐒𝐂 = σ 𝐧 𝐗 𝐣 − 𝐗 = 63. 2855
𝟐

𝐒𝐒𝐄 = σ σ 𝐗 𝐢𝐣 − 𝐗 𝐣 = 97.5040

SST = SSC + SSE = 63. 2855 + 97.5040 = 160.7895


ANALYSIS OF VARIANCE (ANNOVA)
c. Find MSC and MSE.
𝟔𝟑.𝟐𝟖𝟓𝟓
𝑴𝑺𝑪 = = 21.0952
𝟑

𝟗𝟕.𝟓𝟎𝟒𝟎
𝑴𝑺𝑬 = = 6.0940
𝟏𝟔

d. Find the F test value.


21.0952
F= 6.0940 = 3.4616
ANALYSIS OF VARIANCE (ANNOVA)
Step 4 Make the decision.
The test value 3.4616 is greater than the critical value
3.24, so the decision is to reject the null hypothesis.

Step 5 Summarize the results.


There is enough evidence to conclude that there is a
significant difference in the mean tensile strength
among the four suppliers. Not all the means are equal.
ANALYSIS OF VARIANCE (ANNOVA)
Excel Solution (Example 2)

1. Enter the data below in


columns A, B, C and D of an
Excel worksheet.
ANALYSIS OF VARIANCE (ANNOVA)
Excel Solution (Example 2)
2. From the tool bar, select Data,
then Data Analysis.
3. Select Anova : Single Factor.
4. Type in A2:D6 in the Input Range
box.
5. Check Grouped By:Columns.
6. Type 0.05 for the Alpha level.
7. Under the Output options, check
the Output Range and type G1.
8. Click [OK].
ANALYSIS OF VARIANCE (ANNOVA)
Excel Solution (Example 2): The results of the ANOVA are
shown below.
ANALYSIS OF VARIANCE (ANNOVA)
JASP data entry and output (Example 2)
Group Work

Answer the following exercises on page 234 in the prescribed


textbook:
a.) #8 and #9 (using five steps)
b.) #10 and #11 (show Excel solution)
c.) #12
#8. Sodium Contents of Foods. The amount of sodium (in milligrams)
in one serving for a random sample of three different kinds of food
is listed. At the 0.05 level of significance, is there sufficient
evidence to conclude that a difference in mean sodium amounts
exists among condiment, cereal and desserts?
Condiments Cereals Desserts
270 260 100
130 220 180
230 290 250
180 290 250
80 200 300
70 320 360
200 140 300
160 Source: The Doctor’s Pocket Calorie, Fat and
Carbohydrate Counter.

Statistical analysis with software applications by McGraw Hill Education


#9. Hybrid Vehicles. A study was done before the recent surge in
gasoline prices to compare the cost to drive 25 miles for
different types of hybrid vehicles. The cost of a gallon of gas at
the time of the study was approximately $2.50. Based on the
information given for different models of hybrid cars, trucks, and
SUVs, is there sufficient evidence to conclude a difference in
the mean cost to drive 25 miles? Use 𝛼 = .05.
Hybrid cars Hybrid SUVs Hybrid trucks
2.10 2.10 3.62
2.70 2.42 3.43
1.67 2.25
1.67 2.10
1.30 2.25
Source:www.fueleconomy.com

Statistical analysis with software applications by McGraw Hill Education


#10. Heathy Eating. Americans appear to be eating healthier.
Between 1970 and 2013 the per capita consumption of
broccoli increased 1200.5% from 0.5 to 6.4 pounds. A
nutritionist followed a group of people randomly assigned to
one of three groups and noted their monthly broccoli intake
(in pounds). At 𝛼 = .05, is there a difference in means?
Group A Group B Group C
2.0 2.0 3.7
1.5 1.5 2.5
0.75 4.0 4.0
1.0 3.0 5.1
1.3 2.5 3.8
3.0 2.0 2.9 Source:World Almanac

Statistical analysis with software applications by McGraw Hill Education


#11. Movie Theater Attendance. The data shown are the weekly
admissions, in millions, of people attending movie theaters
over three different time periods. At 𝛼 = .05, is there a
difference in the means for the weekly attendance for these
time periods?

1950-1974 1975-1990 1991-2000


58.0 17.1 23.3
39.9 19.9 26.6
25.1 19.6 27.7
19.8 20.3 26.5
17.7 22.9 25.8
Source: Motion Picture Association

Statistical analysis with software applications by McGraw Hill Education


#12. Weight Gain of Athletes. A researcher wishes to see whether there is
any difference in the weight gains of athletes following one of three
special diets. Athletes are randomly assigned to three groups and
placed on the diet for 6 weeks. The weight gains (in pounds) are shown
here. At 𝛼 = .05, can the researcher conclude that there is a difference
in the diets? Diet A Diet B Diet C
3 10 8
6 12 3
7 11 2
4 14 5
8
6
A computer printout for this problem is shown. Use the P-value
method and the information in this printout to test the claim.
Statistical analysis with software applications by McGraw Hill Education
Computer Printout for Exercise 12.

Statistical analysis with software applications by McGraw Hill Education


F Distribution
References:
Statistical analysis with software applications by McGraw Hill Education

Bluman, A. G. (2013). Elementary statistics: A step by step approach: A


brief version (No. 519.5 B585E.). McGraw-Hill.

David, M. (2017). Statistics for managers, using Microsoft excel. Pearson


Education India.

Manalo, R. A. & Noble, N. M. (2011). Business Statistics. A.L.O.H.A.


Alternative Learning Systems, Inc. Manila.

Research By Design. (2020, May 12). How to do an one-way ANOVA in


JASP (12-8). [Video]. Youtube. https://youtu.be/UTOviKjMqBU

You might also like