You are on page 1of 33

Analysis of Variance

ANOVA
Team Details -
Harshit Kapoor
Hetvi Gandhi
Kanchan Choudhury (Team Leader)
Kalpesh Patil
Kanishka Bojewar
Kaustubh Shrivastava
Contents
● What is ANOVA? ● Obtaining the mean squares
● Why ANOVA? ● One Way ANOVA (F-Test)
● Analysis of Variance ANOVA ● One Way ANOVA Table
● Assumptions of ANOVA ● Examples
● Hypothesis of one-way ANOVA ● ANOVA made easy
- Null Hypothesis ● Conclusion
- Alternative Hypothesis ● References
● Partitioning the variation
- Squared Sum Total
- Squared Sum among group
- Squared Sum within group

2
What is ANOVA?
ANOVA is a procedure for testing the difference among different groups of
data for homogeneity.

The essence of ANOVA is that the total amount of variation in a set of data is
broken down into two types -
● The amount which can be attributed to the chance, and
● The amount which can be attributed to specified causes.

3
Why ANOVA?

4
All, 1,000 children per school have the same mean IQ?
Is that even realistic?

5
Analysis of variance

One Way ANOVA Two Way ANOVA

Randomized Block Design

6
ANOVA Assumptions

● The observations are from random sample and they are independent
from each other
● The observations are normally distributed within each group
● The variances are approximately equal between groups
● It is not required to have equal sample sizes in all groups

7
Hypothesis of one-way ANOVA

● Null Hypothesis - H0 : µ1 = µ2 = µ3 · · · = µk*


No variation in mean among groups

● Alternative Hypothesis - HA : µ1 ≠ µ2 ≠ ... µk


Not all of the population means are equal

*k represents no. of independent comparison groups

8
Null Hypothesis

● Null Hypothesis is true

9
Null Hypothesis
● It proposes that there is no difference between certain
characteristics of a population

● Since the mean for the three populations are equal, the
sample means are expected to be close together

● When variability is small - H0

*Population mean = µ
Sample mean = x̄
10
Alternative Hypothesis

● Null Hypothesis is NOT true

11
Alternative Hypothesis
● It captures all possible situations other than equality of all
means specified in the null hypothesis.

● Since population means are unequal, sample means are


expected to be far apart.

● When variability is large - HA

12
Partitioning the Variation

SST = SSA + SSW

SST = Total Sum of Squares (Total variation)


SSA = Sum of Squares Among Groups (Among - group variation)
SSW = Sum of Squares Within Groups (Within - group variation)

13
Total Sum of Squares
SST = SSA + SSW
Where,
SST = (X11 - X)2 + (X12 - X)2 + ... + (Xcn - X)2

c = number of groups or levels


n = number of observations in group

X = grand mean (mean of all data values)

14
Among - group Variation
SST = SSA + SSW
Where,
SSA = n1(X11 - X)2 + n2(X12 - X)2 + ... + nc(Xc - X)2

c = number of groups
nj = sample size from group j
Xj = sample mean from group j

X = grand mean (mean of all data values)


15
Among - group Variation

Variation due to Differences Among Groups MSA = SSA

c-1

μi μj MSA = SSA

Degrees of Freedom

μi μj
16
Within - group Variation
SST = SSA + SSW
Where,
SSW = (X11 - X1)2 + (X12 - X2)2 + ... + (Xcn - Xc)2

c = number of groups
nj = sample size from group j
Xj = sample mean from group j

17
Within - group Variation

Summing the variation within each group MSW = SSW

and then adding over all groups n-c

μi μj MSW = SSW

Degrees of Freedom

μj
18
Obtaining the Mean Squares

19
One Way ANOVA F Test

20
One Way ANOVA Table

Sources of Degrees of Sum of Mean Square FTest


Variation Freedom Squares (Variance )

Among c-1 SSA MSA = SSA/c-1 FTest =


Groups MSA /MSW

Within Groups n-c SSW MSW = SSW/n-c

Total n-1 SST

21
Interpreting One-way ANOVA
Decision Rule

Reject H0
If FSTAT > F(tabulated)

Otherwise do not reject H0

22
Example
Analysing the performance in the math test
A B C
between the schools in a city, a common test was
given to the students taken at random from class
fifth of three schools in the city. The marks of the 9 13 14

students are given in the table. 11 12 13


(significance level= 0.05)
13 10 17

H0: μ1 = μ2 = μ3 9 15 7

8 5 9
H1: μ1, μ2, μ3 are not equal

23
A B C X̄A= 9+11+13+9+8 = 10
5
9 13 14

11 12 13 X̄B = 13+12+10+15+5 = 11
13 10 17 5

9 15 7
X̄C = 14+13+17+7+9 = 12
8 5 9 5

x = X̄A+ X̄B+X̄C = 11 …….. (This is the grand mean)


3

24
X̄A= 10 n(A)= 5
(A - X̄A)2 (B - X̄B)2 (C - X̄C)2

X̄B= 11 n(B)= 5 1 4 4

1 1 1
X̄C = 12 n(C)= 5 9 1 25

1 16 25
n=15 (n(A)+n(B)+n(C))
c=3 (no. of column) 4 36 9

Sum =16 Sum = 58 Sum = 64


x = 11, c=3 (no. of column)

SSA = 5(10-11)2 + 5(11-11)2 + 5(12-11)2 = 5 + 0 + 5 = 10


SSW = 16 + 58 + 64 = 138
25
MSA = SSA = 5
Degree of Freedom 1 = c-1 = 2
Degree of Freedom 2 = n-c = 12 DF1
MSW = SSW = 11.5
DF2
F = MSA = 5 = 0.454 So, calculated value is 0.454
MSW 11

Tabulated value of F at significance


value 0.05 is 3.89 (Critical value)

Therefore Accept H0 as F(calculated) < F(Tabulated)


Therefore we can say μ1 = μ2 = μ3

26
ANOVA Table for the Example

Sources of Sum of Degrees of Mean Square FTest


Variation Squares Freedom (Variance )

Among SSA = 10 c-1=2 MSA = SSA/c-1 FTest =


Groups =5 MSA /MSW
=5 / 11.5
Within Groups SSW = 138 n - c = 12 MSW = SSW/n-c = 0.454
= 11.5

Total SST = 148 n - 1 = 14

27
28
ANOVA Made Easy

Real Life Example #1


A large scale farm is interested in understanding which of three different fertilizers leads to the
highest crop yield. They sprinkle each fertilizer on ten different fields and measure the total yield at
the end of the growing season.

Real Life Example #2


Medical researchers want to know if four different medications lead to different mean blood
pressure reductions in patients. They randomly assign 20 patients to use each medication for one
month, then measure the blood pressure both before and after the patient started using the
medication to find the mean blood pressure reduction for each medication.

29
Real Life Example #3
A grocery chain wants to know if three different types of advertisements affect mean sales
differently. They use each type of advertisement at 10 different stores for one month and measure
total sales for each store at the end of the month.

Real Life Example #4


Biologists want to know how different levels of sunlight exposure (no sunlight, low sunlight,
medium sunlight, high sunlight) and watering frequency (daily, weekly) impact the growth of a
certain plant. In this case, two factors are involved (level of sunlight exposure and water
frequency), so they will conduct a two-way ANOVA to see if either factor significantly impacts
plant growth and whether or not the two factors are related to each other.

30
Conclusion
ANOVA is used in a wide variety of real-life situations, but the most common include:

● Retail: Store are often interested in understanding whether different types of promotions, store
layouts, advertisement tactics, etc. lead to different sales. This is the exact type of analysis that
ANOVA is built for.

● Medical: Researchers are often interested in whether or not different medications affect patients
differently, which is why they often use one-way or two-way ANOVA in these situations.

● Environmental Sciences: Researchers are often interested in understanding how different levels
of factors affect plants and wildlife. Because of the nature of these types of analyses, ANOVA are
often used.

31
References
- Statistics for Business & Economics by Anderson Sweeney
- https://www.spss-tutorials.com/anova-what-is-it/
- https://statisticsbyjim.com/glossary/null-hypothesis/
- https://www.statisticshowto.com/probability-and-statistics/null-hypothesis/
- https://libguides.library.kent.edu/spss/onewayanova
- https://statistics.laerd.com/statistical-guides/one-way-anova-statistical-guid
e.php

32
Thank-you!
Presented to - Professor D.M.Marathe

33

You might also like