You are on page 1of 11

24-11-2023

Experimental Design

Experimental Design: Analysis of


Variance

Experimentation Treatment
• A treatment is something that researchers administer
• An experiment deliberately imposes a treatment on a to experimental units.
group of objects or subjects in the interest of
observing the response. Examples:
• This differs from an observational study, which • A corn field is divided into four, each part is 'treated'
involves collecting and analyzing data without with a different fertiliser to see which produces the
changing existing conditions. most corn;
• A teacher practices different teaching methods on
• Because the validity of an experiment is directly different groups in his/her class to see which yields
affected by its construction and execution, attention the best results;
to experimental design is extremely important. • A doctor treats a patient with a skin condition with
different creams to see which is most effective.

3 4

1
24-11-2023

Factor
Level
• A factor of an experiment is a controlled independent
variable; a variable whose levels are set by the
• Treatments are administered to experimental units by experimenter.
'level', where level implies amount or magnitude. • A factor is a general type or category of treatments.
Different treatments constitute different levels of a
Example:
factor.
• If the experimental units were given 5mg, 10mg, 15mg Examples:
of a medication, those amounts would be three levels of
• Three different groups of runners are subjected to
the treatment. different training methods.
• The runners are the experimental units;
the training methods, the treatments; and
the three types of training methods constitute three
levels of the factor 'type of training'.
5 6

Experimental Design Randomization


• Since it is generally extremely difficult for
• We are concerned with the analysis of data
experimenters to eliminate bias using only their
generated from an experiment. expert judgment, the use of randomization in
• It is wise to take time and effort to organize the experiments is a common practice.
experiment properly to ensure that the right • In a randomized experimental design, objects or
type of data, and enough of it, is available to individuals are randomly assigned (by chance) to an
answer the questions of interest as clearly and experimental group.
efficiently as possible. • Using randomization is the most reliable method of
creating homogeneous treatment groups, without
• This process is called experimental design. involving any potential biases or judgments.

7 8

2
24-11-2023

Completely Randomized Design


Why ANOVA?
• In a completely randomized design, objects or
• We could compare the means, one by one using t-tests
subjects are assigned to groups completely at
for difference of means.
random.
• One standard method for assigning subjects to • Problem: each test contains type I error
treatment groups is to label each subject, then
use a table of random numbers to select from
the labelled subjects.

9 10

Why ANOVA? • For four treatment means, there are six possible comparisons,
hence, the probability of no Type I errors is (0.95)6 = 0.74.
• The probability of a Type I error increases when we The probability of at least one Type I error is 0.26 or 26%.
make several pairwise comparisons. • For five treatment means, there are ten possible comparisons,
• Every time we do a statistical test where the null hence, the probability of no Type I error is (0.95)10 = 0.60.
hypothesis applies, the risk of a Type I error is our The probability of at least one Type I error is 0.40 nor 40%.
That is, 40% of the time we will reject the null hypothesis of
chosen value of α. If α is 0.05, then the probability of
equal means in favor of the alternative!
not making a Type I error is (1-α) or 0.95.
• The total type I error is 1 – (1- α)k where k is the number of
• If we have three treatment means and therefore make means.
three pairwise comparisons (1 versus 2, 2 versus 3, • These risks are unacceptably high. We need a test that
and 1 versus 3), the probability of no Type I errors is compares more than two treatment means with a Type I error
(0.95)3 = 0.86. The probability of at least one Type I the same as α.
error is 0.14 or 14%.
11 12

3
24-11-2023

General ANOVA Setting


• Investigator controls one or more factors of interest
Analysis of Variance (ANOVA) Each factor contains two or more levels
Levels can be numerical or categorical
One-Way Two-Way Different levels produce different groups
ANOVA ANOVA
Think of the groups as populations
F-test Interaction • Observe effects on the dependent variable
Effects Are the groups the same?
Tukey-
Kramer • Experimental design: the plan used to collect the
test data
13 14

One-Way Analysis of Variance


Completely Randomized Design • Evaluate the difference among the means of three or
more groups
Examples: Accident rates for 1st, 2nd, and 3rd shift
• Experimental units (subjects) are assigned
Expected mileage for five brands of tires
randomly to the different levels (groups)
Subjects are assumed homogeneous
• Assumptions
• Only one factor or independent variable Populations are normally distributed
With two or more levels (groups) Populations have equal variances
• Analyzed by one-factor analysis of variance Samples are randomly and independently drawn
(one-way ANOVA)

15 16

4
24-11-2023

Hypotheses: One-Way ANOVA Hypotheses: One-Way ANOVA


• H0 : μ1  μ2  μ3    μc
H0 : μ1  μ2  μ3    μc
All population means are equal H1 : Not all μ j are the same
i.e., no treatment effect (no variation in means
among groups) All Means are the same:
The Null Hypothesis is True

H1 : Not all of the population means are the same (No Group Effect)
 At least one population mean is different, i.e., there is a
treatment (groups) effect
 Does not mean that all population means are different (at μ1  μ 2  μ 3
least one of the means is different from the others)
17 18

• The question is: Do the observed differences in


Hypotheses: One-Way ANOVA samples provide evidence against the null
hypothesis
H 0 : μ1  μ 2  μ 3    μ c At least one mean is different:
The Null Hypothesis is NOT true H 0 : μ1  μ 2  μ 3
H1 : Not all μj are the same (Treatment Effect is present)
• Or can the observed differences be attributed
to chance?
or • i.e., we want to determine if the differences
between samples are large enough to convince
us that the population means differ.
μ1  μ2  μ3 μ1  μ2  μ3
19 20

5
24-11-2023

• The sample variances measure the amount of


variation present in each sample.
• If the amount of variation present within each • Thus, testing for the equality of means
sample is small, then sizable differences involves finding the amount of variation that
between the sample means provide evidence of exists between the samples and comparing this
differences in the population means. to the variation within samples.
• If the sample variances are large, then • Hence, the name, Analysis of Variance.
observed differences in the sample means do
not necessarily provide evidence of differences
in the population means.

21 22

Partitioning the Variation Partitioning the Variation

Total Variation (SST)


SST = SSA + SSW

Total Variation = the aggregate dispersion of the individual data


Among Group Within Group Variation
values around the overall (grand) mean of all factor levels (SST) = Variation (SSA) + (SSW)
Among-Group Variation = dispersion between the factor
sample means (SSA)

Within-Group Variation = dispersion that exists among the data


values within the particular factor levels (SSW)

23 24

6
24-11-2023

The Total Sum of Squares


SSA
SST = SSA + SSW
c nj

SST   ( X ij  X ) 2 Response, X
Where: j 1 i 1

SST = Total sum of squares


c = number of groups
nj = number of values in group j
Xij = ith value from group j
Group 1 Group 2 Group 3
X = grand mean (mean of all data values)
25 26

SSW
Among-Group Variation
SSA  n 1 ( X1  X ) 2  n 2 ( X 2  X ) 2  ...  n c (X c  X ) 2
c Response, X
SSA   n j ( X j  X) 2
j1

Group 1 Group 2 Group 3


µ1 µ2 µc

28

7
24-11-2023

Within-Group Variation Obtaining the Mean Squares


c nj

SSW    ( X ij  X j ) 2 SSA
j 1 i 1 MSA  Mean Squares Among
c 1
SSW  ( X 11  X 1 ) 2  ( X 21  X 1 ) 2  ...  ( X nc  X c ) 2
SSW
MSW  Mean Squares Within
nc

j
30

One-Way ANOVA Table


Source of df SS MS F-Ratio
One-Way ANOVA Test Statistic
Variation (Variance)
H0: μ1= μ2 = … = μc
Among c-1 SSA MSA
Groups H1: At least two population means are different
MSA • Test statistic
Within n-c SSW MSW F
Groups
MSW MSA
F
Total n-1 SST = SSA+SSW MSW
• MSA is mean squares among groups
• MSW is mean squares within groups
• Degrees of freedom
c = number of groups  df1 = c – 1 (c = number of groups)
n = sum of the sample sizes from all groups  df2 = n – c (n = sum of all sample sizes)
df = degrees of freedom 31 32

8
24-11-2023

Example: Comparing Fuel Consumption of


One-Way ANOVA Test Statistic Three Makes of automobile
km ratings of 3 makes of cars.
• The F statistic is the ratio of the among variance to the Car 1 Car 2 Car 3
within variance
18.2 19.8 21.2
 df1 = c -1 will typically be small
 df2 = n - c will typically be large 19.4 21.0 21.8
19.6 20.0 22.4
19.0 20.8 22.0
Decision Rule: 18.8 20.4 21.6
 = 0.05
Reject H0 if F > FU,
otherwise, do not reject
H0 0 Do not Reject H0
Sample Mean 19.0 20.4 21.8
reject H0
FU Sample Variance 0.300 0.260 0.200
33 34

Car 1 Car 2 Car 3


18.2 19.8 21.2
Problem 1 19.4 21.0 21.8 Problem 2 Car 1
17.0
Car 2
24.2
Car 3
26.0
19.6 20.0 22.4 20.4 22.0 19.8
19.0 20.8 22.0 24.0 17.8 24.4
15.8 16.2 16.0
18.8 20.4 21.6 17.8 21.8 22.8

Mean 19.0 20.4 21.8 Mean 19.0 20.4 21.8


Stddev 0.5477 0.5099 0.4472 Stddev 3.2649 3.2924 3.9698
Variance 10.66 10.84 15.76
Variance 0.3 0.26 0.2

9
24-11-2023

ANOVA Table for Example - 1 Decision for Example 1


Sources of Sum of Squares Degrees of Mean Square F
Variation Freedom • The critical value of the test statistic F is
Among groups SSA c-1 MSA MSA / MSW obtained from F distribution with numerator
(Make of car)

Within groups SSW n-c MSW


degrees of freedom = 2 and denominator
(Random error) degrees of freedom = 12.
Total SST n-1
• F0.05, 2, 12 = 3.89
Sources of Sum of Squares Degrees of Mean Square F • Since the observed statistic far exceeds the
Variation Freedom
Among groups 19.60 2 9.800 38.735 critical value, we reject H0
(Make of car)
• There is a significant difference in the km
Within groups 3.04 12 0.253
(Random error) rating offered by the three cars.
Total 22.64 14
37 38

10
24-11-2023

Typical Data for Single Factor Experiment ANOVA Table


Treatment Totals Averages Sources of Sum of Degrees of Mean Square F
- -- Variation Squares Freedom
1 x11 x12 ... x1 n x1. x1.
- --
2 x 21 x 23 ... x 2n x 2. x 2. Among groups

. . . ... . . .
. . . ... . . . Within groups

. . . ... . . .
- --
Total
a xa1 xa 2 ... x an xa . xa .
- --
x .. x..

41 42

Two-Way ANOVA
• Examines the effect of
Two factors of interest on the dependent variable
e.g., Percent carbonation and line speed on soft
drink bottling process
Interaction between the different levels of these
two factors
e.g., Does the effect of one particular carbonation
level depend on which level the line speed is set?

43

11

You might also like