You are on page 1of 12

• MID-SIZE VEHICLES
COMPARISON AMONG MEANS :

MORE THAN TWO POPULATIONS


• SPORTS UTILITY VEHICLES

2 3

PICKUP TRUCKS
SUV’S MID-SIZE VEHICLES

MID-SIZE
SUV

MIDSIZE
SUV
PICKUP

4 5

• AN O VA

ANOVA

6 7

1
An Introduction to Experimental Design
and Analysis of Variance
 An Introduction to Experimental Design  In this section three types of experimental designs are
and Analysis of Variance introduced.
 Analysis of Variance and • a completely randomized design (One-way
the Completely Randomized Design ANOVA)
 Multiple Comparison Procedures • a randomized block design Two-way
• a factorial experiment ANOVA

8 9

An Introduction to Experimental Design Testing for the Equality of k Population Means:


and Analysis of Variance A Completely Randomized Design
 Example: AutoShine, Inc.
 A factor is a variable that the experimenter has
selected for investigation. AutoShine, Inc. is considering marketing a long-
 A treatment is a level of a factor. lasting car wax. Three different waxes (Type 1, Type 2,
 Experimental units are the objects of interest in the and Type 3) have been developed.
experiment. In order to test the durability of these waxes, 5 new
• Response Variable or Dependent Variable: cars were waxed with Type 1, 5 with Type 2, and 5
information with measurable unit
with Type 3. Each car was then repeatedly run
 A completely randomized design or One-way through an automatic carwash until the wax coating
ANOVA is an experimental design in which the showed signs of deterioration.
treatments are randomly assigned to the
experimental units.

10 11

Testing for the Equality of k Population Means: Testing for the Equality of k Population Means:
A Completely Randomized Design A Completely Randomized Design
 Example: AutoShine, Inc.
The number of times each car went through the Wax Wax Wax
carwash before its wax deteriorated is shown on the Observation Type 1 Type 2 Type 3
next slide. AutoShine, Inc. must decide which wax 1 27 33 29
to market. Are the three waxes equally effective? 2 30 28 28
3 29 31 30
4 28 30 32
5 31 30 31

Sample Mean 29.0 30.4 30.0


Sample Variance 2.5 3.3 2.5

Identify Factors, Treatments and Response Variables

12 13

2
One-way ANOVA: Identifying Factor, Treatments, Experimental Units, and
Response Variables Sampling Scheme for Independent Samples
Wax Wax Wax
Observation Type 1 Type 2 Type 3
1 27 33 29
2 30 28 28
3 29 31 30
4 28 30 32
5 31 30 31

Sample Mean 29.0 30.4 30.0


Sample Variance 2.5 3.3 2.5

Factor . . . Car wax


Treatments . . . Type I, Type 2, Type 3
Experimental units . . . Cars
Response variable . . . Number of washes

14 15

Analysis of Variance: A Conceptual Overview Analysis of Variance: A Conceptual Overview

Analysis of Variance (ANOVA) can be used to test H0: 1=2=3=. . . = k


for the equality of three or more population means.
Ha: Not all population means are equal
Data obtained from observational or experimental
studies can be used for the analysis. If H0 is rejected, we cannot conclude that all
population means are different.
We want to use the sample results to test the
following hypotheses: Rejecting H0 means that at least two population
H0: 1=2=3=. . . = k means have different values.

Ha: Not all population means are equal Then we need to conduct a POST-HOC test.

16 17

ANOVA Table
Completely Randomized Design
for a Completely Randomized Design
One-way ANOVA

Source of Sum of Degrees of Mean p-


Variation Squares Freedom Square F Value
 Between (Source of Variation) - Treatments
Estimate of Population Variance SSTR MSTR
Treatments SSTR k-1 MSTR 
 Within (Source of Variation) - Treatments Estimate k-1 MSE
of Population Variance SSE
Error SSE nT - k MSE 
nT - k
 Comparing the Variance Estimates: The F Test
 ANOVA Table Total SST nT - 1
SST’s degrees of freedom
SST is partitioned (d.f.) are partitioned into
into SSTR and SSE. SSTR’s d.f. and SSE’s d.f.

18 19

3
ANOVA Computation
EXPRESSIONS FOR SSTR & SSE
for a Completely Randomized Design

SST divided by its degrees of freedom nT – 1 is the


overall sample variance that would be obtained if we
treated the entire set of observations as one data set.

With the entire data set as one sample, the formula


for computing the total sum of squares, SST, is:
k nj

SST   ( xij  x )2  SSTR  SSE


j  1 i 1

OR

20 21

Between-Treatments Estimate Within-Treatments Estimate


of Population Variance s2 of Population Variance s2
 The estimate of s 2 based on the variation of the  The estimate of s 2 based on the variation of the
sample means is called the mean square due to sample observations within each sample is called the
treatments and is denoted by MSTR. mean square error and is denoted by MSE.

k k
 n (x
2
j j  x )2  (n j  1)s 2j
j 1
MSTR 
j 1
MSE 
k1 nT  k

Numerator is called
Denominator is the Numerator is called Denominator is the the sum of squares
degrees of freedom the sum of squares due degrees of freedom due to error (SSE)
associated with SSTR to treatments (SSTR) associated with SSE

22 23

Comparing the Variance Estimates: The F Test Comparing the Variance Estimates: The F Test

 If the null hypothesis is true and the ANOVA  Sampling Distribution of MSTR/MSE
assumptions are valid, the sampling distribution of
MSTR/MSE is an F distribution with MSTR d.f. Sampling Distribution
equal to k - 1 and MSE d.f. equal to nT - k. of MSTR/MSE
 If the means of the k populations are not equal, the
value of MSTR/MSE will be inflated because MSTR Reject H0
overestimates s 2.
 Hence, we will reject H0 if the resulting value of
Do Not Reject H0 
MSTR/MSE appears to be too large to have been MSTR/MSE
selected at random from the appropriate F F
distribution. Critical Value

24 25

4
ANOVA Table Test for the Equality of k Population Means
for a Completely Randomized Design
 Hypotheses
ANOVA can be viewed as the process of partitioning
the total sum of squares and the degrees of freedom H0: 1=2=3=. . . = k
into their corresponding sources: treatments and error. Ha: Not all population means are equal

Dividing the sum of squares by the appropriate  Test Statistic


degrees of freedom provides the variance estimates
and the F value used to test the hypothesis of equal F = MSTR/MSE
population means.

26 27

Test for the Equality of k Population Means Testing for the Equality of k Population Means:
A Completely Randomized Design
 Rejection Rule  Example: AutoShine, Inc.
p-value Approach: Reject H0 if p-value <  AutoShine, Inc. is considering marketing a long-
lasting car wax. Three different waxes (Type 1, Type 2,
Critical Value Approach: Reject H0 if F > F and Type 3) have been developed.
In order to test the durability of these waxes, 5 new
where the value of F is based on an cars were waxed with Type 1, 5 with Type 2, and 5
F distribution with k - 1 numerator d.f.
with Type 3. Each car was then repeatedly run
and nT - k denominator d.f.
through an automatic carwash until the wax coating
showed signs of deterioration.

28 29

Testing for the Equality of k Population Means: Testing for the Equality of k Population Means:
A Completely Randomized Design A Completely Randomized Design
 Example: AutoShine, Inc.
The number of times each car went through the Wax Wax Wax
carwash before its wax deteriorated is shown on the Observation Type 1 Type 2 Type 3
next slide. AutoShine, Inc. must decide which wax 1 27 33 29
to market. Are the three waxes equally effective? 2 30 28 28
3 29 31 30
Factor . . . Car wax 4 28 30 32
Treatments . . . Type I, Type 2, Type 3 5 31 30 31
Experimental units . . . Cars
Sample Mean 29.0 30.4 30.0
Response variable . . . Number of washes
Sample Variance 2.5 3.3 2.5

𝛼 = 0.05

30 31

5
Testing for the Equality of k Population Means: Testing for the Equality of k Population Means:
A Completely Randomized Design A Completely Randomized Design
 Hypotheses  Mean Square Between Treatments
H0: 1=2=3 Because the sample sizes are all equal:
Ha: Not all the means are equal x  ( x1  x 2  x3 )/3 = (29 + 30.4 + 30)/3 = 29.8
where: SSTR = 5(29–29.8)2 + 5(30.4–29.8)2 + 5(30–29.8)2 = 5.2
1 = mean number of washes using Type 1 wax MSTR = 5.2/(3 - 1) = 2.6
2 = mean number of washes using Type 2 wax  Mean Square Error
3 = mean number of washes using Type 3 wax
SSE = 4(2.5) + 4(3.3) + 4(2.5) = 33.2
MSE = 33.2/(15 - 3) = 2.77

32 33

Testing for the Equality of k Population Means: Testing for the Equality of k Population Means:
A Completely Randomized Design A Completely Randomized Design
 Rejection Rule  Test Statistic
p-Value Approach: Reject H0 if p-value < .05 F = MSTR/MSE = 2.60/2.77 = .939
Critical Value Approach: Reject H0 if F > 3.89  Conclusion
The p-value is greater than .10, where F = 2.81.
where F.05 = 3.89 is based on an F distribution
(Excel provides a p-value of .42.)
with 2 numerator degrees of freedom and 12
Therefore, we cannot reject H0.
denominator degrees of freedom
There is insufficient evidence to conclude that
the mean number of washes for the three wax
types are not all the same.

34 35

Testing for the Equality of k Population Means: Analysis of Variance: A Conceptual Overview
A Completely Randomized Design
 ANOVA Table  Assumptions for Analysis of Variance

Source of Sum of Degrees of Mean For each population, the response (dependent)
Variation Squares Freedom Squares F p-Value variable is normally distributed.

Treatments 5.2 2 2.60 .939 .42 The variance of the response variable, denoted s 2,
is the same for all of the populations. This is also known as
Error 33.2 12 2.77 Assumption of Homogeneity of Variance.
The observations must be independent.
Total 38.4 14

36 37

6
Analysis of Variance: A Conceptual Overview Analysis of Variance: A Conceptual Overview

 Sampling Distribution of x Given H0 is True  Sampling Distribution of x Given H0 is False

Sample means are close together Sample means come from


because there is only different sampling distributions
one sampling distribution and are not as close together
when H0 is true. when H0 is false.
s2
s x2 
n

x2  x1 x3 x3 3 x1 1 2 x2

38 39

Testing for the Equality of k Population Means: Testing for the Equality of k Population Means:
 Example: Reed Manufacturing  Example: Reed Manufacturing
Janet Reed would like to know if there is any A simple random sample of five managers from
significant difference in the mean number of hours each of the three plants was taken and the number of
worked per week for the department managers at her hours worked by each manager in the previous week
three manufacturing plants (in Buffalo, Pittsburgh, is shown on the next slide.
and Detroit).
An F test will be conducted using  = .05.
Identify Factors, Treatments, Experimental units
and Response Variables

40 41

Testing for the Equality of k Population Means: Testing for the Equality of k Population Means:
 Example: Reed Manufacturing

Plant 1 Plant 2 Plant 3


Observation Buffalo Pittsburgh Detroit
Factor . . . Manufacturing plant
1 48 73 51
Treatments . . . Buffalo, Pittsburgh, Detroit
2 54 63 63
Experimental units . . . Managers
3 57 66 61
4 54 64 54 Response variable . . . Number of hours worked
5 62 74 56
Sample Mean 55 68 57
Sample Variance 26.0 26.5 24.5

42 43

7
Testing for the Equality of k Population Means: Testing for the Equality of k Population Means:

 p -Value and Critical Value Approaches  p -Value and Critical Value Approaches

1. Develop the hypotheses. 2. Specify the level of significance.  = .05


H0:  1= 2= 3
Ha: Not all the means are equal 3. Compute the value of the test statistic.

where: Mean Square Due to Treatments


 1 = mean number of hours worked per (Sample sizes are all equal.)
week by the managers at Plant 1 x = (55 + 68 + 57)/3 = 60
 2 = mean number of hours worked per
SSTR = 5(55 - 60)2 + 5(68 - 60)2 + 5(57 - 60)2 = 490
week by the managers at Plant 2
  3 = mean number of hours worked per MSTR = 490/(3 - 1) = 245
week by the managers at Plant 3

44 45

Testing for the Equality of k Population Means: Testing for the Equality of k Population Means:

 p -Value and Critical Value Approaches  ANOVA Table

3. Compute the value of the test statistic. (con’t.)


Source of Sum of Degrees of Mean
Mean Square Due to Error
Variation Squares Freedom Square F p-Value
SSE = 4(26.0) + 4(26.5) + 4(24.5) = 308
Treatment 490 2 245 9.55 .0033
MSE = 308/(15 - 3) = 25.667
Error 308 12 25.667
F = MSTR/MSE = 245/25.667 = 9.55
Total 798 14

46 47

Testing for the Equality of k Population Means: Testing for the Equality of k Population Means:

 p –Value Approach  Critical Value Approach

4. Compute the p –value. 4. Determine the critical value and rejection rule.

With 2 numerator d.f. and 12 denominator d.f., Based on an F distribution with 2 numerator
the p-value is .01 for F = 6.93. Therefore, the d.f. and 12 denominator d.f., F.05 = 3.89.
p-value is less than .01 for F = 9.55. Reject H0 if F > 3.89

5. Determine whether to reject H0. 5. Determine whether to reject H0.


The p-value < .05, so we reject H0. Because F = 9.55 > 3.89, we reject H0.
We have sufficient evidence to conclude that the We have sufficient evidence to conclude that the
mean number of hours worked per week by mean number of hours worked per week by
department managers is not the same at all 3 plant. department managers is not the same at all 3 plant.

48 49

8
EXAMPLE 3 DATA

50 51

What we will cover in Experimental Design

 Statistical studies can be classified as being either


experimental or observational.
 In an experimental study, one or more factors are
controlled so that data can be obtained about how the
factors influence the variables of interest.
Post-Hoc Tests
 In an observational study, no attempt is made to
control the factors.
 Cause-and-effect relationships are easier to establish
in experimental studies than in observational studies.
 Analysis of variance (ANOVA) can be used to analyze
the data obtained from experimental or observational
studies.

52 53


REJECT THE NULL HYPOTHESIS

 Fisher’s least significant difference (LSD)


procedure can be used to determine where the
differences occur.

• •

Note: As per assumption, we always consider equal variance scenario


in ANOVA

54 55

9
_ _ Fisher’s LSD Procedure
Based on the Test Statistic xi - xj
 Hypotheses  Example: Reed Manufacturing
H 0 : i   j
H a : i   j Recall that Janet Reed wants to know if there is any
significant difference in the mean number of hours
worked per week for the department managers at her
• TEST STATISTIC xi  x j
three manufacturing plants.
Analysis of variance has provided statistical
 Rejection Rule
Reject H0 if xi  x j > LSD evidence to reject the null hypothesis of equal
population means. Fisher’s least significant difference
where (LSD) procedure can be used to determine where the
LSD  t /2 MSE( 1 n  1 n ) differences occur.
i j

56 57

• LSD FOR PLANTS 1 AND 2

• Hypotheses (A) H 0 : 1  2
H a : 1   2
LSD  t /2 MSE( 1 n  1 n ) • Rejection Rule
i j
Reject H0 if x1  x2 > 6.98
LSD  2. 179 25 . 667 ( 1 5  1 5 )  6. 98 • Test Statistic
x1  x2 = |55  68| = 13
MSE value was
computed earlier • Conclusion
The mean number of hours worked at Plant 1 is
not equal to the mean number worked at Plant 2.

58 59

Fisher’s LSD Procedure Fisher’s LSD Procedure


Based on the Test Statistic xi - xj Based on the Test Statistic xi - xj
 LSD for Plants 1 and 3  LSD for Plants 2 and 3
• Hypotheses (B) H 0 : 1   3 • Hypotheses (C) H 0 : 2   3
H a : 1  3 H a : 2  3
• Rejection Rule • Rejection Rule
Reject H0 if x1  x3 > 6.98 Reject H0 if x2  x3 > 6.98
• Test Statistic • Test Statistic
x1  x3 = |55  57| = 2 x 2  x3 = |68  57| = 11
• Conclusion • Conclusion
There is no significant difference between the mean The mean number of hours worked at Plant 2 is
number of hours worked at Plant 1 and the mean not equal to the mean number worked at Plant 3.
number of hours worked at Plant 3.

60 61

10
EXAMPLE 3 DATA

62 63

ARE TAMIL NADU’S POLITICAL VIEWS AFFECTED BY THEIR


INCOMES, OR PERHAPS VICE VERSA? IF SO, WE WOULD
EXPECT THAT INCOMES WOULD DIFFER BETWEEN GROUPS
WHO DEFINE THEMSELVES SOMEWHERE ON THE FOLLOWING
SCALE.

64 65

 The comparison-wise Type I error rate  indicates


the level of significance associated with a single
pairwise comparison.
 The experiment-wise Type I error rate EW is the
probability of making a Type I error on at least one of
the (k – 1)! pairwise comparisons.

EW = 1 – (1 – )(k – 1)!

66 67

11
• •
• •

Tukey’s Method
If absolute difference > Omega, Reject Null
Hypothesis

Critical value 𝜔 in Tuckey’s MCM

68 69

70 71

12

You might also like