Professional Documents
Culture Documents
Chi-Square Tests and F-Distribution
Chi-Square Tests and F-Distribution
Larson/Farber 4th ed 1
Chapter Outline
Larson/Farber 4th ed 2
Section 10.1
Goodness of Fit
Larson/Farber 4th ed 3
Section 10.1 Objectives
Larson/Farber 4th ed 4
Multinomial Experiments
Multinomial experiment
• A probability experiment consisting of a fixed
number of trials in which there are more than two
possible outcomes for each independent trial.
• A binomial experiment had only two possible
outcomes.
• The probability for each outcome is fixed and each
outcome is classified into categories.
Larson/Farber 4th ed 5
Multinomial Experiments
Example:
• A radio station claims that the distribution of music
preferences for listeners in the broadcast region is as
shown below.
Distribution of music Preferences
Classical 4% Oldies 2%
Country 36% Pop 18%
Gospel 11% Rock 29%
Each outcome is The probability for
classified into each possible outcome
is fixed.
categories.
Larson/Farber 4th ed 6
Chi-Square Goodness-of-Fit Test
Larson/Farber 4th ed 7
Chi-Square Goodness-of-Fit Test
Example:
• To test the radio station’s claim, the executive can
perform a chi-square goodness-of-fit test using the
following hypotheses.
H0: The distribution of music preferences in the
broadcast region is 4% classical, 36% country,
11% gospel, 2% oldies, 18% pop, and 29% rock.
(claim)
Ha: The distribution of music preferences differs from
the claimed or expected distribution.
Larson/Farber 4th ed 8
Chi-Square Goodness-of-Fit Test
Larson/Farber 4th ed 9
Chi-Square Goodness-of-Fit Test
Larson/Farber 4th ed 10
Example: Finding Observed and
Expected Frequencies
A marketing executive randomly
selects 500 radio music listeners Survey results
from the broadcast region and asks (n = 500)
each whether he or she prefers Classical 8
Country 210
classical, country, gospel, oldies,
Gospel 72
pop, or rock music. The results are
Oldies 10
shown at the right. Find the
Pop 75
observed frequencies and the
Rock 125
expected frequencies for each type
of music.
Larson/Farber 4th ed 11
Solution: Finding Observed and
Expected Frequencies
Observed frequency: The number of radio music
listeners naming a particular type of music
Larson/Farber 4th ed 12
Solution: Finding Observed and
Expected Frequencies
Expected Frequency: Ei = npi
Larson/Farber 4th ed 13
Chi-Square Goodness-of-Fit Test
Larson/Farber 4th ed 14
Chi-Square Goodness-of-Fit Test
Larson/Farber 4th ed 15
Chi-Square Goodness-of-Fit Test
In Words In Symbols
1. Identify the claim. State the State H0 and Ha.
null and alternative
hypotheses.
2. Specify the level of Identify .
significance.
3. Identify the degrees of d.f. = k – 1
freedom.
4. Determine the critical Use Table 6 in
value. Appendix B.
Larson/Farber 4th ed 16
Chi-Square Goodness-of-Fit Test
In Words In Symbols
5. Determine the rejection region.
(O E) 2
6. Calculate the test statistic.
2
E
7. Make a decision to reject or fail If χ2 is in the
to reject the null hypothesis. rejection region,
reject H0.
Otherwise, fail to
reject H0.
8. Interpret the decision in the
context of the original claim.
Larson/Farber 4th ed 17
Example: Performing a Goodness of Fit
Test
Use the music preference data to perform a chi-square
goodness-of-fit test to test whether the distributions are
different. Use α = 0.01.
Distribution of Survey results
music preferences (n = 500)
Classical 4% Classical 8
Country 36% Country 210
Gospel 11% Gospel 72
Oldies 2% Oldies 10
Pop 18% Pop 75
Rock 29% Rock 125
Larson/Farber 4th ed 18
Solution: Performing a Goodness of Fit
Test
• H0: music preference is 4% classical, 36% country,
11% gospel, 2% oldies, 18% pop, and 29% rock
• Ha: music preference differs from the claimed or
expected distribution
• α = 0.01 • Test Statistic:
• d.f. = 6 – 1 = 5
• Rejection Region • Decision:
0.01 • Conclusion:
χ2
0 15.086
Larson/Farber 4th ed 19
Solution: Performing a Goodness of Fit
Test
Type of Observed Expected
music frequency frequency
Classical 8 20
Country 210 180
Gospel 72 55
Oldies 10 10
Pop 75 90
( O E ) 2
2 Rock 125 145
E
(8 20)2 (210 180)2 (72 55)2 (10 10)2 (75 90)2 (125 145)2
20 180 55 10 90 145
22.713
Larson/Farber 4th ed 20
Solution: Performing a Goodness of Fit
Test
• H0: music preference is 4% classical, 36% country,
11% gospel, 2% oldies, 18% pop, and 29% rock
• Ha: music preference differs from the claimed or
expected distribution
• α = 0.01 • Test Statistic:
• d.f. = 6 – 1 = 5 χ 2
= 22.713
• Decision: Reject H
• Rejection Region 0
There is enough evidence to
0.01 conclude that the distribution
of music preferences differs
χ2
0 15.086
22.713
from the claimed distribution.
Larson/Farber 4th ed 21
Example: Performing a Goodness of Fit
Test
The manufacturer of M&M’s candies claims that the
number of different-colored candies in bags of dark
chocolate M&M’s is uniformly distributed. To test this
claim, you randomly select a bag that contains 500 dark
chocolate M&M’s. The results are shown in the table on
the next slide. Using α = 0.10, perform a chi-square
goodness-of-fit test to test the claimed or expected
distribution. What can you conclude? (Adapted from
Mars Incorporated)
Larson/Farber 4th ed 22
Example: Performing a Goodness of Fit
Test
Larson/Farber 4th ed 23
Solution: Performing a Goodness of Fit
Test
• H0: Distribution of different-colored candies in bags
of dark chocolate M&Ms is uniform
• Ha: Distribution of different-colored candies in bags
of dark chocolate M&Ms is not uniform
• α = 0.10 • Test Statistic:
• d.f. = 6 – 1 = 5
• Rejection Region • Decision:
0.10 • Conclusion:
χ2
0 9.236
Larson/Farber 4th ed 24
Solution: Performing a Goodness of Fit
Test
Observed Expected
Color frequency frequency
Brown 80 83.3
Yellow 95 83.3
Red 88 83.3
Blue 83 83.3
(O E ) 2 Orange 76 83.3
2 Green 78 83.3
E
(80 83.3) 2 (95 83.3) 2 (88 83.3) 2 (83 83.3) 2 (76 83.3) 2 (78 83.3)2
83.3 83.3 83.3 83.3 83.3 83.3
3.016
Larson/Farber 4th ed 25
Solution: Performing a Goodness of Fit
Test
• H0: Distribution of different-colored candies in bags
of dark chocolate M&Ms is uniform
• Ha: Distribution of different-colored candies in bags
of dark chocolate M&Ms is not uniform
• α = 0.01 • Test Statistic:
• d.f. = 6 – 1 = 5 χ 2
= 3.016
• Rejection Region • Decision: Fail to Reject H0
There is not enough evidence
0.10 to dispute the claim that the
distribution is uniform.
χ2
03.016 9.236
Larson/Farber 4th ed 26
Section 10.1 Summary
Larson/Farber 4th ed 27
Section 10.2
Independence
Larson/Farber 4th ed 28
Section 10.2 Objectives
Larson/Farber 4th ed 29
Contingency Tables
r c contingency table
• Shows the observed frequencies for two variables.
• The observed frequencies are arranged in r rows and
c columns.
• The intersection of a row and a column is called a
cell.
Larson/Farber 4th ed 30
Contingency Tables
Example:
• The contingency table shows the results of a random
sample of 550 company CEOs classified by age and
size of company.(Adapted from Grant Thornton LLP, The
Segal Company)
Age
Company 39 and 70 and
40 - 49 50 - 59 60 - 69
size under over
Small /
42 69 108 60 21
Midsize
Large 5 18 85 120 22
Larson/Farber 4th ed 31
Finding the Expected Frequency
Larson/Farber 4th ed 32
Example: Finding Expected Frequencies
Find the expected frequency for each cell in the
contingency table. Assume that the variables, age and
company size, are independent.
Age
Company 39 and 70 and
40 - 49 50 - 59 60 - 69 Total
size under over
Small /
42 69 108 60 21 300
Midsize
Large 5 18 85 120 22 250
Total 47 87 193 180 43 550
marginal totals
Larson/Farber 4th ed 33
Solution: Finding Expected Frequencies
(Sum of row r) (Sum of column c)
Er,c
Sample size
Age
Company 39 and 70 and
40 - 49 50 - 59 60 - 69 Total
size under over
Small /
42 69 108 60 21 300
Midsize
Large 5 18 85 120 22 250
Total 47 87 193 180 43 550
300 47
E1,1 25.64
550
Larson/Farber 4th ed 34
Solution: Finding Expected Frequencies
Age
39 and 70 and
40 - 49 50 - 59 60 - 69 Total
Company size under over
Small /
42 69 108 60 21 300
Midsize
Large 5 18 85 120 22 250
Total 47 87 193 180 43 550
Larson/Farber 4th ed 35
Solution: Finding Expected Frequencies
Age
39 and 70 and
40 - 49 50 - 59 60 - 69 Total
Company size under over
Small /
42 69 108 60 21 300
Midsize
Large 5 18 85 120 22 250
Total 47 87 193 180 43 550
Larson/Farber 4th ed 36
Chi-Square Independence Test
Larson/Farber 4th ed 37
Chi-Square Independence Test
Larson/Farber 4th ed 38
Chi-Square Independence Test
• If these conditions are satisfied, then the sampling
distribution for the chi-square independence test is
approximated by a chi-square distribution with
(r – 1)(c – 1) degrees of freedom, where r and c are the
number of rows and columns, respectively, of a
contingency table.
• The test statistic for the chi-square independence test is
2
(O E ) The test is always a
2
E right-tailed test.
where O represents the observed frequencies and E
represents the expected frequencies.
Larson/Farber 4th ed 39
Chi-Square Independence Test
In Words In Symbols
1. Identify the claim. State the State H0 and Ha.
null and alternative
hypotheses.
2. Specify the level of Identify .
significance.
3. Identify the degrees of d.f. = (r – 1)(c – 1)
freedom.
4. Determine the critical value. Use Table 6 in
Appendix B.
Larson/Farber 4th ed 40
Chi-Square Independence Test
In Words In Symbols
5. Determine the rejection
region.
(O E) 2
6. Calculate the test statistic.
2
E
7. Make a decision to reject or If χ2 is in the
fail to reject the null rejection region,
hypothesis. reject H0.
Otherwise, fail to
reject H0.
8. Interpret the decision in the
context of the original claim.
Larson/Farber 4th ed 41
Example: Performing a χ2 Independence
Test
Using the age/company size contingency table, can you
conclude that the CEOs ages are related to company
size? Use α = 0.01. Expected frequencies are shown in
parentheses.
Age
39 and 70 and
40 - 49 50 - 59 60 - 69 Total
Company size under over
42 69 108 60 21 300
Small /
(25.64) (47.45) (105.27 (98.18) (23.45)
Midsize
)
5 18 85 120 22 250
Large
(21.36) (39.55) (87.73) (81.82) (19.55)
Total 47 87 193 180 43 550
Larson/Farber 4th ed 42
Solution: Performing a Goodness of Fit
Test
• H0: CEOs’ ages are independent of company size
• Ha: CEOs’ ages are dependent on company size
• α = 0.01
• d.f. = (2 – 1)(5 – 1) = 4 • Test Statistic:
• Rejection Region
• Decision:
0.01
χ2
0 13.277
Larson/Farber 4th ed 43
Solution: Performing a Goodness of Fit
Test
(O E ) 2
2
E
(42 25.64) 2 (69 47.45) 2 (108 105.27) 2 (60 98.18) 2 (21 23.45)2
25.64 47.45 105.27 98.18 23.45
(5 21.36) 2 (18 39.55) 2 (85 87.73) 2 (120 81.82) 2 (22 19.55)2
21.36 39.55 87.73 81.82 19.55
77.9
Larson/Farber 4th ed 44
Solution: Performing a Goodness of Fit
Test
• H0: CEOs’ ages are independent of company size
• Ha: CEOs’ ages are dependent on company size
• α = 0.01
• d.f. = (2 – 1)(5 – 1) = 4 • Test Statistic:
• Rejection Region χ2 = 77.9
• Decision:Reject H0
0.01 There is enough evidence to
conclude CEOs’ ages are
χ2 dependent on company size.
0 13.277
77.9
Larson/Farber 4th ed 45
Section 10.2 Summary
Larson/Farber 4th ed 46
Section 10.3
Larson/Farber 4th ed 47
Section 10.3 Objectives
Larson/Farber 4th ed 48
F-Distribution
Larson/Farber 4th ed 49
Properties of the F-Distribution
Larson/Farber 4th ed 50
Properties of the F-Distribution
F
1 2 3 4
Larson/Farber 4th ed 51
Critical Values for the F-Distribution
Solution:
•When performing a two-tailed hypothesis test using
the F-distribution, you need only to find the right-
tailed critical value.
•You must remember to use the ½α table.
1 1
(0.05) 0.025
2 2
Larson/Farber 4th ed 54
Solution: Finding Critical F-Values
Larson/Farber 4th ed 55
Two-Sample F-Test for Variances
Larson/Farber 4th ed 56
Two-Sample F-Test for Variances
• Test Statistic
s12
F 2
s2
2 2
where 1 s and s 2 represent the sample variances
s12 s22.
with
• The degrees of freedom for the numerator is
d.f.N = n1 – 1 where n1 is the size of the sample
2
having variance 1 .s
• The degrees of freedom for the denominator is
d.f.D = n2 – 1, and n2 is the size of the sample having
2
variance 2.s
Larson/Farber 4th ed 57
Two-Sample F-Test for Variances
In Words In Symbols
1. Identify the claim. State the State H0 and Ha.
null and alternative
hypotheses.
2. Specify the level of Identify .
significance.
3. Identify the degrees of d.f.N = n1 – 1
freedom. d.f.D = n2 – 1
4. Determine the critical value. Use Table 7 in
Appendix B.
Larson/Farber 4th ed 58
Two-Sample F-Test for Variances
In Words In Symbols
5. Determine the rejection
region.
s12
6. Calculate the test statistic. F 2
s2
7. Make a decision to reject or If F is in the
fail to reject the null rejection region,
hypothesis. reject H0.
8. Interpret the decision in the Otherwise, fail to
context of the original reject H0.
claim.
Larson/Farber 4th ed 59
Example: Performing a Two-Sample F-
Test
A restaurant manager is designing a system that is
intended to decrease the variance of the time customers
wait before their meals are served. Under the old
system, a random sample of 10 customers had a
variance of 400. Under the new system, a random
sample of 21 customers had a variance of 256. At
α = 0.10, is there enough evidence to convince the
manager to switch to the new system? Assume both
populations are normally distributed.
Larson/Farber 4th ed 60
Solution: Performing a Two-Sample F-
Test
2 2
Because 400 > 256, 1
s 400 and s2 256
• H0: σ12 ≤ σ22 • Test Statistic:
• Ha: σ12 > σ22 s12 400
F 2 1.56
• α = 0.10 s2 256
• d.f.N= 9 d.f.D= 20 • Decision: Fail to Reject H0
There is not enough evidence
• Rejection Region:
to convince the manager to
switch to the new system.
0.10
0 1.561.96 F
Larson/Farber 4th ed 61
Example: Performing a Two-Sample F-
Test
You want to purchase stock in a company and are
deciding between two different stocks. Because a
stock’s risk can be associated with the standard
deviation of its daily closing prices, you randomly
select samples of the daily closing prices for each stock
to obtain the results. At α = 0.05, can you conclude that
one of the two stocks is a riskier investment? Assume
the stock closing prices are normally distributed.
Stock A Stock B
n2 = 30 n1 = 31
s2 = 3.5 s1 = 5.7
Larson/Farber 4th ed 62
Solution: Performing a Two-Sample F-
Test
Because 5.72 > 3.52, s12 5.7 2 and s22 3.52
• H0: σ12 = σ22 • Test Statistic:
2 2
• Ha: 1σ 2
≠ σ 2 s 5.7
2
F 1 2 2 2.65
• ½α = 0. 025 s2 3.5
• d.f.N= 30 d.f.D= 29 • Decision: Reject H0
There is enough evidence to
• Rejection Region:
support the claim that one of
the two stocks is a riskier
0.025
investment.
0 2.092.65 F
Larson/Farber 4th ed 63
Section 10.3 Summary
Larson/Farber 4th ed 64
Section 10.4
Analysis of Variance
Larson/Farber 4th ed 65
Section 10.4 Objectives
Larson/Farber 4th ed 66
One-Way ANOVA
Larson/Farber 4th ed 68
One-Way ANOVA
Variance between samples
Test statistic
Variance within samples
1. The variance between samples MSB measures the differences related to the treatment given to each
sample and is sometimes called the mean square between.
2. The variance within samples MSW measures the differences related to entries within the same sample.
This variance, sometimes called the mean square within, is usually due to sampling error.
Larson/Farber 4th ed 69
One-Way Analysis of Variance Test
Larson/Farber 4th ed 71
Test Statistic for a One-Way ANOVA
In Words In Symbols
5. Find the variance between the SS B SS B
MS B
samples. k 1 d.f.N
MS B
7. Find the test statistic. F
MSW
Larson/Farber 4th ed 72
Performing a One-Way ANOVA Test
In Words In Symbols
1. Identify the claim. State the State H0 and Ha.
null and alternative
hypotheses.
2. Specify the level of Identify .
significance.
3. Identify the degrees of d.f.N = k – 1
freedom. d.f.D = N – k
4. Determine the critical Use Table 7 in
value. Appendix B.
Larson/Farber 4th ed 73
Performing a One-Way ANOVA Test
In Words In Symbols
5. Determine the rejection
region.
MS B
6. Calculate the test statistic. F
MSW
SS B MS B
Between SSB d.f.N MS B
d . f .N MSW
SSW
Within SSW d.f.D MSW
d. f .D
Larson/Farber 4th ed 75
Example: Performing a One-Way ANOVA
A medical researcher wants to determine whether there is
a difference in the mean length of time it takes three
types of pain relievers to provide relief from headache
pain. Several headache sufferers are randomly selected
and given one of the three medications. Each headache
sufferer records the time (in minutes) it takes the
medication to begin working. The results are shown on
the next slide. At α = 0.01, can you conclude that the
mean times are different? Assume that each population of
relief times is normally distributed and that the
population variances are equal.
Larson/Farber 4th ed 76
Example: Performing a One-Way ANOVA
Medication 1 Medication 2 Medication 3
12 16 14
15 14 17
17 21 20
12 15 15
19
56 85 66
x1 14 x2 17 x3 16.5
4 5 4
s12 6 s22 8.5 s32 7
Solution:
k = 3 (3 samples)
N = n1 + n2 + n3 = 4 + 5 + 4 = 13 (sum of sample sizes)
Larson/Farber 4th ed 77
Solution: Performing a One-Way ANOVA
• H0: μ1 = μ2 = μ3 • Test Statistic:
• Ha: At least one mean
is different
• α = 0. 01 • Decision:
• d.f.N= 3 – 1 = 2
• d.f. = 13 – 3 = 10
D
• Rejection Region:
0.01
0 7.56 F
Larson/Farber 4th ed 78
Solution: Performing a One-Way ANOVA
x 56 85 66
x 15.92
N 13
SS B ni(xi x ) 2
MS B
d.f.N k 1
4(14 15.92)2 5(17 15.92)2 4(16.5 15.92)2
3 1
21.92
10.96
2
Larson/Farber 4th ed 79
Solution: Performing a One-Way ANOVA
MS B 10.96
F 1.50
MSW 7.3
Larson/Farber 4th ed 80
Solution: Performing a One-Way ANOVA
• H0: μ1 = μ2 = μ3 • Test Statistic:
• Ha: At least one mean MS B
F 1.50
is different MSW
• α = 0. 01 • Decision: Fail to Reject H0
• d.f.N= 3 – 1 = 2 There is not enough evidence
at the 1% level of significance
• d.f.D= 13 – 3 = 10 to conclude that there is a
• Rejection Region: difference in the mean length
of time it takes the three pain
0.01 relievers to provide relief
from headache pain.
0 7.56 F
1.50
Larson/Farber 4th ed 81
Example: Using the TI-83/84 to Perform a
One-Way ANOVA
Three airline companies offer flights between Corydon
and Lincolnville. Several randomly selected flight times
(in minutes) between the towns for each airline are
shown on the next slide. Assume that the populations of
flight times are normally distributed, the samples are
independent, and the population variances are equal. At
α = 0.01, can you conclude that there is a difference in
the means of the flight times? Use a TI-83/84.
Larson/Farber 4th ed 82
Example: Using the TI-83/84 to Perform a
One-Way ANOVA
Airline 1 Airline 2 Airline 3
122 119 120
135 133 158
126 143 155
131 149 126
125 114 147
116 124 164
120 126 134
108 131 151
142 140 131
113 136 141
Larson/Farber 4th ed 83
Solution: Using the TI-83/84 to Perform a
One-Way ANOVA
• H 0: μ 1 = μ 2 = μ 3
• Ha: At least one mean is different
• Store data into lists L1, L2, and L3
Larson/Farber 4th ed 84