Professional Documents
Culture Documents
Section 11.1
Chi-Square Tests for
Goodness of Fit
Chi-Square Tests for Goodness of Fit
LEARNING TARGETS
By the end of this section, you should be able to:
STATE appropriate hypotheses and COMPUTE the expected
counts and chi-square test statistic for a chi-square test for
goodness of fit.
STATE and CHECK the Random, 10%, and Large Counts conditions
for performing a chi-square test for goodness of fit.
CALCULATE the degrees of freedom and P-value for a chi-square
test for goodness of fit.
PERFORM a chi-square test for goodness of fit.
CONDUCT a follow-up analysis when the results of a chi-square
test are statistically significant.
Brown: 12.5%
Red: 12.5%
Yellow: 12.5%
Green: 12.5%
Orange: 25%
Blue: 25%
H0: pbrown = 0.125, pred = 0.125, pyellow = 0.125, pgreen = 0.125, porange = 0.25, pblue =
0.25
Ha: At least two of the pi’s are incorrect
where pcolor = the true proportion of M&M’S Milk Chocolate Candies in the
large bag of that color
Starnes/Tabor, The Practice of Statistics
Stating Hypotheses
H0: The distribution of color in the large bag
of M&M’S Milk Chocolate Candies is the
same as the claimed distribution.
Ha: The distribution of color in the large bag of
all
How likely is it that differences this large or larger would occur just by
chance in random samples of size 60 from the population distribution
claimed by Mars, Inc.?
How likely is it that differences this large or larger would occur just by
chance in random samples of size 60 from the population distribution
claimed by Mars, Inc.?
where the sum is over all possible values of the categorical variable.
where the sum is over all possible values of the categorical variable.
For Jerome’s data, we add six terms—one for each color category:
where the sum is over all possible values of the categorical variable.
For Jerome’s data, we add six terms—one for each color category:
(a)
H0: The sides of Carrie’s die are equally likely to show up.
Ha: The sides of Carrie’s die are not equally likely to show up.
(b) If H0 is true, each of the 6 sides should show up 1/6 of the time. The
expected count is 90(1/6) = 15 for each side.
(c)
2 (12 −15)2 (28 −15)2 (12 −15)2 (13 − 15)2 (10 −15)2 (15 − 15)2
𝜒 = + + + + +
15 15 15 15 15 15
The P-value for a test based on Jerome’s data is between 0.05 and 0.10.
The P-value for a test based on Jerome’s data is between 0.05 and 0.10.
n g to
Fa i li
The P-value for a test based on Jerome’s data is between 0.05 and 0.10.
df = 6 – 1 = 5
Using Table C:
the P-value is between 0.01 and 0.02
Using technology:
2 cdf(lower:14.41, upper:10000, df:5)
= 0.0132
where the sum is over the k different categories. The P-value is the area to the
right of 2 under the chi-square density curve with k – 1 degrees of freedom.
Do these data provide convincing evidence that the birthdays of NHL players are
not uniformly distributed across the four quarters of the year?
Do these data provide convincing evidence that the birthdays of NHL players are
not uniformly distributed across the four quarters of the year?
Do these data provide convincing evidence that the birthdays of NHL players are
not uniformly distributed across the four quarters of the year?
STATE
H0: The birthdays of all NHL players are uniformly distributed across
the four quarters of the year.
Ha: The birthdays of all NHL players are not uniformly distributed
across the four quarters of the year.
We’ll use α = 0.05.
Do these data provide convincing evidence that the birthdays of NHL players are
not uniformly distributed across the four quarters of the year?
PLAN
Chi-square test for goodness of fit
• Random: The data came from a random sample of NHL players ✓
◦ 10%: We must assume that 80 is less than 10% of all NHL players ✓
• Large Counts: All expected counts = 80(1/4) = 20 ≥ 5 ✓
Do these data provide convincing evidence that the birthdays of NHL players are
not uniformly distributed across the four quarters of the year?
DO
2 (3 2− 20)2 (20 − 20)2 (16 −20)2 (12 −20)2
𝜒 = + + + =7.2+ 0+0.8+3.2=11.2
20 20 20 20
df = 4 – 1 = 3
Using Table C: The P-value is between 0.01 and 0.02
Using technology: 2 cdf(lower:11.2, upper:10000, df:3) = 0.011
Do these data provide convincing evidence that the birthdays of NHL players are
not uniformly distributed across the four quarters of the year?
CONCLUDE
Because the P-value of 0.011 < α = 0.05, we reject H0. We have
convincing evidence that the birthdays of NHL players are not
uniformly distributed across the four quarters of the year.
Note: When you run the chi-square test for goodness of fit on the TI-84
calculator, a list of these individual components will be produced and stored
in a list called CNTRB (for contribution).
Starnes/Tabor, The Practice of Statistics
Carrying Out a Test
LEARNING TARGETS
After this section, you should be able to:
STATE appropriate hypotheses and COMPUTE the expected
counts and chi-square test statistic for a chi-square test for
goodness of fit.
STATE and CHECK the Random, 10%, and Large Counts conditions
for performing a chi-square test for goodness of fit.
CALCULATE the degrees of freedom and P-value for a chi-square
test for goodness of fit.
PERFORM a chi-square test for goodness of fit.
CONDUCT a follow-up analysis when the results of a chi-square
test are statistically significant.