AP Stats Ch25

Chapter 25
Comparing Counts
Objectives
 Chi-Square Model
 Chi-Square Statistic
 Knowing when and how to use the Chi-
Square Tests;
 Goodness of Fit
 Test of Independence
 Test of Homogeneity
 Standardized Residual
Categorical Data
 Chi Square tests are used for when we have counts
for the categories of a categorical variable:
 Goodness of Fit Test
 Allows us to test whether a certain population
distribution seems valid. This is a one variable, one
sample test
 Test of Independence
 Cross categorizing one group on two-variables to see if
there is an association between variables. This is a two
variable, one sample test.
 Test for Homogeneity
 Compares observed distribution for several groups to
each other to see if there is a difference among the
population. This is a one variable, many samples test.
Chi Square Model
 Just like the student t-models, chi square has a family
of models depending on degrees of freedom.
 Unlike the student t-models, a chi square distribution
is not symmetric. It’s skewed right.
 A chi square test statistic is always a one-sided, right-
tailed test.
The Chi-Square ( 2 ) Distribution - Properties
 It is a continuous distribution.
 It is not symmetric.
 It is skewed to the right.
 The distribution depends on the degrees of
freedom.
 The value of a 2 random variable is always
nonnegative.
 There are infinitely many 2 distributions,
since each is uniquely defined by its degrees
of freedom.
 For small sample size, the 2 distribution is
very skewed to the right.
 As n increases, the 2 distribution becomes
more and more symmetrical.
 Since we will be using the 2 distribution for
the tests in this chapter, we will need to be
able to find critical values associated with the
distribution.
Critical Value
• Since we will be using the 2 distribution for the tests
in this chapter, we will need to be able to find critical
values associated with the distribution.
• Explanation of the term – critical or rejection region: A
critical or rejection region is a range of test statistic
values for which the null hypothesis will be rejected.
• This range of values will indicate that there is a
significant or large enough difference between the
postulated parameter value and the corresponding
point estimate for the parameter.
Critical Value
• Explanation of the term – non-critical or non-rejection
region: A non-critical or non-rejection region is a
range of test statistic values for which the null
hypothesis will not be rejected.
• This range of values will indicate that there is not a
significant or large enough difference between the
postulated parameter value and the corresponding
point estimate for the parameter.
Critical Value
(Rejection Region)
Non-Critical Region
(Non-Rejection Region)
 Notation: 2, df
 Explanation of the notation 2, df: 2, df is a
2 value with n degrees of freedom such that
 (the significance level) area is to the right of
the corresponding 2 value.
Diagram
explaining the
notation
2, df
The Chi-Square ( 2 ) Distribution - Table
 Values for the random variable with the

appropriate degrees of freedom can be
obtained from the tables in the formula
booklet.
 Example: What is the value of 20.05,10?
α=.05
df=10
χ2 critical value
 Solution: From Table in the formula

booklet, 20.05,10 = 18.307.
 Your Turn: What is the value of 20.10,20?

 20.10,20 = 28.41
CHI-SQUARE (2) TEST
FOR GOODNESS OF FIT
Goodness-of-Fit
 A test of whether the distribution of counts in
one categorical variable matches the
distribution predicted by a model is called a
goodness-of-fit test.
 As usual, there are assumptions and
conditions to consider…
Assumptions and Conditions
 Counted Data Condition: Check that the data
are counts for the categories of a categorical
variable.
 Independence Assumption: The counts in
the cells should be independent of each
other.
 Randomization Condition: The individuals who
have been counted and whose counts are
available for analysis should be a random
sample from some population.
 Sample Size Assumption: We must have
enough data for the methods to work.
 Expected Cell Frequency Condition: We
should expect to see at least 5 individuals in
each cell.
 This is similar to the condition that np
and nq be at least 10 when we tested
proportions.
Calculations
 Since we want to examine how well the
observed data reflect what would be
expected, it is natural to look at the
differences between the observed and
expected counts (Obs – Exp).
Calculations (cont.)
 The test statistic, called the chi-square (or
chi-squared) statistic, is found by adding up
the sum of the squares of the deviations
between the observed and expected counts
divided by the expected counts:
 Obs  Exp 
2
 
2

all cells Exp
One-Sided or Two-Sided?
 The chi-square statistic is used only for
testing hypotheses, not for constructing
confidence intervals.
 If the observed counts don’t match the
expected, the statistic will be large—it can’t
be “too small.”
 So the chi-square test is always one-sided.
 If the calculated statistic value is large enough,
we’ll reject the null hypothesis.
One-Sided or Two-Sided?
 The mechanics may work like a one-sided
test, but the interpretation of a chi-square test
is in some ways many-sided.
 There are many ways the null hypothesis
could be wrong.
 There’s no direction to the rejection of the null
model—all we know is that it doesn’t fit.
Procedure
Procedure (cont.)
Expected Frequencies
If all expected frequencies are not all equal:
E=np
each expected frequency is found by
multiplying the sum of all observed
frequencies by the probability for the
category
Expected Frequencies
 The chi-square goodness of fit test is

always a right-tailed test.
 For the chi-square goodness-of-fit test,
the expected frequencies should be at
least 5.
 When the expected frequency of a class or
category is less than 5, this class or
category can be combined with another
class or category so that the expected
frequency is at least 5.
Goodness-of-fit Test
Test Statistic
 =
2 (O – E)2
E
Critical Values
1. Found in Table using k – 1 degrees of
freedom where k = number of categories
2. Goodness-of-fit hypothesis tests are
always right-tailed.
EXAMPLE
 There are 4 TV sets that are located in
the student center of a large university. At
a particular time each day, four different
soap operas (1, 2, 3, and 4) are viewed on
these TV sets. The percentages of the
audience captured by these shows during
one semester were 25 percent, 30 percent,
25 percent, and 20 percent, respectively.
During the first week of the following
semester, 300 students are surveyed.
EXAMPLE (Continued)
 (a) If the viewing pattern has not changed,
what number of students is expected to
watch each soap opera?
 Solution: Based on the information, the
expected values will be: 0.25300 = 75,
0.30300 = 90, 0.25300 = 75, and
0.20300 = 60.
EXAMPLE (Continued)
 (b) Suppose that the actual observed
numbers of students viewing the soap
operas are given in the following table, test
whether these numbers indicate a change
at the 1 percent level of significance.
EXAMPLE (Continued)
 Solution: Given  = 0.01, n = 4, df = 4 – 1
= 3, 20.01, 3= 11.345. The observed and
expected frequencies are given below
EXAMPLE (Continued)
 Solution (continued): The 2 test statistic
is computed below.
EXAMPLE (Continued)
 Solution (continued):
P-value = .6828, P > 𝛼

EXAMPLE (Continued)
 Solution (continued):
Diagram showing
the rejection
region.
The Chi-Square test for Goodness of Fit
Your Turn
 The Advanced Placement (AP) Statistics examination was
first administered in May 1997. Students’ papers are
graded on a scale of 1–5, with 5 being the highest score.
Over 7,600 students took the exam in the first year, and
the distribution of scores was as follows (not including
exams that were scored late).
 Score 5 4 3 2 1 .
 Percent 15.3 22.0 24.8 19.8 18.1

 A distance learning class that took AP Statistics via
satellite television had the following distribution of grades:
 Score 5 4 3 2 1 .
 Frequency 7 13 7 6 2
O  E 
2
Score Observed Expected % Expected
Counts (pi) Counts (npi)
E
5 7 15.3 5.355 .50533
4 13 22 7.7 3.6481
3 7 24.8 8.68 .32516
2 6 19.8 6.93 .12481
1 2 18.1 6.335 2.9664
Totals 35 100% 35 7.56976

Carry out an appropriate test to determine if
the distribution of scores for students enrolled
in the distance learning program is
significantly different from the distribution of
scores for all students who took the inaugural
exam.
 We must be willing to treat this class of
students as an SRS from the population of all
distance learning classes. We will
proceed with caution. All expected counts
are 5 or more.
 Ho: The distribution of AP Statistics exams
scores for distance learning students is the
same as the distribution of scores for all
students who took the May 1997 exam.
 Ha:The distribution of AP Statistics exams
scores for distance learning students is
different than the distribution of scores for all
students who took the May 1997 exam.
 We will use a significance level of 0.05. There are 5
categories, meaning there are 5 – 1 = 4 degrees of
freedom.
 42  7.56976
 P (42  7.56976)
 P-value = .1087
 We do not have enough evidence to reject Ho since
p > alpha. We do not have enough evidence to
suggest the distributions of scores of traditional
students is different than the distribution of scores of
the distance learning students.
2 TEST OF INDEPENDENCE
Independence
 Contingency tables categorize counts on two (or
more) variables so that we can see whether the
distribution of counts on one variable is
contingent on the other.
 A test of whether the two categorical variables are
independent examines the distribution of counts
for one group of individuals classified according
to both variables in a contingency table.
Definition
 Test of Independence
This method tests the null
hypothesis that the row variable
and column variable in a
contingency table are not related.
(The null hypothesis is the
statement that the row and column
variables are independent.)
 The assumptions and conditions are the
same as for the chi-square goodness-of-fit
test:
 Counted Data Condition: The data must be
counts.
 Randomization Condition and 10% Condition:
As long as we don’t want to generalize, we
don’t have to check these conditions.
 Expected Cell Frequency Condition: The
expected count in each cell must be at least 5.
Test of Independence
Test Statistic
 =
2 (O – E)2
E
Critical Values
1. Found in Table using
degrees of freedom = (r – 1)(c – 1)
r is the number of rows and c is the number of
columns
2. Tests of Independence are always right-
tailed.
Tests of Independence
H0: The row variable is independent of the
column variable
H1: The row variable is dependent (related to)

the column variable
This procedure cannot be used to establish a direct

cause-and-effect link between variables in question.
Dependence means only there is a relationship

between the two variables.
Expected Frequency for Contingency Tables
row total column total

E= table total •
table total
•
table total
n • p
(probability of a cell)
(row total) (column total)

E= (table total)
(row total) (column total)
E= (table total)
Total number of all observed frequencies

in the table
Observed and Expected Frequencies
Men Women Boys Girls Total
332 318 29 27 706
Survived
1360 104 35 18 1517
Died
Total
1692 422 64 45 2223
We will use the mortality table from the Titanic to find expected
frequencies. For the upper left hand cell, we find:
(706)(1692)
E= = 537.360
2223
332 318 29 27 706
Survived 537.360
Died 1360 104 35 18 1517
Total
1692 422 64 45 2223
Find the expected frequency for the lower left hand cell, assuming
independence between the row variable and the column variable.
(1517)(1692)
E= = 1154.640
2223
332 318 29 27 706
Survived 537.360 134.022 20.326 14.291
Died 1360 104 35 18 1517

1154.64 287.978 43.674 30.709
Total
1692 422 64 45 2223
To interpret this result for the lower left hand cell, we can say that although 1360
men actually died, we would have expected 1154.64 men to die if survivablility is
independent of whether the person is a man, woman, boy, or girl.
Example: Using a 0.05 significance level, test the claim
that when the Titanic sank, whether someone survived or
died is independent of whether that person is a man,
woman, boy, or girl.
H0: Whether a person survived is independent of whether

the person is a man, woman, boy, or girl.
H1: Surviving the Titanic and being a man, woman, boy,
or girl are dependent.
2= (332–537.36)2 + (318–132.022)2 + (29–20.326)2 + (27–14.291)2

537.36 134.022 20.326 14.291
+ (1360–1154.64)2 + (104–287.978)2 + (35–43.674)2 + (18–30.709)2

1154.64 287.978 43.674 30.709
2=78.481 + 252.555 + 3.702+11.302+36.525+117.536+1.723+5.260

= 507.084
The number of degrees of freedom are (r–1)(c–1)=

(2–1)(4–1)=3.
Critical value: 2*.05,3 = 7.815. 507.084 > 7.815
We reject the null hypothesis.
P-value: P = P(2 > 507.084) = 0. P < 𝛼.
We reject the null hypothesis.
Survival and gender are dependent.

Test Statistic 2 = 507.084
with  = 0.05 and (r – 1) (c– 1) = (2 – 1) (4 – 1) = 3 degrees of freedom
Critical Value 2 = 7.815 (from Table )

Procedure
Procedure (cont.)
EXAMPLE
 A survey was done by a car manufacturer
concerning a particular make and model. A
group of 500 potential customers were
asked whether they purchased their
current car because of its appearance, its
performance rating, or its fixed price (no
negotiating). The results, broken down by
gender responses, are given on the next
slide.
EXAMPLE (Continued)
Question: Do females feel differently

than males about the three different
criteria used in choosing a car, or do
they feel basically the same?
Solution
 χ2 Test for independence.
 Thus the null hypothesis will be that the

criterion used is independent of gender,
while the alternative hypothesis will be
that the criterion used is dependent on
gender.
Solution (continued)
 The degrees of freedom is given by

(number of rows – 1)(number of columns –
1).
 df = (2 – 1)(3 – 1) = 2.
 Calculate the row and column totals. These

row and column are called marginal totals.
 Computation of the expected values

 The expected value for a cell is the row
total times the column total divided by the
table total.
Let us use  = 0.01. So df = (2 –1)(3 –1) = 2 and 20.01,

2 = 9.210.
 The 2 test statistic is computed in the

same manner as was done for the
goodness-of-fit test.
 Diagram showing the rejection region.

Test of Homogeneity
Comparing Observed Distributions
 A test comparing the distribution of counts for

two or more groups on the same categorical
variable is called a chi-square test of
homogeneity.
 A test of homogeneity is actually the
generalization of the two-proportion z-test.
Comparing Observed Distributions (cont.)
 The statistic that we calculate for this test is

identical to the chi-square statistic for
independence.
 In this test, however, we ask whether choices
are the same among different groups (i.e.,
there is no model).
 The expected counts are found directly from
the data and we have different degrees of
freedom.
 The assumptions and conditions are the
same as for the chi-square goodness-of-fit
test:
 Counted Data Condition: The data must be
counts.
 Randomization Condition and 10% Condition:
As long as we don’t want to generalize, we
don’t have to check these conditions.
 Expected Cell Frequency Condition: The
expected count in each cell must be at least 5.
Test for Homogeneity
 In a chi-square test for homogeneity of
proportions, we test whether different
populations have the same proportion of
individuals with some characteristic.
 The procedures for performing a test of
homogeneity are identical to those for a test
of independence.
Example:
 The following question was asked of a random sample
of individuals in 1992, 2002, and 2008: “Would you
tell me if you feel being a teacher is an occupation of
very great prestige?” The results of the survey are
presented below:
1992 2002 2008
Yes 418 479 525
No 602 541 485
 Test the claim that the proportion of individuals that
feel being a teacher is an occupation of very great
prestige is the same for each year at the  = 0.01
level of significance.
Solution
Step 1: The null hypothesis is a statement of
“no difference” so the proportions for
each year who feel that being a teacher
is an occupation of very great prestige
are equal. We state the hypotheses as
follows:
H0: p1992= p2002= p2008
H1: At least one of the proportions is
different from the others.
Step 2: The level of significance is =0.01.
Solution
Step 3:
(a) The expected frequencies are found by
multiplying the appropriate row and column
totals and then dividing by the total sample
size. They are given in parentheses in the
table below, along with the observed
frequencies.
1992 2002 2008
418 479 525
Yes
(475.554) (475.554) (470.892)
602 541 485
No
(544.446) (544.446) (539.108)
Solution
Step 3:
(b) Since none of the expected frequencies are
less than 5, the requirements are satisfied.
(c) The test statistic is
418  475.554 479  475.554
2 2
 
2
0  
475.554 475.554
485  539.108
2

539.108
 24.74
Solution: Classical Approach
Step 4: There are r = 2 rows and c =3
columns, so we find the critical
value using (2-1)(3-1) = 2 degrees
of freedom.
The critical value is  0.01  9.210 .
2

Solution: Classical Approach

Step 5: Since the test statistic, 0  24.74
2
is greater than the critical value

 0.01
2
 9.210 , we reject the null hypothesis.

Solution: P-Value Approach
Step 4: There are r = 2 rows and c =3
columns so we find the P-value using
(2-1)(3-1) = 2 degrees of freedom.
The P-value is the area under the chi-
square distribution with 2 degrees of
freedom to the right of  02  24.74
which is approximately 0.

Solution: P-Value Approach
Step 5: Since the P-value is less than the
level of significance  = 0.01, we
reject the null hypothesis.
Solution
Step 6: There is sufficient evidence to
reject the null hypothesis at the  =
0.01 level of significance. We
conclude that the proportion of
individuals who believe that
teaching is a very prestigious career
is different for at least one of the
three years.
Example: Should Dentist Advertise?
 It may seem hard to believe but until the
1970’s most professional organizations
prohibited their members from advertising. In
1977, the U.S. Supreme Court ruled that
prohibiting doctors and lawyers from
advertising violated their free speech rights.
Should Dentist Advertise?
 The paper “Should Dentist Advertise?” (J. of
Advertising Research (June 1982): 33 – 38)
compared the attitudes of consumers and
dentists toward the advertising of dental
services. Separate samples of 101
consumers and 124 dentists were asked to
respond to the following statement: “I favor
the use of advertising by dentists to attract
new patients.”
 Possible responses were: strongly agree,
agree, neutral, disagree, strongly disagree.
 The authors were interested in determining

whether the two groups—dentists and
consumers—differed in their attitudes toward
advertising.
 This is a done by a chi-squared test of
homogeneity, that is we are testing the claim
that different populations have the same ratio
across some second variable characteristic.
 So how should we state the null and

alternative hypotheses for this test?
 H0:
The true category proportions for all
responses are the same for both populations
of consumers and dentists.
 Ha:
The true category proportions for all
responses are not the same for both
populations of consumers and dentists.
Observed Data
Strongly Response Strongly
Group Agree Agree Neutral Disagree Disagree
Consumers 34 49 9 4 5 101
Dentists 9 18 23 28 46 124
43 67 32 32 51 225
• How do we determine the expected cell count under the assumption of homogeneity?
• That’s right, the expected cell counts are estimated from the sample data (assuming
that H0 is true) by using …
 expected   row marginal total  column marginal total 

 cell count  
  the total sample size
Expected Values
Consumers 34 49 9 4 5 101
19.30
Dentists 9 18 23 28 46 124
43 67 32 32 51 225
• So the calculation for the first cell is …

1st expected  101 43
   19.302
 cell count  225
Expected Values
Consumers 34 49 9 4 5 101
19.30 30.08 14.36 14.36 22.89
Dentists 9 18 23 28 46 124
23.70 36.92 17.64 17.64 28.11
43 67 32 32 51 225
Test Statistic
 Now we can calculate the 2 test statistic:
 Observed Count  Expected Count 

2
2  
Expected Count
 34  19.30   49  30.08   46  28.11

2 2 2
   ... 
19.30 30.08 28.11
 11.20  11.90  2.00  ...  11.39  84.47

Sampling Distribution
 The two-way table for this situation has 2
rows and 5 columns, so the appropriate
degrees of freedom is (2 – 1)(5 – 1) = 4.
 Chi-Squared critical value: 𝜒2*= 9.49.

𝜒2 (84.47) > 𝜒2* (9.49), Reject the null
hypothesis.
P-value
 P-value: P = P(𝜒2 > 84.47) ≈ 0. Reject the null
hypothesis.
 Conclusion: With a P-value ≈ 0, reject the
null hypothesis. The true category proportions
for all responses are not the same for both
populations of consumers and dentists.
Homogeneity of Proportions
 An advertising firm has decided to ask 92
customers at each of three local shopping
malls if they are willing to take part in a
market research survey. According to
previous studies, 38% of Americans refuse to
take part in such surveys. At α = 0.01, test the
claim that the proportions are equal.
 Step 1 Mall
A
Mall
B
Mall
C
Total
 Ho: p1 = p2 = p3
Will 52 45 36 133
 Ha: At least one Partici
pate
is different
 Step 2 Will 40 47 56 143
not
 α = 0.01 partici
pate
 Step 3
 2
( 2) Total 92 92 92 276
 Step 4
 Put into your calculator
 Observed in matrix A
 Expected in matrix B
 Test statistic = 5.602
 P-value = 0.06
 Step 5
 Do Not Reject Ho
 Step 6
 There is not sufficient evidence to suggest that
at least one is different.
Chi-Square and Causation
 Chi-square tests are common, and tests for
independence are especially widespread.
 We need to remember that a small P-value is not
proof of causation.
 Since the chi-square test for independence treats
the two variables symmetrically, we cannot
differentiate the direction of any possible
causation even if it existed.
 And, there’s never any way to eliminate the
possibility that a lurking variable is responsible for
the lack of independence.
Chi-Square and Causation (cont.)
 In some ways, a failure of independence
between two categorical variables is less
impressive than a strong, consistent, linear
association between quantitative variables.
 Two categorical variables can fail the test of
independence in many ways.
 Examining the standardized residuals can help
you think about the underlying patterns.
CHI-SQUARE INFERENCE
 TEST FOR GOODNESS OF FIT
• Used to determine if a particular population
distribution fits a specified form
HYPOTHESES:
H0: Actual population percents are equal to
hypothesized percentages
Ha: Actual population percents are different from
hypothesized percentages
 TEST FOR INDEPENDENCE
• Used to determine if two variables within a single
population are independent
HYPOTHESES:
H0: There is no relationship between the two variables
in the population
Ha: There is a dependent relationship between the two
variables in the population
 TEST FOR HOMOGENEITY
• Used to determine if two separate populations are
similar in respect to a single variable
HYPOTHESES:
H0: There are no differences among proportions of
success in the populations
Ha: There are differences among proportions of
success in the populations

AP Stats Ch25

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

AP Stats Ch25

Uploaded by

Copyright:

Available Formats

Chapter 25

 Values for the random variable with the

 Solution: From Table in the formula

 Your Turn: What is the value of 20.10,20?

If all expected frequencies are not all equal:

 The chi-square goodness of fit test is

P-value = .6828, P > 𝛼

 Percent 15.3 22.0 24.8 19.8 18.1

3 7 24.8 8.68 .32516

2 6 19.8 6.93 .12481

1 2 18.1 6.335 2.9664

Totals 35 100% 35 7.56976

H1: The row variable is dependent (related to)

This procedure cannot be used to establish a direct

Dependence means only there is a relationship

row total column total

(row total) (column total)

Total number of all observed frequencies

Died 1360 104 35 18 1517

Died 1360 104 35 18 1517

H0: Whether a person survived is independent of whether

2= (332–537.36)2 + (318–132.022)2 + (29–20.326)2 + (27–14.291)2

+ (1360–1154.64)2 + (104–287.978)2 + (35–43.674)2 + (18–30.709)2

2=78.481 + 252.555 + 3.702+11.302+36.525+117.536+1.723+5.260

The number of degrees of freedom are (r–1)(c–1)=

Survival and gender are dependent.

Critical Value 2 = 7.815 (from Table )

Question: Do females feel differently

 Thus the null hypothesis will be that the

 The degrees of freedom is given by

 Calculate the row and column totals. These

 Computation of the expected values

Let us use  = 0.01. So df = (2 –1)(3 –1) = 2 and 20.01,

 The 2 test statistic is computed in the

 Diagram showing the rejection region.

 A test comparing the distribution of counts for

 The statistic that we calculate for this test is

is greater than the critical value

 The authors were interested in determining

 So how should we state the null and

 expected   row marginal total  column marginal total 

• So the calculation for the first cell is …

 Observed Count  Expected Count 

 34  19.30   49  30.08   46  28.11

 11.20  11.90  2.00  ...  11.39  84.47

 Chi-Squared critical value: 𝜒2*= 9.49.

You might also like