Differences Among k Proportions

Chi – Square: Tests of Independence and of Homogeneity, Goodness of Fit Test
There are also many problems in which we must decide whether observed
differences among more than two sample proportions can be attributed to
chance, or whether they are indicative of the fact that corresponding population
proportions are not all equal.
Example: 17 of 200 brand S tires, 34 of 200 brand G tires , and 21 of 200
brand Z tires failed to last 30000 miles, we may want to decide whether the
differences among their proportions are significant or whether they can be
attributed to chance.
Example: A student is doing a research project involving pet preferences
among students at his university. He took random samples of 300 female and 250
male students. Each sample member responded to the survey question, “If you
could own only one pet, what kind would you choose?” The possible responses
were: dog, cat, other pet, no pet. The results of the study follow
Gender Dog Cat Other Pets No Pet
Female 120 132 18 30
Male 135 70 20 25

Does the same proportion of males as females prefer each type of pet?
Example: Last year, the labor union bargaining agents listed five categories
and asked each employee to mark the one most important to her or him. The
categories and corresponding percentages of favorable responses are shown
below. The bargaining agents need to determine if the current distribution of
responses, fits last year’s distribution or if it is different.



Category Percentage of Favorable Responses
Vacation Time 4%
Salary 65%
Safety Regulations 13%
Health and Retirement Benefits 12%
Overtime Policy and Pay 6%

The Chi Square Statistic:


()

( )


With degrees of freedom: ( )( )


Example: Innovative Machines Incorporated has developed two new letter
arrangements for computer keyboards. The company wishes to see if there is any
relationship between the arrangement of letters on the keyboard and the number
of hours if takes a new typing student to learn to type at 20 words per minute.
Or, from another point of view, is the time it takes a student to learn to type
independent of the arrangement of the letters on a keyboard?
Keyboard 21 – 40 h 41 – 60 h 61 – 80 h Row Total
A 25(24) 30(40) 25(16) 80
B 30(36) 71(60) 19(24) 120
Standard 35(30) 49(50) 16(20) 100
Column Total 90 150 60 300
(sample size)

Step 1:

Keyboard arrangement and learning times are independent.

Keyboard arrangement and learning times are not independent.
Step 2:
Step 3:

statistic
Step 4: Reject

if

computed is .
Step 5:
Keyboard 21 – 40 h 41 – 60 h 61 – 80 h Row Total
A 25 30 25 80
B 30 71 19 120
Standard 35 49 16 100
Column Total 90 150 60 200
(sample size)

Table

Keyboard 21 – 40 h 41 – 60 h 61 – 80 h Row Total
A

30

?
25
?
80


B 30
( )
71

19
?
120


Standard 35
?
49
( )
16

100


Column Total 90 150 60 300
(sample size)


()

()

()

58

Step 6: Is

computed () ? Yes, reject

and accept

.
Step 7: Keyboard arrangement and learning times are not independent.

Example: Following are three random samples of salespersons in a large
telemarketing company which are classified by age and sales performance. All
sample sizes are fixed at . Test at whether the sales performance
of these persons is independent of their ages.
Youth Middle Age Senior Total
Highest
quarter of
sales
performance
13 17 18
Middle Half of
sales
performance
31 30 28
Lowest
quarter of
sales
performance
16 13 14
Total


Step 1:

Age and sales performance are independent (unrelated).

Age and sales performance are not independent (related).
Step 2:
Step 3:

statistic
Step 4: Reject

if

computed is ( )( ) .


Step 5:

Table
Youth Middle Age Senior Total
Highest
quarter of
sales
performance
13
()
17
()
18
()
48
Middle Half of
sales
performance
31
()
30
()
28
()
89
Lowest
quarter of
sales
performance
16
()
13
()
14
()

Total 60 60 60 180


( )

( )

( )

Step 6: Is

computed () ? No do not reject

Step 7: Age and sales performance are not related (or are independent).


Chi – Square: Goodness of Fit Test

The Chi Square Statistic:


()

With degrees of freedom: ( ),


Example: Last year, the labor union bargaining agents listed five categories and
asked each employee to mark the one most important to her or him. The
categories and corresponding percentages of favorable responses are shown
below. The bargaining agents need to determine if the current distribution of
responses, fits last year’s distribution or if it is different.




Category Percentage of Favorable Responses
Vacation Time 4%
Salary 65%
Safety Regulations 13%
Health and Retirement Benefits 12%
Overtime Policy and Pay 6%

In case of a goodness-of-fit test, we use the

to compute E for the categories.
Category Observed Expected Expected ( )

( )

Vacation Time 30 4% of 500 100 5
Salary 290 65% of 500 1225 3.77
Safety
Regulations
70 13% of 500 25 0.38
Health and
Retirement
Benefits
70 12% of 500 100 1.67
Overtime
Policy and Pay
40 6% of 500 100 3.33


( )

Step 1:

The present distribution of responses is the same as last year’s.

The present distribution of responses is different.
Step 2:
Step 3:

statistic ,


()

Step 4: Reject

if

computed is .
Step 5:


()

Step 6: Is

computed () ? Yes reject

and accept

.
Step 7: The present distribution of responses is different.

Homework for Saturday, August 17 on 1 bond paper:

1. The accuracy of a survey report from Pulse Asia in Metro Manila was
questioned by some government officials. A random sample of 1215
people living in the city was used to check the report, and the results are
shown here:

Candidate Census Percent Sample Result
Vam Aquino

10% 127
Cheese Escudero

3% 40
Piaya Hontiveros

38% 480
Kookie Pimentel

41% 502
Ed,…” Hug me don”

6% 56
Hanep Billiard

2% 10
Using a 1% level of significance, test the claim that the census distribution
and the sample distribution agree.

2. Miss Yu, a sociologist, is doing a study to see if there is a relationship
between the age of a young adult (18 to 35 years old) and the type of
movie preferred. A random sample of 93 adults revealed the following
data. Test whether age and type of movie preferred are independent at
the 0.05 level of significance.

Movie 18 – 23 years old 24 – 29 years old
30 – 35 years old
Drama 8 15
11
Science Fiction 12 10
8
Comedy 9 8
12