Professional Documents
Culture Documents
Notes12 2c
Notes12 2c
The c 2 test for independence is the third and final type of c 2 test we will learn. It tests for an association
between 2 categorical variables in 1 population.
Among Monte Vista high school seniors, is there an association between gender and having a driver’s
license? Suppose that a random sample of 100 seniors was taken and the subjects were asked their gender
and whether or not they had a driver’s license. Does the data suggest than an association exists?
Mal Femal
e e
License 36 32 68
No 11 21 32
License
47 53 10
If there was no association in the 0
two variables, what would the
expected counts be?
Thus, since there are 100 people in our sample, the expected number of males with licenses should be:
�47 ��68 � 47 � 68 row total � column total
100•P(M and L) = 100 � � �� � �= = = 31.96
� 100 �� 100 � 100 grand total
Male Femal
e
License 31.9 36.04 68
6
No 15.0 16.96 32
License 4
47 53 10
0
Note: Even though this is not a HOP test, the method for calculating expected cell counts is the same.
5 steps:
1. At first glance, it appears that there is an association between gender and having a license since the
observed counts are different than the expected counts. However, it is possible that the variables have no
association and the differences we see are due to sampling variability. To decide, we will conduct a Chi-
square test for Independence ( a = .05).
3. Conditions:
a. The data comes from a random sample of Monte Vista seniors? Given.
b. The sample size is large? Yes, all expected cell counts are ≥ 5 (see table above).
c. Sample < 10% of population? Assuming > 1000 Monte Vista seniors. This is not reasonable.
Proceed with caution.
( 36 - 31.96 )
2
4. c =
2
+L = 3.01, df = (2 - 1)(2 - 1) = 1, P( c > 3.01) = .0827
2
31.96
5. Since P-value > a , we fail to reject the null hypothesis and cannot conclude that there is an association
between gender and having a license.
Innovative Machines Incorporated has developed two new letter arrangements for computer keyboards. The
company randomly select 300 beginning typing students and randomly assigned them to keyboard types.The
company wishes to see if there is any relationship between the arrangement of letters on the keyboard and
the number of hours it takes a new typing student to learn to type at 20 words per minute. Or, from another
point of view, is the time it takes a student to learn to type INDEPENDENT of the arrangement of the letters
on a keyboard? Perform a hypothesis test based on the data in the chart below:
1. At first glance, it appears that there is an association between keyboard and number of hours it takes the
master the keyboard since observed counts are different than the expected counts. However, it is possible
that the variables have no association and the differences we see are due to sampling variability. To decide,
we will conduct a Chi-square test for Independence ( a = .05).
3. Conditions:
a. The data comes from a random sample of new typists? Given.
b. The sample size is large? Yes, all expected cell counts are ≥ 5 (see table above).
(25 - 24) 2
4. c 2 = + ... =13.32, df =(3 - 1)(3 - 1) =4, P( c 2 > 13.32) = .0098
24
5. Since P-value < a , we reject the null hypothesis and conclude that there is an association between
keyboard type and having Mastery Hours required. The time it takes a student to learn to type is not
independent of the keyboard type.
HW #94: 12.16, 12.30, 12.40, 12.43