You are on page 1of 19

CHI-SQUARE

St. Paul University Philippines
Tuguegarao City
CHI-SQUARE DISTRIBUTION
• Unlike the normal and Student-t
distribution, the Chi-Square distribution is
not symmetric.
• The values of the Chi-Square distribution
can be 0 or positive, but they can not be
negative.
• The Chi-Square distribution is different for
each number of degrees of freedom.
CHI-SQUARE DISTRIBUTION
• The Goodness-of-fit test is used to test the
hypothesis that an observed frequency fits
(or conforms ) to some claimed theoretical
distribution.
• The Test Statistic for Goodness-of-Fit Tests
is χ
2
= Σ (O - E)
2
/ E
EXAMPLE 1
• A study was made of 147 industrial accidents
that required medical attention. The sample
data are summarized in the table below. Test
the claim that accidents occur with equal
proportions on the 5 workdays.
Day Mon Tues Wed Thurs Fri
Observed
Accidents
31 42 18 25 31
SOLUTION
The solution of solving problems in Chi
Square is by following the six steps in
hypothesis testing sequencially. This is
similar to Z – test , T – test and F- test. For
the tabular value or critical value of Chi
Square, refer to any statistics books usually
located in the appendix.
Solution
1. Ho: The accidents occur with equal
proportions on the 5 work days..
P
1
=P
2
=P
3
=P4=P
5
Ha: At least one of the proportions is
different from the others.
2. α = .05 since the significance level is not
specified
3. chi(x
2
) – test
4. To find the tabular value
Degrees of freedom = Number of categories(
c ) – one (1)
C = 5, since there are 5 workdays
df = c-1 = 5-1=4
tabular value = 9.488 with α = .05 (found in
the appendix of chi square table)
5. Computation
A. Compute for the expected frequency (fe)
Formula
fe = total frequency(n) x proportion(p)
fe = np
n = 31+ 42+ 18 + 25 + 31 = 147
p = 1/5 0r 20%, since the probability of an
accident in any given workday is 1/5 or 20%
fe = (147)(1/5) = 29.4 in each workday
Value of fe

Expected
Accidents
29.4 29.4 29.4 29.4 29.4


Day Mon Tues Wed Thurs Fri
Observed
Accidents
31 42 18 25 31


X
2
value
B. Compute for the X
2
value
Formula:
χ
2
= (observed frequency – expected frequency)
2
/ expected
frequency
χ
2
= Σ (O - E)
2
/ E
χ
2
= (31-29.4)
2
/ 29.4 + (42-29.4)
2
/ 29.4 + (18-29.4)
2
/ 29.4
+ (25-29.4)
2
/ 29.4 + (31-29.4)
2
/ 29.4
χ
2 =
10.65
6. Decision
Since the computed value 10.65 is
greater than the tabular value 9.488
reject Ho.
Example2
• Mars, Inc. claims that its M & M candies are distributed
with the color percentages of 30% brown, 20% yellow,
20% red, 10% orange, 10% green and 10% tan. A
classroom exercise resulted in the observed frequencies
listed in the table below. At the 0.05 significance level, test
the claim that the color distribution is as claimed by
Mars, Inc.
Brown Yellow Red Orange Green Tan
Observed
Frequency
84 79 75 49 36 47
Solution
1. Ho: The observed percentage color distribution
is not significantly different from the expected
percentage color distribution . P
1
=P
2
=P
3
=P4=P
5
=
P
6
Ha: At least one of the percentage color
distribution is different from the others.
2. α = .05 since the significance level is not specified
3. chi(x
2
) – test
4. To find the tabular value
Degrees of freedom=Number of categories(c)
– one (1)
C = 6, since there are 6 different colors
df = c-1 = 6-1=5
tabular value = 11.071 with α = .05 (found
in the appendix of chi square table)
5. Computation
A. Compute for the expected frequency (fe)
Formula
fe = total frequency(n) x proportion(p)
fe = np
n = 84+ 79+ 75 + 49 + 36 + 47 = 370
fe = (370)(.30) = 111 for color brown
fe = (370)(.20) = 74 for color yellow
5. Computation
fe = (370)(.20) = 74 for color red
fe = (370)(.10) = 37 for color orange
fe = (370)(.10) = 37 for color green
fe = (370)(.10) = 37 for color tan
Observed frequency with the
corresponding expected
frequency
Observed
frequency
84 79 75 49 36 47
Expected
frequency
111 74 74 37 37 37
X
2
value
B. Compute for the X
2
value
Formula:
χ
2
= (observed frequency – expected frequency)
2
/
expected frequency
χ
2
= Σ (O - E)
2
/ E
χ
2 =
13.541
6. Decision
Since the computed value 13.541 is
greater than the tabular value 11.071
reject Ho.
Contingency Tables
• A contingency table (or two-way table) is a
table in which frequencies correspond to
two variables. (One variable is used to
categorize rows, and a second variable is
used to categorize columns). A test of
independence tests the null hypothesis that
the row variable and column variable in a
contingency table are not related.
ASSUMPTIONS
• When working with data in the form of a
contingency table, we test the null
hypothesis that the row variable and the
column variable are independent and the
following assumptions apply:
– the sample data are randomly selected.
– For every cell in the contingency table, the
expected frequency E is at least 5
Test Statistic for a Test of
Independence
• χ
2
= Σ (O - E)
2
/ E
• Critical Values
– Tests of independence with contingency tables
involve only right-tailed critical regions.
– In a contingency table with r rows and c
columns, the number of degrees of freedom is
given by df = (r - 1) (c - 1)
Expected Frequency for a
Contingency Table
(row total)(column total)
Expected Freq. E = --------------------------------
grand total
Example
Test the hypothesis that there is no significant relationship between
the sex of employees and their job satisfaction level if in a certain
company in Makati, the following results were obtained.
Job Satisfaction level
Sex low medium high total
male 45 60 55 160
female 9 10 10 129
Total 54 70 65 189
Solution
1. Ho: There is no significant relationship between
sex and job satisfaction level of 189 employees.
Ha: There is a significant relationship between sex
and job satisfaction level of 189 employees.
2. α = .05 since the significance level is not specified
3. chi(x
2
) – test
4. To find the tabular value
Degrees of freedom = (row classification - 1)
(column - 1)
df = (r - 1) (c - 1)
r= 2, since sex is classified as male or
female
c= 3, since job satisfaction level is
classified low medium and high
df = (2 - 1) (3- 1) = 2
4. To find the tabular value
tabular value = 5.99 with α = .05 (found in
the appendix of chi square table)
5. Computation
A. Compute for the expected frequency (fe)
Formul
(row total) (column total)
Expected Freq. E = --------------------------------
grand total
fe (45)=(160)(54)/189 = 45.71
fe (9)=(129)(54)/189 = 8.29
fe (60)=(160)(70)/189 = 59.26
fe (10)=(129)(570)/189 =10.74
fe (55)=(160)(65)/189 = 55.03
fe (10)=(129)(65)/189 = 9.97
Observed frequency with the
corresponding expected
frequency
Observed
frequency
45 9 60 10 55 10
Expected
frequency
45.71 8.29 59.26 10.74 55.03 9.97
X
2
value
B. Compute for the X
2
value
Formula:
χ
2
= (observed frequency – expected frequency)
2
/ expected
frequency
χ
2
= Σ (O - E)
2
/ E
χ
2 =
0.13
6. Decision
Since the computed value is 0.13 less
than the tabular value 5.99 accept Ho.
This means the that there is a significant
relationship between sex and job
satisfaction level of 189 employees.
ASSIGNMENT # 4
Solve the following problem by
following the steps sequencially.
ASSIGNMENT # 4
It is a common belief that more fatal car crashes occur
on certain days of the week such as Friday or
Saturday. A sample of motor vehicle deaths in
Metro Manila is randomly selected for a recent
year. The number of fatalities for the different days
of the week are listed below. At the .01 significance
level, test the claim that accidents occur with equal
proportion on the different days.
ASSIGNMENT # 4
Days Sun Mon Tue Wed Thu Fri Sat
Number
of
accidents
31 20 20 22 22 29 36
ASSIGNMENT # 4
2. Test the hypothesis if
hypertension is related to the
drinking habits among 200 male
respondents in a certain locality
as shown below.
ASSIGNMENT # 4
Non -
Drinkers
Moderate
Drinkers
Heavy
Drinker
s
Total
With
hypertension
22 42 54 118
Without
hypertension
45 22 15 82
Total 67 64 69 200