You are on page 1of 23

Chapter 9:

Chi-Square Applications
Statistics and Probability
Learning Objectives

When you have completed this chapter, you will be able to :


1. List and understand the characteristics of chi-square
distribution.
2. Conduct a test of hypothesis comparing an observed set of
frequencies to an expected distribution.
3. Conduct a test of hypothesis to determine whether two
classification criteria are related.
LO1. Characteristics of the Chi-Square Distribution

1. Chi-square values are never negative.


2. The chi-square distribution is positively skewed.
3. There is a family of chi-square distributions.
• Each time the degrees of freedom change, a new distribution is formed.
• As the degrees of freedom increases, the distribution approaches a normal
distribution.
LO1. Characteristics of the Chi-Square Distribution
LO2. GOODNESS-OF-FIT-TEST

A goodness-of-fit test will show whether an observed set of frequencies could have
come from a hypothesized population distribution
A. The degrees of freedom are k-1, where k is the number of categories.
B. The formula for computing the value of chi-square is
2
(𝑓𝑜 −𝑓 𝑒 )
𝑥2 = ෍
𝑓𝑒
LO2. CRITICAL VALUES OF CHI-SQUARE
LO2. GOODNESS-OF-FIT TEST:
Equal Expected Frequencies
• Illustration:
The human resources director at Georgetown Paper, Inc., is concerned about
absenteeism among hourly workers. She decides to sample the company records to
determine whether absenteeism is distributed evenly throughout the six-day
workweek. The hypotheses are:
𝐻0 : 𝐴𝑏𝑠𝑒𝑛𝑡𝑒𝑒𝑖𝑠𝑚 𝑖𝑠 𝑒𝑣𝑒𝑛𝑙𝑦 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑒𝑑 𝑡ℎ𝑟𝑜𝑢𝑔ℎ𝑜𝑢𝑡 𝑡ℎ𝑒 𝑤𝑜𝑟𝑘𝑤𝑒𝑒𝑘.
𝐻1 : 𝐴𝑏𝑠𝑒𝑛𝑡𝑒𝑒𝑖𝑠𝑚 𝑖𝑠 𝑛𝑜𝑡 𝑒𝑣𝑒𝑛𝑙𝑦 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑒𝑑 𝑡ℎ𝑟𝑜𝑢𝑔ℎ𝑜𝑢𝑡 𝑡ℎ𝑒 𝑤𝑜𝑟𝑘𝑤𝑒𝑒𝑘.
The sample results are: (use 1 percent significant level)
Number of Absent Number of Absent
Monday 12 Thursday 10
Tuesday 9 Friday 9
Wednesday 11 Saturday 9
LO2. GOODNESS-OF-FIT TEST:
EQUAL EXPECTED FREQUENCIES
• Illustration: The human resources director at Georgetown Step 1. State the null and alternate hypothesis
Paper, Inc., is concerned about absenteeism among hourly H0: Absenteeism is evenly distributed
workers. She decides to sample the company records to throughout the workweek.
determine whether absenteeism is distributed evenly H1: Absenteeism is not evenly distributed
throughout the six day workweek. The hypotheses are: throughout the workweek.
𝐻0: 𝐴𝑏𝑠𝑒𝑛𝑡𝑒𝑒𝑖𝑠𝑚 𝑖𝑠 𝑒𝑣𝑒𝑛𝑙𝑦 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑒𝑑 𝑡ℎ𝑟𝑜𝑢𝑔ℎ𝑜𝑢𝑡 𝑡ℎ𝑒 𝑤𝑜𝑟𝑘𝑤𝑒𝑒𝑘.
𝐻1: 𝐴𝑏𝑠𝑒𝑛𝑡𝑒𝑒𝑖𝑠𝑚 𝑖𝑠 𝑛𝑜𝑡 𝑒𝑣𝑒𝑛𝑙𝑦 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑒𝑑 𝑡ℎ𝑟𝑜𝑢𝑔ℎ𝑜𝑢𝑡 𝑡ℎ𝑒 𝑤𝑜𝑟𝑘𝑤𝑒𝑒𝑘. Step 2. Select the level of significance
𝑇ℎ𝑒 𝑠𝑎𝑚𝑝𝑙𝑒 𝑟𝑒𝑠𝑢𝑙𝑡𝑠 𝑎𝑟𝑒: (𝑢𝑠𝑒 1 𝑝𝑒𝑟𝑐𝑒𝑛𝑡 𝑠𝑖𝑔𝑛𝑖𝑓𝑖𝑐𝑎𝑛𝑡 𝑙𝑒𝑣𝑒𝑙) We selected the 0.01 significance level. The
probability is 0.01 that a true null hypothesis
Number of Number of will be rejected.
Absent Absent
Monday 12 Thursday 10 Step 3. Select the test statistics
The test statistic follows the chi-square
Tuesday 9 Friday 9 distribution, designated as
Wednesday 11 Saturday 9 2
(𝑓𝑂 − 𝑓𝑒 )2
𝑥 =෍
𝑓𝑒
LO2. GOODNESS-OF-FIT TEST:
EQUAL EXPECTED FREQUENCIES
• Illustration: The human resources director at Georgetown Step 4. Formulate the decision rule
Paper, Inc., is concerned about absenteeism among hourly 𝑑𝑓 = 𝑘 − 1 = 6 − 1 = 5
workers. She decides to sample the company records to
determine whether absenteeism is distributed evenly
throughout the six day workweek. The hypotheses are:
𝐻0: 𝐴𝑏𝑠𝑒𝑛𝑡𝑒𝑒𝑖𝑠𝑚 𝑖𝑠 𝑒𝑣𝑒𝑛𝑙𝑦 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑒𝑑 𝑡ℎ𝑟𝑜𝑢𝑔ℎ𝑜𝑢𝑡 𝑡ℎ𝑒 𝑤𝑜𝑟𝑘𝑤𝑒𝑒𝑘.
𝐻1: 𝐴𝑏𝑠𝑒𝑛𝑡𝑒𝑒𝑖𝑠𝑚 𝑖𝑠 𝑛𝑜𝑡 𝑒𝑣𝑒𝑛𝑙𝑦 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑒𝑑 𝑡ℎ𝑟𝑜𝑢𝑔ℎ𝑜𝑢𝑡 𝑡ℎ𝑒 𝑤𝑜𝑟𝑘𝑤𝑒𝑒𝑘.
𝑇ℎ𝑒 𝑠𝑎𝑚𝑝𝑙𝑒 𝑟𝑒𝑠𝑢𝑙𝑡𝑠 𝑎𝑟𝑒: (𝑢𝑠𝑒 1 𝑝𝑒𝑟𝑐𝑒𝑛𝑡 𝑠𝑖𝑔𝑛𝑖𝑓𝑖𝑐𝑎𝑛𝑡 𝑙𝑒𝑣𝑒𝑙) Critical value = 15.086

Number of Number of
Absent Absent
Monday 12 Thursday 10
Tuesday 9 Friday 9
Wednesday 11 Saturday 9
Decision Rule: Reject Null Hypothesis if the computed
value of chi-square is greater than 15.086. Otherwise,
do not reject.
LO2. GOODNESS-OF-FIT TEST:
EQUAL EXPECTED FREQUENCIES
Step 5. Compute the value of chi-square and make a decision.

fo fe (fo-fe) (fo-fe)^2 [(fo-fe)^2]/fe 2


𝑓𝑂 − 𝑓𝑒 2
𝑥 =෍
Monday 12 10 2 4 4/10=0.4 𝑓𝑒
Tuesday 9 10 -1 1 1/10=0.1
𝑥 2 = 0.80
Wednesday 11 10 1 1 1/10=0.1
Thursday 10 10 0 0 0/10=0
Friday 9 10 -1 1 1/10=0.1
Saturday 9 10 -1 1 1/10=0.1
60 60 0.80
Do not reject the null hypothesis. Absenteeism is
distributed evenly throughout the week. The
observed differences are due to sampling
variation.
LO2. GOODNESS-OF-FIT TEST:
Equal Expected Frequencies
• Exercise:
Classic Golf, Inc., manages five courses in the Jacksonville, Florida,
area. The director of golf wishes to study the number of rounds of gold
played per weekday at the five courses. He gathered the following
sample information shown below. At the 0.05 significance level, , test
the null hypothesis that there is no significant difference in the number
of rounds played by day of the week.
Rounds
Monday 124
Tuesday 74
Wednesday 104
Thursday 98
Friday 120
LO2. GOODNESS-OF-FIT TEST:
EQUAL EXPECTED FREQUENCIES
EXERCISE Classic Golf, Inc., manages five courses in the Step 1. State the null and alternate hypothesis
Jacksonville, Florida, area. The director of golf wishes to study H0: There is no significant difference in the
the number of rounds of gold played per weekday at the five number of rounds played by day of the week.
courses. He gathered the following sample information shown H1: There is significant difference in the
below. At the 0.05 significance level, test the null hypothesis number of rounds played by day of the week.
that there is no significant difference in the number of rounds
played by day of the week. Step 2. We selected the 0.05 significance
level. The probability is 0.05 that a true null
Rounds hypothesis will be rejected.
Monday 124 Step 3. Select the test statistics
Tuesday 74 The test statistic follows the chi-square
distribution, designated as
Wednesday 104 (𝑓𝑂 − 𝑓𝑒 )2
2
𝑥 =෍
Thursday 98 𝑓𝑒
Friday 120
LO2. GOODNESS-OF-FIT TEST:
EQUAL EXPECTED FREQUENCIES
EXERCISE Classic Golf, Inc., manages five courses in the Step 4. Formulate the decision rule
Jacksonville, Florida, area. The director of golf wishes to study 𝑑𝑓 = 𝑘 − 1 = 5 – 1 = 4
the number of rounds of gold played per weekday at the five
courses. He gathered the following sample information shown
below. At the 0.05 significance level, test the null hypothesis
that there is no significant difference in the number of rounds
played by day of the week.
Critical value = 9.488
Rounds
Monday 124
Tuesday 74
Wednesday 104
Thursday 98
Decision Rule: Reject Null Hypothesis if the computed
Friday 120 value of chi-square is greater than 9.488. Otherwise, do
not reject.
LO2. GOODNESS-OF-FIT TEST:
EQUAL EXPECTED FREQUENCIES
Step 5. Compute the value of chi-square and make a decision.

fo fe (fo-fe) (fo-fe)^2 [(fo-fe)^2]/fe 2


𝑓𝑂 − 𝑓𝑒 2
𝑥 =෍
Monday 124 104 20 400 400/104=3.85 𝑓𝑒
Tuesday 74 104 -30 900 900/104=8.65
𝑥 2 = 15.31
Wednesday 104 104 0 0 0/104=0
Thursday 98 104 -6 36 36/104=0.35
Friday 120 104 -16 256 256/104=2.46
520 520 15.31
Reject the null hypothesis. There is significant
difference in the number of rounds played by day
of the week.
LO2. GOODNESS-OF-FIT TEST:
UNEQUAL EXPECTED FREQUENCIES
The owner of a mail-order catalog would like to compare her sales with the
geographic distribution of the population. According to the United States Bureau of
the Census, 21 percent of the population lives in the Northeast, 24 percent in the
Midwest, 35 percent in the South, and 20 percent in the West. Listed below is the
breakdown of a sample of 400 orders randomly selected from those shipped last
month. At the 0.01 significance level, does the distribution of the orders reflect the
population?
Region Frequency
Northeast 68
Midwest 104
South 155
West 73
Total 400
LO2. GOODNESS-OF-FIT TEST:
UNEQUAL EXPECTED FREQUENCIES
The owner of a mail-order catalog would like to compare her Step 1. State the null and alternate hypothesis
sales with the geographic distribution of the population. H0: 𝝅𝒏 = 𝟎. 𝟐𝟏,𝝅𝒎 = 𝟎. 𝟐𝟒,𝝅𝒏𝒔 = 𝟎. 𝟑𝟓,𝝅𝒘 = 𝟎.
According to the United States Bureau of the Census, 21 percent 𝟐𝟎,
of the population lives in the Northeast, 24 percent in the H1: The distribution is not as given
Midwest, 35 percent in the South, and 20 percent in the West.
Listed below is the breakdown of a sample of 400 orders Step 2. Select the level of significance
randomly selected from those shipped last month. At the 0.01 We selected the 0.01 significance level. The
significance level, does the distribution of the orders reflect the probability is 0.01 that a true null hypothesis will
population? be rejected.
Region Frequency
Step 3. Select the test statistics The test
Northeast 68 statistic follows the chi-square distribution,
Midwest 104 designated as

South 155 2
𝑓𝑂 − 𝑓𝑒 2
𝑥 =෍
𝑓𝑒
West 73
Total 400
LO2. GOODNESS-OF-FIT TEST:
UNEQUAL EXPECTED FREQUENCIES
The owner of a mail-order catalog would like to compare her Step 4. Formulate the decision rule
sales with the geographic distribution of the population. 𝑑𝑓 = 𝑘 − 1 = 4 – 1 = 3
According to the United States Bureau of the Census, 21 percent
of the population lives in the Northeast, 24 percent in the
Midwest, 35 percent in the South, and 20 percent in the West.
Listed below is the breakdown of a sample of 400 orders
randomly selected from those shipped last month. At the 0.01
significance level, does the distribution of the orders reflect the Critical value = 11.345
population?
Region Frequency
Northeast 68
Midwest 104
South 155
West 73 Decision Rule: Reject Null Hypothesis if the
computed value of chi-square is greater than
Total 400 11.345. Otherwise, do not reject.
LO2. GOODNESS-OF-FIT TEST:
UNEQUAL EXPECTED FREQUENCIES
Step 5. Compute the value of chi-square and make a decision.

fo fe (fo-fe) (fo-fe)^2 [(fo-fe)^2]/fe 2


𝑓𝑂 − 𝑓𝑒 2
𝑥 =෍
Monday 68 84 -16 256 256/84=3.048 𝑓𝑒
Tuesday 104 96 8 64 64/96=0.667
𝑥 2 = 5.9345
Wednesday 155 140 15 225 225/140=1.607
Thursday 73 80 -7 49 49/80=0.6125
400 400 5.9345
Do not reject the null hypothesis. The distribution
of order destinations reflects the population. .
LO3. CONTINGENCY TABLE ANALYSIS

A contingency table is used to test whether two traits or characteristics


are related.
A. Each observation is classified according to two traits.
B. The expected frequency is determined as follows
(𝑟𝑜𝑤 𝑡𝑜𝑡𝑎𝑙)(𝑐𝑜𝑙𝑢𝑚𝑛 𝑡𝑜𝑡𝑎𝑙)
𝑓𝑒 =
𝐺𝑟𝑎𝑛𝑑 𝑇𝑜𝑡𝑎𝑙
C. The degrees of freedom are found by:
𝑑𝑓 = (𝑅𝑜𝑤𝑠 − 1)(𝐶𝑜𝑙𝑢𝑚𝑛 − 1)
D. The usual hypothesis testing procedure is used.
LO3. CONTINGENCY TABLE ANALYSIS

ILLUSTRATION A social scientist sampled 140 people and classified them


according to income level and whether or not they played a state
lottery in the last month. The sample information is reported below. Is
it reasonable to conclude that playing the lottery is related to income
level? Use the 0.05 significance level.
Low Income Middle Income High Income Total
Played 46 28 21 95
Did not play 14 12 19 45
Total 60 40 40 140
LO3. CONTINGENCY TABLE ANALYSIS

ILLUSTRATION A social scientist sampled Step 1. State the null and alternate hypothesis
140 people and classified them according H0: There is no relationship between income and
whether the person played the lottery.
to income level and whether or not they H1: There is a relationship between income and
played a state lottery in the last month. whether the person played the lottery.
The sample information is reported below.
Is it reasonable to conclude that playing Step 2. Select the level of significance
the lottery is related to income level? Use We selected the 0.05 significance level. The
probability is 0.05 that a true null hypothesis will be
the 0.05 significance level. rejected.
Low Income Middle Income High Income Total

Played 46 28 21 95 Step 3. Select the test statistics The test statistic


follows the chi-square distribution, designated as
Did not play 14 12 19 45 2
𝑓 𝑂 − 𝑓𝑒
Total 𝑥2 = ෍
60 40 40 140 𝑓𝑒
LO3. CONTINGENCY TABLE ANALYSIS

ILLUSTRATION A social scientist sampled Step 4. Formulate the decision rule


140 people and classified them according 𝑑𝑓 = 𝑅𝑜𝑤𝑠 – 1 𝐶𝑜𝑙𝑢𝑚𝑛𝑠 – 1 = 2 − 1 3 − 1 = 2
to income level and whether or not they
played a state lottery in the last month.
The sample information is reported below.
Is it reasonable to conclude that playing
the lottery is related to income level? Use Critical value = 5.991
the 0.05 significance level.
Low Income Middle Income High Income Total

Played 46 28 21 95
Did not play 14 12 19 45
Total 60 40 40 140 Decision Rule: Reject Null Hypothesis if the computed
value of chi-square is greater than 5.991. Otherwise,
do not reject.
LO3. CONTINGENCY TABLE ANALYSIS

EXERCISE A study regarding the relationship between age and the amount of pressure
sales personnel feel in relation to their jobs revealed the following sample
information. At the .01 significance level, is there a relationship between job
pressure and age?
Degrees of Job Pressure
Age Low Medium High Total
Less than 25 20 18 22 60
25 up to 40 50 46 44 140
40 up to 60 58 63 59 180
60 and older 34 43 43 120
Total 162 170 168 500

You might also like