Professional Documents
Culture Documents
Remarks:
1. In a 2 x 2 contingency table, the degree of freedom is 𝑣 = (2 − 1)(2 − 1) = 1, so a correction factor, called the
Yates’ correction for continuity, is applied. The corrected formula is as follows:
𝑐 𝑟 2
2 (𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑒𝑑)
(|𝑜𝑖𝑗 − 𝑒𝑖𝑗 | − 0.5)
𝜒 = ∑∑ .
𝑒𝑖𝑗
𝑗 𝑖
2. In general, for an r x c contingency table, it is required that no fewer than 20% of the cells have an expected
frequency of less than five (5), and no cell should have an expected frequency of less than one (1). Columns are
combined (if meaningful) or pooled to meet such assumptions.
The summation extends over all rc cells in the r x c contingency table given below.
An 𝒓 𝒙 𝒄 Contingency Table
Row Column Variable 𝒀
Variable 𝑿 𝑦1 𝑦2 ⋯ 𝑦𝑗 ⋯ 𝑦𝑐 Row Total
𝑥1 𝑜11 𝑜12 ⋯ 𝑜1𝑗 ⋯ 𝑜1𝑐 𝑅1
𝑥2 𝑜21 𝑜22 ⋯ 𝑜2𝑗 ⋯ 𝑜2𝑐 𝑅2
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
𝑥𝑖 𝑜𝑖1 𝑜𝑖2 ⋯ 𝑜𝑖𝑗 ⋯ 𝑜𝑖𝑐 𝑅𝑖
⋮ ⋮ ⋮ ⋯ ⋮ ⋮ ⋮ ⋮
𝑥𝑟 𝑜𝑟1 𝑜𝑟2 ⋯ 𝑜𝑟𝑗 ⋯ 𝑜𝑟𝑐 𝑅𝑟
Column
𝐶1 𝐶2 ⋯ 𝐶𝑗 ⋯ 𝐶𝑐 GT
Total
where
𝑿𝒊 is the ith category of variable X;
𝒀𝒋 is the jth category of variable Y;
𝑜𝑖𝑗 is the observed frequency for cell ij;
𝑅𝑖 is the ith row total;
𝐶𝑗 is the jth column total;
𝐺𝑇 is the Grand Total;
𝑒𝑖𝑗 is the expected frequency;
𝑣 is the degrees of freedom for Chi-Square Statistic;
𝑟 is the number of rows;
𝑐 is the number of columns.
Test the hypothesis that the number of hours spent per day in TV viewing is independent of the sex of the TV
viewer. Use 𝛼 = 0.05.
Given: 𝛼 = 0.05,
Number of hours per day of TV Viewing
Sex < 2 hours 2 to 4 hours > 4 hours Row Total
Male 18 40 20 78 𝑅1
Female 12 50 40 102 𝑅2
Column Total 30 90 60 180 𝐺𝑇
𝐶1𝐶2 𝐶3
The expected frequencies are computed as follows:
(𝑅1 )(𝐶1 ) (78)(30) (𝑅1 )(𝐶2 ) (78)(90) (𝑅1 )(𝐶3 ) (78)(60)
𝑒11 = = = 13 𝑒12 = = = 39 𝑒13 = = = 26
𝐺𝑇 180 𝐺𝑇 180 𝐺𝑇 180
4. Critical Regions: The critical region is given by 𝜒 2 > 𝜒 2 (𝛼,𝑣) where 𝜒 2 (𝛼,𝑣) = 𝜒 2 (0.05,2) = 5.99
5. Computation: Using the formula in step 3, the actual value of the test statistic is:
𝒄 𝒓 𝟐
(𝒐𝒊𝒋 − 𝒆𝒊𝒋 ) (𝒐𝟏𝟏 − 𝒆𝟏𝟏 )𝟐 (𝒐𝟐𝟏 − 𝒆𝟐𝟏 )𝟐 (𝒐𝟏𝟐 − 𝒆𝟏𝟐 )𝟐 (𝒐𝟐𝟐 − 𝒆𝟐𝟐 )𝟐 (𝒐𝟏𝟑 − 𝒆𝟏𝟑 )𝟐 (𝒐𝟐𝟑 − 𝒆𝟐𝟑 )𝟐
𝝌𝟐 = ∑ ∑ = + + + + +
𝒆𝒊𝒋 𝒆𝟏𝟏 𝒆𝟐𝟏 𝒆𝟏𝟐 𝒆𝟐𝟐 𝒆𝟏𝟑 𝒆𝟐𝟑
𝒋 𝒊
(𝟏𝟖 − 𝟏𝟑)𝟐 (𝟏𝟐 − 𝟏𝟕)𝟐 (𝟒𝟎 − 𝟑𝟗)𝟐 (𝟓𝟎 − 𝟓𝟏)𝟐 (𝟐𝟎 − 𝟐𝟔)𝟐 (𝟒𝟎 − 𝟑𝟒)𝟐
= + + + + + = 𝟓. 𝟖𝟖
𝟏𝟑 𝟏𝟕 𝟑𝟗 𝟓𝟏 𝟐𝟔 𝟑𝟒
6. Statistical Decision: Since 𝝌𝟐 = 𝟓. 𝟖𝟖 is NOT greater than 𝟓. 𝟗𝟗 (meaning, it is NOT in the critical
region), the null hypothesis 𝑯𝟎 is NOT rejected.
7. Conclusion: Therefore, the number of hours spent on TV viewing is independent of the sex of an
individual at 𝜶 = 𝟎. 𝟎𝟓.
Example 2. A serum is claimed to cure a certain disease. To test this claim, 200 people infected with the disease
were divided into groups A and B with equal number of people. The serum is given to group A but not to group
B. Group B is called the control group. Table below shows that 75 people of Group A and 65 people of Group B,
recovered from the disease.
Number of People Recovered by
the Serum
Group Recovered Did not Recovered
Group A (with serum) 75 25
Group B (without serum) 65 35
Test the hypothesis that recovery is independent of the use of the serum using 𝛼 = 0.05.
Given: 𝛼 = 0.05,
Number of People Recovered
by the Serum
Group Recovered Did not Recovered Row Total
Group A (with serum) 75 25 100 𝑅1
Group B (without serum) 65 35 100 𝑅2
Column Total 140 60 200 𝐺𝑇
𝐶1 𝐶2
The expected frequencies are computed as follows:
(𝑅1 )(𝐶1 ) (100)(140) (𝑅1 )(𝐶2 ) (100)(60)
𝑒11 = = = 70 𝑒12 = = = 30
𝐺𝑇 200 𝐺𝑇 200
4. Critical Regions: The critical region is given by 𝜒 2 > 𝜒 2 (𝛼,𝑣) where 𝜒 2 (𝛼,𝑣) = 𝜒 2 (0.05,1) = 3.84
Reference: Supe, A., et. al., (2013). Elementary Statistics. Central Book Supply Inc.
Prepared by:
JOBELLE S. SIMBLANTE
Stat 26 Instructor