QM 2 CHAP12

BUSINESS STATISTICS
Chapter 12
Tests of Goodness of Fit and

Independence
Chap 12-1
Chapter Goals
After completing this chapter, you should be

able to:
 Use the chi-square goodness-of-fit test to
determine whether data fits a specified distribution
 Perform tests for normality
 Set up a contingency analysis table and perform a
chi-square test of association
Chap 12-2
Chi-Square Goodness-of-Fit Test
 Does sample data conform to a hypothesized

distribution?
 Examples:
 Do sample results conform to specified expected

probabilities?
 Are technical support calls equal across all days of
the week? (i.e., do calls follow a uniform
distribution?)
 Do measurements from a production process
follow a normal distribution?
Chap 12-3
Chi-Square Goodness-of-Fit Test
(continued)
 Are technical support calls equal across all days of the
week? (i.e., do calls follow a uniform distribution?)
 Sample data for 10 days per day of week:
Sum of calls for this day:

Monday 290
Tuesday 250
Wednesday 238
Thursday 257
Friday 265
Saturday 230
Sunday 192
 = 1722
Chap 12-4
Logic of Goodness-of-Fit Test
 If calls are uniformly distributed, the 1722 calls
would be expected to be equally divided across
the 7 days:
1722
 246 expected calls per day if uniform
7
 Chi-Square Goodness-of-Fit Test: test to see if
the sample results are consistent with the
expected results
Chap 12-5
Observed vs. Expected
Frequencies
Observed Expected
Oi Ei
Monday 290 246
Tuesday 250 246
Wednesday 238 246
Thursday 257 246
Friday 265 246
Saturday 230 246
Sunday 192 246
TOTAL 1722 1722
Chap 12-6
Chi-Square Test Statistic
H0: The distribution of calls is uniform
over days of the week
H1: The distribution of calls is not uniform
 The test statistic is

2
(O
K
 E )
2   i i
(where d.f.  K  1)
i1 Ei
where:
K = number of categories
Oi = observed frequency for category i
Ei = expected frequency for category i
Chap 12-7
The Rejection Region
2
K
(O  E )
2   i i
i 1 Ei
2 2
 Reject H0 if   α

(with k – 1 degrees of
freedom) 0 2
 Do not
reject H0  2
Reject H0
Chap 12-8
Chi-Square Test Statistic
2 2 2
(290  246) (250  246) (192  246)
2    ...   23.05
246 246 246
k – 1 = 6 (7 days of the week) so
use 6 degrees of freedom:
2.05 = 12.5916

Conclusion:
= .05
2 = 23.05 > 2 = 12.5916 so
0 2
reject H0 and conclude that the

Do not Reject H0
reject H0
distribution is not uniform
2.05 = 12.5916
Chap 12-9
Chi-Square Test for Goodness-of-
Fit
Multinomial GOF Test
• A multinomial distribution is defined by any k probabilities 1, 2,
…, k that sum to unity.
• For example, consider the following “official” proportions of M&M
colors.
calc
15-10 Chap 12-10

Chi-Square Test for Goodness-of-Fit
Multinomial GOF Test
• The hypotheses are
H0: 1 = .30, 2 = .20, 3 = .10, 4 = .10, 5 = .10, 6 = .20
H1: At least one of the j differs from the
hypothesized value
• No parameters are estimated (m = 0) and there are c = 6 classes, so
the degrees of freedom are
 = k – 1=5
15-11 Chap 12-11

Goodness-of-Fit Tests,
Population Parameters Unknown
Idea:
 Test whether data follow a specified distribution
(such as binomial, Poisson, or normal) . . .

 . . . without assuming the parameters of the
distribution are known
 Use sample data to estimate the unknown

population parameters
Chap 12-12
Goodness-of-Fit Tests,
Population Parameters Unknown
(continued)
 Suppose that a null hypothesis specifies category
probabilities that depend on the estimation (from the
data) of m unknown population parameters
 The appropriate goodness-of-fit test is the same as in
the previously section . . .
K
(Oi  Ei )2
 
2
i1 Ei
 . . . except that the number of degrees of freedom for
the chi-square random variable is
Degrees of Freedom  (K  m  1)
 Where K is the number of categories
Chap 12-13
Test of Normality
 The assumption that data follow a normal

distribution is common in statistics
 Normality was assessed in prior chapters

 Normal probability plots (Chapter 6)
 Normal quintile plots (Chapter 8)
 Here, a chi-square test is developed
Chap 12-14
Test of Normality
(continued)
 Two population parameters can be estimated using

sample data:
n
 i
(x  x)3
Skewness  i1
ns3
n
 i
(x  x) 4
Kurtosis  i1
ns 4
 For a normal distribution,
Skewness = 0
Kurtosis = 3
Chap 12-15
Bowman-Shelton
Test for Normality
 Consider the null hypothesis that the population
distribution is normal
 The Bowman-Shelton Test for Normality is based on the closeness
the sample skewness to 0 and the sample kurtosis to 3
 The test statistic is
 (Skewness) 2 (Kurtosis  3)2 

B  n  
 6 24 
 as the number of sample observations becomes very large, this
statistic has a chi-square distribution with 2 degrees of freedom
 The null hypothesis is rejected for large values of the test statistic
Chap 12-16
Bowman-Shelton
Test for Normality
(continued)
 The chi-square approximation is close only for very
large sample sizes
 If the sample size is not very large, the Bowman-
Shelton test statistic is compared to significance points
from text Table 16.7
Sample 10% 5% point Sample 10% 5% point
size N point size N point
20 2.13 3.26 200 3.48 4.43
30 2.49 3.71 250 3.54 4.61
40 2.70 3.99 300 3.68 4.60
50 2.90 4.26 400 3.76 4.74
75 3.09 4.27 500 3.91 4.82
100 3.14 4.29 800 4.32 5.46
125 3.31 4.34 ∞ 4.61 5.99
150 3.43 4.39
Chap 12-17
Example: Bowman-Shelton
Test for Normality
 The average daily temperature has been recorded for
200 randomly selected days, with sample skewness
0.232 and kurtosis 3.319
 Test the null hypothesis that the true distribution is
normal
 (Skewness) 2 (Kurtosis  3)2   (0.232) 2 (3.319  3)2 
B  n    200     2.642
 6 24   6 24 
 From Table 16.7 the 10% critical value for n = 200 is

3.48, so there is not sufficient evidence to reject that the
population is normal
Chap 12-18
Contingency Tables
Contingency Tables
 Used to classify sample observations according
to a pair of attributes
 Also called a cross-classification or cross-
tabulation table
 Assume r categories for attribute A and c
categories for attribute B
 Then there are (r x c) possible cross-classifications
Chap 12-19
r x c Contingency Table
Attribute B
Attribute A 1 2 ... C Totals
1 O11 O12 … O1c R1

2 O21 O22 … O2c R2
. . . … . .
. . . … . .
. . . … . .
r Or1 Or2 … Orc Rr
Totals C1 C2 … Cc n
Chap 12-20
Test for Independence
 Consider n observations tabulated in an r x c
contingency table
 Denote by Oij the number of observations in
the cell that is in the ith row and the jth column
 The null hypothesis is
H0 : No association exists
between the two attributes in the population
 The appropriate test is a chi-square test with
(r-1)(c-1) degrees of freedom
Chap 12-21
Test for Independence
(continued)
 Let Ri and Cj be the row and column totals
 The expected number of observations in cell row i and
column j, given that H0 is true, is
R iC j
Eij 
n
 A test of Independence at a significance level  is
based on the chi-square distribution and the following
decision rule
r c (Oij  Eij )2
Reject H0 if χ 2    χ (r2 1)c 1),α
i1 j1 Eij
Chap 12-22
Contingency Table Example
Left-Handed vs. Gender

 Dominant Hand: Left vs. Right
 Gender: Male vs. Female
H0: There is the independence between

hand preference and gender
H1: Hand preference is not independent of gender
Chap 12-23
Contingency Table Example
(continued)
Sample results organized in a contingency table:
Hand Preference
sample size = n = 300:
Gender Left Right
120 Females, 12
were left handed
Female 12 108 120
180 Males, 24 were
left handed Male 24 156 180
36 264 300
Chap 12-24
Logic of the Test
H0: There is the independence between
hand preference and gender
H1: Hand preference is not independent of gender
 If H0 is true, then the proportion of left-handed females

should be the same as the proportion of left-handed
males
 The two proportions above should be the same as the
proportion of left-handed people overall
Chap 12-25
Finding Expected Frequencies
120 Females, 12 Overall:
were left handed
180 Males, 24 were P(Left Handed)
left handed = 36/300 = .12
If no association, then
P(Left Handed | Female) = P(Left Handed | Male) = .12
So we would expect 12% of the 120 females and 12% of the 180
males to be left handed…
i.e., we would expect (120)(.12) = 14.4 females to be left handed

(180)(.12) = 21.6 males to be left handed
Chap 12-26
Expected Cell Frequencies
(continued)
 Expected cell frequencies:
R iC j th th
(i Row total)(j Column total)
Eij  
n Total sample size
Example:
(120)(36)
E11   14.4
300
Chap 12-27
Frequencies
Observed frequencies vs. expected frequencies:

Hand Preference
Gender Left Right
Observed = 12 Observed = 108

Female 120
Expected = 14.4 Expected = 105.6
Male 180
36 264 300
Chap 12-28
The Chi-Square Test Statistic
The Chi-square test statistic is:
r c (Oij  Eij )2
  
2
with d.f .  (r  1)(c  1)
i1 j1 Eij
 where:
Oij = observed frequency in cell (i, j)
Eij = expected frequency in cell (i, j)
r = number of rows
c = number of columns
Chap 12-29
Frequencies
Hand Preference
Gender Left Right

Female 120
Male 180
36 264 300
2 (12  14.4)2 (108  105.6)2 (24  21.6)2 (156  158.4)2

      0.6848
14.4 105.6 21.6 158.4
Chap 12-30
Contingency Analysis
 2  0.6848 with d.f.  (r - 1)(c - 1)  (1)(1)  1

Decision Rule:
If 2 > 3.841, reject H0,
otherwise, do not reject H0
Here, 2 = 0.6848
 = 0.05 < 3.841, so we
do not reject H0
2.05 = 3.841 2 and conclude that
gender and hand
Do not reject H0 Reject H0
preference are not
associated
Chap 12-31
Chapter Summary
 Used the chi-square goodness-of-fit test to determine

whether sample data match specified probabilities
 Conducted goodness-of-fit tests when a population
parameter was unknown
 Tested for normality using the Bowman-Shelton test
 Used contingency tables to perform a chi-square test for
association
 Compared observed cell frequencies to expected cell
frequencies
Chap 12-32

QM 2 CHAP12

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

QM 2 CHAP12

Uploaded by

Copyright:

Available Formats

BUSINESS STATISTICS

Tests of Goodness of Fit and

After completing this chapter, you should be

 Does sample data conform to a hypothesized

 Do sample results conform to specified expected

Sum of calls for this day:

 The test statistic is

15-10 Chap 12-10

15-11 Chap 12-11

(such as binomial, Poisson, or normal) . . .

 Use sample data to estimate the unknown

 The assumption that data follow a normal

 Normality was assessed in prior chapters

 Here, a chi-square test is developed

 Two population parameters can be estimated using

 (Skewness) 2 (Kurtosis  3)2 

 From Table 16.7 the 10% critical value for n = 200 is

Attribute A 1 2 ... C Totals

1 O11 O12 … O1c R1

Left-Handed vs. Gender

H0: There is the independence between

Sample results organized in a contingency table:

 If H0 is true, then the proportion of left-handed females

i.e., we would expect (120)(.12) = 14.4 females to be left handed

 Expected cell frequencies:

Observed frequencies vs. expected frequencies:

Gender Left Right

Observed = 12 Observed = 108

The Chi-square test statistic is:

Gender Left Right

Observed = 12 Observed = 108

2 (12  14.4)2 (108  105.6)2 (24  21.6)2 (156  158.4)2

 2  0.6848 with d.f.  (r - 1)(c - 1)  (1)(1)  1

 Used the chi-square goodness-of-fit test to determine

You might also like