You are on page 1of 28

Contingency Tables

Contingency Tables

Learning Objectives

 Use Contingency Table for 3 or more


Proportions Test

 Contingency Table Testing method

 Test of association

Seagate Confidential 2 Supplier Six Sigma Modular Training


Contingency Tables

Contingency Table
You have learn ANOVA for comparing 3 or more
population means .
What if you have 3 or more population proportions to
compare?
You can’t use one or two proportion test anymore .
Here , we can use contingency table using
2 test statistics.

Seagate Confidential 3 Supplier Six Sigma Modular Training


Contingency Tables

Multiple Proportion Comparison

vs vs
P1 P2 P3
Practical Question Are all populations’ proportions
(example) statistically different?

Statistical Question
Ho: 1 = 2 =3
Ha: at least one i is different

Seagate Confidential 4 Supplier Six Sigma Modular Training


Contingency Tables

Multiple Proportion Test - Procedure


A) State the problem statement

B) Null hypothesis: Ho: P1 = P2 = …. = Pk

C) Alternative hypotheses: not all Pi are equal

D) Determine significance level

E) Create contingency table of “successes” and “failures” from


binomial samples

Seagate Confidential 5 Supplier Six Sigma Modular Training


Contingency Tables

Multiple Proportion Test - Procedure


F) Determine the test statistics using the following formula or with Minitab (Stat >
Tables > Chi-square Test):
r c
n  1 s 2
2 =
(calc)
 (O -E )2 / E
i 1j 1
ij ij ij Recall
 
2
2
0
df = (r-1)(c-1)
where: O = the observed value (from experimental data) 
i1 i
n
x  x 2

s2 
E = the expected value = (Xr * Yc) / Ftotal n 1
r = number of rows
c = number of columns
Xr = total frequency for that row
Yc = total frequency for that column
Ftotal = total frequency for that table

G) Determine critical value of 2 test statistic


Reject null hypothesis if 2observed > 2critical

H) Translate statistical conclusion into practical solution

Seagate Confidential 6 Supplier Six Sigma Modular Training


Contingency Tables

How to construct a Contingency


Table
 Data is arranged into rows and columns, with each possible
occurrence of interest being a particular row x column (which is a
cell) count. The total occurrences of each cell are counted and
placed in that cell.

 Each row or each column has a total count associated with it.

 Based on these row and column counts, each cell--or row/column


combination has an expected count associated with it

 The actual counts are compared to the expected counts for all
cells. The likelihood of the observed variation is evaluated using a
statistical test called the Goodness-of-Fit Test.

Seagate Confidential 7 Supplier Six Sigma Modular Training


Contingency Tables

Contingency Table
P1 P2 P3 P4
Bad 2 4 3 1 X1= 10
Good 0 3 2 5 X2 = 10
Y1=2 Y2 =7 Y3=5 Y4=6 Ftotal=20

For discrete data: Xi


> The fraction of a random element falling in Row i, say U i = Ftotal
> The fraction of a random element falling in Column j, say V j = Yj
> Expected value of each cell Eij = Ftotal x Ui x Vj Ftotal
= ( X i *Y j )
Ftotal
From the observed Oij value and calculated Eij, we can derive the Chi-Square2 statistics
r c (Oij  Eij )
  
2

( calc )
i 1 j 1 Eij
Seagate Confidential 8 Supplier Six Sigma Modular Training
Contingency Tables

r x c Contingency Table
Column Factor Yj
1 2 3 c Totals
Row 1 O11 O12 O13 O1c X1
Factor 2 O21 O22 O23 O2c X2
Xi
3 O31 O32 O33 O3c X3

r Or1 Or2 Or3 Orc Xr


Totals Y1 Y2 Y3 Yc Ftotal
For discrete data: Xi
> The fraction of a random element falling in Row i, say U i = Ftotal
> The fraction of a random element falling in Column j, say V j = Yj
> Expected value of each cell Eij = Ftotal x Ui x Vj Ftotal
= ( X i *Y j )
Ftotal
From the observed Oij value and calculated Eij, we can derive the Chi-Square2 statistics
r c (Oij  Eij )
  
2

( calc )
i 1 j 1 Eij
Seagate Confidential 9 Supplier Six Sigma Modular Training
Contingency Tables

Industrial Process Example:


Defective Sub-Assembly
Jack is working on a project to address the root cause of
defective sub-assemblies. It has been brought to Jack’s
attention that a higher percentage of defectives are produced in
the third shift.
Jack collected the following data to check the validity of the
above claim:
TYPE First Shift Second Shift Third Shift
Defectives 43 56 68
Non-Defectives 965 878 781

Test at 5% significance level to determine if the proportion of


defectives is the same for all three shifts.

Seagate Confidential 10 Supplier Six Sigma Modular Training


Contingency Tables

Industrial Process Example:


Defective Sub-Assembly
Practical problem
 Are some shifts producing more defectives than
others?
Statistical problem
 Is the proportion of defectives the same for all
three shifts?
 Null hypothesis: proportion defectives are all equal
for all three shifts
 Alternate hypothesis: proportion defectives are not
all equal for all three shifts
Seagate Confidential 11 Supplier Six Sigma Modular Training
Contingency Tables

Industrial Process Example:


Defective Sub-Assembly
State the hypotheses and significance level
Ho: P1 = P2 = P3
Ha: at least one Pi is different ( = 0.05)

What hypothesis test is appropriate?


 These hypotheses deal with several proportions
 Use Chi-Square Test for Association

Seagate Confidential 12 Supplier Six Sigma Modular Training


Contingency Tables
Industrial Process Example:
Defective Sub-Assembly
Analysis Using Minitab
Tool Bar Menu > Stat > 2. Stat > Tables > Chi-Square Test
Tables > Chi-Square Test 3. Fill in the dialog as shown below:

1. Enter the data in a Worksheet

4. Click OK

Seagate Confidential 13 Supplier Six Sigma Modular Training


Contingency Tables
Industrial Process Example:
Defective Sub-Assembly
Chi-Square Test: First, Second, Third
Expected counts are printed below observed counts
Oij
First Second Third Total
Xi
1 43 56 68 167
( X i *Y j )
60.31 55.89 50.80 Eij = Ftotal

2 965 878 781 2624


947.69 878.11 798.20 Yj
Total 1008 934 849 2791
FTotal
Chi-Sq = 4.970 + 0.000 + 5.824 + 0.316 + 0.000 + 0.371 = 11.481
DF = 2, P-Value = 0.003
r c (Oij  Eij ) 2
  
2

( calc )
i 1 j 1 Eij
Seagate Confidential 14 Supplier Six Sigma Modular Training
Contingency Tables

Industrial Process Example:


Defective Sub-Assembly
Interpretation:
 p-value = 0.003
 p-value < -risk (0.05): reject Ho
 Infer Ha: sufficient evidence that proportion
defectives are not all equal for all three shifts

Seagate Confidential 15 Supplier Six Sigma Modular Training


Contingency Tables

Multiple Proportion Exercise #1


 Three locations are being analyzed to understand the impact of
distance (in miles) on product shipping damage
 Location A is 200 miles from the customer, Location B is 500
miles from the customer, and Location C is 50 miles from the
customer
 100 shipments of delivered product from each location were
sampled:
 Location A = 5% breakage
 Location B = 8% breakage
 Location C = 10% breakage
 Determine if the amount of breakage is related to the traveling
distance
Seagate Confidential 16 Supplier Six Sigma Modular Training
Contingency Tables

Learning Objectives

 Use Contingency Table for 3 or more


Proportions Test

 Contingency Table Testing method

 Test of association

Seagate Confidential 17 Supplier Six Sigma Modular Training


For all tests: Hypothesis Testing Contingency
Ho: P1=P2
Tables

Roadmap
Ha: P1 P2
p > 0.05 Fail to Reject Ho (null) Proportions Minitab:
p < 0.05 Reject Ho Testing Stat -Bsc Stat -
1 or 2 Proportions

Attribute Data (2 factors only) Ho: Two factors are independent


Continuous Data (one factor only) Ha: Two factors are dependent
Ho: Data is Normal Minitab:
Ha: Data is NOT Normal Stat -Tables - Chi-square Test
Minitab:
Non Normal Stat - Basic Stat - Normality Test
Normality Test Contingency
Ho:  Use Anderson-Darling
Ha: at least one is different Normal Table
Minitab: Two or More
Stat - Anova - Test for Equal Variances Samples One Two or More
For only two ‘s this is similar to an Sample Samples
F-Test: F=(S1)2 / (S2)2 Levene’s Test Ho: target
Ho: 
If F calc > F table, then reject null. Ha: target Chi-Squared Ha: at leastone is different
(Use Chi-Squared for one sample) Minitab:
Minitab:
Stat - Basic Stat > Display Desc Stat
> Graphical Summary Bartlett’s Test Stat - Anova - Test for Equal Variances
Ho: M1M target (For only two ‘s this is the same as an
If target falls between CI,
Ha: M1M target 1 Sample F-Test: F=(S1)2 / (S2)2
Then fail to reject Ho.
Minitab: If F calc > F table, then reject null.
Stat - Nonparametric - 1 Sample-Sign (OR)
Stat - Nonparametric - 1 Sample-Wilcoxon Two or More
Ho: target Samples
(This is also used for paired comparisons:
Ha: target 1 Sample T Test
Ho: M1 = 0)
M1 = Median of sample 1 Minitab: One Way
Stat - Basic Stats - 1 Sample-T
M target = Target Median
(This is also used for paired
Anova
Two Ho: 
2 or More comparisons: Ho:  = 0) Samples
Ho: M1 = M2 = M3 = ... Ha: at least one is different
Ha: at least one is different Samples Minitab:
Minitab: Stat - Anova- One-way
Stat - Nonparametric - Mann-Whitney (OR) Ho:  2 Sample T Test (Be careful if Bartlett’s
Stat - Nonparametric - Kruskal-Wallis (OR) Ha:  (Variances Not Equal) p < 0.05)
Stat - Nonparametric - Mood’s Median (OR) Ho:  2 Sample T Test Minitab: Assumes Equal Variances
Stat - Nonparametric - Friedmans Ha:  ( Variances Equal) Stat - Basic Stats - 2-Sample T
M1 = Median of sample 1, etc... Minitab: (Compares Means using unpooled Std Dev)
Stat - Basic Stats - 2-Sample T Check box to assume unequal variances
(Compares Means using pooled Std Dev)
AlliedSignal BBs & GenCorp BBs Check box to assume equal variances
Seagate
were Confidential
major contributors to this chart. 18 Supplier Six Sigma Modular Training
Contingency Tables

Test of Association
 A contingency table is used to analyze data via a two way
classification (involving two factors). The data are usually attribute
in nature (frequency counts), although they need not be.
 This tool is used to test the relationship between two sources of
variation. The relationship can be statistically described as follows:
Ho: “ {factor A} is independent of {factor B} ”
Ha: “ {factor A} is NOT independent of {factor B} “

 Both manufacturing processes and transactional process are perfect


application grounds.

Seagate Confidential 19 Supplier Six Sigma Modular Training


Contingency Tables

Calculate the Expected Value


Exercise: Calculate the expected values for each cell.
Bad Chillers Totals
Good Chillers

Machine 1 O11 = 20 O12 = 50 70

Machine 2 O21 = 40 O22 = 70 110

Totals 60 120 180


Xr * Y c
Expected =
Ftotal

Good Chillers Bad Chillers


E = 70 * 60 / 180 E12 = (70 * 120) / 180
Machine 1 11
= =
Machine 2 E21 = (110)*(60) / (180) E22 = (110)*(120) / (180)
= =

Seagate Confidential 20 Supplier Six Sigma Modular Training


Contingency Tables

Calculate the Chi-Squared Value


Exercise: Calculate 2 (calc) for this data.
Good Chillers Bad Chillers
O11 = 20 O12 = 50
Machine 1 E11 = 23.3 E12 = 46.6
O21 = 40 O22 = 70
Machine 2 E21 = 36.6 E22 = 73.3

(calc) Oij -Eij)2 / Eij


Good Chillers Bad Chillers
Chi^2= (O11 - E11)2/ E11 Chi^2= (O12-E12)2 / E12
Machine 1
= (20 - 23.3)2 / 23.3 = = (50 - 46.6)2 / 46.6
Machine 2 Chi^2= (40 - 36) / (36.6) Chi^2 = (70 - 73.3)2 / (73.3)
2

= =

2(calc) = ( )+( )

+( )+( ) = _____

Seagate Confidential 21 Supplier Six Sigma Modular Training


Contingency Tables

Contingency Table- Example


Example: To illustrate the use and analysis of contingency tables, let’s consider the
GAGE evaluation for a group of parts using 3 different METHODS for reviewing visual
attributes. Set  = 0.05

TYPE Method A Method B Method C


GAGE 1 37 41 44
GAGE 2 35 72 71

Solution:
A. Practical Problem: Does a particular gage create more or less defects depending
on the test method?
B. Ho: Test method is independent of the gage.
C. Ha: Test method is NOT independent of gage.
D. Determine the test statistic (calc). (We’ll use Minitab.)
Seagate Confidential 22 Supplier Six Sigma Modular Training
Contingency Tables

Contingency Table- Example


 Interpret output
What (calc) ?
What is the P-Value?
What does it mean?

Chi-Square Test
Expected counts are printed below observed counts
Method A Method B Method C Total
1 37 41 44 122
29.28 45.95 46.77
2 35 72 71 178
42.72 67.05 68.23
Total 72 113 115 300
ChiSq = 2.035 + 0.534 + 0.164 +

1.395 + 0.366 + 0.112 = 4.606


df = 2, p = 0.100

Seagate Confidential 23 Supplier Six Sigma Modular Training


Contingency Tables

Contingency Table- Example


 Solution:
E Determine the critical value of the test (in Excel CHIINV).
 What are the degrees of freedom? (Answer: 2)
 What is the value of 2(critical)? (Answer: 5.99)

F If 2(calc) > 2(critical), then REJECT Ho.


calc) = 4.606 Since 4.606 < 5.99,
2(critical) = 5.99 We FAIL to reject Ho.
Alpha = 0.05 Since 0.10 > 0.05,
P-Value = 0.10 We FAIL to reject Ho.

G Translate the statistical conclusion into process terms.


 We conclude that the differences we see in defects for each test method are INDEPENDENT on
the gage used.

Seagate Confidential 24 Supplier Six Sigma Modular Training


Contingency Tables

Apply the Method


Exercise: Use the worksheet in Bhh146.mtw to decide if the outcome
of a surgical procedure depends on the hospital used.

Problem:
Ho : Results of surgical procdeure are not hospital dependent Xr * Y c
Ha : Results of surgical procdeure are hospital dependent E
Ftotal
=
Hosp A Hosp B Hosp C Hosp D Hosp E r c

NI( Observed) 13 5 8 21 43
   ij -Eij)2 / Eij
 (calc) O
i 1 j 1

(Expected )
(chi-sq)
(calc) = 56.705
SI( Observed) 18 10 36 56 29
(Expected )
(chi-sq)

GI( Observed) 16 16 35 51 10
(Expected )
(chi-sq)

Seagate Confidential 25 Supplier Six Sigma Modular Training


Contingency Tables

Exercise for Trainees


 Use the seven-step contingency table analysis to find out
which KPIVs are causing the defects to occur. (File:
ContingT.mtw)
WILLIE BILLIE TILLY
GOOD 69 75 81
OPERATORS
BAD 31 25 19

LOT1 LOT2 LOT3 LOT4


MATERIALS GOOD 45 67 49 64
BAD 21 11 23 12

TEMP 1 TEMP 2 TEMP 3


KPIV
GOOD 64 79 54
BAD 11 22 11

Seagate Confidential 26 Supplier Six Sigma Modular Training


Contingency Tables

End of Topic
What question do you have?

Seagate Confidential 27 Supplier Six Sigma Modular Training


Contingency Tables

The 2 Distribution
 Use discrete, nominal or Chi-square distribution
category data (no for various degrees of freedom ()
0.5
ranking, variable or ratio

Value of the (2) distribution


scale data) 0.45

0.4 =2
 Observations must be
independent. No repeat 0.35
measurements on the 0.3
same part.
0.25
 (R-1)(C-1)= df
0.2 =4
  generally works best 0.15
with 5 or more
0.1
observations in each
cell. 0.05 =6  = 10
0
2
0.1
1.2
2.3
3.4
4.5
5.6
6.7
7.8
8.9

11.1
12.2
13.3
14.4
15.5
16.6
17.7
18.8
19.9
10

Seagate Confidential 28 Supplier Six Sigma Modular Training

You might also like