Contingency Tables
Contingency Tables
Learning Objectives
Use Contingency Table for 3 or more
Proportions Test
Contingency Table Testing method
Test of association
Seagate Confidential 2 Supplier Six Sigma Modular Training
Contingency Tables
Contingency Table
You have learn ANOVA for comparing 3 or more
population means .
What if you have 3 or more population proportions to
compare?
You can’t use one or two proportion test anymore .
Here , we can use contingency table using
2 test statistics.
Seagate Confidential 3 Supplier Six Sigma Modular Training
Contingency Tables
Multiple Proportion Comparison
vs vs
P1 P2 P3
Practical Question Are all populations’ proportions
(example) statistically different?
Statistical Question
Ho: 1 = 2 =3
Ha: at least one i is different
Seagate Confidential 4 Supplier Six Sigma Modular Training
Contingency Tables
Multiple Proportion Test - Procedure
A) State the problem statement
B) Null hypothesis: Ho: P1 = P2 = …. = Pk
C) Alternative hypotheses: not all Pi are equal
D) Determine significance level
E) Create contingency table of “successes” and “failures” from
binomial samples
Seagate Confidential 5 Supplier Six Sigma Modular Training
Contingency Tables
Multiple Proportion Test - Procedure
F) Determine the test statistics using the following formula or with Minitab (Stat >
Tables > Chi-square Test):
r c
n 1 s 2
2 =
(calc)
(O -E )2 / E
i 1j 1
ij ij ij Recall
2
2
0
df = (r-1)(c-1)
where: O = the observed value (from experimental data)
i1 i
n
x x 2
s2
E = the expected value = (Xr * Yc) / Ftotal n 1
r = number of rows
c = number of columns
Xr = total frequency for that row
Yc = total frequency for that column
Ftotal = total frequency for that table
G) Determine critical value of 2 test statistic
Reject null hypothesis if 2observed > 2critical
H) Translate statistical conclusion into practical solution
Seagate Confidential 6 Supplier Six Sigma Modular Training
Contingency Tables
How to construct a Contingency
Table
Data is arranged into rows and columns, with each possible
occurrence of interest being a particular row x column (which is a
cell) count. The total occurrences of each cell are counted and
placed in that cell.
Each row or each column has a total count associated with it.
Based on these row and column counts, each cell--or row/column
combination has an expected count associated with it
The actual counts are compared to the expected counts for all
cells. The likelihood of the observed variation is evaluated using a
statistical test called the Goodness-of-Fit Test.
Seagate Confidential 7 Supplier Six Sigma Modular Training
Contingency Tables
Contingency Table
P1 P2 P3 P4
Bad 2 4 3 1 X1= 10
Good 0 3 2 5 X2 = 10
Y1=2 Y2 =7 Y3=5 Y4=6 Ftotal=20
For discrete data: Xi
> The fraction of a random element falling in Row i, say U i = Ftotal
> The fraction of a random element falling in Column j, say V j = Yj
> Expected value of each cell Eij = Ftotal x Ui x Vj Ftotal
= ( X i *Y j )
Ftotal
From the observed Oij value and calculated Eij, we can derive the Chi-Square2 statistics
r c (Oij Eij )
2
( calc )
i 1 j 1 Eij
Seagate Confidential 8 Supplier Six Sigma Modular Training
Contingency Tables
r x c Contingency Table
Column Factor Yj
1 2 3 c Totals
Row 1 O11 O12 O13 O1c X1
Factor 2 O21 O22 O23 O2c X2
Xi
3 O31 O32 O33 O3c X3
r Or1 Or2 Or3 Orc Xr
Totals Y1 Y2 Y3 Yc Ftotal
For discrete data: Xi
> The fraction of a random element falling in Row i, say U i = Ftotal
> The fraction of a random element falling in Column j, say V j = Yj
> Expected value of each cell Eij = Ftotal x Ui x Vj Ftotal
= ( X i *Y j )
Ftotal
From the observed Oij value and calculated Eij, we can derive the Chi-Square2 statistics
r c (Oij Eij )
2
( calc )
i 1 j 1 Eij
Seagate Confidential 9 Supplier Six Sigma Modular Training
Contingency Tables
Industrial Process Example:
Defective Sub-Assembly
Jack is working on a project to address the root cause of
defective sub-assemblies. It has been brought to Jack’s
attention that a higher percentage of defectives are produced in
the third shift.
Jack collected the following data to check the validity of the
above claim:
TYPE First Shift Second Shift Third Shift
Defectives 43 56 68
Non-Defectives 965 878 781
Test at 5% significance level to determine if the proportion of
defectives is the same for all three shifts.
Seagate Confidential 10 Supplier Six Sigma Modular Training
Contingency Tables
Industrial Process Example:
Defective Sub-Assembly
Practical problem
Are some shifts producing more defectives than
others?
Statistical problem
Is the proportion of defectives the same for all
three shifts?
Null hypothesis: proportion defectives are all equal
for all three shifts
Alternate hypothesis: proportion defectives are not
all equal for all three shifts
Seagate Confidential 11 Supplier Six Sigma Modular Training
Contingency Tables
Industrial Process Example:
Defective Sub-Assembly
State the hypotheses and significance level
Ho: P1 = P2 = P3
Ha: at least one Pi is different ( = 0.05)
What hypothesis test is appropriate?
These hypotheses deal with several proportions
Use Chi-Square Test for Association
Seagate Confidential 12 Supplier Six Sigma Modular Training
Contingency Tables
Industrial Process Example:
Defective Sub-Assembly
Analysis Using Minitab
Tool Bar Menu > Stat > 2. Stat > Tables > Chi-Square Test
Tables > Chi-Square Test 3. Fill in the dialog as shown below:
1. Enter the data in a Worksheet
4. Click OK
Seagate Confidential 13 Supplier Six Sigma Modular Training
Contingency Tables
Industrial Process Example:
Defective Sub-Assembly
Chi-Square Test: First, Second, Third
Expected counts are printed below observed counts
Oij
First Second Third Total
Xi
1 43 56 68 167
( X i *Y j )
60.31 55.89 50.80 Eij = Ftotal
2 965 878 781 2624
947.69 878.11 798.20 Yj
Total 1008 934 849 2791
FTotal
Chi-Sq = 4.970 + 0.000 + 5.824 + 0.316 + 0.000 + 0.371 = 11.481
DF = 2, P-Value = 0.003
r c (Oij Eij ) 2
2
( calc )
i 1 j 1 Eij
Seagate Confidential 14 Supplier Six Sigma Modular Training
Contingency Tables
Industrial Process Example:
Defective Sub-Assembly
Interpretation:
p-value = 0.003
p-value < -risk (0.05): reject Ho
Infer Ha: sufficient evidence that proportion
defectives are not all equal for all three shifts
Seagate Confidential 15 Supplier Six Sigma Modular Training
Contingency Tables
Multiple Proportion Exercise #1
Three locations are being analyzed to understand the impact of
distance (in miles) on product shipping damage
Location A is 200 miles from the customer, Location B is 500
miles from the customer, and Location C is 50 miles from the
customer
100 shipments of delivered product from each location were
sampled:
Location A = 5% breakage
Location B = 8% breakage
Location C = 10% breakage
Determine if the amount of breakage is related to the traveling
distance
Seagate Confidential 16 Supplier Six Sigma Modular Training
Contingency Tables
Learning Objectives
Use Contingency Table for 3 or more
Proportions Test
Contingency Table Testing method
Test of association
Seagate Confidential 17 Supplier Six Sigma Modular Training
For all tests: Hypothesis Testing Contingency
Ho: P1=P2
Tables
Roadmap
Ha: P1 P2
p > 0.05 Fail to Reject Ho (null) Proportions Minitab:
p < 0.05 Reject Ho Testing Stat -Bsc Stat -
1 or 2 Proportions
Attribute Data (2 factors only) Ho: Two factors are independent
Continuous Data (one factor only) Ha: Two factors are dependent
Ho: Data is Normal Minitab:
Ha: Data is NOT Normal Stat -Tables - Chi-square Test
Minitab:
Non Normal Stat - Basic Stat - Normality Test
Normality Test Contingency
Ho: Use Anderson-Darling
Ha: at least one is different Normal Table
Minitab: Two or More
Stat - Anova - Test for Equal Variances Samples One Two or More
For only two ‘s this is similar to an Sample Samples
F-Test: F=(S1)2 / (S2)2 Levene’s Test Ho: target
Ho:
If F calc > F table, then reject null. Ha: target Chi-Squared Ha: at leastone is different
(Use Chi-Squared for one sample) Minitab:
Minitab:
Stat - Basic Stat > Display Desc Stat
> Graphical Summary Bartlett’s Test Stat - Anova - Test for Equal Variances
Ho: M1M target (For only two ‘s this is the same as an
If target falls between CI,
Ha: M1M target 1 Sample F-Test: F=(S1)2 / (S2)2
Then fail to reject Ho.
Minitab: If F calc > F table, then reject null.
Stat - Nonparametric - 1 Sample-Sign (OR)
Stat - Nonparametric - 1 Sample-Wilcoxon Two or More
Ho: target Samples
(This is also used for paired comparisons:
Ha: target 1 Sample T Test
Ho: M1 = 0)
M1 = Median of sample 1 Minitab: One Way
Stat - Basic Stats - 1 Sample-T
M target = Target Median
(This is also used for paired
Anova
Two Ho:
2 or More comparisons: Ho: = 0) Samples
Ho: M1 = M2 = M3 = ... Ha: at least one is different
Ha: at least one is different Samples Minitab:
Minitab: Stat - Anova- One-way
Stat - Nonparametric - Mann-Whitney (OR) Ho: 2 Sample T Test (Be careful if Bartlett’s
Stat - Nonparametric - Kruskal-Wallis (OR) Ha: (Variances Not Equal) p < 0.05)
Stat - Nonparametric - Mood’s Median (OR) Ho: 2 Sample T Test Minitab: Assumes Equal Variances
Stat - Nonparametric - Friedmans Ha: ( Variances Equal) Stat - Basic Stats - 2-Sample T
M1 = Median of sample 1, etc... Minitab: (Compares Means using unpooled Std Dev)
Stat - Basic Stats - 2-Sample T Check box to assume unequal variances
(Compares Means using pooled Std Dev)
AlliedSignal BBs & GenCorp BBs Check box to assume equal variances
Seagate
were Confidential
major contributors to this chart. 18 Supplier Six Sigma Modular Training
Contingency Tables
Test of Association
A contingency table is used to analyze data via a two way
classification (involving two factors). The data are usually attribute
in nature (frequency counts), although they need not be.
This tool is used to test the relationship between two sources of
variation. The relationship can be statistically described as follows:
Ho: “ {factor A} is independent of {factor B} ”
Ha: “ {factor A} is NOT independent of {factor B} “
Both manufacturing processes and transactional process are perfect
application grounds.
Seagate Confidential 19 Supplier Six Sigma Modular Training
Contingency Tables
Calculate the Expected Value
Exercise: Calculate the expected values for each cell.
Bad Chillers Totals
Good Chillers
Machine 1 O11 = 20 O12 = 50 70
Machine 2 O21 = 40 O22 = 70 110
Totals 60 120 180
Xr * Y c
Expected =
Ftotal
Good Chillers Bad Chillers
E = 70 * 60 / 180 E12 = (70 * 120) / 180
Machine 1 11
= =
Machine 2 E21 = (110)*(60) / (180) E22 = (110)*(120) / (180)
= =
Seagate Confidential 20 Supplier Six Sigma Modular Training
Contingency Tables
Calculate the Chi-Squared Value
Exercise: Calculate 2 (calc) for this data.
Good Chillers Bad Chillers
O11 = 20 O12 = 50
Machine 1 E11 = 23.3 E12 = 46.6
O21 = 40 O22 = 70
Machine 2 E21 = 36.6 E22 = 73.3
(calc) Oij -Eij)2 / Eij
Good Chillers Bad Chillers
Chi^2= (O11 - E11)2/ E11 Chi^2= (O12-E12)2 / E12
Machine 1
= (20 - 23.3)2 / 23.3 = = (50 - 46.6)2 / 46.6
Machine 2 Chi^2= (40 - 36) / (36.6) Chi^2 = (70 - 73.3)2 / (73.3)
2
= =
2(calc) = ( )+( )
+( )+( ) = _____
Seagate Confidential 21 Supplier Six Sigma Modular Training
Contingency Tables
Contingency Table- Example
Example: To illustrate the use and analysis of contingency tables, let’s consider the
GAGE evaluation for a group of parts using 3 different METHODS for reviewing visual
attributes. Set = 0.05
TYPE Method A Method B Method C
GAGE 1 37 41 44
GAGE 2 35 72 71
Solution:
A. Practical Problem: Does a particular gage create more or less defects depending
on the test method?
B. Ho: Test method is independent of the gage.
C. Ha: Test method is NOT independent of gage.
D. Determine the test statistic (calc). (We’ll use Minitab.)
Seagate Confidential 22 Supplier Six Sigma Modular Training
Contingency Tables
Contingency Table- Example
Interpret output
What (calc) ?
What is the P-Value?
What does it mean?
Chi-Square Test
Expected counts are printed below observed counts
Method A Method B Method C Total
1 37 41 44 122
29.28 45.95 46.77
2 35 72 71 178
42.72 67.05 68.23
Total 72 113 115 300
ChiSq = 2.035 + 0.534 + 0.164 +
1.395 + 0.366 + 0.112 = 4.606
df = 2, p = 0.100
Seagate Confidential 23 Supplier Six Sigma Modular Training
Contingency Tables
Contingency Table- Example
Solution:
E Determine the critical value of the test (in Excel CHIINV).
What are the degrees of freedom? (Answer: 2)
What is the value of 2(critical)? (Answer: 5.99)
F If 2(calc) > 2(critical), then REJECT Ho.
calc) = 4.606 Since 4.606 < 5.99,
2(critical) = 5.99 We FAIL to reject Ho.
Alpha = 0.05 Since 0.10 > 0.05,
P-Value = 0.10 We FAIL to reject Ho.
G Translate the statistical conclusion into process terms.
We conclude that the differences we see in defects for each test method are INDEPENDENT on
the gage used.
Seagate Confidential 24 Supplier Six Sigma Modular Training
Contingency Tables
Apply the Method
Exercise: Use the worksheet in Bhh146.mtw to decide if the outcome
of a surgical procedure depends on the hospital used.
Problem:
Ho : Results of surgical procdeure are not hospital dependent Xr * Y c
Ha : Results of surgical procdeure are hospital dependent E
Ftotal
=
Hosp A Hosp B Hosp C Hosp D Hosp E r c
NI( Observed) 13 5 8 21 43
ij -Eij)2 / Eij
(calc) O
i 1 j 1
(Expected )
(chi-sq)
(calc) = 56.705
SI( Observed) 18 10 36 56 29
(Expected )
(chi-sq)
GI( Observed) 16 16 35 51 10
(Expected )
(chi-sq)
Seagate Confidential 25 Supplier Six Sigma Modular Training
Contingency Tables
Exercise for Trainees
Use the seven-step contingency table analysis to find out
which KPIVs are causing the defects to occur. (File:
ContingT.mtw)
WILLIE BILLIE TILLY
GOOD 69 75 81
OPERATORS
BAD 31 25 19
LOT1 LOT2 LOT3 LOT4
MATERIALS GOOD 45 67 49 64
BAD 21 11 23 12
TEMP 1 TEMP 2 TEMP 3
KPIV
GOOD 64 79 54
BAD 11 22 11
Seagate Confidential 26 Supplier Six Sigma Modular Training
Contingency Tables
End of Topic
What question do you have?
Seagate Confidential 27 Supplier Six Sigma Modular Training
Contingency Tables
The 2 Distribution
Use discrete, nominal or Chi-square distribution
category data (no for various degrees of freedom ()
0.5
ranking, variable or ratio
Value of the (2) distribution
scale data) 0.45
0.4 =2
Observations must be
independent. No repeat 0.35
measurements on the 0.3
same part.
0.25
(R-1)(C-1)= df
0.2 =4
generally works best 0.15
with 5 or more
0.1
observations in each
cell. 0.05 =6 = 10
0
2
0.1
1.2
2.3
3.4
4.5
5.6
6.7
7.8
8.9
11.1
12.2
13.3
14.4
15.5
16.6
17.7
18.8
19.9
10
Seagate Confidential 28 Supplier Six Sigma Modular Training