You are on page 1of 258

ADVANCED

STATISTICAL
METHODS
FOR ENGINEERS

Chapter Zero

Welcome to Advanced Statistical Methods for


Engineers!

Ground rules please

Use name tents


Cell phones:
Turn off or use vibrate
Take phone calls outside
Keep side conversations to a minimum
Be prompt in returning from breaks
Dont do other work during class
Let instructor know if you need to leave for more than 30 minutes
Listen with an open and active mind
If you have a question at any time, ask!
Other Ground Rules wanted by students?..
Class agree to these Ground Rules?
2

Agenda
Day 1
8:00

Ch 0: Welcome

Day 2
Ch 3: Distribution
Analysis

Day 3

Day 4

Ch 5: Regression
and GLM

Ch 6: Logistic
Regression

9:00 Ch 1: ANOVA and


Equivalence Testing
Ch 7: Statistical
Resources

10:00

End of Day Review

11:00
12:00
1:00

Online Evaluations
Lunch on your own

Lunch on your own

Lunch on your own

Ch 2: Measurement
Systems Analysis

Ch 4: Process
Capability and
Tolerance Intervals

Ch 5: Regression
and GLM
continued

2:00

Lunch on your own

3:00
4:00

End of Day Review

End of Day Review

End of Day Review

5:00

Breaks as Needed
3

Logistics
Starting Time: 8:00
Ending Time: Not later than 5:00
Lunch 12:00-1:00
Breaks every 90-120 minutes
Power Outlets
Rest Room Location
Food and drink locations (snacks, cafeteria, etc)

You Need ...


Laptop with MINITAB and a working wireless Internet
Connection
Writing instruments
Access to data files

Icebreaker (5 Minutes)
In my journey through the world of statistics
One thing that has worked well for me is

One thing that has been a challenge for me is


(Extra Credit)
My favorite statistician, living or dead, is . . .

My favorite statistics joke is

Expectations
Tools, tools, tools

Course may overlap with material from DRM or Lean Sigma

Tools may be familiar, but the intent is to present the tools with a focus
on statistical thinking and decision-making.

Topics may be explored in greater mathematical depth than is offered in


other curricula.

Benefits

A deep mathematical dive can actually help you better see the surface.

Awareness of mathematical assumptions is a critical first step for


growing in your statistical knowledge, but advanced practitioners need
to know:

Which assumptions are most critical?

When is it appropriate to break the rules?

What are the consequences of breaking the rules?

Statistical sophistication allows for flexibility and creativity in problem


solving.
7

Expectations
Experience Chart

Mark an X in column that best describes your


experience with each topic
Topic

None

ALittle

Comfortable Proficient

Icouldteachit

EquivalenceTesting
ToleranceIntervals
ANOVASignal
Interpretation
MeasurementSystems
Analysis

Your Expectations

Create a list at your table

Each table will report

Spokesperson: skip items


already mentioned

DistributionAnalysis
ProcessCapability
GeneralLinearModels

Time: 10 Minutes
8

Your Feedback is Critical


September 17-20 represents the first wave of Advanced SME at
MDT
Given that many of you already are leaders in the statistical or
DRM worlds, your suggestions for course improvements are
extremely important!
At the end of each day, we will engage in brief feedback
session.
At the end of the week, there will be an online survey for you to
formally evaluate the course.
If you wish to provide more detailed feedback, please send an
email to the instructor team: Leroy Mattson, Karen Hulting, Jeremy
Strief, Tom Keenan, Grant Short, Dayna Cruz

9 | MDT Confidential

What questions do you have?

10

Chapter 1:
ANOVA and Equivalence Testing

Topics
Quality Trainer Review
ANOVA

Assumptions
Using Minitab Assistant vs Stat Menu
Calculation Deep Dive
Sample Size
ANOVA Signals

Equivalence Testing

2 | MDT Confidential

Quality Trainer Review

3 | MDT Confidential

Comparing Grouped Data:


Variables Data Response

4 | MDT Confidential

ANOVA: ASSUMPTIONS

5 | MDT Confidential

One-way ANOVA:
Testing for the significance of one factor
The null hypothesis:
H0: 1 = 2 = k
Meaning that the population (response) means are equal at
each of the k levels of this factor or the factor is NOT significant.

The alternative hypothesis:


HA: at least two population means are unequal
Meaning that the factor IS significant

Perform the One-way ANOVA and reject the null hypothesis if


the p-value is < alpha
Usually alpha = 0.05 (or 0.10 or 0.01)
A way to remember: If p is low the null must go.

6 | MDT Confidential

ANOVA: General Process Steps

Select a model
Plan sample size using relevant data or guesses
(Optional) Simulate the data and try the analysis
Collect real data
Fit the model (perform ANOVA and get p value)
Examine the residuals
Transform the response or update the model, if
necessary
State conclusion
7 | MDT Confidential

Typical Assumptions for ANOVA Factors


Factors (or Inputs)
Each factor can be set to two or more distinct
levels
Factor levels can be measured adequately
Factor levels are fixed rather than random
For multiple factors, all combinations of all levels
are represented (levels are completely crossed)

8 | MDT Confidential

Typical Assumptions for ANOVA Responses


Response data is complete, not censored
Some software requires balanced data same
sample size for each level of the input factor
Assumptions on Residuals
Residual = Response Fitted Value
Normally distributed
Equal variance (assumption relaxed in Minitab
Assistant)
Independent (e.g. no time trend)

9 | MDT Confidential

ANOVA CALCULATIONS DEEP DIVE:


STAT MENU & MINITAB ASSISTANT

10 | MDT Confidential

ANOVA Calculations
See www.khanacademy.org
ANOVA 1 Calculating SST (7:39)
ANOVA 2 Calculating SSW and SSB (13:20)
ANOVA 3 Hypothesis Test and F Statistic (10:14)

11 | MDT Confidential

Minitab Analysis of Kahn Dataset


Can arrange either Stacked or Unstacked

12 | MDT Confidential

Consider a PQ Dataset
Three runs of n=10 units produced and tensile
tested
See Ch1DataFile.mtw
Columns TipTensile1, TipTensile2, TipTensile3

13 | MDT Confidential

Minitab Options
Could use
Stat -> ANOVA

-> One way

-> One way (Unstacked)

-> General Linear Model


Stat -> Regression -> General
Regression
Minitab Assistant

Data arrangement
Stacked (one column for X, one
column for Y)
Unstacked (Y values in columns for
each X)

14 | MDT Confidential

ANOVA using Minitab Statistics Menu

15 | MDT Confidential

Stat Menu Outputs

S, R2 and adjusted R2 are measures


of how well the model fits the data.
16 | MDT Confidential

Judging model fit

S is measured in the units of the response variable and represents the


standard distance data values fall from the fitted values
For a given study, the better the model predicts the response, the lower S is

R2 (R-Sq) describes the amount of variation in the observed response


values that is explained by the predictor(s)
R2 always increases with additional predictors.
R2 is most useful when comparing models of the same size

Adjusted R2 is a modified R2 that has been adjusted for the number of


terms in the model
R2 can be artificially high with unnecessary terms, while adjusted R2 may
get smaller when terms are added to the model
Use adjusted R2 to compare models with different numbers of predictors

17 | MDT Confidential

Comparisons Output

18 | MDT Confidential

ANOVA Examining Residuals


1) Test for
Normality
Normal
Probability Plot
is a Straight line

2) Test for Equal


Variances
Residual vs.
Fitted Values is
evenly distributed
around the 0 line

Using the Stacked arrangement, there would


also be a 4th Residual plot Time Order.
This is a Test for Independence looking for a
pattern over time.

Residuals are strongly non-normal . . .


Possible Causes:
Failure of Equal Variance
Assumption
Outliers
Missing Important Factors
in the Model
Data is from Non-Normal
Population
What to do?
Check for Outliers
Check if Equal Variance is satisfied
Perform Normality Test
If data is from Non-Normal Population consider using
Non-Parametric Tests or Transform the Response
variable

10

If Residuals differ Group to Group


Possible Causes:
Non-Constant Variance
Outliers
Missing Important Factors
in the Model

What to do?
Test for equal variance assumption using Stat >
ANOVA > Test for Equal Variances
If test indicates unequal variances then consider
transforming the response variable
Verify if the outlier is a data entry error
Add the factor into the model

If there is a time pattern in the data . . .

What to do?
Prevent by Randomizing
A time effect may be present
Consider time series procedure

11

Common Transformations
Transformation

Comments

Appropriate for Poisson Distributed Data

Log(y)

If the Response is exponentially increasing


then this transformation is appropriate

1/y

Appropriate when responses are close to zero

Called the Arcsine Square Root function.


Appropriate when Response is a proportion
between zero and one.

sin1 y

Another useful tool is Box-Cox Transformation

Minitab Box - Cox Procedure :


Y Y , when 0
Y log e(Y ), when 0

Minitab Screenshots

Box-Cox Transformation in Minitab


Minitab > Stat > Control Charts > Box-Cox Transformation

Box-Cox Plot of Data 1


12

Lower C L

Upper CL
Lambda
(using 95.0% confidence)
Estimate

10

Lower CL
Upper CL

StDev

Rounded Value

0.03
-0.30
0.38
0.00

6
4
2
Limit
0
-1

1
Lambda

12

ANOVA using Minitab Assistant

http://www.minitab.com/support/documentation/Answers/Assistant%20White%20Papers/OneWayANOVA_MtbAsstMenuWhitePaper.pdf
25 | MDT Confidential

Report Card

26 | MDT Confidential

13

Diagnostic Report

27 | MDT Confidential

Power Report

28 | MDT Confidential

14

Summary Report

29 | MDT Confidential

ANOVA - Exercise
Use Ch1DataFile.mtw
Test for differences between the group means
using both Stat menu ANOVA and Minitab
Assistant ANOVA . . . for these 3-lot PQ studies:
For TubeTensile1, TubeTensile2, TubeTensile3
For Diameter1, Diameter2, Diameter3

What are your conclusions?

30 | MDT Confidential

15

ANOVA Alternate Exercise


Analyze this data two ways: 1) Assistant and 2) Stat>ANOVA
Note: Stat>ANOVA assumes equal variances (and so may need
tranformations), but Minitab Assistant ANOVA does no assume equal
variances.

An article in the IEEE Transactions on Components, Hybrids, and


Manufacturing Technology (Vol. 15, No. 2, 1992, pp. 146-153)
described an experiment in which the contact resistance of a
brake-only relay was studied for three different materials (all were
silver-based alloys).
Alloy-Contact Resistance.MPJ

Alloy-Contact Resistance.MPJ

Test at a alpha = 0.01 level


Does the type of alloy affect mean contact resistance?

Applied Statistics and Probability for Engineers, 4th Edition, Douglas C. Montgomery and George C. Runger

General Regression can be used for ANOVA

Use for
multiple
regression
more than
one X

General regression can handle: 1) all continuous input(s), 2) all


categorical input(s), 3) a mixture of continuous and categorical
inputs, and 4) a non-normal response (it allows for the Box-Cox
transformation of the response).
The response must be continuous or considered as continuous.

16

General Regression: Example of ANOVA


Note: A blocked One-way ANOVA is a two way ANOVA where one
factors effect is to be blocked out . The randomization is done
within each block.
Background: The forces exerted by three different stylets in a lead is
compared at 4 different position/advancement conditions (blocks).
The data is given below:
Perform an ANOVA analysis using Stats>Regression>General
Regression and determine if:
(1)
there are significant differences between different stylets, and if
(2)
the blocking factor employed was effective.

Condition is
the Block

Condition
1
2
3
4
x

Force in Grams
Stylet 1 Stylet 2 Stylet 3
18.1
14.5
14.0
20.0
16.1
16.3
30.2
27.5
26.8
42.5
39.4
38.7
27.70
24.38
23.95

stylet.MTW

Stylet.MTW

Blocked One-way ANOVA

17

Blocked One-way ANOVA

(1)
(2)

Are there are significant differences between different stylets?


Is the blocking factor employed effective?

SAMPLE SIZE FOR ANOVA

36 | MDT Confidential

18

Planning Sample Size in ANOVA

Sample Size for One-Way ANOVA Example

Fill in the number of levels for the factor


Always fill in Standard Deviation (use conservative estimate)
Then fill in two of the three long boxes
Can specify several values, separated by spaces

19

Sample Size for One-Way ANOVA

RESPONDING TO ANOVA SIGNALS

40 | MDT Confidential

20

Statistical vs. Practical Significance


Key idea in any hypothesis testing effort
If the test detects a difference (a signal), then what?
Dont assume the signal is automatically bad news (if
youre hoping for consistency) or good news (if youre
hoping for a change)
For example, ANOVA Failure in PQ

Examine the size of the signal in the appropriate


context . . . determine the practical significance of the
difference
The appropriate response depends on an assessment
of both statistical and practical significance

41 | MDT Confidential

ANOVA Signal in PQ
There was a realization that a significant p-value
in the comparison of lot means should not
necessarily mean the PQ fails
Analysis sometimes included to assess the
power of the ANOVA and the practical
significance of the difference in the means.
Eventually, Corporate Policy on Manufacturing
Process Validation added the ANOVA Failure
Flow Chart

42 | MDT Confidential

21

2008 Version
of Corporate
Guideline for
Manufacturing
Process
Validation

43 | MDT Confidential

2012
Version of
CRDM
ANOVA
Signal Flow
Chart

44 | MDT Confidential

22

Pros and Cons


Pro
Provides a consistent way to address the question
of practical significance
Relatively Simple
Effective expect the approach to stand up to
regulatory scrutiny

Con
Can be very prescriptive
Standards for Ppk are quite high: 95% confidence
bound on Ppk > 1.33
Disincentive for larger sample size
45 | MDT Confidential

Current approaches
Corporate Guideline phased out
CV procedure still has essentially the same
ANOVA Signal Flowchart
CRDM originally had a more prescriptive version
CRDM currently has a simplified version
Would also work to include a discussion of the
sample size of the ANOVA and the practical
significance of the difference
Discussion other businesses?

46 | MDT Confidential

23

Example of ANOVA Signal Flow Chart


Recall the ANOVA exercise on Ch1DataFile.mtw
for TubeTensile1, TubeTensile2, TubeTensile3

47 | MDT Confidential

ANOVA Signal Flow Chart Ppk Analysis


First Stack the 3 lots using Data -> Stack -> Columns
Then run
Stat -> Quality Tools -> Capability Analysis -> Normal

Add confidence interval for


Ppk using Options button
48 | MDT Confidential

24

Next steps
Total sample size is 90, so use confidence bound
Lower 95% confidence bound on Ppk is 0.92
Must make 3 more runs
TubeTensile4, TubeTensile5, TubeTensile6
These must pass tolerance interval analysis (like
the first three runs did)
All six runs pass tolerance interval analysis

49 | MDT Confidential

Conclusion

Note: Ppk analysis of all six lots is not


required. Included here FYI.
50 | MDT Confidential

25

Exercise: ANOVA Signal


Run ANOVA and assess practical significance for
In Ch1DataFile.mtw, analyze
WireTensile1, WireTensile2, WireTensile3
Specification is 3 lb minimum

Use one of the ANOVA Signal Flowcharts


Then use another approach to determine the
practical significance of the difference between the
means
Conclusion?

51 | MDT Confidential

ANOVA: Summary And Recap


Review Quality Trainer
Calculations Deep Dive into ANOVA
Analytically, ANOVA is a special case of
Regression
Sample Size
ANOVA Signal Flow chart some Medtronic
divisions use one to standardize response to
ANOVA Signal in PQ

52 | MDT Confidential

26

EQUIVALENCE TESTING

53 | MDT Confidential

Statistical Logic for Equivalence


The basic statistical logic is designed to disprove
equality.
Null hypothesis: Two population parameters are
equal, e.g. 1 = 2.
Alternative hypothesis: Two population parameters
are not equal, e.g. 1 2.

We need a different form of logic to affirmatively


prove equivalence.
Null hypothesis: Two population parameters differ
by or more, e.g. |1 - 2| .
Alternative hypothesis: Two population parameters
differ by less than , e.g. |1 - 2| < .
54 | MDT Confidential

27

Equality vs. Equivalence


Part of the confusion around the issue of
equivalence is that the concepts of equality and
equivalence may not be distinguished.
Equality: Two values/processes are
mathematically identical.
Equivalence: The difference between two
values/processes is sufficiently small that it can be
deemed practically insignificant.

55 | MDT Confidential

Approach 1: Confidence Intervals


The idea is to demonstrate that the confidence interval for
the difference of interest is fully contained within the
range of practical significance [-,].

56 | MDT Confidential

Jones, BMJ 1996

28

Approach 1: Confidence Intervals


Step 1: Define Practical Significance
Before collecting data, use scientific/engineering
principles to decide what difference, , is practically
negligible.

Step 2: Estimate Sample Size for Experiment


Based on characterization data or other assumptions,
estimate the sample size needed to produce a
confidence interval fully contained within [-,]. (Stat <<
Power and Sample Size << Sample Size for Estimation)

Step 3: Collect Data and compute confidence


interval.
If the confidence interval is a strict mathematical
subset of [-,]. equivalence may be declared. If not,
equivalence is either uncertain or untrue.
57 | MDT Confidential

Example of Approach 1

Two processes will be declared equivalent if the difference in their


mean outputs is less than 3 micrometers. So =3.
Based on characterization data,

The old process can be modeled as Normal with a


mean of 30 and a standard deviation of 2.
The new process can be modeled as Normal with a
mean of 31 and a standard deviation of 1.
Based on mathematical theory, the distribution of (new
old) must also be Normal with a mean of 1 and a
standard deviation of sqrt(5) = 2.24.
To be conservative in sample size estimation, the
standard deviation is rounded up to 3.
With an expected mean difference of 1, we need the
confidence interval to have a half-width (margin of
error) of 2 or less.
58 | MDT Confidential

29

Example of Approach 1
Method
Parameter
Distribution
Standard deviation
Confidence level
Confidence interval

Mean
Normal
3 (estimate)
95%
Two-sided

Results
Margin
of Error
2

Sample
Size
12

We need n=12 from


BOTH processes.

59 | MDT Confidential

Example Output
Two-sample T for New vs Old

New
Old

N
12
12

Mean
30.927
29.19

StDev
0.858
1.52

SE Mean
0.25
0.44

Difference = mu (New) - mu (Old)


Estimate for difference: 1.735
95% CI for difference: (0.671, 2.798)
T-Test of difference = 0 (vs not =): T-Value = 3.44

P-Value = 0.003

DF = 17

Conclusions:
The processes are statistically different (p=0.003), which
is a statement about non-equality.
Despite being unequal, the processes are still equivalent.
The 95% confidence interval for the difference in means is
(0.671, 2.798), which is a strict subset of [-3, 3]
60 | MDT Confidential

30

Approach 1: Summary
The confidence interval approach is the gold
standard for clinical trials and other high scrutiny
experiments requiring FDA approval.
It is mathematically equivalent to a p-value-driven
approach called TOST (Two One-Sided T-tests).
The confidence interval approach is easier to
understand than the original form of TOST.

61 | MDT Confidential

Post-hoc Problems
Rigorous application of approach 1 requires that
the value be established before collecting data.
What should we do when data have already been
collected without defining the difference of
interest or planning sample size?

62 | MDT Confidential

31

Approach 2: Retrospective Power Analysis


When data have already been collected without
planning for rigorous equivalence testing,
equivalence may be assessed by displaying an entire
power curve.
Even if this approach does not set a-priori standards
for equivalence,
it provides additional context for an insignificant p-value
it can help engineering experts to make decisions

Subjective judgment will be required to determine if


the experiment was suitably powered to demonstrate
equivalence.
A power curve is a useful supplement to a traditional
analysis, but it does not match the rigor in approach
1.
63 | MDT Confidential

Approach 2 Method
After collecting the means and standard deviation
of the observed data, create a power curve
through the Power and Sample Size platform in
Minitab.
Display and interpret the Power Curve in your
data analysis report.
You may honestly believe that your experiment
was sufficiently powered (>80%) to detect
meaningful differences, but the post-hoc nature
of the analysis makes your argument weaker.
64 | MDT Confidential

32

Example
Consider again our old and new processes which have
distributions of N(30,22) and N(31,12), respectively.
Suppose we forgot to take approach 1 and instead just collected
5 data points from each process.
We found a statistical difference when we collected 12 data
points, but the p-value goes above 0.05 when collecting only 5:
Two-sample T for New_5 vs Old_5

New_5
Old_5

N
5
5

Mean
30.744
29.42

StDev
0.933
3.02

SE Mean
0.42
1.4

Difference = mu (New_5) - mu (Old_5)


Estimate for difference: 1.32
95% CI for difference: (-2.61, 5.25)
T-Test of difference = 0 (vs not =): T-Value = 0.93

P-Value = 0.403

DF = 4

65 | MDT Confidential

Power Curve Inputs


The observed sample size is n=5
Desired power levels are in the range of .8-.95
The pooled standard deviation is 2.24.

66 | MDT Confidential

33

Power Curve Output


With 80% power, this experiment could have
detected a difference of about 4.5.
With 95% power, this experiment could have
detected a difference of about 6.
It is a subjective engineering judgment as to whether
such values provide sufficient reassurance about the
experimental results.

67 | MDT Confidential

Extensions and Challenges


Confidence intervals and power curves can be calculated
for almost any type of statistical scenario:
Comparing 2 means
Comparing >2 means
Comparing standard deviations
Comparing reliability curves
However, the required sample size for proving
equivalence of standard deviations is often much larger
than the sample size for means.
Equivalence for means can reasonably be quantified in
terms of arithmetic differences (e.g. |1 2| < 5), but
equivalence for standard deviations will be quantified in
terms of multiplicative differences (e.g. < 1/2 < 2).
68 | MDT Confidential

34

Exercise Lesion Depth


Consider the key requirement for a new ablation catheter:
equivalent (or greater) maximum lesion depth, compared to the
current design, where the difference of interest is 0.5 mm.
Previous data shows

Normal distribution model is adequate for Max Lesion Depth


Current Design has average max lesion depth of 2.3 mm
New Design has average max lesion depth of 2.2 mm
Largest pooled standard deviation of max lesion depth is 0.356.

Follow Approach 1 to plan sample size for the equivalence test


Assume test data as follows to complete the equivalence
analysis
New: n=15, mean = 2.733, stdev = 0.342
Current: n=15, mean = 2.723, stdev = 0.386

State your conclusion

69 | MDT Confidential

Alternate Exercise: Equivalence Testing


Within your team, identify an example of
equivalence testing in your own work.
Apply Approach 1, using actual or made-up
characterization data for the planning step.
Use Minitab to simulate data collection.
Hint: Use Calc -> Random Data -> Normal . . .

Use Minitab to complete the Approach 1 data


analysis.
State your conclusion from the data.

70 | MDT Confidential

35

EQUIVALENCE Take Away Messages


An insignificant p-value is not a rigorous method of
proving equivalence.
Ideally, practical significance and sample size should be
considered before the experiment begins.
Rigorously proving equivalence first demands carefully
defining the threshold () of practical significance.
The most rigorous way to prove equivalence is to
demonstrate that a confidence interval is fully contained
within [-, ].
An alternativebut less formalapproach is to
retrospectively perform a power analysis.
Dont feel like you need to remember all the Minitab steps;
we hope you remember the concepts and call your
neighborhood statistician for further support.

71 | MDT Confidential

Summary and Review


Quality Trainer Review
ANOVA

Assumptions
Using Minitab Assistant vs Stat Menu
Calculation Deep Dive
Sample Size
ANOVA Signals

Equivalence Testing

72 | MDT Confidential

36

Chapter 2:
Measurement Systems Analysis

Topics
Quality Trainer Review
Topics with Variables Data
Gage R&R Sample Size
Probability of Misclassification (Variables Data)
Helpful Hints

MSA for Destructive Tests


MSA for Attribute Tests

2 | MDT Confidential

Quality Trainer Review

3 | MDT Confidential

Value of Measurement Systems Analysis


If your goal is . . .

then MSA helps by . . .

Reducing variability in Xs and


Process Improvement Ys so that the key Xs may be
discovered.
Capability
More accurate measurements
Demonstration or
of process performance
Estimation
Sorting Out Bad
Reducing the Probability of
Product
Misclassification
Reduced noise allows discovery
Innovation
of more subtle signals
4 | MDT Confidential

Recall . . . MSA Concepts


Bias Mean (Delta difference -- from reference)
Linearity Mean (Bias vs Part or Operating Value)
Stability Mean (Bias vs Time)
Repeatability Standard Deviation
Reproducibility Standard Deviation

so linearity
and stability
should be
plotted

Gage R&R Standard Deviation

while bias,
repeatability and
reproducibility are
just single
numbers
5 | MDT Confidential

Gage Bias and Linearity


Bias is the difference between the average of
repeated measurements and the true value
MSA tends to focus on Gage R&R (variability), but
accuracy (= lack of bias) is equally important
Assumption that procedures for Calibration are in place
- need to confirm
Assumption that procedures for Calibration are
adequate need to confirm

Linearity is a study of bias across the range of


measured values
In Minitab, use Stat -> Quality Tools -> Gage Study ->
Gage Linearity and Bias Study
6 | MDT Confidential

Gage Stability
MINITAB

Snap Gauge.mtw

> Stat > Control Charts > Variables Charts for Subgroups > Xbar-R
Measurement system
is stable over time as
evidenced by:

Xbar-R Chart of Rep1, ..., Rep3


Sample M ean

0.254

UC L=0.253458

0.252
_
_
X=0.2497

0.250
0.248
0.246

S
8-

ep

0
:0
11

p
Se
8-

00
5:
S
9-

ep

0
:0
11

p
Se
9-

00
5:
-S
10

ep

0
:0
11

ep
-S
10

00
5:
-S
11

Day

ep

0
:0
11

e
-S
11

00
5:
-S
12

ep

0
:0
11

e
-S
12

00
5:

0.0100
Sample Range

Xbar Chart - in control

LC L=0.245942

UC L=0.00946

0.0075
0.0050

_
R=0.00367

0.0025
0.0000

R Chart - in control

LC L=0

S
8-

ep

0
:0
11

p
Se
8-

00
5:
S
9-

ep

0
:0
11

p
Se
9-

00
5:
-S
10

ep

0
:0
11

ep
-S
10

Day

00
5:
-S
11

ep

0
:0
11

e
-S
11

00
5:
-S
12

ep

0
:0
11

e
-S
12

00
5:

7 | MDT Confidential

GAGE R&R SAMPLE SIZE

8 | MDT Confidential

Gage R&R Sample Size


General recommendation:
5 to 10 Parts (P)
2 to 3 Operators (O)
2 to 3 Repeats (R)

More rigorous methods


Specify minimum Degrees of Freedom for
estimating Repeatability and Reproducibility
standard deviations
Use confidence intervals for standard deviation
estimates (option provided in Minitab 16)
9 | MDT Confidential

Degrees of Freedom Approach


Estimating Reproducibility Std Dev: O-1
Include as many operators as feasible

Estimating Repeatability Std Dev: P*O*(R-1)


With 30 df, 90% confidence bound on ratio of estimate
to true value is (0.79, 1.21). Ref: on www.minitab.com
search for ID 2613 to access Minitab Assistant White
Papers.

CVG Test
Method
Validation

10 | MDT Confidential

PROBABILITY OF
MISCLASSIFICATION

11 | MDT Confidential

Misclassification
Two Misclassification Probabilities
Probability of Misclassifying Bad Unit as Good
Probability of Misclassifying Good Unit as Bad
LSL

USL

Probability of
Misclassifying
Good Unit as Bad Unit
Probability of
Misclassifying
Bad Unit as Good Unit

12 | MDT Confidential

MINITAB Simulated Estimation of Misclassification:


Following Gage RR study

Part mean = 30, Part Std Dev = 10, Part Upper Spec = 40

No measurement system bias

Gage R&R Std Dev = 2.6


1) Calc/Random Data/Normal
(simulate true part measurements)

2) Calc/Random Data/Normal
(simulate gage variability)

13 | MDT Confidential

MINITAB Simulated Estimation of Misclassification (cont)

3) Calc/calculator/ use the +


Add 1) + 2) to simulate observed
measurements

4) Calc/calculator : assign a 1 for in


spec for 1)
Ex: (TrueMeasure 40)

14 | MDT Confidential

MINITAB Simulated Estimation of Misclassification (cont)


5) Calc/calculator : assign a 1 for in
specs for 2)
Ex: (ObsMeasure 40)

6) Stat/Table/Crosstabs to
crosstabulate 4) and 5).

15 | MDT Confidential

MINITAB Simulated Estimation of Misclassification (cont)


Estimated % of Truly Out of Spec called In Spec is 2.1%.

The simulation sample size was 10000. A larger sample size would be better.

16 | MDT Confidential

MINITAB Misclassification

17 | MDT Confidential

MINITAB Misclassification

Two problems:
1) Only three decimals for probabilities( i.e. 0.000)
2) Cant enter historical: 1) process mean 2) part std.dev 3) gage std.dev
(Note: (2) can now be done with a CSR work aid 13)
18 | MDT Confidential

Misclassification Using Minitab


and Work Aid 13
CSRworkaid13 POM.mtw
MINITAB

Load into the worksheet:


the Part mean (30) and the Part Sigma (10)

and the Gage Sigma (2.6)

19 | MDT Confidential

MINITAB Misclassification

20 | MDT Confidential

10

MINITAB Misclassification
Enlarging the label on the sample mean chart, we see the mean is 30.

21 | MDT Confidential

MINITAB Misclassification
Examining the output we see that: USL 40, and the Part Sigma (10)
and the Gage Sigma (2.6) .
Prob. of a truly bad part called good is .021

22 | MDT Confidential

11

Probability of Misclassification (POM) Tool


Originally written in R by Tarek Haddad to recreate functionality lost when Medstat was
retired.
Jim Dawson collaborated with Tarek to continue
development and turn it into an Excel tool.
A substantial Software Validation effort was
undertaken by Nick Finstrom and Barry Christy,
with the support of Pete Patel and the CVG Test
Method Council. Validation work to be completed
in early 2014.
23 | MDT Confidential

POM Tool

Replicates Medstat functionality


More resolution in results than Minitab
Graphics
Guardbanding
Normal, Lognormal and Weibull distributions of parts

24 | MDT Confidential

12

POM with Guardband

25 | MDT Confidential

Exercise
Run POM analysis
Using Minitab
Simulation
Using Work Aid 13 and
Minitab GRR
Using POM Tool

26 | MDT Confidential

13

HELPFUL HINTS

27 | MDT Confidential

Gage R&R Helpful Hints - Normality


Normality testing is not needed for Gage R&R
analysis
Distribution of the raw data will depend strongly on the
parts used in the study there no expectation or
assumption that the raw data will to follow any specific
distribution
Repeated measurements on the same part by the
same operator will likely follow a normal distribution
Like any ANOVA model, the residuals are assumed to follow
a normal distribution but the analysis is relatively robust
to non-normality of the residuals

Probability of Misclassification does depend on the part


or process distribution (each part measured once)

28 | MDT Confidential

14

Gage R&R Helpful Hints One-Sided


Specification
In the case of a one-sided specification, the Percent
Tolerance metric depends on the part average
Minitab uses the overall average in the Gage R&R study
as the estimate of the part average
If the parts used in the study are not representative of the
expected part distribution . . .
The overall average will be a poor estimate of the process
average
The percent tolerance result will be misleading
Best practice would be to calculate Percent Tolerance
separately using a better estimate of the process average
Being not representative can be a good practice for
example, including parts that dont meet the specification

29 | MDT Confidential

Corrective Actions for Failed Gage R&R


Repeatability problem
Could be due to part positional variation
Standardize by measuring same position on each part
Or make multiple measurements at random or systematic
positions and use the average

If gage itself is too variable, may need to improve


or replace
In the meantime, Repeatability variability can be filtered
out by taking repeated, independent measurements and
using the average. Note that this approach does not
correct for Reproducibility issues.

30 | MDT Confidential

15

Corrective Actions for Failed Gage R&R


Reproducibility Problem
Look for assignable causes that explain the
operator-to-operator differences
Understand any Operator*Part interactions these
may provide clues to differences in technique.
Possibly improve the measurement procedure
and/or re-train the operators
Improve any visual aids or samples used in the
measurement procedure

31 | MDT Confidential

Approaches to Robust Gage R&R

Standard Gage R&R methods assume that other factors that affect
measurements have been studied and controlled in the development
of the test method.

If these sources of variability still affect the measurements, then . . .

The Expanded Gage R&R allows you to add additional factors.


Besides operator & part, you could add fixture number, gage
number or other factors. The Expanded GRR can also handle
missing data.

Reference: Make Your Destructive, Dynamic, and Attribute


Measurement System work for you by William Mawby.

This book includes the Analysis Of Covariance method that


allows one to load in the varying environmental factors like
temperature & humidity (covariates) into a GRR.

The General Linear Model in Minitab (under the ANOVA branch)


can be used to model covariates (also handles missing data).

32 | MDT Confidential

16

MSA FOR DESTRUCTIVE


MEASUREMENTS

33 | MDT Confidential

Two Types of Destructive Measurements


1. Truly destructive: Measurement destroys unit being measured
Pull test

In neither case is it possible


to take repeated measures,
so gage R&R is not possible.

Peel test
Tensile test

2. Non-replicable: Measurement process can change the unit


or you are measuring a transient phenomena
Catapult distance
Motor speed
Heart rate
Dimension of silicon part (can compress)
Dimensions of heart tissue (can compress)
Ref: Make Your Destructive, Dynamic, and Attribute measurement System
Work for You. by. W. D. Mawby

34 | MDT Confidential

17

Approaches to Destructive MSA


Approach

Pro

Con

Develop a non-destructive
measurement

Ideal solution

Often difficult or
impossible

Attempt to use identical parts


as repeat measurements
and apply usual requirements
for GRR %Tolerance

Easy to apply usual


Minitab calculations

Rarely works because


parts arent actually
identical

Use a coupon test so that


parts are more identical

Results better than


above

Coupons may not be


representative easier to
measure than real parts

Focus on improving the


measurement process using
DMAIC

Proven methodology Cannot conclude


whether measurement
system is adequate

Focus on Reproducibility

Not affected by part- Might miss a


to-part variability
Repeatability issue

35 | MDT Confidential

What about using Nested Gage R&R?


The nested Gage R&R analysis applies when one operator
measures different parts than another operator.
For example, John measures parts 1, 2, 3, 4, 5 repeatedly and
Jane measures parts 6, 7, 8, 9, 10 repeatedly.
Common application would be Inter-laboratory Testing, where
operators at each location measure different parts repeatedly.
Can work for Destructive MSA if each homogeneous sample
may be sub-sampled. Then operators can measure different
samples repeatedly.

Analysis
The nested analysis does not include a term for Part * Operator
interaction.
Note that Minitab Assistant doesnt offer the Nested analysis

Unless sub-sampling of homogeneous material is possible,


Nested does not solve the key problem of Destructive MSA
Its impossible to repeat the measurement
36 | MDT Confidential

18

Destructive Gage R&R Example


MINITAB

TestingSupplierCoils.mtw

Tensile testing of tubing


8 pieces of tubing
Each tubing cut into 2 sub samples
Assume variation between sub
samples due to measurement error

Assume an upper specification of


850 g
37 | MDT Confidential

Destructive Gage R&R using sub-samples

38 | MDT Confidential

19

Destructive Gage R&R using sub-samples

39 |
MDT
Confi
denti
al

Destructive Gage R&R using sub-samples

Nearly all measurement system


variation due to repeatability
rather than operator
(reproducibility).. . . Or maybe
sub-sample differences?

Large result for


% Tolerance

Measurement system does


not distinguish one part from
another within the range of
parts used in the study
40 | MDT Confidential

20

Destructive Gage R&R using sub-samples


Destructive Gage R&R using subsamples gave poor results
Since repeatability accounts for most of the apparent measurement
variation it is likely that parts were not very similar
In this project they used DMAIC Process Knowledge method to improve
system without obtaining a formal measurement

41 | MDT Confidential

Focus on Reproducibility
With destructive measurements, the
Repeatability Standard Deviation always includes
the part-to-part or subsample-to-subsample
variation. In general, repeatability standard
deviation cannot be accurately estimated.
If one population of parts is randomly assigned to
multiple operators, then the Reproducibility
Standard Deviation is not affected by part-to-part
variation.
Reproducibility standard deviation can be
estimated accurately even for destructive tests.
42 | MDT Confidential

21

Reproducibility
Stop
Trying to force (Repeatability + Part) Standard
Deviation to be small enough to meet a requirement.
Trying to obtain or create identical parts.

Start
Estimate Reproducibility standard deviation and ensure
that it is small enough. This standard deviation
depends only on the differences between operator
means.
Compare operator standard deviations. Identify cases
where operators show substantially different variation
across equivalent sets of parts.
43 | MDT Confidential

Example: CVG Test Method Validation


for Destructive Tests
Obtain a population of 40 parts
Do not need to get identical or nearly identical
parts

Randomly assign 10 parts to each of 4 operators


Calculate %Tolerance for Reproducibility
Compare to requirement of 25%

Calculate Std Dev Ratio


Compare to simulation-based critical values (for
typical study, critical value is 3.10
44 | MDT Confidential

22

Example Calculations
Data based on actual TMV studies
But altered to disguise
Detection Time A, Detection Time P

45 | MDT Confidential

Detection Time A

46 | MDT Confidential

23

Run One-Way ANOVA

Reproducibility = sqrt((0.778-0.627)/10) = 0.123

47 | MDT Confidential

Calculate Results

% Tolerance (Reproducibility)
= 100 * ((6*0.123)/2*(30-11.740))
= 100 * (.738 / 36.52)
= 2.02%
Std Dev Ratio = 0.986 / 0.546 = 1.81
Result: Pass
48 | MDT Confidential

24

Detection Time P

49 | MDT Confidential

Calculations for Detection Time P


Reproducibility = sqrt((11.225-0.976)/10) = 1.01
% Tolerance (Reproducibility)
= 100 * ( (6*1.01) / 2*(30-14.798) )
= 100 * (6.06 / 30.40)
= 19.9%
Std Dev Ratio = 1.113 / 0.846 = 1.32
Result: Pass

50 | MDT Confidential

25

Exercises
Open Destructive Exercises.mtw
For Bond Strength results:
Assume specification is Minimum 5 lb
Analysis

Individual Value Plot


% Tolerance for Reproducibility
Std Dev Ratio
Is this destructive measurement system adequate?

Repeat for Buckle Force results


Assume specification is Maximum 340 grams
51 | MDT Confidential

MSA FOR ATTRIBUTE


MEASUREMENTS

52 | MDT Confidential

26

ATTRIBUTE GAGE R&R


Attribute data are usually the result of human judgment

Which category does this item belong in?

When categorizing items, you need a high degree of


agreement on which way an item should be categorized
The best way to assess human judgment is to have all
operators repeatedly categorize several known test units
(Attribute Gage R&R)

Look for agreement

each person categorizes the same unit consistently


there is agreement between the operators on each unit

Use disagreements as opportunities to determine and eliminate


problems

53 | MDT Confidential

SETTING UP AN ATTRIBUTE GAGE STUDY


Most important aspect of attribute Gage Study is

selecting parts (representative defects)


Most challenging aspect is choosing parts for the
study. Typically use . . .
50% acceptable parts
50% defective parts

Have operators repeatedly classify parts in


random order without knowledge of which part
they are classifying (blind study)

54 | MDT Confidential

27

Analysis of Attribute Gage R&R


Stat Quality Tools Attribute Agreement
Analysis
Percent Agreement based on number of Parts
Kappa Statistics (range -1 to 1)

Minitab Assistant Measurement System


Analysis
More graphical output
Accuracy statistics based on number of Appraisals
No Kappa statistics
55 | MDT Confidential

Use Minitab Assistant


-> Measurement Systems Analysis (MSA)

28

Create Attribute Agreement worksheet

Create Attribute Agreement worksheet

29

Create Result Data

Choose Number of Appraisers = 3


Choose Number of Trials = 2
Choose Number of Test Items = 10
Items 1-5 are Good; Items 6-10 are Bad
Click OK
Copy column Standards and paste into Results
Fix column name back to Results
Find first trial of Item 1 and Item 2
Change result from Good to Bad to inject two
errors into the simulated study

Save onto Desktop as Attribute GRR

Attribute Agreement Analysis

30

Summary Report
Attribute Agreement Analysis for Results
Summary Report
Misclassification Rates

Is the overall % accuracy acceptable?


< 50%

100%

No

Yes
96.7%
The appraisals of the test items correctly matched the
standard 96.7% of the time.

3.3%
6.7%
0.0%
6.7%

Comments

% Accuracy by Appraiser
120
100.0

100

Overall error rate


Good rated Bad
Bad rated Good
Mixed ratings (same item rated both
ways)

100.0

96.7%

90.0

80

60

40

Consider the following when assessing how the


measurement system can be improved:
-- Low accuracy rates: Low rates for some appraisers may
indicate a need for additional training for those appraisers.
Low rates for all appraisers may indicate more systematic
problems, such as poor operating definitions, poor training,
or incorrect standards.
-- High misclassification rates: May indicate that either too
many Good items are being rejected, or too many Bad
items are being passed on to the consumer (or both).
-- High percentage of mixed ratings: May indicate items in
the study were borderline cases between Good and Bad,
thus very difficult to assess.

Attribute c=0
result . . .
Showing that no
bad parts were
misclassified as
good
Overall, 96.7% of
presentations
were classified
correctly

20

Appraiser 1

Appraiser 2

Appraiser 3

61 | MDT Confidential

Accuracy Report
Attribute Agreement Analysis for Results
Accuracy Report
All graphs show 95% confidence intervals for accuracy rates.
Intervals that do not overlap are likely to be different.

Illustrates the
95% / 90% result

% by Appraiser and Standard

% by Appraiser

Good

Appraiser 1
Appraiser 1

Appraiser 2

Appraiser 3

Appraiser 2
40

60

80

100

% by Standard
Appraiser 3

Good
Bad

Bad
40

60

80

100
Appraiser 1

% by Trial

Appraiser 2

Appraiser 3

40

60

80

100

40

60

80

100

31

Kappa

Kappa is a measure of raters agreement.

Minitab:

Reports two Kappa statistics: Fleiss & Cohens


Defaults to Fleiss Kappa
Minitab will only calculate Cohens Kappa if you choose the option for
Cohens Kappa, and if one of these two conditions is true:

A) Two appraisers perform a single trial on each


sample
B) One appraiser performs two trials on each sample

Kappa is meant for attribute data.

Kappa ranges from -1 to 1.

63 | MDT Confidential

Kappa (Landis and Koch)

According to AIAG (Auto industry), a general rule of thumb is:

A Kappa value greater than 0.75 indicates a good to excellent


agreement

Kappa values less than 0.40 indicate poor agreement.

This general rule of thumb may not apply for most Medtronic
applications. Any disagreement on rejectable units would be of
concern.
64 | MDT Confidential

32

Kappa calculations

65 | MDT Confidential

Kappa results

66 | MDT Confidential

33

Summary and Recap


Quality Trainer Review
Topics with Variables Data
Gage R&R Sample Size
Probability of Misclassification (Variables Data)
Helpful Hints

MSA for Destructive Tests


MSA for Attribute Tests

67 | MDT Confidential

BACKUP SLIDES

68 | MDT Confidential

34

Destructive Gage R&R - 2 Nested Designs


Stage 1
1 Operator

2 Stage Nested Design


Approach

Parts
Samples are parts that can
be subdivided into
homogenous sub samples.

Location

1
1 2

Stage 1: 1 operator
measures sub-samples (2-5)
from parts (5-10).
Stage 2: 3 operators each
measure same location per
part (5-10).

2
5

1 2

10
5

1 2

Stage 2
1 sub-sample per part
Operator
Parts

1
1 2

10

1 2

3
10

1 2

10

69 | MDT Confidential

Destructive Gage R&R - 2 Stage Die


Bond Example (cont.)
Project:

MINITAB

Destructive 2 stage nested.mpj

Pull testing of die bond.


Parts are die. Sub-samples
are 5 wire locations on the
die. Spec = 7.5 grams
minimum.
Stage1: 1 operator pull
tests all 5 wire locations on
each of 10 die.
Stage 2: Each of 3
operators pull test 10 die at
wire location 1.
70 | MDT Confidential

35

Destructive Gage R&R - 2 Stage Die Bond


Example (cont.)
Stage 1: Stat >
ANOVA > Fully Nested
ANOVA

From worksheet: stage1

2part

Nested ANOVA: Pull Strength versus Die


Variance Components
Source
Die
Error
Total

Var Comp.
0.088
0.479
0.567

% of Total
15.50
84.50

StDev
0.296
0.692
0.753

71 | MDT Confidential

Destructive Gage R&R - 2 Stage Die Bond Example


(cont.)
Stage 2: Stat >
ANOVA > Fully Nested
ANOVA

From worksheet: stage2

Nested ANOVA: Pull Strength (Wire 1) versus Operator


Variance Components
2

operator

Source
Operator
Error
Total

Var Comp.
0.053
0.428
0.481

% of Total
11.08
88.92

StDev
0.231
0.654
0.694

part / repeat

72 | MDT Confidential

36

Destructive Gage R&R - 2 Stage Die Bond Example


(cont.)
Manual calculation of Gage Repeatability and Reproducibility

2
2
repeat

= 2
part
part / repeat

= 0.428 0.088 = 0.340

R&R

= 0.340 + .053

= 0.393

Compare Gage R&R variance to part variance if parts are


chosen to be representative of production process.
Since this is a one-sided spec (7.5 grams) use
Misclassification to determine gage acceptance.
73 | MDT Confidential

Kappa Call Center Example

Call Center workers were asked to categorize types of calls they


received:
Callcat.mtw MINITAB

74 | MDT Confidential

37

Kappa Attribute Analysis: Option Setting

75 | MDT Confidential

Kappa : Within Appraiser Agreement

76 | MDT Confidential

38

Kappa: Each Appraiser vs Standard

77 | MDT Confidential

Kappa for Appraisers

What do we conclude from this analysis for the raters performance?

What would you do next?

Can this method be applied to the banana data?

78 | MDT Confidential

39

Distribution Analysis

The Art of Finding Useful Models


Jeremy Strief, Ph.D.
MECC Principal Statistician

Objectives
Explain why distributional analysis is statistically
complicated (and sometimes emotionally frustrating!)
Emphasize the importance of engineering theory and
historical precedent.
Encourage the use of multiple graphical methods in
addition to numerical tests.
Review common causes of Non-Normality.
Discuss Transformations and how they compare to
fitting non-Normal distributions.

Medtronic Confidential

Recap from Quality Trainer

Normal Distribution Basics


Capability Analysis (Normal)
Capability Analysis (Non-Normal)
Graphical tools
Boxplots
Histograms
Individual Value Plots

3 | MDT Confidential

Distribution Analysis
Motivation and Philosophy

Why Assess Distribution


Statistical tools vary in sensitivity to and effect of distributional assumptions
Some MDT procedures require distributional assessment for those
statistical methods which are highly sensitive to distributional assumptions

StatisticalTool
CapabilityAnalysis
ToleranceIntervals
VariablesLotAcceptanceSampling
IndividualsChartforSPC
GLM/Regression/ANOVA
XbarchartforSPC
Twosamplettest
Nonparametricmethods

DistributionalSensitivity
High
High
High
High
Med
Med/Low
Low
Low

EffectofPoorDistributionalFit
IncorrectPPM/Ppk
IncorrectBounds
Alteredrejectionandacceptancerates
Incorrectcontrollimits
approximatepvalue
approximatepvalue
approximatepvalue
approximatepvalue

5 | MDT Confidential

Not All Data Are Normal: Example


H i s to g r a m o f T i m e
40

Frequency

30

20

Lead Time Data


usually have a long
tail skewed
distribution

10

10

20

30
Tim e

40

50

P r o b a b ility P lo t o f T im e
Norm al
9 9.9

M ean
S tD e v
N
AD
P - V a lu e

99

Percent

95
90

12 .31
9.6 56
100
5.7 38
< 0.0 05

80
70
60
50
40
30
20
10
5
1
0.1

-20

-10

10

20
T im e

30

40

50

60

6 | MDT Confidential

Not All Data are Normal: Considerations


Observed data need not follow any tractable
mathematical model.
Some mathematical models may be useful, if
imperfect, representations of the data.

7 | MDT Confidential

Frustrations with Distributional Analysis


Larger sample sizes (n>100) cause the statistical
tests to detect small departures from a theoretical
model. Such departures may not be practically
significant.
Smaller sample sizes (n<15) often yield multiple
distributions with p-values greater than 0.05. Graphs
may look sparse and thus may not narrow ones
choice of distribution.
Note: for both cases the data needs to come from a
process in control.

9 |Medtronic Confidential

The Underlying Statistical Hypotheses


The statistical hypothesis testing is backward, in that the null
hypothesis assumes that the particular distribution is a good fit.
H0: Distribution specified has a good fit
H1: Distribution specified has lack-of-fit

Low p-values will disprove the fit of a distribution. So certain


distributions can be ruled out as a reasonable models.
Using the standard goodness-of-fit metrics, it is technically not
possible to prove that a particular distribution is the true model
for the data.
Instead of providing statistical proof, distribution analysis is
geared toward assessing which statistical distributions are
plausible models for the data at hand.

9 | MDT Confidential

Philosophy of Distribution Analysis

All models are approximations. Essentially, all


models are wrong, but some are useful. However,
the approximate nature of the model must always
be borne in mind.
--G.E.P. Box

10 | MDT Confidential

N=15 Probability Plots

Medtronic Confidential

N=500 Examples

Only 12 out of 500 values were affected by the truncation or


censoring.
Medtronic Confidential

How to Determine Distribution


Priority order

1. Scientific/Engineering Knowledge
2. Historical distribution analysis
3. Distribution analysis

Why is
distribution
analysis last?

Sample size (50 to 100)


Regardless of n, key Xs and shift and drift
can mask true distribution

Distribution applies to short term data only

13 | MDT Confidential

Importance of Engineering Theory


The choice of distribution should be both statistically
plausible and scientifically justified.
Engineering theory and historical precedents often
suggest whether a distribution should be Normal,
Lognormal, or Weibull.
If scientific theory does not lead to one single
statistical model, at least consider
Whether the distribution should be skewed or symmetric
Which distributions can be ruled out

Medtronic Confidential

Data Analysis Philosophy


Information shouldnt be destroyed. Examples of
information destruction are
Converting variables data to attribute data.
Heavy rounding with a bad measurement system.
Drifting measurement system.

Check the quality and structure of the raw data.


Are there physically impossible values, wild
outliers, missing values, too many ties?
Are the data paired or unpaired?
Was randomization employed?
How was the data generated?
15 | MDT Confidential

Data Analysis Philosophy


Plot the data AND do analytics.
PLOT histograms, run charts, scatter plots, .
See what is going on. Do a probability plot for
process data.
Use ANALYTICS to get quantitative about what
you have seen. Examine the residual plots from
analytical model fits.

Analyses are performed on yesterdays data


today to predict tomorrows performance.
Data from an unstable process that is analyzed
(ignoring the instability) may result in a conclusion
that will not hold up tomorrow.
16 | MDT Confidential

Distribution Analysis

Review of Engineering Distributions

Most Common Statistical Models for


Engineering Applications

Weibull
Exponential (special case of Weibull)
Lognormal
Normal

18 | MDT Confidential

Weibull

A flexible model which can assume many different shapes, depending on the
choice of parameters
Scale parameter or
Shape parameter
Arises from weakest link failures, or situations when the underlying process
focuses on the minimum or maximum value of independent, positive random
variables.
Models stress-strength failures

19 | MDT Confidential

Exponential

Special case of Weibull when =1


Constant hazard rate, meaning that the probability of failure is not a
function of the age of the device/material.
May occur when multiple failure modes are operating simultaneously
May be useful in modeling software failures resulting from external
sources (e.g. cosmic radiation causes bit-flips at an extremely low,
constant rate)

20 | MDT Confidential

10

Lognormal

Models time-to-failure caused by several forces which combine


multiplicatively.
Describes time to fracture from fatigue crack growth in metals.
Right skewed distribution, useful when data values take multiple
orders of magnitude (e.g. 1.4, 14, 140).
Two parameters (,), each of which is traditionally expressed on
the log scale.
So if X~Lognormal(,), then ln(X)~Normal(,)

21 | MDT Confidential

Normal

Models time-to-failure caused by additive, independent forces


Commonly describes gage error, dimensional measurements from
a supplier, and other symmetric, bell-shaped phenomena

22 | MDT Confidential

11

Additional Models to Consider


Logistic
Smallest Extreme Value (SEV)
Largest Extreme Value (LEV)

23 | MDT Confidential

Some Relationships

SEV distribution = ln(Weibull distribution).


LEV distribution = ln(1/Weibull distribution).
Normal distribution = ln(Log-normal distribution).
All Weibull distributions can be rescaled and
repowered to get another Weibull.
The Weibull(100,4) is very close to a Normal
(mean=90.64, s.d= 25.43). This normal is thicker in
the tails than the Weibull (100,4). Ref: 02SR013
Algorithm for Computing Weibull Sample Size for
Complete Data

24 | MDT Confidential

12

Review: Common Engineering Distributions


Weibull

Normal
Wearout

Default

Time to
stress/strength
related failure
Measurement
error

Infant
mortality

Dimensions

Lead Time

Time to
fatigue
related failure

Lognormal
25 | MDT Confidential

Distribution Analysis
Statistical Overview

13

Statistical Approach to Distribution Analysis


Both graphical and numerical approaches are
needed
P-value is not definitive, given the backward
nature of hypothesis testing
Visual assessment of the probability plot is
crucial
Reasonably large sample sizes (~50) are
needed. Consult your local procedures (e.g.
DOC000550 within CRDM) for specific rules.

27 | MDT Confidential

Distribution Analysis
Graphical Methods

14

Good Distribution Analysis Should


Always Begin With Plots!
Probability plots
Histograms
Time plots

Medtronic Confidential

Probability Plot
A probability plot is a 2-dimensional plot with specialized (often
logarithmic) axes, to facilitate comparison between observed
data and a hypothesized distribution.
More specifically, a probability plot is a comparison between the
observed and theoretical quantiles (i.e. percentiles) for a
hypothesized distribution.

30 | MDT Confidential

15

Probability Plot Interpretation


If the distribution is a good fit to the data, the plotted points
should fall approximately in a straight line.
When interpreting the probability plot, examine both the p-value
and the visual fit.
At the tails of the distribution, look whether the points are falling on
the conservative side of the fitted line.
Look for major deviations in the pattern of points from a straight
linekinks, ties, curves, jumps, etc. Do not worry if a few points
fall outside the confidence bounds.
Fat Pencil Test: Can the observed data values be covered up by a
fat pencil?

31 | MDT Confidential

Probability Plot in Minitab

32 | MDT Confidential

16

Probability Plot Examples


Right skew and
curvature:

Large N makes for


obvious curvature:

Medtronic Confidential

Probability Plot Examples


Subtle Patterns can be
caused by randomness

Both datasets were


sampled directly from a
Normal distribution.
Medtronic Confidential

17

Probability Plot Examples


Distribution does not pass the Anderson-Darling test, but the
lower tail of the distribution falls on the conservative side of the
fitted line.
Distribution appears to have a lower limit of zero
It would be conservative to use the Normal model to estimate
the lower tail behavior.

35 | MDT Confidential

Histograms in Minitab
The graph menu offers a histogram platform, but the graphical
summary platform offers more information with fewer clicks.

36 | MDT Confidential

18

Histograms
More intuitive than probability plots, since the x-y axes are not
transformed.
Not informative with small sample sizes (<30)
Can theoretically be misleading if the bin width is calculated
inappropriately, but in practice the histogram is a useful tool for
moderate-to-large sample sizes
Apparent right skew

Approximately Bell-Shaped

37 | MDT Confidential

Time Plots
Fitting a single distribution to your data implies that the
underlying process is stable.
Without a stable process, distributional fit is irrelevant.
Time plots and control charts help evaluate the stability of your
process.

38 | MDT Confidential

19

Why is Stability needed to Assess Distribution?


MINITAB

Distribution Analysis Shift and Drift.mtw

Distribution Assessment Risks

Shift and Drift, and Variation in Key Xs


masks distribution

Initial capability data always contains


Shift and Drift

At Final Capability, process is stable


and variation in Key Xs is removed

100 samples from Week 1


25 samples from Week 2
100 samples from Week 3

Distribution applies to short term data only

39 | MDT Confidential

Initial Process Data often have Shift and Drift


I Chart of Initial Capability Data
1

35

1
1 1
11
1
1
1
1
1 1
1
1 1
111 11 1
1 1 11 11 1 1 11 1 1 11
1 1 11 1 1111 11 1111 1 111 11 111111
11 1 1
1
1
11
1
1 11
1 1 11
1
11 1
1
11 1 1
1
1
1 11
11
1

Individual Value

30
25

_
X =19.93

20
15
10
5

UCL=26.30

LCL=13.55

1
1
1
11
11 1
1
11
1 1
1
1
1 1
1 11
1 11 1
1
1
1
1 1 1 11 1
1 1
1 1 11111 111 1111 111 111 1 111 1 1
1
1
1 111 111
1
1 11111 1 1 11 1 11 1
1
1 1 11
1 1
11 1
1

23

45

67

89
111
133
Obse rv ation

155

177

199

221

40 | MDT Confidential

20

Long Term Data May not be Normal

Probability Plot of Initial Capability Data


Normal - 95% CI

99.9
99

Combined
Data is not
normal

95

Percent

90
80
70
60
50
40
30
20

Mean
S tDev
N
AD
P -Valu e

10
5

19.93
9.679
225
13.617
<0.005

1
0.1

-20

-10

10
20
30
Initial Capability Data

40

50

60

41 | MDT Confidential

But Short Term Data Could be Normal

Probability Plot of Initial Capability Data


Normal - 95% CI

99.9
99
95

Percent

90
80
70
60
50
40
30
20

Week
1
2
3

Mean StDev
N
AD
P
9.871 2.155 100 0.476 0.233
20.39 2.203 25 0.280 0.616
29.87 2.011 100 0.236 0.785

10
5
1
0.1

Each week
is normal

10

20
30
Initial Capability Data

40

42 | MDT Confidential

21

Distribution Analysis
Numerical Methods

Numerical Methods
For all numerical methods:
A large (0.05) p-value implies there is no evidence
against the hypothesized distribution.
A small (<0.05) p-value implies there is statistically
significant lack-of-fit.

It is commonly stated that a distributional test


passes when p0.05.
A passing test does NOT mean that the hypothesized
distribution is correct or best. There may be
multiple models which fit the data, and you should
choose whichever model best matches science and
historical precedent.

44 | MDT Confidential

22

Most Common Normality Tests


Anderson-Darling (AD) test
Ryan-Joiner test
Note: The Ryan-Joiner test is essentially
equivalent to the Shapiro-Wilk test.

45 | MDT Confidential

Anderson-Darling
Default approach in Minitab.
May be used to assess fit of Normal and nonNormal distributions.
Gives unreliable results when data are
discretized/grouped, which is fairly common
when measurement system resolution is poor.

46 | MDT Confidential

23

Anderson-Darling in Minitab
For assessing Normality:

47 | MDT Confidential

Anderson-Darling in Minitab
For any/all distributions:

48 | MDT Confidential

24

Anderson-Darling Results
Normal(10,1.5)

Normal(10,1.5)--Rounded

49 | MDT Confidential

Ryan-Joiner
Useful for discretized, rounded, or clumpy data
Will not declare significant lack-of-fit simply due to poor
measurement resolution
Recommended minimum of 5 groups to have a meaningful pvalue. Fewer groups may yield an overly optimistic (high) pvalue.

Anderson-Darling

Ryan-Joiner

50 | MDT Confidential

25

Ryan-Joiner in Minitab

51 | MDT Confidential

Truncation
The Normal distribution may be used to model tail
behavior if it provides a conservative estimate of
those tails.
This situation arises when data are truncated, which
is quantitatively captured as negative kurtosis.

52 | MDT Confidential

26

Truncation
In principle, truncated data may be evaluated
graphically or through a Skewness-Kurtosis (SK) test.
The SK test checks whether the tails of the Normal
distribution are longer or shorter than the tails of your
data.
MECC has created and validated an Excel
spreadsheet (R134997) which executes the SK test.
In practice, consult your local procedures to ensure
your analysis of truncated data is compliant.

Microsoft Excel
Worksheet
53 | MDT Confidential

Avoiding Parametric Distributions Altogether


Chebyshevs inequality captures the tail behavior of any
statistical distribution with a finite variance.
For any random variable X and constant k > 1,
P( |X-| k ) 1/k2

This inequality may be useful for skipping the issue of


distributional fit altogether, especially if distributional fit is
being assessed in order to compute a tolerance interval.
Chebyshevs will only be helpful if the process capability is
extremely high.
Consult your own procedures for details, but CRDM
procedures invoke the following version of Chebyshev:
If the nearest specification is at least 10 standard deviations
away from the mean, it may be inferred by Chebyshev that at
least 99% of the distribution will fall within specification.

54 | MDT Confidential

27

Why Normality Tests Fail


1. A shift occurred in the middle of the data
2. Multiple sources or multiple failure modes with
different distributions
3. Outliers
4. Piled up data.
5. Truncated data (sorted before you get it)
6. The underlying distribution is not normal (skewed)
7. Poor measurement resolution
8. Too much data (over powered to detect nonnormality)
9. Due to random chance you expect the test to fail
5% of the time (i.e. 95% confidence) if the data were
truly from a normal distribution.

Resolving Non-Normality
1

Datashift

Multipledatasources

Outliers

4/5

Censored/Truncateddata
(tails lost)

Distributionnotnormal

Poormeasurementresolution

Toomuchdata

RandomChance

Sublot
Skewness/kurtosistest
Attributesampling
Sublot
Skewness/kurtosistest
Attributesampling
Attributesampling
Outlierremoval
(Mayremoveoutliersonlyif they
constitutetyposordatacollection
errors.)
Skewness/kurtosistest
Conservativefitting
Attributesampling
Nonnormal analysis
Transformation
AttributeSampling
RyanJoiner
Skewness/kurtosistest
Graphicalevidence
Random subsampling
Historical assessment

28

When Multiple Distributions Fit


Prior engineering knowledge is
particularly useful when multiple
distributions yield p-values above 0.05:
Picking the distribution solely based on best p-value or
best R2 is rational when there is absolutely no history or
scientific theory.
A better approach is to assemble a list of plausible
(p>0.05) distributions and then make a final choice based
upon history and science.
P-values will sometimes be below 0.05 simply as a result
of chance (Type I error). It is not recommended to
immediately change years of analysis based on one
significant p-value. Investigate and monitor before
changing distributions.

57 | MDT Confidential

Avoid the daily special


Do NOT take the distribution du jour approach, in
which multiple distributions are chosen for a single
process. This reflects either:
An out-of-control process, which cant be
captured by a single distribution anyway.
The bad statistical practice of just defaulting to
the distribution with the highest p-value.

58 | MDT Confidential

29

Example: Capability for Non-Normal Data


using Tribal Knowledge for Distribution
MINITAB

LoanApplicationTime.MTW

Problem Statement: Time (in days) to process


(reject/accept) loan applications is too long causing loss in
customer applications
Project Goal: Decrease potential customer loss from
15% to 5%. Customer expectation is 20 days.
Project Strategy: Path Y = Time
Task: Determine capability for Y = Time

Assume lead time has a LogNormal Distribution

59 | MDT Confidential

Verify Lognormal Distribution

Probability Plot of Time


Lognormal - 95% CI
99.9

Loc
Scale
N
AD
P-Value

99

Percent

95
90
80
70
60
50
40
30
20

P r o b a b i l i ty P l o t o f T i m e
Lo g n o r m a l - 9 5 % C I

10
5

99.9

99

Lo c
S c a le
N
AD
P - V a lu e

95
90

1
Percent

0.1

2.269
0.6845
100
0.432
0.299

10
Time

80
70
60
50
40
30
20

2.269
0.6845
100
0.432
0.299

100

10
5
1

Check if LogNormal
provides a good fit

0.1

10
Tim e

100

60 | MDT Confidential

30

Capability for Non-Normal Data using LogNormal

Process Capability of Time


Calculations Based on Lognormal Distribution Model
USL
Process Data
LS L
*
Target
*
USL
20
Sample M ean 12.31
Sample N
100
Location
2.26918
Scale
0.684493

O v erall C apability
Z.Bench
1.06
Z.LSL
*
Z.U SL
0.47
Ppk
0.16
Exp. O v erall Performance
PP M < LSL
*
PP M > USL 144242
PP M Total
144242

O bserv ed Performance
PPM < LS L
*
PPM > USL 160000
PPM Total
160000

10

20

30

40

50

61 | MDT Confidential

Distribution Analysis
Transformations

31

Two Options
When a dataset is non-Normal, it is acceptable either to
Mathematically transform the data to achieve Normality
Fit a non-Normal distribution

Transformation carries the practical advantage that many


statistical methods are based upon Normality, so there will
be more analytical tools available for the transformed
dataset.
Transformation carries the disadvantages of creating
unnatural units (e.g. log-meters instead of meters) and
altering potentially relevant structures of the data.
Note: Please do NOT try transformations of data from
an unstable process, or bimodal data (two bumps).

63 | MDT Confidential

Transformation Advice
If a transformation is chosen, it should be as
simple as possible, and it should ideally have a
physical interpretation.
A log transformation is particularly desirable,
since it
Is monotonic
Is straightforward to interpret (it turns multiplicative
effects into additive effects)
Is equivalent to the LogNormal distribution
Is common in the literature

64 | MDT Confidential

32

Transformation Advice
The Johnson transformation is a last resort, as it
Rarely has any scientific/engineering meaning
Involves a complicated mathematical structure
Is not universally considered an acceptable
transformation
Any Box-Cox transformation with a lambda value
between [-2,2] is typically acceptable, although the
chosen lambda should ideally have a physical
meaning.

65 | MDT Confidential

Transformation Advice
There is no transformation which will eliminate outliers!
By definition, an outlier is so far away from the rest of the data
values that it is unlikely to belong to the same distribution.
An attribute approach is typically needed when outliers are
present.
Investigate the outlier and determine if there were any typos or
other unusual circumstances which would warrant deletion.
Outliers should NOT be deleted unless there is a strong
argument as to why the outlier is not representative of the
process.
An apparent outlier could possibly be a typical datapoint from
a highly skewed distribution, like LogNormal or LEV.
Use engineering thinking as well as statistical thinking to decide
the best course of action for outlier mitigation.

Stay consistent in your choice of transformation.


Inconsistency implies an unstable process/distribution.

66 | MDT Confidential

33

Box-Cox Transformations
(when there is no theoretical distribution)

Assumptions for Y
Y > 0; Y is skewed (right or left)
Y is unimodal (single peak)
Box-Cox determines transform to make Y
normal
Y() = (Y -1) / for 0
= loge(Y) for = 0
Use Box-Cox when there is no theoretical distribution

67 | MDT Confidential

Box-Cox Transformations
(when there is no theoretical distribution)

Typical Box-Cox transformations


2 Y2 transformation
0.5 sqrt(Y) transformation
0 logeY transformation
0.5 1 / sqrt(Y) transformation
1 1 / Y transformation
Use Box-Cox when there is no theoretical distribution

68 | MDT Confidential

34

Example: Capability for Non-Normal Data using


Box-Cox
MINITAB

Error Resolution Time.MTW

Problem Statement: Time (in days) to resolve errors in case report forms for
a pre-market clinical evaluation is too long causing delay in the product
release
Project Goal: Decrease error resolution time. Expectation is 7 days.
Project Strategy: Path Y = Resolution Time
Task: Determine capability for Y = Resolution Time

69 | MDT Confidential

Example: Verify LogNormal


Probability Plot of Resolution Time
Lognormal - 95% CI

Fails
3 second rule
Fat pencil test
p-value

99.9

Loc
Scale
N
AD
P-Value

99
95

Percent

90

1.760
1.303
200
3.623
<0.005

80
70
60
50
40
30
20
10
5
1
0.1

0.01

0.10

1.00
10.00
Resolution Time

100.00

1000.00

Not LogNormal!
70 | MDT Confidential

35

Example: Apply Box-Cox Transformation

71 | MDT Confidential

Example: Determine Optimal Lambda


Box-Cox Plot of Resolution Time
Lower CL

Upper CL
Lambda

50

(using 95.0% confidence)

StDev

40

Estimate

0.26

Lower CL
Upper CL

0.15
0.38

Rounded Value

0.26

30

Box-Cox transformation of Y.

20

Default = rounded value


10
Limit
-1

1
Lambda

72 | MDT Confidential

36

Example: Calculate Capability for Non-Normal Data


Using Box-Cox

73 | MDT Confidential

Example: Capability for Box-Cox Transformed Y


Process Capability of Resolution Time
Using Box-Cox Transformation W ith Lambda = 0.26
U S L*

t ransforme d dat a

P rocess D ata
LS L
*
T arget
*
USL
7
S am ple M ean
10.2928
S am ple N
200
S tD ev (Within)
9.25009
S tD ev (O v erall) 9.5492

Within
O v erall
P otential (Within) C apability
Z.B ench -0.01
Z.LS L
*
Z.U S L
-0.01
C pk
-0.00
C C pk
-0.00

A fter T ransform ation


LS L*
T arget*
U S L*
S am ple M ean*
S tD ev (Within)*
S tD ev (O v erall)*

O v erall C apability

*
*
1.65972
1.66391
0.503383
0.485671

Z.B ench
Z.LS L
Z.U S L
P pk
C pm

0.4
O bserv ed P erform ance
P P M < LS L
*
P P M > U S L 520000.00
P P M T otal 520000.00

E xp.
PPM
PPM
PPM

0.8

Within P erform ance


< LS L*
*
> U S L* 503324.68
T otal
503324.68

1.2

1.6

2.0

2.4

-0.01
*
-0.01
-0.00
*

2.8

E xp. O v erall P erform ance


P P M < LS L*
*
P P M > U S L* 503445.93
P P M T otal
503445.93

Capability = Z.Bench (Potential)


74 | MDT Confidential

37

A Desirable Problem
If your data could be handled either through a transformation or
a non-Normal distribution, either path is acceptable.
All else being equal, a recommended prioritization is as follows:
1.
2.
3.
4.

Log Transformation (= LogNormal model)


Weibull/Exponential model
Box-Cox with lambda0 but lambda within [-2, 2]
Other engineering distribution (SEV/LEV, logistic, etc.)

Any prioritization scheme should be interpreted as a heuristic,


not as the one true path.
The most important thing is to plot your data and arrive at a
mathematical solution which makes sense within the
engineering/scientific context at hand.
As much as possible, remain consistent in your choice of
statistical method. Avoid the distribution du jour or
transformation du jour.

75 | MDT Confidential

Distribution Analysis
Flowchart

38

Normality Testing Flowchart: CRDM


CRDM: Meant as a teaching aid, not an official quality doc.

Medtronic Confidential

Normality Testing Flowchart: CRDM

Medtronic Confidential

39

Normality Testing Flowchart: CRDM

Medtronic Confidential

Normality Testing Flowchart: CRDM

Medtronic Confidential

40

Distribution Analysis

Summary and Challenge Problem

Objectives Recap
Explain why distributional analysis is statistically
complicated (and sometimes emotionally
frustrating!)
Emphasize the importance of engineering theory
and historical precedent.
Encourage the use of multiple graphical methods
in addition to numerical tests.
Review common causes of Non-Normality
Discuss Transformations and how they compare
to fitting non-Normal distributions

Medtronic Confidential

41

Distribution Analysis Commentary


Distribution fitting is NOT about finding the true
distribution for your data; statistical theory
CANNOT prove that a particular distribution is the
true model for the data.
A model is true if it still fits when the sample sizes approaches
infinity.
With engineering data, it is often the case that distributions are
approximately Normal when N=50, but taking N=200 or N=500
will show smallbut statistically significantdepartures from
Normality.
In such a situation, the Normal distribution is often still a useful
model even if it is not a true model.

Instead of providing scientific truth, distribution


analysis is geared toward assessing which
statistical distributions are plausible models for the
data at hand.

Distribution Analysis Commentary


Good distribution fitting should combine statistical analysis with
engineering/scientific thinking.
Even before any data are collected, engineering theory and
historical precedents often suggest a distributional form:
Does the process involve any sort of maximization or
minimization of physical forces? If so, then Weibull might be
a good model.
Does the process involve the averaging of multiple small
forces? If so, the Normal might be a good model.
Are there historical precedents which suggest which model
is best?
Ideally, the chosen distribution should have an insignificant pvalue AND it should intuitively match with engineering principles.

42

Dont Forget Business Context


Usually distribution analysis is just one step in a
larger analytical problem.
Keep the larger business/engineering problem in
mind, as it may suggest
Whether only one tail of the dataset needs to be
modeled.
Whether a single low p-value might be a statistical
false alarm.
Whether the model needs to produce highly
precise numbers or just be in the ballpark.
85 | MDT Confidential

Challenge Problem
MECC Supplier Dataset: mecc_supplier.mtw
Business goal is to qualify the supplier as having
high capability, and possibly to create a variables
or attribute acceptance sampling plan.
LSL: 0.058
USL: 0.064
Analyze the data and offer your opinion of what
distribution is best for the situation at hand.
What questions would you ask the Supplier
Quality Engineer to help refine your decision?
86 | MDT Confidential

43

Process Capability Analysis

Objectives

QT Review
Process Capability

2 | MDT Confidential

Recap from Quality Trainer

Introduction
Process Capability for Normal Data
Capability Indices
Process Capability for Non-Normal Data
Summary

3 | MDT Confidential

A5 Process Capability
Measuring Process Capability

Sigma Scale, Z scores, DPM=PPM

Process Capability Indices

(Cp, Cpk, Pp, Ppk)


Impact of Normality & Process Stability
Attribute Data
Non-normal Data
Minitab Assistant
Impact of Sample Size (Confidence Limits)
Comparison to Tolerance Intervals
Impact of Measurement Error
4 | MDT Confidential

SIX SIGMA QUALITY LEVEL


Customer
Requirement

Histogram of Process Output


0.09

Process Capability: Comparison between


what the process produces vs. what is required

130

Density

0.08

Z = 6.0

0.07

Mean = 100

0.06

Std Dev = 5

Defect
Rate:

1 part
per
billion

0.05
0.04
0.03

NOTE:
2 parts per
billion for
two-sided
specs

0.02
0.01
0.00

80

90

100

5 | MDT Confidential

110
X

120

130

140

SIX SIGMA QUALITY LEVEL


Histogram of Process Output

Customer
Requirement
130

To Estimate Long-Term Performance,


Apply a 1.5SHIFT IN MEAN

0.09

Density

0.08
0.07

Mean = 107.5

0.06

Std Dev = 5

Z = 4.5

4.5

Defect
Rate:

0.05

3.4 parts
per
million

0.04
0.03
0.02
0.01
0.00

80

6 | MDT Confidential

90

100

110
X

120

130

140

SIGMA SCALE

Short-Term

Process
Sigma
6.0
5.5
5.0
4.5
4.0
3.5
3.0
2.5
2.0
1.5
1.0
0.5
0.0

Long-Term

7 | MDT Confidential

z
4.5
4.0
3.5
3.0
2.5
2.0
1.5
1.0
0.5
0.0
-0.5
-1.0
-1.5

Standard Normal Tail Area Probability

P(Z>z)
0.0000034
0.0000317
0.0002327
0.0013500
0.0062097
0.0227501
0.0668072
0.1586553
0.3085375
0.5000000
0.6914625
0.8413447
0.9331928

DPMO
3.4
32
233
1,350
6,210
22,750
66,807
158,655
308,538
500,000
691,462
841,345
933,193

% Conforming
99.99966
99.99683
99.9767
99.865
99.379
97.72
93.32
84.13
69.15
50.00
30.85
15.87
6.68

Process Capability Indices

Cp, Pp

Cpk, Ppk

Process Capability Ratio:


Variation only, ignores mean

Includes mean
to account for
centering

Cp, Cpk use within subgroup variation estimate of (Short-Term, Potential)


Pp, Ppk use overall sample standard deviation estimate (Long-Term, Actual)

8 | MDT Confidential

Process Capability Ratio


Spec Width: USL LSL
defines allowable variation
6 defines actual
process variation

ProcessFalloutandtheprocess
capabilityratio(PCR).

9 | MDT Confidential

Within Subgroup Unbiasing Constants

10 | MDT Confidential

Simulating Within Subgroup Variation


EXERCISE
1. Randomly Sample from a Known Population: Normal (=100, =5)
Using Calc > Random Data > Normal
Simulate 10,000 Subgroups of Size n=5 by placing data into C1-C5
2. Compute the Mean (X-bar), Range (R), and StDev (S) for each subgroup
Using Calc > Row Statistics place into C6, C7, C8
3. Compute the Variance (S2), R/d2 and S/c4 (for n=5, d2=2.326, c4=0.939986)
Using Calc > Calculator place into C9, C10, C11
4. Evaluate the performance of the 6 statistics in columns C6-C11 in estimating
population parameters
Using Stat > Basic Statistics > Display Descriptive Statistics
or Stat > Basic Statistics > Graphical Summary
5. Which statistics are biased and which ones are unbiased?
11 | MDT Confidential

NOTE:
Average (R/d2) = (R-bar/d2)

Estimating Within Subgroup StDev

Data Source:
20 subgroup samples of 5
parts taken from a
component manufacturing
process. Data are coded
x 0.0001 in. + 0.50 in.
Applied Statistics &
Probability for Engineers,
6th Edition (Montgomery &
Runger, Wiley 2013)
LSL = 25
USL = 45
Target = 35

12 | MDT Confidential

Xbar-S Chart of Component Data


n=5 per subgroup
1

Sample Mean

37.5

UCL=36.82

35.0

_
_
X=33.32

32.5
30.0

LCL=29.82

1
1

6.0

11
Sample

13

15

17

19

Sample StDev

UCL=5.116
4.5
3.0

_
S=2.449

1.5
LCL=0

0.0
1

13 | MDT Confidential

11
Sample

13

15

17

19

NOTE: Center line on S chart is calculated


backward from Unbiased Pooled StDev:
2.605*0.939986 = 2.449 for subgroup size n=5

Stat > Quality Tools > Capability Analysis > Normal . . .


Process Capability of Component Data
n=5 per subgroup
LSL

Target

USL
W ithin
Ov erall

P rocess Data
LS L
25
Target
35
USL
45
S ample M ean 33.32
S ample N
100
S tDev (Within) 2.60524
S tDev (O v erall) 3.29946

P otential (Within) C apability


Cp
1.28
C P L 1.06
C P U 1.49
C pk 1.06
O v erall C apability
Pp
PPL
PPU
P pk
C pm

27
O bserv ed P erformance
P P M < LS L 0.00
P P M > U S L 0.00
P P M Total 0.00

30

E xp. Within P erformance


P P M < LS L 702.65
PPM > USL
3.68
P P M Total 706.32

33

36

39

E xp. O v erall P erformance


P P M < LS L 5840.77
P P M > U S L 200.09
P P M Total 6040.85

42

1.01
0.84
1.18
0.84
0.90

45

PPM Estimates Assume


a Stable Process That Is
Normally Distributed

14 | MDT Confidential

Stat > Basic Statistics > Normality Test . . .


Probability Plot of Component Data (Subgroup Data Stacked)
Normal
99.9

Mean
StDev
N
AD
P-Value

99
95
90

33.32
3.299
100
0.620
0.104

Percent

80
70
60
50
40
30
20
10
5
1
0.1

25

30

35

40

45

Data
15 | MDT Confidential

Stat > Quality Tools > Capability Sixpack > Normal . . .


Process Capability Sixpack of Component Data
Sample Mean

Capability Histogram

Xbar Chart

LSL

Target

36

S pecifications
LS L
25
Target 35
USL
45

_
_
X=33.32

32

LCL=29.82

28
1

11

13

15

17

19

27

30

Sample Range

16

33

36

39

42

45

Normal Prob Plot

R Chart

A D: 0.620, P : 0.104

UCL=12.81
_
R=6.06

LCL=0

0
1

11

13

15

17

19

20

Last 20 Subgroups

30

Within
StDev 2.605
Cp
1.28
Cpk
1.06
PPM
706.32

32
24
5

16 | MDT Confidential

10
Sample

40

Capability Plot

40
Values

USL

UCL=36.82

15

20

Within
Overall

Overall
StDev 3.299
Pp
1.01
Ppk
0.84
Cpm
0.90
PPM
6040.85

Specs

NOTE: Center line on R chart is calculated


backward from Unbiased Pooled StDev:
2.605*2.326 = 6.06 for subgroup size n=5

Stat > Quality Tools > Capability > Between/Within . . .


Between/Within Capability of Component Data
n=5 per subgroup
LSL

Target

USL
B/W
Overall

P rocess D ata
LS L
25
Target
35
USL
45
S ample M ean
33.32
S ample N
100
S tDev (Betw een) 2.37001
S tDev (Within)
2.60524
S tDev (B/W)
3.52197
S tDev (O v erall) 3.29946

B/W C apability
Cp
0.95
C P L 0.79
C P U 1.11
C pk 0.79
O v erall C apability
Pp
PPL
PPU
P pk
C pm

27
O bserv ed P erformance
P P M < LS L 0.00
P P M > U S L 0.00
P P M Total 0.00

17 | MDT Confidential

E xp. B/W P erformance


P P M < LS L 9080.54
P P M > U S L 456.04
P P M Total 9536.58

30

33

36

39

42

1.01
0.84
1.18
0.84
0.90

45

E xp. O v erall P erformance


P P M < LS L 5840.77
P P M > U S L 200.09
P P M Total 6040.85

NOTE: Use Between/Within analysis when there


is significant variation between subgroups

Estimating Within StDev from


Individual Measurements
Data Source:
20 samples of individual
measurements of
concentration taken at
one-hour intervals from a
chemical process.
Applied Statistics &
Probability for Engineers,
6th Edition (Montgomery &
Runger, Wiley 2013)
LSL = 95
USL = 105
Target = 100

18 | MDT Confidential

I-MR Chart of Concentration


108
Individual V alue

U C L=105.98
104
_
X=99.10

100
96

LC L=92.21

92
1

11
O bser vation

13

15

17

19

U C L=8.461

M oving Range

8
6
4

__
M R=2.589

LC L=0

0
1

11
O bser vation

13

15

17

19

19 | MDT Confidential

Stat > Quality Tools > Capability Analysis > Normal . . .


Process Capability of Concentration
LSL

Target

USL
Within
Overall

P rocess D ata
LS L
95
Target
100
USL
105
S ample M ean
99.095
S ample N
20
S tD ev (Within)
2.29563
S tD ev (O v erall) 1.97603

P otential (Within) C apability


Cp
0.73
C P L 0.59
C P U 0.86
C pk
0.59
O v erall C apability
Pp
PPL
PPU
P pk
C pm

94
O bserv ed P erformance
P P M < LS L 50000.00
PPM > USL
0.00
P P M Total
50000.00

96

E xp. Within P erformance


P P M < LS L 37226.30
PPM > USL
5051.62
P P M Total
42277.92

98

100

102

E xp. O v erall P erformance


P P M < LS L 19117.21
PPM > USL
1402.63
P P M Total
20519.84

0.84
0.69
1.00
0.69
0.76

104

PPM Estimates Assume


a Stable Process That Is
Normally Distributed

20 | MDT Confidential

10

Stat > Basic Statistics > Normality Test . . .


Probability Plot of Concentration
Normal
99

Mean
StDev
N
AD
P-Value

95
90

99.10
1.976
20
0.398
0.333

Percent

80
70
60
50
40
30
20
10
5

95.0

97.5

100.0
Concentration

102.5

105.0

21 | MDT Confidential

Stat > Quality Tools > Capability Sixpack > Normal . . .


Process Capability Sixpack of Concentration
Capability Histogram

Individual Value

I Chart

LSL

UCL=105.98

105

Target

S pecifications
LSL
95
Target 100
U SL
105

_
X=99.10

100
95

LCL=92.21
1

11

13

15

17

19

94

96

98

Moving Range

100

102

104

Normal Prob Plot

Moving Range Chart


8

UCL=8.461

__
MR=2.589

A D : 0.398, P : 0.333

LCL=0
1

11

13

15

17

19

95

Last 20 Observations

Values

USL

StDev
Cp
Cpk
PPM

100.0
97.5
95.0
5

10
Observation

15

100

105

Capability Plot

20

Within
2.296
0.73
0.59
42277.92

Within
O v erall

Overall
StDev 1.976
Pp
0.84
Ppk
0.69
Cpm
0.76
PPM
20519.84

S pecs

22 | MDT Confidential

11

Compare Process Capability Indices


to Diagnose Improvement Actions
Potential Capability
(Inherent Variation)

Cp

Disparity Indicates Centering Issue

Disparity Indicates
Stability Issue

Disparity Indicates
Stability Issue

Pp

Cpk

Disparity Indicates Centering Issue

Ppk
Overall Performance

23 | MDT Confidential

FOUR POSSIBILITIES
(Donald J. Wheeler)
Control Charts (LCL, UCL)

Is Process In Statistical Control?


Yes
Yes

Ideal State
(Monitor)

Brink of Chaos
(Remove
Special Causes)

Threshold
State
(Alter System)

State of
Chaos

Is Process Capable of
Meeting Requirements?
Process Capability Indices
No
(Cp, Cpk, Pp, Ppk)
Requires LSL, USL

No

24 | MDT Confidential

12

Centered, Stable, Capable


Time Series Plot of A
115
110

110

105
100
95
90

90
1

10

20

30

40

50
Index

60

70

80

90

100

Process Capability of A
LSL

USL
Within
Overall

Process Data
LSL
Target

90
*

USL

Potential (Within) Capability


Cp
2.24

110

Sample Mean

99.8045

Sample N
StDev(Within)

100
1.48539

StDev(Overall)

1.45512

CPL

2.20

CPU
Cpk

2.29
2.20

Overall Capability

90

93

Observed Performance
Exp. Within Performance
PPM < LSL 25 | MDT
0.00 Confidential
PPM < LSL
0.00

96

99

102

105

Pp
PPL
PPU
Ppk

108

Exp. Overall Performance


PPM < LSL
0.00

PPM > USL

0.00

PPM > USL

0.00

PPM > USL

0.00

PPM Total

0.00

PPM Total

0.00

PPM Total

0.00

2.29
2.25
2.34
2.25

Cpm

25

Not Centered, Stable, Potentially Capable


Time Series Plot of B
115
110

110

105
100
95
90

90
1

10

20

30

40

50
Index

60

70

80

90

100

Process Capability of B
LSL

USL
Within
Overall

Process Data
LSL
Target
USL

90
*

Potential (Within) Capability


Cp
2.45

110

Sample Mean

106.814

Sample N
StDev(Within)

100
1.36089

StDev(Overall)

1.4326

CPL

4.12

CPU
Cpk

0.78
0.78

Overall Capability

90

93

96

99

Observed Performance
Exp. Within Performance
PPM < LSL 26 | MDT0.00
PPM < LSL
0.00
Confidential

Exp. Overall Performance


PPM < LSL
0.00

PPM > USL

20000.00

PPM > USL

9616.16

PPM > USL

13080.51

PPM Total

20000.00

PPM Total

9616.16

PPM Total

13080.51

102

105

108

111

Pp
PPL
PPU
Ppk
Cpm

2.33
3.91
0.74
0.74
*

26

13

Centered, Stable, Not Capable


Time Series Plot of C
110

110

105
100
95
90

90
1

10

20

30

40

50
Index

60

70

80

90

100

Process Capability of C
LSL

USL
Within
Overall

Process Data
LSL
Target

90
*

USL

Potential (Within) Capability


Cp
0.70

110

Sample Mean

100.309

Sample N
StDev(Within)

100
4.73733

StDev(Overall)

4.66247

CPL

0.73

CPU
Cpk

0.68
0.68

Overall Capability

90

95

100

Observed Performance
Exp. Within Performance
PPM < LSL 27 | 20000.00
PPM < LSL
14772.82
MDT Confidential

Exp. Overall Performance


PPM < LSL
13515.67

PPM > USL

30000.00

PPM > USL

20394.82

PPM > USL

18831.51

PPM Total

50000.00

PPM Total

35167.63

PPM Total

32347.18

105

Pp
PPL
PPU
Ppk

110

0.71
0.74
0.69
0.69

Cpm

27

Centered, Unstable, Potentially Capable


Time Series Plot of D
115
110

110

105
100
95
90

90
1

10

20

30

40

50
Index

60

70

80

90

100

Process Capability of D
LSL

USL
Within
Overall

Process Data
LSL
Target
USL

90
*

Potential (Within) Capability


Cp
2.71

110

Sample Mean

100.106

Sample N
StDev(Within)

100
1.23073

StDev(Overall)

3.78355

CPL

2.74

CPU
Cpk

2.68
2.68

Overall Capability

90

93

Observed Performance
Exp. Within Performance
PPM < LSL 28 | MDT
0.00 Confidential
PPM < LSL
0.00

96

99

102

105

Pp
PPL
PPU
Ppk

108

Exp. Overall Performance


PPM < LSL
3782.21

PPM > USL

0.00

PPM > USL

0.00

PPM > USL

4459.79

PPM Total

0.00

PPM Total

0.00

PPM Total

8242.00

Cpm

0.88
0.89
0.87
0.87
*

28

14

Critical Thinking of Data & Analysis


is Required for Valid Inferences

DATA

CONDITIONS
How was it collected?
At a single point in
time, or over multiple
time points? Were all
sources of variation
acting during the data
collection timeframe?

ANALYSIS
STATISTICS
Control Charts:
Is variation stable over time?

INFERENCE
PREDICTION
Future Performance

Process Capability Indices:


Cp, Cpk
Pp, Ppk
Within
Overall
Short-Term
Long-Term

QUALIFICATION STUDY: Data collected at a single time point over limited


conditions. Therefore, control charts and Ppk may not reflect long-term
performance since the analysis was computed from a short-term data set.
Recommend using study sample size as the subgroup size for capability analysis.
29 | MDT Confidential

How to Evaluate Process Capability


Stability: Compare Cpk to Ppk or Cp to Pp
Centering: Compare Cp to Cpk or Pp to Ppk
Variation: Compare Cp, Pp to 1.0
World Class Performance: Cpk > 2, Ppk > 1.5

How to Improve Process Capability


(1) Make Process Stable
(2) Center Process Mean
(3) Reduce Process Variation
(4)* Widen Specification Limits
* What is required for option #4? What quality system requirement exists
to assure that option #4 is done with scientifically sound rationale?
30 | MDT Confidential

15

Quality Improvement Process

31 | MDT Confidential

Attribute Data
WHAT IS Z
2 .1 5 C a p a b ility

W h a t is Z ?

T e lls h o w

c a p a b le Y is r e la tiv e to s p e c s
Z
6

D P M O
3 .4

2 33

6 ,2 1 0

6 6 ,8 0 7

3 0 8 ,5 3 7

6 9 1 ,4 6 2

DPMO = defects per million opportunities


Opportunities = Number of Units* Opportunities per Unit ( to have a defect)
Defects = number of observed defects in the Number of Units
Attribute Capability Measures
1)A defect rate or a defective rate
(they are the same if there is only one opportunity per unit for a
defect - in this case a defective unit has only one defect )
2)DPMO
3)Z

32 | MDT Confidential

16

Attribute Data
ROADMAP FOR CAPABILITY
2.15 Capability

Capability Roadmap
What Type of Data
Do You Have ?
Attribute Data

Variables Data

MINITAB:
Stat > Quality Tools >
Capability Analysis > Normal

Z.st

Z Bench
Potential (Within)

For Attribute Data: Can use Minitab:


Stat>Quality Tools>Capability Analysis>Binomial
(for defective units or one defect opportunity per unit)
Warning: you have to manually add 1.5 to Z from Minitab to get Z.st
from the Six Sigma Project Guide.
33 | MDT Confidential

Attribute Data
ATTRIBUTE PROCESS CAPABILITY
Opps per unit is the number of opportunities per unit to have this particular defect.
A unit may have more than one opportunity to have a specific defect.
(It is conservative to assume only 1 opportunity of a defect per unit)
The Six Sigma Project Guide is used to carry out the capability calculations.
The icon for this Guide looks like this:

Open up this Project Guide and click on initial capability icon

Note: Z.ST stands for Z short term which is a common measure to use in Six Sigma.

34 | MDT Confidential

17

Attribute Data
5 Capab ty

Example: Capability for Attribute Data


Project Goal: Improve Freestyle first pass yield from 10% to 20%.
50 Freestyles inspected
44 defective
What is initial capability?

Defects

Opps per
Unit

Units

44
NA

1
NA

50
NA

Z.ST

Z.ST 95%
Upper

Z.ST 95%
Lower

0.33

0.80

-0.19

Initial Capability
Final Capability

Initial Capability

Based on n=50, we are 95% confident: Z.ST < 0.80, Z.St > -0.19

35 | MDT Confidential

Attribute Data
2.15 Capability

Example: Capability for Attribute Data


Graph of Initial vs Final Capability
(good way to present capability!)

Capability for Attribute Y

Project Goal
(% Defective)

80.000%

Defects

Opps per
Unit

Units

Total Opps

DPMO

44
80

1
1

50
100

50
100

880000
800000

Z.ST

Z.ST 95%
Upper

Z.ST 95%
Lower

Project
Goal Z.ST

Initial Capability

0.33

0.80

-0.19

0.658

Final Capability

0.66

0.95

0.36

0.658

Initial Capability
Final Capability

Final Capability:
# Units is arbitrary (since we dont have any final data yet)
# Defects = # Units * Project Goal % = 100 * 80% = 80 (assumes goal is met)

36 | MDT Confidential

18

Attribute Data
Attribute capability can be expressed as:
-a proportion defective with a confidence interval
-a Z with a confidence interval
p

Example: Capability for Attribute Data (cont)


100%

Initial vs Final Capability


% Defective with 95% Confidence Bounds

% Defective is a good way


to explain capability

Project Goal

% Defective

80%

60%

40%

Initial vs Final Capability


Z.ST with 95% Confidence Bounds
Project Goal

20%

5
0%
Final Capability

Z.ST

4
Initial Capability

3
2

But in Lean Sigma


they like Z

1
0
Initial Capability

Final Capability

37 | MDT Confidential

Attribute Data
For baseline capability: 44/50 defective units (one opportunity per unit)
Inputs:
Minitab: Stat>Quality Tools>Capability Analysis>Binomial

38 | MDT Confidential

19

Attribute Data
For baseline capability: 44/50 defective units (one opportunity per unit)
Outputs:
Binomial Process Capability Analysis of Defectives
P C har t

Binomial P lot
U C L=1

Expected Defectives

Propor tion

1.0

_
P =0.88

0.9

0.8
LC L=0.7421

47.5
45.0
42.5
40.0
40
45
50
O bser ved Defectives

1
Sample
C umulative % Defective

H istogr am
1.00

S ummary S tats

95

85
80
75
0.98

0.99

1.00
Sample

1.01

1.02

% D efectiv e:
Low er C I:
U pper C I:
Target:
P P M D ef:
Low er C I:

88.00
75.69
95.47
0.00
880000
756899

U pper C I:
P rocess Z:
Low er C I:
U pper C I:

954665
-1.1750
-1.6919
-0.6964

Fr equency

%Defective

(95.0% confidence)
90

0.75
0.50
0.25
0.00

88
% Defective

Note: Add 1.5 to Minitab Z outputs to get the Z.st & CI Z.st for baseline
44/50.
39 | MDT Confidential

Attribute Data
2.15 Capability

Exercise: Capability for Attribute Data


Problem Statement: Expense reporting first pass yield is too low.
Project Goal: Improve first pass yield from 70% to 85%.

Submitted Reports
Defects
Opportunities per Report

200
61
1

Task: Determine submitted reports capability


Approach: Work alone or in small groups.

40 | MDT Confidential

20

Attribute Data: Manufacturing Yield


First-Pass Yield (%) by Operational Step
FPY = (Good / Attempts)*100
Attempts

OP 10

PRB

Scrap

Good

Rework

Rolled-Throughput Yield (%) by Product

41 | MDT Confidential

RTY = FPYi) = FPY1 x FPY2 x FPY3 x. . .

Individuals Control Chart


Individuals Control Chart of Daily FPY
100.0

Daily FPY (%)

97.5

_
X=96.55

95.0
92.5
LCL=90.70

90.0
87.5
85.0

13

19

25

31
Day

37

43

49

55

Purpose: (1) from baseline data, determine


threshold limits for prospective monitoring
42 | MDT Confidential

21

Individuals Control Chart


Individuals Control Chart of Daily FPY
100.0

Daily FPY (%)

97.5

_
X=96.55

95.0
92.5
LCL=90.70

90.0

87.5
NOTE: A statistically
stable process is in control, displaying a consistent
pattern of variation over time. The variation exhibited by a stable
process is 85.0
considered to be due to chance or common causes that are
1
7
13
19
25
31
37
43
49
55
inherent to the design of the system
(product and process). Therefore, a
Day
stable process is operating to its full potential by design. If we desire
better performance (increase mean FPY, or reduce variation), then a
change to the system is required. What type of changes may be
effective? Who is responsible for excecuting changes to the system?
43 | MDT Confidential

Individuals Control Chart


Purpose: (2) quantify process stability by
comparing two estimates of variation:
Long-Term: Sample Standard Deviation
Short-Term: Average Moving Range / 1.128
Stability Index = Long-Term / Short-Term
Process is Unstable When Stability Index > > 1.0

44 | MDT Confidential

22

Example A
I-MR Chart of Daily FPY
100
Daily FP Y ( % )

_
X=96.51
95

LC L=90.58

90

85

13

19

25

31
Day

37

43

49

55

M oving Range

U C L=7.288
6
4
__
M R=2.231

LC L=0

0
1

13

19

25

31
Day

37

43

49

55

I Chart (Long-Term):
S = 2.013
MR Chart (Short-Term): S = 2.231 / 1.128 = 1.98
Stability Index = 2.013 / 1.978 = 1.02
45 | MDT Confidential

Example B
I Chart of Daily FPY
_
X=97.0

100

LCL=82.6

Daily FPY (%)

80
1

60

40

20

UB=0

13

19

25

31
Day

37

43

49

55

I Chart (Long-Term):
S = 13.45
MR Chart (Short-Term): S = 5.4 / 1.128 = 4.79
Stability Index = 13.45 / 4.79 = 2.81
46 | MDT Confidential

23

Example B
I Chart of Daily FPY
_
X=97.0

100

LCL=82.6

Daily FPY (%)

80
1

60

40

20

13

19

25

31
Day

37

NOTE: Variation that exceeds


statistical control limits should be
treated as due to the presence of
a special cause; local action
should be taken to investigate,
determine root cause, and
prevent reoccurrences.
UB=0
43

49

55

I Chart (Long-Term):
S = 13.45
MR Chart (Short-Term): S = 5.4 / 1.128 = 4.79
Stability Index = 13.45 / 4.79 = 2.81
47 | MDT Confidential

Example C
I Chart of Daily FPY
_
X=99.32

100

LCL=97.40

Daily FPY (%)

95

90

85

80

75

13

19

25

31
Day

37

43

49

55

I Chart (Long-Term):
S = 3.627
MR Chart (Short-Term): S = 0.72 / 1.128 = 0.638
Stability Index = 3.627 / 0.638 = 5.68
48 | MDT Confidential

24

Macro View of FPY by Op


AVERAGE FPY vs. STABILITY INDEX
100

AVERAGE FPY

99

98

97

A
96
1

49 | MDT Confidential

3
4
STABILITY INDEX

Improvement Strategy
AVERAGE FPY vs. STABILITY INDEX
Capable But Periodically Unstable

100

Identify & Remove Special Causes


Daily: MTM/Supervisors

AVERAGE FPY

99

98

Stable but
Chronically Less
Capable

97

Change System
A (Projects)
96

Monthly, Quarterly:
Ops Mgmt, Engr
1

50 | MDT Confidential

3
4
STABILITY INDEX

25

Non-normal Data
Dataset: DISTSKEW.MTW
Variables: Pos Skew (column B)
Objective: Determine Cpk with Specs: 5-50
Pathway: Stat/Basic Statistics/Graphical Summary
Inputs: select variable Pos Skew to analyze
Is this data normally distributed?
Pathway: Graph/Probability Plot (Test for Normality, default option)
Inputs: select variable Pos Skew to analyze
Two plot layout: Right click on folder icon on toolbar to left of i toolbar symbol
Hold down control key and left click on two graph names, right click on the graph names
to get layout tool and click on finish.
Layout tool results:

Can we compute Cpk?


51 | MDT Confidential

Non-normal Data
CPK FOR NON-NORMAL DISTRIBUTION
Dataset: DISTSKEW.MTW
Variables: Pos Skew (column B)
Box/Cox transformation :Pathway: Stat/Control Charts/Box Cox
Inputs: all obs in one column/ select variable Pos Skew /Subgroup Size 1
Johnson Transformation: Pathway: Stat/Quality Tools/Johnson Transformation
Inputs: select variable Pos Skew to analyze
Merged layout:

= 0.0

52 | MDT Confidential

26

Non-normal Data
BOX- COX TRANSFORMATION
BOX COX Table of Transformations
______________________________________________________________________

Transformation
______________________________________________________________________
1
No transformation
1/2
Square root
0
Log
-1/2
Reciprocal Square Root
-1
Reciprocal

Example of Minitab Box-Cox Input Screen with Lambda=0

53 | MDT Confidential

Non-normal Data
CPK WITH TRANSFORMED DATA
What is the Cpk for DistSkew Data Set?
Pathway: Stat/Quality tools/Capability Analysis/Normal
Inputs: select variable Pos skew, subgroup size 1, LSL=5,USL=50
AND click on Box-Cox button and select Use optimal lambda

Recall = 0 is the log transformation of the data.


Cpk= __________.

54 | MDT Confidential

27

Non-normal Data
CAPABILITY WITH TRANSFORMED DATA
Capability SixPack
Pathway: Stat/Quality tools/Capability Sixpack/Normal
Inputs: select variable Pos skew, subgroup size 1, LSL=5,USL=50
AND click on Box-Cox button and select Use optimal lambda

55 | MDT Confidential

Non-normal Data
CAPABILITY WITH RAW DATA
What is the Capability for DistSkew Data Set?
Pathway: Stat/Quality tools/Capability Analysis/Nonnormal
Inputs: select variable Pos skew, subgroup size 1, LSL=5,USL=50
AND click select the radio button distribution with pull down of lognormal
Output:

56 | MDT Confidential

28

Non-normal Data
Capability Normal Branch with Box-Cox vs Log-Normal
1) the ppm Observed stay the same when you fit the log-normal using
either the normal or non-normal capability branch. Actually, the ppm
observed will stay the same no what distribution you fit to the data.
2) the ppm Expected Overall stays the same when you fit the log-normal
using either the normal or non-normal capability branch.
3) The Ppks can be very different between using the capability normal
branch (with the Box-Cox transform) vs using the capability nonnormal
(using lognormal fit) because the capability nonnormal branch uses
the ISO definition of Ppk
4) The capability nonnormal has no Cpk. Just Ppk. And it has no confidence
interval for Ppk either.

57 | MDT Confidential

Minitab Assistant vs Method Chooser


Minitab Method Chooser Flowchart

58 | MDT Confidential

29

Minitab Assistant Flow Chart

59 | MDT Confidential

Minitab Assistant: Continuous Data


Minitab Assistant wants 100 or more data points.
Minitab tests (AD) for normality at the .05 level.
Minitab Assistant uses THREE rules to check for
stability of the process:

Test 1: Point out side control limits


Test 2 : Nine points in a row on the same side of the
centerline
Test 7(Modified):12-15 points within one sigma of the
centerline

Minitab Assistant info: http://www.minitab.com/enCN/support/answers/answer.aspx?ID=2613&langT


ype=1033

60 | MDT Confidential

30

Minitab Assistant

61 | MDT Confidential

Minitab Assistant Normal Distribution


Use CABLE.MTW
Dataset has 100 measurements of the diameter of
a cable wire 20 hourly samples of n=5
The engineering specification for this diameter is
0.55 +/- 0.05 cm.
Our task is to conduct a capability analysis of this
process.

62 | MDT Confidential

31

Minitab Assistant Normal Distribution

63 | MDT Confidential

Minitab Assistant Normal Distribution


Capability Analysis for Diameter
Report Card
Check

Status

Stability

Capability Analysis for Diameter


Process Performance Report

Description
The process mean and variation are stable. No points are out of control.

Number of
Subgroups

Capability Histogram
Are the data inside the limits?

You only have 20 subgroups. For a capability analysis, it is generally recommended that you collect at
least 25 subgroups over a long enough period of time to capture the different sources of process
variation.

Normality

Your data passed the normality test. As long as you have enough data, the capability estimates
should be reasonably accurate.

Amount
of Data

The total number of observations is 100 or more. The capability estimates should be reasonably
precise.

Process Characterization

LSL

USL

Total N
Subgroup size
Mean
StDev (overall)
StDev (within)

100
5
0.54646
0.019341
0.018548

Capability Statistics

Capability Analysis for Diameter


Diagnostic Report
0.50

Xbar-R Chart
Confirm that the process is stable.

0.52

0.54

0.56

0.58

0.60

Actual (overall)
Pp
Ppk
Z.Bench
% Out of spec (observed)
% Out of spec (expected)
PPM (DPMO) (observed)
PPM (DPMO) (expected)
Potential (within)
Cp
Cpk
Z.Bench
% Out of spec (expected)
PPM (DPMO) (expected)

0.86
0.80
2.29
2.00
1.10
20000
10969
0.90
0.83
2.41
0.81
8072

Actual (overall) capability is what the customer experiences.

M ean

0.56

Potential (within) capability is what could be achieved if process


shifts and drifts were eliminated.

0.54

Capability Analysis for Diameter


Summary Report

0.52
0.10

Customer Requirements

Rang e

How capable is the process?


0

0.05

Low

High

Upper Spec
Target
Lower Spec

0.6
*
0.5
Process Characterization

Z.Bench = 2.29

Mean
Standard deviation

0.00
1

11

13

15

17

19
Actual (overall) Capability
Are the data inside the limits?
LSL

Normality Plot
The points should be close to the line.

USL

0.86
0.80
2.29
1.10
10969

Comments

Normality Test
(Anderson-Darling)
Results
P-value

0.54646
0.019341

Actual (overall) capability


Pp
Ppk
Z.Bench
% Out of spec
PPM (DPMO)

Conclusions
-- The defect rate is 1.10%, which estimates the
percentage of parts from the process that are outside the
spec limits.

Pass
0.794

Actual (overall) capability is what the customer experiences.

0.50

0.52

0.54

0.56

0.58

0.60

64 | MDT Confidential

32

Minitab Assistant: Non-Normal Data

Use TILES.MTW
Choose Minitab Assistant
Capability Analysis
Detects non-normality
and offers the option of
transfomation (Box-Cox)

65 | MDT Confidential

Minitab Assistant: Non-Normal Data

Capability Analysis for Warping


Report Card
Check

Status

Description

Stability

The process mean and variation are stable. No points are out of control.

Number of
Subgroups

You only have 10 subgroups. For a capability analysis, it is generally recommended that you collect at
least 25 subgroups over a long enough period of time to capture the different sources of process
variation.

Normality

The transformed data passed the normality test. As long as you have enough data, the capability
estimates should be reasonably accurate.

Amount
of Data

The total number of observations is 100 or more. The capability estimates should be reasonably
precise.

66 | MDT Confidential

33

Minitab Assistant: Non-Normal Data


Capability Analysis for Warping
Diagnostic Report

Capability Analysis for Warping


Summary Report

Xbar-S Chart
Confirm that the process is stable.

Customer Requirements

M
ean

How capable is the process?

Low

3
StDev

High

Upper Spec
Target
Lower Spec

1
1

Mean
Standard deviation

2.9231
1.7860

Actual (overall) capability


Pp
Ppk
Z.Bench
% Out of spec
PPM (DPMO)

*
0.75
2.24
1.26
12569

10

Normality Plot (lambda = 0.50)


The points should be close to the line.

Actual (overall) Capability


Are the data below the limit?

Normality Test
(Anderson-Darling)
Original

Transformed

Fail
0.010

Pass
0.574

Results
P-v alue

8
*
*
Process Characterization

Z.Bench = 2.24

USL

Comments
Conclusions
-- The defect rate is 1.26%, which estimates the
percentage of parts from the process that are outside the
spec limits.

Capability Analysis for Warping


Process Performance Report

Capability Histogram
Are the data below the limit?

Actual (overall) capability is what the customer experiences.

Process Characterization
USL

Total N
Subgroup size

100
10
0.0

Capability Statistics
Actual (overall)
Pp
Ppk
Z.Bench
% Out of spec (observed)
% Out of spec (expected)
PPM (DPMO) (observed)
PPM (DPMO) (expected)
Potential (within)
Cp
Cpk
Z.Bench
% Out of spec (expected)
PPM (DPMO) (expected)
0.0

1.5

3.0

4.5

6.0

7.5

1.5

3.0

4.5

6.0

7.5

*
0.75
2.24
2.00
1.26
20000
12569
*
0.76
2.28
1.12
11249

Transformed Data

Actual (overall) capability is what the customer experiences.


Potential (within) capability is what could be achieved if process
shifts and drifts were eliminated.

67 | MDT Confidential

Confidence Limits: NOT IN ASSISTANT


Stat -> Quality Tools -> Capability Analysis ->
Normal
Follow this path if you need to calculate the Lower
Confidence Bound on Cpk or Ppk
Note: The Normal branch has Box-Cox transformations (for
non-normal data) that allows you to get Cpk and Ppk and
confidence intervals for Cpk & Ppk on the transformed scale.
Note: There is no Cpk or a confidence interval for Ppk if you
use the Non-Normal branch.
Note: The Minitab Assistant DOES NOT give confidence limits
for Capability indicies. It allows you to use the Box-Cox
transform when it detects non-normal data.

68 | MDT Confidential

34

Confidence Limits: Normal Case

Select one-sided lower limit


69 | MDT Confidential

Confidence Limits: Normal Case


Process Capability of Diameter
(using 95.0% confidence)
LSL

USL
Within
Ov erall

P rocess Data
LS L
0.5
Target
*
U SL
0.6
Sample M ean 0.54646
Sample N
100
StDev (Within) 0.0185477
StDev (O v erall) 0.0193414

P otential (Within) C apability


Cp
0.90
Low er C L 0.78
C PL
0.83
C PU
0.96
C pk
0.83
Low er C L 0.71

Cpk and 95%


Lower
confidence
limit for Cpk

O v erall C apability

0.50
O bserv ed P erformance
P P M < LS L 10000.00
P P M > U SL 10000.00
P P M Total
20000.00

0.52

E xp. Within P erformance


P P M < LSL 6124.50
P P M > U SL 1947.11
P P M Total 8071.61

0.54

0.56

0.58

E xp. O v erall P erformance


P P M < LS L
8150.57
P P M > U S L 2818.71
P P M Total 10969.28

0.60

Pp
Low er C L
PPL
PPU
P pk
Low er C L
C pm
Low er C L

0.86
0.76
0.80
0.92
0.80
0.69
*
*

Ppk and 95%


Lower
confidence
limit for Ppk

70 | MDT Confidential

35

Confidence Limits: Normal Case


LOWER 95% CONFIDENCE FOR OBSERVED CPK
Obs
Cpk
0.5
0.6
0.7
0.8
0.9
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2.0
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
3.0
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
3.9
4.0

10
0.24
0.31
0.38
0.44
0.51
0.58
0.64
0.70
0.77
0.83
0.89
0.96
1.02
1.08
1.14
1.21
1.27
1.33
1.39
1.45
1.52
1.58
1.64
1.70
1.76
1.82
1.89
1.95
2.01
2.07
2.13
2.19
2.25
2.32
2.38
2.44

20
0.32
0.40
0.48
0.55
0.63
0.71
0.78
0.86
0.93
1.01
1.08
1.16
1.23
1.30
1.38
1.45
1.53
1.60
1.67
1.75
1.82
1.90
1.97
2.04
2.12
2.19
2.26
2.34
2.41
2.48
2.56
2.63
2.71
2.78
2.85
2.93

30
0.35
0.44
0.52
0.60
0.68
0.76
0.84
0.92
1.00
1.08
1.16
1.24
1.32
1.40
1.48
1.56
1.64
1.71
1.79
1.87
1.95
2.03
2.11
2.19
2.27
2.34
2.42
2.50
2.58
2.66
2.74
2.82
2.89
2.97
3.05
3.13

Sample Size (n)


40
50
75
0.37
0.39
0.41
0.46
0.47
0.50
0.54
0.56
0.59
0.63
0.65
0.67
0.71
0.73
0.76
0.79
0.82
0.85
0.88
0.90
0.94
0.96
0.99
1.03
1.04
1.07
1.11
1.13
1.15
1.20
1.21
1.24
1.29
1.29
1.32
1.37
1.37
1.41
1.46
1.45
1.49
1.55
1.54
1.57
1.64
1.62
1.66
1.72
1.70
1.74
1.81
1.78
1.83
1.90
1.86
1.91
1.98
1.94
1.99
2.07
2.03
2.08
2.16
2.11
2.16
2.24
2.19
2.24
2.33
2.27
2.33
2.42
2.35
2.41
2.50
2.43
2.50
2.59
2.52
2.58
2.68
2.60
2.66
2.76
2.68
2.75
2.85
2.76
2.83
2.94
2.84
2.91
3.02
2.92
3.00
3.11
3.01
3.08
3.20
3.09
3.16
3.28
3.17
3.25
3.37
3.25
3.33
3.46

100
0.42
0.51
0.60
0.69
0.78
0.87
0.96
1.05
1.14
1.23
1.32
1.41
1.49
1.58
1.67
1.76
1.85
1.94
2.03
2.11
2.20
2.29
2.38
2.47
2.56
2.65
2.73
2.82
2.91
3.00
3.09
3.18
3.26
3.35
3.44
3.53

150
0.43
0.53
0.62
0.71
0.80
0.89
0.99
1.08
1.17
1.26
1.35
1.44
1.53
1.62
1.71
1.80
1.89
1.99
2.08
2.17
2.26
2.35
2.44
2.53
2.62
2.71
2.80
2.89
2.98
3.07
3.16
3.25
3.34
3.44
3.53
3.62

200
0.44
0.54
0.63
0.72
0.82
0.91
1.00
1.09
1.19
1.28
1.37
1.46
1.55
1.65
1.74
1.83
1.92
2.01
2.11
2.20
2.29
2.38
2.47
2.57
2.66
2.75
2.84
2.93
3.03
3.12
3.21
3.30
3.39
3.48
3.58
3.67

71 | MDT Confidential

EXERCISE: Confidence Bound for Cpk


Simulation Study of Cpk:
1)Simulate 10,000 rows with 5 columns of a normal distribution with
mean = 10 and std. dev. = 1.
2) Compute the mean and standard deviation for each row.
3) Use LSL= 7, USL = 13. and compute Cpl & Cpu for each row.
4) Take the min of Cpl & Cpu to get Cpk for each row.
5) Make a histogram of the simulated Cpks.
Does the distribution of simulated Cpks look normal?.
What should the theoretical Cpk be from the mean, standard deviation
and specs?
What is the distribution of Cpk lower bounds? How often does the
lower confidence bound contain the true value for Cpk?

72 | MDT Confidential

36

Simulation Results for n=5


Summary for Sample Cpk Estimated from n=5
Mu=10, Sigma=1, LSL=7, USL=13, Population Cpk = 1.0
A nderson-Darling N ormality Test

1.1

2.2

3.3

4.4

5.5

6.6

A -S quared
P -V alue <

436.22
0.005

M ean
S tDev
V ariance
S kew ness
Kurtosis
N

1.1036
0.5573
0.3106
2.7486
13.4533
10000

M inimum
1st Q uartile
M edian
3rd Q uartile
M aximum

7.7

0.3191
0.7542
0.9676
1.2816
7.8123

95% C onfidence Interv al for M ean


1.0926

1.1145

95% C onfidence Interv al for M edian


0.9589
9 5 % C onfidence Inter vals

0.9761

95% C onfidence Interv al for S tDev


0.5497

Mean

0.5651

Median
0.950

0.975

1.000

1.025

1.050

1.075

1.100

In theory, Cpk = min ( (13 10)/(3*1) , (10-7)/(3*1) ) = 1.000


73 | MDT Confidential

Simulation Results for n=5, 10, 20, 30, 50, 100


Histogram of Cpk 5

Histogram of Cpk 10

Normal

Normal

Mean
StDev
N

500

0.0

1.1

2.2

3.3
4.4
Cpk 5

5.5

6.6

Mean
StDev
N

1600
Frequency

Frequency

1000

1.104
0.5573
10000

800

7.7

0.7

1.4

2.1

Histogram of Cpk 20

StDev
N

0.1763
10000

1.10

1.32
Cpk 20

1.54

1.76

1.98

0.64

0.80

0.96

Mean
StDev

0.9788
0.1053

10000

1.08
Cpk 50

1.12
Cpk 30

1.28

1.44

0.1411
10000

1.60

1.20

1.32

1.44

Mean
StDev

500
Frequency

Frequency

250

0.96

0.9770

StDev
N

Normal

500

0.84

Mean

Histogram of Cpk 100

Normal

0.72

4.9

250

Histogram of Cpk 50

4.2

500
Frequency

Frequency

0.9802

200

0.88

3.5

Normal
Mean

400

0.66

2.8
Cpk 10

Histogram of Cpk 30

Normal

1.007
0.2853
10000

0.9825
0.07376
10000

250

0.72

0.80

0.88

0.96
1.04
Cpk 100

1.12

1.20

1.28

74 | MDT Confidential

37

Simulation Results for n=5, 10, 20, 30, 50, 100


Boxplot of Sample Cpk Estimates by Sample Size
8

Assumptions:
Normal
= 10
=1
LSL = 7
USL = 7
True Cpk = 1.0

7
6

4
3
2
1

0
Cpk 5

Cpk 10

Cpk 15

Cpk 20

Cpk 25

Cpk 30

Cpk 50

Cpk 100

75 | MDT Confidential

Simulation Results for n=5, 10, 20, 30, 50, 100


Boxplot of Cpk Lower Bounds by Sample Size
3.5

Assumptions:
Normal
= 10
=1
LSL = 7
USL = 7
True Cpk = 1.0

3.0
2.5
2.0
Data

Data

1.5
1.0

0.5
0.0
Cpk LB 5

Cpk LB 10

Cpk LB 15

Cpk LB 20

Cpk LB 25

Cpk LB 30

Cpk LB 50 Cpk LB 100

76 | MDT Confidential

38

Performance of Cpk Lower Confidence Bound


95% Lower Confidence Bound for Cpk Miss Rate vs. Population Mean
LSL
7

(Population Sigma Varies to Make Cpk=1.0)


10
N
5
10
15
20
25
30
50
100

6.00%
5.00%

Miss Rate

5.00%
4.00%
3.00%
2.00%
1.00%

Target
Nominal

0.00%

USL=13
7.0

7.5

8.0
8.5
9.0
True Population Mean

9.5

10.0

77 | MDT Confidential

Confidence Limits: Normal Case


LOWER 95% CONFIDENCE FOR OBSERVED CPK

Simulation Results
Formula is conservative when process
mean is on target (better than 95%
coverage of true Cpk value).
As process mean deviates from target,
formula provides approximately the
stated reliability in performance (95%),
regardless of sample size.
78 | MDT Confidential

39

Relationship Between Cpk & Tolerance


Intervals (Confidence/Reliability Levels)

79 | MDT Confidential

Attribute Sample Sizes Using c=0 Plans


NUMBER OF TESTS WITHOUT FAILURE VS RELIABILITY AND CONFIDENCE
Reliability

50

60

70

75

0.999999
0.99999
0.9999
0.999
0.998
0.997
0.996
0.995
0.994
0.993
0.992
0.991
0.99
0.98
0.97
0.96
0.95
0.94
0.93
0.92
0.91
0.9
0.8
0.7
0.6
0.5

693147
69315
6932
693
347
231
173
139
116
99
87
77
69
35
23
17
14
12
10
9
8
7
4
2
2
1

916291
91629
9163
916
458
305
229
183
153
131
115
102
92
46
31
23
18
15
13
11
10
9
5
3
2
2

1203973
120397
12040
1204
602
401
301
241
201
172
150
134
120
60
40
30
24
20
17
15
13
12
6
4
3
2

1386294
138629
13863
1386
693
462
346
277
231
198
173
154
138
69
46
34
28
23
20
17
15
14
7
4
3
3

Confidence Level (% )
80
85
90
1609438
160943
16094
1609
804
536
402
322
268
230
201
179
161
80
53
40
32
27
23
20
18
16
8
5
4
3

1897120
189712
18971
1897
948
632
474
379
316
271
237
210
189
94
63
47
37
31
27
23
21
19
9
6
4
3

2302584
230258
23025
2302
1151
767
575
460
383
328
287
255
230
114
76
57
45
38
32
28
25
22
11
7
5
4

95

97.5

99

99.5

99.9

2995731
299572
29956
2995
1497
998
748
598
498
427
373
332
299
149
99
74
59
49
42
36
32
29
14
9
6
5

3688878
368887
36887
3688
1843
1228
921
736
613
526
460
409
368
183
122
91
72
60
51
45
40
36
17
11
8
6

4605168
460515
46050
4603
2301
1533
1149
919
766
656
574
510
459
228
152
113
90
75
64
56
49
44
21
13
10
7

5298315
529830
52981
5296
2647
1764
1322
1058
881
755
660
587
528
263
174
130
104
86
74
64
57
51
24
15
11
8

6907752
690773
69075
6905
3451
2300
1724
1379
1148
984
861
765
688
342
227
170
135
112
96
83
74
66
31
20
14
10

80 | MDT Confidential

40

Producers Risk Of Using c=0 Plans


Sample Size Required

Reliability Level Required to Pass

0 Failures Allowed
Reliability
0.999
0.997
0.99
0.95
0.90
0.80

Confidence
90
95
2302
767
230
45
22
11

2995
998
299
59
29
14

50% Chance 95% Chance


Confidence
90
95

Reliability
0.999
0.997
0.99
0.95
0.90
0.80

0.99970
0.99910
0.9970
0.9847
0.9690
0.9389

Confidence
90
95

0.99977
0.99931
0.9977
0.9883
0.9764
0.9517

0.99998
0.99993
0.9997
0.9989
0.9977
0.9950

0.99998
0.99995
0.9998
0.9991
0.9982
0.9963

NOTE: The term reliability in a compliance testing context refers to


conformance to design requirements, not to the actual device field
performance level. The difference is due to unaccounted design margin
between spec limits and the variation required to degrade field performance.
c=0 Plans Maximize The Chances Of A Good Process Failing The Study
81 | MDT Confidential

The Big Picture of Compliance Testing


AFTER

BEFORE

DURING

Characterization

Qualification

Process

Studies

Studies

Stability

Inject sources of
variation to stress system
Experimentation (DOE)
Simulation Modeling
Measure design margin

n delivers conf%/rel%

Limited conditions

All sources of
variation will be
acting over the long
term in the future

Representative
sample?

Need to detect
significant changes

One time point

Optimization
82 | MDT Confidential

41

How To Move Away From A


Compliance Testing Culture Toward
A Capability Culture
Identify Critical Requirements
1. Perform thorough characterization studies; inject
sources of variation, test to failure
2. Demand variables data, system performance modeling,
measure design margin, robust design, optimization:
then you can skip compliance testing!
3. If variables data is unavailable, challenge that!
4. For attribute data: select risk-based confidence/reliability
levels, perform compliance testing or else cite the work
done during characterization!
5. Control processes to ensure that our system robustness
does not deteriorate over time and that we are alerted to
assignable causes of variation if they occur
83 | MDT Confidential

Impact of Measurement Error


(Imprecision)

2 observed = 2 product + 2 measurement error


Process capability study variation is inflated by measurement
error (gage repeatability & reproducibility).
Therefore, if an independent gage R&R study has been
completed, then subtract the measurement error from the
observed process capability variation to estimate true product
variation:

2 product = 2 observed - 2 measurement error


84 | MDT Confidential

42

Capability Analysis: Summary


PROCESS CAPABILITY: THE NATURAL VARIABILITY IN A PROCESS.
VARIABLES DATA
Cp, Pp:
Measure of process potential (for a centered process)
Cpk, Ppk: Measure of actual process capability
PPM estimates from PROCESS CAPABILITY INDICES ASSUME THAT
THE PROCESS IS STABLE AND FOLLOWS A NORMAL (BELL-SHAPED)
DISTRIBUTION.
Make sure there are no shifts or trends
If the data are not normal try other parametric distributions (Weibull or
lognormal) or Box-Cox transformation
If those fail, consider the data as attribute
Consider the impact of sample size and how the data was collected
(short-term vs. long-term) when making inferences use confidence
bounds to incorporate uncertainty in estimates
ATTRIBUTE DATA
The proportion (p bar) from the P chart is the process capability.
85 | MDT Confidential

Summary Quiz
True or False
___________

___________

___________

You can ignore plotting the data and just compute Ppk.
The Total Exp. Overall ppm are the same for
log-normal data in Minitab for both of these approaches:
1) the Normal Capability branch (with lambda=0) and
2) the non-normal Capability branch and selecting
Log-normal.
The smaller the sample used to compute Ppk, the
better. It is less work to collect the data.

86 | MDT Confidential

43

Summary And Recap


Measuring Process Capability

Sigma Scale, Z scores, DPM=PPM

Process Capability Indices

(Cp, Cpk, Pp, Ppk)


Impact of Normality & Process Stability
Attribute Data
Non-normal Data
Minitab Assistant
Impact of Sample Size (Confidence Limits)
Comparison to Tolerance Intervals
Impact of Measurement Error

87 | MDT Confidential

44

Chapter 4B:
Tolerance Intervals

Topics
Tolerance Intervals
Calculations
Sample Size

2 | MDT Confidential

Statistical Tolerance Intervals


From the Medtronic Handbook of Statistics:
For variables data, a statistical tolerance
interval places limits on the variation expected
in individual items from a population.
A tolerance interval is described by two
parameters: confidence level and population
fraction (sometimes called reliability, for
fraction meeting spec(s) )

3 | MDT Confidential

Tolerance Intervals New in Minitab 16


A new feature in Minitab 16 is the calculation of
tolerance intervals using a normal-distribution
assumption.
The normal distribution assumption is critical.
Unlike confidence intervals which are somewhat
unaffected by lack of normality, tolerance intervals
are completely dependent upon it.

4 | MDT Confidential

Normal distribution Tolerance Interval

5 | MDT Confidential

Statistical Tolerance Intervals


From the Medtronic Handbook of Statistics, contd:
If the data is not normal, transformations should be
tried to obtain normality. For example, if the data
were lognormal then tolerance intervals could be
constructed on the log of the data.
If the underlying population distribution is known but
is not normal then reliability/distribution analysis
techniques can be used.
Tolerance Intervals generally should be
Two-sided if the specification is two-sided
One-sided if the specification is one-sided

6 | MDT Confidential

Tolerance Interval Calculation


First determine appropriate data distribution or
transformation
For Normal distribution or transformation to
normal distribution
Use Stat -> Quality Tools -> Tolerance Interval

For other distribution (e.g. Weibull)


Use Stat -> Reliability/Survival -> Parametric
Distribution Analysis

7 | MDT Confidential

Example: Tolerance Intervals


Use Ch1DataFile.mtw
Variables TubeTensile1,
TubeTensile2, TubeTensile3

8 | MDT Confidential

Step 1: Identify Distribution


Stat -> Basic Statistics
-> Normality Test

9 | MDT Confidential

Step 2: Calculate tolerance bound


Lower Tolerance Bound for TubeTensile1

Tolerance Interval Output TubeTensile1

11 | MDT Confidential

A Very Confusing Output in Minitab 16


Tolerance Interval Plot for Tensile Bond 1_1
95% Lower Bound
At Least 95% of Population Covered
Statistics
N
Mean
StDev

30
41.063
10.127

Normal
0

20

40

60

Lower

18.583

Nonparametric
N ormal

Lower

7.560

Normality Test

N onparametric
0

10

20

30

40

50

60

AD
P-Value

70

0.772
0.040

Normal Probability Plot


99
P er cent

90
50
10
1

10

20

30

40

50

60

70

Tell everyone you know who uses Minitab:


The 95%/95% statement on the display
ONLY applies to the Normal-distribution
interval, not the Nonparametric Interval

Must Look in Session Window:

12 | MDT Confidential

Try this using Summarized Data

13 | MDT Confidential

Try using other Sample Sizes

95/95 Nonparametric One-sided requires n=59


95/95 Nonparametric Two-sided requires n=93
14 | MDT Confidential

Exercise
Compute 95/95 lower tolerance bounds for
TubeTensile2, TubeTensile3

Compute 95/90 lower tolerance bounds for


TubeTensile2, TubeTensile3

Compute 95/95 two-sided tolerance intervals for


TubeTensile2, TubeTensile3

15 | MDT Confidential

Using Summarized Data option to Evaluate


Sample Size for Tolerance Intervals
Imagine having the following historical data on
the pull strength of an electrode to plan a
future study using Tolerance Intervals

A normal distribution assumption is appropriate


The historical mean is 4.92 lbs
The historical standard deviation is 0.87 lbs
The lower specification limit for pull strength is
2.0 lbs

16 | MDT Confidential

Sample Size Evaluation for NormalDistribution Tolerance Intervals


Ask:
How likely are these results to predict the
results of the future study?
Will the future study run under the same
conditions? Worst case?
Would that affect the mean or standard
devation we expect?

Sample Size for Normal Distribution Tolerance


Intervals
For example, might decide to use a larger
standard deviation value, say 1.10
(approximately 25% larger) as the planning
value
Need to know confidence and reliability to
demonstrate. For example, lets use 95%
confidence and 95% reliability.
Start with n=30 and see if that sample size
would be large enough . . .

18 | MDT Confidential

Sample Size for Normal Distribution


Tolerance Intervals

Since the one-sided


tolerance interval is above
the specification value of
2.0, n=30 is large enough

Now try smaller sample sizes . . .

n=14 is the smallest


sample size that
produces an interval
above 2.0

10

Exercises
Choose a sample size for
Normal distribution tolerance interval
One-sided specification: Min 3 lbf
Planning data: TubeTensile3

Choose a sample size for


Normal distribution tolerance interval
Two-sided specification: 3.5 to 4.0
Planning data: Spacing4

21 | MDT Confidential

Tolerance Interval for Non-Normal


Distributions
If a normal-distribution model is not appropriate
for the data, then either
Transform the data to Normal
Use the (normal distribution) Tolerance Interval module

Or identify a non-normal distribution (e.g. Weibull)


Use Stat -> Reliability/Survival -> Parametric Distribution
Analysis
Use confidence intervals on percentiles to determine the
Tolerance Interval Limits

22 | MDT Confidential

11

Tolerance Intervals via Reliability/Survival Menu


One-sided
Lower 95% / 95%:
Calculate one-sided lower
95% confidence bound on
the 5th percentile

Upper 95% / 95%:


Calculate one-sided upper
95% confidence bound on
the 95th percentile

Two-sided
Two-sided 95% / 95%:
Calculate two-sided
confidence intervals for
2.5th and 97.5th percentiles.
Lower bound is the lower
95% bound on the 2.5th
percentile
Upper bound is the upper
95% bound on the 97.5th
percentile

23 | MDT Confidential

Weibull Tolerance Interval


Use Stat -> Quality
Tools -> Individual
Distribution
Identification
Data was randomly
generated from
Weibull with shape 2
and scale 25.
All except Normal fit
well
Imagine that due to
subject-matter
knowledge, Weibull
is believed to be the
best model

24 | MDT Confidential

12

Tolerance Intervals via Reliability/Survival

25 | MDT Confidential

Weibull 95/95 Lower Bound

26 | MDT Confidential

13

Weibull 95/95 Two-sided


Tolerance Interval

Interval is
2.03 to 61.51
27 | MDT Confidential

Sample size for Weibull Tolerance Interval


See Medtronic Corporate Statistical Resources
Work Aid #2

28 | MDT Confidential

14

Summary and Review


Tolerance Intervals
Calculations
Sample Size

29 | MDT Confidential

15

General Linear Models (GLM)


I feel like Im regressing

LeRoy Mattson
Jeremy Strief

Objectives
Understand how GLM is a generalization of
ANOVA and regression
Understand three primary concepts within GLM
models
Fixed vs. Random effects
Nesting vs. Crossing
Covariate (Continuous) vs. Factor (Attribute)

Fit GLM in Minitab

2 | MDT Confidential

Recap from Quality Trainer


One-Way ANOVA
Two-Way ANOVA
Correlation & Regression

3 | MDT Confidential

Statistical Tools for Analyzing key Xs


X

Attribute
Variables
(discrete data) (continuous data)

Variables (continuous)

Attribute (discrete)

Regression
Multiple Regression
GLM

t-test (1 X, 2 levels)
One-way ANOVA
GLM

Logistic Regression

Chi Square
Logistic Regression

General Linear Models

GLM: Concepts
GLM: Variable Y One Attribute X
GLM: Variable Y Two Attribute Xs
GLM: Variable Y Mixture of Attribute & Variable Xs

GLM Introduction
GLM stands for General Linear Model
A flexible, unified approach to regression and
ANOVA.
Needed when building a Y=f(X) transfer function, but
when the input variables dont match a standard
regression or ANOVA approach:
Regression assumes continuous Xs
ANOVA treats Xs as attributes, and it often requires a
balanced experimental design in Minitab
What if your dataset does not fit into the ANOVA or
Regression mold?
6 | MDT Confidential

Motivating Example
Pin Pulls.mtw

MECC began collecting data around pull strength


for a particular component.
Due to the nature of the investigation and due to
resource constraints, it was not possible to
execute a formal DOE.
Data were collected over a series of months, and
sample sizes were not equally distributed across
all the engineering conditions of interest. (So the
dataset is unbalanced, in DOE language.)

7 | MDT Confidential

Motivating Example
Response variable (Y): Pull Strength
Predictor Variables (Xs):
Hole diameter: 17.5, 18.5, or 19.5
Fillet Style: one-sided or two-sided
Solder size: small or large

Fillet style and Solder size are attribute metrics


Hole diameter is a variables/continuous metric

8 | MDT Confidential

Data are unbalanced


Tabulated statistics: hole diameter, 1 or 2 sided
fillet, solder size
Results for solder size = 1
Rows: hole diameter

17.5
18.5
19.5
All

All

4
0
4
8

3
0
2
5

7
0
6
13

Columns: 1 or 2 sided fillet

Results for solder size = 2


Rows: hole diameter

17.5
18.5
19.5
All

All

9
4
16
29

4
0
12
16

13
4
28
45

Columns: 1 or 2 sided fillet

9 | MDT Confidential

How to Analyze in Minitab?


With multiple Xs of various types, GLM is the
only method which can be used to analyze the
data in Minitab
JMP also offers flexible modeling platforms
through Custom Design and Fit Model

10 | MDT Confidential

Three Main Concepts in GLM


Predictor variables (Xs) can be characterized in
three ways:
Fixed vs. Random effects
Nesting vs. Crossing
Covariate (Continuous) vs. Factor (Attribute)

11 | MDT Confidential

An Unfortunate Naming Convention


In statistical literature, there are two types of models
whose names are confusingly similar.
The General Linear Model is the main topic of todays
talk.
Y is continuous
X can be continuous or categorical

The Generalized Linear Model is a further abstraction of


the General Linear Model.
Y can be continuous or categorical
X can be continuous or categorical
Subcategories of Generalized Linear Models are
Logistic regression for a binary Y
Poisson regression for a count-based Y
General linear model for a continuous Y

The Advanced SME class will focus on the General Linear


Model in Ch 5 and on Logistic Regression in Ch 6.
12 | MDT Confidential

General Linear Models

GLM: Concepts
GLM: Variable Y One Attribute X
GLM: Variable Y Two Attribute Xs
GLM: Variable Y Mixture of Attribute & Variable Xs

Topics to be covered
GLM: Variable Y One Attribute X
One-way ANOVA (review)
GLM approach
Random effect vs. Fixed effect model

14

One Attribute X Example


Project Goal : Reduce late
deliveries (>36 hrs.) from suppliers
MINITAB SupplierLT.mtw

15

One attribute X: Example

Is there a practical difference among several suppliers?

% of Lead Time variance


explained by variation in
Supplier means

16

GLM approach to one attribute X


Model: yij = + ai + eij

where i represents factor level for A

17

Minitab Output

Expected lead time for Blitz : Y = 35.092 - 7.323(1) = 27.769


Expected time for Hare : Y = ?
Expected time for Wild : Y = 35.092 - 7.323(-1) + 5.023(-1) - 3.134 (-1)
+ 7.716(-1) + 1.686(-1) = 31.125
18

GLM: multiple comparisons

19

Multiple comparison

20

10

Capabilities of ANOVA vs GLM


Capability

ANOVA

GLM

Can fit unbalanced data

no*

yes

Can specify factors as random and


obtain expected means squares

yes

yes

Fits covariates

no

yes

Performs multiple comparisons

no*

yes

* Except for one-way ANOVA

21

GLM has some limits


Just like the one-way ANOVA:

residuals should be distributed normally

residuals should not have a pattern when plotted


against the predicted Y

residuals should not have a pattern when plotted


in run order

Just like regression:

one should check that that factors arent highly


correlated

one should simplify the model.


22

11

What are Random Effects?


Random X
X is random factor when levels of X are randomly chosen from a
population of possible levels.
Inferences are made on the overall population of Xs, rather than
on the specific levels chosen for the experiment.
Random effect models focus on estimating variance components.
How much variation in Y is due to X? There is less concern with
estimating the mean for any particular level of X.
Example: Selecting a random sample of 3 operators and a random
sample of 5 parts for Gage R&R Study in MSA

23

What are Fixed Effects?


Fixed X
The specific levels used in the experiment will be
controlled and replicated in a real manufacturing
situation.
There are only a few discrete levels of X which are
of scientific interest, or there only a few discrete
levels of X which can actually be produced in the
real world.
We are specifically interested in estimating the
mean value of Y for a given value of X.

24 | MDT Confidential

12

Fixed vs. Random Quiz


1. MECC wishes to understand the impact of two
different material suppliers upon weld penetration.
Based on the specific performance of each supplier,
MECC intends to establish a long-term contract with
one or both suppliers.

Supplier is a _____ effect for the response of weld


penetration.

2. In a Gage R&R study, we select three operators


from a pool of 30. We are not interested in the
specific performance of the 3 operators in the
experiment; we wish to understand the variability
due to operator.

Operator is a _____ effect.

25 | MDT Confidential

Common Examples in Manufacturing


Fixed Effects:

Designs
Suppliers
Material types
Controllable process settings (e.g. laser power,
position, etc.)

Random Effects:

Lots
Operators
Subsampling from a finite population of levels
Noise variables (uncontrollable aspects of a process)

26 | MDT Confidential

13

Random Effect vs. Fixed Effect

Example: Fiber Strength Data :

Model: yij = + ai + eij

Objective of Random Effect Model:

MINITAB

Loom.mtw

Var(y) = a2 + 2
Random Effect Model
Estimate a2 & 2

27

One-way ANOVA for fiber strength data

na2

28

14

GLM for Random Effect Model


MINITAB

Loom.mtw

29

Compare to the manual results

30

15

General Linear Models

GLM: Concepts
GLM: Variable Y One Attribute X
GLM: Variable Y Two Attribute Xs
GLM: Variable Y Mixture of Attribute & Variable Xs

Topics to be covered
GLM: Variable Y Two Attribute Xs
Two-way ANOVA
GLM approach
Crossed vs. Nested design

32

16

Example: Two Attribute Xs


Problem Statement: Customer service call center staffing often too
high (waste) or too low (low customer satisfaction).
Project Goal: Improve call center forecast accuracy. Accurate
forecast is within 20 calls of actual.
Path Y: Calls Received
Xs
Day (Monday to Friday)
Shift
1 (21:00-3:00)
2 (3:00-9:00)
3 (9:00-15:00)
4 (15:00-21:00)

MINITAB

Call Center Attribute XR.mtw

Monday begins
at 21:00 on
Sunday, etc.
33

ANOVA approach
Y

interaction default
between Xs

Xs

34

17

ANOVA Output
yijk = + ai + bj + abij + eijk

35

General Linear Model Approach


Y

Xs

36

18

General Linear Model, cont.

p < 0.05 (Day and Shift are Key Xs)

p < 0.05 (Day*Shift interaction is significant)

37

GLM: Main Effect Plot

38

19

GLM: Main Effect Plot, cont


Main Effects Plot (fitted means) for Calls Received
Day

90

Shift

Since interaction is
significant, these
plots do not tell the
whole story!

Mean of Calls Received

80
70
60
50
40
30

Each point is the


mean number of calls
received for that day

Each point is the


mean number of calls
received for that shift

20
Mon

Tue

Wed

Thu

Fri

# of Calls received decreases by day of week

# of calls received is lower for 1st shift

39

GLM: Interaction Plot

40

20

GLM: Interaction Plot, continued

Interaction =
Lines NOT
Parallel

Each line is a
different day

Each line is a
different shift

Shift 1 appears to have more calls on Monday than other days


Since p < 0.05 for Day*Shift, this observed interaction is
significant
Effect of Shift depends on Day. Effect of Day depends on Shift.
41

Example: Statistical Impact of X on Y: 2

2 Shift

= 29278.4/39776.0 = 73.6% of the variation in calls received

2 Day

= 1473.8/39776.0

2 Day*Shift

= 3.7% of the variation in calls received

= 5412.4/39776.0 = 13.6% of the variation in calls received


42

21

Exercise: All Xs Attributes Y Variables


MINITAB

Days Overdue.mtw

Project Goal: Improve On Time Delivery to Customer


Project Strategy: Path Y = Days Overdue
Xs:
X1 = Product (1 or 2)
X2 = Priority (1 to 4), 1 = Highest Priority, 4 = No Priority
Task:

Perform ALL steps of Analyze using the data

Approach:

Work alone or in small groups.


15 Minutes

43

Exercise Debrief
Solution:
What are the key Xs?
What is the relationship between the key Xs and Y
What is the impact of the key Xs on Y?

What was difficult?

44

22

Residuals Verify Assumptions


MINITAB

Days Overdue.mtw

45

Residuals Verify Assumptions


Residual Plots for Missed Days
Normal Probability Plot of the Residuals
99.9
99
90
50
10
1
0.1

Residuals Versus the Fitted Values


Standardized Residual

Percent

Verify Normality
Assumption
(want fit to line)

-4

-2
0
2
Standardized Residual

Histogram of the Residuals

18
12
6
0

-2

-1
0
1
Standardized Residual

3.0
1.5
0.0
-1.5
-3.0
-15

-10

-5
0
Fitted Value

Residuals Versus the Order of the Data


Standardized Residual

Frequency

24

Verify Equal Variance


Assumption (Want no
patterns)

3.0
1.5
0.0
-1.5
-3.0

10

20

46

30 40 50 60 70
Observation Order

80

90 100

Verify Independence
Assumption (Want no
patterns)

23

Another two attribute Xs example: Gage R&R


MINITAB

Micrometer.mtw

Design : Crossed design


Model : Random Effect model

47

Nesting
Factor B is nested in factor A if the levels of B
have different meanings for each level of A.
Stated differently, factor B is nested in factor A if
there is a completely different set of levels of B
for every level of A.
Minitab notation: B(A) means B is nested within
A.

48 | MDT Confidential

24

Nesting Example
Example: An experiment is run with three suppliers,
each of which produces three batches of material.
There clearly are three levels of supplier, but how
many levels of batch are there?
Batch 1 from supplier 1 has nothing to do with batch 1
from supplier 2. Batch level 1 has no consistent
meaning across suppliers. So Batch is nested in
supplier.
Instead of labeling the batch levels as 1-3, it would be
appropriate to label them 1-9.

You know that B is nested if A it is reasonable to label


each level of B differently, depending on the level of
A.
49 | MDT Confidential

Crossing
Factor B is crossed with Factor A if the levels of B
have the same meaning for each level of A.
This is the standard factorial structure of a DOE
Example: An experiment is run with three
suppliers, each of which utilizes two types of
material100% gold or 100% nickel.
Gold and Nickel have the same meaning and
same interpretation, regardless of supplier.
Supplier is therefore crossed with material.

50 | MDT Confidential

25

Example of Nested Design

Levels of Batches nested within levels of supplier


Is this a factorial design?
Can we estimate Supplier X batch interaction?

yijk = + ai + bj(i) + ek(ij)


51

Nested Design - continued


Company buys raw material in batched from 3 different
suppliers. The purity of this material varies considerably.
Which causes problems in manufacturing the finished
product. We wish to determine if the variability in purity is
attributable to difference between the suppliers.Four
batches of raw material are selected at random from
each supplier, and 3 determinations were made on each
batch.
MINITAB

Purity.mtw

52

26

ANOVA for the Purity data


yijk = + ai + bj(i) + ek(ij)

A = Fixed or Random ?,

B = Fixed or Random ?

Is Supplier a key X?

batch = ?

= 1.62

Is there differences among suppliers?


53

Incorrect GLM Analysis


Supplier and batch
fixed effects
Two-way ANOVA: purity versus supplier, batch

Source

DF

SS

MS

supplier

15.056

7.52778

2.85

0.077

batch

25.639

8.54630

3.24

0.040

Interaction

44.278

7.37963

2.80

0.033

Error

24

63.333

2.63889

Total

35

148.306

S = 1.624

R-Sq = 57.30%

R-Sq(adj) = 37.72%

54

27

GLM Exercise:
MINITAB

(Purity.mtw)

Is supplier a key X?

Assume that suppliers were randomly chosen


(i.e., random effect), and estimate supplier using
GLM.
55

Summary: Different Types of Xs


I X at a time:

F/R

2 or more Xs at a time: F/R


C/N

F = Fixed

C = Crossed

R = Random

N = Nested

56

28

Specifying the Model Terms in Minitab

example

Statistical model

Terms in model

Factor A, B crossed

Yijk= + ai + bj + abij + eijk

A, B, A*B

Crossed and nested


(B nested within A,
both crossed with C)

Yijkl = + ai + bj(i) + ck + acjk + bcjk(i)

A, B(A), C, A*C,
B*C

+ el(ijk)

57

Exercise
MINITAB

Time.MTW

Problem Statement: Assembly time is too long for a manufacturing process.


Type of layout and type of fixture are suspect Xs for assembly lead time.
Two (2) different layouts and three (3) different fixtures are to be tested.
Two(2) groups of 4 Operators each are randomly selected to test each layout
with the 3 fixtures, two times. All factorial combinations of layout and fixture
are completely randomized in the experiment.
Task: Are type of fixture and layout key Xs for assembly time?
Experimental Design:
L1
O1 O2 O3 O4

L2
O5 O6 O7 O8

F1
F2
F3
Time: 20 minutes
58

29

General Linear Models

GLM: Concepts
GLM: Variable Y One Attribute X
GLM: Variable Y Two Attribute Xs
GLM: Variable Y Mixture of Attribute & Variable Xs

Topics to be covered
GLM: Variable Y Mixture of Attribute and Variable Xs
GLM with Covariates
Strategic GLM

60

30

When Can I Treat an X as Variables?


When relationship between X and Y can be described with a line or curve
Number of levels does not determine Variables X vs Attribute X

Attribute (Factor) X

Variables (Covariate) X
7

Coffee Taste
Lead Time (Days)

Taste

Curve
Y=F(X)

Actual
data

quadratic

3
0

10

20

30

40

Lead Time vs Supplier

35

50

60

30
25
20
15
10
5

Brew To Serve Time

X = Brew To Serve Time has 3


levels (1, 30, 60)

No Line or
Curve
Y=F(X)

Actual
data

Supplier

10

X = Supplier has 10 levels (1 to 10)


61

GLM for Mixture of Attribute and Variable Xs


Specify variable Xs as
covariates

Example:
MINITAB

Catapult Multiple X.mtw


Are there significant
main effects?
interactions?
curvature?

62

31

Analyze Centering Xs - Main Effects

Covariates tells MINITAB which Xs are variables


63

Analyze Xs - Main Effects


Source
Rub Band
Shot
Operator
Ball
Time PB
PB Angle
Error
Total

DF
3
1
1
2
1
1
37
46

Seq SS
3784.8
8.3
55.7
684.3
336.8
10108.2
4957.1
19935.2

Adj SS
5991.3
61.3
40.2
572.2
1.4
10108.2
4957.1

Adj MS
1997.1
61.3
40.2
286.1
1.4
10108.2
134.0

F
14.91
0.46
0.30
2.14
0.01
75.45

P
0.000
0.503
0.587
0.133
0.919
0.000

Rub Band, PB Angle are significant Main Effects


Ball is close (include it for now)
Remember: p-values will change when terms are added or deleted from model

64

32

Reduce Terms
Edit Last Dialog

Tells MINITAB to give


coefficients for Attribute as
well as Variables Xs

65

Reduce Terms
Source
Rub Band
Ball
PB Angle
Error
Total

DF
3
2
1
40
46

Seq SS
3784.8
691.8
10348.8
5109.8
19935.2

Adj SS
5988.0
681.2
10348.8
5109.8

Adj MS
1996.0
340.6
10348.8
127.7

F
15.63
2.67
81.01

P
0.000
0.082
0.000

Ball p-value smaller with Shot, Operator,


Time PB removed from model
Should we keep Ball?

66

33

What If We Treat PB Angle as Attribute?

Source
Rub Band
Ball
PB Angle
Error
Total

DF
3
2
4
37
46

Seq SS
3784.8
691.8
12097.9
3360.7
19935.2

DF
Adj SS
F
p

Adj SS
5212.1
900.0
12097.9
3360.7

Adj MS
1737.4
450.0
3024.5
90.8

Variable
1
10348.8
81.01
0.000

Attribute
4
12097.9
33.30
0.000

40

37

Error DF

F
19.13
4.95
33.30

P
0.000
0.012
0.000

67

What If We Treat PB Angle as Attribute?


Term
Constant
Rub Band
1
2
3
Ball
Golf
Wiffle
PB Angle
130
140
150
160

Coef
99.896

SE Coef
1.531

T
65.24

P
0.000

-14.661
19.518
3.354

2.725
2.942
2.597

-5.38
6.63
1.29

0.000
0.000
0.204

7.570
-5.102

2.664
2.031

2.84
-2.51

0.007
0.016

-32.106
-4.533
7.497
9.455

3.035
2.920
3.129
3.669

-10.58
-1.55
2.40
2.58

0.000
0.129
0.022
0.014

Model Prediction (Rub Band = 1, Ball = Wiffle, PB Angle = 150)


Distance = 99.896

-14.661

-5.102

+7.497

= 87.63

Model Prediction (Rub Band = 4, Ball = Golf, PB Angle = 180)


Impossible: Can only get PB Angle predictions for 130,140,150,160,170
68

34

Interactions

If sample size is small


try interactions one at a time

69

Interactions
Source
Rub Band
Ball
PB Angle
Rub Band*Ball
Error
Total

DF
3
2
1
6
34
46

Seq SS
3784.8
691.8
10348.8
187.5
4922.3
19935.2

Adj SS
4423.8
502.4
8766.5
187.5
4922.3

Adj MS
1474.6
251.2
8766.5
31.2
144.8

F
10.19
1.74
60.55
0.22

P
0.000
0.192
0.000
0.969

Rub Band * Ball Interaction not significant


Note:
6 DF (degrees of freedom) for Rub Band*Ball = 3 * 2
34 DF left for for Error = 46 - 3 - 2 - 1 - 6
If DF for Error decreases then p values increase
If DF for Error < 0 then no p values are possible (MINITAB will complain!)
Conclusion: Be careful when adding interactions (DF for Error may reach 0)
70

35

Interactions
p-values for Interactions
Rub Band

Ball

Ball

0.969

PB Angle

0.566

0.211

Conclusion: No significant interactions


Should we test interactions with Shot? Operator? Time PB?
71

Review: Linear vs Curvature


Main Effects Plot (data means) for Taste

Curvature only
applies to
variables Xs!

5.5

Point Type
Corner
Center

Mean of Taste

5.0

Curvature Model
4.5
4.0
3.5

Linear Model

3.0
1.0

30.5

60.0

Brew to Serve Time

Quadratic Model: Y = aX2 + bX + c


72

36

Curvature - Detecting with Residuals

Plot Residuals vs PB Angle to


graphically check for curvature
73

Curvature - Detecting with Residuals


Residuals Versus PB Angle
(response is Distance)

Standardized Residual

-1

-2
130

140

150
PB Angle

160

170

Looks like curvature


Now lets prove it!

74

37

Curvature - Add X2 Term to Model

(PB Angle)2

75

Curvature - Add X2 Term to Model


Note: Ball main effect is now significant
Source
Rub Band
Ball
PB Angle
PB Angle*PB Angle
Error
Total

DF
3
2
1
1
39
46

Seq SS
3784.8
691.8
10348.8
1360.4
3749.3
19935.2

Adj SS
6036.3
1060.9
1685.1
1360.4
3749.3

Adj MS
2012.1
530.5
1685.1
1360.4
96.1

F
20.93
5.52
17.53
14.15

P
0.000
0.008
0.000
0.001

(PB Angle)2 is significant

76

38

Final Model - Check Residuals


We now have all
terms for our model.
Need to check
residuals to
understand how
good the model is?

77

Final Model - Check Residuals


Residual Plots for Distance
Normal Probability Plot of the Residuals
90
50
10
1

Residuals Versus the Fitted Values


Standardized Residual

99

Percent

Brush over
to find
which point
is causing
trouble!

-2

0
2
Standardized Residual

4
2
0
-2

60

Histogram of the Residuals


Standardized Residual

Frequency

12
8
4
-1

0
1
2
3
Standardized Residual

100
Fitted Value

120

140

Residuals Versus the Order of the Data

16

80

4
2
0
-2

10

15 20 25 30 35
Observation Order

40

45

What do we conclude?
78

39

Exercise

Twelve steel brackets were randomly divided into three groups


and sent to three vendors to be zinc plated. The chief concern
in this process is whether or not there is any difference in zinc
thickness among vendors. The following table lists the plating
thickness (Y), as well as the thickness of the bracket (X), in
hundred-thousandths of an inch.

MINITAB

Zinc plating.mtw

79

Exercise : Questions
1) One-way ANOVA : X = vendor
Is there significant differences among vendors?
2) GLM: X1= vendor, X2 = Bracket Thickness
How does this change the conclusion?
3) Bonus Questions:
If you are to do this testing again, what would you do differently?
Use a graphical tool to support your rationale (Suggestion: try
Interaction Plot under ANOVA)

80

40

Problems with Designs: Correlated Xs

81

Return to MECC Example


Pin Pulls.mtw
Response Variable (Y): Pull Strength
Predictor Variables (Xs):
Hole diameter: 17.5, 18.5, or 19.5
Fillet Style: one-sided or two-sided
Solder size: small or large

Exercise:
Fit a GLM to create a model for pull strength
Can Hole diameter be reasonably treated as a
covariate? (Engineering theory suggests that it can.)
Determine if variables are fixed vs. random, crossed
vs. nested
Which Xs are statistically significant?
82 | MDT Confidential

41

Summary And Recap


Understand how GLM is a generalization of
ANOVA and regression
Understand three primary concepts within GLM
models
Fixed vs. Random effects
Nesting vs. Crossing
Covariate (Continuous) vs. Factor (Attribute)

Fit GLM in Minitab

83 | MDT Confidential

42

Logistic Regression
I still feel like Im regressing
LeRoy Mattson

Objectives
Understand how logistic regression creates a
predictive model for an attribute Y
Fit logistic regression models in Minitab

2 | MDT Confidential

Logistic Regression

Logistic Regression Attribute Y, One X


Logistic Regression Attribute Y, Multiple Xs

Attribute Y Data Types

Individual unit categorized into a classification


Finite number of possible values
Cannot be subdivided meaningfully
4 Attribute data types:
Binary (pass/fail, good/bad)
Nominal (complaint codes, problem type)
Ordinal (low/medium/high, mild/moderate/severe)
Discrete(# errors)

Is Smoking (X) a key X for Lung cancer (Y)?


Y: Lung Cancer; Yes, No
X: Smoking; Yes, No

X\Y

Lung Cancer

No Lung Cancer

Total

Smoker

Non-smoker

Analysis tools: Relative Risk or Odds Ratio

The Relative Risk of lung cancer for


smoker vs non-smoker = (2/5)/(1/9) = 3.6

Concept: Odds Ratio (OR) as a Measure of X Impact for Attribute Y


OR = Odds of Y outcome in one group
relative to another group
= Odds of cancer for smokers
odds of cancer for non-smokers
OR =

2/3
1/8

0.67
0.125

= 5.33

Interpretation of Odds Ratio: 5.33


Odds of cancer for smokers is 5.33*odds for non-smokers
Chance of getting cancer is increased 433% with smoking

Statistical Tools for Analyzing key Xs


X
Attribute
Variables
(discrete data) (continuous data)

Variables (continuous)

Attribute (discrete)

Regression
Multiple Regression
GLM

t-test (1 X, 2 levels)
One-way ANOVA
GLM

Logistic Regression

Chi Square
Logistic Regression

Example Binary Logistic Regression Attribute X


Problem Statement: Is smoking associated with disease in previous
example? Y? X?
MINITAB
Smoking.MTW
Task:

In this module
1) What tool(s) for Hypothesis Test?
2) What tool(s) for Graphical Analysis?

Approach:

Work individually.

Y = Cancer or Cancer-Free
X = Exposure
(smoking/nonsmoking)

Logistic Regression Analysis

Logistic Regression Analysis cont.


Wald Test to Verify Key X:
If p-value < 0.05, X is Key
(Smoking is not Key X)

OR for Attribute X Impact:


OR of disease for exposed relative to unexposed = 5.33
(433% increase in odds of disease for smoking relative to
nonsmoking)
10

Exercise Binary Logistic Regression Attribute X


Problem Statement: A new data set on smoking has been collected.
Analyzed this data set and determine if smoking has an effect on
cancer.
MINITAB

Smoking2.MTW

11

Example Binary Logistic Regression Variable X


Problem Statement: Toy company is interested in whether a toy missile will
hit flying targets of varying speeds. Y? X?
MINITAB
Speed.MTW
Task:

In this module
1) What tool(s) for Hypothesis Test?
2) What tool(s) for Graphical Analysis?

Approach:

Work individually.

Y = Hit or Miss? (1/0)


X = Target speed

12

Incorrect Analysis: Variables Y


Fitted Line extends beyond 0 and 1
Fitted Line Plot
hit or miss = 1.562 - 0.003005 target speed (cms/sec)
S
R-Sq
R-Sq(adj)

1.0

Heteroscedastic (Unequal) Variances

0.397278
41.8%
39.3%

0.6

Residual Plots for hit or miss


Normal Probability Plot of the Residuals

0.4

99

0.2

90
200

250

300
350
400
target speed (cms/sec)

450

Percent

0.0
500

50
10
1

Residuals Versus the Fitted Values


Standardized Residual

hit or miss

0.8

-2

-1
0
1
Standardized Residual

2
1
0
-1
-2

0.00

Histogram of the Residuals


Standardized Residual

Frequency

4.5
3.0
1.5
-2

-1
0
1
Standardized Residual

0.50
Fitted Value

0.75

1.00

Residuals Versus the Order of the Data

6.0

0.0

0.25

2
1
0
-1
-2

8 10 12 14 16 18 20 22 24
Observation Order

13

Correct Analysis: Use Binary Logistic Regression


What if we analyze proportion of hits?
Proportion p vs X
0.9

Logit(p) vs X

Logit (p)

b1 < 0

0.8

logit (p) = b0 +b1X

0.6

logit(p)

proportion(p)

0.7

0.5
0.4
0.3

-1

0.2
-2

0.1
0.0
200

250

300

350
X

400

450

200

500

p(x) = proportion of Y-attributes at


Logit transformation straightens

250

300

350
X

400

450

500

each X value
S-shape to straight line

Logit(p) = loge[(p/(1-p)]
Logistic f(x): loge [p(x)/(1-p(x))] = b0 +b1X
Origins: Verhulst (mathematician) named the logistic function (1838-1847: 3 papers).
Pearl and Reed (1920, Johns Hopkins, Biometry and Vital Statistics) rediscovered
Logistic to model population growth in US
14

Binary Logistic Regression


Does target speed affect hit or miss?

Raw data (0,1)

Declare Attribute Xs

Fitted probabilities stored as EPRO1


Default: Event is 1
15

Identifying Key Xs
Link Function: Logit

Wald Test to Verify Key X:

Response Information
Variable

Value

Count

hit or miss

13

12

Total

25

If p-value < 0.05, X is Key


(Event)

(Target speed is Key X)

Logistic Regression Table


Odds
Predictor

Coef

Constant
target speed (cms/sec)

SE Coef

5.56028

2.04130

2.72

0.006

-0.0156619

0.0055920

-2.80

0.005

Ratio

0.98

95% CI
Lower Upper

0.97

1.00

Log-Likelihood = -11.411
Test that all slopes are zero: G = 11.796, DF = 1, P-Value = 0.001

Compare with GLM: Wald test in Logistic is similar to t-test in GLM


16

Measuring X Impact Odds Ratio for X (OR)


Link Function: Logit

OR for Variables X Impact:

Response Information
Variable

Value

hit or miss

13

12

Total

25

OR for a c unit increase in X = e(c*b1) = (Odds

Count
(Event)

Ratio for 1 unit X increase)c


(Need to determine meaningful c)

Logistic Regression Table


Odds
Predictor

Coef

SE Coef

5.56028

2.04130

2.72

0.006

-0.0156619

0.0055920

-2.80

0.005

Constant
target speed (cms/sec)

Ratio

0.98

95% CI
Lower Upper

0.97

1.00

Log-Likelihood = -11.411
Test that all slopes are zero: G = 11.796, DF = 1, P-Value = 0.001

For 50 unit increase in target speed, risk (chance) of hitting target = (0.98)50 = 0.46
(i.e., a 54% reduction)
17

Graphical Analysis Plot logistic curve

18

Graphical Analysis Plot logistic Curve


Check against raw data
Scatterplot of hit or miss, EPRO1 vs target speed (cms/sec)
Variable
hit o r miss
EPRO 1

1.0

0.8

Y-Data

Fitted Logistic Curve


0.6

0.4

0.2

0.0
200

250

300
350
400
target speed (cms/sec)

450

500

Target speed = 350, 50% chance of hitting target


19

Exercise: Binary Logistic Regression


Problem Statement: Chemotherapy induced remission rate takes too long
to measure and is inaccurate causing delays and errors in cancer research.
Project Strategy: Determine if labeling index is a variables path Y for
remission rate. Labeling index measures the proliferative activity of cells
after a patient receives an injection of thymidine as part of chemotherapy. It
represents the percentage of cells that are labeled (Lee, 1974).
MINITAB

Cancer Remission.MTW

(from Lee*)

Task:

Verify if labeling index is a suitable Path Y for remission rate.

Approach:

Work individually or in pairs

Time:

10 minutes

* Lee (1974): A computer program for linear logistic regression analysis.


Computer Prog. Biomed. 4: 80-92.

20

10

Exercise Debrief
Solution:
1.

Is labeling index a Key X?

2.

What hypothesis test did you use to verify the key X?

3.

Compare the results from fitted line plot.

4.

Is impact of labeling index large enough to use as a Path Y for remission


rate?

What did you learn?

21

Logistic Regression

Logistic Regression Attribute Y, One X


Logistic Regression Attribute Y, Multiple Xs

11

Example: Multiple Binary Logistic Regression


A cancer study showed the number of cases of esophageal cancer,
classified by age group and alcohol consumption (0=none, 1=some).
Y?
Xs?
Data type?
MINITAB

EsophagealCancer.MTW

Task: Verify if age group and alcohol


consumption are key Xs for incidence of
esophageal disease.
Alcohol is
Attribute X
23

Study Effect of alcohol consumption/age on cancer cases

Fitted Line Plot


cancer% = 14.40 + 0.1286 age
S
R-Sq
R-Sq(adj)

30

8.45436
9.2%
0.0%

cancer%

25

20

15

10
20

30

40

50
age

60

70

80

24

12

Are Xs correlated (potential confounding of Xs)


S c a tte r pl o t o f a l c % v s a ge
90
80

Correlated Xs (corr = 51%)

70

alc%

60
50
40
30
20
10
20

Rows: Alcohol
25
0
6
18.18
1
27
81.82
All
33
100.00

30

40

50
a ge

Columns: Age Group


35
45
55
3
30
9
14.29
83.33
30.00
18
6
21
85.71
16.67
70.00
21
36
30
100.00 100.00 100.00

60

65
27
81.82
6
18.18
33
100.00

75
18
66.67
9
33.33
27
100.00

70

All
93
51.67
87
48.33
180
100.00

80

% alcohol use depends on


age-group.

25

Example: Multiple Binary Logistic Regression


A cancer study showed # cases of esophageal cancer,
classified by age group and alcohol consumption (0=none,
1=some).
Y? Xs? Data type?

Alcohol is Attribute X

MINITAB

EsophagealCancer.MTW

Task: Verify if age group and alcohol


consumption are key Xs for incidence of
esophageal disease.

26

13

Example Hypothesis Test


OR for Attribute X Impact = Odds Ratio
Logistic Regression Table

Predictor
Constant
Alcohol
1
Age Group

Coef
-2.72159

SE Coef
0.753215

Z
-3.61

P
0.000

0.733554
0.0187340

0.406715
0.0119209

1.80
1.57

0.071
0.116

Odds
Ratio

95% CI
Lower Upper

2.08
1.02

0.94
1.00

4.62
1.04

Log-Likelihood = -87.915
Test that all slopes are zero: G = 4.314, DF = 2, P-Value = 0.116

Risk (chance) of getting


cancer increases by 108%
with alcohol use

Alcohol is Key X

27

Graphical Analysis Raw data and Logistic


Regression Estimates

Scatterplot of EPRO1 vs Age Group


A lco ho l
0
1

0.35

EPRO1

0.30
0.25

Fitted Logistic Model

0.20
0.15

0.10
20

30

40

50
A ge Group

60

70

80

28

14

Exercise: Logistic Regression


Problem Statement: A sample of ingots are treated with four levels of heat
time and five levels of soak time. The response is number of ingots ready to
be rolled (out of those tested) for each combination of times.
Project Goal: maximize the ingots ready to be rolled.
MINITAB

Ingots.MTW

Task:

Verify if heat time and soak time Key Xs.

Approach:

Work individually or in pairs

Time:

10 minutes

29

Summary Quiz

True or False
___________

Use Binary logistic regression when


Y is Variables

___________

Odds ratio is odds of Y outcome in


one group relative to another group

___________

Use GLM analysis when Y is attribute


at 2 levels

30

15

Statistical Resources
Avoiding wheel re-invention
LeRoy Mattson

Objectives
Ensure you are aware of statistical resources
both internal and external to Medtronic:
Medtronic Statistical Resources Web Site
External Web Sites

This chapter can serve as a reference document


after the class is complete.

2 | MDT Confidential

Medtronic Statistical Resource Web Site

http://mitintra.corp.medtronic.com/corporate-statistics/

3 | MDT Confidential

Software Validation Plans & Reports

For links to Validation Plans & Reports: Click on Search button on Web Site
For Medstat Plans/Reports: Enter Medstat validation
For Minitab Plans/reports: Enter Minitab validation
For Crystal Ball Validation Plans/Reports :Enter Crystal Ball validation
Note: These links are to pdf documents stored in Documentum.

4 | MDT Confidential

About Corporate Stats

5 | MDT Confidential

About Corporate Stats Cont.

6 | MDT Confidential

Get Trained

7 | MDT Confidential

Recap from Quality Trainer


If you complete all of the Quality Trainer ,
Minitab will send you a Certificate . It takes
20-40 hours to complete all of the QT.

8 | MDT Confidential

Get Trained Cont.

9 | MDT Confidential

Tools/Resources: Software

10 | MDT Confidential

Tools/Resources: Minitab16 Validation

11 | MDT Confidential

Tools/Resources: Work Aids

12 | MDT Confidential

Tools/Resources: MHOS
Medtronic Handbook of Statistics - Rev G. : in pdf format only

13 | MDT Confidential

Tools/Resources: Business SOPS


Business Unit procedures for:
Test Method Validation (MSA included)
Normality Testing
Lot Acceptance (or sampling plans for incoming)
SPC

14 | MDT Confidential

Tools/Resources: Other
Software

Miscellaneous

15 | MDT Confidential

Tools/Resources: JMP Software


Contact Kevin Gaffney at MECC if you are interested in
obtaining JMP.
The software has been officially validated and may be used
within the quality system.
JMP tends to be more interactive than Minitab and is more
powerful for certain applications (e.g. advanced DOE).
JMP is point-and-click like Minitab, but it is more objectoriented instead of menu-oriented.

16 | MDT Confidential

Get Connected

17 | MDT Confidential

Get Connected
Industrial Statistics Questions?
1) Contact your divisions Industrial Statistics Council member

2)Otherwise, contact Medtronic Statistical Resources

18 | MDT Confidential

External Web Sites


Statistical Standards:
ISO Standards for Statistical Methods can be purchased as part of a CD Rom
collection available at http://www.iso.org/iso/pressrelease.htm?refid=Ref1134.

ASTM Standards on precision and Bias 6th edition


http://www.astm.org/BOOKSTORE/COMPS/BIAS08.htm

ASTM SPC Standard


http://www.astm.org/Standards/E2587.htm

19 | MDT Confidential

External Web Sites


Statistical Standards:
ASQ has ANSI/ASQ standards:
http://asq.org/quality-press/display-item/index.html?item=T004
GHTF has standards (link to GHTF process validation below)
http://www.ghtf.org/sg3/sg3-final.html
AIAG has Guidance for MSA & SPC
Publications Catalog - Automotive Industry Action Group
Large list of Acceptance Sampling Standards :
http://variation.com/techlib/standard.html
Has list of acceptance sampling stats standards: MIL-STD & ANSI & ISO
Bulk sampling & reliability are listed.

20 | MDT Confidential

10

External Web Sites


Statistical Committees:
ISO Statistics Technical Committee: TC 69 - with six subcommittees
http://www.iso.org/iso/home/standards_development/list_of_iso_technical_c
ommittees/iso_technical_committee.htm?commid=49742
The Six ISO TC69 Subcommittees

ASTM Technical Committee E11 (Statistics)


http://www.astm.org/COMMIT/COMMITTEE/E11.htm
USP Expert Statistics Committee
http://www.usp.org/council-experts-expert-committees-overview/expertcommittees/statistics

21 | MDT Confidential

External Web Sites


Handbooks:
NIST E-Statistics Handbook (has hyperlinks)
http://www.itl.nist.gov/div898/handbook/

22 | MDT Confidential

11

External Web Sites


Well-known Consultants in Industrial Statistics:
Dr. Wayne Taylor:
http://www.variation.com/
Dr. Douglas C. Montgomery (Arizona State University):
http://www.amazon.com/Douglas-C.-Montgomery/e/B001IGNOBC
Dr. Donald Wheeler:
http://www.spcpress.com/
Newsletters:
American Society for Quality(ASQ) Statistics Section
http://asq.org/statistics/
American Statistical Association(ASA): Quality & Productivity Section
http://www.amstat-online.org/sections/qp/newsletter.html

23 | MDT Confidential

Summary And Recap

Medtronic Statistical Resource Web Site


External Web Sites

24 | MDT Confidential

12