6 Sigma Reference Materials

6 Sigma Reference Materials
SUBJECT MATTER LINK

Alpha Risk Alpha_Risk
Alternative Hypothesis - Ha Alt_Hypothesis
ANOVA – One Way ANOVA_One_Way
ANOVA – Two Way ANOVA_Two_Way
ANOVA – N Way ANOVA_N_Way
Basic Probability Theory Probability Basic_Probability_Th
Basic Probability Theory Conditional Probability Conditional_Prob
Basic Probability Theory Continuous Probability Continuous_Prob
Beta Risk β Beta_Risk
Box Plots Box_Plot
Breakthrough Strategy Breakthrough_Strategy
Chi-Square – Goodness Of Fit Test Chi_Square_Goodness_of_Fit
Chi-Square – Test Of Homogeneity Chi_Square_Homogenity
Chi-Square – Yates Correction Chi_Square_Yates_Correct
Confidence Interval – Mean Confidence_Interval_Mean
Confidence Interval – Proportion Confidence_Interval_Proportion
Confidence Interval Standard Deviation Confidence_Interval_Std_Dev
Control Chart – Moving Range (MR) Control_Chart_MR
Control Chart – np Control_Chart_np
Control Chart – Proportion Control_Chart_p
Control Charts – R (Range) Control_Chart_r
Control Charts – Standard Deviation Control_Chart_s
Control Chart – u Control_Chart_u
Control Chart – Xbar Control_Chart_Xbar
Control Limits Control_Limits
Correlation Coefficient Corr_Coeff
Crosstabulation and Contingency Table Crosstab_and_Cont_Table
Data Transformations Data_Transformation
Defects Per Opportunity Defect_per_Opportunity
Defects per Unit Defect_per_Unit
Degrees of Freedom Degrees_Freedom
Distribution – Binomial Dist_Binomial
Distribution - Chi Square Dist_Chi_Square
Distribution – F Dist_F
Distribution – Lognormal Dist_Lognormal
Distribution – Normal Dist_Normal
Distribution – Poisson Dist_Poisson
Distribution – Probability Plot Dist_Probability_Plot
Distribution – t Dist_t
Distribution – Weibull Dist_Weibull
DOE Factorial Experiment DOE_Experiment
DOE Factorial Experiment Blocking DOE_Blocking
DOE Factorial Experiment Center Point Center_Point
DOE Factorial Experiment Contrast Contrast
DOE Factors and Levels Factors_and_Levels
DOE Factorial Experiment – Confounding Fact_Exp_Confounding
DOE Factorial Experiment – Fold-Over Design DOE_FoldOver_Design
DOE Factorial Experiment – Fractional Fact_Exp_Fractional
DOE Factorial Experiment – Full Fact_Exp_Full
1
SUBJECT MATTER LINK
DOE Factorial Experiment – Inner Array DOE_Inner_Array
DOE Factorial Experiment – Randomization DOE_Randomization
DOE Factorial Experiment – Replication DOE_Replication
DOE Factorial Experiment – Resolution Fact_Exp_Resolution
DOE Factorial Experiment – Runs Fact_Exp_Runs
DOE Notation – Two-Level Factorial Experiment Notation_Two_Level_Exp
F Test 2 Variances F_Test
Fishbone Diagram (Cause & Effect Diagram) Fishbone
Fitted Line Plot Fitted_Line_Plot
FMEA FMEA
Gage R&R – Attribute Data Gage_Attribute
Gage R&R – Variable Data Gage_Variable
Histogram Histogram
Heterogeneity of Variance Heterogeneity_of_Variance
Hidden Factory Hidden_Factory
Homogeneity of Variance Homogeneity_of_Variance
Independence of the Mean and Variance Indep_of_Mean_and_Variance
Interaction Interaction
Interaction Plot Interaction_Plot
Is/Is Not Technique Is_Is_Not
Linear Regression Linear_Reg
Main Effects Plot Main_Effects_Plot
Measurement Scale – Continuous Meas_Scale_Cont
Measurement Scale - Discrete (Attribute) Meas_Scale_Discrete
Measurement Scale – Likert Likert_Scale
Measurement Scale – Log Log_Scale
Measurement Scale – Nominal Meas_Scale_Nominal
Measurement Scale – Ordinal Meas_Scale_Ordinal
Measurement Linearity Measurement_Linearity
Measurement Reliability Measurement_Reliability
Measurement Repeatability Measurement_Repeatability
Measurement Reproducibility Measurement_Reproducibility
Measures of Location – Mean, Median, Mode Measure_of_Location
Measures of Variation – Range, Variance Measure_of_Variation
Mistake Proofing Principles Mistake_Proofing
Nonparametric Tests Nonparametric_Test
Null Hypothesis - H0 Null_Hypothesis
Out of Control Out_of_Control
Process Capability Cp, Cpk, Pp, Ppk Indices_of_Capability
Process Mapping Mapping
Random Sampling Random_Sampling
Rational Subgrouping Rational_Subgrouping
Residual Plots Residual_Plots
Sample and Population Sample_Population
Sample Size 1 Sample Mean, Alpha & Beta Sample_Size_1_Sample_Mean
Sample Size 2 Sample Mean Sample_Size_2_Sample_Mean
Sample Size Attribute Data Sample_Size_Attribute_Data
Sample Size Mean, Error Margins Sample_Size_Mean
Sample Size Standard Deviation Sample_Size_Standard_Deviation
Scatter Plot Scatter_Plot
Sequential Sampling Sequential_Sampling
2
SUBJECT MATTER LINK
Sets and Events Sets_Events
Sets, Theorems Sets_Theorems
Shift and Drift – 1.5 Sigma Shift_Drift
Standard Deviation Standard_Deviation
Standard Deviation Long Term Standard_Deviation_Long_Term
Standard Deviation Short Term Standard_Deviation_Short_Term
Standard Normal Deviate (Z) Standard_Normal_Deviate
Stratified Sampling Stratified_Sampling
Sums of Squares Sums_of_Squares
T Test – 1 Sample T_Test_1_Sample
T Test – 2 Sample T_Test_2_Sample
Test Sensitivity Tests_Sensitivity
Total Defects per Unit Total_Defects_per_Unit
Truth Table – 1.5 Shift Truth_Table
Western Electric SPC Rules Western_Elec_Rules
Yates Standard Order Yates_Standard_Order
Yield Final Yield_Final
Yield First Time Yield_First_Time
Yield Normalized Yield_Normalized
Yield Rolled Throughput Yield_Rolled_Throughput
Yield Throughput Yield_Throughput
Z Value Long Term Z_Value_Long_Term
Z Value Short Term Z_Value_Short_Term
Z Test 1 Sample Z_Test_1_Sample
3
ANOVA – One Way
Purpose
To compare the means of two or more populations on a continuous CT characteristic. Since we don't
know the population means, an analysis of data samples is required. ANOVA (Analysis of Variance) is
usually used to determine if there is a statistically significant change in the mean of a CT characteristic
under two or more conditions introduced by one factor (see concept Factor and Levels).
Anatomy
ANOVA - One Way - Part 1 of 2
A
H0 : µ1 = µ2 = ….. = µg
C Ha : µi ≠ µj for at least one pair (i, j)
Yij = µ + τi + εij
B
One-Way Analysis of Variance
Analysis of Variance
Source DF SS MS F P
Factor 2 1355.7 677.9 64.87 0.000
Error 42 438.9 10.4
Total 44 1794.6
F Individual 95% CIs For Mean

Based on Pooled StDev
Level N Mean StDev
1 15 66.038 2.760 (--*--)
2 15 79.300 3.217 (---*--)
3 15 74.584 3.658 (--*---)
Pooled StDev = 3.233 65.0 70.0 75.0 80.0

D E
Six Sigma - Tools & Concepts ANOVA112_001
Reference: Juran's Quality Control Handbook - Ch. 26, P. 10-16, Design and Analysis of Experiments by D. C. Montgomery – Ch. 3, P. 67-79
Terminology
A. Null (H0) and alternative (Ha) hypotheses where the means (µ1, µ2, …, µg) of the g levels of the factor
are compared. There is only one alternative hypothesis: which is at least the means of two levels are
significantly different. (Note: these hypotheses correspond to a fixed effects model, for more
information see reference above.)
B. Model where yij is the (ij)th observation of the CT characteristic (i = 1, 2, …, g and j = 1, 2, …, n, for g
levels of size n), µ is the overall mean, τi is the ith level effect and εij is an error component.
C. Minitab session window output.
D. Descriptive Statistics – Sample sizes, means, standard deviations (StDev), and the Pooled Standard
Deviation, which is a combination of the standard deviations of the g groups.
E. Confidence Intervals around the mean for each individual level.
F. ANOVA Table (see part 2).
Major Considerations
The assumptions for using this tool is that the data comes from independent random samples taken from
normally distributed populations, with the same variance. When using ANOVA, we are using a model
where its adequacy has to be verified using residual analysis (see tool Residual Plots).
4
Application Cookbook
1. Define problem and state the objective of the study.
2. State Null and Alternative Hypothesis.
3. Establish sample size (see tool Sample Size – Continuous Data – One Way ANOVA).
4. Select random samples.
5. Measure the CT characteristic.
6. Analyze data with Minitab (part 1 of 2):
• In order to fully use the Minitab functions associated with ANOVA it is recommended that the
data be stacked into one column and a second column to contain the group codes. Using the
function under Manip > Stack/Unstack > Stack Columns.
• See Part 2 for continuation of the application cookbook.
ANOVA – One Way (ANOVA Table)

Purpose
To summarize the results of an analysis of variance calculation in a table.
Anatomy
ANOVA - One Way - Part 2 of 2
A One-Way Analysis of Variance

Source DF SS MS F P
Factor 2 1355.7 677.9 64.87 0.000
Error 42 438.9 10.4
Total 44 1794.6
B C D E F
Six Sigma - Tools & Concepts Anova122_001
Reference: Juran's Quality Control Handbook Ch. 26, P. 10-16, Design and Analysis of Experiments by D. C. Montgomery – Ch. 3, P. 67-79
Terminology
A. Source – Indicates the different variation sources decomposed in the ANOVA table. "Factor"
represents the variation introduced between the factor levels. The "Error" is the variation within each
of the factor levels. The "Total" is the total variation in the CT characteristic.
B. DF – The number of degrees of freedom related to each sum of square (SS). They are the
denominators of the estimate of variance.
5
C. SS – The sums of squares measure the variability associated with each source. They are the
variance estimate's numerators. "Factor" SS is due to the change in factor levels. The larger the
difference between the means of a factor levels, the larger the factor sum of squares will be. The
"Error" SS is due to the variation within each factor level. The "Total" SS is the sum of the Factor and
Error sum of squares (see tool Sum of Squares).
D. MS – Mean Square is the estimate of the variance for the factor and error sources. Computed by MS
= SS/DF.
E. F – The ratio of the mean square for the "Factor" and the mean square for the "Error".
F. P-Value – This value has to be compared with the alpha level (α) and the following decision rule is
used: if P < α, reject H0 and accept Ha with (1-P)100% confidence; if P ≥ α, don't reject H0.
The assumptions for using this tool is that the data comes from independent random samples taken from
normally distributed populations, with the same variance. When using ANOVA we are using a model
where its adequacy has to be verified using residual analysis (see tool Residual Plots).
1. Analyze the data with Minitab (part 2 of 2):

• Verify the assumption of equality of variance for all the levels with the function under Stat >
ANOVA > Homogeneity of Variance (to interpret this analysis, see tool Homogeneity of
Variance Tests).
• Use the function under Stat > ANOVA > One Way.
• Input the name of the column, which contains the measurement of the CT characteristic into
the 'Response' field, and the name of the column that contains the level codes into the
'Factor' field.
• In order to verify the assumption of the model, select the options 'Store Residuals' and 'Store
Fits'. Select the graphs option and highlight all the available residual Plots (to interpret these
plots see tool Residual Plots).
• In the event of non-compliance with either of these assumptions, the results of the analysis of
variance may be distorted. In this case, the use of graphical tools such as a Box Plot can be
used depict the location of the means and the variation associated with each factor level. In
the case of outliers, these should be investigated.
2. Make a Statistical decision from the session window output of Minitab. Either accept or reject H0.
When H0 is rejected we can conclude that there is a significant difference between the means of the
levels.
3. Translate statistical conclusion into practical decision about the CT characteristic.
6
ANOVA – Two Way – Random Factors - Part 1 of 2 (Model and Hypotheses)
Purpose
To analyze the effect of two random factors on a CT characteristic (see concept Factors and Levels). A
factor is said to be random when levels are randomly chosen from a population of possible levels and we
wish to draw conclusions about the entire population of levels, not just those used in the study. For
example, this type of analysis is generally used in Gage R&R studies.
Anatomy
ANOVA - Two Way - Random

Factors - Part 1 of 2
A
Yijk = µ + τi + βj + (τβ)ij + εijk
V(yijk) = στ2 + σβ2 + στβ2 + σ2 B
H 0 : σ τ2 = 0 H a : σ τ2 > 0
C
H0 : σβ2 = 0 Ha : σβ2 > 0
H0 : στβ2 = 0 Ha : στβ2 > 0
Six Sigma - Tools & Concepts FileName_001
Reference: Juran's Quality Control Handbook-Ch. 26 P. 17-22, Design and Analysis of Experiments by D.C. Montgomery–Ch. 11 P. 470-475
Terminology
A. The model where yijk is the (ijk) th observation of the CT characteristic (I = 1, 2, …, a, j = 1, 2, …, b, k

= 1, 2, …, n) for a levels of factor A, b levels for factor B and n the number of observations in each of
the combination of the factor levels. µ represents the overall mean, τi (tau i) the effect of factor A, βj
(beta j) the effect of factor B, (τβ)ij the interaction effect between A and B and εijk (epsilon ijk) the
error component. For example, with a Gage R&R study factor, A can be the Operator, factor B the
Parts and the interaction is the interaction between the Operators and the Parts. In this case, 'a' will
be the number of Operators, 'b' the number of parts and 'n' the number of repeated measurements
(see tool Measurement-Variable Gage R&R Study).
B. The variance (V) of any observation (yijk) where στ2, σβ2, στβ2 and σ2 are called variance
components.
C. Null (H0) and alternative (Ha) hypotheses where the variance (σ2) of each effect are compared with
0. For each effect, if H0 is true the levels are similar, but if HA is true variability exists between the
levels.
7
For a random effect model, we assume that the levels of the two factors are chosen at random from a
population of levels. The data should come from independent random samples taken from normally
distributed populations. The adequacy of this model has to be verified using residual analysis (see tool
Residual Plots)
1. Define problem and state the objective of the study
2. Identify the two factors for study, and the levels associated with these factors. E.g. in a Gage R&R
study, the factors are Parts and Operators and generally the number of levels for the parts are 10
and the number of levels for the operators are 3 (see tool – Measurement-Variable Gage R&R
Study).
3. Establish sample size.
4. Measure the CT characteristic
• In order to use the Minitab functions, the data has to be formatted in a certain way. Three
columns are required. Two columns contain the identification of the levels for each factor. The
third column contains the measured data from the CT characteristic. To help create the level
codes for the two factors, the function under Calc > Make Patterned Data > Simple Set of
Numbers can be used.
• E.g. if we have one factor 'A' with 2 levels and the second factor 'B' with 4 levels, the following
screens would be used in Minitab to create the two columns identifying the factor levels.
• See part 2 for continuation of the application cookbook.
ANOVA – Two Way – Random Factors - Part 2 of 2 (ANOVA Table)

Purpose
To summarize the results of an analysis of variance calculation in a table.
Anatomy
ANOVA - Two Way - Random

Factors - Part 2 of 2
Analysis of Variance B
A Source DF SS MS F P
Factor 1 9 2.05871 0.228745 39.7178 0.00000

Factor 2 2 0.04800 0.024000 4.1672 0.03256
Fac 1*Fac 2 18 0.10367 0.005759 4.4588 0.00016
Error 30 0.03875 0.001292
Total 59 2.24912
Source Variance C D E F
Component
Factor 1 0.037164
Factor 2 0.000912
Fac 1*Fac 2 0.002234
Error 0.00129
Six Sigma - Tools & Concepts ANOVA222_001
8
Reference: Design and Analysis of Experiments by D. C. Montgomery – Ch. 11 P. 470-475
Terminology
A. Source – Indicates the different sources of variation that are decomposed in the ANOVA table. These
sources are: factor 1, factor 2, the interaction between factors 1 & 2, the error of the repeated
observations within each combination of the factor levels, and the total variation.
B. DF – The number of degrees of freedom related to each sum of square (SS). They are the
C. SS – The sums of squares measure the variability associated with each source. They are the
numerators of the estimate of variance.
D. MS – Mean Square is the estimate of the variance for the factors and error sources. Computed by
MS = SS/DF.
E. F – F statistics that permit hypothesis testing for factors 1, 2 and the interaction effect.
σˆ 2 , σˆ 2τβ , σˆ β2 , σˆ 2τ
G. The estimators of variance components ( ).
A drawback of the analysis of variance method to estimate the variance components ( σˆ , σˆ , σˆ , σˆ ) is
2 2 2 2
τβ β τ
that it may provide negative estimates. This indicates that the model being fit is inappropriate for the
data. One should assume that the negative estimate means that the variance component is really zero
and fit a reduce model that does not include the corresponding effect.
1. Analyze the data with Minitab (part 2 of 2):
• Use the function under Stat > ANOVA > Balanced ANOVA.
• Input the name of the column, which contains the measurement of the CT characteristic into
the 'Response' field, and the names of the columns that contains the factors into the 'Model'
field. The symbol (|) should be introduced between the two factors e.g. Parts | Operators to
obtain the complete model.
• In order to obtain a random effect model, you have to input the name of the two factors into
the option 'Random Factors'. To obtain the estimators of the variance component go to
'Options' and select the 'Display expected mean squares'.
• In order to verify the assumption of the model, select the 'Graphs' option and highlight all the
available residual Plots (to interpret these plots see tool 'Residual Plots').
• If the residual plots show any abnormalities, the results of the analysis of variance may be
distorted. An investigation should be conducted to determine the causes for such
abnormalities.
• For a gage R&R study, use the function under Stat > Quality Tools > Gage R&R Study and
ensure that the method of analysis is ANOVA.
2. Make a statistical decision from the session window output of Minitab. Either accept or reject H0 for
each effect. When H0 is rejected we can conclude that an effect is statistically significant. In other
words, the effect has an influence on the CT characteristic.
9
ANOVA N-Way Two-Level Part 1 of 2
Purpose
To analyze the effect of N fixed factors on a CT characteristic (see concept Factors and Levels). A fixed
effects model is one in which the conclusions apply to the factor levels considered in the analysis, and
cannot be extended to those not explicitly considered.
Anatomy
ANOVA N-Way Two-Level - Fixed

Factors Part 1 of 2
Yijkl = µ + τi + βj +γk+ (τβ)ij + (τγ)ik + (βγ)jk

+ (τβγ)ijk + εijkl
A
H0 : τ1 = τ2 = … =τa = 0 H0 : (τβ)
τβ)ij = 0
Ha : At least one τi ≠ 0 τβ)ij ≠ 0
Ha : At least one (τβ)
H0 : β1 = β 2 = … = β b = 0 H0 : (τγ)ik = 0
Ha : At least one β j ≠ 0 Ha : At least one (τγ)ik ≠ 0
H0 : γ1 = γ 2 = … = γ c = 0
H0 : (βγ)
βγ)jk = 0
Ha : At least one γ k ≠ 0
βγ)jk ≠ 0
Ha : At least one (βγ)
H0 : (τβ
(τβγ)ijk = 0
B
(τβγ)ijk ≠ 0
Ha : At least one (τβ
Six Sigma - Tools & Concepts AnovaNW1_001
Reference: Design and Analysis of Experiments, Ch. 6
Terminology
A. The model where Yijkl is the (ijkl) th observation of the CT characteristic (i = 1, 2, …, a, j = 1, 2, …, b,
k = 1, 2, …, c, l = 1, 2, 3, …, n) for "a" levels of factor A, "b" levels for factor B, "c" levels for factor C
and n is the number of observations in each of the combination of the factor levels. µ represents the
overall mean, τi (tau i) the effect of factor A, β j (beta j) the effect of factor B, γj (gamma j) the effect of
factor C. (τβ)ij represents the interaction effect between A and B, (τβγ)ijk represents the full
interaction between all the factors, and εijkl (epsilon ijkl) is the error component. Note that this is an
example of a three-way two-level ANOVA.
B. Null (H0) and alternative (Ha) hypotheses where the variance (σ2) of each effect is compared with 0.
For each effect, if H0 is true the levels are similar, but if HA is true variability exists between the
levels.
The example presented is a three-factor, two-level ANOVA. The complexity of the linear model, and the
statistical hypotheses, increases rapidly as the number of factors and interaction increases.
10
2. Identify the factors for study, and the levels associated with these factors. Establish sample size.
5. In order to use the Minitab functions, the data has to be formatted in a certain way. For N-Way
ANOVA, N+1 columns are required. N columns contain the identification of the levels for each factor.
The (N+1)th column contains the measured data from the CT characteristic. To help create the level
codes for the two factors, the function under Stat>DOE>Create Factorial Design can be used.
6. See part 2 for continuation of the application cookbook.
ANOVA N-Way Two-Level Part 2 of 2

Purpose
To summarize, in tabular form, the results of an Analysis of Variance calculation for a multi-factor, two-
level Factorial Experiment.
Anatomy
ANOVA N-Way Two-Level - Part 2

of 2 (ANOVA Table)
C D E F G
A
Source DF Seq SS Adj SS Adj MS F P

A 1 6864.8 6864.8 6864.8 262.01 0.000
B 1 69.7 69.7 69.7 2.66 0.122
C B 1 1547.2 1547.2 1547.2 59.05 0.000
A*B 1 84.0 84.0 84.0 3.21 0.092
A*C 1 4.8 4.8 4.8 0.18 0.675
B*C 1 115.7 115.7 115.7 4.42 0.052
A*B*C 1 4.8 4.8 4.8 0.18 0.675
Error 16 419.2 419.2 26.2
Total 23 9110.2
Six Sigma - Tools & Concepts AnovaNW2_001
Reference: Design and Analysis of Experiments, Ch. 6
11
Terminology
A. Source – Indicates the different variation sources decomposed in the ANOVA table, the Factors, or
Main Effects, and the Interactions. The "Error" is the variation within each treatment combination. The
"Total" is the total variation in the CT characteristic (experimental response).
B. DF – The number of degrees of freedom related to each sum-of-squares (SS). They are the
C. Seq SS – The Sequential sums-of-squares measure the variability associated with each source. They
are the variance estimate's numerators. The SS for the Main Effects and Interactions is due to the
2
change in factor levels. The formula for Sum-of-Squares is: SS = (Contrast) /nT where Contrast =
(Σ+Y)-(Σ-Y), n = number of replications, and T = number of runs of the basic design matrix, before
replication.
D. Adj SS – The adjusted Sum of Squares. When the experimental model is orthogonal, the sequential
and adjusted Sum of Squares will be the same. Calculations of the MS, F and p factors are all based
upon the adjusted sum of squares.
E. MS – Mean Square is the estimate of the variance for the factors, interactions, and error sources.
Computed by MS = Adj SS/DF.
F. F – The ratio of the mean square for the "Factor" or interaction and the mean square for the "Error".
When the error is zero, the F statistic cannot be calculated.
G. P-Value – This value has to be compared with the alpha level (α) and the following decision rule is
used: if P < α, reject H0 and accept Ha with (1-P) 100% confidence; if P ≥ α, don't reject H0.
1. Analyze the data using Minitab (part 2 of 2).
Use the function Stat>ANOVA>General Linear Model.
Input the name of the column that contains the measurement of the CT characteristic into the
"Response" field, and the names of the columns containing the factors into the "Model" field. The
"Pipe" symbol, |, should be introduced between all factors, e.g. A|B|C, to obtain the complete model.
In order to verify the assumptions of the model, select the "Graphs" option and highlight all the
available Residual Plots (to interpret these plots, refer to the tool "Residual Plot").
If the residual plots show any abnormalities, the results of the analysis of variance may be distorted.
An investigation should be conducted to determine the causes for such abnormalities.
2. Make a statistical decision from the Session window output from Minitab. Either accept or reject HO
for each effect. When HO is rejected, we can conclude that an effect is statistically significant. In other
words, the effect has an influence on the CT characteristic.
3. Translate the statistical conclusion into a practical decision about the CT characteristics.
12
Box Plot
Purpose
To graphically compare the data from a process at different levels, or states, of the process. Box plots
display the main features of a batch of data, and permit simple comparisons of several batches. They
can display, in an easy to understand graphical form, the Median, the First and Third Quartiles of the
data sample, the Range, and any outliers.
Anatomy
Box Plot
350 C
A
F
*
300 *
Output
* E
G DE
250
*
200
Factor 1 Factor 2 Factor 3 Factor 4 Factor 5
Factors
B
BoxPlot_001
Six Sigma - Tools & Concepts
Reference: Basic Statistics: Tools for Continuous Improvement P. 2.71 – 2.73
Terminology
Process variable being studied
Process number, or other form of indicator which differentiates one sub-group of data from another
First Quartile (Q1)– The upper edge of the Box represents the First Quartile, the observation (data point)
which is located at position (n+1)/4 in the sorted data
Third Quartile (Q3)– The lower edge of the Box represents the Third Quartile, the observation (data
point) which is located at 3(n+1)/4 in the sorted data
Whiskers, whose upper and lower extremes are marked by the limits:
Lower Limit = Q1-1.5(Q3-Q1) Upper Limit = Q3+1.5(Q3-Q1)
Median of the individual sub-group
Outliers
13
The best tool for drawing Box plots is Minitab, using the Graph>Boxplot function.
Box Plots are also known as Box Whisker Plots.
Some references to Box Plots state that the total height of the Box Plot is simply the Range of the data.
The calculation of the Q1-1.5(Q3-Q1) limit, with the presentation of outliers, is specific to Minitab.
1. Gather data and divide into sub-groups, if applicable.
2. For each sub-group, determine number of data points.
3. For each sub-group, calculate the First and Third Quartile values.
4. For each sub-group, calculate the Median value.
5. For each sub-group, calculate the upper and lower values of the whiskers.
6. Draw the box-plots for each sub-group.
14
Breakthrough Strategy
Purpose
The Six Sigma Breakthrough Strategy is the rigorous and data-driven four-stage methodology for
achieving near-perfect process capability, yielding tremendous increases in product and process quality
and customer satisfaction.
Anatomy
The Breakthrough Strategy
B
Phase
Phase1:
1: D
Measurement
Measurement
Characterization
Characterization
A
Phase
Phase2:2: E
Analysis
Analysis
Breakthrough
Breakthrough
Strategy
Strategy C
Phase
Phase3:
3: F
Improvement
Improvement
Optimization
Optimization
Phase
Phase4:4: G
Control
Control
Six Sigma - Tools & Concepts SSigBrSt_001
Reference:
Terminology
A. The Breakthrough Strategy - The four-step procedure to "break through" the so-called 5 Sigma Wall
and achieve near-perfect process capability.
B. Process Characterization - The first half of the Breakthrough Strategy, where the process is defined
and analyzed to determine its current state and the nature of any problems.
C. Process Optimization - The second half of the Breakthrough Strategy, where the process is optimized
and then put in a state of statistical control.
D. Phase 1 of Breakthrough – Measurement.
E. Phase 2 of Breakthrough – Analysis.
F. Phase 3 of Breakthrough – Improvement.
G. Phase 4 of Breakthrough – Control.
15
Phase 1 (Measurement)
This phase is concerned with selecting one or more product characteristics; i.e., dependent variables,
mapping the respective process, making the necessary measurements, recording the results on process
"control cards," and estimating the short- and long-term process capability.
Phase 2 (Analysis)
This phase entails benchmarking the key product performance metrics. Following this, a gap analysis is
often undertaken to identify the common factors of successful performance; i.e., what factors explains
best-in-class performance. In some cases, it is necessary to redesign the product and/or process.
Phase 3 (Improvement)
This phase is usually initiated by selecting those product performance characteristics, which must be
improved to achieve the goal. Once this is done, the characteristics are diagnosed to reveal the major
sources of variation. Next, the key process variables are identified by way of statistically designed
experiments. For each process variable, which proves to be leverage in nature, performance
specifications are established.
Phase 4 (Control)
This phase is related to ensuring that the new process conditions are documented and monitored via
statistical process control methods. After a "settling in" period, the process capability would be
reassessed. Depending upon the outcomes of such a follow-on analysis, it may be necessary to revisit
one or more of the preceding phases.
16
Chi-Square – Goodness Of Fit Test
Purpose
To determine whether the observed frequencies in a sample could occur by chance when sampling from
a population with an assumed distribution.
Anatomy
Chi-square - Goodness of Fit Test
Response Frequency
Response Observed Expected
1 f1 F1
2 f2 F2
3 f3 F3
A i fi B Fi C
r fr Fr
H
r (fi − Fi )2
Then x 2 = Σ
i =1 Fi
D
x 2 crit = x 2α ,(r −1)
E F G
Six Sigma - Tools & Concepts ChiGoFit_001
Reference: Black Book, 25.2
Terminology
A. i – ith category of the response
B. fi – observed frequency of response for category i
C. Fi – expected frequency of response for category i
χ – The Chi-Square statistic
2
D.
χ crit – Chi-Square critical to be compared with χ
2 2
E.
F. α - The level of statistical error associated with a decision based on the statistic
G. (r – 1) – The degrees of freedom = (# of categories – 1)
H. r – The number of response categories
The sample size must be large enough to have each cell populated by 5 or more observations.
The population must be sampled at random.
17
1. Sample from the population to obtain the observed frequencies. e.g. roll a die n times and count 1s,
2s, 3s, etc.
2. Enter the expected frequencies (e.g. for a die Fi = 1 / 6 x n).
3. Calculate the χ statistic.
2
4. Determine the critical χ and compare to the calculated value.

2
5. If χ crit > χ calc then the observed distribution of frequencies could have occurred purely by chance
2 2
sampling from the assumed distribution. (The die is not biased.)
18
Chi-Square – Test Of Homogeneity
Purpose
To compare observed and expected frequencies of occurrence in a contingency table to test for
independence of the variables.
Anatomy
Chi-Square - Test of Homogeneity
A C
A1 A2 ... Ak Total
B1 f11 f12 L1
F11 F12
B2 f21 f22 L2
F21 F22 G
B Bi fij fik Li
E Fij Fik
Br frk Lr
F Frk I
D Total C1 C2 Cj Ck n
H
r k (f − Fij )
2
x2 = Σ Σ
ij
H o: the variables are independent J i=1 j=1 Fij
H a: the variables are not independent d.f. = (r-1)(k-1)
Six Sigma - Tools & Concepts ChiTsHom_001
Reference: Black Book, 25-16
Terminology
A. Variable A
B. Variable B
C. k – Number of columns (equal to number of levels of Variable A)
D. r – Number of rows (equal to number of levels of Variable B)
E. fij – observed frequency of joint occurrence of variables
F. Fij – expected frequency of joint occurrence of variables
G. Li – Total of all occurrences of Bi
H. Cj – Total of all occurrences of Aj
I. n – Total number of observations
J. d.f. – Degrees of Freedom
The population must be sampled at random.
The sample size must be large enough for the expected frequency of each cell to be 5 or greater.
The variables are assumed to be independent.
19
1. Cross tabulate variables and create the contingency table per the Crosstabulation and Contingency
Table tool.
2. Calculate the χ2 statistic.
3. Determine the degrees of freedom.
4. Choose α.
5. Determine χ crit from tables or Excel using CHINV and the appropriate α and degrees of freedom.
2
6. If χ2 calculated < χ2 critical accept Ho i.e. the variables are independent with a confidence level equal to (1
- α). Otherwise – reject Ho, and the variables cannot be considered independent.
7. An alternative to the manual calculation is to enter the table in Minitab and use
STAT → TABLES → Chi-Square Test
20
Chi-Square – Yates Correction
Purpose
To correct for error in calculated Chi-Square in a situation where the actual frequencies are discrete, the
degrees of freedom = 1, and the sample size is relatively small. In such a case, the significant difference
between the discrete values observed and the continuous values of the theoretical distribution will have
an effect on the result.
Anatomy
Chi-Square - Yates Correction
B C
r (fo - Fe − .5) 2
x2 = Σ
i =1 Fe
D
A
Six Sigma - Tools & Concepts ChiYatCo_001
Terminology
Α. χ 2 – The Chi-Square Statistic.
B. r – Number of response categories.
C. | fo – Fe | - The absolute value of observed (discrete) minus expected (continuous) frequencies.
D. 0.5 – Yates correction.
The Yates correction applies in cases of df = 1 and a relatively small sample size.
Minitab does not provide for Yates correction.
The value of χ2 calculated with the Yates Correction will be χ2calc.
1. Subtract expected value from observed value for each cell.
2. Subtract 0.5 from the absolute value of the difference between observed and expected frequencies in
each cell.
3. Square the result.
4. Divide by the expected frequency.
5. Sum over all cells.
This value will be the Chi-Square Statistic - χ2calc.
21
Confidence Interval - Mean
Purpose
To estimate the true mean of a continuous CT characteristic within a range of values based on
observations from a sample. A pre-assigned probability called “confidence level” is used to determine
this range of values. For a confidence level of 95%, we can say that we have a 95% percent chance that
the confidence interval contains the true mean of the CT characteristic.
Anatomy
Confidence Interval - Mean
B C A
s s
x − tα ≤ µ ≤ x + tα
;n−1 n ;n−1 n
2 2
E D
F G
LCL µ UCL
X
Six Sigma - Tools & Concepts ConIntMn_001
Reference: Juran’s Quality Control Handbook - Ch. 23 P. 45-50, Business Statistics by Downing and Clark - Ch. 11 P. 211-218
Terminology
A. True or Population Mean.
Sample Mean.
Sample Standard Deviation.
Sample Size (n).
Tabulated Student (t) distribution value with risk α/2 and n-1 degrees of freedom.
Lower Confidence Limit (LCL).
Upper Confidence Limit (UCL).
A representation of a confidence interval. For a confidence level of 95%, we can say that we have a 95%
percent chance that the true mean is somewhere in the interval between the Lower Confidence Limit
(LCL) and the Upper Confidence Limit (UCL).
22
The assumption for using this interval is that the data comes from a normal distribution. The use of the
Student (t) distribution is for sample sizes of less than 30. When the sample size is greater or equal to
30, the standard normal distribution (Z) is generally used. However, as the sample size increases, the t
distribution approaches the Z distribution (see Tool Distribution – t).
Collect a sample of data from the process.
B. Analyze data with Minitab:
– Use the function under Stat>Basic Statistics>1 Sample t.
– Select the Confidence interval option and enter a confidence level, usually this level is 95%
(default setting). The results will appear in Minitab Session Window output.
Minitab Session Window output :
Confidence Intervals
Variable N Mean StDev SE Mean 95.0 % CI

Data 25 1.929 0.932 0.186 (1.544, 2.313)
Note 1: If the sample size is ≥ 30, the function under Stat>Basic Statistics>1 Sample Z can be used.
Note 2: There is another function that can be used to obtain the confidence interval for the mean under
Stat>Basic Statistics>Descriptive Statistics>Graphs>Graphical Summary. This function presents
not only this interval but also descriptive statistics and graphical representations.
23
Confidence Interval - Proportion
Purpose
To calculate the confidence interval for a proportion which contains the true value of p with a (1 - α) level
of confidence.
Anatomy
Confidence Interval - Proportion
pˆ (1 − pˆ ) pˆ (1 − pˆ )
pˆ − Z α ≤ p ≤ pˆ + Z α
2 n 2 n
1 42 43
A D E C B F
Six Sigma - Tools & Concepts ConfInPr_001
Reference: Business Statistics – P. 231
Terminology
A. p̂ is the observed proportion in the sample and is an estimator of the population proportion.
B. p is the population proportion.
C. n is sample size.
D. Zα/2 is the standard normal deviate appropriate to the level of confidence.
Ε. α is the chosen level of α risk.
F. Standard deviation for p̂ .
The interval is based on an approximation of the Binomial distribution by a Normal Distribution.
The following inequalities must apply:
• n p̂ ≥ 5;
• n(1 – p̂ ) ≥ 5;
n should not be less than 30.
The population must be much greater than n
24
1. Determine p̂ from sample data.
2. Decide on α, e.g. .05.
3. Determine Z α/2 from tables or Excel, e.g. Z α/2 = 1.96 for α = .05.
4. Apply the formula.
25
Confidence Interval – Standard Deviation
Purpose
To estimate the true standard deviation of a continuous CT characteristic within a range of values based
on observations from a sample. A pre-assigned probability called "confidence level" is used to determine
this range of values. For a confidence level of 95%, we can say that we have a 95% percent chance that
the confidence interval contains the true standard deviation of the CT characteristic.
Anatomy
Confidence Interval - Standard

Deviation
C A
B
n −1 n −1
s ≤σ≤s
χ χ
2 2
α α
;n−1 1− ;n−1
2 2
D E
F G
LCL s σ UCL
Six Sigma - Tools & Concepts ConIStDv_001
Reference: The Vision of Six Sigma: Tools and Methods for Breakthrough by M. Harry – Ch. 15 P. 2-6, Statistical Quality Control Methods by I. W. Burr Ch. 2 P.
13-17
Terminology
A. True or Population Standard Deviation.
B. Sample Standard Deviation.
C. Sample Size (n).
D. Tabulated value of a Chi-square (χ2) distribution with risk α/2 and n−1 degrees of freedom.
2
E. Tabulated value of a Chi-square (χ ) distribution with risk 1−α/2 and n−1 degrees of freedom.
F. Lower Confidence Limit (LCL).
G. Upper Confidence Limit (UCL).
H. A representation of a confidence interval. For a confidence level of 95%, we can say that we have a
95% percent chance that the true standard deviation is somewhere in the interval between the Lower
Confidence Limit (LCL) and the Upper Confidence Limit (UCL).
The assumption for using this interval is that the data comes from a normal distribution.
26
1. Collect a sample of data from the process.
2. Use the following Minitab function. The function available presents not only this interval but also
descriptive statistics and graphical representations:
• Use the function under Stat>Basic Statistics>Descriptive Statistics>Graphs>Graphical
Summary.
• Select the Graphs and Graphical Summary options. Enter a confidence level, usually this
level is 95%.
Descriptive Statistics
Variable: Data
Anderson-Darling Normality Test

A-Squared: 0.230
P-Value: 0.785
Mean 1.92854
StDev 0.93240
Variance 0.869375
Skewness -5.9E-02
Kurtosis -1.01358
0 1 2 3 N 25
Minimum 0.06105
1st Quartile 1.24938
Median 1.83322
3rd Quartile 2.70713
Maximum 3.48078
95%ConfidenceInterval forMu
95%Confidence Interval for Mu
1.54366 2.31342
1.5 2.0 2.5 95%Confidence Interval for Sigma
0.72805 1.29711
95% Confidence Interval for Median
95%ConfidenceInterval for Median 1.44667 2.51378
27
Control Chart – Moving Range (MR) Chart
Purpose
To observe and evaluate the behavior of a process over time, and against control limits, and take
corrective action if necessary. The MR Chart plots the moving range between individual process output
values, generally the difference between two successive readings. The MR Chart is used to visualize the
process dispersion, and is usually plotted in conjunction with the X (Individuals) Chart, which is used to
visualize the process location
Anatomy
Control Chart - Moving Range (MR)
Moving Range Chart E
6
3.0SL=5.636
5 UCL = D4 R
A MR
F
Moving Range
3 D
2
R=1.725
1
LCL MR
= D3 R
0 -3.0SL=0.000
0 5 10 15 20 25 C
Observation Number
B
Ctrl_MR_001
Reference: Statistical Process Control – Ford/GM/Chrysler pp75 - 78
Terminology
A. Moving Range – The Moving Range values as calculated from sequential, or chronological, readings
from the process
B. Observation Number – The chronological index number for the individual moving range value being
referenced
C. Lower Control Limit (LCL) – Line and numerical value representing the lower limit of the variation that
could be expected if the process were in a state of statistical control, equal to the average Moving
Range over the period, multiplied by a conversion factor.
D. Average of the Moving Range– Average value of the individual Moving Range values, over the period
of inspection being referenced
E. Upper Control Limit (UCL) – Line and numerical value representing the upper limit of the variation
that could be expected if the process were in a state of statistical control. It is equal to the average
Moving Range over the period, multiplied by a second conversion factor, different from the one used
to calculate the LCL.
F. Plot of the Moving Range values vs observation number. Any excursion in this plot above the UCL or
below the LCL represents an out-of-control condition and should be investigated
28
Moving Range values are correlated, because each successive point has a point in common with the
preceding one. Care must be exercised in interpretation.
There will always be one less Moving Range value than Individuals value
1. Determine purpose of the chart
2. Select data collection point
3. Set up forms for recording and charting data and write specific instructions on use of the chart
4. Collect and record data. A minimum of 20 individual readings should be taken. Note that even though
the measurements are sampled individually, the number of readings grouped to form the moving
range determines the nominal sample size "n"
5. Compute the Moving Range MRI between n consecutive values
6. Compute the Average Moving Range MR
7. Compute Upper Control Limit UCLMR
8. Compute Lower Control Limit LCLMR
9. Plot data points
10. Interpret chart together with other pertinent sources of information on the process and take corrective
action if necessary
29
Control Chart – np Chart
Purpose
To observe and evaluate the behavior of a process over time, and take corrective action if necessary.
The np Chart plots numbers of defective units and is applicable to binomially distributed discrete defect
data collected from subgroups of equal size. np Charts differ from p Charts in that they plot the actual
number of defective units, rather than the proportion of defective units.
Anatomy
Control Chart - np Chart
G
H
A NP Chart for Defect_np E
105
1 1
1
3.0SL=96.87
95 = n p + 3 n p (1 − p )
F UCL np
Sample Count
85
75
NP=73.16
65 D
= n p − 3 n p (1 − p )
55 LCL np
-3.0SL=49.45
45
0 5 10 15 20 25
C
B Sample Number
Six Sigma - Tools & Concepts Ctrl_NP_001
Terminology
A. Sample Count – Numbers of defective units observed
B. Sample Number – The chronological index number for the sample, or subgroup, whose numbers of
defective units is being referenced
C. Lower Control Limit (LCL) –Represents the lower limit of the variation that could be expected if the
process were in a state of statistical control, by convention equal to the Mean minus three Standard
Deviations
D. Process Average Number of Units Nonconforming – (np) Average value of the number of defective
units, over the period of inspection being referenced
E. Upper Control Limit (UCL) – Represents the upper limit of the variation that could be expected if the
process were in a state of statistical control, by convention equal to the Mean plus three Standard
Deviations
F. Plot of number of units nonconforming vs sample number. Any point in this plot above the UCL or
below the LCL represents an out-of-control condition to be investigated
G. np Chart – The title "np" Chart refers to the number of units nonconforming in a subgroup, where n is
the subgroup size and p is the probability of a defective unit
H. Out of Control Point – By definition, any point that exceeds either the UCL or the LCL is out of control.
Minitab has a number of tests available for out of control conditions, and labels each point with a
number corresponding to the test which the point fails
30
The np Chart plots the number of units defective, and not the number of defects
The use of an np Chart is preferred over the p Chart if using the actual number defective is more
meaningful than the defectives rate, and the subgroup, or sample, size remains constant from period to
period
Large subgroup sizes should always be selected (n>50 is considered normal), and the np value should
always be greater than 5
3. Establish basis for subgrouping
4. Establish sampling interval and determine sample size
6. Collect and record data. It is recommended that at least 20 samples be used to calculate the Control
Limits
7. Count each "npi", the number of nonconforming units for each of the i subgroups
8. Compute the Process Average Number of Units Nonconforming np
9. Compute Upper Control Limit UCLnp
10. Compute Lower Control Limit LCLnp
11. Plot data points
action if necessary
UCL np
= n p + 3 n p(1 − p )
n p = (∑ np )/ k
k
i =1 i
LCL np
= n p − 3 n p (1 − p )
31
Control Chart –p Chart
Purpose
To observe and evaluate the behavior of a process over time, and against control limits, and take
corrective action if necessary. The p Chart plots the proportion of nonconforming units collected from
subgroups of equal or unequal size. p Charts differ from np Charts in that they plot the proportion of
defective units, rather than the number of defective units
Anatomy
Control Chart - P Chart
G
P Chart for Defects_ H
0.014 E
0.013 F
0.012
3.0SL=0.01139
0.011 = p + 3 p (1 − p ) / n *
Proportion
UCL p
0.010
0.009
P=0.008496
0.008 D
A 0.007 = p − 3 p (1 − p ) / n *
LCL p
0.006
-3.0SL=0.005601
0.005
0 5 10 15 20 25
C
Sample Number
B
Six Sigma - Tools & Concepts Ctrl_P_001
Reference: Statistical Process Control – Ford/GM/Chrysler p91 - 110
Terminology
A. Proportion – Proportion of defective units observed, obtained by dividing the number of defective
units observed in the sample, by the number of units sampled
B. Sample Number – The chronological index number for the sample, or subgroup, whose proportion of
defective units is being referenced
C. Lower Control Limit (LCL) –Represents the lower limit of the variation that could be expected if the
process were in a state of statistical control, by convention equal to the Mean minus three Standard
Deviations. Since the sample size varies, the Lower Control Limit is recalculated each time, resulting
in a "staircase" effect
D. Process Average Proportion Nonconforming – (p) Average value of the proportion of defective units
in each subgroup, over the period of inspection being referenced
process were in a state of statistical control, by convention equal to the Mean plus three Standard
Deviations. Since the sample size varies, the Upper Control Limit is recalculated each time, resulting
in a "staircase" effect
F. Plot of proportion nonconforming vs sample number. Any point in this plot above the UCL or below
the LCL represents an out-of-control condition to be investigated
G. p Chart – The title "p" Chart refers to the proportion of defective units in a subgroup
H. Out of Control Point – By definition, any point that exceeds either the UCL or the LCL is out of
control. Minitab has a number of tests available for out of control conditions, and normally labels each
point with a number corresponding to the test which the point fails. If the sample size is not constant,
however, the tests are not applied.
32
The p Chart plots the proportion of units defective, and not the proportion of defects
The use of a p Chart is preferred over the np Chart if using the rate of defective units is more meaningful
than using the actual number of defective units, and the subgroup, or sample, size varies from period to
period.
Large subgroup sizes should always be selected (n>50 is considered normal), and the np value should
always be greater than 5
6. Collect and record data. It is recommended that at least 20 samples be used to calculate the Control
Limits
7. Compute p, the proportion nonconforming for each of the i subgroups
8. Compute the Process Average Proportion Nonconforming p
9. Compute Upper Control Limit UCLp
10. Compute Lower Control Limit LCLp
12. Interpret chart together with other pertinent sources of information on the process and take
corrective action if necessary
UCL = p + 3 p(1 − p ) / n *
p
p = (∑ (np ))/ (∑ n )
k k
i =1 i i =1 i
LCL = p − 3 p(1 − p ) / n *
p
*Note: as the subgroup size n changes, the UCL and LCL must be recalculated for each subgroup
33
Control Charts – R (Range) Chart
Purpose
To observe and evaluate the variation of a process over time, and against control limits, and take
corrective action if necessary. The R Chart plots the range values, or the difference between the highest
and lowest values, for a series of subgroups. The R Chart is usually plotted in conjunction with the Xbar
Chart
Anatomy
Control Chart - R Chart
G
R Chart
7 UCLR = D4 R
3.0SL=6.445
6
A F
Sample Range
5
E
4
3 R=3.048
D
2
1
LCLR = D3 R
0 -3.0SL=0.000
0 5 10 15 20 25
C
Sample Number
B
Ctrl_R_001
Reference: Statistical Process Control – Ford/GM/Chrysler pp29-64
Terminology
A. Sample Range – The Range values as calculated from sequential, or chronological, subgroups from
the process
B. Sample Number – The chronological index number for the individual sample range value being
referenced
could be expected if the process were in a state of statistical control, equal to the average Range
over the period, multiplied by a conversion factor.
D. Process Average Range – Average value of the individual sample ranges, over the period of
inspection being plotted
Range over the period, multiplied by a second conversion factor, different from the one used to
calculate the LCL.
F. Plot of the Range values vs sample number.
The Xbar Chart, together with the R Chart, is a sensitive control chart for identifying assignable causes of
product and process variation, and gives great insight into short-term variations.
34
3. Establish basis for sub grouping
4. Establish sampling interval and determine sample size n
6. Collect and record data.
7. Compute the Average Range R
8. Compute Upper Control Limit UCLR
9. Compute Lower Control Limit LCLR
action if necessary
35
Control Charts – Standard Deviation (s) Chart
Purpose
To observe and evaluate the variation of a process over time, and against control limits, and take
corrective action if necessary. The S Chart plots the standard deviation of each of a number of sampled
subgroups. The s Chart is usually plotted in conjunction with the Xbar Chart
Anatomy
Control Chart - Standard Deviation

(s) Chart
G
S Chart
3 UCL s
= B4 s
3.0SL=2.675
A
F
Sample StDev
2 E
S=1.281
1 D
LCL s
= B3 s
0 -3.0SL=0.000
0 5 10 15 20 25
Sample Number C
B
CtrlStDv_001
Terminology
A. Sample StDev – The standard deviations of the process subgroups as collected in sequential, or
chronological, order from the process
B. Sample Number – The chronological index number for the sample, or subgroup, whose standard
deviation is being referenced
could be expected if the process were in a state of statistical control, equal to the average Standard
Deviation over the period, multiplied by a conversion factor.
D. Process Average – Overall average value of the subgroup standard deviations, over the period of
inspection being referenced
Standard Deviation over the period, multiplied by a second conversion factor, different from the one
used to calculate the LCL.
F. Plot of the sample Standard Deviation values vs sample number.
36
The s Chart is a more accurate indicator of process variation, and is recommended for use with larger
sample sizes (generally ≥ 10).
It is less sensitive than the R Chart in detecting special causes of variation that cause only a single value
in a subgroup to be unusual
7. Compute each subgroup's Mean Xi
8. Compute each subgroup's Standard Deviation sI
9. Compute the overall process Standard Deviation s
10. Compute Upper Control Limit UCLs
11. Compute Lower Control Limit LCLs
action if necessary
n is the subgroup size, and k is the number of subgroups

LCL = 0 for n<7
37
Control Chart – u Chart
Purpose
To observe and evaluate a process behavior over time and take corrective action if necessary. The u Chart plots
defects per unit data collected from subgroups of equal or unequal size. u Charts differ from c Charts in that they
plot the proportion of defects, rather than the number of defects
Anatomy
Control Chart - U Chart
G
U Chart for Defects
H
1.7
UCL u = u + 3 u/n*
A 1.6
3.0SL=1.548
1.5
Sample Count
F
1.4 E
1.3
U=1.229
1.2
1.1 D
1.0 LCL u = u − 3 u/n*
0.9 -3.0SL=0.9109
0.8
0 5 10 15 20 25 C
B Sample Number
Ctrl_U_001
Reference: Statistical Process Control – Ford/GM/Chrysler pp 115 - 118
Terminology
A. Sample Count – Numbers of defects per unit observed
B. Sample Number – The chronological index number for the sample, or subgroup, whose numbers of
defects per unit is being referenced
C. Lower Control Limit (LCL) – Represents the lower limit of the variation that could be expected if the
process were in a state of statistical control, by convention equal to the Mean minus three times the
square root of the process Standard Deviation
D. Process Average Number of Defects per Unit – Average value of the number of defects per unit, over
the period of inspection being referenced
process were in a state of statistical control, by convention equal to the Mean plus three times the
square root of the process Standard Deviation
F. Plot of number of defects per unit vs sample number. Any excursion in this plot above the UCL or
G. u Chart – The title "u" Chart refers to the number of defects per unit in a subgroup
H. Out of Control Point – By definition, any point that exceeds either the UCL or the LCL is out of
control. Minitab has a number of tests available for out of control conditions, and labels each point
with a number corresponding to the test which the point fails
38
The u Chart may be used in situation where the sample includes more than one unit, and must always be used
when the sample size varies from one period to the next
7. Count each "ci", the number of nonconformities for each of the i subgroups
8. Compute the Process Average Number of Nonconformities per Unit u
9. Compute Upper Control Limit UCLu
10. Compute Lower Control Limit LCLu
11. Plot the data points
action if necessary
39
Control Chart – Xbar Chart
Purpose
To observe and evaluate the behavior of a process over time and take corrective action if necessary. The
Xbar Chart plots the average values of each of a number of small sampled subgroups. The Xbar Chart is
usually plotted in conjunction with the R (Range) Chart or the s (Standard Deviation) Chart
Anatomy
Control Chart - Xbar Chart
G
X-bar Chart
UCL Xbar
= X + A2R
102
3.0SL=101.8
F
A
101 E
Sample Mean
100 X=99.94
D
99
H
LCL Xbar
= X − A2 R
98 -3.0SL=98.11
1
0 5 10 15 20 25 C
B
Sample Number
Six Sigma - Tools & Concepts Ctrl_XBr_001
Reference: Statistical Process Control – Ford/GM/Chrysler pp 29 - 68
Terminology
A. Sample Mean – The means of the process subgroups as collected in sequential, or chronological,
order from the process
B. Sample Number – The chronological index number for the sample, or subgroup, whose average
value is being referenced
could be expected if the process were in a state of statistical control, equal to the overall Mean minus
the average Moving Range multiplied by a conversion factor.
D. Process Average – Overall average value of the individual process readings, over the period of
inspection being referenced
that could be expected if the process were in a state of statistical control, equal to the overall Mean
plus the average Moving Range multiplied by a conversion factor.
F. Plot of the individual sample Means vs sample number. Any excursion in this plot above the UCL or
G. Xbar Chart – The title "Xbar" Chart refers to the sample average value (X) being plotted
Out of Control Point – By definition, any point that exceeds either the UCL or the LCL is out of control.
Minitab has a number of tests available for out of control conditions, and labels each point with a
number corresponding to the test which the point fails
40
The Xbar Chart, together with the R Chart, is a sensitive control chart for identifying assignable causes of
product and process variation, and gives great insight into short-term variations
The Control Limits for the Xbar Chart are different, depending on whether it is being plotted for use with
the R Chart or the s Chart
6. Collect and record data. A minimum of 25 subgroups or samples of size n should be measured
7. Compute the Process Average X
8. If using the R Chart - Compute the Average Moving Range R (Ref. Tool "Control Chart – R Chart)
9. If using the s Chart - Compute the Average Standard Deviation s (Ref. Tool "Control Chart – s Chart)
10. Compute required Upper Control Limit UCLXbar
11. Compute required Lower Control Limit LCLXbar
action if necessary.
41
Correlation Coefficient
Purpose
To estimate the intensity of the relation between two random variables.
Anatomy
Correlation Coefficient
∑(x − x )(y − y )
r=
i i
∑(x i − x )∑(yi − y ) 2 2
A
B D E
C
Six Sigma - Tools & Concepts CorrCoef_001
.Reference: Juran's Quality Control Handbook Ch. 23 Page 102 & 103
Terminology
A. Coefficient of Linear Correlation – It is an estimate of the intensity of the linear relation between
Variable A and Variable B.
When "r" is close to +1, there is a strong positive correlation between the Variables, i.e. when
Variable A increases, Variable B increases.
When "r" is close to 0, there is very little or no correlation between the Variables.
When "r" is close to -1, there is a strong negative correlation between the Variables, i.e. when
Variable A increases, Variable B decreases and vice versa.
B. The individual values of the independent variable.
C. The Mean value of the independent variable.
D. The individual values of the dependent variable.
E. The Mean value of the dependent variable.
A large Coefficient of Linear Correlation (i.e. approaches +1 or –1) is an indication of strong relation
between two variables; however, it does not necessarily imply that there is a cause and effect relation
between them. It should also be noted that data has to be entered in ordered pairs.
42
1. Collect data samples
2. Enter the data corresponding to the variables into two separate columns in Minitab
3. Obtain correlation matrix using STAT>BASIC STATISTICS>CORRELATION
4. In the Variables field, select columns corresponding to the two variables. Order of the variables does
not affect the result.
5. Do not select the box for Store Matrix, as this feature is not activated for Minitab Release 11.
6. Interpret the value of Coefficient of Linear Correlation. A large value, i.e. close to +1 or –1 is
indicative of strong positive and negative relations respectively.
7. There is no hard and fast rule on how large the coefficient should be. For many cases, value greater
than +/- 0.8 is considered to be reasonable. It truly depends on the application and consequence of
the interpretation.
43
Crosstabulation and Contingency Table
Purpose
To provide a method of listing the observed and expected frequency of occurrence of classification
variables in a matrix and place them in a contingency table.
Anatomy
Crosstabulation and Contingency

Table
A C
A1 A2 ... Ak Total
B1 f11 f12 L1
F11 F12
B2 f21 f22 L2
F21 F22 G
B Bi fij fik Li
E Fij Fik
Br frk Lr
F Frk I
D Total C1 C2 Cj Ck n
Six Sigma - Tools & Concepts CrtabCon_001
Reference: Black Book, 25-16
Terminology
A. Variable A.
B. Variable B.
C. k – Number of columns (equal to number of levels of Variable A).
D. r – Number of rows (equal to number of levels of Variable B).
E. fij – observed frequency of joint occurrence of variables.
F. Fij – expected frequency of joint occurrence of variables.
G. Li – Total of all occurrences of Bi.
H. Cj – Total of all occurrences of Aj.
I. n – Total number of observations.
• The population must be sampled at random.
• The sample size must be large enough for the expected frequency of each cell to be 5 or greater.
• The variables are assumed to be independent.
1. Cross tabulate the variables and insert the observed frequencies into the contingency table.
2. From the observed values, calculate the expected frequencies and insert into the table.
i.e. Fij = Cj Li
n
44
Degrees of Freedom
Purpose
To determine the number of measurements that are necessary to make an unbiased estimate of
a statistic.
Anatomy
Degrees of Freedom
A B C
ν = n−m
D
n
∑ ( x − x)
i
σˆ = s = i =1
n −1
DgrsFree_001
Reference: Juran’s Quality Control Handbook - Chapter 23, page 73
Terminology
A. (nu) = dF : Represents the degrees of freedom
B. n: Represents the number of observations taken.
C. m: Represents the number of parameters to be estimated. A parameter is a constant defining some
property of the density function of a variable, such as the mean or standard deviation.
D. This example is using the equation of the sample standard deviation to illustrate the use of degrees
of freedom.
E. The denominator (n-1) represents the degrees of freedom. In this equation, it was necessary to
establish only one constant (the sample mean) in order to compute the standard deviation.
Therefore, the degrees of freedom is equal to n-1. The number of degrees of freedom will vary
according to the statistical test and parameters that are used, but it will always be related to the
number of observations and constants. Usually, degrees of freedom are represented by n-1 or n-2.
45
Factor and Levels
Purpose
To describe the independent variables (X1, X2, …, XN) of a process when conducting a study of
their effect on a dependent variable (Y) with tools such as ANOVA and Design of Experiments
(DOE).
Anatomy
Factor and Levels
A B C D
Examples
People
Factor
Operator Day shift
Levels
Night shift
P
Material
Material type A B
R Service
Equipment
Pressure Low High
O CTQ
Product
Policies
Policy Old New C CTC
CTD
Procedures
Procedure 1 2 E Task
Methods
Assembly
sequence
Old New S
Environment
Temperature Low High S
(X1, X2, … , XN) Y

Six Sigma - Tools & Concepts FactLvls_001
Reference: Juran’s Quality Control Handbook - Chapter 26 Page 3, Basic Statistics by Kiemele, Schmidt and Berdine - Chapter 8 Page 1-10
Terminology
A. Process inputs (independent variables – see concept Variables – Dependent and Independent).
B. Factor – Input or independent variable name such as Operator, Material type, Pressure, Policy, etc.
The objective of the Breakthrough Strategy is to identify and contain the factors that have the
greatest effect on the CT characteristic.
C. Levels – Set values for a factor in a study. In most cases the factors are studied when set at two
levels. It is possible to increase the number of levels (3, 4, etc.), but this increases the complexity of
the study.
D. A factor is said to be random when the levels are randomly selected from a larger set of levels. In this
case, the resultant analysis accounts for random effects of the factor, and conclusions can be drawn
about those levels not considered in the analysis.
E. A factor is said to be fixed when the levels are specifically assigned. In this case, subsequent
analysis of the response would report on the fixed effects of the levels, and the conclusions
constrained to only those levels present in the analysis.
F. Process output (dependent variable – see concept Variables – Dependent and Independent).
46
Factorial Experiment - Confounding
Purpose
Main and/or interaction effects are confounded if only their combined effects, and not their
individual effects, can be determined from the experimental design. In other words, the unique
effect of one contrast cannot be separated from another. Confounding is also known as Alias
Structure.
Anatomy
Factorial Experiment -
Confounding
Generator:
Generator: ABC
ABC==11 Half Fraction of a
Full 23 Experimental Design
A B C AB AC BC ABC
2 a 1 -1 -1 -1 -1 1 1 Alias BC AC AB
Effect A B C
3 b -1 1 -1 -1 1 -1 1 RunOrder
2 a 1 -1 -1
5 c -1 -1 1 1 -1 -1 1 3 b -1 1 -1
5 c -1 -1 1
8 abc 1 1 1 1 1 1 1 8 abc 1 1 1
Generator:
Generator: ABC
ABC==11
Same
Same
A Contrasts
Contrasts B
Confounding:
Confounding: AA++ BC
BC
Six Sigma - Tools & Concepts FctExCnf_001
Reference: The Vision of Six Sigma: A Roadmap for Breakthrough Ch. 14
Terminology
A. Main effect contrast Factor A.
B. Interaction of contrast Factors B and C, confounded with the main effect.
Confounded contrasts can be seen in an experimental design when both columns have the same signs
at each experimental run.
47
Factorial Experiment - Fractional
Purpose
A Fractional Factorial Experiment is a sub-set of a Full Factorial experiment, where only a
selected fraction of all the possible combinations of design factor levels are run. It is typically
performed when it is impractical or too expensive to run a Full Factorial Experiment, such as
when there are a relatively large number of Factors. While not as precise as Full Factorial
Experiment, because a certain amount of information will be lost, the impact is minimal. The
trade-off is that information regarding third-order and higher interactions are lost, though these
interactions are usually considered negligible.
Anatomy
Factorial Experiment - Fractional
A
RunOrder A B C AB AC BC ABC
1 (1) -1 -1 -1 1 1 1 -1
2 a 1 -1 -1 -1 -1 1 1
3 b -1 1 -1 -1 1 -1 1
C 4 ab 1 1 -1 1 -1 -1 -1
5 c -1 -1 1 1 -1 -1 1
6 ac 1 -1 1 -1 1 -1 -1
7 bc -1 1 1 -1 -1 1 -1
D 8 abc 1 1 1 1 1 1 1
Alias BC AC AB
Effect A B C
B Run Order AB AC BC ABC
2 a 1 -1 -1 -1 -1 1 1
3 b -1 1 -1 -1 1 -1 1
5 c -1 -1 1 1 -1 -1 1
8 abc 1 1 1 1 1 1 1
Generator:
Generator: ABC
ABC==11 E
Six Sigma - Tools & Concepts FctExFra_001
Reference: Statistics for Experimenters - Ch. 12
Terminology
A. Full Factorial Experiment – 3 Factors at 2 Levels, requiring 23 = 8 runs.
B. Fractional Factorial Experiment, using the third order interaction ABC at its highest level as the
generator, requiring 23-1=4 runs.
C. Second order interaction effects which are confounded with the main effects, and whose effects
cannot be separated.
D. Main Factor effects.
E. Design generator, chosen to define the Fractional Factorial Experiment. In this example, the
generator is ABC at its high (+1) setting. Typically, generators are chosen from the highest-order
interactions, because the action of selecting the generator means its effect will be lost in the analysis.
48
Factorial Experiment - Full
Purpose
A Full Factorial Experiment is a rigorous experimental design where all possible combinations of
the Factors at each of the chosen levels are tested. If “k” Factors are tested at “m” levels, then
there would be mk experimental runs made.
Anatomy
Factorial Experiment - Full
C
A
Run Order A B C AB AC BC ABC Y

1 (1) -1 -1 -1 1 1 1 -1 45.1
2 a 1 -1 -1 -1 -1 1 1 72.7
3 b -1 1 -1 -1 1 -1 1 41.7
4 ab 1 1 -1 1 -1 -1 -1 70.4
5 c -1 -1 1 1 -1 -1 1 57.4
6 ac 1 -1 1 -1 1 -1 -1 85.7
7 bc -1 1 1 -1 -1 1 -1 50.7
8 abc 1 1 1 1 1 1 1 87.5
B E D
Six Sigma - Tools & Concepts FctExFul_001
Reference: Juran, pp 26.2 – 26.10
Terminology
A. The experimental run number. For the example shown, 3 Factors (A, B, and C) and 2 levels (-1
and 1), means 23, or 8, experimental runs.
B. The Yates designation of the experimental run.
C. The different contrasts, representing the main effects (A, B, and C) and the second order
interactions (AB, BC, and AC), and the full interaction ABC. The example is shown in Yates
Standard Order.
D. The experimental Response, giving the value of the dependent variable at each run of the
experiment.
E. The experimental treatments for each run.
49
Factorial Experiment - Resolution
Purpose
An experimental design of Resolution R is one in which no p-Factor effect is confounded with
any other effect containing less than R-p Factors. The Resolution of a design is represented by
a Roman numeral appended as a subscript.
Anatomy
Factorial Experiment - Resolution
Available Factorial Designs (with Resolution)

Factors
2 3 4 5 6 7 8 9 10 11 12 13 14 15
4 Full III
8 Full IV III III III
Runs 16 Full V IV IV IV III III III III III III III
32 Full VI IV IV IV IV IV IV IV IV IV
64 Full VII V IV IV IV IV IV IV IV
128 Full VIII VI V V IV IV IV IV
A
C
Six Sigma - Tools & Concepts FctExRes_001
Reference: Statistics for Experimenters pp 385 - 389
Terminology
A. The number of runs in the experimental design, for a 2k-p Factorial Design.
B. The number of Factors under study.
C. The Resolution of the design, where:
A design of Resolution III does not confound main effects with one another, but does confound
main effects with two-Factor interactions.
A design of Resolution IV does not confound main effects and two-Factor interactions, but does
confound two-Factor interactions with other two-Factor interactions.
A design of Resolution V does not confound main effects and two-Factor interactions with each
other, but does confound two-Factor interactions with three-Factor interactions.
50
Factorial Experiment - Runs
Purpose
A Factorial Experiment run is the determination of the process Response, or output, for a
specific combination of Factor, or process input, levels. An Experiment consists of multiple runs,
whose number will depend upon the number of Factors and the type of experimental design.
Anatomy
Factorial Experiment - Runs
C
A
Run Order A B C Y
D
1 (1) -1 -1 -1
2 a 1 -1 -1
3 b -1 1 -1
4 ab 1 1 -1
5 c -1 -1 1
6 ac 1 -1 1
7 bc -1 1 1
8 abc 1 1 1
Six Sigma - Tools & Concepts FctExRun_001
Reference: The Vision of Six Sigma: Tools and Methods for Breakthrough Ch. 18
Terminology
A. The experimental run number.
B. The Yates designation for the Run.
C. The settings (high or low) for the main process Factors.
D. The response measured for the experimental run.
51
Heterogeneity of Variance
Purpose
Heterogeneity of variance is the statistical comparison to confirm that a significant difference
exists between the variances of different subgroups.
Anatomy
Heterogeneity of Variance
A σ 1 ≠ σ 2 or σ 3 B
15
σ1
Frequency
Frequency
10
10
σ2
5
5
0 0
2 3 4 5 6 7 8 9 10 11 12 2 3 4 5 6 7 8 9 10 11 12
Run 1 Run 2
D
Norm al Probability Plot
C for Run 1 - Run 3
99 • Run 1
• Run 2
10
95
90 • Run 3
σ3
Frequency
Percent
80
70
60
50
5 40
30
20
10
5
0
1
2 3 4 5 6 7 8 9 10 11 12 2490 2495 2500 2505 2510 2515
Run 3 Data
Six Sigm a - Tools & Concepts HeteVari_001
Reference: Juran's Quality Handbook, 4th edition/Forrest W. Breyfogle III, Statistical Method for Testing,
Development, and Manufacturing, John Wiley and Sons, page 114
Terminology
A. Histogram of run 1 data, with σ1
B. Histogram of run 2 data, with σ2
C. Histogram of run 3 data, with σ3
D. Normal probability plot: best fit line (typical) which slope represents the variance
In these three runs, at least one variance statistically differs from the others, indicating
heterogeneity of variance.
Heterogeneity of variance precludes the use of other statistical tests requiring similar variances,
but may confirm the effect of a process change on output variability.
Heterogeneity is complementary to homogeneity.
52
Hidden Factory
Purpose
The term “Hidden Factory” is used to designate all work performed above and beyond entitlement, to
produce a good unit of output, and that is not explicitly shown in any report (cost, accounting,
performance, etc.).
Anatomy
Hidden Factory
Input
Input Operation
Inspec-
tion Operation
Inspec-
tion ... Output
Output
Not Not
OK OK
Rework Rework
Scrap Scrap
Six Sigma - Tools & Concepts FileName_001

ProcFlow_001
HidFact_001
Terminology
A. The Visible Factory;
B. The Hidden Factory. This term is closely related to the concepts of productivity, value and money.
There is no absolute definition of what constitutes a hidden factory. It depends on what is measured
and how we measure it. As a general guideline, if cost, time, etc, is not clearly identified in any report
such as timecards, productivity reports, accounting reports, etc., it is part of the hidden factory.
Entitlement is the amount of work actually needed to produce a good unit where work is synonym of
value-added, money, time, etc.
53
Homogeneity of Variance
Purpose
Homogeneity of variance is the statistical comparison to confirm that no significant difference
exists between the variances of different subgroups.
Homogeneity of variance is a fundamental concept in statistics.
Anatomy
Homogeneity of Variance of A & B
A B
6
10
Frequency
5
Frequency
5 3
σΑ ≅ σB
0
0
6 7 8 9 10
σA
11 12 13 14
8 9 10 11 12 13
σB
14 15 16
A B
C
σ
Six Sigma - Tools & Concepts HomoVari_001
th
Reference: Juran’s Quality Handbook,4 edition/Forrest W. Breyfogle III, Statistical Method for Testing, Development, and Manufacturing, John Wiley and
Sons,page114/
Terminology
A. Histogram of data from subgroup A, with σA
B. Histogram of data from subgroup B, with σB
C. Normal curve
D. Normal curve Standard deviation σ
Homogeneity of variance is proven since subgroups A and B variances are similar to σ.
Statistical tests requiring similar variances may be performed and they confirm that there is no
process output variability change.
Homogeneity is complementary to Heterogeneity
54
Independence of the Mean and Variance
Purpose
To understand the importance of the independence between the sample mean and the
variance. Most statistics assume that the mean can be moved without affecting the variance,
therefore the correlation between them must be controlled.
Anatomy
Independence of the Mean and

Variance
VARIANCE
VARIANCE
A B
VARIANCE
VARIANCE
C
VARIANCE
VARIANCE
MEAN
MEAN MEAN
MEAN
MEAN
MEAN
Six Sigma - Tools & Concepts IndMnVar_001
Reference: Juran’s Quality Control Handbook - Chapter 23, Page 16
Terminology
A. Represents a positive correlation between the mean and variance. When the mean increases, the
variance increases, so they are not independent.
B. Represents a negative correlation between the mean and variance. When the mean increases, the
variance decreases, so they are not independent.
C. Represents no correlation between the mean and variance. A Change in the mean has no effect on
the variance, so they are independent. As the mean and variance are independent, we can identify
and control independently the variables that affect the mean or the variance.
55
Interaction
Purpose
An interaction exists between Factors when the effect of one Factor upon the Response variable
changes depending on the level of the other Factor(s).
Anatomy
Interaction
B YY
14 D +1
Run Order A B AB Y1 12
1
2
(1)
a
-1
1
-1
-1
1
-1
1.5
4.5
10
Factor
FactorBB
3
4
b
ab
-1
1
1
1
-1
1
4.5
13.5
8 C
Contrast 12.0 12.0 6.0
6
-1
Effect 6.0 6.0 3.0
Avg + 9.0 9.0 7.5
4
Avg - 3.0 3.0 4.5
E
∆ 6.0 6.0 3.0
2 F
0
-1 +1
Factor
Factor
FactorAA A
A
Interact_001
Reference: Understanding Industrial Experimentation p 135
Terminology
A. First experimental Factor.
B. Response variable.
C. Response when first Factor is at lowest setting and second Factor is at highest setting.
D. Response when first Factor is at highest setting and second Factor is at highest setting.
E. Response when first Factor is at lowest setting and second Factor is at lowest setting.
F. Response when first Factor is at highest setting and second Factor is at lowest setting.
When no interaction exists between Factors, there is no difference in the Response between the
levels of one Factor, when the other factor changes. Significant interaction means that as one
Factor changes state from low to high level, the Response changes significantly depending on
whether the other Factor is at a high or low setting.
Significant interaction can mask the significance of main effects.
56
Measurement Scale - Continuous
Purpose
To provide a means for measuring a continuous quantity which can be subdivided into finer and finer
increments.
Anatomy
Measurement Scale - Continuous

(Variable)
g X1 X2 X3 X4 X5
1 1.242 1.239 1.239 1.242 1.240
2 1.240 1.241 1.240 1.239 1.242
3 1.239 1.239 1.239 1.239 1.240
4 1.241 1.240 1.240 1.240 1.241 A
5 1.240 1.241 1.240 1.238 1.241
B 6 1.241 1.240 1.240 1.240 1.239 1.240 ± .003
7 1.237 1.240 1.240 1.237 1.238
8 1.240 1.242 1.240 1.240 1.238
9 1.240 1.239 1.240 1.239 1.242
10 1.239 1.239 1.241 1.239 1.240 Dimension
11 1.239 1.238 1.242 1.238 1.240 “B”
12 1.239 1.241 1.239 1.239 1.242
C
13 1.239 1.242 1.239 1.239 1.240
14 1.240 1.239 1.240 1.239 1.241 D
15 1.241 1.240 1.240 1.240 1.240
16 1.240 1.239 1.240 1.240 1.240
17 1.241 1.239 1.238 1.240 1.240
18 1.239 1.239 1.241 1.241 1.239
19 1.240 1.239 1.240 1.238 1.242
20 1.241 1.240 1.241 1.239 1.240
Six Sigma - Tools & Concepts MeaScCon_001
Reference: Juran’s Quality Control Handbook - Chapter 23, Page 32 / Statistical Methods, p483, Forrest W. Breyfrogle III, John Willey & sons inc.
Terminology
A. Characteristic’s entity to be evaluated
B. Data sheet showing 20 samples (#1 to #20), each with a size of 5 parts
C. Sample number (#9)
D. Single value (part #4 out of sample #9) - continuous or variable data
57
Measurement Scale - Discrete (Attribute)
Purpose
To provide a means for measuring count data using integer or whole numbers. Usually used to
enumerate as a means to establish density for opportunities for defects.
Anatomy
Measurement Scale - Discrete

(Attribute)
A B
Lines
Line Characteristic D U OP TOP DPU DPO DPMO Shift Z.B

Characteristics
1 Type A 81 83 95 7,885 0.976 0.0103 10,273 1.50 3.82 Defects
2 Type B 67 584 20 11,680 0.115 0.0057 5,736 1.50 4.03
3 Type C 19 225 38 8,550 0.084 0.0022 2,222 1.50 4.35
Units
4 Type D 33 884 57 50,388 0.037 0.0007 655 1.50 4.71 Opportunities
5 Type E 67 774 37 28,638 0.087 0.0023 2,340 1.50 4.33
6 Type F 27 669 91 60,879 0.040 0.0004 444 1.50 4.82
Total Opportunities
1 2 3 4 5 6
1 2 3 4 5 6 7
2 3 4 5 6 7 8
3 4 5 6 7 8 9
4 5 6 7 8 9 10
C
5 6 7 8 9 10 11
D
6 7 8 9 10 11 12
MeaScDis_001
Reference: Juran’s Quality Control Handbook - Chapter 25, Page 4 / Statistical Methods, p483, Forrest W. Breyfrogle III, John Willey & sons inc
Terminology
A. Data sheet
B. Typical Attribute or Discrete data
C. Event you’re interested in
D. Outcomes that result from that Event - Attribute or Discrete data
58
Measurement Scale - Nominal
Purpose
To classify data into classes, each given a distinguishing symbol or name with no relationship
between the various categories in terms of relative size.
Anatomy
Measurement Scale - Nominal
A B
MeaScNom_001
Reference: Statistics Probability, inference and decision, vol 1, William L. Hays - Robert L. Winkler, Holt,
Rinehart and Winston inc. - Chapter 5.3, Page 246
Terminology
A. Workers wearing glasses = category of workers
B. Certified inspectors = classification
C. Missing approval signature = type of defect
D. Different type of defects on left wing upper surface = poor quality categories
Whole numbers / integers only
Go / No Go, Pass / Fail, Yes / No
59
Measurement Scale - Ordinal
Purpose
To rank data as a means to establish relative importance with no information about the distance
between categories.
To yield more knowledge about relative size, importance or relationship of a category of items.
Anatomy
Measurement Scale - Ordinal
Example 1 Example 2
C Survey : We like happy people !
Question x :
$ A
B Excellent
Very Good
Fair
$ ☺
Poor !
I’m Mad
#
Six Sigma - Tools & Concepts MeaScOrd_001
Reference: Statistics Probability, inference and decision, vol 1, William L. Hays - Robert L. Winlerk, Holt, Rinehart and Winston inc. - Chapter 5.3, Page 246
Terminology
A. Ordinal scale
B. Ranked typical answers - in ascending or descending order
C. Relative height order - category = people (not including brooms)
Whole numbers / integers only
60
Measurement Linearity
Purpose
To explain the reliability of a measuring instrument by determining if there is linearity error.
There is a change in the accuracy of the measuring instrument throughout the expected
operating range of the gage.
Anatomy
Linearity
Smaller
Accuracy
Error
B Larger
Accuracy
Error
C D
Six Sigma - Tools & Concepts MeasLine_001
Reference: Measurement Systems Analysis (Chrysler, Ford, GM) - Chapter 2, Page 18
Terminology
A. True or reference value
B. Observed Average Value
C. Lower part of the range
D. Higher part of the range
E. Linearity
Linearity is the change in accuracy through the expected operating range of a measuring instrument.
A linearity error may result from a calibration error, wear of the measuring instrument. etc. or simply
this type of error is natural for the instrument.
61
Measurement Reliability
Purpose
To explain the reliability of measurement instruments. Any error in these instruments influences
the ability to judge conformance. A measuring instrument is said to be reliable when it yields the
same results on repeated trials. In other words, a reliable instrument has negligible
measurement error because it is more accurate and precise.
Anatomy
Measurement Reliability
Accurate A
1 2 3 4 5
1 2 3 4 5 1 2 3 4 5
MeasReli_001
Reference: Juran’s Quality Control Handbook - Chapter 18, Page 63-65 / Minitab Reference Manual – Chapter 10, page 2
Terminology
A. Accuracy: After making numerous repeated measurements on a single product characteristic with a
measuring instrument, the extent to which the computed average of these measurements agrees
with the “true” value of the characteristic reflects the “accuracy” of the measurement system. The
difference between the average and “true” value is also called the “error”, also known as “systemic
error” or “bias”. An accuracy problem generally indicates that the measuring instrument needs to be
calibrated.
B. Precision: After making numerous repeated measurements on a single product characteristic with a
measuring instrument, the extent to which the instrument is able to reproduce it’s own measurements
is called “precision”. Regardless of accuracy of calibration, experience shows that a measuring
instrument will not give identical measurement readings, even when making a series of
measurements on the same product characteristic. Precision generally describes the inherent
variation of a specific measuring instrument and is usually evaluated by the metrology function.
62
Measurement Repeatability
Purpose
To study the measurement system that is used to collect data. One important source of error
associated with a measurement system is repeatability, which describes the variation obtained
in measurements made by one instrument over repeated trials.
Anatomy
Repeatability
Six Sigma - Tools & Concepts MesRepe_001
Terminology
A. Repeatability:
Repeatability is the variation in measurements obtained when an operator measures the same
characteristic several times with the same instrument under the following conditions:
– on the same part;
– in the same location on the part;
– under the same condition of use;
– over a short period of time.
Repeatability is considered to be the inherent variation of the measurement system. The repeatability
error comes from the instrument itself and the position of the part in relation to the instrument. Note
that measurements may show good repeatability without being accurate.
In contrast to precision, where the measuring instrument is generally evaluated by the metrology
function, repeatability is evaluated through a gage R&R study of the measuring system.
63
Measurement Reproducibility
Purpose
To study the measurement system that is used to collect data. One important source of error
associated with a measurement system is reproducibility. This describes the variation
introduced by different operators in the measurement process.
Anatomy
Reproducibility
Operator B
Operator C
Operator A
Six Sigma - Tools & Concepts MeasRepr_001
Terminology
A. Reproducibility
Reproducibility is the variation between the average of the measurements obtained by different
operators who measure the same characteristic on the same parts under the following conditions:
– the same measuring instrument;
– the same method;
– in the same location on the parts;
– under the same conditions of use;
– over a short period of time.
Reproducibility represents the incremental bias that can be attributed to each operator. If there is
variability between the operators, then the reproducibility of the measurements represents the
variation between operators. Reproducibility can be evaluated through a gage R&R study.
64
Mistake Proofing Principles
Purpose
To determine methods that will ensure a process is defect-free all the time. It applies to any
process where repetitive steps occur which could be skipped, performed out of order, or not
conducted correctly. Mistake Proofing ensures that tasks can only be done the right way.
Anatomy
Mistake Proofing Principles
A B C D
Principle Objective How Preference

1) Elimination Eliminating the Redesigning the process Best
possibility of error or product so that the task
is no longer necessary
2) Replacement Substituting a moreUsing robotics or Better
reliable process automation
3) Facilitation Making the work Color-coding, combining Better

easier to perform steps, etc
4) Detection Detecting the errorDeveloping computer Better

before further software which notifies a
processing worker when a wrong
input is made
5) Mitigation Minimizing the Utilizing fuses for Good
effect of the error overloaded circuits
Six Sigma - Tools & Concepts MistProf_001
Reference: Quality Planning and Analysis p 347
Terminology
A. Principle – There are 5 basic principles involved in mistake proofing
B. Objective – The main difference between each of the principles
C. How – A generic description of how the principle is implemented
D. Preference – Always use the highest principle possible
65
Notation – Two-Level Factorial Experiment
Purpose
To provide a standard notation for describing a Two-Level Full or Fractional Factorial
Experiment.
Anatomy
Notation - Two-Level Factorial

Experiment
Six Sigma - Tools & Concepts Not2LvFE_001
Reference: Statistics for Experimenters – pp 374 - 418
Terminology
A. Number of Factor Levels. This example represents a two-level factorial experiment, but other
factor levels are possible.
B. Resolution of Experiment, generally expressed as a Roman numeral (see Concept – Resolution).
C. Fraction of the Experiment:
a. P = 1 corresponds to a half-fraction.
b. P = 2 corresponds to a quarter-fraction.
c. P = 3 corresponds to an eighth-fraction, etc.
D. Number of experimental Factors.
66
Null Hypothesis - H0
Purpose
The null hypothesis, designated as H0, is the assertion being tested in an hypothesis test.
Usually the null hypothesis is a statement of “no effect” or “no difference”. For example the null
hypothesis to validate a process change when the old mean and new mean are µο, and µ1
respectively, will state that both process means are the same i.e. H0: µo = µ1.
Anatomy
Null Hypothesis - H0
Null Hypothesis For Means
B
H0 : µο = µ1
A
µο µ1
Null Hypothesis For Standard Deviations

D
H0 : σο = σ1
C
Six Sigma - Tools & Concepts NullHypo_001
Reference: Basic Statistics by M. J. Kiemele, S. R. Schmidt and R. J. Berdine – Ch. 6, p. 1-3, The Vision of Six Sigma: Tools and Methods for Breakthrough by M.
J. Harry – Ch. 13, P. 4-6
Terminology
A. Null hypothesis used to check whether new process mean µ1 differs from old process mean µο . Test
is to determine if change in mean is simply due to random variation, or whether the process has
changed, and the new mean is significantly different from the old one. The null hypothesis is
assumed true until sufficient evidence is presented against it.
B. Plots of different means, from data taken from the old process and the new process. The null
hypothesis will state that all the data belongs to the same underlying population, with the new
process essentially equivalent to the old one.
C. Null hypothesis used to check whether new process standard deviation differs from old process
standard deviation.
D. Plots with different standard deviations, from data taken from the old process and the new process.
The null hypothesis will state that all the data belongs to the same underlying population, with the
change in standard deviation being due to random variation only, i.e. the process has in fact not
really changed.
67
Out of Control
Purpose
A process is Out of Control when non-common, or assignable, causes of variation force points
on a Control Chart to exceed either of the Control Limits. It may also be deemed to be Out of
Control if it fails one or more of a number of quantitative decision tests such as the Western
Electric Tests
Anatomy
Out of Control
I Chart for Response F
1
G
A 3.0SL=105.4
105
Individual Value
B X=100.3
100
C
95 -3.0SL=95.11
0 5 10 15 20 25
Observation Number
Six Sigma - Tools & Concepts Out_Ctrl_001
Terminology
A. Upper Control Limit (UCL) – The upper limit of process control. The UCL is by convention calculated
to equal to the process Mean plus three Standard Deviations
B. Center Line – Calculated as the process Mean over the period being investigated
C. Lower Control Limit (LCL) – The lower limit of process control. The LCL is by convention calculated
to equal to the process Mean minus three Standard Deviations
D. One Sigma Line – A line calculated as being one Standard Deviation away from the Center Line, or
process Mean, used in applying a number of tests for the process’ state of control
E. Two Sigma Line – A line calculated as being two Standard Deviations away from the Center Line, or
process Mean, used in applying a number of tests for the process’ state of control
F. Out of Control Point – A single process point showing the most obvious sign of an Out-of-Control
situation, i.e. being beyond either the UCL or LCL
G. Out of Control Situation – Another example of Out-of-Control, where the process has failed Western
Electric Test 2, which says that a process is Out of Control when two out of any three successive
points are on the same side of the Central Line and simultaneously more than two Standard
Deviations from the Central Line. There are many more tests for a process’ state of control, but the
Western Electric Rules are an excellent starting point and should be used whenever possible.
68
Indices of Capability – Cp
Purpose
To determine whether a process, given its natural short-term variation, has the potential capability to
meet established customer requirements or specifications. Cp is a ratio of the tolerance width to the
short-term spread of the process. Cp does not consider the center of the process. It estimates the
"instantaneous capability" of the process.
Anatomy
Indices of Capability - C p
Lower Upper
Specification Specification
Tolerance Width
Cp = Limit Limit
Short Term Process Spread

Cp < 1
Improvement
A Cp = 1
Cp = 1.5
USL − LSL
Cp =
6σ ST
Cp = 2.0
Six Sigma - Tools & Concepts IndicCP _001
Reference: Juran's Quality Control Handbook – Ch. 16, P. 19-35
Terminology
A. Process capability index: Indicates the short-term level of performance that a process can potentially
achieve.
B. Tolerance width: Upper Specification Limit (USL) minus Lower Specification Limit (LSL).
C. Short-term process spread: Represents six times the short-term standard deviation (±3σST).
D. Cp < 1: The process output exceeds specifications. The process is incapable.
Cp = 1: The process barely meets specifications. There is a
probability that at least 0.3% defects will be produced and even more if the process is not centered.
Cp > 1: The process output falls within specifications, but, defects
might be produced if the process is not centered on the target value.
Cp = 2: Represents the short-term objective for process capability.
Since Zst = 3 x Cp, we achieve 6σ when Cp = 2.
Cp is used for continuous data and is based on several assumptions. Cp assumes that the process is
statistically stable and that its data is approximately normally distributed. If the distribution of the data is
very skewed, the data should be transformed (see Data Transformations tool). Since Cp does not
consider process centering, it should not be used alone to describe process performance. It should be
used in conjunction with Cpk, which considers process centering.
69
1. To calculate Cp indices using Minitab, the following information is needed: Subgroup size, lower and
upper specification limits and the method to estimate standard deviation.
2. In Minitab, use Stat > Quality tools > Capability Analysis
Process Capability Analysis for Sample
Low er Spec Upper Spec
C p index 15 17 19 21 23 25 27
S hort-Term Capability
Cp 0.89 Targ * M ean 19.9365 %>USL Ex p 0.35 PPM>US L E xp 3458
CPU 0.90 USL 25.0000 M ean+3s 25.5608 Obs 2.60 Obs 25974
CPL 0.88 LS L 15.0000 M ean-3s 14.3121 %<LSL Ex p 0.42 PP M<LS L E xp 4231
Cpk 0.88 k 0.0127 s 1.8748 Obs 0.00 Obs 0
Cpm * n 77.0000
70
Indices of Capability – Cpk
Purpose
To determine whether a process, given its short-term variation, meets established customer
requirements or specifications. Cpk considers process centering. Cpk is a ratio of the distance measured
between the process mean and the closest specification limit to half of the total process spread.
Anatomy
Indices of Capability - Cpk
A D
C pk = min {C pl ,C pu } Lower
n
Upper
specificatio specificatio
n
Potential
capability
Real
capability
Cp = 2.0 Cpk = 2.0

B
Increase in the number of rejects

X − LSL
Cp = 2.0 Cpk = 1
C pl =
3σ ST
Cp = 2.0 Cpk < 1
Cp = 2.0 Cpk = 0
C
USL − X
Cp = 2.0 Cpk < 0
C pu =
3σ ST
Cp = 2.0 Cpk < -1
Six Sigma - Tools & Concepts IndicCPK _001
Reference: Juran's Quality Control Handbook – Ch. 16, P. 19-35
Terminology
A. Process capability index: Indicates the level of performance that a process can achieve taking into
account the location of the process mean. It is equal to the smaller of either Cpl or Cpu. When the
process is centered, Cpu = Cpl = Cpk = Cp.
B. Process capability index (lower): Represents the distance between the lower tolerance and the mean
divided by 3σST.
C. Process capability index (upper): Represents the distance between the upper tolerance and the
mean divided by 3σST.
D. Cpk = Cp: The process mean is on target.
Cpk = 0: The process mean falls on one of the specification limits,
therefore, 50% of the process output falls beyond the specification limits.
Cpk < -1: The process mean is completely out of the specification
limits, therefore, 100% of the process output is out of specification limits.
71
Cpk is used for continuous data and is based on several assumptions. Cpk assumes that the process is
statistically stable and that its data is approximately normally distributed. Cpk assumes that data is
approximately normally distributed. If the distribution of the data is very skewed, we should transform the
data (see Data Transformations tool). Cpk considers process centering ant the short-term variation in the
process. However, it should not be used alone to describe process capability. It should be used in
conjunction with Cp, which is the short-term process capability index.
1. To calculate Cpk indices using Minitab, the following information is needed: Subgroup size, lower and
upper specification limits and the method to estimate standard deviation.
2. In Minitab, use Stat > Quality tools > Capability Analysis.
Process Capability Analysis for Sample
Lower Spec Upper Spec
Process Capability Analysis for TOTAL_DE
Cpk index 15 17 19 21 23 25 27
15 17 19 21 23 25 27
Short-TermCapability
Short-TermCapability
CpCp 0.89 0.89

Targ * TargMean *
19.9365 Mean
%>USL Exp 0.35 19.9365
PPM>USL Exp 3458 %>USL Exp 0.35 PPM>USL Exp 3458
CPU 0.90 USL 25.0000 Mean+3s 25.5608 Obs 2.60 Obs 25974
CPU0.88
CPL 0.90
LSL 15.0000 USLMean-3s25.0000
14.3121 Mean+3s
%<LSL Exp 0.42 25.5608
PPM<LSL Exp 4231 Obs 2.60 Obs 25974
Cpk 0.88 k 0.0127 s 1.8748 Obs 0.00 Obs 0
CPL
Cpm * 0.88 n 77.0000 LSL 15.0000 Mean-3s 14.3121 %<LSL Exp 0.42 PPM<LSL Exp 4231
Cpk 0.88 k 0.0127 s 1.8748 Obs 0.00 Obs 0
Cpm * n 77.0000
72
Indices of Capability - Pp
Purpose
To determine whether a process, given its long-term variation, has the capability to meet established
customer requirements or specifications. Pp is a ratio of the tolerance width to the long-term spread of
the process. Pp does not consider the center of the process.
Anatomy
Indices of Capability - P p
Lower Upper
Specification Specification
Tolerance Width
Pp = Limit Limit
Long Term Process Spread

Pp < 1
Improvement
A
Pp = 1
USL − LSL
Pp = Pp = 1.5
6σ LT
C
Six Sigma - Tools & Concepts IndicPP _001
Reference: Chrysler, Ford, General Motors, ASQC, and AIAG (1995) – Statistical Process Control P. 79-86
Terminology
A. Process performance index: Indicates the long-term level of performance that a process can
potentially achieve.
B. Tolerance width: Upper Specification Limit (USL) minus Lower Specification Limit (LSL)
C. Long-term process spread: Represents six times the long-term standard deviation (±3σLT).
D. Pp < 1: The process variation exceeds specifications. Defects are being produced.
Pp = 1: The process barely meets specifications. There is a probability that at least 0.3%
defects will be produced and even more if the process is not centered.
Pp = 1.5: Represents the long-term objective for process capability.
Pp is used for continuous data and is based on several assumptions. It assumes that the process is
statically stable and that its data is approximately normally distributed. If the distribution of the data is
very skewed, the data should be transformed (see Data Transformations tool). Since Pp does not
consider process centering, it should not be used alone to describe process performance. It should be
used in conjunction with Ppk, which considers process centering.
73
1. To calculate Pp indices using Minitab, the following information is needed: Subgroup size and lower
and upper specification limits.
2. Stat > Quality tools > Capability Analysis.
Pp index
15 17 19 21 23 25 27
Long-Term Capability
Pp 0.86 Targ * Mean 19.9365 %>USL Exp 0.45 PPM>USL Exp 4547
PPU 0.87 USL 25.0000 Mean+3s 25.7600 Obs 2.60 Obs 25974
PPL 0.85 LSL 15.0000 Mean-3s 14.1129 %<LSL Exp 0.55 PPM<LSL Exp 5495
Ppk 0.85 k 0.0127 s 1.9412 Obs 0.00 Obs 0
Cpm * n 77.0000
74
Indices of Capability – Ppk
Purpose
To determine whether a process, given its long-term variation, meets established customer requirements
or specifications. Ppk considers the centering of the process. Ppk is a ratio of the measured distance
between the overall mean of the process and the closest specification limit to half of the total process
spread.
Anatomy
Indices of Capability - Ppk
A D
Ppk = min {Ppl ,Ppu } Lower
n
Upper
specificatio specificatio
n
Potential
capability
Real
capability
Pp = 1.5 Ppk = 1.5

B
Increase in the number of rejects

X − LSL
Pp = 1.5 Ppk = 1
Ppl =
3σ LT
Pp = 1.5 Ppk < 1
Pp = 1.5 Ppk = 0
C
Pp = 1.5 Ppk < 0

USL − X
Ppu =
3σ LT
Pp = 1.5 Ppk < -1
Six Sigma - Tools & Concepts IndicPPK_001
Reference: Chrysler, Ford, General Motors, ASQC, and AIAG (1995) – Statistical Process Control P. 79-86
Terminology
A. Process performance index: Indicates the level of long-term performance that a process can achieve
taking into account the location of the process mean. It is equal to the smaller of either Ppl or Ppu.
When the process is centered, Ppu = Ppl = Ppk = Pp.
B. Process performance index (lower): Represents the distance between the lower tolerance and the
mean long-term divided by 3σLT.
C. Process performance index (upper): Represents the distance between the upper tolerance and the
mean long-term divided by 3σLT.
D. Ppk = Pp: The process mean is on target. Since ZLT = 3 x Ppk, we achieve 4.5σLT when Ppk = 1.5.
Ppk = 0: The process mean falls on one of the specification limits.
Therefore, 50% of the process output is out of the specification limits.
Ppk <- 1: The process mean falls completely out of the specification
limits. Therefore 100% of the process output is out of the specification limits.
75
Ppk is used for continuous data and is based on several assumptions. It assumes that the process is
statistically stable and that its data is approximately normally distributed. If the distribution of the data is
very skewed, the data should be transformed (see Data Transformations tool). Using Ppk indices without
the information provided by the Cp and Cpk indices may lead to the wrong interpretations of the
capability of the process.
1. To calculate Ppk indices using Minitab, the following information is needed: Subgroup size and lower
and upper specification limits.
2. Stat > Quality tools > Capability Analysis
Ppk index
15 17 19 21 23 25 27
Long-Term Capability
Pp 0.86 Targ * Mean 19.9365 %>USL Exp 0.45 PPM>USL Exp 4547
PPU 0.87 USL 25.0000 Mean+3s 25.7600 Obs 2.60 Obs 25974
PPL 0.85 LSL 15.0000 Mean-3s 14.1129 %<LSL Exp 0.55 PPM<LSL Exp 5495
Ppk 0.85 k 0.0127 s 1.9412 Obs 0.00 Obs 0
Cpm * n 77.0000
76
Random Sampling
Purpose
Randomized Sampling is used to ensure that the elements selected for measurement are the
result of pure chance. Selecting samples at random increases the likelihood that the
measurements of the samples are representative of the process population.
Anatomy
Random Sampling
C
3 16
22 37
71 62 62 71
B
D
37
16 22
E
RandSamp_001
Reference: Juran’s Quality Control Handbook - Page 25.20
Terminology
A. Process
B. Process Output for a given period
C. Individual Units - With Random Sampling at any one time each of the remaining units of product has
an equal chance of being the next unit selected for sample.
D. List of Random Numbers - To conduct Random Sampling requires that random numbers be
generated and that the numbers can be applied to the process output. Random numbers can be
generated using calculators, computers, Excel, Minitab, a bowl of numbered chips, random number
dice, or random number tables.
E. Unit Numbers corresponding to the list of random numbers.
77
Basic Probability Theory -Sample and Population
Purpose
To define basic probability theory and concepts helpful to understand Six Sigma tools and their
application.
Anatomy
Sample and Population
A Sample Population
Based on a sample taken from
a population we calculate B
probabilities and make
inferences about the whole
population.
Six Sigma - Tools & Concepts BaProDf1_001
Reference: Juran’s Quality Control Handbook - Chapter 23, Page 21-24
Terminology
A. Sample - A limited number of measurements taken from a larger source.
B. Population or Universe - A large source of measurements from which a sample is taken. A population
may physically exist, or it may only be a concept, such as all experiments which might be run.
78
Sequential Sampling
Purpose
Sequential Sampling plans are used for selecting samples of units produced in consecutive
order.
Anatomy
Sequential Sampling
Input Output
PROCESS
...th ...th ...th ...th
B
9th 10th 11th 12th
Measurements
CTQ 3rd 2nd 1st
1st 1.003
2nd 1.0025
C
3rd 1.004 4th
4th 1.0046
CT Characteristics measured
SequSamp_001
Reference:
Terminology
A. Process
B. Process Output for a given period
C. Measurements taken and recorded from units produced in sequence at different periods (hourly,
weekly, monthly, etc.)
79
Basic Probability Theory – Sets and Events
Purpose
To describe operations involving sets. These descriptions are useful in understanding
calculation of probabilities, which are an essential tool in Six Sigma.
Anatomy
Event Operations
S A B S A’
A C
A
B
A∩B A’
S A B
A∪B
S B S
A D E A B
A⊆B A∩B=∅
Six Sigma - Tools & Concepts BaProStE_001
Terminology
Composite event - A combination of various events. Each rectangle represents a set of events.
For example, if we toss three coins and we are interested in the event: “More heads that tails”
the following combinations (out of all eight possible combinations) represent a composite event:
HHH, HHT, THH, HTH. The probability of a composite event is the sum of the probability of all
simple events that comprise the composite event. P(more heads than tails) = P(HHH)+ P(HHT)+
P(THH)+ P(HTH)
A. Intersection A ∩ B - The event (A ∩ B) occurs only if both A and B occur.
B. Union A ∪ B - The event (A ∪ B) occurs if A or B or both A and B occur.
C. Negation or opposite of A (A’) - This event occurs if and only if A does not occur.
D. Inclusion or Implication A ⊆ B - Read “A includes B”, is the event whereby every time A occurs, B
occurs as well.
E. Events incompatible or mutually exclusive (A ∩ B = ∅ ) - This event occurs if events A and B cannot
occur at the same time. In other words, if there is no element that is in both A and B. The intersection
of A and B is an empty set (represented by ∅ ).
80
Basic Probability Theory – Sets, Theorems
Purpose
To describe three basic theorems helpful in understanding probabilities.
Anatomy
Three Basic Theorems
Theorem 1: If there is a 1/6 A

probability of getting a 6, then
there is a 5/6 probability of not
getting a 6
B
Theorem 2: The probability of

getting a 2 or a 3 is:
(1/6) + (1/6) = 1/3 or 0.333
Theorem 3: The probability of

getting two 2’s is:
(1/6) x (1/6) = 1/36 or 0.0278
Six Sigma - Tools & Concepts BaProStT_001
Terminology
A. Theorem 1 - If P(A) represents the probability that event A will occur, then the probability that event A
will not occur, P(A’) is equivalent to P(A’) = 1 - P(A).
B. Theorem 2 - If A and B are two events that can occur simultaneously, then the probability that event
A or event B will occur is P(A ∪ B) = P(A) + P(B) - P(A ∩ B).
C. Note: If A and B are mutually exclusive, then the probability of
either A or B is
P(A ∪ B) = P(A) + P(B).
D. Theorem 3 - If A and B are two independent events, then the probability that events A and B occur
simultaneously is P(A ∩ B) = P(A) x P(B).
E. If the occurrence of A influences the probability that event B will occur, then the probability that event
A and event B will occur simultaneously is P(A ∩ B) = P(A) x P(B|A).
F. P(B|A) is read “probability of B, knowing that A has occurred”.
81
Shift and Drift – 1.5 Sigma
Purpose
To take account of the natural behavior of most processes to vary with time. To use a
recognized approximate value for this variation.
Anatomy
Shift and Drift - 1.5 Sigma
D t6
B t5
G
t4
F
t3 σST
t2
C
E
t1 J
A ±1.5σST
I
Six Sigma - Tools & Concepts ShftDrft_001
Reference: Mikel J. Harry, The vision of six sigma, White Book, Black book, Chapter 9
Terminology
A. Long term central tendency of the process, (i.e. overall mean)
Sample distribution (typical)
Sample mean (typical)
Dynamic shift between consecutive means (typical)
Sample number
Equivalent short term distribution
Short term standard deviation σST, result of white noise effect
Short term mean centered (i.e. the best you may expect)
Shift and Drift ±1.5σST (± 1.5 sigma Short Term) band in which the short term means are expected to lie,
result of dynamic variation in the process mean over time only due to black noise.
Long term distribution illustrating the total variation due to white noise plus black noise.
82
Stratified Sampling
Purpose
Stratified Sampling is used when a lot is the result of multiple operators, machines, shifts, etc.;
the product is actually the combination of several smaller lots. Selecting stratified samples
increases the likelihood that the measurements of the samples are representative of the
combined lots’ process population.
Anatomy
Stratified Sampling
A
I II III
37 16 8 66 32 56
22 71 2 55 23 10
C
62 3 24 70 33 74
D
Six Sigma - Tools & Concepts StraSamp_001
Reference: Juran’s Quality Control Handbook - Page 25.20
Terminology
A. Process - When Stratified Sampling is used the lot is known to come from different machines, shifts,
operators, etc. The roman numerals represent these differences.
B. Process Output for a given period from different processes.
C. List of Random Numbers - With Stratified Sampling the samples are selected proportionately from
each process. Within each lot random sampling is used; this means at any one time each of the
remaining units of product has an equal chance of being the next unit selected for sample. To
conduct Random Sampling requires that random numbers be generated and that the numbers can
be applied to the process output. Random numbers can be generated using calculators, computers,
Minitab, Excel, a bowl of numbered chips, or random number dice.
D. Proportionate Samples from each individual process.
83
Test Sensitivity
Purpose
The test sensitivity is how much observable difference needs to be detected in a hypothesis test
of the mean in order for an effect to be practically significant when the amount of risk for both α
(alpha) and β (beta) has been fixed. This difference is referred to as “delta sigma” and is
designated as δ/σ .
Anatomy
Test Sensitivity
δ/σ
α/2 1−α α/2

Control
B Distribution
Contrast D
Distribution
β 1−β
A
E
δ
Six Sigma - Tools & Concepts TestSens_001
Reference: The Vision of Six Sigma: Tools and Methods for Breakthrough by M. J. Harry Ch. 13, P. 11-14, Ch. 14, P. 23
Terminology
A. δ (delta) - The difference in means that we want to be able to detect with the test.
Control Distribution - Distribution associated with the null (H0) hypothesis.
(1-α) - Confidence level: confidence that an observed outcome in the sample is “real”.
Contrast Distribution - Distribution associated with the alternate (Ha) hypothesis.
(1-β) - Power of the test: chance of detecting a specified change in the population with the sample if the
difference is actually there to detect.
δ/σ -The difference in means that we want to be able to detect with the test expressed in standard
deviation units. It is a direct measure of test sensitivity; i.e., the difference between two means expressed
in standard deviation units. By fixing this value in advance of a test, we are able to align practical
significance with statistical significance. We are able to dial in the degree of change we need the
hypothesis test to detect in order to proclaim that a particular effect is significant in the real world. Should
an effect not reach the prescribed δ/σ value, then we would say that the effect is not influential enough to
be of practical concern. Note that we would be able to make this statement with (1-β)100 percent
confidence.
84
Alpha Risk α
Purpose
Alpha risk (α) is the probability of rejecting the null hypothesis H0 when it is actually true. It is the risk of saying there
is a difference in the sample characteristic of interest (e.g. the mean) when in reality such a difference does not
exist. It is also known as the Significance Level or the risk of making a Type 1 Error.
Anatomy
Alpha Risk α
A
Truth About Population
Ho
Hoisistrue
true Ha
Haisistrue
true
B
Correct D
Accept Decision
Accept Ho
Ho
α = 0.10
α = 0.05
C
Type 1 α = 0.01
Error Correct
Reject
Reject Ho
Ho α De
Six Sigma - Tools & Concepts AlfaRisk_001
Reference: Juran' Quality Control Handbook – Ch. 23, P. 60-63, The Vision of Six Sigma: Tools and Methods for Breakthrough
by M. J. Harry – Ch. 13, P. 9-11
Terminology
The null hypothesis will either be true or false for the population under investigation. Alpha risk is
probability of making the wrong decision when the null hypothesis is in fact true.
If the null hypothesis Ho is true, then accepting H0 will be the correct decision.
If the null hypothesis Ho is true, then rejecting H0 will be the wrong decision, and in this case a type 1
error will be committed.
The risk of committing a type 1 error, i.e. the probability of rejecting H0 when it is really true, can be set in
advance. If the consequences of a type 1 error are very serious, then the probability of such an error
should be kept low. Common levels of type 1 risk include 10, 5 and 1 per cent, with a 1 per cent risk
being selected over a 10 per cent risk, if the consequences of a type 1 error are extremely serious.
Minimizing this risk will make it more difficult to accept the alternative hypothesis Ha.
85
Alternative Hypothesis - Ha
Purpose
Any hypothesis, which differs from a given null hypothesis, is called the alternative hypothesis.
This is designated as Ha, or H1. While the null hypothesis is a statement of “no effect” or “no
difference”, the alternative hypothesis will state that a difference or effect exists.
Anatomy
Alternative Hypothesis - Ha
Ho: µo = µ1
B C
Ha: µo ≠ µ1
Ha: µo > µ1 D
Ha: µo < µ1
E
Six Sigma - Tools & Concepts AltrHypo_001
Reference: Basic Statistics by M. J. Kiemele, S. R. Schmidt and R. J. Berdine – Ch. 6, p. 1-3, The Vision of Six Sigma: Tools and Methods for Breakthrough by M.
J. Harry – Ch. 13, P. 4-6
Terminology
A. The Null Hypothesis.
B. Different Alternative Hypotheses, which if accepted state that observed differences between µο and
µ1 are statistically significant, and cannot be explained away as random variation in the samples.
C. Two Sided Alternative Hypothesis; it is accepted if: the new process mean µ1 is significantly less or
greater than the old process mean µο
∆. One Sided Alternative (Directional) Hypothesis; it is accepted only if: the new process mean µ1 is
significantly less than the old process mean µο
E. One Sided Alternative (Directional) Hypothesis; it is accepted only if: the new process mean µ1 is
significantly greater than the old process mean µο.
86
Basic Probability Theory - Probability
Purpose
To define basic probability theory and concepts helpful to understand Six Sigma tools and their
application.
Anatomy
Basic Probability Theory
What is the probability

of tossing 3 coins and
getting 2 “heads” (H)
and 1 “tails” (T) in A
order when each toss Random Experiment:
is independent from Tossing 3 coins
the others? B
Sampling space:
HHH HHT HTH THH
HTT THT TTH TTT
C
Event:
HHT
D
Probability:
P(HHT) = 1 = 0.125
8
Six Sigma - Tools & Concepts BaProDf2_001
Terminology
A. Random Experiment - Any process that involves chance and that is likely to lead to one or more
results. A random experiment exhibits the following characteristics:
a) The result cannot be predicted with certainty;
b) The set of all possible results can be described before the experiment;
c) The experiment can be repeated at will under the same conditions.
B. Sample Space - The set of all possible outcomes from a random experiment.
C. Event - Any possible subset of the sample space.
D. Probability - A number that is calculated from a sample and that indicates the likelihood of an event
occurring in a population. It can be thought of as a number that indicates the proportion of times an
event would occur if an experiment is repeated a very large number of times. This number ranges
from 0.0 (impossibility of occurrence) to 1.0 (certainty of occurrence), and the sum of the probability
of all possible outcomes is equal to 1.0. When all events are equally likely to occur the probability of
an event (A) is defined as:
Number of results favorable to event A
P(A) =
Number of possible results
87
Beta Risk β
Purpose
Beta Risk (β) is the probability of accepting the null hypothesis H0 when it is actually false. It is
the risk of not discovering a difference in the sample characteristic of interest (e.g. the mean),
when in reality such a difference does exist. It is also known as the risk of making a Type 2
Error.
Anatomy
Beta Risk β
A
Truth About Population
Ho
Hoisistrue
true Ha
Haisistrue
true
B
Type 2
Error Accept
Accept Ho
Ho
β
C
Note: It is not possible
to simultaneously Correct
commit a type 1 and
type 2 error. Either an Decision Reject
RejectHo
Ho
alpha or beta error
can be made, but not
both.
Six Sigma - Tools & Concepts BetaRisk_001
Reference: Juran’ Quality Control Handbook – Ch. 23, P. 60-63, The Vision of Six Sigma: Tools and Methods for Breakthrough
by M. J. Harry – Ch. 13, P. 9-11
Terminology
A. The null hypothesis will either be true or false for the population under investigation. Beta risk is the
probability of making a wrong decision when the null hypothesis is in fact false.
B. If the null hypothesis Ho is false, then accepting H0 will be the wrong decision and a type 2 error is
made. The risk of committing a type 2 error is known as the beta risk which is important because it
is an insurance against saying there is no difference when in actual fact there is.
C. If the null hypothesis Ho is false, then rejecting H0 will be the correct decision.
Note: That a type 1 and type 2 error cannot be committed simultaneously, since the null hypothesis
cannot be true and false at the same time.
88
Center Point
Purpose
Center point runs are sometimes included in a Two-Level Factorial Design to estimate curvature
(the departure from a linear relationship between the Response and the Factors), to obtain
degrees of freedom for an estimate of background noise, and to check on consistency of the
basic process during the course of an experimental study.
Anatomy
Center Point
Y
Y
True Effect
True Effect A B
AA B
Experimental RUN A BB
Experimental C
Effect
Effect 1 (1) -1 -1
Lo 2 * o o
Hi
( -) ( + ) 3 a +1 -1
4 b -1 +1
5 * o o
6 ab +1 +1 D
7 * o o
Center
Point
Runs
Six Sigma - Tools & Concepts CntrPnt_001
Reference: Understanding Industrial Experimentation p 237
Terminology
A. Run number.
B. Factors, or independent variables.
C. High and low levels of the Factors.
D. Center points for the Factors, values that are mid-point between the high and low settings. Center
points are chosen for each Factor separately, and should be randomly distributed throughout the
runs.
89
Basic Probability Theory – Conditional Probability
Purpose
To calculate the probability of an event in relation to a second event instead of in relation to the
sample space, e.g.: the probability of occurrence of event B knowing that event A has occurred.
Anatomy
Conditional Probability
What is the
probability of drawing
2 kings in succession
from a deck of 52 cards,
if no marked cards are
dealt?
Six Sigma - Tools & Concepts BaProCPr_001
Terminology
A. Two successive events (Drawing two kings from an unmarked deck of cards)
The following formulae allows us to find the probability of occurrence of B knowing that A has
occurred and the probability of A knowing that B has occurred.
P( A ∩ B) P( A ∩ B)
P ( B | A) = Where P(A) ≠ 0 Likewise: P( A | B) = Where P(B) ≠ 0
P ( A) P ( B)
The Multiplication Theorem allows us to calculate the probability of occurrence of two successive
events:
P( A ∩ B) = P( A) × P ( B | A) or P( A ∩ B) = P( B) × P( A | B)
Multiplication Theorem - For two events with non-zero probability, the probability that the two events
A and B will occur simultaneously is equal to the product of the probability of A multiplied by the
probability of B, knowing that event A has occurred
90
Basic Probability Theory – Continuous Probability Distribution
Purpose
A Probability Distribution is a graph, table or formula used to assign probabilities to all of the
possible outcomes or values of a characteristic measured in an experiment.
Anatomy
Probability Distribution For
Continuous Data
Diameter
If the characteristic measured

can take on any value (length,
height, time, money, etc.), then
the probability distribution is
said to be continuous. This curve
represents such distribution.
Characteristic measured
Diameter
Six Sigma - Tools & Concepts BaProDs2_001
Terminology
A. Continuous Probability Distribution - The characteristic measured can take on any value subject only
to the precision of the measuring instrument (length, time, money, distance, etc.).
B. The sum of all probabilities is equal to 100% (i.e. the area under the curve equals one)
Examples of continuous distributions include the Normal, Log normal, Exponential, Weibull, t, F, Chi-
square, etc.
91
Contrast
Purpose
To make a comparison between experimental factor levels, we use contrasts. Contrast defines a
set of coefficients used in a design matrix, and also refers to an intermediate calculation in
statistical analysis of a DOE.
Anatomy
Contrast
A B AB Y
A B (1) -1 -1 +1 1.50
a +1 -1 -1 4.50
Vectored
Vectored
Column
b -1 +1 -1 4.50
Column
Response
Response ab +1 +1 +1 13.50
Run Order A B AB Y A Y
1 (1) -1 -1 1 1.5 -1 * 1.5 = -1.5
2 a 1 -1 -1 4.5 1 * 4.5 = 4.5
3 b -1 1 -1 4.5 -1 * 4.5 = -4.5 C
4 ab 1 1 1 13.5 1 * 13.5= 13.5
12.0
D
Example Computation
Column
Contrast
Sum of Vectored Responses
Six Sigma - Tools & Concepts Contrast_001
Reference: Understanding Industrial Experimentation – pp111 - 123
Terminology
A. Factor contrast.
B. Response.
C. Individual vectored column response, obtained by multiplying the contrast coefficient by the
response.
D. Column contrast obtained by summing the individual vectored column responses.
92
Control Limits
Purpose
Control Limits are calculated values and lines plotted on a Control Chart, used to determine the
state of statistical control of a process. The Upper and Lower Control Limits are generally equal
to the Mean (+ or -)plus or minus three Standard Deviations, respectively. If a process point
exceeds either the UCL or the LCL, the process is considered to be out of control, and action
should be taken.
Anatomy
Control Limits
I Chart for Response

1
A 3.0SL=105.6
105
Individual Value
B X=100.3
100
D
C
95 -3.0SL=94.98
0 5 10 15 20 25
Observation Number
Six Sigma - Tools & Concepts CtrlLmts_001
Terminology
A. Upper Control Limit (UCL) – The upper range of process control. The UCL is by convention
calculated to equal to the process Mean plus three Standard Deviations.
B. Center Line – Calculated as the process Mean over the period being investigated
C. Lower Control Limit (LCL) – The lower range of process control. The LCL is by convention calculated
to equal to the process Mean minus three Standard Deviations.
D. Plot of process sample statistic in chronological order vs. sample number. Any excursion in this plot
above the UCL or below the LCL represents an out-of-control condition and should be investigated.
E. Out of Control Point – A single process point showing the most obvious sign of an Out-of-Control
situation, i.e. being beyond either the UCL or LCL.
93
Data Transformations
Purpose
To transform data that is not normally distributed into data that follows a normal distribution, thus allowing
us to calculate basic statistics and valid probabilities related to the population (mean, standard deviation,
Z values, probabilities for defects, yield, etc.).
Anatomy
Data Transformation
RANGE OF
TRANSFORMATION VARIABLE ORIGINAL DISTRIBUTION TRANSFORMED DISTRIBUTION
A B C D
X 0 ≤ X≤ ∞
X X
loge X
or
0 ≤ X≤ ∞
log10 X
X loge X
(2)
loge X
(2)
1-X (1) (1)
or 0 ≤ X≤ + 1
log10 X
1-X loge X
X
1-X
1 loge 1+X (2)

2 1-X
(1)
or -1 ≤ X ≤ + 1
log10 1+X
1-X
1 loge 1+X
2 1-X
DataTran_001
Reference: Juran's Quality Control Handbook - Ch. 23, P. 91-94
Terminology
A. Mathematical transformations - Prior to using these transformations consult the Application
Cookbook.
B. Range of variable studied.
C. Original distribution of the variable.
D. Resulting distribution after applying a mathematical transformation.
Before using a mathematical transformation apply steps 1 and 2 of the Application Cookbook.
94
If the data is not normally distributed (e.g. fails the normality test in Minitab) conduct the following steps:
1. Examine the data to see if there is a non-statistical explanation for the unusual distribution pattern.
For example, if data is collected from various sources (similar machines or individuals performing the
same process) and each one has a different mean or standard deviation, then the combined output
of the sources will have an unusual distribution such as a mixture of the individual distributions. In
this case, separate analyses could be made for each source (individual, machine, etc.).
2. Analyze the data in terms of averages instead of individual values. Sample averages closely follow a
normal distribution even if the population of individual values from which the sample averages came
is not normally distributed.
If conclusions on a characteristic can be made based on the average value proceed but remember
these only apply to the average value and not to the individual values in the population.
3. If steps 1 and 2 do not provide with reliable estimates, use the Weibull distribution. Consult the
Application Cookbook of the tool Distribution - Weibull. The resulting straight line can provide
estimates of the probabilities for the population.
4. If all above steps fail in providing reliable estimates, use one of the most common mathematical
transformations which include:
x 1+ x
x2 ex ln x ln log
1− x 1− x
1 e− x x 1 1+ x
log ln
x 1− x 2 1− x
x log x
95
Defects Per Opportunity
Purpose
To compute the number of defects per opportunity in order to calculate the probability of defect free units
produced by the process.
Anatomy
Defects Per Opportunity
D
DPO =
A
TOP C
D
=
OPxU
D
E
Six Sigma - Tools & Concepts DefPerOp_001
Reference: The Vision of Six Sigma: A Roadmap for Breakthrough
Terminology
A. Defects per Opportunity;
B. Number of defects affecting the units produced by the process;
C. Total number of opportunities per characteristic;
D. Number of opportunities per unit;
E. Number of units per characteristic.
A critical characteristic is an opportunity for defect. One opportunity is an opportunity only if it is
measured (i.e. only active opportunities are considered).
1. Select critical characteristics;
2. For each characteristic, count the number of active opportunities;
3. Count the number of defects per characteristic;
4. Apply formula.
96
Defects per Unit
Purpose
The Defects-Per-Unit metric is a fundamental Six Sigma concept. It means that for a process producing
U units of output, with D number of defects observed, then on average, each unit of manufactured
product will contain (D/U) such defects.
Anatomy
Defects per Unit
DPU = D
U B
DefPerUn_001
Reference:
Terminology
A. Defects - The number of times a process output does not meet the specifications laid out for its
performance.
B. Units - The number of units of process output.
There are two main types of defects. Uniform defects appear within a unit of product, while Random
defects are intermittent and unrelated. DPU calculations are based on an assumption of random defects.
1. Once defect data has been collected, the preferred method of calculating parameters such as DPU is
to use a spreadsheet such as MS Excel, or alternatively, Minitab.
2. The formulae for calculating DPU, given numbers of defects and units of production, are presented
below.
DPU = (number of defects)/(number of units)
97
Distribution - Binomial
Purpose
To calculate the probability of r occurrences in n trials when the probability of occurrence of an event is
constant for each of n independent trials of the event.
Anatomy
Binomial Distribution for Various
Probabilities of Occurrence (P)
A D n = No. of trials
p = probability of success on
0.45 E each trial
n = 10, p = 0.1
probability of r occurrences
0.4
0.35 Y = n! pr qn − r
r! (n − r)!
0.3 n = 10, p = 0.3
0.25
n = 10, p = 0.5
0.2
0.15
0.1
B
0.05
0
0 2 4 6 8 10
C no. of occurrences ( r )
DistBino_001
Reference: Juran’s Quality Control Handbook - Ch. 23, P. 27-30
Terminology
A. Vertical axis p® - Scale to measure the probability of r occurrences of an event.
B. Horizontal axis- Scale to measure the number of occurrences ®.
C. The probability of having exactly two successes (r = 2) in ten trials when the probability of success in
each trial is equal to 0.10 is approximately 0.2. The probability of having up to two successes is given
by the area under the curve from r = zero to the point r = 2.
D. Curve of the Binomial distribution for various number of trials and probabilities of success. The total
area under the curve is equal to one (1)
E. Probability Function
n!
Y = prqn − r
r! (n − r )!
where q = 1 - p
98
• The population size is at least 10 times the sample size.
• Applicable to discrete distributions.
• The probability of success in each trial is constant.
A. Use the Excel function to calculate the individual term binomial distribution probability.
Alternatively, use the tables printed at the end of most statistics books to find the probability of r
successes in n intervals when each success has a probability p. For example, the probability of having
p
n r 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50
2 0 0.9025 0.8100 0.7225 0.6400 0.5625 0.4900 0.4225 0.3600 0.3025 0.2500
1 0.9975 0.9900 0.9775 0.9600 0.9375 0.9100 0.8775 0.8400 0.7975 0.7500
3 0 0.8574 0.7290 0.6141 0.5120 0.4219 0.3430 0.2746 0.2160 0.1664 0.1250
1 0.9928 0.9720 0.9393 0.8960 0.8438 0.7840 0.7183 0.6480 0.5748 0.5000
2 0.9999 0.9990 0.9966 0.9920 0.9844 0.9730 0.9571 0.9360 0.9089 0.8750
4 0 0.8145 0.6561 0.5220 0.4096 0.3164 0.2401 0.1785 0.1296 0.0915 0.0625
1 0.9860 0.9477 0.8905 0.8192 0.7383 0.6517 0.5630 0.4752 0.3910 0.3125
2 0.9995 0.9963 0.9880 0.9728 0.9492 0.9163 0.8735 0.8208 0.7585 0.6875
3 1.0000 0.9999 0.9995 0.9984 0.9961 0.9919 0.9850 0.9744 0.9590 0.9375
5 0 0.7738 0.5905 0.4437 0.3277 0.2373 0.1681 0.1160 0.0778 0.0503 0.0313
1 0.9774 0.9185 0.8352 0.7373 0.6328 0.5282 0.4284 0.3370 0.2562 0.1875
2 0.9988 0.9914 0.9734 0.9421 0.8965 0.8369 0.7648 0.6826 0.5931 0.5000
3 1.0000 0.9995 0.9978 0.9933 0.9844 0.9692 0.9460 0.9130 0.8688 0.8125
4 1.0000 1.0000 0.9999 0.9997 0.9990 0.9976 0.9947 0.9898 0.9815 0.9688
up to two successes in three trials when each trial has a 0.2 probability is equal to 0.9920.
99
Distribution - Chi Square
Purpose
Curve representing the function of the chi-square statistic (χ2).

This statistic is used in the chi-square test (see Chi-square homogeneity test and Chi-square goodness
of fit test).
Chi-square Distribution for

Various Degrees of Freedom (ν)
0.5
Value of the (χ2) distribution
0.45 C
0.4
ν=2
0.35
0.3
0.25
0.2
ν=4
0.15
0.1
0.05 ν=6 ν = 10 B
0
11.1
12.2
13.3
14.4
15.5
16.6
17.7
18.8
19.9
0.1
1.2
2.3
4.5
7.8
8.9
3.4
5.6
6.7
10
χ2
DistChiS_001
Anatomy
Reference: Juran’s Quality Control Handbook - Ch. 23, P. 28,32,68,72
Terminology
A. Vertical axis - Scale to measure the probability at different values of chi-square p(χ2).
B. Horizontal axis - Scale of measure of the chi-square statistic χ2.
C. Curve of the chi-square distribution for various degrees of freedom. The number of degrees of
freedom is represented by the Greek letter ν (nu). The probability that a value X is less than a
specified value χ2, is the area under this curve up to the point χ2.
a. When the number of degrees of freedom (ν) is small the density function is severely
asymmetric. As the number of degrees of freedom increases (ν), the line becomes more
symmetric. As the number of observations (n) becomes very large, the curve resembles a
normal distribution.
b. The mean and variance of the χ2 distribution are ν and 2ν respectively
c. The highest value of p ( χ2 ) in the curve occurs at χ2 = ν - 2 for ν > = 2.
100
Applicable to continuous and discrete variables.
The number of degrees of freedom is given by the formula ν = n – 1 where n = number of observations in
the sample
1. Use Excel functions to calculate the probability and the inverse of the chi-square distribution.
2. In Minitab use the following menu to generate a chi-square distribution.
3. CALC>PROBABILITY DISTRIBUTIONS>CHI-SQUARE
4. Alternatively, the tables printed at the end of most statistics books can be used.
101
Distribution - F
Purpose
A continuous random variable distribution used in the F-test (see F test - 2 variances). In variance
analysis, the F-test verifies if the groups tested are from same mean populations. In regression analysis,
it verifies if there is a connection between the independent and the dependent variables.
Anatomy
The F-Distribution
C
dfnum = 2 A χ12/ν1
1
dfden = 12 F=
0.9 χ22/ν2
Value of the F Distribution
0.8
0.7 D B
0.6 S12
dfnum = 8 F=
0.5
dfden = 12 S22
0.4
0.3
dfnum = 30
E
0.2
dfden = 30
0.1
0
0 1 2 3 4 5
F Statistic
Six Sigma - Tools & Concepts Dist_F_001
Reference: Juran’s Quality Control Handbook - Ch. 23, P. 68
Terminology
A. F statistic - A continuous random variable equal to the ratio of two independent chi-square random
variables divided by their respective degrees of freedom (ν1 and ν2).
B. F-test parameter - To test the similarity of the variance of two populations (σ12 and σ22) based on two
independent random samples of variance S12 and S22 we use the F-test (see tool F test - 2
variances). If we assume that the variances of the two populations are equal (σ12 = σ22), then the
following ratio follows an F-distribution:
S12
F=
2
C. S2
D. Vertical axis Y(F) - Scale to measure the value of the F-statistic function.
E. Probability Density function - Curves representing the F distribution for three different sets of values
for the degrees of freedom of the numerator (df num) and the degrees of freedom of the denominator
(df den). The total area under each of the curves is equal to one (1).
F. Horizontal axis - Scale of measure of the F-statistic.
102
Samples are taken from populations that follow a normal distribution.
The standard deviation of the populations are estimated by the sample standard deviations.
1. Use Excel functions to calculate the F probability distribution and the inverse of the F probability
distribution (Note: Excel calculates the probability from x to + %, whereas some tables calculate from
the origin to x. To illustrate: 0.95 = 1 - 0.05)
2. Alternatively, use the tables printed at the end of most statistic books. For example, for df numerator
= 2 and df denominator = 10 and a 95% confidence level we can say that the probability is 0.05 that
the F-value will be 4.1028 or greater.
df num 1 2 3 4 5 6 7 8 9 10 20 60 120
df den
1 161.4462 199.4995 215.7067 224.5833 230.1604 233.9875 236.7669 238.8842 240.5432 241.8819 248.0156 252.1956 253.2543
2 18.5128 19.0000 19.1642 19.2467 19.2963 19.3295 19.3531 19.3709 19.3847 19.3959 19.4457 19.4791 19.4873
3 10.1280 9.5521 9.2766 9.1172 9.0134 8.9407 8.8867 8.8452 8.8123 8.7855 8.6602 8.5720 8.5494
4 7.7086 6.9443 6.5914 6.3882 6.2561 6.1631 6.0942 6.0410 5.9988 5.9644 5.8025 5.6878 5.6581
5 6.6079 5.7861 5.4094 5.1922 5.0503 4.9503 4.8759 4.8183 4.7725 4.7351 4.5581 4.4314 4.3985
6 5.9874 5.1432 4.7571 4.5337 4.3874 4.2839 4.2067 4.1468 4.0990 4.0600 3.8742 3.7398 3.7047
7 5.5915 4.7374 4.3468 4.1203 3.9715 3.8660 3.7871 3.7257 3.6767 3.6365 3.4445 3.3043 3.2674
8 5.3176 4.4590 4.0662 3.8379 3.6875 3.5806 3.5005 3.4381 3.3881 3.3472 3.1503 3.0053 2.9669
9 5.1174 4.2565 3.8625 3.6331 3.4817 3.3738 3.2927 3.2296 3.1789 3.1373 2.9365 2.7872 2.7475
10 4.9646 4.1028 3.7083 3.4780 3.3258 3.2172 3.1355 3.0717 3.0204 2.9782 2.7740 2.6211 2.5801
20 4.3513 3.4928 3.0984 2.8661 2.7109 2.5990 2.5140 2.4471 2.3928 2.3479 2.1242 1.9464 1.8963
30 4.1709 3.3158 2.9223 2.6896 2.5336 2.4205 2.3343 2.2662 2.2107 2.1646 1.9317 1.7396 1.6835
40 4.0847 3.2317 2.8387 2.6060 2.4495 2.3359 2.2490 2.1802 2.1240 2.0773 1.8389 1.6373 1.5766
50 4.0343 3.1826 2.7900 2.5572 2.4004 2.2864 2.1992 2.1299 2.0733 2.0261 1.7841 1.5757 1.5115
120 3.9201 3.0718 2.6802 2.4472 2.2899 2.1750 2.0868 2.0164 1.9588 1.9105 1.6587 1.4290 1.3519
103
Distribution - Lognormal
Purpose
To describe distributions associated to life spans, reaction times, income distributions, economic data,
etc. If Y = Ax where x has a normal distribution, Y is said to have a Lognormal distribution.
Anatomy
The Lognormal Distribution
A
0.3
Value of the lognormal
0.25
C
distribution
0.2
0.15
0.1
0.05
B
1 6 11 16 21 26 31 36 41
X
Six Sigma - Tools & Concepts DistLogN
Terminology
A. Vertical axis - Scale of measure of the lognormal distribution.
B. Horizontal axis - Scale of measure of the independent variable.
C. Curve representing a lognormal distribution. A variable x has a lognormal distribution if logA(x) is
normally distributed. Conversely, we can say that if y = Ax, where x has a normal distribution, then Y
is said to have a lognormal distribution.
Caution shall be exercised in calculating probabilities and making predictions. For example, if Y
represents the life of a component and Y = Ax where X has a normal distribution, then one would want to
make predictions on the average life of the system, not on the mean of the logarithm of Y. To solve this
problem, consult the Data Transformations tool.
104
1. Use Excel functions to calculate the cumulative probability and the inverse of the lognormal
cumulative distribution of x where ln(x) is normally distributed.
2. In Minitab, the following menu to generate a lognormal distribution. CALC>PROBABILITY

DISTRIBUTIONS>LOGNORMAL
105
Distribution - Normal
Purpose
To determine the probability of an event regarding a population and to make predictions about such
population based on estimates of its mean and standard deviation. To represent the way many real life
populations are distributed.
Anatomy
Distribution - Normal
G
1 − ( X − µ ) 2 / 2σ 2
µ y = e
σ 2π
A
B C
σ
F
D E
−∞ X1 X2 +∞
DistNorm_001
Terminology
A. Population’s mean ( µ ) - Central tendency of the distribution. The area under the curve is split half on
either side of this point.
B. Population’s standard deviation ( σ ) - Measure of dispersion of the distribution given by the
horizontal distance between the mean and the point of inflection of the curve.
C. Normal curve - Bell shaped curve representing the normal distribution. This curve is asymptotic to the
horizontal axis.
D. Probability - The area under the curve represents the probability of occurrence of an event. The total
area from - % to + % is equal to one (1).
E. Horizontal axis - Axis of measure of a variable such as a CT characteristic.
F. The probability that the value of X lies between points X1and X2 is equal to the area under the curve
from point X1 to point X2.
G. Probability Function of the normal distribution.
106
A sample that is representative of the population shall be used to calculate an estimate of the
population’s mean and standard deviation.
For a more common and easier way to calculate probabilities using the standard normal distribution, see
the “Standard Normal Deviate (Z)”.
1. Calculate an estimate of the population’s mean and standard deviation.
2. Considering the specification limits of the particular case, determine the values of the independent
variable for which you want to calculate the probability of occurrence of an event. Usually, this is the
point beyond which a defect will occur.
3. Use Excel functions to calculate the normal cumulative distribution and the inverse of the normal
cumulative distribution.
4. In Minitab use the following menu to generate a normal distribution. CALC>PROBABILITY
DISTRIBUTIONS>NORMAL
107
Distribution - Poisson
Purpose
To calculate the probability of occurrence of an event in a population when there are many opportunities,
but the probability of each trial is low (less than 0.10). To describe the behavior of discrete variables
when the above conditions are met. A discrete random variable only takes whole values.
Anatomy
Poisson Distribution for Different
Values of λ
A D F
0.3
λ=2
−np
(np) r e
probability of occurrence
0.25
Y =
r!
0.2
λ=4
0.15 E
λ=6
0.1 λ = 10
C
B
0.05
0
12
14
20
10
16
18
2
8
0
no. of occurrences
DistPois_001
Terminology
A. Vertical axis - Scale to measure the probability of occurrence of an event.
B. Horizontal axis - Scale of measure the number of occurrences.
C. Probability - The area under the curve represents the probability of occurrence of an event. The total
area under the curve is equal to one (1).
D. Curve of the Poisson distribution for various levels of lambda (λ) - The Poisson distribution is a
probability distribution for the number of occurrences per unit interval which can be a unit of time or
space. The Poisson distribution is a good approximation of the binomial distribution for the case
where n is large and p is small.
E. Lambda (λ). Parameter which represents the average number of occurrences per interval. It is
defined as λ = np where n = no. of trials and p = probability of occurrence. Probability Function:
−np
(np ) r e
Y =
r!
108
Applicable when sample size is at least 16, the population size is at least 10 times the sample size and
the probability of occurrence p on each trial is less than 0.1.
Events occur at random order and they are roughly proportional to the length of time, volume of space or
area under study. Also, there is no overlapping of events (“clumping”).
1. Use Excel functions to calculate the probability of an occurrence for a discrete variable that follows
the Poisson distribution. For example, the probability that 12 or less occurrences of an event that has
an average number of occurrences of 5.2 is equal to 0.997.
2. In Minitab use the following menu to generate a Poisson distribution. CALC>PROBABILITY
DISTRIBUTIONS>POISSON
3. Alternatively, the tables printed at the end of most statistics books can be used.
109
Distribution – Probability Plot
Purpose
To display an estimate of the cumulative distribution function which best fits a given set of data
Anatomy
Distribution - Probability Plot
Probability Plot
99
Mean: 0.368571
95 StDev: 0.319178
90
80
E C
70
Percent
60
B 50
40
30
20
10
D
5
1
-0.5 0.0 0.5 1.0 1.5
Data
A
DistPrPl_001
Reference: Juran Quality Control Handbook Ch. 23
Terminology
A. Data scale.
B. Cumulative percentage, based upon an assumed probability distribution.
C. Best- fit line generated by Linear Regression.
D. Actual data points plotted against the probability.
E. Confidence intervals.
Fits of data using Probability Plots can be done by assuming a variety of distributions. While the best
distribution to start with is the Normal, the data can also be tested against the Lognormal, Weibull,
Exponential, etc.
Probability plots are best done using Minitab’s Graph>Probability Plot function
110
1. Gather the data and tabulate it in column form.
2. Select the probability distribution to be used to test the data.
3. Given the observed data, calculate the cumulative relative frequency of the data, in percent.
4. Calculate the expected (i.e. theoretical) cumulative relative frequency in percent, based upon the
chosen frequency distribution.
5. On a cumulative probability graph, plot the actual data points against a straight line plot of the
theoretical distribution.
6. If required, decide the level of statistical confidence required, calculate the confidence limits for the
distribution and plot them as curves on either side of the theoretical distribution.
111
Distribution - t
Purpose
Symmetric, bell shaped distribution that resembles the standardized normal (Z) distribution, but with
more area in its tails. That is, with more variability than the Z distribution. The t-test is used to test
population means when small samples are involved. (See T-Test - One Sample and T-Test - Two
Sample).
Anatomy
T-Distribution for Various Degrees
of Freedom (df)
0.4 A
df = 20
Value of the t Distribution
0.35
0.3 D
0.25
0.2
0.15
df = 2 C
0.1
0.05 B
-5 -4 -3 -2 -1 0 1 2 3 4 5
t
Dist_T_001
Reference: Juran’s Quality Control Handbook - Ch. 23, P. 64, 65 & 66
Terminology
A. Vertical axis - Scale to measure the probability at different values of the t statistic.
B. Horizontal axis - Scale of measure of the t statistic.
C. Area under the curve representing the probability that the t-static will take on specific values.
D. Curve of the t Distribution for various degrees of freedom. The total area under each of the curves is
equal to one (1). The t distribution has the following major properties:
a. It is centered around zero and is symmetric about its mean.
b. Its variance is greater than 1, but as the sample size (n) increases, the variance approaches
1.
c. The t distribution has less area in the middle and more in the tails than the Z distribution.
d. The t distribution approaches the Z distribution as the number of degrees of freedom (df)
increases. The number of degrees of freedom equals the sample size (n), minus the number
of population parameters which must be estimated from sample observations. In this case we
must estimate µ, therefore df = n – 1.
112
The t distribution is used when sample size (n) is small and the population’s standard deviation (σ) is
unknown.
The parent population from which the sample is taken follows a normal distribution with mean µ.
1. Use Excel functions to calculate the t distribution and the inverse of the t distribution. For example,
there is a 0.05 probability that a sample with 10 degrees of freedom would have t = 1.812 or smaller.
2. Note: When using the function TINV the probability is multiplied by
2 since Excel returns the value associated to a two-tail curve.
3. In Minitab use the following menu to generate a t distribution CALC>PROBABILITY
DISTRIBUTIONS> T
4. Alternatively, the tables printed at the end of most statistics can be used.
113
Distribution -Weibull
Purpose
To make predictions regarding a population applicable in describing a wide variety of patterns of

variation, including departures from the normal and exponential.
This distribution covers many shapes of distributions thus reducing the problem of deciding which of the
common distributions (e.g. normal or exponential) best fits a set of data.
Anatomy
The Weibull Distribution
A
C
α = Scale parameter
2 β = Shape parameter
1.8
Value of the Weibull function
α=1 α=1
1.6
β = 0.5 β=4
1.4
1.2
α=1
1
β=1
0.8
α=1
0.6
β=2
0.4
0.2
0
0.1 0.4 0.7 1 1.3 1.6 1.9 2.2 2.5
B X
Six Sigma - Tools & Concepts DistWeib_001
Reference: Juran’s Quality Control Handbook - Ch. 23, P. 34-37
Terminology
A. Vertical axis - Scale to measure the value of the Weibull function.
B. Horizontal axis - Scale of measure of the independent variable (X).
C. Curves representing the Weibull Distribution for different values of its parameters β, α and γ.
β (Beta) Shape parameter - Reflects the pattern of the curve. When β = 1.0 the Weibull function
reduces to the exponential function and when β is about 3.5 (and α = 1 and γ = 0), it closely
approximates the normal distribution.
α (Alpha) Scale parameter - As changes the curve becomes flatter or more peaked.
γ (gamma) Location parameter - Smallest possible value of X (often assumed to be zero, thereby
simplifying the equation
114
The location parameter is usually assumed zero to simplify calculations.
1. An analytical approach for the Weibull distribution (even with tables) is cumbersome, and predictions
are usually made with Weibull probability paper.
2. To create a plot in Minitab, enter the values in a column and choose the menu Graph > Probability
Plot specifying the column where data is entered.
3. Observe if the points fall approximately in a straight line, and if so, read the probability predictions
from the graph. For example, based on a sample taken on the life of a component we want to predict
the percentage failure of the population. The failure data (expressed in hours) is 10 263, 12 187, 16
908, 18 042 and 23 271. Applying steps 1 and 2 we obtain the following graph.
Weibull Probability Plot for DATA
95% confidence interval
99
95 Shape: 3.91148
90 Scale: 17861.7
80
70
60
50
40
30
Percent
20
10
5
3
2
1
2000 4000 6000 8000 10000 20000 40000
Data
Reading the graph we can see that about 80% of the population will fail in less than 20,000 hours.
115
Experiment
Purpose
To identify, verify, and optimize the influence of the leverage variables associated with a manufacturing
process.
Anatomy
Experiment
Start
B Establish
Establish Experimentation
Experimental Y Start
Experimental Required?
Objectives
Objectives
Select
Selectthe
the N A
Independent
IndependentVariables
Variables C End
totobe
beIncluded
Included
ininthe
theExperiment
Experiment End
High
Number of Resolution
N Y
Factors > 8? Required? H
D N F
Y
Conduct Conduct Conduct

Screening Characterization Optimization
Experiment Experiment Experiment
Yes Y J
G
E Practical
Objectives
N Effects Y N
Achieved?
Observed
I
Six Sigma - Tools & Concepts Experimt_001
Reference: The Vision of Six Sigma: Supplier Breakthrough pp 2.29 to 2.34
Terminology
A. The first step is to determine whether or not a statistical experiment is actually justified or needed.
B. Next, determine the objectives of the experiment.
C. Determine the Factors, or independent variables, to be studied.
D. If the number of Factors is large, it may necessitate a screening experiment.
E. Conduct a screening experiment of the number of Factors is greater than 8.
F. If a high degree of resolution is not required, then a Characterization experiment may be conducted.
G. Conduct a Characterization experiment.
H. If a high degree of resolution is required, conduct an optimization experiment.
I. If no practical effects are observed from the experiment, then the Factors should be re-considered
and the experiment re-run.
J. If the objectives have been met, then the experiment is concluded, otherwise the assumptions should
be re-visited and the experiment re-run.
116
Time, desired accuracy, and cost are major factors in determining whether to proceed with an
experiment, or whether to conduct a fractional experiment.
Careful selection of the experimental factors is critical to the success or failure of the experiment.
1. Determine the nature of the problem to be investigated by the experiment.
2. Establish the goals and objectives of the experiment.
3. Select the response variable(s) for the experiment.
4. Select the independent variable(s), or factors.
5. Choose the factor levels .
6. Select the experimental design.
7. Conduct the experiment and collect the data.
8. Analyze the data.
9. Draw the experimental conclusions.
10. Achieve the objective.
117
Factorial Experiment - Blocking
Purpose
To divide an experimental design into a series of experimental spaces or periods of time, in such a way
that bias effects are negligible within the block. In other words, the variation due to noise is minimized
within the block.
Anatomy
Factorial Experiment - Blocking
Run A B C D Block Reac Temp Time Cat Run Run

Order 1 2 3 4 No. A B C D Day Night
1 -1 -1 -1 -1 2 1 230 30 Std 8
2 1 -1 -1 -1 1 2 230 30 Std 8 C C
3 -1 1 -1 -1 1 1 300 30 Std 7
4 1 1 -1 -1 2 2 300 30 Std 1
5 -1 -1 1 -1 1 1 230 60 Std 4
6 1 -1 1 -1 2 2 230 60 Std 4
B
7 -1 1 1 -1 2 1 300 60 Std 7
A 8 1 1 1 -1 1 2 300 60 Std 3
9 -1 -1 -1 1 1 1 230 30 New 6
10 1 -1 -1 1 2 2 230 30 New 3
11 -1 1 -1 1 2 1 300 30 New 2
12 1 1 -1 1 1 2 300 30 New 5
13 -1 -1 1 1 2 1 230 60 New 6
14 1 -1 1 1 1 2 230 60 New 2 D
15 -1 1 1 1 1 1 300 60 New 1
16 1 1 1 1 2 2 300 60 New 5
Six Sigma - Tools & Concepts FctExBlk_001
Reference: Statistics for Experimenters pp 102 - 104
Terminology
A. Experimental run number in Yates Standard Order.
B. Block number. In this example, the design is divided into two blocks, runs conducted during the day
and runs conducted at night.
C. The randomized sequence of 8 runs for Block number 1 (runs conducted during the day).
D. The randomized sequence of 8 runs for Block number 2 (runs conducted during the night).
118
Blocking can increase the precision of the test for factor effects by reducing the size of the error term.
Blocking is also very useful when running an experiment on a process that is out of control.
Blocking should only be used for reducing error from unavoidable sources, and should not be used for
dealing with avoidable sources of error which could be dealt with during the initial design of the
experiment.
Minitab has the capability of generating a blocked design.
1. Select the experimental design based upon the number of Factors, levels, and desired resolution.
2. Select number of blocks based upon the experimental situation, by selecting portions of the
experiment and grouping runs where certain Factors are expected to be more homogeneous than
others.
3. Divide the experimental runs into the number of blocks.
4. Randomize the experiment by ensuring that the runs are conducted in random order.
119
Factorial Experiment - Randomization
Purpose
Factorial experiment runs, and the allocation of the experimental resources, should be done in random
order to average out the effects of extraneous sources of variation, placing the effects of noise
throughout the experiment.
Anatomy
Factorial Experiment -
Randomization
Run Order A B C AB AC BC ABC Y

1 (1) -1 -1 -1 1 1 1 -1 45.1
2 a 1 -1 -1 -1 -1 1 1 72.7
Original
3 b -1 Sequence
1 -1 -1 1 -1 1 41.7
4 ab Run
1 1 Order -1 A 1 B -1 C-1 AB
-1 AC
70.4 BC ABC Y
5 c -11 1-1 (1) 1 -1 1 -1 -1 -1-1 11 57.41 1 -1 45.1
6 ac 1 -1 1 -1 1 -1 -1 85.7
2 5 c -1 -1 1 1 -1 -1 1 57.4
7 bc -1 1 1 -1 -1 1 -1 50.7
8 abc 1
3 8 1 abc 1
1 1
1 1
11 1
1
1 1 1 87.5
87.5
4 4 ab 1 1 -1 1 -1 -1 -1 70.4
5 2 a 1 -1 -1 -1 -1 1 1 72.7
6 7 bc -1 1 1 -1 -1 1 -1 50.7
A
7 6 ac 1 -1 1 -1 1 -1 -1 85.7
8 3 b -1 1 -1 -1 1 -1 1 41.7
Six Sigma - Tools & Concepts FctExRan_001
Reference: Statistics for Experimenters pp
Terminology
A. Three factor, two-level Full Factorial Experiment, shown unrandomized, in Yates Standard Order.
B. Same Full Factorial Experiment, randomized.
120
The consequences of an erroneous conclusion based upon a non-randomized design, often justify the
cost and complexity of performing randomization.
Completely randomizing an experiment, changing Factor settings each time, can add significant cost to
the experiment.
Randomization of run order is best performed using Minitab, which allows variation of the base number
for the random number generation.
1. Select desired experimental design, based upon desired results, number of Factors, number of
levels, etc.
2. In Minitab, select Stat>DOE>Create Factorial Design, and enter desired “Type of Design” and
“Number of Factors”.
3. Click on “Designs…” button, and select the specific design, and the number of center points,
replicates and blocks.
4. Back at the “Factorial Designs” window, click on “Options…” button, and ensure that the “Randomize
Runs” button is set to on. This will ensure that when Minitab generates the specific design requested,
it will list the runs in completely random order, as shown by the random Yates order column. Minitab
will still show the “Run Order” in numerical sequence.
121
Factorial Experiment - Replication
Purpose
Replication is the systematic duplication of a series of experimental runs, in order to increase precision or
to provide the means for measuring precision by calculating the experimental error. For Robust Design,
replication allows us to analyze the response mean and variance.
Anatomy
Factorial Experiment - Replication
Run Order A B AB Y
1 (1) -1 -1 1 40.7 A
2 a 1 -1 -1 74.2
3 b -1 1 -1 44.1
4 ab 1 1 1 72.6
B Run Order A B AB Y
5 (1) -1 -1 1 38.6
6 a 1 -1 -1 68.4
7 b -1 1 -1 42.1
8 ab 1 1 1 86
Six Sigma - Tools & Concepts FctExRep_001
Reference: Statistics for Experimenters p 105
Terminology
A. First experimental replicate.
B. Second experimental replicate.
C. Experimental Factors are duplicated exactly from one replicate to the next.
Replication should be carried out in such a way that variation among replicates can provide an accurate
measure of errors that affect comparisons between runs.
Replication is best performed using Minitab.
122
1. Select desired experimental design, based upon desired results, number of Factors, number of
levels, etc.
2. In Minitab, select Stat>DOE>Create Factorial Design, and enter desired “Type of Design” and
“Number of Factors”.
3. Click on “Designs…” button, and select the specific design, and the number of center points,
replicates and blocks.
4. If the runs are not randomized, then replicating the design n times will cause n identical designs to be
generated in order, “stacked” one after the other. If randomization is selected, then the replicated
runs will be randomized.
123
Failure Mode and Effects Analysis (FMEA) Part 1 of 3
Purpose
To assure, in an analytical and systematic manner, that potential process or product failure modes, their
associated causes, and the potential customer effects of the failures, have been considered and
addressed.
Anatomy
Failure Modes and Effects

Analysis (FMEA) - Part 1 of 3
Process/Product
Process/Product
Failure
Failure Modes
Modes and
and Effects
Effects Analysis
Analysis
(FMEA)
(FMEA)
Process or
Prepared by: Page ____ of ____
Product Name:
Responsible: FMEA Date (Orig) ______________ (Rev) _____________
Process Potential Potential

Potential Actions Actions
Step/Part Failure Failure Current Controls Resp.
S Causes O D R Recommended Taken S O D R
Number Mode Effects
E C E P E C E P
V C T N V C T N
E F G H
Reference: Potential Failure Mode and Effects Analysis Reference Manual – Chrysler, Ford and General Motors
Terminology
A. Process or Product Name – Description of Process or Product being analyzed.
B. Responsible – Name of Process Owner.
C. Prepared By - Name of Agent coordinating FMEA study.
D. FMEA Date – Dates of Initial and subsequent FMEA Revisions.
E. Process Step/Part Number – Description of individual item being analyzed.
F. Potential Failure Mode – Description of how the process could potentially fail to meet the process
requirements and/or design intent, i.e. a description of a non-conformance at that specific process
step.
G. Potential Failure Effects – Description of the effects of the Failure Mode upon the customer, i.e. what
the next user of the process or product would experience or notice.
H. SEV (Severity) – An assessment of the seriousness of the effect of the potential failure mode upon
the customer.
124
A Failure Mode and Effects Analysis should be updated after each change introduced to the process.
This means that this analysis is never over, unless the process is completely withdrawn.
Criteria can be adapted to the specific situation
1. Identify the form by entering the basic information from the analysis.
2. List the process steps (only those chosen from the Cause and Effect Matrix).
3. List all potential failure types for each process step (for example, in a Brainstorming session).
4. List the potential effects for each failure type.
5. Establish the severity rate of each effect.
SEVERITY CRITERIA
10 Hazardous Without Warning
9 Hazardous With Warning
8 Very High
7 High
6 Moderate
5 Low
4 Very Low
3 Minor
2 Very Minor
1 None

Purpose
addressed
125
Anatomy
Process/Product
Process/Product
Failure
Failure Modes
Modes and
and Effects
Effects Analysis
Analysis
(FMEA)
(FMEA)
Process or
Product Name:

Number Mode Effects
E C E P E C E P
V C T N V C T N
A B C D E
FMEA2_3_001
Terminology
A. Potential Causes – Description of how the failure could occur, described in terms of something that
can be corrected or controlled.
B. OCC (Occurrence) – Description of how frequently the specific failure cause is expected to occur,
ranked on a scale of 1 to 10 as per the table below.
C. Current Controls – Description of process controls that either prevent, to the extent possible, the
failure mode from occurring, or detect the failure mode should it occur.
D. DET (Detection) – An assessment of the probability that the current controls will detect the potential
cause, or the subsequent failure mode.
E. RPN (Risk Priority Number) – The product of the Severity, Occurrence, and Detection Rankings i.e.
RPN = SEV * OCC * DET.
A Failure Mode and Effects Analysis should be updated after each change introduced to the
manufacturing or assembly process. This means that this analysis is never over, unless the process is
completely withdrawn from production.
1. Determine the possible causes of the failures identified (for example, in a Brainstorming session).
2. Assign a level of occurrence (frequency of the failure).
3. Identify the control methods for each potential cause listed.
4. Establish the level of detection that describes the probability that the failure will be detected by the
existing controls.
5. Calculate the Risk Priority Number: RPN = level of severity x level of occurrence x level of detection.
126
OCCURRENCE DETECTION
10 ≥1 in 2 Very High 10 Absolute Uncertainty
9 1 in 3 Very High 9 Very Remote
8 1 in 8 High 8 Remote
7 1 in 20 High 7 Very Low
6 1 in 80 Moderate 6 Low
5 1 in 400 Moderate 5 Moderate
4 1 in 2,000 Moderate 4 Moderately High
3 1 in 15,000 Low 3 High
2 1 in 150,000 Low 2 Very High
1 ≤1 in 1,500,000 Remote 1 Almost Certain

Purpose
addressed
Anatomy

Process/Product
Process/Product
Failure
Failure Modes
Modes and
and Effects
Effects Analysis
Analysis
(FMEA)
(FMEA)
Process or
Product Name:

E G
Number Mode Effects
E C E P E C E P
V C T N V C T N
A B C D F
FMEA3_3_001
127
Terminology
A. Actions Recommended – Actions to reduce any or all of the Occurrence, Severity or Detection
rankings.
B. Responsibility – Person or group responsible for the Recommended Action.
C. Actions Taken – Brief description of actual action and effective date.
D. New SEVERITY Rating after corrective action.
E. New OCCURENCE Rating after corrective action.
F. New DETECTION Rating after corrective action.
G. Resulting new RPN after corrective action.
A Failure Modes and Effects Analysis should be updated after each change introduced to the
manufacturing or assembly process. This means that this analysis is never over, unless the process is
completely withdrawn from production.
1. Recommend corrective measures.
2. Identify the persons or department responsible for implementing corrective measures and a
performance date for the action plan.
3. Follow up the recommendations by indicating what measures have been taken.
4. Recalculate the Risk Priority Numbers after implementing the corrective measures.
128
Fishbone Diagram
Purpose
The Fishbone Diagram, also known as the Cause and Effect Diagram or Ishikawa Diagram, is a
graphical construct used to identify and explore on a single chart, in increasing detail, the possible
causes which lead to a given effect. The ultimate aim is to work down through the causes to identify
basic root causes of a problem.
Anatomy
Fishbone Diagram
A
Major Cause Major Cause Major Cause
Category 1 Category 2 Category 3
Root
Root
Cause
Cause C
Cause
Cause
Cause
Cause Cause
Cause Cause Cause

Cause
Secondary Cause
Root Cause Cause
Root
Cause
Root
Effect
Cause
Cause
B Cause
Cause
Cause
Cause E
Cause Cause
Cause
Secondary
Root Root Cause
Root
Cause Cause
Major Cause Major Cause Major Cause D
Category 4 Category 5 Category 6
Fishbon1_001
Reference: The Memory Jogger P. 23 – 30 Juran Quality Control Handbook P. 22.37 - 22.38
Terminology
A. Major Cause Categories.
B. High level cause.
C. Root cause, i.e. cause of a cause.
D. Secondary root cause, i.e. cause of a root cause.
E. The effect of the causes, i.e. the problem whose causes is being investigated.
129
The Major Cause Categories are not firmly defined, and can easily vary according to the situation, or the
type of problem being studied. For example, six Categories are typically used in Manufacturing
processes: Materials, Machine, Measurement, Methods, Manpower and Milieu (Environment). Similarly,
four are typically used in administrative processes: Personnel, Plant Facilities, Policies and Procedures.
Depending on the situation, other categories are possible.
Fishbone Diagrams are best prepared in a team setting using Brainstorming techniques, but can also be
based on process data if it is available. The same cause should not be used on several exercises.
1. Select the Problem Statement, or Effect, summarized in a few key words, and place it in a box on the
right side of the new diagram.
2. Select the Major Cause Categories, according to the specific situation and problem statement, and
connect them with a straight line to the “backbone” of the diagram.
3. Place brainstormed or data-driven causes in the appropriate Category.
4. Place Root Causes, against each of the main causes.
5. Continue driving down, identifying further lower-level Root Causes.
130
Fitted Line Plot
Purpose
To graphically represent the relationship between a continuous dependant variable (Y) and an
continuous independent variable (X), for a process operating in its natural state.
Anatomy
Fitted Line Plot
Regression Plot
D
Y = -1521.41 + 15.6152X
R-Sq = 0.911 E
6000
5000
B
4000 A
Hardness
3000
2000
1000
C
Regression
0
95% CI
-1000 95% PI
100 200 300 400
Density
Six Sigma - Tools & Concepts FitLnPlt_001
Reference: Minitab Reference Manual Ch. 2 P. 26-29
Terminology
A. Fitted Line through the data. The regression line is simply the one that best fits the plotted data, and
relates the independent variable (X) to our dependent variable (Y) (see tool Linear Regression).
B. Confidence bands about the fitted regression line at a specified confidence level, (usually 95% level).
In other words, we are 95% confident that the plotted data falls within these two lines.
C. Prediction bands about the fitted regression line at the confidence level specified, (usually 95% level).
In other words, we are 95% confident that the fitted regression line “A” falls within these two lines.
D. Regression Equation - A prediction equation, which allows the values of the inputs to predict a
corresponding output.
E. Coefficient of Determination: r2, is a number that represents the adequacy of the regression model or
the amount of the variation in Y that can be explained by the regression equation.
131
This tool should be used after finding a satisfactory model through regression analysis.
1. Carry out regression analysis.
2. Analyze data with Minitab:
– Use the Function under Stat > Regression > Fitted Line Plot.
– Input the name of the dependant variable into the ‘Response’ field, and the name of the
independent variable into the ‘Predictor’ field.
– Go to options and select ‘Display confidence bands’ and ‘Display prediction bands’.
132
Fold-Over Design
Purpose
To produce a mirror image of given design, in order to separate confounded interactions. Generally, this
converts designs of Resolution III to designs of Resolution IV.
Anatomy
Fold-Over Design
A
Block 1
Block 1
A B C AB AC BC ABC
Full Factorial Matrix B 1 (1) -1 -1 -1 1 1 1 -1

4 ab 1 1 -1 1 -1 -1 -1
A B C AB AC BC ABC 6 ac 1 -1 1 -1 1 -1 -1
1 (1) -1 -1 -1 1 1 1 -1 7 bc -1 1 1 -1 -1 1 -1
2 a 1 -1 -1 -1 -1 1 1 Generator
Generator
3 b -1 1 -1 -1 1 -1 1 Block 2
Block 2
4 ab 1 1 -1 1 -1 -1 -1 2 a 1 -1 -1 -1 -1 1 1
5 c -1 -1 1 1 -1 -1 1 3 b -1 1 -1 -1 1 -1 1
C 5 c -1 -1 1 1 -1 -1 1
6 ac 1 -1 1 -1 1 -1 -1
8 abc 1 1 1 1 1 1 1
7 bc -1 1 1 -1 -1 1 -1
Generator
8 abc 1 1 1 1 1 1 1 Generator
Notice that the “signs” of Block 1 are the

reverse of those given in Block 2. In other
words, Block 2 is the mirror image of Block 1.
Six Sigma - Tools & Concepts FldOvrDe_001
Reference: Statistics for Experimenters page 340
Terminology
A. 23 full factorial design.
B. First “folded” Resolution III block, consisting of half the runs.
C. Second “folded” Resolution III block, consisting of mirror images of the Factor level settings from
the first block.
Fold-over designs leave the main effects of the k factors unconfounded with block variables.
All the two-factor interactions are confounded with blocks.
A. For a given Resolution III 2k-p fractional factorial experiment, duplicate the first fraction by adding the
mirror image design obtained by changing the high level settings to low level and the low level
settings to high level.
B. Re-run the experiment using this new Resolution IV design.
133
F-Test – Two Variances
Purpose
To compare the variances of two populations on a continuous CT characteristic. Since we don’t know the
population variances, an analysis of two samples of data is required. This test is usually used to
determine if there is a statistically significant change in the variance of a CT characteristic under two
conditions.
Anatomy
F-Test - Two Variances
A
H0 : σ12 = σ22 vs. Ha : σ12 ≠ σ22
Ha : σ12 > σ22
B
Ha : σ12 < σ22
F-Test Two-Sample for Variances
Variable 1 Variable 2
Mean 50.0571 50.0813
C
D
Variance 1.0677 6.63359
Observations 22 22
df 21 21
E
F 0.16095 F
P(F<=f) one-tail 4.9E-05
F Critical one-tail 0.4798 G
Six Sigma - Tools & Concepts FTst2Var_001
Reference: Juran’s Quality Control Handbook - Ch. 23 P. 68, Basic Statistics by Kiemele, Schmidt and Berdine-Ch. 6 P. 29-32
Terminology
A. Null (H0) and alternative (Ha) hypotheses, where the two population variances (σ12, σ22) are
compared. For the alternative hypothesis, one of the three hypotheses has to be chosen before
collecting the data to avoid being biased by the observations of the samples.
B. Excel output for an F-Test with directional alternative hypothesis (Ha: σ12>σ22 or Ha: σ12< σ22).
C. Descriptive Statistics - sample mean, sample variance, number of observations or sample size.
D. Number of degrees of freedom (df = number of observations – 1, for each sample).
E. Computed or observed F statistic ( F = s12 s 22 , see tool Distribution – F).
used: if P < α, reject H0 and accept Ha with (1-P) 100% confidence; if P ≥ α, don’t reject H0.
G. Tabulated Fisher (F) distribution value with risk α (for Ha: σ12>σ22) or 1-α (for Ha: σ12<σ22) and n1-1;
n2-1 degrees of freedom.
134
The assumptions for using this test is that the data comes from two independent random samples taken
from two normally distributed populations. This test is sensitive to the normality assumption. It is
available in Excel but not in Minitab where the function Homogeneity of Variance under Stat>ANOVA can
be used instead (see tool Homogeneity of Variance Tests).
2. Establish hypothesis - State the null hypothesis (H0) and the alternative hypothesis (Ha).
3. Establish alpha level (α). Usually α is 0.05.
4. Select random samples.
6. Analyze data with Excel:
– When the alternative hypothesis is directional (> or <), use the function under Tools>Data
Analysis> F-Test Two-Sample for Variances (If you do not have a Data Analysis option, select the
following under Tools>Add-Ins>Analysis ToolPak).
Put data in two columns. After selecting the function, input variable 1 & 2 ranges. Input alpha (α)
level (default=0.05) and select output range.
– When the alternative hypothesis is non directional (≠), use the statistical function (fx) called
FTEST. The only output from this function is the P-value.
Make statistical decision based on the output from Excel. Either accept or reject H0.
Translate statistical conclusion to practical decision about the CT characteristic.
135
Measurement - Attribute Gage R&R Study
Purpose
To evaluate an attribute-based measurement system. A "critical to" characteristic is measured in attribute
data when it is compared with a limit in order to accept or reject it. This data is classified into categories:
accepted or rejected. An attribute Gage R&R evaluates the consistency between measurement decisions
to accept or reject.
Anatomy
Attribute Gage R&R Study
A B
SHIPPING ORDER
QTY UNIT DESCRIPTION TOTAL

Operator 1 NO-GO GO
1 $10.00 $10.00
3 $1.50 $4.50
10 $10.00 $10.00
2 $5.00 $10.00
Operator 2
Error
Operator 3
MeasGRRA_001
Reference: Measurement Systems Analysis (Chrysler, Ford, GM)- Ch. 2, P. 81
Terminology
A. A sample of operators who usually take the measurements.
B. A characteristic (CTQ, CTC,CTD, CTP) measured with attribute data.
C. Data collection and analysis of the results.
An attribute-based measurement system, such as a go/no-go gage, cannot indicate how good or how
bad a part is. It can only indicate if the part is accepted or rejected. Although this Gage R&R permits an
evaluation of the measurement system used to obtain data, it is recommended to try and find a
measurement system which provides continuous data.
136
Although the number of operators, trials and parts may vary, the following steps are the most common.
1. Identify two operators who will participate in the study. These operators should be selected from
those who normally take the data.
2. Obtain 20 samples (e.g. parts) that fall within the range of possible outcomes of the process. For
example, when selecting parts for evaluating a go/no-go gage, it is desirable that some of the parts
are slightly below or above the standard.
3. Have the first operator measure all the samples once in random order (blind sampling, where the
operator does not know the identity of the sample, is often done to reduce human bias).
4. Record the values obtained on a data collection sheet or directly in a computer file.
5. Have the second operator measure all the samples once in random order.
6. Repeat steps 3-5 until finishing 2 trials.
7. Analyze the results and determine follow-up action if necessary.
An Excel spreadsheet can be used to analyze the results.
The measurement system is acceptable if the majority of the measurement decisions agree. If the
measurement decisions do not agree, the measurement system must be improved and re-evaluated.
Some improvement methods are: establish a standardized measuring procedure, provide training to
the operators, change the go/no-go gage, or find an alternative measurement system which can
obtain continuous data.
137
Measurement - Variable Gage R&R Study
Purpose
To analyze the scope and types of error produced when a variable based measurement system interacts
with its environment. The study focuses on two types of error with the greatest effect on the
measurement system, repeatability and reproducibility, accounting for the variation introduced by one
measuring instrument and different operators.
Anatomy
Measurement - Variable Gage R&R

Study
Diameter
B
Operator 1
Operator 3
Operator 2
MeasVARG_001
Reference: Measurement Systems Analysis (Chrysler, Ford, GM) - Ch. 2, P. 39,45 / Minitab Reference Manuel – Ch. 10, P. 5-14
Terminology
A. A continuous characteristic (CTQ, CTP, etc.) to be measured with an instrument.
B. The measuring instrument that we want to verify.
C. A representative group of personnel who normally operate the instrument.
D. Data collection and analysis of the results with a statistical software.
A variable Gage R&R is a carefully planned way to study a measurement system. This tool looks at the
variation in measurement data employing one measuring instrument and several operators. In the case
that it is necessary to analyze other factors, such as different equipment, measuring methods etc., a
design of experiment study would be required.
138
Although the number of operators, trials and parts may vary, the following steps are the most common
for a Gage R&R study.
1. The instrument must have an adequate resolution that allows at least one-tenth of the expected
process variation of the characteristic to be read directly. For example, if the characteristic's variation
is 0.001, the equipment should be able to "read" a change of 0.0001.
2. Identify three operators who will participate in the study. These operators must be familiar with the
instrument and the measuring method.
3. Obtain 10 samples (e.g. parts) that represent the actual or expected range of process variation.
Number the parts and mark the area to be measured on each part.
4. Calibrate the gauge, or assure that it has been calibrated.
5. Have the first operator measure all the samples once in random order (another technique is to
reduce human bias by blind sampling, where the operator does not know the identity of the parts
being measured).
6. Record the values obtained on a data collection sheet or directly in a computer file.
7. Have the second and then the third operator measure all the samples once in random order.
8. Repeat steps 5-7 until finishing 3 trials.
9. Check for outliers in the data, if possible on the spot, in order to repeat these readings (wrong
adjustments, handling errors, dirt on the samples or instruments, reading errors, etc.). A good way to
find outliers is out-of-control points on the range chart.
10. Analyze results using Minitab Software (Stat>Quality Tools>Gage R&R Study).
Note: The preferred method of analysis is the ANOVA. To use the Gage R&R function, Minitab needs
three column variables: Part numbers (sample names or numbers for each observed
measurement), Operators (operator names or numbers for each observed measurement) and
Measurement data (observed measurements). To generate part and operator numbers, the
function Simple Set of Numbers can be used (Calc>Make Patterned Data>Simple Set of
Numbers). For information and examples see Minitab Reference Manual p10.5-10.14.
11. Determine follow-up actions if necessary. For example, improve/clarify measuring method, organize
training for the operators, find another instrument, etc.
139
Histogram, Raw
Purpose
To depict the frequencies of numerical or measurement data in a bar chart.
This tool represent one of the means of visualizing the shape, arithmetic mean and dispersion of a
distribution along the scale of measurement.
Anatomy
Histogram, Raw
A B
C
FREQUENCY
F
{
E SCALE OF MEASURE
Six Sigma - Tools & Concepts HistoRaw_001
Terminology
A. Vertical axis - Scale to measure the frequency of observations
B. Modal Class - Interval with the highest frequency (i.e. number of observations)
C. Frequency - Number of observations found for each interval. It is represented by the height of each
bar in the graph
D. Interval or Class - Set of real numbers between two values defined by exact limits. The shape of the
histogram is influenced by the number and width of the intervals.
E. Interval width - Defined by the difference between the upper exact limit and the lower exact limit. The
width is the same for all intervals (i.e. bars) in the histogram
F. Horizontal axis - Scale of measure of the variable or CT characteristic
The measurement system must be validated prior to collecting data (when applicable).
The guidelines to calculate the no. of intervals and their width must be observed.
1. Select the variable to measure (e.g. a CT characteristic)
2. Conduct a measurement validation study to ensure good measurements and repeatability (when
applicable)
140
4. Define the number of intervals required to construct the graph. As a guideline the number of intervals
is equal to (n) ½ where n represent the no. of observations.
Intervals should:
Be mutually exclusive
Be the same width
Be not less than 6 and not more than 20
Be continuous throughout the distribution
Limits of each interval are recorded in limit values
Limits of each interval are written in exact limits
5. Calculate the interval or class width using exact limits. Exact limits are extensions of plus or minus (±)
one-half the smallest unit that the measuring instrument can read. The smallest unit read is also
known as resolution, so an exact limit is an extension of ± one half of the resolution.
6. Construct a frequency table counting how many observations fall into each interval when the limits
are recorded using exact limits.
7. Create a bar chart using the count per interval
ALTERNATIVE COOKBOOK USING MINITAB

8. Perform steps 1-4 as indicated above
9. Collect and record data
10. Enter the data in columns of the data window
11. Choose the menu GRAPH
12. Choose the sub-menu HISTOGRAM
13. Select the column(s) containing the data in GRAPH VARIABLES and press OK
Note: Minitab calculates the no. of intervals and the interval width
Histogram, Relative
Purpose
To depict the relative frequencies of numerical or measurement data expressed as percentage in a bar chart.
Anatomy
141
Histogram, Relative
15
10
PERCENT
5 C
SCALE OF MEASURE
HistoRel_001
Reference: Juran's Quality Control Handbook - Ch., P.
Terminology
G. Vertical axis - Scale to measure the frequency of observations expressed as percentage
H. Percentage - Percent of the total observations found for each interval. It is represented by the height
of the bar in the graph
I. Interval - Set of real numbers between two values defined by exact limits
J. Horizontal axis - Scale of measure of the variable or CT characteristic measured
The guidelines to calculate the no. of intervals and their width must be observed.
1. Perform steps 1-6 as indicated in the cookbook for the tool "Histogram, Raw"
2. Determine the total number of observations
3. Count how many observations fall into each interval of the frequency table
4. Divide the value found in step 3 by the total from step 2 and express it as a percentage of the total
number of observations. Record this value in the table
5. Create a bar chart using the percentage found for each interval
ALTERNATIVE COOKBOOK USING MINITAB

10. Select the column(s) containing the data in GRAPH VARIABLES
11. Select OPTIONS
12. Select Type of Histogram: Percent
13. Press on OK in the to menu boxes for MINITAB to create the graph
142
Histogram, Cumulative Relative
Purpose
To depict the cumulative relative frequencies of numerical or measurement data expressed as percentage in a bar
chart.
Anatomy
Histogram, Cumulative Relative
100
A E
B
CUMULATIVE
PERCENT
C
50
0
100
D
SCALE OF MEASURE
A
CUMULATIVE
PERCENT
50
D SCALE OF MEASURE
Six Sigma - Tools & Concepts HistoCum_001
Reference: Juran's Quality Control Handbook - Ch., P.
Terminology
A. Vertical axis - Scale to measure the frequency of observations expressed as cumulative percentage
B. Cumulative Percentage - Cumulative percent of the total observations found for each interval. It is
represented by the height of the bar in the graph
C. Interval - Set of real numbers between two values defined by exact limits
D. Horizontal axis - Scale of measure of the variable or CT characteristic measured
E. Cumulative histogram.
F. Cumulative frequency polygon or ogive constructed by connecting the midpoints of the intervals of
the cumulative histogram (E)
The guidelines to calculate the number of intervals and their width must be observed.
143
2. Determine the total number of observations
3. Count how many observations fall into each interval of the frequency table
4. Divide the value found in step 3 by the total from step 2 and express it as a percentage of the
total number of observations. Record this value in the table
5. Calculate the cumulative percent frequency for each interval
6. Create a bar chart using the cumulative percentage found for each interval
7. ALTERNATIVE COOKBOOK USING MINITAB
9. Collect and record data
13. Select the column(s) containing the data in GRAPH VARIABLES
14. Select OPTIONS
15. Select Type of Histogram: Cumulative Percent
16. Press on OK in the to menu boxes for MINITAB to create the graph
17. To create the cumulative percentage polygon select CONNECT in the menu DATA DISPLAY in
MINITAB.
144
Inner Array
Purpose
To systematically account for the effects of experimental Factors that are beyond the control of the
experimenter, by introducing a second, ancillary, experimental design consisting solely of uncontrollable
environmental Factors. The Inner Array forms half of an experimental design known as the Taguchi
approach, and is used in conjunction with an Outer Array.
Anatomy
Inner Array
Run Run Run Run

1 2 3 4
B M 1 -1 1 -1
C
N 1 1 -1 -1
O -1 1 1 -1
26III–3 Design
AB AC BC
Run A B C D E F E
1 -1 -1 -1 1 1 1 Y1 Y9 Y17 Y25
2 1 -1 -1 -1 -1 1 Y2 Y10 Y18 Y26
3 -1 1 -1 -1 1 -1 Y3 Y11 Y19 Y27
A
4 1 1 -1 1 -1 -1 Y4 Y12 Y20 Y28
5 -1 -1 1 1 -1 -1 Y5 Y13 Y21 Y29
6 1 -1 1 -1 1 -1 Y6 Y14 Y22 Y30
D 7 -1 1 1 -1 -1 1 Y7 Y15 Y23 Y31
8 1 1 1 1 1 1 Y8 Y16 Y24 Y32
Generators: ABD ACE BCF
Six Sigma - Tools & Concepts InnrAray_001
Reference: Understanding Industrial Experimentation pp 269 - 292
Terminology
A. The Inner Array, also known as the Design Matrix.
B. The Outer Array, also known as the Noise Matrix.
C. An experimental Factor (B).
D. An experimental run.
E. One Response acquired per run for each of the environmental Factors.
The Inner Array is also known as the Design Matrix.
For each run of the Inner Array, one Response is collected for each setting of the environmental factors.
Therefore, each "set" of Responses for each Design run corresponds to one sub-group of size n, where
n is the number of runs of the Noise Matrix.
Minitab limits Taguchi designs to two-level designs.
145
1. Select the experimental design, based upon the number of Factors, levels, desired precision, etc.
2. Select the environmental (noise) factors over which there is little or no control.
3. Select a 2k experimental design where k is the number of noise factors.
4. Determine the high and low factor settings for the k noise factors.
5. For each of the runs in the main experimental design (the inner array), collect one response point for
each combination of noise factors.
k
6. Continue the experimental runs until 2 *n, Responses have been collected. "n" is the number of runs
in the inner array.
7. Each of the k Responses for each run corresponds to one sub-group.
146
Interaction Plot
Purpose
To graphically represent the impact that a change in a process input has on the experimental response.
Anatomy
Interaction Plot
E F
Factor A
40 Factor A
60 Low
High Low
Low High
50 High Low
30 High
Slight
Mean
Slight 40 No
Mean
Interaction No
30 20 Interaction
20
10
Run
10
Low High
Low High
Factor B
Factor B
A D
Factor A Factor A
Low 75 Low
30 Med
High
High
Low 65
Low
22Factor
Factor
High
55 Med 33Level
Level
Full
Mean
High
Mean
Full 45 With
Interaction 20 With
(reversal) 35 Interaction
(reversal)
25
10 15
Low High Low Med High

Factor B Factor B
C
B
Six Sigma - Tools & Concepts InterPlt_001
Reference: The Vision of Six Sigma: Supplier Breakthrough pp 2.29 to 2.34
Terminology
A. Vertical Scale is the units of the Response variable.
B. Horizontal Axis is the second of the two Factors.
C. Horizontal Axis scale presents the different levels of the second Factor.
D. Line representing the change in the response variable when the second Factor goes from one level
to another, when the first Factor is at its lowest level.
E. Line representing the change in the response variable when the second Factor goes from one level
to another, when the first Factor is at its highest level.
F. Legend displaying the attributes of the symbols and lines for the levels of the first Factor column.
An Interaction Plot is typically a second step in the ANOVA process. A Main Effects Plot is first used to
demonstrate the Main Effects of a Factor, then an Interaction Plot is used to visualize the presence of
interactions of the Factors.
Evaluating interactions is very important because they can cancel out or magnify factor main effects.
147
1. Collect data and present in the form of a matrix.
• The first column contains the levels of the first Factor.
• The second column contains the levels of the second Factor.
• The third and subsequent columns contain the Response values measured at the settings of
the first and second Factors.
• If multiple raw Responses are presented in separate columns, the Mean Response should be
calculated.
2. Prepare a graph showing the Response scale as the vertical axis, and the setting levels of the
second Factor as the horizontal axis.
3. Plot the Mean Responses for the second Factor at its various levels, with the first Factor at its lowest
setting. If the second Factor only has two levels (i.e. low and high), the graph will be a simple straight
line connecting the two points. If the second Factor has multiple levels, the plot will consist of a series
of points joined with straight lines.
4. Repeat the previous step, with the first Factor at its next lowest setting.
5. Continue the process until all levels of the first Factor have been addressed.
6. Interpret the chart:
• The more parallel the lines at the different Factor settings, the less the interaction.
• Perfectly parallel lines show no interaction.
• Slightly non-parallel lines imply a slight interaction between the Factors.
• Lines which cross indicate clear evidence of an interaction between the Factors.
148
"Is/Is Not" Technique
Purpose
To diagnose improvement solutions that address the root cause of a problem, thus avoiding wasted time
and effort on catering to symptoms.
Anatomy
“Is/Is Not” Technique
Problem Definition “Is” “Is Not” Get Info On
What
A Object
Defect
B
Where E F G
C
When
D How Big
IsNotTec_001
Reference:
Terminology
A. Problem Defined – State the problem clearly
B. Where does the problem occur
C. When does the problem occur
D. How big is the defect
E. This column defines the "Is" state
F. This column defines the "Is Not" state
G. This column contains notes where more information gathering is needed
The process of focused brainstorming on when the problem does not occur often seems futile, but
questions get asked using this technique, that would never get asked otherwise. Some of these
questions may lead to a narrowing-in on the potential solution.
1. Define, in a brainstorming session, WHEN/HOW/WHY/WHERE the problem or defect does NOT
occur
2. Contrast this with when it DOES occur
3. Look for explanations that may help to describe the conditions as described above
4. Investigate further any potential threads of logic that would identify a root cause
5. Where possible, conduct a verification test or experiment to confirm your hypothesis before taking
improvement action
149
Linear Regression
Purpose
To estimate the parameters of an equation relating a particular variable (dependent variable or "Y") to another
variable (independent variable or "X"), where the resulting equation is called a "regression equation" or "regression
model". Simple linear regression is applied when the dependent variable is linearly proportional to just one
independent variable.
Anatomy
Linear Regression
Regression Analysis
A
The regression equation is

B
Strength = 49.1 + 3.47 Temp
Predictor Coef StDev T P

Constant 49.0523 0.1047 468.51 0.000
Temp 3.4707 0.1215 28.56 0.000
D S = 1.015 R-Sq = 89.3% R-Sq(adj) = 89.2%

Six Sigma - Tools & Concepts LinRegre_001
C
Source DF SS MS F P
Regression 1 840.64 840.64 815.47 0.000
Residual Error 98 101.03 1.03
Total 99 941.67
E
Reference: Juran's Quality Control Handbook Ch. 23 P. 96-108.
Terminology
A. Regression equation – expressing the predicted value as a function of the "X" and coefficients.
B. A sufficiently small p-value (e.g. p< α) is indicative that the coefficient is statistically significant.
2
C. R-Sq - (r ) Coefficient of Determination is the ratio of SS regression/SStotal and is a measure of the fit of
the regression to the data. It suggests a very good fit when "R-Sq" approaches 100%, and a poor fit
when it is small. R-Sq(adj) is adjusted for the degrees of freedom.
D. Analysis of Variance – standard interpretations apply to the Sum of Squares (SS), Mean Sum of
Squares (MS), the F-value and P-value corresponding to the F-Test.
E. A sufficiently small p-value (e.g. p< α) is indicative that the regression is statistically significant.
Linear regression is applied with the assumption that the dependent variable is linearly proportional to
the independent variable, and the data has to be paired. Simple regression is applicable to one
independent variable, while multiple regression is used for cases with more than one independent
variable.
150
1. Collect data samples.
2. Enter the data corresponding to the independent variable and dependent variable into two separate
columns in Minitab.
3. Select Stat – Regression – Regression.
4. In the Response field, select column corresponding to the dependent variable.
5. In the Predictors field, select column corresponding to the independent variable.
6. For Options, select "Fit Intercept" but leave all other settings unchecked for basic applications.
7. Carry out residual analysis (see tool Residual Plots).
8. Check if the P value is sufficiently small (i.e. P< α ) to conclude that the regression is statistically
significant.
151
Main Effects Plot
Purpose
To graphically compare the level of a process output variable at various states of process "factors", to
gain an understanding of the main effect of a change in the factor on the output
Anatomy
Main Effects Plot
Main Effects Plot - Means for Output

A
3.3
3.1
Output
2.9
2.7 E
D
C
2.5
Factor 1 Factor 2
F
MainPlot_001
Reference:
Terminology
A. Output variable being studied.
B. Factors whose effect on the output is being studied.
C. The Mean of the Output variable at different levels (or values) of Factor 1 (more than one data point
may have been collected for each level of the factor, so the Mean is used ).
D. The Mean of the Output variable at different levels (or values) of Factor 2.
E. The grand Mean of all the output values.
F. Levels of the Factors.
The best tool for drawing Main Effects Plots is Minitab, using the Stat>ANOVA>Main Effects Plot
function.
Main Effects Plots can be drawn to compare many Factors at one time. A relatively flat line indicates that
as the Factor changes value, it has little effect upon the output, while a line that has a lot of up and down
movement indicates that as the Factor changes, it has a greater "main effect".
152
1. Gather data and present in tabular form, with output values being matched with corresponding levels
of the chosen Factors.
2. When more than one output value is presented for each factor, calculate the Mean of the output
value for each Factor.
3. Calculate the Grand Average of all the output values, and plot it as a straight line on all Main Effects
plots.
4. For each Factor, plot the Mean of the output values at each level of the specific Factor.
5. If more than one Factor is being analyzed, the plots for the different Factors should be presented
together for comparison.
153
Measurement Scale - Likert Scale
Purpose
To evaluate customer satisfaction using an ordinal scale with a range of ratings or degrees of satisfaction
arranged in order.
To determine what you are doing right, as well as wrong.
Anatomy
Measurement Scale - Likert Scale
WE LIKE HAPPY PEOPLE! A B C D E
C EXCELLENT VERY GOOD FAIR POOR I’M MAD
To help serve you better and Are we friendly and courteous?

keep you happy, we’d like your
comments on our “Happy Face
Report card ”. Just circle the
face that fits and drop the card Was your Cable Television installation completed properly?
in the mail. Many thanks!
D
Name
How is your picture quality?
Address
Our overall grade is:

Telephone
Please call us if you have any additional comments or suggestions.
Six Sigma - Tools & Concepts Measclik_001
Reference: Customer Satisfaction Assessment Guide, Motorola, 1995
Terminology
A. Ordinal scale arranged in ascending or descending order.
B. A series of questions to be rated by the customer.
C. Simple set of instructions.
D. An attempt to collect names/addresses so that follow up can be conducted in the future.
Keep the survey very simple.
Make it easy for the customer to tell you he is not happy.
The goal is to gather data to best know how to satisfy the customer.
154
1. Designing the Survey Questionnaire:
Design Considerations:
• Length (not too long);
• Types of questions (statements of fact or measures of performance and importance);
• Open-ended questions/probes (respondents will be able to volunteer issues and provide
explanations);
• Appearance (simple, not busy).
Type of Question Formats:
• Closed-ended (yes/no);
• Rating scales (even or add numbers? Both can be useful);
• Open ended (free response).
2. Other Considerations:
• Focus on one theme;
• Its usually best to include a midpoint in rating scales (e.g. 3/5/7 categories);
• Try to solicit your customers feelings with regard to your competitors;
• Identify specific target control groups:
• At least 10% of total customer base;
• Stratify various customer segments;
• Give prior notice before delivering survey;
• Personalize the survey and cover letter;
• Address confidentiality;
• Offer an incentive or token of appreciation for completion;
• Follow up with a collection strategy;
• Develop action plans based on results;
• Communicate results to customers;
• Follow up with repeat surveys to monitor changes with time.
155
Measurement Scale - Logarithm
Purpose
To display non-linear data in a format that spans several orders of magnitude.
To provide a presentation technique that can be read with a degree of precision for a wide range of
values.
Anatomy
Measurement Scale - Logarithm
C Improvement
Parts-Per-Million Goal
7000
6000 is difficult to Results of Service Benchmarking
5000 detect
4000
PPM
3000 IRS - Tax Advice (with ± 1.5 σ shift)
2000 1,000,000 (phone-in)
1000 (140,000 PPM)
0 Restaurant Bills
100,000
Doctor Prescription Writing
-1000
A -10 0 10 20 30 40 50 60 70 Payroll Processing
Order Writeup
Forecast Period 10,000
B (Months from Baseline) Average
Average Journal Vouchers
D Company
Company Wire Transfers
1,000 Airline Baggage Handling
Purchased Material
Lot Reject Rate
100
10000 σ E
Parts-Per-Million Goal
4σ
10
1000 F
E World-Class
World-Class
100 1
2 3 4 5 6 7
10
σ
6σ Domestic Airline
Sigma Scale of Measure Flight
1 B Fatality Rate
-10 0 10 20 30 40 50 60 70 (0.43 PPM)
E Forecast Period
(Months from Baseline)
MeaScLog_001
Reference: Business statistics Ch 15, Douglas Downing and Jeffrey Clark, Barron's Business review series, inc
Terminology
A. Linear vertical scale «Y».
B. Linear horizontal scale «X».
C. Curve representing the function: Y=f(X).
D. Difficulty to detect any variation of y for different Xs.
E. Logarithms vertical scale.
F. New line representation of the function: log Y=b+mX (if linear).
G. PPM vs Sigma (values from less than one, to over 100,000 can be read from this scale).
If the non linearity of the data is such that useful information cannot be obtained, then taking the
logarithm is appropriate. The use of log base 10 is most often used, but any other base can be selected if
desired.
156
1. Enter data in excel spreadsheet.
2. Select log 10, log x, ln, as required.
3. Plot data.
data point number log 10 log 6 ln
1 10 1 1.285097 2.302585
2 100 2 2.570194 4.60517
3 1000 3 3.855292 6.907755
4 10000 4 5.140389 9.21034
10
8
log 10
6
log 6
4
ln
2
0
1 2 3 4
157
Measure of Location - Mean
Purpose
To calculate the arithmetic mean of all data measurements. The mean is the most representative
descriptive measure of the central tendency of the data from a population or a sample. It has the
advantage of lending itself to many other statistical calculations.
Anatomy
Measure of Location - Mean
∑X
B
i
X = i =1 C
A
n
D
Six Sigma - Tools & Concepts MeaLocMn_001
Reference: Juran's Quality Control Handbook - Ch. 23, P. 16
Terminology
X A.: The symbol for the mean of a sample (pronounced "X BAR").
n
∑i =1 B. : The summation symbol, from i = 1 to n.

Xi
C. : Represents the ith value of X.
n
∑X
i =1
i
n D.: Expresses the arithmetic mean of a sample of the population. It is the sum of data divided by
the number of values in the data set. Expressed algebraically:
X 1 + X 2 + X 3 + ... + X n
X=
n
One must be careful with the mean when there are extreme values in the data set. It best represents the
central tendency when the data is more or less evenly distributed about the middle, such as in the
Normal data.
158
1. Using Excel to calculate the mean of a sample, we have to use the function AVERAGE with data or
range of data.
2. Another tool can be used in Minitab as described below: STAT>BASIC STATISTICS>DESCRIPTIVE
STATISTICS
Measure of Location - Median

Purpose
To calculate the half way point of the data measurements. This is one of three principal measurements of
the central tendency of a population or a sample.
Anatomy
Measure of location - Median
Odd number of observations

B
X(n +1)
Median =
2 Even number of observations
X 1 + X 2 + X 3 + X4 + X 5 + X 6 + X 7
Xn/2 + X(n/2) +1
Median value Median =
2
X 1 + X 2 + X 3 + X4 + X 5 + X 6 + X 7 + X 8
A
Median value = mean of these two data
Six Sigma - Tools & Concepts MeaLocMd_001
Terminology
X ( n +1 2)
A. : Represents the value at (n +1)/2 observation of the sample.
X ( n 2 ) + X ( n 2 ) +1
2 B. : Represents the mean of the values at the (n/2) and (n/2 + 1) observations of the
sample.
C. Where n = the number of observations in the sample.
The median describes the central tendency of the data set and is sometimes a better measure of central
tendency than the mean, especially for data coming from a non symmetrical distribution. It is less precise
than the mean for data coming from a interval scales, for example from a measuring instrument.
159
1. Using Excel to calculate the arithmetic median of a data set, we have to use statistical function
MEDIAN with data range.
2. Another tool to use of Minitab in described below: STAT>BASIC STATISTICS>DESCRIPTIVE
STATISTICS
Measure of Location - Mode

Purpose
To find the most frequently observed value in a set of data. This is one of three principal measurements
of the central tendency of a population or a sample when data is coming from a nominal scale of
measure. It is also used for severely skewed distributions, for describing an irregular situation or for
eliminating the effects of extreme values.
Anatomy
Measure of location - Mode
C B
A
Scores
45
46
4
45
34
50
40
43
42
3
Frequency
35
40
39
43
46 2
35
37
35
46
37
43 1
32
50
25
34
22
48 0
43
29
32 20 30 40 50
Scores
Six Sigma - Tools & Concepts MeaLocMo_001

Terminology
A. Set of observed data.
B. The most frequently occurring value in the set of data.
C. Histogram, as processed by Minitab.
It is possible that a set of data will contain no mode. If a mode exists, it may be unique (unimodal) or
multiple (bimodal, trimodal, etc.). Since the mode is not representative of all the data, it is the least
efficient of all the measures of location.
1. Using Excel to determine the mode of a data set, we have to use the statistical function MODE and
enter the range of data:
2. Another tool is the Frequency Histogram from which you can determine the mode, produced by
Minitab: STAT>BASIC STATISTICS>DESCRIPTIVE STATISTICS
160
Measure of Variation - Range
Purpose
To measure data dispersion between the highest and the lowest values of a sample. This is one of the
measurements that evaluates the variation between values.
Anatomy
Measure of Variation - Range
A
R
R == XX max - X min
max - X min
C
Six Sigma - Tools & Concepts MeaVarRg_001
Reference: Business statistics - Ch. 2, P. 13
Terminology
A. R: Represents the range. It is an estimator of the population standard deviation σ.
B. The largest value in the data set.
C. The smallest value in the data set
Since the RANGE only considers extreme values, it may be an imprecise dispersion measure, unless the
sample size is small (i.e.: 10).
1. Use Minitab to calculate the Range:
Use the following function in Calc>Column Statistics and choose Range as statistic.
161
Measure of Variation - Variance
Purpose
To determine the variability in a set of data. It is important to measure the extent to which data are
scattered around the zone of central tendency, the mean, because this variation causes defects.
Variance is the sum of squares of the difference between each observation and the average divided by
the degrees of freedom.
Anatomy
Variance
B C
MeaVarVr_001
Reference: Business Statistics, Ch. 2, P. 13
162
Terminology
2
A. s2 or σ2 (sigma hat squared) are used to denote "sample variation". σ (sigma squared) or an upper
case S2 are used to signify "population variance".
th
B. xi represents the i value of X.
C. x (x bar) is used to denote the sample mean. µ (mu) represents the population mean.
D. Degrees of Freedom (n-1), where the lower case "n" is sample size. The population variance does
not use n-1 in the denominator. Population size is designated with an upper case N.
The variance is a good measure of dispersion, but it suffers from one disadvantage: it is difficult to
interpret the numerical value of the variance because the units of measure are squared. It is often more
convenient to calculate the square root of the variance, which is referred to as the standard deviation.
To calculate the VARIANCE:
Using Excel to calculate the variance of a data set, we have to use statistical function VAR.
163
Nonparametric Test – Kruskal Wallis Test
Purpose
To compare the medians of two or more populations on a continuous CT characteristic. For data that
comes from a non-normal distribution, this test is comparable to the One way ANOVA. Since we don't
know the population medians, an analysis of samples of data is required. This test is usually used to
determine if there is a statistically significant change in the median of a CT characteristic under two or
more different conditions introduced by one factor (see concept Factor and Levels).
Anatomy
Nonparametric Test - Kruskal -
Wallis Test
H 0: M 1
=M 2
=…=M g
H a : M i ≠ M j for at least one pair (i, j)

A
Kruskal-Wallis Test
Kruskal-Wallis Test on C1 C
C2 N Median Ave Rank Z
1 5 13.20 7.7 -0.45
2 5 12.90 4.3 -2.38 D
3 6 15.60 12.7 2.71

Overall 16
H = 8.63 DF = 2 P = 0.013
H = 8.64 DF = 2 P = 0.013 (adjusted for ties)
F
E
Six Sigma - Tools & Concepts KrusWTst _001
Reference: Business Statistics by D. Downing and J. Clark - Ch. 17 P. 383-398, Minitab Reference Manual -Ch. 5 P. 13-16
Terminology
A. Null (H0) and alternative (Ha) hypotheses, where the medians (M1, M2, …, Mg) of the g levels of the
factor are compared. There is only one alternative hypothesis which is that at least the medians of
two levels are significantly different.
B. Minitab Session Window Output.
C. Sample medians of the ordered data.
R
D. The z-value indicates how the mean rank ( i ) for the group i differs from the mean rank ( R ) for all N
observations.
E. Kruskal-Wallis test statistic.
The Kruskal-Wallis test assumes that the data come from g independent random samples from
continuous distributions, all having the same shape.
164
2. Establish hypotheses. State the Null hypothesis and the Alternative hypothesis.
3. Establish alpha level (α). Usually alpha is 0.05.
4. Select the random samples.
6. Analyze the data with Minitab:
Input values of the CT characteristics into columns in the data window
The data has to be stacked into one column and a second column to contain the group codes. This is
done using the function under Manip > Stack/Unstack > Stack Columns.
Use the function under Stat > Nonparametrics >Kruskal Wallis
Input the name of the column that contains the CT measurements into the 'Response' field, and the
name of the column that contains the level codes into the 'Factor' field.
7. Make statistical decision from the Session Window output of Minitab. Either accept or reject H0. If H0
is rejected, we can conclude that there is a significant difference between the medians of the levels.
8. Translate statistical conclusion to practical decision about the CT characteristic.
Nonparametric Test - Mann-Whitney Test - Two Sample

Purpose
To compare the medians of two populations on a continuous CT characteristic. For data that comes from
a non-normal distribution, this test is comparable to the Two Samples T-Test. Since we don't know the
population medians, an analysis of two samples of data is required. This test is usually used to determine
if there is a statistically significant change in the median of a CT characteristic under two different
conditions.
Anatomy
Nonparametric Test - Mann-
Whitney Test - Two Samples
H 0: M 1
=M 2
H a: M ≠ M , M <M , M >M
1 2 1 2 1 2
A
Mann-Whitney Confidence Interval and Test

C1 N = 8 Median = 69.50
C
C2 N = 9 Median = 78.00
Point estimate for ETA1-ETA2 is -7.50
95.1 Percent CI for ETA1-ETA2 is (-18.00, 4.00) D
Test of ETA1 = ETA2 vs ETA1 not = ETA2 is significant at 0.2685
The test is significant at 0.2679 (adjusted for ties)
E F
Cannot reject at alpha = 0.05
Six Sigma - Tools & Concepts MannWTst _001
165
Reference: Business Statistics by D. Downing and J. Clark - Ch. 17 P. 383-398, Minitab Reference Manual - Ch. 5 P. 11-13
Terminology
A. Null (H0) and alternative (Ha) hypotheses, where the two population medians (M1, M2) are compared.
For the alternative hypothesis, one of the three hypotheses has to be chosen before collecting the
data to avoid being biased or influenced by the observations of the sample.
B. Minitab Session Window output.
C. Sample medians of the ordered data.
D. The 95.1% confidence interval for the difference in population medians.
E. Mann-Whitney test statistic.
The Mann-Whitney test assumes that the data are independent random samples from two populations
that have the same shape (hence the same variance), and a scale that is at least ordinal in nature (see
concept Measurement Scale – Ordinal). For normal populations, this test is less powerful than the Two
Samples T-Test when variances are assumed to be equal (see tool T-Test - Two Samples). However, for
other populations, the Mann-Whitney test is more powerful than the T-Test. If the two populations have
different shapes or standard deviations, then the Two Samples T- Test with variances not assumed
equal is a more appropriate test.
Input the two data samples into two separate columns.
Use the function under Stat > Nonparametrics > Mann-Whitney.
Input one column name into 'First sample' and the second column into 'Second Sample', input the
Confidence level (default is 95%), and the desired Alternative hypothesis (>, <, ≠..the default
setting).
7. Make statistical decision from the Session Window output of Minitab. Either accept or reject H0.
Nonparametric Test - Mood's Median Test

Purpose
To compare the medians of two or more populations on a continuous CT characteristic. For data that
comes from a non-normal distribution, this test is comparable to the One way ANOVA. Since we don't
know the population medians, an analysis of data samples is required. This test is usually used to
determine if there is a statistically significant change in the median of a CT characteristic under two or
more different conditions introduced by one factor (see concept Factor and Levels).
166
Anatomy
Nonparametric Test - Mood’s

Median Test
H 0: M 1
=M 2
=…=M g
A H a: M ≠ M for at least one pair (i, j)

i j
Mood Median Test

G
B Mood median test for Otis
Chi-Square = 49.08 DF = 2 P = 0.000
Individual 95% Cis

ED F
N<= N> Median Q3-Q1 ----+---------+---------+---------+--
0 47 9 97.5 17.2 (------+-----)
1 29 24 106.0 21.5 (-----+-----)
2 15 55 116.5 16.2 (----+----)
Overall median = 107.0
C
E
D
Six Sigma - Tools & Concepts MoodMTst _001
Reference: Business Statistics by D. Downing and J. Clark - Ch. 17 P. 383-398, Minitab Reference Manual -Ch. 5 P. 16-18
Terminology
A. Null (H0) and alternative (Ha) hypotheses, where the medians (M1, M2, …, Mg) of the g levels of the
factor are compared. There is only one alternative hypothesis: at least the medians of two levels are
significantly different.
B. Minitab session window output.
C. Number of measurements for each sample that falls above or below the overall median.
D. The interquartile range for each of each samples.
E. Individual sample 95% Confidence Intervals, shown as a graphical output.
F. The Chi-squared test value.
G. P-Value – This value has to be compared with the alpha level (α) and the following decision rule is
The Mood's Median test assumes that the data come from g independent random samples from
continuous distributions, all having the same shape. Compared to the Kruskal-Wallis test (see
Nonparametric Test - Kruskal-Wallis Test), this test is more robust to the presence of outliers in data, and
is particularly appropriate in the preliminary stages of analysis. However the Mood's Median test is less
efficient for data coming from many distributions, including the normal.
167
Input values of the CT characteristics into columns in the data window
The data has to be stacked into one column and a second column to contain the group codes. This is
done using the function under Manip > Stack/Unstack > Stack Columns.
Use the function under Stat > Nonparametrics > Mood's Median Test
Input the name of the column that contains the CT measurements into the 'Response' field, and the
name of the column that contains the level codes into the 'Factor' field.
In order to verify the assumption of the model, select the options 'Store Residuals' and 'Store Fits' (to
interpret these plots see tool Residual Plots).
In the event of non-compliance with either of these assumptions, the results of the Mood's median
test may be distorted. In the case of outliers, these should be investigated.
7. Make a Statistical decision from the session window output of Minitab. Either accept or reject Ho. If
H0 is rejected, we can conclude that there is a significant difference between the medians of the
levels.
Nonparametric Test - Wilcoxon Test - One Sample

Purpose
To compare the population median of a continuous CT characteristic with a value such as the target. For
data that comes from a non-normal distribution, this test is comparable to the One Sample T-Test. Since
we don't know the population median, an analysis of a sample of data is required. This test is usually
used to determine if the median of a CT characteristic is on target.
Anatomy
N o n p a r a m e tr ic Test - W ilc o x o n
T e s t - O n e S a m p le
H 0
: M = M 0
H : M ≠ M , M < M , M > M
A a 0 0 0
B
Wilcoxon Signed Rank Test
Test of median = 115.0 versus median > 115.0
N for Wilcoxon Estimated

N Test Statistic P Median
C1 29 29 315.5 0.018 165.0
C D E
S ix S ig m a - T o o ls & C o n c e p ts W ilc o T s t _ 001
Reference: Business Statistics by D. Downing and J. Clark - Ch. 17 P. 383-398, Minitab Reference Manual -Ch. 5 P. 8 - 9
168
Terminology
A. Null (H0) and alternative hypotheses (Ha), where the population median (M) is compared to a value
(M0) such as a target. For the alternative hypothesis, one of the three hypotheses has to be chosen
before collecting the data to avoid being biased or influenced by the observations of the sample.
C. Sample size, N for test (Sample size minus measurements equal to the hypothesized median).
D. Wilcoxon test statistic.
E. P-Value – This value has to be compared with the alpha level (α) and the following decision rule is
The Wilcoxon test assumes that the data comes from a random sample and the distribution is
symmetrical. For a normal population, this test is less powerful than Student's t test (see tool T-Test -
One Sample). However, for other populations, the Wilcoxon test may be more powerful than Student's t
test.
4. Select a random sample.
Use the function under Stat > Nonparametrics > 1-Sample Wilcoxon.
Select the Test median option, input the target value and the desired Alternative hypothesis (>, <,
≠..the default setting).
7. Make statistical decision from the Session Window output of Minitab. Either accept or reject H0.
169
Process Mapping/Process Flow Diagram
Purpose
To provide a visual representation of the process steps that a product or service follows. To compare the
"as is" process against the "should be". To assess the complexity level of the process, and to identify
non-value added operations, possible simplification and standardization opportunities, areas where
defects may occur, data collection points, etc.
Anatomy
Process Flow Diagram
A B C D E
E F
G
Six Sigma - Tools & Concepts ProFlwDi_001
Terminology
A. Terminal symbol (rounded rectangle) used to designate the beginning or end of a process flow.
Usually identified as "start", "end", "beginning" or "stop";
B. Activity symbol (rectangle) used to designate an activity. A brief description of the activity is also
included within this symbol;
C. Decision symbol (diamond) used to illustrate the point where a decision must be taken. Subsequent
activities are dependent upon the decision taken at this point;
D. Flow line (arrow) indicates the direction of the process and connects its elements;
E. Connector symbol (circle) used to illustrate a break and its continuation elsewhere on the same page
or another page;
F. Document symbol indicates a printed document pertinent to the process;
G. Delay symbol used to indicate that there is a delay or waiting period in the process.
The Process Flow Diagram should be accurate and reflect the true process and not the "ideal" process.
This is an important tool for continuous process improvement and should not be under utilized. The same
level of detail shall be provided for all the steps of the process map.
170
1. Determine the limits of the process to map.
• Clearly define where the process begins and ends;
• Agree on the level of detail to show in the process map.
2. Determine the steps in the process.
• In a brainstorm session, list major activities, inputs, outputs and decisions.
3. Identify the sequence of the steps.
• Draw the steps in the same order as they are carried out without any arrows;
• Define what "is" and not what "should be" (By comparing these, two probable causes of
problems may be identified and solved).
4. Draw the Process Flow Diagram using the standard symbols presented herein.
• Label each process step using words that are understandable by everyone;
• Add arrows to show the direction in which the process flows;
• Identify the process map with its name, date, and names of the team members.
5. Test the Process Flow Diagram to ensure that it is complete.
• Are the symbols used correctly?
• Are the process steps clearly identified?
• Is every feedback loop closed?
• Does every continuation point have a corresponding point elsewhere in the process map?
• Do all activity boxes have only one arrow coming out of them? (If not, a decision diamond
may be needed instead of an activity box);
• Validate the process map with colleagues that are not part of the team and who carry out the
activities depicted in the process map. Highlight their comments to the team and incorporate
their comments as applicable.
6. Finalize the Process Flow Diagram.
• Is this process being run the way it should be?
• Are people/departments following the process as mapped?
• Are there obvious redundancies or complexities that can be reduced or eliminated?
7. How different is the current process map from an ideal one? Draw an ideal process map. Compare
the two to identify discrepancies and opportunities for improvement.
171
Rational Subgrouping
Purpose
To collect data in such a mater as to have the best possible representation of an unknown process state.
Anatomy
Rational Subgrouping
Time
15 10
D
15
Frequency
Frequency
Frequency
10
10
5
5 5
C
0 0 0
7.4707.4787.4867.4947.5027.5107.5187.5267.5347.5427.550 7.4707.4787.4867.4947.5027.5107.5187.5267.5347.5427.550 7.4707.4787.4867.4947.5027.5107.5187.5267.5347.5427.550
sub1 sub2 sub3
A
Process output
Measurement
B
Measurement
Time
Subgroups
G
F
Sub 1 Sub 2 Sub 3 Time

E
Six Sigma - Tools & Concepts RatSubgp_001
nd
Reference: Douglas C. Montgomery, Introduction to Statistical Quality Control, John Wiley & Sons, 2 edition, P. 113/Donald J. Wheeler, Advanced Topics in
Statistical Quality Control, SPC Press, Ch. 6.3
Terminology
A. True process output measurement of each production unit
B. Individual measurement in a rational subgroup (size n=5) (typical)
C. Histogram showing the distribution of the measurement in a subgroup (typical)
D. Subgroups measured in sequence and this sequence must be preserved in the data (typical)
E. Interval between subgroups (rational) in order to capture black noise effects
F. Subgroup sizes (rational) in order to only capture white noise effects
G. Overall measurement distribution (white noise plus black noise)
It is important to note that improper rational subgrouping can lead to the inclusion of special cause effects in the
within subgroup variation. It can also result in missing the non-random variation in the between subgroups variation.
The consequence might be a wrong perception of the true situation resulting in inappropriate actions.
Each subgroup (usually consecutive units), must come from a single distinct population. Within subgroup
variation should be representative of white noise only. The interval between subgroups should be
appropriate to capture the between variation due to black noise. Only through a reasonable approach is
it possible to obtain rational subgrouping.
172
1. Define the distinct population to be represented by the subgroup data;
2. Establish the minimum subgroup size in order to reflect the within variation;
3. Establish the frequency of the sampling in order to capture the between variation;
4. Collect data maintaining the sequential information.
Donald J. Wheeler has six guiding principles for subgrouping in a rational manner.
.Never knowingly subgroup unlike things together;

.Minimise the variation within each subgroups;
.Maximise the opportunity for variation between the subgroups;
.Average across noise not across signals;
.Treat the charts in accordance with the use of the data;
.Establish standard sampling procedures
173
Residuals Plots
Purpose
Residual (also known as error) is the difference between the data point and the fitted value. To verify the
fit of the mathematical model to the data, often in conjunction with ANOVA and Regression, to suggest
alternative models and identify the presence of outliers.
Anatomy
Residual Plots
A
C
Six Sigma - Tools & Concepts ResiPlot_001
Terminology
A. Distribution of the residuals presented as a Normal Plot – It provides a visual check of the
assumption that the residuals are normally distributed. They tend to form a straight line for a normal
distribution.
B. Histogram of Residuals – Similar to the function of Normal Plot, it provides an alternate presentation
to check if the shape of the residual distribution resembles a normal distribution.
C. Individual (I) Chart of residuals – It helps to check if the residuals are time dependent and provides
information to identify outliers. When the residuals form a trend or fall outside the upper or lower
control limits, the data point should be examined closely for special causes.
D. Residuals plotted against the fit of the mathematical model – It provides visual clues to verify if the
variance of the residual is constant. The distance between the residual points and the centerline
represents how closely the mathematical model fits the data point. Patterns other than a horizontal
band of randomly distributed residuals may indicate problems such as a data measuring instrument
problem, or data coming from an asymmetrical distribution.
Since residual analysis is an essential step to validate the mathematical model, it should be carried out
routinely. For the residual analysis on Regression, the data must be entered in pairs.
174
1. While carrying out analyses such as Regression or ANOVA (ANOVA one way unstacked does not
support residual plot), pick Graph, then select one or more of the Residual Plots. Residual Plots will
be displayed individually with the analysis.
2. Alternatively, pick Storage, then select options for Residuals and Fits when performing analyses such
as Regression, ANOVA, etc. The residuals and fits associated with the analysis will be stored in
columns labeled REST1 and FITS1 or the equivalent.
3. Carry out residual analysis by checking if the residual is normally distributed, (i.e. straight line in the
Normal plot, and the histogram in the shape of normal distribution).
4. Check if the residuals in the I-Chart form a trend which is often indicative of a problem in data
collection. Examining points outside the control limits for special causes.
5. Check if the residuals are randomly distributed. Converging or diverging patterns as shown below
tend to suggests measuring problem with an error proportional to the measured value, or data from
an asymmetrical distribution.
175
Sample Size – Continuous Data – One Sample Mean Test
Purpose
To calculate the sample size required for a one sample Mean Test taking into account the desired α and
β risks.
Anatomy
Sample Size - Continuous Data -

One Sample Mean Test
B C D
2
 
σ +
2
 z zβ 
α
 2 
A
n=
δ
2
Six Sigma - Tools & Concepts SaSiT1M_001
Reference: Juran's Quality Control Handbook – Ch. 23 P. 78, Experimental Statistics by M.G. Natrella Ch. T P. 16-17
Terminology
A. n - Sample size symbol.
Β. σ2 - The known population variance from past experience or the sample variance from a preliminary
sample.
C. Zα/2 - Tabulated standard normal (Z) distribution value with probability α/2. The value α/2 is used
when the alternative hypothesis is two sided (HA : µ ≠ µ0). When the alternative hypothesis is one
sided (HA : µ > µ0 or µ < µ0), then the value α must be used.
D. Zβ - Tabulated standard normal (Z) distribution value with probability β.
Ε. δ2 -The minimum difference that we want to be able to detect from this test.
The assumption for using this formula is that the data comes from a normal distribution.
176
1. Establish alpha level (α) and beta level (β). Usually α is 0.05 and β is 0.1.
2. Obtain the Zα/2 and Zβ values from a software such as Excel, or a table of area under the normal
curve (ex. For α=0.05, Zα/2 = Z0.025 = 1.96 and for β=0.10, Zβ = 1.28).
3. Obtain σ from past experience or from an estimate from a preliminary sample of size ≥ 30.
4. Establish δ, which represents the minimum difference that we want to be able to detect from the test
expressed in the measurement units of the CT characteristic. In other words, if the population mean
differs by a certain value, we would like to detect this value with a high probability. For example,
suppose we wish to test the true cycle time of a process. We determine that if the mean cycle time
differs by as much as 2.0 hours, we would like to be able to detect it, then δ=2.
5. Use the formula presented. The sample size will be the next integer (e.g. n=35.32, sample size will
be 36).
177
Sample Size – Continuous Data – Two Samples Mean Test
Purpose
To calculate the sample size, taking into account desired α and β risks.
Anatomy

Two Samples Mean Test
A α= 20% α= 10% α= 5% α= 1%
δ/σ 20% 10% 5% 1% 20% 10% 5% 1% 20% 10% 5% 1% 20% 10% 5% 1%
0.1 902 1314 1713 2603 1237 1713 2165 3154 1570 2102 2599 3674 2336 2976 3563 4806
B 0.2
0.3
225
100
328
146
428
190
651
289
309
137
428
190
541
241
789
350
392
174
525
234
650
289
919
408
584
260
744
331
891 1202
396 534
0.4 56 82 107 163 77 107 135 197 98 131 162 230 146 186 223 300
0.5 36 53 69 104 49 69 87 126 63 84 104 147 93 119 143 192
0.6 25 36 48 72 34 48 60 88 44 58 72 102 65 83 99 134
0.7 18 27 35 53 25 35 44 64 32 43 53 75 48 61 73 98
0.8 14 21 27 41 19 27 34 49 25 33 41 57 36 46 56 75
0.9 11 16 21 32 15 21 27 39 19 26 32 45 29 37 44 59
C 1.0
1.1
9
7
13
11
17
14
26
22
12
10
17
14
22
18
32
26
16
13
21
17
26
21
37
30
23
19
30
25
36
29
48
40
1.2 6 9 12 18 9 12 15 22 11 15 18 26 16 21 25 33
D 1.3 5 8 10 15 7 10 13 19 9 12 15 22 14 18 21 28
1.4 5 7 9 13 6 9 11 16 8 11 13 19 12 15 18 25
1.5 4 6 8 12 5 8 10 14 7 9 12 16 10 13 16 21
1.6 4 5 7 10 5 7 8 12 6 8 10 14 9 12 14 19
1.7 3 5 6 9 4 6 7 11 5 7 9 13 8 10 12 17
1.8 3 4 5 8 4 5 7 10 5 6 8 11 7 9 11 15
1.9 2 4 5 7 3 5 6 9 4 6 7 10 6 8 10 13
2.0 2 3 4 7 3 4 5 8 4 5 6 9 6 7 9 12
2.1 2 3 4 6 3 4 5 7 4 5 6 8 5 7 8 11
σ
2 2 2.2 2 3 4 5 3 4 4 7 3 4 5 8 5 6 7 10
 
2.3 2 2 3 5 2 3 4 6 3 4 5 7 4 6 7 9
n 1 = n 2 = 2  z α + zβ 
2.4 2 2 3 5 2 3 4 5 3 4 5 6 4 5 6 8
2.5 1 2 3 4 2 3 3 5 3 3 4 6 4 5 6 8
δ
2.6 1 2 3 4 2 3 3 5 2 3 4 5 3 4 5 7
 2  2 2.7 1 2 2 4 2 2 3 4 2 3 4 5 3 4 5 7
2.8 1 2 2 3 2 2 3 4 2 3 3 5 3 4 5 6
2.9 1 2 2 3 1 2 3 4 2 2 3 4 3 4 4 6
3.0 1 1 2 3 1 2 2 4 2 2 3 4 3 3 4 5
3.1 1 1 2 3 1 2 2 3 2 2 3 4 2 3 4 5
3.2 1 1 2 3 1 2 2 3 2 2 3 4 2 3 3 5
3.3 1 1 2 2 1 2 2 3 1 2 2 3 2 3 3 4
3.4 1 1 1 2 1 1 2 3 1 2 2 3 2 3 3 4
3.5 1 1 1 2 1 1 2 3 1 2 2 3 2 2 3 4
3.6 1 1 1 2 1 1 2 2 1 2 2 3 2 2 3 4
3.7 1 1 1 2 1 1 2 2 1 2 2 3 2 2 3 4
3.8 1 1 1 2 1 1 1 2 1 1 2 3 2 2 2 3
3.9 1 1 1 2 1 1 1 2 1 1 2 2 2 2 2 3
Six Sigma - Tools & Concepts SaSiT2M_001
Reference: The Vision of Six Sigma: Tools and Methods for Breakthrough – Ch. 14 P. 24-28
Terminology
Α. α and β risk.
Β. δ/σ -The difference in means that we want to be able to detect with the test expressed in standard
deviation units.
C. Corresponding sample size for sample 1 and 2.
D. Formula used in the table where: n1 and n2 represent sample sizes of sample 1 and 2, Zα/2 and Zβ are
tabulated standard normal (Z) distribution values with probability α/2 and β, σ is the population
standard deviation and δ is the difference that we want to be able to detect with the test.
The assumption for using this formula is that the data comes from a normal population.
178
1. Establish alpha level (α) and beta level (β). Usually α is 0.05 and β is 0.10.
2. Establish δ, which represents the minimum difference that we want to be able to detect from the test
expressed in the measurement units of the CT characteristic. In other words, if the two population
means differ by a certain value, we would like to detect this value with a high probability. For
example, suppose we wish to compare the true cycle time of a process before and after a change.
We determine that if the mean cycle time differs by as much as 2.0 hours, we would like to be able to
detect it, so δ=2.
3. Obtain σ from past experience or an estimate from a preliminary sample.
4. Use the formula or the table presented to obtain the sample size.
• If the formula is used, obtain the Zα/2 and Zβ values from a software such as Excel or a table
of area under the normal curve (ex. For α=0.05, Zα/2 = Z0.025 = 1.96 and for β=0.10, Zβ = 1.28).
179
Sample Size – Continuous Data – Mean
Purpose
To calculate the sample size required in estimating the population mean of a continuous CT
characteristic from a sample of data within a certain margin of error.
Anatomy

Mean
B C
 zα 2 × σ 
2
A
n= 
 E 
 
D
Six Sigma - Tools & Concepts SaSiMean_001
Reference: Juran's Quality Control Handbook – Ch. 23 P. 50-51, Experimental Statistics by M.G. Natrella Ch. 2 P. 10-11.
Terminology
A. n - Sample size symbol.
B. Zα/2 - Tabulated standard normal (Z) distribution value with probability α/2.
. σ - The known population standard deviation from past experience or the sample standard deviation
from a preliminary sample.
D. E - The maximum allowable error or the desired precision.
The assumption for using this formula is that the data comes from a normal distribution. If the standard
deviation of the population is unknown, the sample standard deviation of a preliminary sample of at least
30 can be used.
180
2. Obtain the Zα/2 value from a software such as Excel, or a table of area under the normal curve (ex.
For α=0.05, Zα/2 = Z0.025 =:1.96).
3. Establish E, which is the maximum allowable error or the desired precision, expressed in the
measurement unit of the CT characteristic. In other words, let's say that we are willing to take a α risk
that our estimate of the mean of the population will be off by E or more units. As an example,
suppose we wish to estimate the true cycle time of a process. We determine that our estimate must
be within 2.0 hours of the true mean in order to be useful, then E=2.
4. Obtain σ from past experience or an estimate from a preliminary sample of observations ≥ 30. If only
a smaller preliminary sample is available, the following chart can be used.
5. Use the formula presented or the following chart.
• To use the chart, first calculate E/s (s is the sample standard deviation) and select the curve
line corresponding to the α level. The sample size can be read on the Y-axis of the graph.
α = .10 .05 .01

100
90
80
70
60
50
40
30
25
20
15
SAMPLE SIZE
10
9
8
7
6
5
4
3
2.5
2
1.5
1
.1 .15 .2 .25 .3 .4 .5 .6 .7 .8.91.0 1.5 2 2.5 3 4 5 6 7 8 9 10
E/S
Note: Taken from the Juran's Quality Control Handbook - Appendix II, page AII.33)
181
Sample Size - Discrete
Purpose
To establish the appropriate sample size for a population of discrete data.
Anatomy
Sample Size - Discrete
A B
npˆ ≥ 5
n(1 − pˆ )≥ 5
Six Sigma - Tools & Concepts SampSizD_001
Terminology
A. Sample Size - Discrete Data (Proportion)
B. Proportion - The proportion of successes
Applicable to discrete distributions

To be used only where p is not too close to zero or one
Applies for normal approximation of the binomial distribution
Sample size should be a minimum of 30
Sample size should be sufficiently large such that under practical worst case considerations no data
frequency (cell) is 5
1. Estimate the value of p for the population.
2. Choose a sample size.
3. Verify both equations are satisfied by calculating values for both equations
n p̂ ≥ 5
n(1 - p̂ ) ≥ 5
182
Sample Size – Continuous Data - Standard Deviation
Purpose
To calculate the sample size required in estimating the population standard deviation of a continuous CT
characteristic from a sample within a certain margin of error.
Anatomy

Standard Deviation
1,000
800
600
500
400
A 300
200
DEGREES OF FREEDOM
ν
γ=
100
.9
9
80
C
γ=
.9
60
5
γ=
50
.9
0
40
30
20
10
6
5
B 5 6 8 10 20 30 40 50
P%
Six Sigma - Tools & Concepts SaSiStdv_001
Reference: Juran's Quality Control Handbook – Ch. 23 P. 50-51 and Appendix II P. 34, Experimental Statistics by M.G. Natrella Ch. 2 P. 12-13
Terminology
Α. γ (gamma) - Confidence level (1-α)
B. P% - The maximum allowable error or the desired precision expressed as a percentage
C. Degrees of freedom ν = n - 1 , hence sample size = ν + 1
The assumption for using this formula is that the data comes from a normal distribution.
1. Establish confidence level (1-α). Usually the confidence level is 0.95
2. Establish P%, the allowable percentage deviation of the estimated standard deviation from its true
value.
• Let say that if we take a sample of measurements, and we want to know how many
measurements are required to estimate the standard deviation within P percent of its true
value with a prescribed confidence level. As an example, suppose we want to know how large
a sample would be required to estimate the standard deviation within 20% of its true value,
with a confidence level equal to 0.95, then P=20% (and γ=0.95).
3. Select the line corresponding to the confidence level. At the intersection between this line and the
P% line, the number of degrees of freedom (n-1) are obtained on the Y-axis.
4. Add 1 to obtain the sample size.
183
Scatter Plot
Purpose
To graphically demonstrate the nature of any possible relationship between two sets of variables by
plotting one against the other on a simple x-y plot
Anatomy
Scatter Plot
Scatter Plot 1 - No Obvious Relationship Between X and Y

12.5
11.5 C
10.5
Y
9.5
8.5 Scatter Plot 2 - Linear Relationship Between X and Y
7.5 1.0
0 50 100
X
Y
0.5
B
0.0
0 50 100
A X
Six Sigma - Tools & Concepts ScatPlot_001
Reference: Juran Quality Control Handbook p 23.10
Terminology
A. X variable.
B. Y variable.
C. Data points, plotted with no interconnection between points.
Scatter plots are a good first step in determining the presence of any type of relationship between any
two variables, which may or may not be dependent
Data must be presented in the order it was collected
Scatter plots can best be generated in either MS Excel or in Minitab
1. Gather data in tabular form.
2. Select which variable to use as the X axis variable.
3. Plot data using Minitab's Graph>Plot function or using MS Excel.
184
Ζ)
Standard Normal Deviate (Ζ
Purpose
To determine the probability of defect for a random variable critical characteristic which is measured in
continuous data. The Z transform is an important tool because it enables this calculation without the
complex mathematical calculations, which would otherwise be required.
Anatomy
Standard Normal Deviate (Z)
B C
SL − µ
A Z=
σ
(X − µ)
Z=
Specification
Limit
σ Z=
Z F
D
Units of Measure
0.85 .197662672 2.36 .009137469 3.87 .000054545 5.38 .000000041

Where “X” may take on 0.90 .184060243 2.41 .007976235 3.92 .000044399 5.43 .000000031
the spec limit value 0.95 .171056222 2.46 .006946800 3.97 .000036057 5.48 .000000024
(USL or LSL). E 1.00 .158655319 2.51 .006036485 4.02 .000029215 5.53 .000000018
1.05 .146859086 2.56 .005233515 4.07 .000023617 5.58 .000000014
1.10 .135666053 2.61 .004527002 4.12 .000019047 5.63 .000000010
1.15 .125071891 2.66 .003906912 4.17 .000015327 5.68 .000000008
1.20 .115069593 2.71 .003364033 4.22 .000012305 5.73 .000000006
1.25 .105649671 2.76 .002889938 4.27 .000009857 5.78 .000000004
1.30 .096800364 2.81 .002476947 4.32 .000007878 5.83 .000000003
1.35 .088507862 2.86 .002118083 4.37 .000006282 5.88 .000000003
1.40 .080756531 2.91 .001807032 4.42 .000004998 5.93 .000000002
1.45 .073529141 2.96 .001538097 4.47 .000003968 5.98 .000000001
1.50 .066807100 3.01 .001306156 4.52 .000003143 6.03 .000000001
StdNormZ_001
Terminology
A. The standard transform, Z, transforms a set of data such that the mean is always zero (µ=0) and the
standard deviation is always one (σ=1.0). By virtue of this transformation, the raw units of measure
(e.g. inches, etc.) are eliminated or lost so the Z measurement scale is without units.
B. X is the value that a random variable CT characteristic can take.
C. The mean of the population. When using a sample, an estimate of µ such as Xbar will be used to
substitute µ.
D. The standard deviation of the population. When using a sample, an estimate of σ, such as "s", will be
used to substitute σ.
E. The area under the Normal Curve Table, for normal distributions where µ=0 and σ=1.0, is the
reference used to find the surface that lies beyond the value of X.
F. This area represents the probability of a defect. When X takes the value of a performance limit, for
example a specification limit (SL), the area under the normal curve which lies beyond the Z value is
the probability of producing a defect.
The use of the Standard Normal Deviate assumes that the underlying distribution is Normal. When
establishing a rate of nonconformance with the Z value, if the actual distribution is markedly skewed (I.e.
non-normal), the likelihood of grossly distorted estimates is quite high. To avoid such distortion, it is often
possible to mathematically transform the raw data.
185
1. To calculate the Z value from sample data, apply the formula and replace µ and σ by Xbar and "s"
respectively.
2. Use the following Excel function (NORMSDIST) to obtain the probability related to a Z value. Note
that Excel gives the probability to be lower than the Z value. In order to obtain the probability of being
greater than a Z value, simply use 1-NORMSDIST.
3. In Minitab use the function "Calc>Probability Distribution>Normal" to obtain the probability to be
lower than a Z value.
Cumulative Distribution Function
Normal: mean = 0 and standard deviation = 1.00000
X P(X < SL)
2.91 0.9982
1-0.9982= .0018 (.18%) which represents the probability of having a value greater than Z
186
Standard Deviation - Conventional
Purpose
To provide a measurement yardstick that describes the variation in a set of data. The standard deviation
is the most important measure of variation.
Anatomy
Standard Deviation - Conventional
E D C
n
∑ ( x − x )2 i
σ$ = s = i =1
n −1
A
StDevCnv_001
Terminology
A. σ̂ (sigma hat): Represents the estimator of the population standard deviation (σ). Also "s" is used to
denote sample standard deviation.
B. n − 1 Represents the sample size minus 1 (n-1). It represents the number of degrees of freedom
associated with the statistic.
C. X (x bar) is used to denote sample mean.
D. X1 Represents the ith value of X.
n n
∑i =1
∑(X
i =1
i − X )2 =
E. The summation symbol, from i = 1 to n. Where Sum of squares.
The standard deviation is the square root of the variance. The variance is difficult to interpret because
the unit of measure is squared. When we take the square root of the variance, we obtain a measure of
dispersion (standard deviation) which is in the same unit of measure as the sample.
To calculate the standard deviation:
1. Using Excel to calculate the standard deviation of a data set, we have to use the statistical function
STDEV.
2. Another tool is the use of Minitab, using STATS>BASIC STATISTICS>DESCRIPTIVE STATISTICS
187
Standard Deviation - Long-Term
Purpose
The standard deviation long-term estimates total standard deviation of the process, and is a measure of
how widely values are dispersed from the overall average value (the grand mean) due to white noise and
black noise.
Used to calculate process long-term capability (i.e.: Pp and Ppk) and PPM defective
Anatomy
Standard Deviation - Long-Term
sLT = σˆ = σLT F E
C D
∑∑ ( xi , j − x) 2
g n
j =1 i =1
sLT =
(ng − 1)
A
G H I
Six Sigma - Tools & Concepts StdDevLT_001
th
Reference: Mikel J. Harry, The vision of six sigma, Ch. 9/ Eugene L. Grant Richard S. Leavenworth, Statistical Quality Control 6 edition, McGaw Hill, Ch. 3
Terminology
A. Calculated Standard Deviation Long-Term.
B. Standard Deviation (hat means: Estimate of).
C. ith data point in the jth row (subgroup).
D. Overall arithmetic average.
E. Sum of squares for the jth row.
F. Sum of squares for all rows.
G. Number of rows (g).
H. Number of data in a row (n).
I. Degree of freedom: (i.e. (gn-1)).
Major concept in quality management and in statistics.

Metric used to evaluate long-term process variability.
Used by customers to rate suppliers and competitors.
188
1. Compute the overall mean of all data.
2. Calculate the differences between each individual data and the overall mean.
3. Square each difference.
4. Sum all squares.
5. Compute the degrees of freedom, which is the product of number of rows (n) and the number of
columns (g), minus 1 (i.e. ng-1).
6. Divide the sum of squares by the degrees of freedom.
7. Square root the ratio.
Alternative: Compute directly the standard deviation of all data at once.
The standard deviation long-term may be computed with Excel or Minitab
189
Standard Deviation - Short-Term
Purpose
The Short-term standard deviation estimates standard deviation within, and is a measure of how widely
values are dispersed from the average value (the mean) due only to the effect of white noise.
It is used to calculate Short-term Process Capability (i.e.: Cp and Cpk).
Anatomy
Standard Deviation - Short-Term
F E
B C D
sST = σ̂
∑ ∑(x − x )
g n
2
ij j
sST = j=1 i=1
g(n −1)
A
G H I
Six Sigma - Tools & Concepts StdDevST_001
Reference: Juran's Quality Control handbook, 4th edition, Ch 23
Terminology
A. Calculated Standard Deviation Short-Term.
B. Standard Deviation (hat means: Estimate of).
C. ith data point in the jth row (subgroup).
D. Arithmetic average of jth row.
E. Sum of squares for the jth row.
F. Sum of squares for all rows.
G. Number of rows (g).
H. Number of data in a row (n).
I. Degrees of freedom: (i.e. g(n-1)).
Major concept in quality management and in statistics
Metric used to evaluate short-term process variability (smallest possible for a given process)
Used by some customers to rate its suppliers and competitors
Standard Deviation Short-Term is also called Pooled Standard Deviation
1. Compute the variance for each row.
2. Average row variances.
3. Square root the average variance.
The standard deviation Short-term may be computed with Excel or Minitab.
190
Sums of Squares: Total, Between, and Within
Purpose
To break down the total variation into its components i.e. between and within variations.
Sum of squares is a mathematical technique to compute the combined effect of different sources of
variability.
Anatomy
Sum of Squares: Total, Between
and W ithin
H A D F G
C
SS T SS B
E
SS W
B
g n g g n
Σ Σ Σ ΣΣ
2 2 2
X ij - X = n X j - X + X ij - X j
j = 1 i= 1 j= 1 j = 1 i = 1
Total
Total Betw
Between
een W
Within
ithin
Sustained Reproducibility Instantaneous Reproducibility
Capability
Capability Accuracy
Accuracy Precision
Precision
Six Sigm a - Tools & Concepts Sum Squar_001
Reference: Mikel J. Harry, The vision of six sigma, Ch. 9/Douglas C. Montgomery, Design and Analysis of Experiments, 4th edition, John Wiley & Sons, , P. 69
Terminology
A. Individual value.
B. Summation over all subgroups (j = 1 to g).
C. Summation over all individuals within the subgroups (i = 1 to n).
D. Grand average (overall).
E. Average of subgroup j.
F. Between Sum of Squares (Black noise, special cause effect).
G. Within Sum of Squares (White noise, error, random cause effect).
H. Total Sum of Squares (White noise plus Black noise).
To analyze the total Sum of Squares, we need to break it into two parts: Within and Between Sum of
squares.
If each subgroup mean is the same as the population mean, then the deviation of any one element, from
the grand mean, arises only by chance (random variation).
Total Sum of Squares
1. Calculate the overall variance
2. Multiply the overall variance by its degrees of freedom
Within Sum of squares
3. Calculate the short-term variance
4. Multiply this variance by its degrees of freedom
Between Sum of Squares
Subtract the Within Sum of Squares from the Total Sum of Squares
191
Total Defects Per Unit
Purpose
To compute the Total Defects Per Unit (TDPU) for a multi-operation process in order to compute the
process rolled throughput yield.
Anatomy
Total Defects Per Unit
B C D
A
k k D
TDPU = ∑DPUi = ∑  
i =1 i =1
U i
= − ln(YRT ) E
G F
Six Sigma - Tools & Concepts TotDefUn_001
Reference: The Vision of Six Sigma: A Roadmap for Breakthrough
Terminology
A. Total defects per unit;
B. Summation of all the terms from 1 to k (k equals the number of operations in the process);
C. Number of defects per unit for the ith operation;
D. Number of defects produced during the ith operation;
E. Number of units produced during the ith operation;
F. Rolled throughput yield;
G. Natural logarithm.
The TDPU is the sum of the individual process step DPUs, and is not the ratio of the sum of defects to
the sum of units.
1. Compute Defects per Unit (DPU) for each process operation;
2. Count the process operations and sum to calculate TDPU;
or
3. Compute the process Rolled throughput Yield;
4. Calculate the negative Natural Log of YRT.
192
Truth Table
Purpose
To determine whether to add or subtract sigma shift in converting between Z short-term and Z long-term.
Anatomy
Truth Table
TO CONVERT A
FROM
Short-Term Long-Term
Short-Term No Action
+ 1.5 σST
TO
Long-Term - 1.5 σST No Action
B
C D
Six Sigma - Tools & Concepts TruthTbl_001
Reference: Mikel J. Harry, The vision of six sigma, White Book, White book, Ch. 9
Terminology
A. Actual situation measured data
B. Estimated situation
C. Historical shift between long-term and short-term
Fundamental of the Six Sigma approach. Tremendous effect on how to consider process performance
Short-term data are free of special causes. Therefore it represents the effect of random causes only
(white noise). Long-term data reflects random plus special cause effects.
1. Determine whether the raw data are short-term or long-term.
2. What kind of information is required: short-term or long-term
3. Enter the truth table at the corresponding location
4. Identify the appropriate action.
193
T-Test – One Sample
Purpose
To compare the population mean of a continuous CT characteristic with a value such as the target. Since
we don't know the population mean, an analysis of a sample of data is required. This test is usually used
to determine if the mean of a CT characteristic is on target.
Anatomy
T-Test - One Sample
A
H0 : µ = µ0 vs. Ha : µ ≠ µ0
Ha : µ > µ0
B
Ha : µ < µ0
T-Test of the Mean

C
Test of mu = 2.000 vs. mu not = 2.000
Variable N Mean StDev SE Mean T P

Data 25 1.929 0.932 0.186 -0.38 0.70
D E F
Six Sigma - Tools & Concepts TTst1Smp_001
Reference: Juran's Quality Control Handbook Ch. 23 P. 60-81, Business Statistics by Downing and Clark Ch. 13 P. 252-256
Terminology
A. Null (H0) and alternative (Ha) hypotheses, where the population mean (µ) is compared to a value (µ0)
such as the target. For the alternative hypothesis, one of the three hypotheses has to be chosen
before collecting the data to avoid being biased or influenced by the observations of the sample.
C. Hypotheses tested: Η0: µ=2 vs. Ha: µ ≠2.
D. Descriptive Statistics - Name of the column that contains the data, sample size, sample mean,
sample standard deviation (StDev) and Standard error of the mean (SE Mean= s n ).
x − µ
t = 0
E. Computed t statistic ( s n ).
The assumption for using this test is that the data comes from a random sample of a normal distribution.
When the sample size is greater or equal to 30, the standard normal distribution (Z) is generally used
(see tool Z Test – One Sample). However, as the sample size increases, the t distribution approaches
the Z distribution (see tool Distribution – t).
194
2. Establish hypotheses - State the null hypothesis (H0) and the alternative hypothesis (Ha).
4. Establish sample size (see tool Sample Size – Continuous Data – T-Test - One Sample).
• Use the function under Stat>Basic Statistics>Descriptive Statistics>1-Sample t.
• Select the Test mean option, input the target value and the desired alternative hypothesis (>,
<, ≠: the default setting).
8. Make statistical decision from the session window output of Minitab. Either accept or reject H0.
195
T-Test – Two Samples
Purpose
To compare the means of two populations on a continuous CT characteristic. Since we don't know the
population means, an analysis of two data samples is required. This test is usually used to determine if
there is a statistically significant change in the mean of a CT under two different conditions.
Anatomy
T-Test - Two Samples
A H0 : µ1 = µ2 vs.Ha : µ1 ≠ µ2
H a : µ1 > µ2
H a : µ1 < µ2
B
Two Sample T-Test and Confidence Interval
Two sample T for Data1 vs Data2

N Mean StDev SE Mean
Data1 20 8.05 1.06 0.24 C
Data2 20 9.674 0.738 0.17
D E
95% CI for mu Data1 - mu Data2: ( -2.21, -1.04)
T-Test mu Data1 = mu Data2 (vs <): T= -5.62 P=0.0000 DF= 38
Both use Pooled StDev = 0.916
H F G
Six Sigma - Tools & Concepts TTst2Smp_001
Reference: Juran's Quality Control Handbook - Ch. 23 P. 66-67, Minitab Reference Manual – Ch. 1P. 19-21
Terminology
A. Null (H0) and alternative (Ha) hypotheses, where the two population means (µ1, µ2) are compared.
For the alternative hypothesis, one of the three hypotheses has to be chosen before collecting the
data to avoid being biased by the observations of the samples.
C. Descriptive Statistics - Sample size, sample mean, sample standard deviation (StDev) and standard
error of the mean (SE Mean = s n ).
D. Hypotheses tested: H0:µ1= µ2 vs. Ha: µ1< µ2, and the computed t statistic.
E. Confidence interval around the difference of the two means (µ1-µ2). When µ1-µ2=0 is included in this
interval, the data supports the hypothesis Η0:µ1= µ2 at the α (alpha) level.
F. P-Value – This value has to be compared with the α (alpha) level and the following decision rule is
G. Number of degrees of freedom.
H. Pooled standard deviation – This is used when the variances of the two populations are assumed to
be equal.
The assumptions for using this test is that the data comes from two independent random samples taken
from two normally distributed populations.
196
2. Establish hypothesis - State the null hypothesis (H0) and the alternative hypothesis (Ha).
4. Establish sample size (see tool Sample Size – Continuous Data – T-Test - Two Sample).
Use the function under Stat>Basic Statistics> 2-Sample t.
According to how the data is formatted, choose one of the two options: Samples in one column
(when all data are stored in one column and a second column contains group codes) or Samples
in different columns.
Input the desired alternative hypothesis (>, <, ≠ : the default setting) and the confidence level (usually
95%) for the confidence interval on the difference of two means.
Select the Assume equal variances option only if you can support this assumption from past historical
records. It gives a slightly more powerful method than the one that does not assume equal
variances, but can be seriously in error if the variances of the populations are not equal. This
option should not be used in most cases.
197
Western Electric Rules – Test 1
Purpose
To determine in a rigorous, standardized manner, whether a process is in or out of statistical control,
through the use of quantitative tests applied to points on a Control Chart. Western Electric Rule 1 states
that a process is out of control if any one point falls outside of either the LCL or the UCL.
Anatomy
Western Electric Rules - Test 1
I C hart for Response
1
D
3.0SL=105.6
105
Individual Value
Western Electric Rules Test 1:

A lack of control is indicated
100 X=100.3 whenever any one value
falls outside the three sigma limit
95 -3.0SL=94.98
0 5 10 15 20 25
O bs ervation Num ber
I C hart for Response
1 05 3.0S L=1 05.5

C
Individual Value
1 00 X =99.72
B
95
-3.0SL =93.9 4
A
90 E
1
0 5 10 15 20 25
O bs ervation Num ber WER_Tst1_001
Reference: Statistical Quality Control – p 116
Terminology
A. Lower Control Limit (LCL) – Line and numerical value representing the lower limit of the variation that
could be expected if the process were in a state of statistical control, by convention equal to the
Mean minus three Standard Deviations
B. Central Line – Average value of the process parameter being plotted, over the period of inspection
being referenced.
C. Upper Control Limit (UCL) – Line and numerical value representing the upper limit of the variation
that could be expected if the process were in a state of statistical control, by convention equal to the
Mean plus three Standard Deviations.
D. Out of Control Point – This single point above the UCL is sufficient for this process to fail Test
Number 1
E. Out of Control Point – This single point below the UCL is sufficient for this process to fail Test
Number 1
The use of formal tests such as the Western Electric Rules, eliminates subjectivity as a factor in
determining the state of process control
Western Electric Rule 1 is one of the most common tests for the state of process control, and should
always be applied first
Minitab has the ability to test Control Charts for any of the Western Electric Rules
198
1. Plot the Control Chart, indicating the Central Line, and the Upper and Lower Control Limits
2. Plot the four lines corresponding to the One and Two Sigma lines above and below the Central Line
3. Determine the existence of any points above the UCL or below the LCL.
4. If any points are outside these Limits, the process is to be considered out of control, and action
should be taken.

Purpose
To determine whether a process is in or out of statistical control through the use of quantitative tests
applied to points on a Control Chart. Western Electric Rule 2 states that a process is out of control if any
two out of three successive points are on the same side of the Central Line AND are more than two
Standard Deviations away from the Central Line
Anatomy
105 3.0SL=105.0 1
F F
E σ
2σ
105 H 3.0SL=105.4
Individual Value
Individual Value
σ
1σ
X=100.4
100
100 X=100.3
D σ
1σ
σ
2σ
-3.0SL=95.82 -3.0SL=95.11
95
95
0 5 I 10
Chart for 15Response
20 25 0 5 10 15 20 25
Observation Number G
O bs ervation Num ber
C 3.0SL=105.5
105
Western Electric Rules Test 2:
Individual Value
A lack of control is indicated whenever

B100 X=100.4 at least two out of three successive values
are both on the same side of the Central Line
and are also more than two sigma distant
A 95 -3.0SL=95.31 from the Central Line
0 5 10 15 20 25
Observation Number
Six Sigma - Tools & Concepts WER_Tst2_001
Terminology
being referenced.
D. One Sigma Line – Line plotted on the Control Chart, for the purposes of the test, corresponding to
one Standard Deviation above or below the Process Mean
E. Two Sigma Line – Line plotted on the Control Chart, for the purposes of the test, corresponding to
two Standard Deviations above or below the Process Mean
F. Out of Control Case – Two successive points above the Two Sigma line constitute failure of Western
Electric Test 2
199
G. Out of Control Case – Two out of three successive points are above the Two Sigma line constitute
failure of Western Electric Test 2
H. Out of Control Case – Similar to G above, but also features one point above the UCL, thereby also
failing Western Electric Test 1
Western Electric Rule 2 is not used as frequently as Test 1 and 4, and is generally used when increased
sensitivity is required
3. Determine the existence of any points above or below the Two Sigma lines.
4. If any two consecutive points, or any two out of three successive points, are outside the Two Sigma
lines, and on the same side of the Center Line, the process is to be considered out of control, and
action should be taken.

Purpose
To determine whether a process is in or out of statistical control through the use of quantitative tests
applied to points on a Control Chart. Western Electric Rule 3 states that a process is out of control if any
four out of five successive points are on the same side of the Central Line AND are more than one
Standard Deviation away from the Central Line
Anatomy

F 105
104 3.0SL=104.1
104
3.0SL=103.8
103
E σ 103
2σ
Individual Value
Individual Value
102
σ 102
1σ
101
101
100 X=100.1
100 X=99.87
D99 σ
1σ
99
98
σ 98
2σ
97
97
-3.0SL=96.39
96
0 5
I Chart
10
for
15
Response
20 25
96 -3.0SL=95.68
95
Observation Number
C104 0
3.0SL=103.8 5 10 15 20
H
25
103 Observation Number
Individual Value
102
101 Western Electric Rules Test 3:
B100 X=100.1
A lack of control is indicated whenever
99
98 at least four out of five successive values
97 are all on the same side of the Central Line
A -3.0SL=96.36
96
and are also more than one sigma distant
0 5 10 G 15 20 25
from the Central Line
Observation Num ber
Six Sigma - Tools & Concepts WER_Tst3_001
200
Terminology
being referenced.
D. One Sigma Line – Line plotted on the Control Chart, for the purposes of the test, corresponding to
one Standard Deviation above or below the Process Mean
E. Two Sigma Line – Line plotted on the Control Chart, for the purposes of the test, corresponding to
two Standard Deviations above or below the Process Mean
F. Out of Control Case – Four consecutive points above the One Sigma line constitute failure of
Western Electric Test 3
G. Out of Control Case – Four out of five successive points above the One Sigma line constitute failure
of Western Electric Test 3
H. Out of Control Case – Four consecutive points below the One Sigma line constitute failure of
Western Electric Test 3
Western Electric Rule 3 is not used as frequently as Test 1 and 4, and is generally used when increased
sensitivity is required
3. Determine the existence of any points above or below the One Sigma lines. If any four out of five
successive points are outside the One Sigma lines, and on the same side of the Center Line, the
process is to be considered out of control, and action should be taken.

Purpose
To determine in a rigorous, standardized manner, whether a process is in or out of statistical control,
through the use of quantitative tests applied to points on a Control Chart. Western Electric Rule 4 states
that a process is out of control if any eight successive points are on the same side of the Central Line
201
Anatomy

104 3.0SL=103.7
C
103
E
Individual Value
102
101
B100 X=100.0
99
98
A 97 -3.0SL=96.39
96 D
0 5 10 15 20 25
10 4
O bs ervation Num ber 3.0SL=1 03 .2
10 3
10 2
Individual Value
Western Electric Rules Test 4: 10 1
10 0
A lack of control is indicated whenever X=99.52
99
at least eight successive values 98
are all on the same side 97

96 -3 .0 SL =9 5.8 4
of the Central Line
95
0 5 10 15 20 25
O bs ervation Num ber WER_Tst4_001

Terminology
being referenced.
D. Out of Control Case – Eight successive points above the Central Line constitute failure of Western
Electric Test 4
E. Out of Control Case – Eight successive points below the Central Line constitute failure of Western
Electric Test 4
The use of formal tests such as the Western Electric Rules, eliminates judgement as a factor in
Western Electric Rule 4 is a common test and is used frequently in conjunction with Western Electric
Test 1
3. Determine the existence of any points above or below the One Sigma lines. If any eight successive
points are on the same side of the Center Line, the process is to be considered out of control, and
action should be taken.
202
Yates Standard Order
Purpose
To place the experimental factor levels in a standard order prior to running an Experimental Design.
Factor levels are in standard order when the first column consists of alternating low and high settings, the
second column consists of alternating pairs of low and high settings, the third column consists of four low
followed by four high settings, etc.
Anatomy
Yates Standard Order
E
A B C
1 (1) -1 -1 -1
2 a +1 -1 -1 D
3 b -1 +1 -1
A 4 ab +1 +1 -1
5 c -1 -1 +1
6 ac +1 -1 +1
7 bc -1 +1 +1
8 abc +1 +1 +1
B C
Six Sigma - Tools & Concepts YatesStO _001
Reference: Statistics for Experimenters Page 323 Juran pp 26.24 – 26.26
Terminology
A. Experiment run number.
B. Yates designation.
C. Symbol for high-level setting of Factor.
D. Symbol for low-level setting of Factor.
E. Factor designation.
When placed in Yates Standard Order, the kth column of a table consists of a sequence of 2k-1 low level
settings followed by 2k-1 high level settings.
Only 2k Factorial Designs can be placed in Yates Standard Order.
Even though placing a design in standard order results in an orderly presentation, it is critical that the
experiment actually be run in random order.
Minitab has the capability of producing experimental designs in standard order or in random order.
1. Arrange the first column of the design matrix as a series of alternating high and low settings.
2. Arrange the second column as an alternating series of two low settings, followed by two high
settings.
3. Arrange the third column as an alternating series of four low settings followed by four high settings.
k-1
4. This arrangement continues as a series of alternating groups of high and low settings of size 2 until
all columns have been arranged.
203
Yield - Final
Purpose
To compute the "First Time Yield" after the last step of a series of k process steps is completed. Final
Yield is a system test, in that not every CT characteristic is tested or verified.
Anatomy
Yield - Final
S
YFINAL = U E
C
B
Calculation Point
Calculation Point
A for Final Yield
for Final Yield
Process 1... Process k
Input
Input Operation Verify ... Operation Verify
Final
Insp. Output
Output
Not Not
OK OK
Rework Rework
D
Scrap Scrap
Six Sigma - Tools & Concepts YieldFin_001
Terminology
A. Process Input.
B. First process step.
C. kth (and last) process step.
D. Process output.
E. Calculation point of Final Yield, defined as the ratio of number of units accepted (S) to the number of
units tested (U).
Does not consider the "Hidden Factory". YFINAL does not provide an insight into true process
performance or on the severity of failures.
The cost structure of each unit of output may be different.
1. Count the number of units inspected or tested (U).
2. Count the number of units that pass the inspection/test requirements (S) at the end of the process.
3. Apply the formula YFINAL , expressing the result as a percentage.
204
Yield – First Time
Purpose
To calculate the ratio of the number of units accepted to the total number of units inspected. This
calculation of process performance represents the classical view of Yield.
Anatomy
Yield - First Time
S
YFT =
L M
A
C
U
Calculation Point
forCalculation
First Time Point
Yield
for First Time Yield
Operator Output
Input
Input Operation
Operation
Inspection Output
Verify
Not Not
OK OK B
Rework
Rework
D
Scrap
Scrap E
Six Sigma - Tools & Concepts Yield1st_001
Terminology
A. Process Operation Step including operator self-check/verification
B. Process Inspection/Test/Verification Step
C. Point at which YFT is calculated, considering the number of units tested (U) and the number of units
that pass inspection (S)
D. Process Rework Step
E. Scrap
Calculation of Process Yield using First Time Yield formula does not reflect the true process
effectiveness. It often leads to an overvalued calculation of the true Yield, and has no practical meaning
other than the ratio of successes over units tested. First Time Yield should not be used in practice.
YFT does not take into account the "Hidden Factory".
1. Count the number of units inspected or tested (U)
2. Count the number of units that pass the inspection/test requirements (S)
3. Apply the formula YFT , expressing the result as a percentage
205
Yield - Normalized
Purpose
To assign a single yield value to each step in a process when production of the output (product, service,
etc.) involves more than one step. The Normalized Yield represents an equalized yield for each process
step, in that each step is assigned the same yield value.
Anatomy
Yield - Normalized
D
YNORM= k
Y RT
C
YNORM YNORM YNORM
Step 1 Step 2 ... Step k
A B
Six Sigma - Tools & Concepts YieldNor_001
Reference: The Vision of Six Sigma: A Roadmap for Breakthrough Ch. 14 & 19
Terminology
A. Typical process step
B. Number of process steps (k)
C. Rolled Throughput Yield
D. Normalized Yield assigned to each step
Normalized Yield is most commonly used to assign a yield value to each step in a process, when the
individual Throughput Yields at each step are not known.
Normalized Yield can also be used when performing Metrics Flow Down calculations, when individual
process step yields are not known, or to assign equalized yield values to each process step, even when
individual YTPs are known.
1. Determine the Rolled Throughput Yield of the process YRT, either from the multiplication of the
-TDPU
individual step Throughput Yields, or from the formula YRT = e .
2. Determine the number of process steps (k)
3. Calculate YNORM as the kth root of YRT.
206
Yield – Rolled Throughput
Purpose
To calculate the probability that a unit of output will be defect free after a series of k process steps.
Anatomy
Yield - Rolled Throughput
k
E
Y = ∏Y
B
RT
i =1
TPi
C
YTP1 YTP2 YTPk D
Step 1 Step 2 ... Step k
YieldRol_001
Terminology
A. Process step (typical).
B. Throughput Yield of process step 1.
C. kth process step (last in the series to calculate the Rolled Throughput Yield).
D. Throughput Yield of process step k.
E. Formula to calculate the Rolled Throughput Yield (YRT).
Since the concept of yield represents the probability of producing zero defects, and process steps are
assumed to be independent, the probability of producing zero defects after k steps is equal to the
product of the yield values for each step (see concept Basic Probability Theory – Sets, Theorems).
YRT is a function of the number of defects in the process step.
1. Calculate the Throughput Yield (YTP) for each step or process (steps 1 through k).
2. Multiply the Throughput Yield of each step.
207
Yield - Throughput
Purpose
To calculate the true measure of a process step's effectiveness taking into account the "Hidden Factory".
Since this measure considers the "Hidden Factory", it is said to be the complete and true assessment of
process effectiveness. YTP of any given process step represents the probability of producing a defect-
free unit at that process step.
Anatomy
Yield - Throughput
Y TP
= e− dpu G
L
A
B Calculation Point
forCalculation
ThroughputPoint
Yield
for Throughput Yield
D
E
= (e−dpo)m
Operator Output H
Input Operation Inspection Output
Input Operation Verify
C Defects Not Not

OK OK F
H
Rework
Rework
Scrap
Scrap
YieldTru_001
Terminology
A. Number of units going into a process. Each unit (u) contains (m) opportunities for defect.
B. Process operation step.
C. Defects (d) produced during the execution of the process step.
D. Point in process where Throughput Yield is calculated (prior to inspection or test).
E. Operator verification step.
F. Inspection after the process step.
G. Formula to calculate the Throughput Yield, where e ≅ 2.718282 and dpu = no. of defects (d)
no. of units (u)
H. Throughput yield is also calculated using this formula, where m = number of opportunities per unit
and dpo = no. of defects (d)
no. of opportunities (o)
YTP is reflective of the true cost structure of the process.
Calculation of YTP shall be done prior to any form of correction or rework of the output.
Calculation based on discrete or continuous data.
The Poisson approximation may be applied when the opportunities for nonconformance is large and the
probability of an event is small. If these assumptions prove to be unreasonable, then the binomial model
can be used.
208
1. Count the number of units going into a process (u).
2. Define the number of opportunities for defect contained in each unit (m).
3. Count the number of defects produced (d). This count shall be done prior to any form of rework or
correction to the output.
4. Compute DPU or DPO.
5. Compute Throughput Yield.
209
Z Value - Long-Term
Purpose
To be able to evaluate long-term process performance.
To statistically estimate the PPM, DPMO.
Anatomy
Ζ Value - Long-Term
B λ = Tµ (Mean)
(Target) σ
USL
Ζ LT (Upper) =
σLT
F
µ
LSL T USL
Target
C λ = Tµ (Mean)
(Target)
D
LSL Ζ= SL - λ
Ζ LT (Lower) = σ
σLT
A
Six Sigma - Tools & Concepts ZValueLT_001
Reference: Mikel J. Harry, The vision of six sigma, White Book, White book, Ch. 8-9
Terminology
A. General equation of Z
B. Z long-term for Upper specification limit
C. Z long-term for Lower specification limit
D. Specification limits, which are an expression of the CTs
E. Central tendency to be used in the calculation (λ):
λ = µ for long-term naturally centered process
λ = T for long-term artificially centered process
F. Long-term standard deviation
Z is a metric
Z long-term is always in term of "how many long-term sigma"
Z makes a bridge between process and Normal probability
The Six Sigma objective is to achieve a ZLT level of 4.5 or higher
210
1. Choose the reference for the central tendency
• for the actual distribution, use m (i.e. the overall mean)
• for artificially centering on the mid point of the specification, use T(i.e. the target )
2. Estimate the standard deviation long-term
3. Compute both specification limits
4. Apply the proper formula to compute Upper and Lower Z long-term
211
Z Value - Short-Term
Purpose
To be able to evaluate short-term process performance
To rate performance based on benchmarking
Anatomy
Ζ Value - Short-Term
B λ=T ( target ) σ
US L
Ζ ST (Upper) =
σ ST
F
µ
LS L T US L
Target
C λ=T ( target )
D
LSL Ζ= SL - λ
Ζ ST (Low er) = σ
σ ST
A
Six Sigm a - Tools & C oncepts ZValu eST_001
Reference: Mikel J. Harry, The vision of six sigma, White Book, White book, Ch. 8-9
Terminology
A. General equation of Z
B. Z short-term for Upper specification limit
C. Z short-term for Lower specification limit
D. Specification limits, which are an expression of the CTs
E. Central tendency to be used in the calculation; λ = T for short-term artificially centered process
F. Short-term standard deviation
Z is a metric
Z is always in term of "how many short-term sigma"
Z makes a bridge between process and Normal probability
The Six Sigma objective is to achieve a ZST level of 6 or higher
1. Take the reference for the central tendency. Target value is used because Z short-term reflects
"process capability" under the assumption of random variation.
2. Estimate the standard deviation short-term
3. Compute both specification limits
4. Apply the proper formula to compute Upper and Lower Z short-term
Note: Z upper = Z lower if the target is the specification mid point.
212
Z-Test – One Sample
Purpose
To compare the population mean of a continuous CT characteristic with a value such as the target. Since
we don't know the population mean, an analysis of a data sample is required. This test is usually used to
determine if the mean of a CT characteristic is on target when the sample size is greater or equal to 30.
Anatomy
Z-Test - One Sample
A H0 : µ = µ0 vs. Ha : µ ≠ µ0
Ha : µ > µ0
B
Ha : µ < µ0
Z-Test
C
Test of mu = 10.000 vs mu not = 10.000
The assumed sigma = 1.82
Variable N Mean StDev SE Mean Z P

Data 35 9.402 1.825 0.308 -1.94 0.053
D E F
Six Sigma - Tools & Concepts ZTst1Smp_001
Reference: Basic Statistics by Kiemele, Schmidt and Berdine - Ch. 6 P. 3-11
Terminology
A. Null (H0) and alternative (Ha) hypotheses, where the population mean (µ) is compared to a value (µ0)
such as the target. For the alternative hypothesis, one of the three hypotheses has to be chosen
before collecting the data to avoid being biased by the observations of the sample.
C. Hypotheses tested: Η0: µ=10 vs. Ha: µ ≠10.
D. Descriptive Statistics - Name of the column that contains the data, sample size, sample mean,
sample standard deviation (StDev) and standard error of the mean (SE Mean = s n ).
x − µ0
Z =
E. Computed Z statistic ( s n ).
The assumption for using this test is that the data comes from a random sample with a size greater or
equal to 30. It can also be used when the standard deviation of the population is known but this case is
quite rare in the practice.
213
2. Establish hypothesis - State the null hypothesis (H0) and the alternate hypothesis (Ha).
4. Establish sample size (see tool Sample Size - Continuous Data – Z Test - One Sample).
• Use the function under Stat>Basic Statistics> 1-Sample z.
• Select the Test mean option, input the target value and the desired alternative hypothesis (>,
<, ≠: the default setting).
• Enter a sigma value. This value can be the sample standard deviation (s) or a known value of
the population standard deviation.
214

6 Sigma Reference Materials

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

6 Sigma Reference Materials

Uploaded by

Copyright:

Available Formats

6 Sigma Reference Materials

SUBJECT MATTER LINK

ANOVA - One Way - Part 1 of 2

F Individual 95% CIs For Mean

Pooled StDev = 3.233 65.0 70.0 75.0 80.0

Six Sigma - Tools & Concepts ANOVA112_001

• See Part 2 for continuation of the application cookbook.

ANOVA – One Way (ANOVA Table)

ANOVA - One Way - Part 2 of 2

A One-Way Analysis of Variance

Six Sigma - Tools & Concepts Anova122_001

1. Analyze the data with Minitab (part 2 of 2):

ANOVA - Two Way - Random

V(yijk) = στ2 + σβ2 + στβ2 + σ2 B

A. The model where yijk is the (ijk) th observation of the CT characteristic (I = 1, 2, …, a, j = 1, 2, …, b, k

ANOVA – Two Way – Random Factors - Part 2 of 2 (ANOVA Table)

ANOVA - Two Way - Random

Factor 1 9 2.05871 0.228745 39.7178 0.00000

Six Sigma - Tools & Concepts ANOVA222_001

ANOVA N-Way Two-Level - Fixed

Yijkl = µ + τi + βj +γk+ (τβ)ij + (τγ)ik + (βγ)jk

Six Sigma - Tools & Concepts AnovaNW1_001

Reference: Design and Analysis of Experiments, Ch. 6

ANOVA N-Way Two-Level Part 2 of 2

ANOVA N-Way Two-Level - Part 2

Source DF Seq SS Adj SS Adj MS F P

Six Sigma - Tools & Concepts AnovaNW2_001

Reference: Design and Analysis of Experiments, Ch. 6

Reference: Basic Statistics: Tools for Continuous Improvement P. 2.71 – 2.73

The Breakthrough Strategy

Six Sigma - Tools & Concepts SSigBrSt_001

Chi-square - Goodness of Fit Test

Six Sigma - Tools & Concepts ChiGoFit_001

Reference: Black Book, 25.2

4. Determine the critical χ and compare to the calculated value.

sampling from the assumed distribution. (The die is not biased.)

Chi-Square - Test of Homogeneity

H a: the variables are not independent d.f. = (r-1)(k-1)

Six Sigma - Tools & Concepts ChiTsHom_001

Reference: Black Book, 25-16

Six Sigma - Tools & Concepts ChiYatCo_001

Confidence Interval - Mean

Minitab Session Window output :

Variable N Mean StDev SE Mean 95.0 % CI

Confidence Interval - Proportion

Six Sigma - Tools & Concepts ConfInPr_001

Reference: Business Statistics – P. 231

Confidence Interval - Standard

Six Sigma - Tools & Concepts ConIStDv_001

Anderson-Darling Normality Test

Control Chart - Moving Range (MR)

Moving Range Chart E

Reference: Statistical Process Control – Ford/GM/Chrysler pp75 - 78

Control Chart - np Chart

Six Sigma - Tools & Concepts Ctrl_NP_001

Reference: Statistical Process Control – Ford/GM/Chrysler pp111 - 112

Six Sigma - Tools & Concepts Ctrl_P_001

Reference: Statistical Process Control – Ford/GM/Chrysler p91 - 110

Control Chart - R Chart

Reference: Statistical Process Control – Ford/GM/Chrysler pp29-64

Control Chart - Standard Deviation

Reference: Statistical Process Control – Ford/GM/Chrysler pp65 - 68

n is the subgroup size, and k is the number of subgroups

Control Chart - U Chart