LSSBB Course Notes Secured

Define Phase
What is Six Sigma?
Understanding Six Sigma
Definitions
Definitions
History
History
Strategy
Strategy
Problem
ProblemSolving
Solving
Roles
Roles&&Responsibilities
Responsibilities
Six Sigma Fundamentals
Selecting Projects
Elements of Waste
Wrap Up & Action Items
2
What is Six Sigma…as a Symbol?
σ sigma is a letter of the Greek alphabet.

– Mathematicians use this symbol to signify Standard Deviation, an
important measure of variation.
– Variation designates the distribution or spread about the average of
any process.
Narrow
NarrowVariation
Variation Wide
WideVariation
Variation
The variation in a process refers to how tightly all the various outcomes are clustered around the average.
No process will produce the EXACT same output each time.
3
What is Six Sigma…as a Value?
Sigma is a measure of deviation. The mathematical calculation for the

Standard Deviation of a population is:
By definition, the Standard Deviation is the

distance between the Mean and the point of
inflection on the normal curve.
Sigma can be used interchangeably with the statistical term Standard Deviation.
Standard Deviation is the average distance of data points away from the Mean in a
distribution.
Point of Inflection
4
What is Six Sigma…as a Measure?
The probability of creating a defect can be estimated and translated

into a “Sigma” level.
*LSL – Lower Spec Limit

*USL – Upper Spec Limit
-6 -5 -4 -3 -2 -1 +1 +2 +3 +4 +5 +6
The higher the sigma level, the better the performance. Six Sigma refers to a process having
6 Standard Deviations between the average of the process center and the closest specification
limit or service level.
5
Measure
“Sigma Level” is:

– A statistic used to describe the performance of a process relative to the
specification limits
– The number of Standard Deviations from the Mean to the closest
specification limit of the process
USL
6 Sigma
5 Sigma
4 Sigma
3 Sigma
2 Sigma
1 Sigma
The likelihood of a defect decreases as the number of Standard Deviations that can be fit
between the Mean and the nearest spec limit increases.
6
What is Six Sigma…as a Metric?
Each of these metrics serves a different purpose and may be used at different levels in the
organization to express the performance of a process in meeting the organization’s (or
customer’s) requirements. We will discuss each in detail as we go through the course.
• Defects
• Defects per unit (DPU) 20
18
• Parts per million (PPM)
16
• Defects per million opportunities (DPMO) 14
• Rolled Throughput yield (RTY) 12
• First Time Yield (FTY) 10
8
• Sigma (s) 0 20 40 60 80 100
These are certain metrics that we use in Six Sigma. You will learn more
about these through the course of your study.
7
What is Six Sigma…as a Benchmark?
Yield PPMO COPQ Sigma
99.9997% 3.4 <10% 6 World Class Benchmarks
99.976% 233 10-15% 5 10% GAP
99.4% 6,210 15-20% 4 Industry Average
93% 66,807 20-30% 3 10% GAP
65% 308,537 30-40% 2 Non Competitive
50% 500,000 >40% 1
Source: Journal for Quality and Participation, Strategy and Planning Analysis
What does 20 - 40% of Sales represent to your Organization?
8
What is Six Sigma…as a Method?
DMAIC provides the method for applying the Six Sigma philosophy
in order to improve processes.
 Define - the business opportunity
 Measure - the process current state
 Analyze - determine Root Cause or Y= f (x)
 Improve - eliminate waste and variation
 Control - evidence of sustained results
9
What is Six Sigma…as a Tool?
Six Sigma contains a broad set of tools, interwoven in a business

problem-solving methodology. Six Sigma tools are used to scope and
choose projects, design new products and processes, improve current
processes, decrease downtime and improve customer response time.
- Six Sigma has not created new tools, it has simply organized a variety
of existing tools to create flow.
Customer Value
Management Product Process Process System Functional
Responsiveness,
Cost, Quality,
= EBIT, (Enabler) , Design , Yield , Speed , Uptime , Support
Delivery
10
What is Six Sigma…as a Goal?
Sweet Fruit
Design for Six Sigma
5+ Sigma
Bulk of Fruit
Process
3 - 5 Sigma Characterization
and Optimization
Low Hanging Fruit

Basic Tools of Problem
3 Sigma
Solving
Ground Fruit
1 - 2 Sigma Simplify and
Standardize
11
What is Six Sigma…as a Philosophy?
General Electric: First, what it is not. It is not a secret society, a slogan or a cliché. Six Sigma is a
highly disciplined process that helps us focus on developing and delivering near-perfect products and
services. The central idea behind Six Sigma is that if you can measure how many "defects" you have in
a process, you can systematically figure out how to eliminate them and get as close to "zero defects" as
possible. Six Sigma has changed the DNA of GE — it is now the way we work — in everything we do
and in every product we design.
Honeywell: Six Sigma refers to our overall strategy to improve growth and productivity as well as a
measurement of quality. As a strategy, Six Sigma is a way for us to achieve performance breakthroughs. It
applies to every function in our company, not just those on the factory floor. That means Marketing,
Finance, Product Development, Business Services, Engineering and all the other functions in our
businesses are included.
Lockheed Martin: We’ve just begun to scratch the surface with the cost-saving initiative called Six
Sigma and already we’ve generated $64 million in savings with just the first 40 projects. Six Sigma uses
data gathering and statistical analysis to pinpoint sources of error in the organization or products and
determines precise ways to reduce the error.
12
History of Six Sigma
• 1984 Bob Galvin of Motorola edicted the first objectives of Six Sigma
– 10x levels of improvement in service and quality by 1989
– 100x improvement by 1991
– Six Sigma capability by 1992
– Bill Smith, an engineer from Motorola, is the person credited as the father of Six Sigma
• 1984 Texas Instruments and ABB Work closely with Motorola to further develop
Six Sigma
• 1994 Application experts leave Motorola
• 1995 AlliedSignal begins Six Sigma initiative as directed by Larry Bossidy
– Captured the interest of Wall Street
• 1995 General Electric, led by Jack Welsh, began the most widespread
undertaking of Six Sigma even attempted
• 1997 To present Six Sigma spans industries worldwide
13
• Simplistically, Six Sigma was a program that was generated around

targeting a process Mean (average) six Standard Deviations away from
the closest specification limit.
• By using the process Standard Deviation to determine the location of the

Mean the results could be predicted at 3.4 defects per million by the use
of statistics.
• There is an allowance for the process Mean to shift 1.5 Standard

Deviations. This number is another academic and esoteric controversial
issue not worth debating. We will get into a discussion of this number
later in the course.
14
• Six Sigma created a realistic and quantifiable goal in terms of its target of 3.4 defects
per million operations. It was also accompanied by a methodology to attain that
goal.
• That methodology was a problem solving strategy made up of four steps: measure,
analyze, improve and control.
• When GE launched Six Sigma they improved the methodology to include the Define
Phase.
Control Improve Analyze Measure Define
MOTOROLA GENERAL ELECTRIC
15
DMAIC Phases Roadmap
Champion/
Process
Owner
Identify Problem Area
Determine Appropriate Project Focus

Define
Estimate COPQ
Charter Project
Measure
Assess Stability, Capability, and Measurement Systems
Identify and Prioritize All X’s

Analyze
Prove/Disprove Impact X’s Have On Problem
Identify, Prioritize, Select Solutions Control or Eliminate X’s Causing Problems

Improve
Implement Solutions to Control or Eliminate X’s Causing Problems

Control
Implement Control Plan to Ensure Problem Does Not Return
Verify Financial Impact
16
Define Phase Deployment
Business Case
Selected
Notify Belts and Stakeholders
Create High-Level Process Map

(Pareto, Project Desirability)
Define & Charter Project

(Problem Statement, Objective, Primary Metric, Secondary Metric)
N Estimate COPQ
Approved
Project Recommend Project Focus
Focus
Y
Create Team
Charter Team
Ready for Measure
17
Define Phase Deliverables
Listed below are the type of Define Phase deliverables that will be
reviewed by this course.
By the end of this course, you should understand what would be necessary
to provide these deliverables in a presentation.
– Charter Benefits Analysis
– Team Members (Team Meeting Attendance)
– Process Map – high level
– Primary Metric
– Secondary Metric(s)
– Lean Opportunities
– Stakeholder Analysis
– Project Plan
– Issues and Barriers
18
Six Sigma Strategy
Six Sigma places the emphasis on the Process

– Using a structured, data driven approach centered on the customer Six
Sigma can resolve business problems where they are rooted, for example:
• Month end reports
• Capital expenditure approval
• New hire recruiting
Six Sigma is a Breakthrough Strategy

– Widened the scope of the definition of quality
• includes the value and the utility of the product/service to both the company
and the customer.
Success of Six Sigma depends on the extent of transformation

achieved in each of these levels.
19
Conventional Strategy
Conventional definitions of quality focused on conformance to standards.
Requirement Requirement
or or
LSL
Target USL
Bad Good Bad
Conventional strategy was to create a product or service that met certain

specifications.
• Assumed that if products and services were of good quality.
then their performance standards were correct.
• Rework was required to ensure final quality.
• Efforts were overlooked and unquantified (time, money, equipment usage, etc).
20
Problem Solving Strategy
The Problem Solving Methodology focuses on:

• Understanding the relationship between independent variables and the
dependant variable.
• Identifying the vital few independent variables that effect the dependant
variable.
• Optimizing the independent variables so as to control our dependant
variable(s).
• Monitoring the optimized independent variable(s).
There are many examples to describe dependant and independent
relationships.
• We describe this concept in terms of the equation:
• This equation is also commonly referred to as a transfer function
Y=f (Xi)
This simply states that Y is a function of the X’s. In
other words Y is dictated by the X’s.
21
Example
Y=f (Xi)
Which process variables (causes) have critical impact on the output
(effect)?
Crusher Yield
= f ( Feed , Speed , Material
Type ,
Tool
Wear , Lubricant )
Time to Close =f( Trial
,
Correct Sub
,
Credit
,
Entry
Balance Accounts Accounts Memos Mistakes Xn, , )
Applied
If we are so good at the X’s why are we constantly

testing and inspecting the Y?
22
Y=f(X) Exercise
Exercise: Consider establishing a Y = f(X) equation for a

simple everyday activity such as producing a cup of
espresso. In this case our output, or Y, is espresso.
Espresso =f( X1 , X2 , X3 , X4 , Xn )
23
Six Sigma Strategy
We use a variety of Six Sigma tools to help

separate the “vital few” variables (X1)
effecting our Y from the “trivial many.” (X10) (X4)
Some processes contain many, many
variables. However, our Y is not effected (X7) (X8)
equally by all of them.
By focusing on the vital few we instantly (X3)
gain leverage. (X5)
(X9)
Archimedes said: “ Give me a lever big enough and fulcrum on which
to place it, and I shall move the world.”
(X6)
(X2)
24
Breakthrough Strategy
Bad 6-Sigma
6-Sigma
Breakthrough UCL
UCL
Breakthrough
Old Standard
Performance
LCL
UCL
UCL
New Standard
LCL
LCL
Good
Time Juran’s Quality Handbook by Joseph Juran
By utilizing the DMAIC problem solving methodology to identify and

optimize the vital few variables we will realize sustainable breakthrough
performance as opposed to incremental improvements or, even worse,
temporary and non-sustainable improvement.
25
VOC, VOB, VOE
The foundation of Six Sigma requires Focus on the voices of the Customer, the Business and the Employee which provides:
VOC is Customer Driven

VOB is Profit Driven
– Awareness of the needs that are critical to the quality (CTQ) of our products and services
– Identification of the gaps between “what is” and “what should be”
– Identification of the process defects that contribute to the “gap”
VOE is Process Driven
– Knowledge of which processes are “most broken”
– Enlightenment as to the unacceptable Costs of Poor Quality (COPQ)
26
Six Sigma Roles and Responsibilities
There are many roles and responsibilities for successful

implementation of Six Sigma.
• Executive Leadership
MBB
• Champion/Process Owner
Black Belts • Master Black Belt
• Black Belt
Green Belts
• Green Belt
Yellow Belts • Yellow Belt
Eventually there should be a big base of support internal

to the organization.
27
Executive Leadership
Not all Six Sigma deployments are driven from the top by Executive
Leadership. The data is clear, however, that those deployments that are
driven by executive management are much more successful than those that
aren’t.
• Makes decision to implement the Six Sigma initiative and develop
accountability method
• Sets meaningful goals and objectives for the corporation
• Sets performance expectations for the corporation
• Ensures continuous improvement in the process
• Eliminates barriers
28
Champion/Process Owner
Champions identify and select the most meaningful projects to work on,
they provide guidance to the Six Sigma belt and open the doors for the
belts to apply the process improvement technologies.
• Own project selection, execution control, implementation and realization of

gains
• Own Project selection
• Obtain needed project resources and eliminates roadblocks
• Participate in all project reviews
• Ask good questions…
• One to three hours per week commitment
29
Master Black Belt
MBB should be well versed with all aspects of Six Sigma, from technical
applications to Project Management. MBBs need to have the ability to
influence change and motivate others.
• Provide advice and counsel to Executive Staff
• Provide training and support

– In class training
– On site mentoring
• Develop sustainability for the business
• Facilitate cultural change
30
Black Belt
Black Belts are application experts and work projects within the business.
They should be well versed with The Six Sigma Technologies and have the
ability to drive results.
• Project team leader
• Facilitates DMAIC teams in applying Six Sigma methods to solve

problems
• Works cross-functionally
• Contributes to the accomplishment of organizational goals
• Provides technical support to improvement efforts
31
Green Belt
Green Belts are practitioners of Six Sigma Methodology and typically

work within their functional areas or support larger Black Belt Projects.
• Well versed in the definition & measurement of critical processes

– Creating Process Control Systems
• Typically works project in existing functional area
• Involved in identifying improvement opportunities
• Involved in continuous improvement efforts

– Applying basic tools and PDCA
• Team members on DMAIC teams

– Supporting projects with process knowledge & data collection
32
Yellow Belt
• Provide support to Black Belts and Green Belts as needed
• May be team members on DMAIC teams

– Supporting projects with process knowledge and data collection
33
The Life of a Six Sigma Belt
Training as a Six Sigma Belt can be one of the most rewarding

undertakings of your career and one of the most difficult.
• You can expect to experience:
– Hard work (becoming a Six Sigma Belt is not easy)
– Long hours of training
– Be a change agent for your organization
– Work effectively as a team leader
– Prepare and present reports on progress
– Receive mentoring from your Master Black Belt
– Perform mentoring for your team members
– ACHIEVE RESULTS!
You’re going places!

34
Black & Green Belt Certification
To achieve certification, Belts must:
• Complete all course work:

– Be familiar with tools and their application
– Practice using tools in theoretical situations
– Discuss how tools will apply to actual projects
• Demonstrate application of learning to training project:

– Use the tools to effect a financially measurable and significant business impact
through their projects
– Show ability to use tools beyond the training environment
• Must complete two projects within one year from beginning of training
• Achieve results and make a difference
• Submit a final report which documents tool understanding and application as

well as process changes and financial impact for each project
35
Organizational Behaviors
All players in the Six Sigma process must be willing to step up and act
according to the Six Sigma set of behaviors.
– Leadership by example: “walk the talk”
– Encourage and reward individual initiative
– Align incentive systems to support desired behaviors
– Eliminate functional barriers
– Embrace “systems” thinking
– Balance standardization with flexibility
36
Summary
At this point, you should be able to:
• Describe the objectives of Six Sigma
• Describe the relationship between variation and sigma
• Recognize some Six Sigma concepts
• Recognize the Six Sigma implementation model
• Describe the general roles and responsibilities in Six Sigma
37
Process Maps
Voice of the Customer
Cost of Poor Quality
Process Metrics
Selecting Projects
Elements of Waste
38
What is a Process?
Why have a process focus?

– So we can understand how and why work gets done
– To characterize customer & supplier relationships
– To manage for maximum customer satisfaction while utilizing minimum
resources
– To see the process from start to finish as it is currently being performed
– Blame the process, not the people
proc•ess (pros′es) n. – A repetitive and systematic series of

steps or activities where inputs are modified to achieve a
value-added output
39
Examples of Processes
We go through processes everyday. Below are some examples of those processes.

Can you think of other processes within your daily environment?
• Injection molding • Recruiting staff

• Decanting solutions • Processing invoices
• Filling vial/bottles • Conducting research
• Crushing ore • Opening accounts
• Refining oil • Reconciling accounts
• Turning screws • Filling out a timesheet
• Building custom homes • Distributing mail
• Paving roads • Backing up files
• Changing a tire • Issuing purchase orders
40
Process Maps
• The purpose of Process Maps is to:

– Identify the complexity of the process
– Communicate the focus of problem solving
• Process Maps are living documents and must be changed as the process is
changed
– They represent what is currently happening, not what you think is happening.
– They should be created by the people who are closest to the process
Process Map
t
Start Step A Step B Step C Step D Finish
ec
sp
In
41
Process Map Symbols
Standard symbols for Process Mapping (available in Microsoft Office™,

Visio™, iGrafx™ , SigmaFlow™ and other products):
A RECTANGLE indicates an A PARALLELAGRAM shows

activity. Statements within the that there are data
rectangle should begin with a verb
A DIAMOND signifies a decision point. An ELLIPSE shows the start and

Only two paths emerge from a decision end of the process
point: No and Yes
An ARROW shows the A CIRCLE WITH A LETTER OR

connection and direction of
1 NUMBER INSIDE symbolizes the
continuation of a flowchart to
flow
another page
42
High Level Process Map
One of the deliverables from the Define Phase is a high level Process Map, which
at a minimum must include:
– Start and stop points
– All process steps
– All decision points
– Directional flow
– Value categories as defined below
• Value Added:
– Physically transforms the “thing” going through the process
– Must be done right the first time
– Meaningful from the customer’s perspective (is the customer willing to pay for it?)
• Value Enabling:
– Satisfies requirements of non-paying external stakeholders (government regulations)
• Non-Value Added
– Everything else
43
Process Map Example
A Process Map for a Call Center -

START B Z
REVIEW CASE LOGOFF PHONE, CHECK

LOGON TO PC &
TOOL HISTORY & MAIL,E-MAIL,VOICE MAIL
APPLICATIONS
TAKE NOTES
E
C Y
SCHEDULED
N PHONE TIME?
A
SCHEDULED
PHONE TIME? Z TRANSFER Y
TRANSFER
APPROPRIATE?
CALL
D N
Y
A EXAMINE NEXT NOTE
N OR RESEARCH ITEM
LOGON
TO PHONE IMMEDIATE PROVIDE
Y RESPONSE Y RESPONSE
ACCESS CASE TOOL F
D PHONE
TIME AVAILABLE? PHONE&
N WALK-IN NOTE
CALL or DATA ENDS ENTER APPROPRIATE
WALK-IN? N SSAN (#,9s,0s)
Z CALL PUT ON HOLD,
REFER TO IF EMP DATA NOT
PHONE DATA REFERENCES POPULATED, ENTER
CAPTURE BEGINS
CREATE A CASE
Y INCL CASE TYPE
ANSWER? OLD N
DETERMINE WHO DATE/TIME, &
CASE
IS INQUIRING N NEEDED BY
Y
QUERY INTERNAL UPDATE ENTRIES
ACCESS CASE TOOL HRSC SME(S) INCL OPEN DATE/TIME AUTO Y
ROUTE
ROUTE
DETERMINE NATURE N
OF CALL & CONFIRM Y
ANSWER?
UNDERSTANDING
CASE Y CLOSE CASE
N CLOSED W/ E
DATE/TIME
CASE TOOL N OFF HOLD AND ADD TO N
RECORD? C ARRANGE CALL RESEARCH
BACK PHONE DATA LIST GO TO E
TAKE ACTION
Y ENDS F or E NEXT
or
DEPENDING ON
DO RESEARCH F
B CASE
44
Cross Functional Process Map
When multiple departments or functional groups are involved in a complex process it is often
useful to use cross functional Process Maps.
– Draw in either vertical or horizontal Swim Lanes and label the functional groups and draw the
Process Map
Sending Fund Transfers

Department
Attach ACH ACH – Automated

Request
Start transfer
form to Clearing House.
Invoice
Fill out ACH Receive

Vendor
Produce an
Invoice
No
enrollment payment End
form
Match against
Accounting
Maintain database
Financial
Vendor Yes Input info into bank batch to balance ACH

info in web interface and daily cash transfers
FRS? batch
Accepts transactions,
Bank
transfer money and

provide batch total
Accounting
Review and
General
21.0
Process 3.0
Bank
transfer in Journey Entry
Reconciliation
FRS
45
Process Map Exercise
Exercise objective: Using your favorite Process Mapping tool

create a Process Map of your project or functional area.
1. Create a high level Process Map, use enough detail to make

it useful.
• It is helpful to use rectangular post-it’s for process steps and
square ones turned to a diamond for decision points.
2. Color code the value added (green) and non-value added
(red) steps.
3. Be prepared to discuss this with your mentor.
46
Do you know your Customer?
Knowing your customer is more than just a handshake. It is necessary to

clearly understand their needs. In Six Sigma we call this “understanding the
CTQ ’s” or critical to customer characteristics.
Voice Of the Customer Critical to Customer

Characteristics
47
Voice of the Customer
Voice of the Customer or VOC seems obvious; after all, we all know what the
customer wants. Or do we??
The customer’s perspective has to be foremost in the mind of the Six Sigma Belt
throughout the project cycle.
1. Features
• Does the process provide what the customers expect and need?
• How do you know?
2. Integrity
• Is the relationship with the customer centered on trust?
3. Delivery
• Does the process meet the customer’s time frame?
4. Expense
• Does the customer perceive value for cost?
48
What is a Customer?
There are different types of customers which dictates how we interact with them in
the process. In order to identify customer and supplier requirements we must first
define who the customers are:
External
– Direct: those who receive the output of your services, they generally are the
source of your revenue
– Indirect: those who do not receive or pay for the output of your services but
have a vested interest in what you do (government agencies)
Internal
- those within your organization
who receive the output of your
work
49
Value Chain
The relationship from one process to the next in an organization creates a “Value Chain” of
suppliers and receivers of process outputs.
Each process has a contribution and accountability to the next to satisfy the external customer.
External customers needs and requirements are best met when all process owners work
cooperatively in the Value Chain.
Careful –
each move
has many
impacts!
50
What is a CTQ?
• Critical to Quality (CTQ ’s) are measures that we use to capture VOC properly.
(also referred to in some literature as CTC’s – Critical to Customer)
• CTQ ’s can be vague and difficult to define.
– The customer may identify a requirement that is difficult to measure directly so it will
be necessary to break down what is meant by the customer into identifiable and
measurable terms
Product: Service:
• Performance • Competence
• Features • Reliability
• Conformance • Accuracy
• Timeliness • Timeliness
• Reliability • Responsiveness
• Serviceability • Access
• Durability • Courtesy
• Aesthetics • Communication
• Reputation • Credibility
• Completeness • Security
• Understanding
51
Developing CTQ’s
Identify Customers
Step 1 • Listing
• Segmentation
• Prioritization
Validate CTQ s
Step 2 • Translate VOC to CTQ’s
• Prioritize the CTQ’s
• Set Specified Requirements
• Confirm CTQ’s with customer
Capture VOC
Step 3 • Review existing performance
• Determine gaps in what you need to know
• Select tools that provide data on gaps
• Collect data on the gaps
52
Cost of Poor Quality (COPQ)
• COPQ stands for Cost of Poor Quality
• As a Six Sigma Belt, one of your tasks will be to estimate COPQ for your process
• Through your process exploration and project definition work you will develop a
refined estimate of the COPQ in your project
• This project COPQ represents the financial opportunity of your team’s

improvement effort (VOB)
• Calculating COPQ is iterative and will change as you learn more about the
process
No, not that kind of

cop queue!
53
The Essence of COPQ
• COPQ helps us understand the financial impact of problems created by defects.
• COPQ is a symptom, not a defect

– Projects fix defects with the intent of improving symptoms.
• The concepts of traditional Quality Cost are the foundation for COPQ.
– External, Internal, Prevention, Appraisal
• A significant portion of COPQ from any defect comes from effects that are
difficult to quantify and must be estimated.
54
COPQ - Categories
Internal COPQ Prevention

• Quality Control • Error Proofing Devices
Department • Supplier Certification
• Inspection • Design for Six Sigma
• Quarantined Inventory • Etc…
• Etc…
External COPQ Detection

• Warranty • Supplier Audits
• Customer Complaint Related Travel • Sorting Incoming Parts
• Customer Charge Back Costs • Repaired Material
• Etc… • Etc…
55
COPQ - Iceberg
Inspection
Warranty Recode
Rework
Rejects
Visible Costs
Engineering change orders Lost sales
Time value of money (less obvious) Late delivery

Expediting costs
More Set-ups
Excess inventory
Working Capital allocations

Long cycle times
Excessive Material
Orders/Planning
Hidden Costs Lost Customer Loyalty
56
COPQ and Lean
Waste does not add, subtract or otherwise modify the throughput in a way that is
perceived by the customer to add value.
• In some cases, waste may be necessary,

but should be recognized and explored:
Lean Enterprise
– Inspection, Correction, Waiting in Seven Elements of Waste *
suspense
 Correction
– Decision diamonds, by definition, are
non-value added
 Processing
•
 Conveyance
Often, waste can provide opportunities
for additional defects to occur.
 Motion
 Waiting
• We will discuss Lean in more detail later  Overproduction
in the course.  Inventory
*Womack, J. P., & Jones, D. T. (1996). Lean
Thinking. New York, NY: Simon & Schuster
57
COPQ – Hard and Soft Savings
While hard savings are always more desirable because they are easier to
quantify, it is also necessary to think about soft savings.
COPQ – Hard Savings COPQ – Soft Savings
• Labor Savings • Gaining Lost Sales

• Cycle Time Improvements • Missed Opportunities
• Scrap Reductions • Customer Loyalty
• Hidden Factory Costs • Strategic Savings
• Inventory Carrying Cost • Preventing Regulatory Fines
58
COPQ Exercise
Exercise objective: Identify current COPQ opportunities

in your direct area.
1. Brainstorm a list of COPQ opportunities.
2. Categorize the top 3 sources of COPQ for the four

classifications:
• Internal
• External
• Prevention
• Detection
59
The Basic Six Sigma Metrics
In any process improvement endeavor, the ultimate objective is to make the

process:
• Better: DPU, DPMO, RTY (there are others, but they derive from these basic three)
• Faster: Cycle Time
• Cheaper: COPQ
IfIfyou
youmake
makethe
theprocess
processbetter
betterby
byeliminating
eliminatingdefects
defectsyou
youwill
willmake
makeititfaster.
faster.
IfIfyou
youchoose
choosetotomake
makethe
theprocess
processfaster,
faster,you
youwill
willhave
havetotoeliminate
eliminatedefects
defectstotobe
beasasfast
fastas
as
you
youcancanbe.
be.
IfIfyou
youmake
makethe
theprocess
processbetter
betteror
orfaster,
faster,you
youwill
willnecessarily
necessarilymake
makeititcheaper.
cheaper.
The
Themetrics
metricsfor
forall
allSix
SixSigma
Sigmaprojects
projectsfall
fallinto
intoone
oneof
ofthese
thesethree
three
categories.
categories.
60
Cycle Time Defined
Think of Cycle Time in terms of your product or transaction in the eyes of

the customer of the process:
– It is the time required for the product or transaction to go through the entire process,
from beginning to end
– It is not simply the “touch time” of the value-added portion of the process
What is the cycle time of the process you mapped?

Is there any variation in the cycle time? Why?
61
Defects Per Unit (DPU)
Six Sigma methods quantify individual defects and not just defectives
– Defects account for all errors on a unit
• A unit may have multiple defects
• An incorrect invoice may have the wrong amount due and the wrong due date
– Defectives simply classifies the unit bad
• Doesn’t matter how many defects there are
• The invoice is wrong, causes are unknown
– A unit:
• Is the measure of volume of output from your area.
• Is observable and countable. It has a discrete start and stop point.
• It is an individual measurement and not an average of measurements.
Two Defects One Defective
62
First Time Yield
FTY is the traditional quality metric for yield

– Unfortunately, it does not account for any necessary rework
Total Units Passed

FTY = Total Units Tested
Units in = 100 Units in = 100 Units in = 100 Units Passed = 50

Units Out = 100 Units Out = 100 Units Out = 100 Units Tested = 50
Process A (Grips) Process B (Shafts) Process C (Club Heads) Final Product (Set of Irons)
Defects Repaired Defects Repaired Defects Repaired

40 30 20 FTY = 100 %
*None of the data used herein is associated with the products shown herein. Pictures are no more than illustration to make a point to teach the concept.
63
Rolled Throughput Yield
RTY is a more appropriate metric for problem solving

– It accounts for losses due to rework steps
RTY = X1 * X2 * X3
Units in = 100 Units in = 100 Units in = 100
Units W/O Rework = 60 Units W/O Rework = 70 Units W/O Rework = 80 Units Passed = 34
RTY = 0.6 RTY = 0.7 RTY = 0.8 Units Tested = 100
Process A (Grips) Process B (Shafts) Process C (Club Heads) Final Product (Set of Irons)
Defects Repaired Defects Repaired Defects Repaired

40 30 20 RTY = 33.6 %
64
RTY Estimate
• In many organizations the long term data required to calculate RTY

is not available, we can however estimate RTY using a known DPU
as long as certain conditions are met.
• The Poisson distribution generally holds true for the random
distribution of defects in a unit of product and is the basis for the
estimation.
– The best estimate of the proportion of units containing no
defects, or RTY is:
RTY = e-dpu
The mathematical constant e is the base of the natural logarithm.

e ≈ 2.71828 18284 59045 23536 02874 7135
65
Deriving RTY from DPU
The Binomial distribution is the true model for defect data, but the Poisson is the convenient
model for defect data. The Poisson does a good job of predicting when the defect rates are low.
120%
Poisson
Poisson VS
VS Binomial
Binomial (r=0,n=1)
(r=0,n=1) Probability
Probability Yield
Yield Yield
Yield %
% Over
Over
120% of
of aadefect
defect (Binomial)
(Binomial) (Poisson)
(Poisson) Estimated
Estimated
0.0
0.0 100%
100% 100%
100% 0%
0%
100%
100% Yield
Yield (Binomial)
(Binomial) 0.1
0.1 90%
90% 90%
90% 0%
0%
Yield
Yield (Poisson) 0.2 80% 82% 2%
(RTY)
(Poisson) 0.2 80% 82% 2%

Yield (RTY)
80%
80%
0.3
0.3 70%
70% 74%
74% 4%
4%
60% 0.4
0.4 60%
60% 67%
67% 7%
7%
60%
0.5 50% 61% 11%
Yield
0.5 50% 61% 11%

40% 0.6
0.6 40%
40% 55%
55% 15%
15%
40%
0.7
0.7 30%
30% 50%
50% 20%
20%
20%
20% 0.8
0.8 20%
20% 45%
45% 25%
25%
0.9
0.9 10%
10% 41%
41% 31%
31%
0%
0% 1.0
1.0 0%
0% 37%
37% 37%
37%
0.0
0.0 0.1
0.1 0.2
0.2 0.3
0.3 0.4
0.4 0.5
0.5 0.6
0.6 0.7
0.7 0.8
0.8 0.9
0.9 1.0
1.0
Probability
Probabilityof
ofaa defect
defect
Binomial
n = number of units
r = number of predicted defects
p = probability of a defect occurrence Poisson
q=1-p
For low defect rates (p < 0.1), the Poisson approximates the Binomial fairly well.
66
Deriving RTY from DPU - Modeling
Unit
Basic Question: What is the likelihood of producing a
Opportunity unit with zero defects?
• For the unit shown above the following data was

gathered:
– 60 defects observed
RTY for DPU = 1
– 60 units processed 0.368
• What is the DPU? 0.364
0.36
Yield
0.356
0.352
• What is probability that any given opportunity 0.348
will be a defect? 10
10 100
100 1000
1000 10000
10000 100000
100000 1000000
1000000
Chances Per Unit
• What is the probability that any given Opportunities

Opportunities P(defect) P(no
P(no defect)
defect) RTY
RTY (Prob
(Prob defect
defect free
free unit)
opportunity will NOT be a defect is: 10
10 0.1
0.1 0.9
0.9 0.34867844
0.34867844
100
100 0.01
0.01 0.99
0.99 0.366032341
0.366032341
1000
1000 0.001
0.001 0.999
0.999 0.367695425
0.367695425
10000
10000 0.0001
0.0001 0.9999 0.367861046
0.367861046
• The probability that all 10 opportunities on 100000
100000 0.00001
0.00001 0.99999
0.99999 0.367877602
0.367877602
1000000
1000000 0.000001
0.000001 0.999999
0.999999 0.367879257
0.367879257
single unit will be defect-free is:
If we extend the concept to an infinite number of
opportunities, all at a DPU of 1.0, we will approach the
value of 0.368.
67
RTY Prediction — Poisson Model
• Use the binomial to estimate the probability of a discrete event (good/bad)

when sampling from a relatively large population,
n > 16, & p < 0.1.
• When r=0, we compute the probability of finding zero defects per unit (called
“rolled throughput yield”).
• The table to the right shows the proportion of product which will have
(dpu) r e – dpu
– 0 defects (r=0) Y=
– 1 defect (r=1) When DPU=1
r r! p[r]
– 2 defects (r=2), etc…
0 0.3679
• When, on average, we have a process, with 1 defect per unit, then we say there 1 0.3679
is a 36.79% chance of finding a unit with zero defects. There is only a 1.53%
chance of finding a unit with 4 defects. 2 0.1839
• When r=1, this equation simplifies to: 3 0.0613
• To predict the % of units with zero defect (i.e., RTY): 4 0.0153
– count the number of defects found 5 0.0031
– count the number of units produced 6 0.0005
– compute the dpu and enter it in the dpu equation: 7 0.0001
8 0.0000
68
Six Sigma Metrics – Calculating DPU
The DPU for a given operation can be calculated by dividing the number of defects found in
the operation by the number of units entering the operational step.
100 parts built

2 defects identified and corrected
dpu = 0.02
So RTY for this step would be e-.02 (.980199) or 98.02%.
RTY1=0.98 RTY2=0.98 RTY3=0.98 RTY4=0.98 RTY5=0.98 RTYTOT=0.904

dpu = .02 dpu = .02 dpu = .02 dpu = .02 dpu = .02 dpuTOT = .1
If the process had only 5 process steps with the same yield the process.
RTY would be: 0.98 * 0.98 * 0.98 * 0.98 * 0.98 = 0.903921 or 90.39%. Since our metric of primary
concern is the COPQ of this process, we can say that in less than 9% of the time we will be spending
dollars in excess of the pre-determined standard or value added amount to which this process is entitled.
Note: RTY’s must be multiplied across a process, DPU’s are

added across a process.
69
Focusing our Effort – FTY vs. RTY
Assume we are creating two products in our organization that use

similar processes.
Product A
FTY = 80%
Product B
FTY = 80%
How do you know what to work on?
70
Focusing our Effort – FTY vs. RTY
Let’s look at the DPU of each product assuming equal opportunities and margin…
Product A
Product B
dpu 100 / 100 = 1 dpu dpu 200 / 100 = 2 dpu
Now, can you tell which to work on?
“the product with the highest DPU?” …think again!
How much more time and/or raw material are required?

How much extra floor space do we need?
How much extra staff or hours are required to perform the rework?
How many extra shipments are we paying for from our suppliers?
How much testing have we built in to capture our defects?
71
Summary
• Describe what is meant by “Process Focus”
• Generate a Process Map
• Describe the importance of VOC, VOB and VOE, and CTQ’s
• Explain COPQ
• Describe the Basic Six Sigma metrics
• Explain the difference between FTY and RTY
• Explain how to calculate “Defects per Unit” DPU
72
Project Selection
Selecting Projects
Selecting Projects
Refining & Defining
Financial Evaluation
Elements of Waste
73
Approaches to Project Selection
There are three basic approaches to Project Selection…
“Blatantly Obvious” “Brainstorming Approach”
Things that clearly occur on a Identifies projects based on individual’s

repetitive basis and present problems “experience” and “tribal knowledge” of areas
in delivering our service(s) or that may be creating problems in delivering
product(s). our service(s) / product(s) and hopefully tie to
bottom-line business impact.
“Structured Approach”
Identifies projects based on organizational data, provides a direct plan to effect core
business metrics that have bottom-line impact.
All three ways work…the Structured Approach is the most desirable.

74
Project Selection – Core Components
Business Case – The Business Case is a high level articulation of the area of
concern. This case answers two primary questions; one, what is the business
motivation for considering the project and two, what is our general area of
focus for the improvement effort?
Project Charter – The Project Charter is a more detailed version of the Business
Case. This document further focuses the improvement effort. It can be
characterized by two primary sections, one, basic project information and, two,
simple project performance metrics.
Benefits Analysis – The Benefits Analysis is a comprehensive financial evaluation

of the project. This analysis is concerned with the detail of the benefits in
regard to cost & revenue impact that we are expecting to realize as a result of
the project.
75
Project Selection - Governance
Responsible Frequency
Party Resources of Update
Business Champion Business Unit

N/A
Case (Process Owner) Members
Champion (Process Owner)

Project
Six Sigma Belt & Ongoing
Charter
Master Black Belt
Benefits Capture Champion (Process Owner)

Benefits Ongoing /
Manager or 76 &
Analysis D,M,A,I,C
A Structured Approach – A Starting Point
The Starting Point is defined by the Champion or Process Owner and the Business Case is the
output.
– These are some examples of business metrics or Key Performance Indicators commonly referred
to as KPI’s.
– The tree diagram is used to facilitate the process of breaking down the metric of interest.
 EBIT
Level 2
 Cycle time
 Defects Level 2
 Cost Level 1
Level 2
 Revenue
 Complaints Level 2
 Compliance
 Safety
77
A Structured Approach - Snapshot
The KPI’s need to broken down into actionable levels.
Business Measures
Actionable Level
Key Performance Indicators (KPIs)
Level 2 Level 3 Activities Processes

Level 1
78
Business Case Components – Level 1
Primary Business Measure or Key Performance Indicator (KPI)

Level 1
– Focus on one primary business measure or KPI.
– Primary business measure should bear a direct line of site with the organizations strategic
objective.
– As the Champion narrows in on the greatest opportunity for improvement, this provides a
clear focus for how the success will be measured.
79
Business Case Components – Business Measures
Post business measures (product/service) of the primary

business measure are lower level metrics and must focus on the
end product to avoid internal optimization at expense of total
optimization.
Business Business
Activities Processes
Primary Business Measure Measure
Measure Business Business

Measure Measure
80
Business Case Components - Activities
Business Business

Measure Measure
Y = f (x1, x2, x3…xn )

1st Call Resolution = f (Calls, Operators, Resolutions…xn )
Black Box Testing = f (Specifications, Simulation, Engineering…xn)
81
Business Case Components - Processes
Business Business

Measure Measure
Y = f (x1, x2, x3…xn )

Resolutions = f (New Customers, Existing Customers, Defective Products…x n )
Simulation = f (Design, Data, modeling…xn )
82
What is a Business Case?
The Business Case communicates the need for the project in terms of
meeting business objectives.
The components are:

– Output unit (product/service) for external customer
– Primary business measure of output unit for project
– Baseline performance of primary business measure
– Gap in baseline performance of primary business measure from business objective
Let’s get down

to business!
83
Business Case Example
During FY 2005, the 1st Time Call Resolution

Efficiency for New Customer Hardware Setup was 89%
.
This represents a gap of 8% from the industry standard

of 93% that amounts to US $2,000,000 of annualized
cost impact.
84
The Business Case Template
Fill in the Blanks for Your Project:
During ___________________________________ , the ____________________ for

(Period of time for baseline performance) (Primary business measure)
________________________ was _________________ .

(A key business process) (Baseline performance)
This gap of ____________________________

(Business objective target vs. baseline)
from ___________________ represents ____________________ of cost impact.

(Business objective) (Cost impact of gap)
85
Business Case Exercise
Exercise objective: To understand how to create a “strong” Business Case.
1. Complete the Business Case template below to the best of your ability.
During ________________________ , the ____________________ for

(Period of time for baseline performance) (Primary business measure)
_______________________ was ___________________ .

(A key business process) (Baseline performance)
This gap of __________________________

(Business objective target vs. baseline)
from __________________ represents ____________ of cost impact.

(Business objective) (Cost impact of gap)
86
What is a Project Charter?
The Project Charter expands on the Business Case, it clarifies the

projects focus and measures of project performance and is completed
by the Six Sigma Belt.
Components: • The Problem
• Project Scope
• Project Metrics
• Primary & Secondary
• Graphical Display of Project Metrics
• Primary & Secondary
• Standard project information
• Project, Belt & Process Owner names
• Start date & desired End date
• Division or Business Unit
• Supporting Master Black Belt (Mentor)
• Team Members
87
Project Charter - Definitions
• Problem Statement - Articulates the pain of the defect or error in the process.
• Objective Statement – States how much of an improvement is desired from the project.
• Scope – Articulates the boundaries of the project.
• Primary Metric – The actual measure of the defect or error in the process.
• Secondary Metric(s) – Measures of potential consequences (+ / -) as a result of changes in

the process.
• Charts – Graphical displays of the Primary and Secondary Metrics over a period of time.
88
Project Charter - Problem Statement
Migrate the Business Case into Problem Statement…
89
Project Charter – Objective & Scope
Consider the following for constructing your Objective & Scope
What represents a significant improvement?

• X amount of an increase in yield
• X amount of defect reduction
• Use Framing Tools to establish the initial scope
90
Pareto Analysis
Pareto Analysis:
• A bar graph used to arrange information in such a way that priorities for process
improvement can be established.
Pareto Chart of Help Text Errors
400
100
300 80
Quantity
Percent
200 60
40
100
20
0 0
Help Text Errors
C2 145 142 34 29 8
Percent 40.5 39.7 9.5 8.1 2.2
Cum % 40.5 80.2 89.7 97.8 100.0
• The 80-20 theory was first developed in 1906, by Italian economist, Vilfredo Pareto, who
observed an unequal distribution of wealth and power in a relatively small proportion of
the total population. Joseph M. Juran is credited with adapting Pareto's economic
observations to business applications.
91
The 80:20 Rule Examples
• 20% of the time expended produced 80% of the results
• 80% of your phone calls go to 20% of the names on your list
• 20% of the streets handle 80% of the traffic
• 80% of the meals in a restaurant come from 20% of the menu
• 20% of the paper has 80% of the news
• 80% of the news is in the first 20% of the article
• 20% of the people cause 80% of the problems
• 20% of the features of an application are used 80% of the time
92
Pareto Chart - Tool
Multi level Pareto Charts are used in a drill down fashion to get to Root
Cause of the tallest bar.
Pareto Chart of Scrap

Level 1
300000 100
250000 80
200000
60
Percent
Scrap
150000
Pareto Chart
40 by Department
100000 Level 2
180000
160000 20 100
50000
140000
0 0 Pareto
80 Chart by Part Number
Count A B 120000 C Level 3
100000
Scrap 250000 30000 25000
Percent
100000 100
Count
Percent 82.0 9.8 8.2 60

Cum % 82.0 91.8 80000 100.0
80000
40 80
60000
40000
60000 20 60
Percent
20000
0 40000 0
Department J M F W Other 40
Count 95000 23000 19000 17500 5000
Percent 59.6 14.4 11.9 11.0 3.1
Cum % 59.6 74.0 85.9 20000 96.9 100.0 20
0 0
Part Z101 Z876 X492
Percent 78.9 15.8 5.3
Cum % 78.9 94.7 100.0
93
Pareto Chart - Tool
Pareto Chart of Scrap

Level 1 100
300000
Pareto Chart by Department
250000 80
Level 2
180000
Pareto Chart by Department
200000 Level
180000 2
60 160000 100
Percent
Scrap
160000 100
150000
140000
40
80 140000
100000 120000
80
120000
Percent
100000
Count
20 60
50000
80000 Pareto Chart by Part Number
Percent
Level 3 100000
Count
0 0 40
Count A B
60000
C
100000
100
60
Scrap 250000 30000 40000 25000
Percent 82.0 9.8
20000
8.2
80000
20 80000
Cum % 82.0 91.8 100.0 80
0 0
60000 40
Department J M F W Other
Count 95000 23000 60000
19000 17500 5000 60
Percent
Percent 59.6 14.4 11.9 11.0 3.1
Cum % 59.6 74.0 85.9 96.9 100.0 40000
40000 40 20
20000
20000 20
0 0
0 0
Department J M F W Other
Part
Percent
Z101
78.9
Z876
15.8
X492
5.3
Count 95000 23000 19000 17500 5000
Interpretation: Cum % 78.9 94.7 100.0 Percent

Cum %
59.6
59.6
14.4
74.0
11.9
85.9
11.0
96.9
3.1
100.0
• Level 1:
Pareto Chart by Part Number
Level 3
– Department J Makes up 60% of the 100000
100
Scrap 80000
80
• Level 2: 60000
60
Percent
– Part Z101 Makes up 80% of 40000 40
Department J’s Scrap 20000 20
0 0
Part Z101 Z876 X492
Percent 78.9 15.8 5.3
Cum % 78.9 94.7 100.0
94
Pareto Chart - Example
Use the “Call Center.mtw” worksheet to create a Pareto Chart
95
What would you do with this Pareto?
Pareto Chart of FAILURE MODE

3000 100
2500
80
2000
Percent
Count
60
1500
40
1000
500 20
0 0
FAILURE MODE
Count 495 489 478 472 468 455

Percent 17.3 17.1 16.7 16.5 16.4 15.9
Cum % 17.3 34.4 51.2 67.7 84.1 100.0
When your Pareto shows up like this your focus is probably too broad.
96
Let’s look at the problem a little differently…
– Using a higher level scope for the first Pareto may help in providing
focus.
– Create another Pareto as shown above.
97
This gives a better picture of which product category produces the highest defect
count.
Pareto Chart of PRODUCT CATAGORIES

2500
100
2000
80
1500
Percent
60
Count
1000 40
500 20
0 0
PRODUCT CATAGORIES
Count 1238 450 362 201 106

Percent 52.5 19.1 15.4 8.5 4.5
Cum % 52.5 71.6 87.0 95.5 100.0
Now we’ve got something to work with. Notice the 80% area, draw a line from
the 80% mark across to the cumulative percent line (Red Line) in the graph as
shown here.
98
Now that we have more of a focus area, drill down one more level.
– This chart will only use the classifications within the first bar on the previous chart.
– Create another Pareto which will drill down to the categories within the Card type
from the previous Pareto.
99
Now what, we’ve got ourselves another “Flateto”…
Pareto
Pareto Chart
Chart of
of TRAVEL
TRAVEL
1400
1400
100
100
1200
1200
1000
1000 80
80
Percent
800
Percent
800
Count
Count
60
60
600
600
40
40
400
400
20
20
200
200
00 00
TRAVEL
TRAVEL CAR
CAR HOTEL
HOTEL AIR
AIR
Count
Count 428
428 420
420 390
390
Percent
Percent 34.6
34.6 33.9
33.9 31.5
31.5
Cum
Cum %% 34.6
34.6 68.5
68.5 100.0
100.0
Essentially this tells us that there in no clear direction within the

Platinum Business Accounts.
100
Project Charter – Primary Metric
Establishing the Primary Metric:

The Primary Metric is a very
important measure in the Six
Sigma project; this metric is a
quantified measure of the
defect or primary issue of the
project.
We can have only One Primary

Metric. Recall the equation Y =
f (X); well once your defect is
located then Y will be your
defect…your primary metric
– Quantified measure of the defect will measure it.
– Serves as the indicator of project success
– Links to the KPI or Primary Business measure
– Only one Primary Metric per project
101
Project Charter – Secondary Metrics
Establishing Secondary Metric(s):
Secondary Metrics are put in

place to measure potential
changes that may occur as a result
of making changes to our Primary
Metric.
They will measure ancillary
changes in the process, both
positive and negative.
– Measures positive & negative consequences as a result of changes in the process

– Can have multiple Secondary Metrics
102
Project Charter – Metric Charts
Generating Charts:
Primary and Secondary

Metrics should be continually
measured and frequently
updated during the projects
lifecycle.
Use them as your gauge of
Project Success and Status.
This is where your Project’s
progress will be apparent.
– Displays Primary and Secondary Metrics over time

– Should be updated regularly throughout the life of the project
– One for Primary Metric and one for each of the Secondary Metrics
– Typically utilize Time Series Plots
103
Project Charter Exercise
Exercise objective: To begin planning the Project Charter deliverable.

1. Complete the Project Charter template to the best of your ability.
2. Be prepared to present your Stakeholder Analysis to your mentor.
Project Charter Template.xls
104
What is the Financial Evaluation?
The financial evaluation establishes the value of the project.
The components are:

– Impact OK, let’s add it up!
• Sustainable
• One-off
– Allocations
• Cost Codes / Accounting System
– Forecast
• Cash flow
• Realization schedule
Typically a financial representative is responsible for evaluating the financial impact of

the project. The Belt works in coordination to facilitate the proper information.
105
Benefits Capture - Calculation “Template”
Whatever your organization’s protocol may be these aspects should be accounted for
within any improvement project.
There are two types of

I
Impact: One Off &
Sustainable
M
P
Sustainable Impact “One-Off” Impact
A
C
T
Cost Codes allocate the

C
impact to the appropriate
O
S
T
area in the “Books”
Reduced Increased Implemen-
Costs Capital
C Costs Revenue tation
O
D
E
S
Forecasts allow for proper
management of projects and
F
resources
O Realization Schedule
R (Cash Flow)
E
C
A
S
T By Period
(i.e. Q1,Q2,Q3,Q4)
106
Benefits Capture – Basic Guidelines
• Benefits should be calculated on the baseline of key business process

performance that relate to a business measure or KPI(s).
• The Project Measure (Primary Metric) has to have a direct link between
the process and its KPI’s.
• Goals have to be defined realistically to avoid under or over setting.
• Benefits should be annualized.
• Benefits should be measured in accordance with Generally Accepted

Accounting Principles (GAAP).
107
Benefits Capture - Categorization
A
• Projects directly impact the Income Statement or Cash Flow Statement.
B • Projects impact the Balance Sheet (working capital).
• Projects avoid expense or investment due to known or expected events in

C the future (cost avoidance).
• Projects are risk management, insurance, Safety, Health, Environment and

D Community related projects which prevent or reduce severity of
unpredictable events.
You don’t want to take this one home!

108
Benefits Calculation Involvement & Responsibility
Project Selection D-M-A-I-C Implementation 6 Month Audit
Financial Financial Financial Financial

Representative Representative Representative Representative
Champion Black Belt Champion Process Owner

& &
Process Owner Process Owner
109
Benefits Capture - Summary
• Performance tracking for Six Sigma Projects should use the same
discipline that would be used for tracking any other high-profile
projects.
• The A-B-C-D categories can be used to illustrate the impact of your

project or a “portfolio” of projects.
• Establish The Governess Grid for Responsibility & Involvement.
It’s a wrap!
110
Benefits Calculation Template
The Benefits Calculation Template facilitates and aligns with the aspects
discussed for Project Accounting.
111
Summary
• Understand the various approaches to project selection
• Articulate the benefits of a “Structured Approach”
• Refine and Define the business problem into a Project Charter to display
critical aspects of an improvement project
• Make initial financial impact estimate
112
Elements of Waste
Selecting Projects
Elements of Waste
7 Components of Waste
5S
113
Definition of Lean
“Lean Enterprise is based on the premise that anywhere work is

being done, waste is being generated.
The Lean Enterprise seeks to organize its processes to the optimum

level, through the continual focus on the identification and
elimination of waste.”
-- Barbara Wheat
114
Lean – History
1885 1913 1955 - 1990 1993 -

Craft Production Mass Production Toyota Production Lean Enterprise
- Machine then harden - Part inter-changeability System - "Lean" applied to all
- Fit on assembly - Moving production line - Worker as problem functions in enterprise
- Customization - Production engineering solver value stream
- Highly skilled workforce - "Workers don't like to - Worker as process - Optimization of value
- Low production rates think" owner enabled by: delivered to all
- High Cost - Unskilled labor -- Training stakeholders and
- High production rates -- Upstream quality enterprises in value chain
- Low cost -- Minimal inventory - Low cost
- Persistent quality -- Just-in-time - Improving productivity
problems - Eliminate waste - High quality product
- Inflexible models - Responsive to change - Greater value for
- Low cost stakeholders
- Improving productivity
- High quality product
115
Lean Six Sigma
Lean Six Sigma combines the strengths of each system:
• Lean • Six Sigma

– Guiding principles based operating – Focus on voice of the customer
system – Data and fact based decision making
– Relentless elimination of all waste – Variation reduction to near
– Creation of process flow and demand perfection levels
pull – Analytical and statistical rigor
– Resource optimization
– Simple and visual
Strength: Effectiveness
Strength: Efficiency
An Extremely Powerful Combination!

116
Project Requirements for Lean
• Perhaps one of the most criminal employee performance issues in today’s

organizations is generated not by a desire to cheat one’s employer but rather by a
lack of regard to waste.
• In every work environment there are multiple opportunities for reducing the non-
value added activities that have (over time) become an ingrained part of the
standard operating procedure.
• These non-value added activities have become so ingrained in our process that
they are no longer recognized for what they are, WASTE.
• waste (v.) Anything other than the minimum amount of time, material, people,
space, energy, etc needed to add value to the product or service you are
providing.
• The Japanese word for waste is muda.
Get that stuff

outta here!
117
Seven Components of Waste
Muda is classified into seven components:

– Overproduction
– Correction (defects)
– Inventory
– Motion
– Overprocessing
– Conveyance
– Waiting
Sometimes additional forms of muda are added:

– Under use of talent
– Lack of safety
Being Lean means eliminating waste.
118
Overproduction
Overproduction is producing more than the next step needs or more than
the customer buys.
– It may be the worst form of waste because it contributes to all the others.
Examples are:
Preparing extra reports
Reports not acted upon or even read
Multiple copies in data storage
Over-ordering materials
Duplication of effort/reports
Waste of Overproduction relates to the excessive

accumulation of work-in-process (WIP) or finished
goods inventory.
119
Correction
Correction of defects is as obvious as it sounds.

Examples are:
Incorrect data entry
Paying the wrong vendor
Misspelled words in
communications
Making bad product
Materials or labor discarded

during production
Eliminate erors!!
Waste of Correction includes the waste of handling and fixing
mistakes. This is common in both manufacturing and
transactional settings.
120
Inventory
Inventory is the liability of materials that are bought, invested in and not
immediately sold or used.
Examples are:
Transactions not processed
Bigger “in box” than “out box”
Over-ordering materials
consumed in-house
Over-ordering raw materials –

just in case
Waste of Inventory is identical to overproduction except that it

refers to the waste of acquiring raw material before the exact
moment that it is needed.
121
Motion
Motion is the unnecessary movement of people and equipment.

– This includes looking for things like documents or parts as well as movement
that is straining.
Examples are:
Extra steps
Extra data entry
Having to look for something
Waste of Motion examines how people move to ensure

that value is added.
122
Overprocessing
Overprocessing is tasks, activities and materials that don’t add value.

– Can be caused by poor product or tool design as well as from not understanding
what the customer wants.
Examples are:
Sign-offs
Reports that contain more

information than the customer
wants or needs
Communications, reports,
emails, contracts, etc that contain
more than the necessary points
(briefer is better)
Waste of Overprocessing relates to over-processing
anything that may not be adding value in the eyes of Voice mails that are too long
the customer.
123
Conveyance
Conveyance is the unnecessary movement of material and goods.

– Steps in a process should be located close to each other so movement is
minimized.
Examples are:
Extra steps in the process
Distance traveled
Moving paper from place to

place
Waste of Conveyance is the movement of material.
124
Waiting
Waiting is nonproductive time due to lack of material, people, or

equipment.
– Can be due to slow or broken machines, material not arriving on time, etc.
Examples are:
Processing once each month

instead of as the work comes in
Showing up on time for a

meeting that starts late
Delayed work due to lack of

communication from another
internal group
Waste of Waiting is the cost of an idle resource.

125
Exercise
Exercise objective: To identify waste that occurs in your

processes.
Write an example of each type of Muda below:
– Overproduction ___________________
– Correction ___________________
– Inventory ___________________
– Motion ___________________
– Overprocessing ___________________
– Conveyance ___________________
– Waiting ___________________
126
5S – The Basics
5S is a process designed to organize the workplace, keep it neat and clean,

maintain standardized conditions and instill the discipline required to
enable each individual to achieve and maintain a world class work
environment.
• Seiri - Put things in order

• Seiton - Proper Arrangement
• Seiso – Clean
• Seiketsu – Purity
• Shitsuke - Commitment
127
English Translation
There have been many attempts to force five English “S” words to maintain the original
intent of 5S from Japanese. Listed below are typical English words used to translate:
1.) Sort (Seiri)
2.) Straighten or Systematically Arrange (Seiton)
3.) Shine or Spic and Span (Seiso)
4.) Standardize (Seiketsu) Place things in such a way that
5.) Sustain or Self-Discipline (Shitsuke) they can be easily reached
whenever they are needed.
Straighten
Sort Shine
5S
Visual sweep of areas, eliminate
Identify necessary items and remove dirt, dust and scrap. Make
unnecessary ones, use time management. workplace shine.
Self-Discipline
Standardize
Make 5S strong in habit.
Work to standards, maintain
Make problems appear and
standards, wear safety
solve them.
equipment.
128
Exercise
Exercise objective: : To identify elements of 5S in your workplace.

Write an example for each of the 5S’s below:
• Sort ____________________
• Straighten ____________________
• Shine ____________________
• Standardize ____________________
• Self-Discipline ____________________
129
Summary
• Identify and describe the 7 Elements of Waste
• Describe 5S
• Provide examples of how Lean Principles can affect your area
130
Define Phase
Wrap Up and Action Items
Define Phase Overview—The Goal
The goal of the Define Phase is to:
• Identify a process to improve and develop a specific Six Sigma project.

– Six Sigma Belts define critical processes, sub-processes and elaborate the
decision points in those processes.
• Define is the “contract” phase of the project. We are determining exactly

what we intend to work on and estimating the impact to the business.
• At the completion of the Define Phase you should have a description of the
process defect that is creating waste for the business.
132
Define Action Items
At this point you should all understand what is necessary to complete these
action items associated with Define.
– Charter Benefits Analysis

– Team Members
– Process Map – high level
– Primary Metric
– Lean Opportunities
– Stakeholder Analysis
– Project Plan Deliver
– Issues and Barriers the
Goods!
133
Six Sigma Behaviors
• Being tenacious, courageous
• Being rigorous, disciplined
• Making data-based decisions
• Embracing change & continuous learning Walk

• Sharing best practices
the
Walk!
Each
Each“player”
“player”ininthe
theSix
SixSigma
Sigmaprocess
processmust
mustbe
be
AAROLE
ROLEMODEL
MODEL
for
forthe
theSix
SixSigma
Sigmaculture.
culture.
134
Define Phase — The Roadblocks
Look for the potential roadblocks and plan to address them before they become
problems:
– No historical data exists to support the project.
– Team members do not have the time to collect data.
– Data presented is the best guess by functional managers.
– Data is communicated from poor systems.
– The project is scoped too broadly.
– The team creates the “ideal” Process Map rather than the “as is” Process
Map.
Clear the road –

I’m comin’
through!
135
DMAIC Roadmap
Process Owner
Champion/

Define
Estimate COPQ
Establish Team
Measure

Analyze

Improve

Control
Implement Control Plan to Ensure Problem Doesn’t Return
136
Define Deployment
Business Case
Selected
Notify Belts and Stakeholders
Create High-Level Process Map

(Pareto, Project Desirability)
Define & Charter Project

(Problem Statement, Objective, Primary Metric, Secondary Metric)
N Estimate COPQ
Approved
Project Recommend Project Focus
Focus
Y
Create Team
Charter Team
Ready for Measure
137
Action Items Support List
138
Summary
At this point, you should:
• Have a clear understanding of the specific action items
• Have started to develop a Project Plan to complete the action items
• Have identified ways to deal with potential roadblocks
• Be ready to apply the Six Sigma method within your business
139
Measure Phase
Process Discovery
Welcome to Measure
Process Discovery
Cause and Effect Diagrams
Detailed Process Mapping
FMEA
Six Sigma Statistics
Measurement System Analysis
Process Capability
141
Overview of Brainstorming Techniques
A commonly used tool to solicit ideas by using categories to stimulate cause and
effect relationships within a problem. It uses verbal inputs in a team environment.
Cause and Effect Diagram

People Machine Method
The Y
The or
Problem
The X’s Problem
Condition
(Causes)
l
Material Measurement Environment Categories
142
People Machine Method
The Y
Theor
Problem
The X’s Problem

Condition
(Causes)
Materiall Measurement Environment Categories
Products Categories for the legs of the Transactional

–Measurement diagram can use templates for –People
–People products or transactional symptoms. –Policy
–Method Or you can select the categories by –Procedure
–Materials process step or what you deem –Place
–Equipment appropriate for the situation. –Measurement
–Environment –Environment
The Measurement category groups Root Causes related to the measurement and measuring of a
process activity or output:
Examples of questions to ask:
• Is there a metric issue? Measurement
• Is there a valid measurement
system? Is the data good enough?
• Is data readily available?
Y
The People category groups Root Causes related to people, staffing and
organizations:
Examples of questions to ask: People
• Are people trained, do they
have the right skills?
• Is there person to person
Y
variation?
• Are people over-worked?
144
The Method category groups Root Causes related to how the work is done, the way the process
is actually conducted:
Examples of questions to ask: Method
• How is this performed?
• Are procedures correct?
• What might unusual? Y
The Materials category groups Root Causes related to parts, supplies, forms or information
needed to execute a process:

• Are bills of material current? Y
• Are parts or supplies obsolete?
• Are there defects in the materials?
Materials
145
The Equipment category groups Root Causes related to tools used in the process:

• Have machines been serviced recently, what is
the uptime? Y
• Have tools been properly maintained?
• Is there variation?
Equipment
The Environment (a.k.a. Mother Nature) category groups Root Causes related to our work
environment, market conditions and regulatory issues.
• Is the workplace safe and comfortable?
• Are outside regulations impacting the business? Y
• Does the company culture aid the process?
Environment
146
Classifying the X’s
The Cause & Effect Diagram is simply a tool to generate opinions about possible
causes for defects.
For each of the X’s identified in the Fishbone diagram classify them as follows:
– Controllable – C (Knowledge)
– Procedural – P (People, Systems)
– Noise – N (External or Uncontrollable)
Think of procedural as a subset of controllable. Unfortunately, many procedures

within a company are not well controlled and can cause the defect level to go up.
The classification methodology is used to separate the X’s so they can be used in
the X-Y Matrix and the FMEA taught later in this module.
WHICH X’s CAUSE DEFECTS?

147
Chemical Purity Example
Measurement Manpower Materials
Training on method (P) Raw Materials (C)

Incoming QC (P)
Measurement Insufficient staff (C)

Method (P) Skill Level (P) Multiple Vendors (C)
Measurement
Capability (C) Adherence to procedure (P) Specifications (C)
Work order variability (N)
Chemical
Startup inspection (P) Room Humidity (N) Column Capability (C) Purity
Handling (P) RM Supply in Market (N) Nozzle type (C)
Purification Method (P) Shipping Methods (C) Temp controller (C)
Data collection/feedback (P)
Methods Mother Nature Equipment
148
Cause & Effect Diagram - MINITAB™
Below is a Cause & Effect Diagram for surface flaws. The next few slides will
demonstrate how to create it in MINITAB™.
Fishbone Diagram
Measurem Material Personnel
ents
S hifts
A lloy s
S uperv isors
M icrometers
Lubricants
M icroscopes
Training
S uppliers
Inspectors O perators
Surface
Flaws
B rake
C ondensation S peed
E ngager
Lathes
M oisture%
B its
A ngle
S ockets
Env ironment Methods Machines
149
Open the MINITAB™ Project “Measure Data Sets.mpj” and select the worksheet “Surfaceflaws.mtw”.
150
151
In order to adjust the Fishbone Diagram so the main causes titles are not rolled grab
the line with your mouse and move the entire bone.
Fishbone Diagram
Measurem Material Personnel
ents
S hifts
A lloy s
S uperv isors
M icrometers
Lubricants
M icroscopes
T raining
S uppliers
Inspectors O perators
Surface
F laws
B rake
C ondensation S peed
E ngager
Lathes
M oisture%
B its
A ngle
S ockets
Env ironment Methods Machines
152
Cause & Effect Diagram Exercise
Exercise objective: Create a Fishbone Diagram.
1. Retrieve the high level Process Map for your project and use it to
complete a Fishbone, if possible include your project team.
Don’t let the

big one get
away!
153
Overview of Process Mapping
In order to correctly manage a process, you must be able to

describe it in a way that can be easily understood.
– The preferred method for describing a process is to identify
it with a generic name, show the workflow with a Process
Map and describe its purpose with an operational
description.
– The first activity of the Measure Phase is to adequately
describe the process under investigation.
t
Step D Finish
ec
Start Step A Step B Step C
sp
In
154
Information from Process Mapping
By mapping processes we can identify many important

characteristics and develop information for other analytical tools:
1. Process inputs (X’s)

2. Supplier requirements
3. Process outputs (Y’s)
4. Actual customer needs
5. All value-added and non-value added process tasks and steps
6. Data collection points
•Cycle times
•Defects
•Inventory levels
•Cost of poor quality, etc.
7. Decision points
8. Problems that have immediate fixes
9. Process control needs
155
Process Mapping
There are usually three views of a process:
1 2 3
What you THINK it is.. What it ACTUALLY is.. What it SHOULD be..
156
Standard Process Mapping Symbols
Standard symbols for Process Mapping (available in Microsoft Office™, Visio™,

iGrafx™ , SigmaFlow™ and other products):
A RECTANGLE indicates an A PARALLELAGRAM shows

activity. Statements within the that there are data
rectangle should begin with a verb
A DIAMOND signifies a decision point. An ELLIPSE shows the start and

Only two paths emerge from a decision end of the process
point: No and Yes
An ARROW shows the A CIRCLE WITH A LETTER OR

connection and direction of
1 NUMBER INSIDE symbolizes the
continuation of a flowchart to
flow
another page
157
Process Mapping Levels
Level 1 – The Macro Process Map, sometimes called a Management level or

viewpoint.
Pizza
Customer Calls for Take Make Cook Deliver Customer
Correc Box Pizza
Hungry Order Order Pizza Pizza Pizza Eats
t
Level 2 – The Process Map, sometimes called the Worker level or viewpoint. This
example is from the perspective of the pizza chef
Pizza
Dough
No
Take Order Add Place in Observe Check Yes Remove
from Cashier Ingredients Oven Frequently if Done from Oven 1
Start New
Pizza
Scrap
No
Tape
Pizza Place in Put on
1 Correct Box
Order on Delivery Rack
Yes Box
Level 3 – The Micro Process Map, sometimes called the Improvement level or
viewpoint. Similar to a level 2, it will show more steps and tasks and on it will be
various performance data; yields, cycle time, value and non value added time, defects,
etc.
158
Types of Process Maps
The Linear Flow Process Map

Calls
Customer Take Make Cook Pizza Box Deliver Customer
for
Hungry Order Pizza Pizza Correct Pizza Pizza Eats
Order
As the name states, this diagram shows the process steps in a sequential flow, generally ordered from an upper left
corner of the map towards the right side.
The Deployment-Flow or Swim Lane Process Map

C u s to m e r
Customer Calls for Customer

Hungry Order Eats
C a s h ie r
Take
Order
Cook
Make Cook Pizza Box

Pizza Pizza Correct Pizza
D e liv e r e r
Deliver
Pizza
The value of the Swim Lane map is that is shows you who or which department is responsible for the steps in a
process. This can provide powerful insights in the way a process performs. A timeline can be added to show how long
it takes each group to perform their work. Also each time work moves across a swim lane, there is a “Supplier –
Customer” interaction. This is usually where bottlenecks and queues form.
159
Process Maps – Examples for Different Processes
Linear Process Map for Door Manufacturing

Linear Process Map for Door Manufacturing
Begin Prep doors Inspect Pre-cleaning A
Begin Prep doors Inspect Pre-cleaning A
Return
for
Return
rework
for
rework
Mark for door
Install into Inspect Mark for door
A Install into
work jig
Light sanding Inspect
finish
handle B
A work jig
Light sanding
finish
handle
drilling B
drilling
Rework
Rework
De-burr and Apply part Move to

B Drill holes De-burr
smooth and
hole Apply
numberpart Move to
finishing
C
B Drill holes
smooth hole number finishing
C
Scratch Final Apply stain

C Scratch Final Inspect Inspect End
repair cleaning Apply stain
and dry End
C repair cleaning
Inspect
and dry
Inspect
Scrap
Scrap
S u p p lPi ePr ror oc cu ur er emT moepen nMt tgF t i/n a n c e I . T . B Bu us isni ne se ss s
Swim Lane Process Map for Capital Equip

Swim Lane Process Map for Capital Equip
U n it
Prepare
paperwork Review &
U n it
Define Prepare Receive &

(CAAR &
paperwork approve
Review &
Needs
Define installation CAAR
use
Receive &
(CAAR & approve
Needs request)
installation CAAR
use
request)
T o p M gF t i/n a n c e I . T .
Review &
Configure
approve
Review & & install
Configure
standard
approve
& install
standard
Review &
Issue
C o rp o ra te
approve
Review & payment
Issue
CAAR
C o rp o ra te
approve
payment
CAAR
Review &
approve
Review &
CAAR
approve
CAAR
Acquire
equipment
Acquire
equipment
S u p p lie r
Supplier Supplier
Ships
Supplier Paid
Supplier
Ships Paid
21 days 6 days 15 days 5 days 17 days 7 days 71 days 50 days

21 days 6 days 15 days 5 days 17 days 7 days 71 days 50 days
160
The SIPOC “Supplier – Input – Process – Output – Customer” Process Map
Suppliers Inputs Process Outputs Customers Requirements
r ATT Phones r Pizza type r See Below r Price r Cook r Complete call < 3 min
r Office Depot r Size r Order confirmation r Accounting r Order to Cook < 1 minute
r TI Calculators r Quantity r Bake order r Complete bake order
r NEC Cash Register r Extra Toppings r Data on cycle time r Correct bake order
r Special orders r Order rate data r Correct address
r Drink types & quantities r Order transaction r Correct Price
r Other products r Delivery info
r Phone number
r Address
r Name
r Time, day and date
r Volume
Level 1 Process Map for Customer Order Process

Call for an Answer Write Confirm Sets Address & Order to
Order Phone Order Order Price Phone Cook
The SIPOC diagram is especially useful after you have been able to construct either a Level 1
or Level 2 Map because it facilitates your gathering of other pertinent data that is affecting the
process in a systematic way.
161
The Value Stream Map
Process Steps
Log Route Disposition Cut Check Mail Delivery
-Computer -Department -Guidelines -Computer -Envelops
-1 Person Assignments -1 Person -Printer I -Postage
Size of work queue or I I -1 Person
I I -1 Person -1 Person
inventory
4,300 7,000 C/T = 75 sec 1,700 C/T = 255 sec 2,450 C/T = 15 sec 1,840 C/T = 100 sec
C/T = 15 sec
Uptime = 0.90 Uptime = 0.95 Uptime = 0.95 Uptime = 0.85 Uptime = 0.90
Process Step Time Hours = 8 Hours = 8 Hours = 8 Hours = 8 Hours = 8
Parameters Breaks = 0.5 Breaks = 0.5 Breaks = 0.5 Breaks = 0.5 Breaks = 0.5
Hours Hours Hours Hours Hours
Available =6.75 Available =7.13 Available =7.13 Available =6.38 Available =6.75
Sec. Sec. Sec. Sec. Sec.
Avail. = 24,300 Avail. = 25,650 Avail. = 25,650 Avail. = 22,950 Avail. = 24,300
Step Processing Time
15 sec 75 sec 255 sec 15 sec 100 sec
Days of Work in
queue 2.65 days 20.47 days 16.9 days 1.60 days 7.57 days
Process Performance Metrics

IPY = 0.92 IPY = .94 IPY = .59 IPY = .96 IPY = .96
Defects = 0.08 Defects = .06 Defects = .41 Defects = .04 Defects = .04
RTY = .92 RTY = .86 RTY = .51 RTY = .49 RTY = .47
Rework = 4.0% Rework = 0.0 Rework = 10% Rework = 0.0 Rework = 0.0
Material Yield = .96 Material Yield = .94 Material Yield = .69 Material Yield = .96 Material Yield = .96
Scrap = 0.0% Scrap = 0.0% Scrap = 0.0% Scrap = 0.0% Scrap = 4.0%
Aggregate Performance Metrics
Cum Material Yield = .96 X .94 X .69 X .96 X .96 = .57 RTY = .92 X .94 X .59 X .96 X .96 = .47
The Value Stream Map is a very powerful technique to understand the velocity of process
transactions, queue levels and value added ratios in both manufacturing and non-
manufacturing processes.
162
Process Mapping Exercise – Going to Work
The purpose of this exercise is to develop a Level 1 Macro, Linear Process Flow Map and
then convert this map to a Swim Lane Map.
Read the following background for the exercise: You have been concerned about your
ability to arrive at work on time and also the amount of time it takes from the time your alarm
goes off until you arrive at work. To help you better understand both the variation in arrival
times and the total time, you decide to create a Level 1 Macro Process Map. For purposes of
this exercise, the start is when your alarm goes off the first time and the end is when you arrive
at your work station.
Task 1 – Mentally think about the various tasks and activities that you routinely do from the
defined start to the end points of the exercise.
Task 2 – Using a pencil and paper create a Linear Process Map at the macro level, but with
enough detail that you can see all the major steps of your process.
Task 3 – From the Linear Process Map, create a swim lane style Process Map. For the lanes
you may use the different phases of your process, such as the wake up phase, getting prepared,
driving, etc.
163
A Process Map of Process Mapping
Select the process Create the Level 2 PFM Create a Level 3 PFM
Determine approach to
map the process Perform SIPOC Add Performance data
Complete Level 1 PFM

worksheet Identify all X’s and Y’s Identify VA/NVA steps
Identify customer
Create Level 1 PFM
requirements
Define the scope for the Identify supplier

Level 2 PFM requirements
164
Process Mapping Approach
Select the
Using the Individual Approach
process 1. Start with the Level 1 Macro Process Map.
2. Meet with process owner(s) / manager(s). Create a Level 1
Map and obtain approval to interview process members.
Determine
approach to 3. Starting with the beginning of the process, pretend you are the
map the product or service flowing through the process, interview to
process
gather information.
4. As the interview progress, assemble the data into a Level 2
Complete PFM.
Level 1 PFM
worksheet 5. Verify the accuracy of the Level 2 PFM with the individuals
who provided input.
6. Update the Level 2 PFM as needed.
Create Level 1
PFM
Using the Team Approach

Define the 1. Follow the Team Approach to Process Mapping
scope for the
Level 2 PFM
165
Select the
Using the Team Approach
process 1. Start with the Level 1 Macro Process Map.
2. Meet with process owner(s) / manager(s). Create a Level 1
Map and obtain approval to call a process mapping meeting
Determine
approach to with process members (See team workshop instructions for
map the details on running the meeting).
process
3. Bring key members of the process into the process flow
workshop. If the process is large in scope, hold individual
Complete workshops for each subsection of the total process. Start with
Level 1 PFM the beginning steps. Organize meeting to use the “post-it note
worksheet
approach to gather individual tasks and activities, based on the
macro map, that comprise the process.
Create Level 1
4. Immediately assemble the information that has been provided
PFM into a Process Map.
5. Verify the PFM by discussing it with process owners and by
observing the actual process from beginning to end.
Define the
scope for the
Level 2 PFM
166
Select the
The Team Process Mapping Workshop
process 1. Add to and agree on Macro Process Map.
2. Using 8.5 X 11 paper for each macro process step, tape the
Determine
process to the wall in a linear style.
approach to 3. Process Members then list all known process tasks that they do
map the on a post-it note, one process task per note.
process
• Include the actual time spent to perform each activity, do
not include any wait time or queue time.
Complete • List any known performance data that describe the quality
Level 1 PFM
worksheet of the task.
4. Place the post-it notes on the wall under the appropriate macro
step in the order of the work flow.
5. Review process with whole group, add additional information
Create Level 1
PFM and close meeting.
6. Immediately consolidate information into a Level 2 Process
Map.
7. You will still have to verify the map by walking the process.
Define the
scope for the
Level 2 PFM
167
Steps in Generating a Level 1 PFM
Select the
Creating a Level 1 PFM
process 1. Identify a generic name for the process:
For instance: “Customer Order Process”
Determine 2. Identify the beginning and ending steps of the process:
approach to Beginning - customer calls in. Ending – baked pizza given to operations
map the
process 3. Describe the primary purpose and objective of the process (operational
definition):
The purpose of the process is to obtain telephone orders for pizzas, sell
Complete additional products if possible, let the customer know the price and
Level 1 PFM approximate delivery time, provide an accurate cook order, log the time and
worksheet
immediately give it to the pizza cooker.
4. Mentally “walk” through the major steps of the process and write them
down:
Create Level 1
PFM
Receive the order via phone call from the customer, calculate the price,
create a build order and provide the order to operations
5. Use standard flowcharting symbols to order and to illustrate the flow of
the major process steps.
Define the
scope for the
Level 2 PFM
168
Exercise – Generate a Level 1 PFM
The purpose of this exercise is to develop a Level 1 Linear Process Flow Map
Select the for the key process you have selected as your project.
process
Read the following background for the exercise: You will use your
selected key process for this exercise (if more than one person in the class is
Determine part of the same process you may do it as a small group). You may not have
approach to all the pertinent detail to correctly put together the Process Map, that is ok, do
map the
process the best you can. This will give you a starting template when you go back to
do your project. In this exercise you may use the Level 1 PFM worksheet on
Complete
the next page as an example.
Level 1 PFM
worksheet
Task 1 – Identify a generic name for the process.
Task 2 - Identify the beginning and ending steps of the process.
Task 3 - Describe the primary purpose and objective of the process
Create Level 1
PFM
(operational definition).
Task 4 - Mentally “walk” through the major steps of the process and write
them down.
Define the
Task 5 - Use standard flowcharting symbols to order and to illustrate the flow
scope for the of the major process steps.
Level 2 PFM
169
Exercise – Generate a Level 1 PFM
1. Identify a generic name for the process:
2. Identify the beginning and ending steps of the process:
3. Describe the primary purpose and objective of the process (operational

definition):
4. Mentally “walk” through the major steps of the process and write them down:
5. Use standard flowcharting symbols to order and to illustrate the flow of the
major process steps on a separate sheet of paper.
170
Example Template for Generating a Level 1 PFM
1. Identify a generic name for the process:

(I.E. customer order process).
• Identify the beginning and ending steps of the process:

(beginning - customer calls in, ending – pizza order given to the chef).
• Describe the primary purpose and objective of the process (operational definition):
(The purpose of the process is to obtain telephone orders for Pizzas, sell additional
products if possible, let the customer know the price and approximate delivery time,
provide an accurate cook order, log the time and immediately give it to the pizza
cooker).
• Mentally “walk” through the major steps of the process and write them down:
(Receive the order via phone call from the customer, calculate the price, create a build
order and provide the order to the chef).
• Use standard flowcharting symbols to order and to illustrate the flow of the major
process steps on a separate sheet of paper.
171
Defining the Scope of Level 2 PFM
Customer Order Process

Select the Customer Calls for Take Make Cook Box Deliver Customer
Hungry Order Order Pizza Pizza Pizza Pizza Eats
process
Determine Pizza
Dough
approach to
map the No
process Take Order Add Place in Observe Check Yes Remove
1
from Cashier Ingredients Oven Frequently if Done from Oven
Start New
Complete Pizza
Level 1 PFM
worksheet Scrap
No
Tape
1 Correct Box
Yes Box
Create Level 1
PFM
The rules for determining the Level 2 Process Map scope:
• From your Macro Process Map, select the area which represents your problem.
• Map this area at a Level 2.
Define the
scope for the • Start and end at natural starting and stopping points for a process, in other words you
Level 2 PFM have the complete associated process.
172
Defining the Scope of Level 2 PFM
Create the Level

2 PFM Pizza
Dough
No
Perform SIPOC from Cashier Ingredients Oven Frequently if Done from Oven 1
Start New
Pizza
Identify all X’s
and Y’s Scrap
No
Tape
1 Correct Box
Identify customer Yes Box
requirements
Identify supplier
requirements
173
Building a SIPOC
SIPOC diagram for customer-order process:

Create the Level Suppliers Inputs Process Outputs Customers Requirements
2 PFM r ATT Phones r Pizza type r See Below r Price r Cook r Complete call < 3 min
r Office Depot r Size r Order confirmation r Accounting r Order to Cook < 1 minute
r TI Calculators r Quantity r Bake order r Complete bake order
r NEC Cash Register r Extra Toppings r Data on cycle time r Correct bake order
r Special orders r Order rate data r Correct address
r Drink types & quantities r Order transaction r Correct Price
Perform SIPOC r Other products r Delivery info
r Phone number
r Address
r Name
r Time, day and date
r Volume
Identify all X’s
and Y’s
Customer Order:
Identify customer
requirements Level 1 process flow diagram
Call for an Answer Write Confirm Sets Address & Order to
Order Phone Order Order Price Phone Cook
Identify supplier
requirements
174
Identifying Customer Requirements
Create the Level 2

PFM
Process Name
Operational
PROCESS OUTPUT Definition
IDENTIFICATION AND ANALYSIS
1 3 4 5 6 7 8 9 10 11 12 13
Output Data Requirements Data Measurement Data Value Data General Data/Information
Customer (Name) Metric Measurement VA
System (How is it Frequency of or
Process Output - Name (Y) Internal External Metric LSL Target USL Measured) Measurement Performance Level Data NVA Comments
Perform SIPOC
Identify all X’s

and Y’s
Identify customer
requirements
Identify supplier
requirements
175
Identifying Supplier Requirements
Create the Level 2

PFM
Process Name
Operational
PROCESS INPUT Definition
IDENTIFICATION AND ANALYSIS
1 2 3 4 5 6 7 8 9 10 11 12
Input Data Requirements Data Measurement Data Value Data General Data/Information
Supplier (Name) Metric Measurement NV
Controlled (C) System (How is it Frequency of Performance or
Perform SIPOC Process Input- Name (X) Noise (N) Internal External Metric LSL Target USL Measured) Measurement Level Data NVA Comments
Identify all X’s

and Y’s
Identify customer
requirements
Identify supplier
requirements
176
Controllable vs. Noise Inputs
Screens in Place
“Make Pizza” Process Procedural Oven Clean

Ingredients prepared
Inputs
Oven Temp
Bake Time
Controllable Key Process
Ingredients
Recipe Inputs Process Outputs
Correct Ingredients
Properly Cooked
Room Temp Pizza Size Hot Pizza >140 deg
Moisture Content
Ingredient Variation
Noise Inputs Ingredient Types/Mixes
Volume
Every input can be either:

Controllable (C) - Inputs can be adjusted or controlled while the process is running (e.g., speed, feed rate,
temperature, and pressure)
Procedural (P) - Inputs can be adjusted or controlled while the process is running (e.g., speed, feed rate, temperature,
and pressure)
Noise (N) - Things we don’t think we can control, we are unaware of or see, too expensive or too difficult to control
(e.g., ambient temperature, humidity, individual)
177
Exercise – Supplier Requirements
The purpose of this exercise is to identify the requirements for the suppliers to
Create the Level
the key process you have selected as your project.
2 PFM
Read the following background for the exercise: You will use your
selected key process for this exercise (if more than one person in the class is
Perform SIPOC part of the same process you may do it as a small group). You may not have
all the pertinent detail to correctly identify all supplier requirements, that is ok,
do the best you can. This will give you a starting template when you go back
to do your workplace assignment. Use the process input identification and
Identify all X’s analysis form for this exercise.
and Y’s
Task 1 – Identify a generic name for the process.

Task 2 - Write an operational description for the process
Identify customer
requirements
Task 3 - Complete the remainder of the form except the Value –Non value
added column.
Task 4 - Report out to the class when called upon,
Identify supplier
requirements
178
The Level 3 Process Flow Diagram
Pizza
Dough
No
from Cashier Ingredients Oven Frequently if Done from Oven 1
Start New
Pizza
Scrap
No
Tape
1 Correct Box
Yes Box
Process Name Step Name/Number

PROCESS STEP Process Name Step Name/Number Process Name Step Name/Number
PROCESS STEP PROCESS STEP
OUTPUT IDENTIFICATION AND ANALYSIS PROCESS STEP
OUTPUT
1 IDENTIFICATION
3 4 AND
5 ANALYSIS
6 7 8 9 10 11 12 13 INPUT IDENTIFICATION AND ANALYSIS
1 Output Data 3 4 5 Requirements
6 Data7 8 9 Measurement
10 Data 11 Value Data
12 General Data/Information
13 INPUT
1 IDENTIFICATION
2 3 AND4 ANALYSIS
5 6 7 8 9 10 11 12 13
Customer (Name) Metric 1 Input Data2 3 4 5Requirements
6 Data7 8 9 Measurement10 Data 11 Value Data
12 General Data/Information
13
Output Data Requirements Data Measurement Measurement Data Value
VA Data General Data/Information
Customer (Name) Metric System (How is it Frequency of or VA Input Data Supplier (Name) Metric
Requirements Data Measurement Measurement Data VA Data
Value General Data/Information
Measurement Controlled (C) System (How is it Frequency of Performance or VA
Process Output - Name (Y) Internal External Metric LSL Target USL Measured) Measurement Performance Level Data NVA Comments Supplier (Name) Metric Measurement
System (How is it Frequency of or Process Input- Name (X) Noise Internal
(N) (C) External Metric LSL Target USL Measured)
Controlled System (How isMeasurement
it Frequency ofLevel Data
Performance NVA or Comments
Process Output - Name (Y) Internal External Metric LSL Target USL Measured) Measurement Performance Level Data NVA Comments
Process Input- Name (X) Noise (N) Internal External Metric LSL Target USL Measured) Measurement Level Data NVA Comments
179
Process Inputs (X’s) and Outputs (Y’s)
PROCESS STEP
OUTPUT IDENTIFICATION AND ANALYSIS
1 3 4 5 6 7 8 9 10 11 12 13
Output Data Requirements Data Measurement Data Value Data General Data/Information
Create a Level 3 Customer (Name) Metric Measurement
System (How is it Frequency of
VA
or
PFM Process Output - Name (Y) Internal External Metric LSL Target USL Measured) Measurement Performance Level Data NVA Comments
Add
Performance
data
Identify
VA/NVA steps PROCESS STEP
INPUT IDENTIFICATION AND ANALYSIS

1 2 3 4 5 6 7 8 9 10 11 12 13
Input Data Requirements Data Measurement Data Value Data General Data/Information
Supplier (Name) Metric Measurement VA
Controlled (C) System (How is it Frequency of Performance or
Process Input- Name (X) Noise (N) Internal External Metric LSL Target USL Measured) Measurement Level Data NVA Comments
180
Process Inputs (X’s) and Outputs (Y’s)
C /N Requirements or Specs. Inputs (Xs) Process
N/C 7”, 12”, 16” Size of Pizza Y’s

N/C 12 meats, 2 veggies, 3 cheese Toppings
N N/A Name
N Within 10 miles Address Order •All fields
N Within area code Phone
Take Order
complete
N 11 AM to 1 AM Time
N 5 X 52 Day
N MM/DD,YY Date
C All fields complete Order

C Per Spec Sheets Ingredients •Size
Make Pizza Raw •Weight
S.O.P Per Rev 7.0 Recipe Pizza
C As per recipe chart 3-1 in Oz. Amounts •Ingredients
correct
C All fields complete Order

C Ingredients per order Raw Pizza
C 350F +/- 5F Oven Temp Cook Pizza •>140F
C Time Cooked •Ingredients
10 Min
Pizza correct
N 60 per hour max Volume
•No burns
181
Identifying Waste
Writes time A Rewrite

NV
Add to Order order
on scratch
pad
A No
Greetings and Request
NV No
Call for an Answer mention order from Writes on Asks for Confirm
Order phone specials customer scratch pad more? order
Yes
1 2
A No
NV No
Calculate Asks cook Inform Gets address Thanks Another
2 price for time customer of Order
& phone # customer & call 3
estimate price/time still OK? hangs up waiting
Yes
NVA Yes
Writes time
on scratch
pad Yes
Create a Level New
1
order?
3 PFM •Each process activity can be tested for its
value-add contribution No
A
Completes NV
Add •Ask the following two questions to identify 3 order from
from note pad
Performance non-value added activity:

–Is the form, fit or function of the work item A
data
changed as a result of this activity? Give order to
OK Verify
with
NV
Cook
–Is the customer willing to pay for this activity? notes
Not
OK
Identify
A
NV
Rewrite
VA/NVA steps Order
182
Definition of X-Y Matrix
• The X-Y Matrix is:

– A tool used to identify/collate potential X’s and assess their relative
impact on multiple Y’s (include all Y’s that are customer focused)
– Based on the team’s collective “opinions”
– Created for every project
– Never completed
– Updated whenever a parameter is changed
• To summarize, the X-Y is a team-based prioritization tool for the potential

X’s
• WARNING! This is not real data, this is organized brainstorming!! At the

conclusion of the project you may realize that the things you thought were
critical are in fact not as important as was believed.
183
The Vital Few
A Six Sigma Belt does not just discover which X’s are important in a process
(the vital few).
– The team considers all possible X’s that can contribute or cause the
problem observed.
– The team uses 3 primary sources of X identification:
• Process Mapping
• Fishbone Analysis
• Basic Data Analysis – Graphical and Statistical
– A List of X’s is established and compiled.
– The team then prioritizes which X’s it will explore first, and eliminates
the “obvious” low impact X’s from further consideration.
The X-Y Matrix is this Prioritization Tool!
184
The “X-Y Matrix”
185
Using the Classified X’s
• Breakthrough requires dealing primarily with controllable X’s impacting the “Y”.
• Use the controllable X’s from the Fishbone analysis to include in the X-Y Matrix.
• The goal is to isolate the vital few X’s from the trivial many X’s.
• Procedures and Noise X’s will be used in the FMEA at the end of this module.
However:
– All procedures must be in total compliance.
• This may require some type of effectiveness measure.
• This could reduce or eliminate some of the defects currently seen in the process
(allowing focus on controllable X’s).
– Noise type inputs increase risk of defects under current technology of
operation and therefore:
• Increase RPN on the FMEA document from an input.
• Help identify areas needing investment for a justified ROI.
*Risk Priority Number
186
X-Y Matrix: Steps
List X’s from Fishbone Diagram in horizontal rows
187
X-Y Matrix: Steps
List Y’s in columns (including Primary and Secondary metrics).

Weight the Y’s on a scale of 1-10 (10 - highest and 1- lowest).
188
X-Y Matrix: Steps
For each X listed, rank its effect on each metric based on a scale of 1, 3 or 9.
– 9 = Highest
– 3 = Marginal
– 1 = None
189
X-Y Matrix: Steps
“Ranking” multiplies the rank of each X by the Weight of each Metric. The product
of that is added together to become the “Ranking”.
190
Example
Click the Demo button to see an example.
191
Example
Click the Summary Worksheet
YX Diagram Summary
Process: laminating
Date: 5/2/2006 Input Matrix Results
100.00%
Output Variables Input Variables 90.00%
80.00%
O utput (Y's)
Description Weight Description Ranking Rank % 70.00%
60.00%
broken 10 temperature 162 14.90% 50.00%
40.00%
unbonded area 9 human handling 159 14.63% 30.00%
20.00%
smears 8 material properties 130 11.96% 10.00%
0.00%
thickness 7 washer 126 11.59%
clean room cleanlines s

material properties
temperature
pressure
time
foreign material 6 pressure 120 11.04%
0 robot handling 120 11.04%
0 time 102 9.38%
0 clean room practices 90 8.28% Input Summary
0 clean room cleanliness 78 7.18% Input (X's)
0 - 0.00%
192
Fishbone Diagram Exercise
Exercise objective: Create an X-Y Matrix using the

information from the Fishbone analysis.
1. Using the Fishbone Diagram created earlier, create an X-Y

Matrix.
2. Present results to your mentor.
193
Definition of FMEA
Failure Modes Effect Analysis (FMEA) is a structured approach to:

• Predict failures and prevent their occurrence in manufacturing and other
functional areas which generate defects.
• Identify the ways in which a process can fail to meet critical customer
requirements (Y).
• Estimate the Severity, Occurrence and Detection (SOD) of defects
• Evaluate the current Control Plan for preventing these failures from
occurring and escaping to the customer.
• Prioritize the actions that should be taken to improve and control the process
using a Risk Priority Number (RPN).
Give me an “F”, give me an “M”……
194
History of FMEA
History of FMEA:
• First used in the 1960’s in the Aerospace industry during the Apollo
missions
• In 1974 the Navy developed MIL-STD-1629 regarding the use of FMEA
• In the late 1970’s, automotive applications driven by liability costs, began to
incorporate FMEA into the management of their processes
• Automotive Industry Action Group (AIAG) now maintains the FMEA
standard for both Design and Process FMEA’s
195
Types of FMEA’s
• System FMEA: Performed on a product or service product at the early concept/design level
when various modules all tie together. All the module level FMEA’s tie together to form a
system. As you go lower into a system more failure modes are considered.
– Example: Electrical system of a car, consists of the following modules: battery, wiring
harness, lighting control module and alternator/regulator.
– System FMEA focuses on potential failure modes associated with the modules of a system
caused by design
• Design DFMEA: Performed early in the design phase to analyze product fail modes before they
are released to production. The purpose is to analyze how fail modes affect the system and
minimize them. The severity rating of a fail mode MUST be carried into the Process PFMEA.
• Process PFMEA: Performed in the early quality planning phase of manufacturing to analyze
fail modes in manufacturing and transactional processes that may escape to the customer. The
failure modes and the potential sources of defects are rated and corrective action taken based on a
Pareto analysis ranking.
• Equipment FMEA: used to analyze failure modes in the equipment used in a process to detect
or make the part.
– Example: Test Equipment fail modes to detect open and short circuits.
196
Purpose of FMEA
FMEA’s:
• Improve the quality, reliability and safety of products.
• Increase customer satisfaction.
• Reduce product development time and cost.
• Document and track actions taken to reduce risk and improve the
process.
• Focus on continuous problem prevention not problem solving.
197
Who Creates FMEAs and When?
Who When
• The focused team working on a • Process FMEAs should be started:
breakthrough project. • At the conceptual design phase.
• Process FMEAs should be updated:
• ANYONE who had or has a role
• When an existing design or process is being
in defining, executing, or
changed.
changing the process.
• When carry-over designs or processes will be
• This includes: used in new applications and environments.
• When a problem solving study is completed
• Associates
and needs to be documented.
• Technical Experts • System FMEAs should be created after system
• Supervisors functions are defined but before specific hardware is
selected.
• Managers • Design FMEAs should be created when new
• Etc. systems, products and processes are being designed.
198
Why Create an FMEA?
As a means to manage…
RISK!!!
We want to avoid causing failures in the Process as well as the Primary &
Secondary Metrics .
199
The FMEA…
# Process Potential Potential S C Potential O Current D R Recommen Responsible Taken S O D R

Function Failure Failure E l Causes of C Process E P d Actions Person & Actions E C E P
(Step) Modes Effects V a Failure (X's) C Controls T N Target Date V C T N
(process (Y's) s
defects) s
1
2
3
4
5
6
7
8
9
200
FMEA Components…#
# Process Potential Potential S C Potential O Current D R Recommend Responsible Taken S O D R

Function Failure Failure E l Causes of C Process E P Actions Person & Actions E C E P
(Step) Modes Effects (Y's) V a Failure (X's) C Controls T N Target Date V C T N
(process s
defects) s
The first column is the Process Step Number.

1
2
3
4
5
Etc.
201
FMEA Components…Process Step

Function Failure Failure E l Causes of C Process E P Actions Person & Action E C E P
(Step) Modes Effects V a Failure (X's) C Controls T N Target Date s V C T N
(process (Y's) s
defects) s
Enter the Name of the Process Step here. The FMEA should sequentially follow the
steps documented in your Process Map.
Phone
Dial Number
Listen for Ring
Say Hello
Introduce Yourself
Etc.
202
FMEA Components…Potential Failure Modes

Function Failure Failure E l Causes of C Process E P d Actions Person & Action E C E P
(process (Y's) s
defects) s
This refers to the mode in which the process could potentially fail. These are
the defects caused by a C,P or N factor that could occur in the Process.
This information is obtained from Historical Defect Data.
FYI..A failure mode is a fancy name for a defect.
203
FMEA Components…Potential Failure Effects
# Process Potential Potential S C Potential O Current D R Recommen Responsibl Taken S O D R

Function Failure Failure E l Causes of C Process E P d Actions e Person & Action E C E P
(process (Y's) s
defects) s
This is simply the effect of realizing the potential failure mode on

the overall process. It focuses on the outputs of each step.
This information can be obtained in the Process Map.
204
FMEA Components…Severity (SEV)

(process (Y's) s
defects) s
This ranking should be developed based on the teams knowledge of the process in
conjunction with the predetermined scale.
The measure of Severity is a financial measure of the impact to the business of
realizing a failure in the output.
205
Ranking Severity
Effect Criteria: Severity of Effect Defined Ranking

Hazardous: May endanger the operator. Failure mode affects safe vehicle operation and/or 10
Without involves non-compliance with government regulation. Failure will occur WITHOUT
Warning warning.
Hazardous: May endanger the operator. Failure mode affects safe vehicle operation and/or 9
With Warning involves non-compliance with government regulation. Failure will occur WITH
warning.
Very High Major disruption to the production line. 100% of the product may have to be scrapped. 8
Vehicle/item inoperable, loss of primary function. Customers will be very dissatisfied.
High Minor disruption to the production line. The product may have to be sorted and a portion 7
(less than 100%) scrapped. Vehicle operable, but at a reduced level of
performance. Customers will be dissatisfied.
Moderate Minor disruption to the production line. A portion (less than 100%) may have to be 6
scrapped (no sorting). Vehicle/item operable, but some comfort/convenience
item(s) inoperable. Customers will experience discomfort.
Low Minor disruption to the production line. 100% of product may have to be re-worked. 5
Vehicle/item operable, but some comfort/convenience item(s) operable at a
reduced level of performance. Customers will experience some dissatisfaction.
Very Low Minor disruption to the production line. The product may have to be sorted and a 4
portion (less than 100%) re-worked. Fit/finish/squeak/rattle item does not
conform. Most customers will notice the defect.
Minor Minor disruption to the production line. A portion (less than 100%) of the product may 3
have to be re-worked online but out-of-station. Fit/finish/squeak/rattle item
does not conform. Average customers will notice the defect.
Very Minor Minor disruption to the production line. A portion (less than 100%) of the product may 2
have to be re-worked online but in-station. Fit/finish/squeak/rattle item does
not conform. Discriminating customers will notice the defect.
None No effect. 1
* Potential Failure Mode and Effects Analysis (FMEA), Reference Manual, 2002. Pgs 29-45. Chrysler Corporation, Ford Motor
Company, General Motors Corporation.
206
Applying Severity Ratings to Your Process
• The guidelines presented on the previous slide were developed for the auto
industry.
• This was included only as a guideline....”actual results may vary” for your
project.
• Your severity may be linked to impact on the business or impact on the next
customer, etc.
You will need to define your own criteria…

and be consistent throughout your FMEA
Let’s brainstorm how we might define the following SEVERITY levels in our own
projects:
1, 5, 10
207
Sample Transactional Severities
Effect Criteria: Impact of Effect Defined Ranking
Critical Business May endanger company’s ability to do business. Failure mode affects process
10
Unit-wide operation and / or involves noncompliance with government regulation.
Critical Loss - May endanger relationship with customer. Failure mode affects product delivered
Customer and/or customer relationship due to process failure and/or noncompliance with 9
Specific government regulation.
Major disruption to process/production down situation. Results in near 100%

High 7
rework or an inability to process. Customer very dissatisfied.
Moderate disruption to process. Results in some rework or an inability to process.

Moderate Process is operable, but some work arounds are required. Customers experience 5
dissatisfaction.
Minor disruption to process. Process can be completed with workarounds or
Low rework at the back end. Results in reduced level of performance. Defect is 3
noticed and commented upon by customers.
Minor disruption to process. Process can be completed with workarounds or
Minor rework at the back end. Results in reduced level of performance. Defect noticed 2
internally, but not externally.
None No effect. 1
208
FMEA Components…Classification “Class”

(process (Y's) s
defects) s
Class should categorize each step as a…

 Controllable (C)
 Procedural (P)
 Noise (N)
This information can be obtained in the Process Map.
Controllable – A factor that can be dialed into a specific setting/value. For example Temperature or Flow.
Procedures – A standardized set of activities leading to readiness of a step. For example Safety Compliance, “Lock -Out
Tag-Out.”
Noise - A factor that can not be dialed in to a specific setting/value. For example rain in a mine.
209
Potential Causes of Failure (X’s)

(process (Y's) s
defects) s
Potential Causes of the Failure refers to how the failure could occur.
This information should be obtained from the Fishbone Diagram.
210
FMEA Components…Occurrence “OCC”

(process (Y's) s
defects) s
Occurrence refers to how frequently the specified failure is projected to occur.

This information should be obtained from Capability Studies or Historical Defect
Data - in conjunction with the predetermined scale.
211
Ranking Occurrence
Probability of Failure Possible Failure Rates Cpk Ranking
Very High: Failure is almost inevitable.  1 in 2 < 0.33 10

1 in 3 ³ 0.33 9
High: Generally associated with processes
1 in 8 ³ 0.51 8
similar to previous processes that have
often failed.
1 in 20 ³ 0.67 7
Moderate: Generally associated with 1 in 80 ³ 0.83 6

processes similar to previous processes that
have experienced occasional failures but 1 in 400 ³ 1.00 5
not in major proportions.
1 in 2,000 ³ 1.17 4
Low: Isolated failures associated with

1 in 15,000 ³ 1.33 3
similar processes.
Very Low: Only isolated failures associated
with almost identical processes. 1 in 150,000 ³ 1.5 2
Remote: Failure is unlikely. No failures

ever associated with almost identical  1 in 1,500,000 ³ 1.67 1
processes.
Potential Failure Mode and Effects Analysis (FMEA), Reference Manual, 2002. Pg. 35.. Chrysler Corporation, Ford Motor Company,
General Motors Corporation.
212
FMEA Components…Current Process Controls

(process (Y's) s
defects) s
Current Process Controls refers to the three types of controls that are in place to
prevent a failure in with the X’s. The 3 types of controls are:
• SPC (Statistical Process Control)
• Poke-Yoke – (Mistake Proofing)
• Detection after Failure
Ask yourself “how do we control this defect?”
213
FMEA Components…Detection (DET)

(process (Y's) s
defects) s
Detection is an assessment of the probability that the proposed type of control will
detect a subsequent Failure Mode.
This information should be obtained from your Measurement System Analysis

Studies and the Process Map. A rating should be assign in conjunction with the
predetermined scale.
214
Ranking Detection
Criteria: The likelihood that the existence of a defect will

Detection be detected by the test content before the product Ranking
advances to the next or subsequent process
Almost Impossible Test content must detect < 80% of failures 10
Very Remote Test content must detect 80% of failures 9
Remote Test content must detect 82.5% of failures 8
Very Low Test content must detect 85% of failures 7
Low Test content must detect 87.5% of failures 6
Moderate Test content must detect 90% of failures 5
Moderately High Test content must detect 92.5% of failures 4
High Test content must detect 95% of failures 3
Very High Test content must detect 97.5% of failures 2
Almost Certain Test content must detect 99.5% of failures 1
Potential Failure Mode and Effects Analysis (FMEA), AIAG Reference Manual, 2002 Pg. 35.. Chrysler Corporation, Ford Motor
Company, General Motors Corporation.
215
Risk Priority Number “RPN”

(process (Y's) s
defects) s
The Risk Priority Number is a value that will be used to rank order the concerns
from the process.
The RPN is the product of, Severity, Occurrence and Detect ability as represented
here…
RPN = (SEV)*(OCC)*(DET)
216
FEMA Components…Actions

Function Failure Failure E l Causes of C Process E P Actions Person & Action E C E P
(process (Y's) s
defects) s
Recommended Actions refers to the activity for the prevention of a defect.
Responsible Person & Date refers to the name of the group or person responsible
for completing the activity and when they will complete it.
Taken Action refers to the action and effective date after it has been completed.
217
FMEA Components…Adjust RPN

(process (Y's) s
defects) s
Once the Recommended Actions, Responsible Person & Date, Taken Action
have been completed the Severity, Occurrence and Detection should be adjusted.
This will result in a new RPN rating.
218
FMEA Exercise
Exercise objective: Assemble your team in order to create

a FMEA using the information generated from the
Process Map, Fishbone Diagram and X-Y Matrix.
1. Be prepared to present results to your mentor.
OK Team, let’s
get that FMEA!
219
Summary
• Create a high-level Process Map
• Create a Fishbone Diagram
• Create an X-Y Matrix
• Create an FMEA
• Describe the purpose of each tool and when it should be used
220
Process Discovery
Basic Statistics
Descriptive Statistics
Normal Distribution
Assessing Normality
Special Cause / Common Cause
Graphing Techniques
Process Capability
221
Purpose of Basic Statistics
The purpose of Basic Statistics is to:

• Provide a numerical summary of the data being analyzed.
– Data (n)
• Factual information organized for analysis.
• Numerical or other information represented in a form suitable for processing by computer
• Values from scientific experiments.
• Provide the basis for making inferences about the future.
• Provide the foundation for assessing process capability.
• Provide a common language to be used throughout an organization to describe
processes.
Relax….it won’t
be that bad!
222
Statistical Notation – Cheat Sheet
Summation An individual value, an observation
The Standard Deviation of sample data A particular (1st) individual value
The Standard Deviation of population data For each, all, individual values
The variance of sample data The Mean, average of sample data

The variance of population data
The grand Mean, grand average
The range of data
The Mean of population data
The average range of data
Multi-purpose notation, i.e. # of subgroups, # of A proportion of sample data

classes
A proportion of population data
The absolute value of some term
Sample size
Greater than, less than
Greater than or equal to, less than or equal to Population size
223
Parameters vs. Statistics
Population: All the items that have the “property of interest” under study.
Frame: An identifiable subset of the population.
Sample: A significantly smaller subset of the population used to make an inference.
Population
Sample
Sample
Sample
Population Parameters: Sample Statistics:

– Arithmetic descriptions of a population – Arithmetic descriptions of a
– µ,  , P, 2, N sample
– X-bar , s, p, s2, n
224
Types of Data
Attribute Data (Qualitative)

– Is always binary, there are only two possible values (0, 1)
• Yes, No
• Go, No go
• Pass/Fail
Variable Data (Quantitative)
– Discrete (Count) Data
• Can be categorized in a classification and is based on counts.
– Number of defects
– Number of defective units
– Number of customer returns
– Continuous Data
• Can be measured on a continuum, it has d ecimal subdivisions that are meaningful
– Time, Pressure, Conveyor Speed, Material feed rate
– Money
– Pressure
– Conveyor Speed
– Material feed rate
225
Discrete Variables
Discrete Variable Possible Values for the Variable
The number of defective needles in boxes of 100 diabetic 0,1,2, …, 100

syringes
The number of individuals in groups of 30 with a Type A 0,1,2, …, 30

personality
The number of surveys returned out of 300 mailed in a 0,1,2, … 300

customer satisfaction study.
The number of employees in 100 having finished high 0,1,2, … 100

school or obtained a GED
The number of times you need to flip a coin before a head 1,2,3, …
appears for the first time
(note, there is no upper limit because you might need to
flip forever before the first head appears)
226
Continuous Variables
Continuous Variable Possible Values for the Variable
The length of prison time served for individuals convicted All the real numbers between a and b, where a is the
of first degree murder smallest amount of time served and b is the largest.
The household income for households with incomes less All the real numbers between a and $30,000, where a is
than or equal to $30,000 the smallest household income in the population
The blood glucose reading for those individuals having All real numbers between 200 and b, where b is the
glucose readings equal to or greater than 200 largest glucose reading in all such individuals
227
Definitions of Scaled Data
• Understanding the nature of data and how to represent it can affect the types of statistical
tests possible.
• Nominal Scale – data consists of names, labels, or categories. Cannot be arranged in an

ordering scheme. No arithmetic operations are performed for nominal data.
• Ordinal Scale – data is arranged in some order, but differences between data values either
cannot be determined or are meaningless.
• Interval Scale – data can be arranged in some order and for which differences in data
values are meaningful. The data can be arranged in an ordering scheme and differences
can be interpreted.
• Ratio Scale – data that can be ranked and for which all arithmetic operations including
division can be performed. (division by zero is of course excluded) Ratio level data has
an absolute zero and a value of zero indicates a complete absence of the characteristic of
interest.
228
Nominal Scale
Qualitative Variable Possible nominal level data values for the

variable
Blood Types A, B, AB, O
State of Residence Alabama, …, Wyoming
Country of Birth United States, China, other
Time to weigh in!

229
Ordinal Scale
Qualitative Variable Possible Ordinal level data

values
Automobile Sizes Subcompact, compact, intermediate,

full size, luxury
Product rating Poor, good, excellent
Baseball team classification Class A, Class AA, Class AAA, Major

League
230
Interval Scale
Interval Variable Possible Scores
IQ scores of students in BlackBelt 100…

Training (the difference between scores is
measurable and has meaning but a
difference of 20 points between 100
and 120 does not indicate that one
student is 1.2 times more intelligent )
231
Ratio Scale
Ratio Variable Possible Scores
Grams of fat consumed per adult in the United 0…

States (If person A consumes 25 grams of fat and person
B consumes 50 grams, we can say that person B
consumes twice as much fat as person A. If a
person C consumes zero grams of fat per day, we
can say there is a complete absence of fat
consumed on that day. Note that a ratio is
interpretable and an absolute zero exists.)
232
Converting Attribute Data to Continuous Data
• Continuous Data is always more desirable
• In many cases Attribute Data can be converted to Continuous
• Which is more useful?

– 15 scratches or Total scratch length of 9.25”
– 22 foreign materials or 2.5 fm/square inch
– 200 defects or 25 defects/hour
233
Measures of Location (central tendency)

– Mean
– Median
– Mode
Measures of Variation (dispersion)

– Range
– Interquartile Range
– Standard deviation
– Variance
234
Enter the data as below in an Excel sheet. The data contains Average
Handle time for calls in a call center service BU
Data
5.01
5.01
5.01
5
5.01
5
5.01
4.99
4.99
5
5
5
4.99
5
5
5
5
235
Measures of Location
Mean is:
• Commonly referred to as the average.
• The arithmetic balance point of a distribution of data.
Stat>Basic Statistics>Display Descriptive Statistics…>Graphs…
>Histogram of data, with normal curve
Column1
Sample Population
Mean 5.001176
Standard Error 0.00169

Median 5
Mode 5
Standard Deviation 0.006966

Critical descriptive statistics measures are listed as below
Sample Variance 4.85E-05
1.Mean = This is the arithmetic average of the data set
Kurtosis -0.67438 2.Median = This is the positional average of the data set
3.Standard Deviation = This is the square root of variance, with population
variance being the average of squared mean differences.
Skewness -0.16095
Range 0.02
Minimum 4.99
Maximum 5.01 236
Measures of Variation
Range is the:
Difference between the largest observation and the smallest observation in the data
set.
• A small range would indicate a small amount of variability and a large range a large
amount of variability.
Max = 5.01, Min = 4.99, Range = 0.02
Interquartile Range is the:

Difference between the 75th percentile and the 25th percentile.
Q3 = 5.01, Q1 = 5. Thus IQR = 5.01 – 5 = 0.01
Use Range or Interquartile Range when the data distribution is Skewed.

237
Standard Deviation is:

Equivalent of the average deviation of values from the Mean for a distribution of data.
A “unit of measure” for distances from the Mean.
Use when data are symmetrical.
Sample Population
Cannot calculate population Standard Deviation because this is sample data.
238
Variance is the:
Average squared deviation of each individual data point from the Mean.
Sample Population
The sample standard deviation is often known as unbiased estimator of

standard deviation.
239
Normal Distribution
The Normal Distribution is the most recognized distribution in statistics.
What are the characteristics of a Normal Distribution?

– Only random error is present
– Process free of assignable cause
– Process free of drifts and shifts
So what is present when the data is Non-normal?
240
The Normal Curve
The normal curve is a smooth, symmetrical, bell-shaped curve,

generated by the density function.
It is the most useful continuous probability model as many naturally

occurring measurements such as heights, weights, etc. are
approximately Normally Distributed.
241
Normal Distribution
Each combination of Mean and Standard Deviation generates a unique Normal

curve:
“Standard” Normal Distribution
– Has a μ = 0, and σ = 1
– Data from any Normal Distribution can be made to

fit the standard Normal by converting raw scores
to standard scores.
– Z-scores measure how many Standard Deviations from the mean a particular
data-value lies.
242
Normal Distribution
The area under the curve between any 2 points represents the proportion
of the distribution between those points.
The
Thearea
areabetween
betweenthe
theMean
Mean
and
andany
anyother
otherpoint
pointdepends
depends
upon
uponthe
theStandard
StandardDeviation.
Deviation.
 x
Convert any raw score to a Z-score using the formula:
Refer to a set of Standard Normal Tables to find the proportion between μ

and x.
243
The Empirical Rule
The Empirical Rule…
-6 -5 -4 -3 -2 -1 +1 +2 +3 +4 +5 +6
68.27 % of the data will fall within +/- 1 Standard Deviation

95.45 % of the data will fall within +/- 2 Standard Deviations
244
The Empirical Rule (cont.)
No matter what the shape of your distribution is, as you travel 3 Standard Deviations
from the Mean, the probability of occurrence beyond that point begins to converge to a
very low number.
245
Why Assess Normality?
While many processes in nature behave according to the Normal Distribution, many
processes in business, particularly in the areas of service and transactions, do not.
There are many types of distributions:
There are many statistical tools that assume Normal Distribution properties in their
calculations.
So understanding just how “Normal” the data are will impact how we look at the
data.
246
Tools for Assessing Normality
The shape of any Normal curve can be calculated based on the Normal
Probability density function.
Tests for Normality basically compare the shape of the calculated curve to
the actual distribution of your data points.
For the purposes of this training, we will focus on 2 ways in MINITAB™ to

assess Normality:
– The Anderson-Darling test
– Normal probability test
Watch that curve!
247
Goodness-of-Fit
The Anderson-Darling test uses an empirical density function.
100
Expected for Normal Distribution
Departure of the actual Actual Data
20%
data from the expected
80
Normal Distribution. C
u
m
The Anderson-Darling u
l
a 60
Goodness-of-Fit test t
i
v
assesses the magnitude e
P
of these departures e 40
r
using an Observed c
e
n
minus Expected t
20
formula. 20%
0
3.0 3.5 4.0 4.5 5.0 5.5
Raw Data Scale
248
The Normal Probability Plot
ENTER
DATA
HERE Sorted Count F1i 1-F1i F2i Si N Plt Line N Plt Line
103 1 0.09693643 0.90306357 0.0510784 -5.3080938 102.9777 0.086207

103.5
103.2 2 0.23808738 0.76191262 0.225311 -8.7761728 103.1844 0.224138

103.7
103.2 3 0.23808738 0.76191262 0.4334334 -11.355673 103.3226 0.362069

103.2
103.5 4 0.5665666 0.4334334 0.4334334 -9.8292444 103.4429 0.5

103
Test Hypotheses
103.5 5 0.5665666 0.4334334 0.7619126 -7.5607564 103.5632 0.637931
103.5
H0: Data is normally distributed.
Is the data normally distributed or
103.7 6 0.77468896 0.22531104 non-normally distributed?
0.7619126 -5.7993879 103.7013 0.775862
103.2 HA: Data is not normally distributed.
0.679287 P-value
104 7 0.94892162 0.05107838 0.9030636 -2.0070883 103.908 0.913793
104
249
If the Data Are Not Normal, Don’t Panic!
• Normal Data are not common in the transactional world.
• There are lots of meaningful statistical tools you can use to analyze your
data (more on that later).
• It just means you may have to think about your data in a slightly
different way.
Don’t touch that button!

250
Normality Exercise
Exercise objective: To demonstrate how to test for

Normality.
1. Generate Normal Probability Plots and the graphical

summary using the “Descriptive Statistics.MTW” file.
2. Use only the columns Dist A and Dist D.
3. Answer the following quiz questions based on your

analysis of this data set.
251
Isolating Special Causes from Common Causes
Special Cause: Variation is caused by known factors that result in a non-random

distribution of output. Also referred to as “Assignable Cause”.
Common Cause: Variation caused by unknown factors resulting in a steady but

random distribution of output around the average of the data. It is the variation
left over after Special Cause variation has been removed and typically (not
always) follows a Normal Distribution.
If we know that the basic structure of the data should follow a Normal
Distribution, but plots from our data shows otherwise; we know the data contain
Special Causes.
Special Causes = Opportunity

252
Introduction to Graphing
The purpose of Graphing is to:

• Identify potential relationships between variables.
• Identify risk in meeting the critical needs of the Customer, Business and
People.
• Provide insight into the nature of the X’s which may or may not control
Y.
• Show the results of passive data collection.
In this section we will cover…

1. Box Plots
2. Scatter Plots
3. Dot Plots
4. Time Series Plots
5. Histograms
253
Data Sources
Data sources are suggested by many of the tools that have been covered so
far:
– Process Map
– X-Y Matrix
– Fishbone Diagrams
– FMEA
Examples are:
1. Time 3. Operator
Shift Training
Day of the week Experience
Week of the month Skill
Season of the year Adherence to procedures
2. Location/position 4. Any other sources?

Facility
Region
Office
254
Graphical Concepts
The characteristics of a good graph include:

• Variety of data
• Selection of
– Variables
– Graph
– Range
Information to interpret relationships
Explore quantitative relationships
255
The Histogram
A Histogram is often a bar chart type graph representing frequency

distribution of your data set.
Steps to make a histogram
1.Use Freedman Diaconic Rule, i.e. (2* IQR)/(n)1/3

2.Freedman Diaconic Rule determines bin width
3.Determine number of bins, h = (Max – Min)/ Bin Width
1. IQR = 0.01, n = 17
2. Bin Width = 0.007
3. Number of Bins = 3
256
Histogram Caveat
Joining the top sections of all

the bars, you would find that
a normal distribution takes
shape.
Note that normal distribution

is extremely sensitive to
sample size.
257
Box Plot
Box Plots summarize data about the shape, dispersion and center of the data and also
help spot outliers.
Box Plots require that one of the variables, X or Y, be categorical or Discrete and the
other be Continuous.
A minimum of 10 observations should be included in generating the Box Plot.
Maximum Value
75th Percentile
Middle
50% of 50th Percentile (Median)
Data
Mean
25th Percentile
min(1.5 x Interquartile Range

or minimum value)
Outliers
258
Box Plot Anatomy
Outlier
*
Upper Limit: Q3+1.5(Q3-Q1)
Upper Whisker
Q3: 75th Percentile

Median
Box
Q2: Median 50th Percentile
Q1: 25th Percentile
Lower Whisker
Lower Limit: Q1+1.5(Q3-Q1)
259
Box Plot Examples
What does this Box Plot tell you?
Which data set seems to have higher variation?
Which data set seems to have a lower measure of

central tendency?
Which data set has outliers that needs to be

eliminated?
Use the sheet Box Plot to draw your own Box Plots
and interpret
260
Box Plot Example
What is your interpretation on the Box plot above?
261
Time Series Plot
Time Series Plots are excellent tools and a must in all projects for you to understand if there are time-series patterns in the data.
Patterns noted from this Time Series Chart
1.The data increases for 3 months and then

experiences a fall in the first quarter.
2.The Average Up-run length, i.e. trending of

consecutive points is 5.
3.Every half yearly the performance falls to almost the

same levels.
These patterns should be investigated for their

causes.
A time series plot template has

been made available in the
LSSBB Data set.
262
Summary
• Explain the various statistics used to express location and spread of data
• Describe characteristics of a Normal Distribution
• Explain Special Cause variation
• Use data to generate various graphs and make interpretations based on their
output
263
Process Discovery
Basics of MSA
Variables MSA
Attribute MSA
Process Capability
264
Introduction to MSA
So far we have learned that the heart and soul of Six Sigma is that it is a
data-driven methodology.
– How do you know that the data you have used is accurate and precise?
– How do know if a measurement is a repeatable and reproducible?
How good are these?

or
MSA
265
MSA is a mathematical procedure to quantify variation introduced to a process

or product by the act of measuring.
Item to be Reference
Measured Measurement
Operator Measurement Equipment
Process
Procedure
Environment
The item to be measured can be a physical part, document or a scenario for customer service.
Operator can refer to a person or can be different instruments measuring the same products.
Reference is a standard that is used to calibrate the equipment.
Procedure is the method used to perform the test.
Equipment is the device used to measure the product.
Environment is the surroundings where the measures are performed.
266
Measurement Purpose
In order to be worth collecting, measurements must provide value - that is, they
must provide us with information and ultimately, knowledge
The question…
What do I need to know?

…must be answered before we begin to consider issues of measurements, metrics, statistics, or
data collection systems
Too often, organizations build complex data collection and information

management systems without truly understanding how the data collected and
metrics calculated actually benefit the organization.
267
Purpose
The purpose of MSA is to assess the error due to measurement systems.

The error can be partitioned into specific sources:
– Precision
• Repeatability - within an operator or piece of equipment
• Reproducibility - operator to operator or attribute gage to attribute gage
– Accuracy
• Stability - accuracy over time
• Linearity- accuracy throughout the measurement range
• Resolution
• Bias – Off-set from true value
– Constant Bias
– Variable Bias – typically seen with electronic equipment, amount of
Bias changes with setting levels
268
Accuracy and Precision
Accurate
Accuratebut butnotnotprecise
precise--OnOnaverage,
average, Precise
Precisebut
butnot
notaccurate
accurate--TheTheaverage
average
the
theshots
shotsare
areininthe
thecenter
centerof
ofthe
thetarget
target isisnot
noton
onthe
thecenter,
center,but
butthe
thevariability
variabilityisis
but
butthere
thereisisaalot
lotofofvariability
variability small
small
269
MSA Uses
MSA can be used to:
Compare internal inspection standards with the standards of your customer.
Highlight areas where calibration training is required.
Provide a method to evaluate inspector training effectiveness as well as serves as

an excellent training tool.
Provide a great way to:

–Compare existing measurement equipment.
–Qualify new inspection equipment.
270
Why MSA?
Measurement System Analysis is important to:

• Study the % of variation in our process that is caused by our measurement
system.
• Compare measurements between operators.
• Compare measurements between two (or more) measurement devices.
• Provide criteria to accept new measurement systems (consider new
equipment).
• Evaluate a suspect gage.
• Evaluate a gage before and after repair.
• Determine true process variation.
• Evaluate effectiveness of training program.
271
Appropriate Measures
Appropriate Measures are:

• Sufficient – available to be measured regularly
• Relevant –help to understand/isolate the problems
• Representative - of the process across shifts and people
• Contextual – collected with other relevant information that might explain

process variability.
272
Poor Measures
Poor Measures can result from:

• Poor or non-existent operational definitions
• Difficult measures
• Poor sampling
• Lack of understanding of the definitions
• Inaccurate, insufficient or non-calibrated measurement devices
Measurement Error compromises decisions that affect:

– Customers
– Producers
– Suppliers
273
Examples of What to Measure
Examples of what and when to measure:

• Primary and secondary metrics
• Decision points in Process Maps
• Any and all gauges, measurement devices, instruments, etc
• “X’s” in the process
• Prior to Hypothesis Testing
• Prior to modeling
• Prior to planning designed experiments
• Before and after process changes
• To qualify operators
MSA is a Show Stopper!!!

274
Components of Variation
Whenever you measure anything, the variation that you observe can be segmented
into the following components…
Observed Variation
Unit-to-unit (true) Variation Measurement System Error
Precision Accuracy
Repeatability Reproducibility Stability Bias Linearity
All measurement systems have error. If you don’t know how much of the variation you
observe is contributed by your measurement system, you cannot make confident decisions.
If you were one speeding ticket away from losing your license, how fast would you be
willing to drive in a school zone?
275
Precision
A precise metric is one that returns the same value of a given attribute
every time an estimate is made.
Precise data are independent of who estimates them or when the estimate
is made.
Precision can be partitioned into two components:

– Repeatability
– Reproducibility
Repeatability and Reproducibility = Gage R+R
276
Repeatability
Repeatability is the variation in measurements obtained with one measurement

instrument used several times by one appraiser while measuring the identical
characteristic on the same part.
Repeatability
For example:
– Manufacturing: One person measures the purity of multiple samples of the same
vial and gets different purity measures.
– Transactional: One person evaluates a contract multiple times (over a period of
time) and makes different determinations of errors.
277
Reproducibility
Reproducibility is the variation in the average of the measurements made by

different appraisers using the same measuring instrument when measuring
the identical characteristic on the same part.
Reproducibility
Y Operator A
Operator B
For example:
– Manufacturing: Different people perform purity test on samples from the same
vial and get different results.
– Transactional: Different people evaluate the same contract and make different
determinations.
278
Time Estimate Exercise
Exercise objective: Demonstrate how well you can estimate a

10 second time interval.
1. Pair up with an associate.

2. One person will say start and stop to indicate how long they
think the 10 seconds last. Do this 6 times.
3. The other person will have a watch with a second hand to
actually measure the duration of the estimate. Record the value
where your partner can’t see it.
4. Switch tasks with partner and do it 6 times also.
5. Record all estimates, what do you notice?
279
Accuracy
An accurate measurement is the difference between the observed average of the

measurement and a reference value.
– When a metric or measurement system consistently over or under estimates the value of an
attribute, it is said to be “inaccurate”
Accuracy can be assessed in several ways:
– Measurement of a known standard
– Comparison with another known measurement method
– Prediction of a theoretical value
What happens if we don’t have standards, comparisons or theories?
True
Average
Accuracy
Warning, do not assume your metrology
reference is gospel.
Measurement
280
Accuracy Against a Known Standard
In transactional processes, the measurement system can consist of a database

query.
– For example, you may be interested in measuring product returns where
you will want to analyze the details of the returns over some time period.
– The query will provide you all the transaction details.
However, before you invest a lot of time analyzing the data, you must ensure the
data has integrity.
– The analysis should include a comparison with known reference points.
– For the example of product returns, the transaction details should add up
to the same number that appears on financial reports, such as the income
statement.
281
Accuracy vs. Precision
ACCURATE PRECISE BOTH
+ =
Accuracy relates to how close the

average of the shots are to the Master or
bull's-eye.
Precision relates to the spread of the

shots or Variance.
NEITHER
282
Bias
Bias is defined as the deviation of the measured value from the actual value.
Calibration procedures can minimize and control bias within acceptable

limits. Ideally, Bias can never be eliminated due to material wear and tear!
Bias Bias
283
Stability
Stability of a gauge is defined as error (measured in terms of Standard Deviation)

as a function of time. Environmental conditions such as cleanliness, noise,
vibration, lighting, chemical, wear and tear or other factors usually influence
gauge instability. Ideally, gauges can be maintained to give a high degree of
Stability but can never be eliminated unlike Reproducibility. Gage Stability
studies would be the first exercise past calibration procedures.
Control Charts are commonly used to track the Stability of a measurement system
over time.
Drift
Stability is Bias characterized as a

function of time!
284
Linearity
Linearity is defined as the difference in Bias values throughout the measurement range in
which the gauge is intended to be used. This tells you how accurate your measurements
are through the expected range of the measurements. It answers the question, "Does my
gage have the same accuracy for all sizes of objects being measured?"
Linearity = |Slope| * Process Variation
Low Nominal High

% Linearity = |Slope| * 100
+e
B i a s (y)
0.00
*
-e
*
*
Reference Value (x)
y = a + b.x
y: Bias, x: Ref. Value
a: Slope, b: Intercept
285
Types of MSA’s
MSA’s fall into two categories:

– Attribute
– Variable
Attribute Variable
– Pass/Fail – Continuous scale
– Go/No Go – Discrete scale
– Document Preparation – Critical dimensions
– Surface imperfections – Pull strength
– Customer Service Response – Warp
Transactional projects typically have Attribute based measurement systems.

Manufacturing projects generally use Variable studies more often, but do use
Attribute studies to a lesser degree.
286
Variable MSA’s
MINITAB™ calculates a column of variance components (VarComp) which are used to calculate %
Gage R&R using the ANOVA Method.
Measured Value True Value
Estimates for a Gage R&R study are obtained by calculating the variance components for each term
and for error. Repeatability, Operator and Operator*Part components are summed to obtain a total
Variability due to the measuring system.
We use variance components to assess the Variation contributed by each source of measurement error
relative to the total Variation.
287
Session Window Cheat Sheet
Contribution of Variation to the total

Variation of the study.
% Contribution, based on variance components, is

calculated by dividing each value in VarComp by the
Total Variation then multiplying the result by 100.
Use % Study Var when you are interested in comparing the

measurement system Variation to the total Variation.
% Study Var is calculated by dividing each value in Study
Var by Total Variation and Multiplying by 100.
Study Var is calculated as 5.15 times the Standard Deviation
for each source.
(5.15 is used because when data are normally distributed,
99% of the data fall within 5.15 Standard Deviations.)
288
Session Window Cheat Sheet
Session Window explanations
When the process tolerance is entered in the system,

MINITABTM TM calculates % Tolerance which compares
measurements system Variation to customer

specification. This allows us to determine the proportion
of the process tolerance that is used by the Variation in
the measurement system.
Always round down to the nearest whole number.
289
Number of Distinct Categories
The number of distinct categories tells you how many separate groups of parts the
system is able to distinguish.

Unacceptable for estimating
process parameters and indices
Only indicates whether the
process is producing conforming
or nonconforming parts

1 Data Category

Generally unacceptable for
estimating process parameters
and indices
Only provides coarse estimates
2 - 4 Categories
Recommended
5 or more Categories
290
AIAG Standards for Gage Acceptance
Here are the Automotive Industry Action Group’s definitions for Gage
acceptance.
% Tolerance
or % Contribution System is…
% Study Variance
10% or less 1% or less Ideal
10% - 20% 1% - 4% Acceptable
20% - 30% 5% - 9% Marginal
30% or greater 10% or greater Poor
291
MINITABTM Graphic Output Cheat Sheet
Gage name: Sample Study - Caliper

Date of study: 2-10-01
Gage R&R (ANOVA) for Data Reported by: B Wheat
Tolerance:
Misc:
Components of Variation By Part

100 %Contribution
0.630
%Study Var
%Tolerance
Percent
50 0.625
0 0.620
Gage R&R Repeat Reprod Part-to-Part Part 1 2 3 4 5 6 7 8 9 10
R Chart by Operator By Operator

0.010 1 2 3 MINITAB
MINITAB
0.630
breaks
breaksdown
TM
TM
downthe theVariation
Variationininthe
themeasurement
measurement
Sample Range
system into specific sources. Each cluster of bars represents a

UCL=0.005936 system into specific sources. Each cluster of bars represents a
0.005
source
0.625of variation. By default, each cluster will have two bars,
source of variation. By default, each cluster will have two bars,
0.000
R=0.001817
LCL=0
corresponding
corresponding toto%Contribution
%Contributionand and%StudyVar.
%StudyVar. IfIfyou
youadd
addaa
0.620
0 tolerance
Operator and/or
tolerance and/or
1 historical
historical
2 sigma,
sigma,bars 3 for
bars for%%Tolerance
Toleranceand/or
and/or
Xbar Chart by Operator %Process
%Processare areadded.
Operator*Part Interaction
added.
0.632 1 2 3 Operator
UCL=0.6316 0.631 1
0.631
0.630 2
Sample Mean
0.630
InInaagood
goodmeasurement
measurementsystem,
0.629
system,thethelargest
largestcomponent
componentofof 3
Average
0.629
0.628 Mean=0.6282 0.628
0.627
0.626
Variation
VariationisisPart-to-Part
0.627
0.626Part-to-Partvariation.
variation. IfIfinstead
insteadyou
youhave
havelarge
large
0.625
0.624
LCL=0.6248
amounts of Variation
0.625
attributed to Gage R&R, then
amounts of Variation attributed to Gage R&R, then corrective
0.624
corrective
0
action
actionisisneeded.
Part 1 2
needed.
3 4 5 6 7 8 9 10
292

Tolerance:
Misc:

100
%Contribution
0.630
%Study Var
%Tolerance
Percent
50 0.625
0 0.620
Gage R&R Repeat Reprod Part-to-Part MINITAB1TMTMprovides
MINITAB providesanan
Part 2 3 4 5 R Chart
7 and
R6 Chart 8 Xbar
and 9 10 Chart by Operator. The R
Xbar Chart by Operator. The R
R Chart by Operator chart
chartconsists
consistsofofthe following:
By
the Operator
following:
0.010 1 2 3
0.630
- The plotted points are the difference between the largest and smallest
Sample Range
- The plotted points are the difference between the largest and smallest
UCL=0.005936
measurements on each part for each operator. If the measurements are
0.005
measurements
0.625 on each part for each operator. If the measurements are
the same then the range = 0.
R=0.001817 the same then the range = 0.
- -The
TheCenter
CenterLine,
Line,isisthe
thegrand
grandaverage
averagefor
forthe
theprocess.
process.
0.000 LCL=0 0.620
- -The Control Limits represent
The 1Control Limits represent the amount
the amount of variation expected for the
0 Operator 2 3 of variation expected for the
subgroup
subgroupranges.
ranges. These
Theselimits
limitsare
arecalculated
calculatedusing
usingthe
thevariation
variationwithin
within
Xbar Chart by Operator subgroups. Operator*Part Interaction
0.632 1 2 3
UCL=0.6316
subgroups.
0.631
Operator
0.631 1
0.630 2
Sample Mean
0.630
If any of the points on the graph go above the upper Control Limit
0.629
If any of the points on the graph go above the upper Control Limit
3
Average
0.629
0.628 Mean=0.6282 (UCL), then that operator is having problems consistently measuring
0.628
0.627 (UCL), then that operator is having problems consistently measuring
0.627
0.626 parts. The Upper Control Limit value takes into account the number of
0.626
0.625 parts. The Upper Control Limit value takes into account the number of
0.624
LCL=0.6248 measurements
measurementsby
0.625
0.624 byananoperator
operatoronona apart
partand
andthe
thevariability
variabilitybetween
betweenparts.
parts.
0 IfIf the operators are measuring consistently, then these ranges shouldbebe
the
Part operators
1 2 are
3 measuring
4 5 6 consistently,
7 8 9 10 then these ranges should
small
smallrelative
relativetotothe
thedata
dataand
andthe
thepoints
pointsshould
shouldstay
stayinincontrol.
control.
293

Tolerance:
Misc:

100
%Contribution 0.630
%Study Var
%Tolerance
Percent
50 MINITAB TM
provides an R Chart and Xbar Chart by Operator. The Xbar
MINITABTM provides an R Chart and Xbar Chart by Operator. The Xbar
0.625
Chart compares the part-to-part variation to repeatability. The Xbar chart
Chart compares the part-to-part variation to repeatability. The Xbar chart
consists of the following:
0 consists of the following:
0.620
- The plotted points are the average measurement on each part for each
R Chart by Operator - The plotted pointsBy
areOperator
the average measurement on each part for each
0.010 operator.
1 2 3
operator.
- The Center Line is the overall average for all part measurements by all
0.630
Sample Range
- The Center Line is the overall average for all part measurements by all
UCL=0.005936 operators.
0.005 operators.
- The Control Limits (UCL and LCL) are based on the variability between
0.625
- The Control Limits (UCL and LCL) are based on the variability between
R=0.001817 parts and the number of measurements in each average.
0.000 LCL=0
parts and the number of measurements in each average.
0.620
0
Because
Operator the
1 parts chosen for
2 a Gage R&R study3 should represent the
Because the parts chosen for a Gage R&R study should represent the
Xbar Chart by Operator entire range ofOperator*Part
possible parts,Interaction
this graph should ideally show lack-of-
entire range of possible parts, this graph should Operator
ideally show lack-of-
0.632 1 2 3
UCL=0.6316 control.
0.631 Lack-of-control exists when many points are1 above the Upper
0.631 control.
0.630 Lack-of-control exists when many points are above the Upper
Control Limit and/or below the Lower Control Limit.2
Sample Mean
0.630
Control Limit and/or below the Lower Control Limit.
0.629 3
Average
0.629
0.628 Mean=0.6282 0.628
0.627
0.627
In this case there are only a few points out of control which indicates the
0.626 In this case there are only a few points out of control which indicates the
0.626
0.625 LCL=0.6248 measurement system is inadequate.
0.625
0.624 measurement system is inadequate.
0.624
0 Part 1 2 3 4 5 6 7 8 9 10
294

Tolerance:
MINITAB
MINITABTMprovides
TM
providesananinteraction
interactionchart
chartthat
thatshows
showsthethe
Misc:
average measurements
average measurements taken
taken by each operator on eachpart
by each operator
Components on each
of Variation
partinin
By Part
the
thestudy,
study,arranged
arrangedby
bypart.
100
part. Each
Eachline
lineconnects
connectsthe
theaverages
averagesfor%Contribution
for 0.630
%Study Var
aasingle
singleoperator.
operator.
Percent %Tolerance
50 0.625
Ideally,
Ideally,the
thelines
lineswill
will0follow
followthe
thesame
samepattern
patternand
andthe
thepartpart 0.620
averages will vary enough
averages will vary enoughGage that
thatdifferences
R&Rdifferences
Repeat
between
between
Reprod
parts
partsare
Part-to-Part are Part 1 2 3 4 5 6 7 8 9 10
clear.
clear. R Chart by Operator By Operator
0.010 1 2 3
0.630
Sample Range
UCL=0.005936
Pattern 0.005 Means… 0.625
R=0.001817
0.000 LCL=0
Lines are virtually identical Operators are measuring the parts 0.620
0 Operator 1 2 3
the same
Xbar Chart by Operator Operator*Part Interaction
UCL=0.6316 0.631 1
0.631
One line is consistently higher or That operator is measuring parts 0.630 2
Sample Mean
0.630
0.629
lower than the others consistently higher or lowerMean=0.6282
than 3
Average
0.629
0.628 0.628
0.627 the others 0.627
0.626 0.626
0.625 LCL=0.6248 0.625
0.624 0.624
0 Part 1 2 3 4 5 6 7 8 9 10
Lines are not parallel or they cross The operators ability to measure a
part depends on which part is being
measured (an interaction between
operator and part)
295

MINITAB TM
MINITAB Gage generates
TM R&Raa(ANOVA)
generates “by
“byoperator”
for chart
operator” chartthat
Data thathelps
helpsusus
Date of study:
Reported by:
2-10-01
B Wheat
determine
determinewhether
whetherthe
themeasurements
measurementsarearevariability
variabilityare
are
Tolerance:
Misc:
consistent across operator.
consistent across operator. Components of Variation By Part
100
The
Thebybyoperator
operatorgraph
graphshows
showsall
allthe
thestudy
studymeasurements
%Contribution
measurements %Study Var
0.630
arranged
arrangedbybyoperator. Dots
Dotsrepresent
representthethemeasurements;
measurements;the
%Tolerance
Percent
operator.
50 the
circle-cross
circle-crosssymbols
symbolsrepresent
representthe
themeans.
means. The
Thered
redline
line
0.625
connects the average measurements for each operator.

connects the average measurements for each operator. 0.620
0
R Chart by Operator By Operator

0.010 1 2 3
If the red line is … Then… 0.630

Sample Range
UCL=0.005936
0.005
0.625
Parallel to the x-axis The operators are R=0.001817
0.000 measuring the parts LCL=0 0.620
0 similarly Operator 1 2 3

UCL=0.6316 0.631 1
0.631
You
Youcan
canalso
alsoassess
assesswhether
whetherthe
theoverall
overallVariability
Variabilityininpart
0.630 2
Sample Mean
0.630
part 0.629 3
Average
0.629
measurement
measurement is the same using this graph. Is the spreadininthe
Not parallelis
tothe
thesame using
x-axis0.628 this
The graph. Is
operators the spread
are Mean=0.6282
the
0.628
0.627 0.627
measurements similar? Or is one operator
measuring more
the Variable
parts
measurements similar? Or is one operator more Variable
0.626 than
than 0.626
0.625 LCL=0.6248 0.625
the others? 0.624 differently
the others? 0
0.624
Part 1 2 3 4 5 6 7 8 9 10
296

Tolerance:
Misc:
MINITABTM allows us to analyze all of the
measurements taken in Components
the studyofarranged
Variation by part. By Part
100
The measurements are represented by dots; the means %Contribution
%Study Var
0.630
by the circle-cross symbol. The red line connects the %Tolerance

Percent
50
average measurements for each part. 0.625
0 0.620
Ideally, Gage R&R Repeat Reprod Part-to-Part Part 1 2 3 4 5 6 7 8 9 10
 Multiple measurements
R Chart byfor each individual
Operator By Operator
part have little variation (the dots for one
0.010 1 2 3
part 0.630
Sample Range
will be close together) UCL=0.005936

• Averages
0.005
will vary enough that differences 0.625
between parts are clear R=0.001817

0.000 LCL=0 0.620
0 Operator 1 2 3

UCL=0.6316 0.631 1
0.631 0.630 2
Sample Mean
0.630
0.629 3
Average
0.629
0.628 Mean=0.6282 0.628
0.627 0.627
0.626 0.626
0.625 LCL=0.6248 0.625
0.624 0.624
0 Part 1 2 3 4 5 6 7 8 9 10
297
Practical Conclusions
For this example, the measuring system contributes a great deal to the overall Variation, as confirmed
by both the Gage R&R table and graphs.
The Variation due to the measurement system, as a percent of study Variation is causing 92.21% of
the Variation seen in the process.
By AIAG Standards this gage should not be used. By all standards, the
data being produced by this gage is not valid for analysis.
% Tolerance
or % Contribution System is…
% Study Variance
10% or less 1% or less Ideal
10% - 20% 1% - 4% Acceptable
20% - 30% 5% - 9% Marginal
30% or greater 10% or greater Poor
298
Repeatability and Reproducibility Problems
Repeatability Problems:
• Calibrate or replace gage.
• If only occurring with one operator, re-train.
Reproducibility Problems:
• Measurement machines
– Similar machines
• Ensure all have been calibrated and that the standard measurement method is being
utilized.
– Dissimilar machines
• One machine is superior.
• Operators
– Training and skill level of the operators must be assessed.
– Operators should be observed to ensure that standard procedures are followed.
• Operator/machine by part interactions
– Understand why the operator/machine had problems measuring some parts and not others.
• Re-measure the problem parts
• Problem could be a result of gage linearity
• Problem could be fixture problem
• Problem could be poor gage design
299
Design Types
Crossed Design
• A Crossed Design is used only in non-destructive testing and assumes that all the parts can be
measured multiple times by either operators or multiple machines.
– Gives the ability to separate part-to-part Variation from measurement system Variation.
– Assesses Repeatability and Reproducibility.
– Assesses the interaction between the operator and the part.
Nested Design
• A Nested Design is used for destructive testing (we will learn about this in MBB training) and also
situations where it is not possible to have all operators or machines measure all the parts multiple
times.
– Destructive testing assumes that all the parts within a single batch are identical enough to claim
they are the same.
– Nested designs are used to test measurement systems where it is not possible (or desirable) to
send operators with parts to different locations.
– Do not include all possible combinations of factors.
– Uses slightly different mathematical model than the Crossed Design.
300
Gage R & R Study
Gage R&R Study

– Is a set of trials conducted to assess the Repeatability and Reproducibility of the
measurement system.
– Multiple people measure the same characteristic of the same set of multiple units
multiple times (a crossed study)
– Example: 10 units are measured by 3 people. These units are then randomized and a
second measure on each unit is taken.
A Blind Study is extremely desirable.

– Best scenario: operator does not know the measurement is a part of a test
– At minimum: operators should not know which of the test parts they are currently
measuring.
NO, not that kind of R&R!
301
Variable Gage R & R Steps
Step 1: Call a team meeting and introduce the concepts of the Gage R&R
Step 2: Select parts for the study across the range of interest
– If the intent is to evaluate the measurement system throughout the process range, select parts
throughout the range
– If only a small improvement is being made to the process, the range of interest is now the
improvement range
Step 3: Identify the inspectors or equipment you plan to use for the analysis
– In the case of inspectors, explain the purpose of the analysis and that the inspection system is
being evaluated not the people
Step 4: Calibrate the gage or gages for the study
– Remember Linearity, Stability and Bias
Step 5: Have the first inspector measure all the samples once in random order
Step 6: Have the second inspector measure all the samples in random order
– Continue this process until all the operators have measured all the parts one time
– This completes the first replicate
Step 7: Repeat steps 5 and 6 for the required number of replicates
– Ensure there is always a delay between the first and second inspection
Step 8: Enter the data into MINITABTM and analyze your results
Step 9: Draw conclusions and make changes if necessary
302
Gage R & R Study
Part Allocation From Any Population
10 x 3 x 2 Crossed Design is shown

A minimum of two measurements/part/operator is required
Three is better!
Trial 1
Operator 1
Trial 2
P
a
Trial 1
r 1 2 3 4 5 6 7 8 9 10 Operator 2
t Trial 2
s
Trial 1
Operator 3
Trial 2
303
Data Collection Sheet
Create a data collection sheet for:

– 10 parts
– 3 operators
– 2 trials
304
The Data Collection Sheet
305
Gage R & R
Open the file “Gageaiag2.MTW” to view the worksheet.
Variables:
– Part
– Operator
– Response
306
Gage R & R
Use 1.0 for the tolerance.
307
Graphical Output
Looking at the “Components of Variation” chart, the Part to Part Variation needs to be larger than Gage
Variation.
If in the “Components of Variation” chart the “Gage R&R” bars are larger than the “Part-to-Part” bars, then
all your measurement Variation is in the measuring tool i.e.… “maybe the gage needs to be replaced”. The
same concept applies to the “Response by Operator” chart. If there is extreme Variation within operators,
then the training of the operators is suspect.
Gage R&R (ANOVA) for Response

Reported by :
G age name: Tolerance:
D ate of study : M isc:
Components of Variation Response by Part
Part to Part Variation 100

% Contribution
% Study Var
% Tolerance
1.00
Percent
needs to be larger 50
0.75
than Gage Variation 0.50

0
Gage R&R Repeat Reprod Part-to-Part 1 2 3 4 5 6 7 8 9 10
Part
R Chart by Operator
Response by Operator
1 2 3
0.10
UCL=0.1089 1.00 Operator Error
Sample Range
0.75
0.05 _
R=0.0333
0.50
0.00 LCL=0
1 2 3
Operator
Xbar Chart by Operator
1 2 3 Operator * Part Interaction
1.00 1.00 O perator
Sample Mean
1
_
UCL=0.8774
Average
2
X=0.8147
0.75 LCL=0.7520 0.75 3
0.50
0.50
1 2 3 4 5 6 7 8 9 10
Part
308
Session Window
Two-Way ANOVA Table With Interaction

Source DF SS MS F P
Part 9 1.89586 0.210651 193.752 0.000
Operator 2 0.00706 0.003532 3.248 0.062
Part * Operator 18 0.01957 0.001087 1.431 0.188
Repeatability 30 0.02280 0.000760
Total 59 1.94529
Gage R&R
%Contribution
Source VarComp (of VarComp)
Total Gage R&R 0.0010458 2.91
Repeatability 0.0007600 2.11
Reproducibility 0.0002858 0.79
Operator 0.0001222 0.34
Operator*Part 0.0001636 0.45
Part-To-Part 0.0349273 97.09
Total Variation 0.0359731 100.00
Number of Distinct Categories = 8
I can see clearly now!

309
Session Window
If the Variation due to Gage R & R is high, consider:

• Procedures revision?
• Gage update? • 20 % < % Tol GRR < 30%  Gage Unacceptable
• Operator issue? • 10 % < % Tol GRR < 20 %  Gage Acceptable
• Tolerance validation?
• 1 % < % Tol GRR < 10 %  Gage Preferable
Study Var %Study Var %Tolerance

Source StdDev (SD) (6 * SD) (%SV) (SV/Toler)
Total Gage R&R 0.032339 0.19404 17.05 19.40
Repeatability 0.027568 0.16541 14.54 16.54
Reproducibility 0.016907 0.10144 8.91 10.14
Operator 0.011055 0.06633 5.83 6.63
Operator*Part 0.012791 0.07675 6.74 7.67
Part-To-Part 0.186889 1.12133 98.54 112.13
Total Variation 0.189666 1.13800 100.00 113.80
Number of Distinct Categories = 8
310
Signal Averaging
Signal Averaging can be used to reduce Repeatability error when a better gage is
not available.
– Uses average of repeat measurements.
– Uses Central Limit theorem to estimate how many repeat measures are
necessary.
Signal Averaging is a method to

reduce Repeatability error in a poor
gage when a better gage is not
available or when a better gage is
not possible.
311
Signal Averaging Example
Suppose SV/Tolerance is 35%.
SV/Tolerance must be 15% or less to use gage.
Suppose the Standard Deviation for one part measured by one person many times is
9.5.
Determine what the new reduced Standard Deviation should be.
312
Signal Averaging Example
Determine sample size:
Using
Usingthe
theaverage
averageofof66repeated
repeated
measures
measureswill
willreduce
reducethethe
Repeatability
Repeatabilitycomponent
componentof of
measurement
measurementerror
errortotothe
the
desired
desired15%
15%level.
level.
This method should be considered temporary!

313
Paper Cutting Exercise
Exercise objective: Perform and Analyze a variable MSA Study.
1. Cut a piece of paper into 12 different lengths that are all fairly close to
one another but not too uniform. Label the back of the piece of paper to
designate its “part number”
2. Perform a variable gage R&R study as outlined in this module. Use the
following guidelines:
– Number of parts: 12
– Number of inspectors: 3
– Number of trials: 5
3. Create a MINITABTM data sheet and enter the data into the sheet as each
inspector performs a measurement. If possible, assign one person to
data collection.
4. Analyze the results and discuss with your mentor.
314
Attribute MSA
A methodology used to assess Attribute Measurement Systems.
Attribute
Attribute Gage
Gage Error
Error
Repeatability
Repeatability Reproducibility
Reproducibility Calibration
Calibration
– They are used in situations where a continuous measure cannot be obtained.
– It requires a minimum of 5x as many samples as a continuous study.
– Disagreements should be used to clarify operational definitions for the
categories.
• Attribute data are usually the result of human judgment (which category does this
item belong in).
• When categorizing items (good/bad; type of call; reason for leaving) you need a
high degree of agreement on which way an item should be categorized.
315
Attribute MSA Purpose
The purpose of an Attribute MSA is:

– To determine if all inspectors use the same criteria to determine “pass” from “fail”.
– To assess your inspection standards against your customer’s requirements.
– To determine how well inspectors are conforming to themselves.
– To identify how inspectors are conforming to a “known master,” which includes:
• How often operators ship defective product.
• How often operators dispose of acceptable product.
– Discover areas where:
• Training is required.
• Procedures must be developed.
• Standards are not available.
An Attribute MSA is similar in many ways to the continuous MSA, including the purposes. Do you
have any visual inspections in your processes? In your experience how effective have they been?
316
Visual Inspection Test
Take 60 Seconds and count the number of times “F” appears in this paragraph?
The Necessity of Training Farm Hands for First Class Farms in the
Fatherly Handling of Farm Live Stock is Foremost in the Eyes of
Farm Owners. Since the Forefathers of the Farm Owners Trained the
Farm Hands for First Class Farms in the Fatherly Handling of Farm
Live Stock, the Farm Owners Feel they should carry on with the
Family Tradition of Training Farm Hands of First Class Farmers in
the Fatherly Handling of Farm Live Stock Because they Believe it is
the Basis of Good Fundamental Farm Management.
317
How can we Improve Visual Inspection?
Visual Inspection can be improved by:

• Operator Training & Certification
• Develop Visual Aids/Boundary Samples
• Establish Standards
• Establish Set-Up Procedures
• Establish Evaluation Procedures
– Evaluation of the same location on each part.
– Each evaluation performed under the same lighting.
– Ensure all evaluations are made with the same standard.
Look closely now!

318
Attribute Agreement Analysis
Attribute Column: the responses by the operators in the study.

Samples: The I.D. for the individual pieces.
Appraisers: the name or I.D. for each operator in the study.
If there is a known true answer, the

column containing that answer goes
here. (Accuracy assessment)
319
Date of study :
Assessment Agreement
Reported by :
Name of product:
Misc:
A ppraiser vs Standard
100 95.0% C I
P ercent
80
60
Percent
40
20
0
Duncan Hayes Holmes Montgomery Simpson
Appraiser
This graph shows how each appraiser compared to the right answer, accuracy. The blue
dot is the actual percentage for each operator. The red line with the X on each end is the
confidence interval. Duncan agreed with the standard 53% of the time. We are 95%
confident, based on this study, that Duncan will agree with the standard between 27% and
79% of the time. To decrease the interval, add more parts to the study.
320
Attribute Agreement Analysis for Rating
Each Appraiser vs Standard
Appraiser # Inspected # Matched Percent 95 % CI

Duncan 15 8 53.33 (26.59, 78.73)
Hayes 15 13 86.67 (59.54, 98.34)
Holmes 15 15 100.00 (81.90, 100.00)
Montgomery 15 15 100.00 (81.90, 100.00)
Simpson 15 14 93.33 (68.05, 99.83)
# Matched: Appraiser's assessment across trials agrees with the known standard.
321
Between Appraisers
# Inspected # Matched Percent 95 % CI

15 6 40.00 (16.34, 67.71)
# Matched: All appraisers' assessments agree with each other.
All Appraisers vs Standard
# Inspected # Matched Percent 95 % CI

15 6 40.00 (16.34, 67.71)
# Matched: All appraisers' assessments agree with the known standard.

322
Kappa Statistics
Fleiss' Kappa Statistics
Appraiser Response Kappa SE Kappa Z P(vs > 0)

Duncan -2 0.58333 0.258199 2.25924 0.0119
-1 0.16667 0.258199 0.64550 0.2593
0 0.44099 0.258199 1.70796 0.0438
1 0.44099 0.258199 1.70796 0.0438
2 0.42308 0.258199 1.63857 0.0507
Overall 0.41176 0.130924 3.14508 0.0008
Simpson -2 1.00000 0.258199 3.87298 0.0001
-1 1.00000 0.258199 3.87298 0.0001
0 0.81366 0.258199 3.15131 0.0008
1 0.81366 0.258199 3.15131 0.0008
2 1.00000 0.258199 3.87298 0.0001
Overall 0.91597 0.130924 6.99619 0.0000
323
M&M Exercise
Exercise objective: Perform and Analyze an Attribute MSA Study.
• You will need the following to complete the study:

– A bag of M&Ms containing 50 or more “pieces”
– The attribute value for each piece.
– Three or more inspectors.
• Judge each M&M as pass or fail.

Number Part Attribute
– The customer has indicated that they want a bright and shiny M&M and that
1 M&M Pass they like M’s.
2 M&M Fail
• Pick 50 M&Ms out of a package.
3 M&M Pass
• Enter results into either the Excel template or MINITAB TM and draw
conclusions.
• The instructor will represent the customer for the Attribute score.
324
Summary
• Understand Precision & Accuracy
• Understand Bias, Linearity and Stability
• Understand Repeatability & Reproducibility
• Understand the impact of poor gage capability on product quality
• Identify the various components of Variation
• Perform the step by step methodology in Variable and Attribute MSA’s
325
Process Capability
Process Discovery
Process Capability
Continuous Capability
Concept of Stability
Attribute Capability
326
Understanding Process Capability
Process Capability:
• The inherent ability of a process to meet the expectations of the customer without any
additional efforts.
• Provides insight as to whether the process has a :

– Centering Issue (relative to specification limits)
– Variation Issue
– A combination of Centering and Variation
– Inappropriate specification limits
• Allows for a baseline metric for improvement.
*Efforts: Time, Money, Manpower, Technology, and Manipulation
327
Capability as a Statistical Problem
Our Statistical Problem: What is the probability of our process

producing a defect ?
Define a Practical
Problem
Create a
Statistical Problem
Correct the
Statistical Problem
Apply the Correction

to the Practical
Problem
328
Capability Analysis
The X’s Y = f(X) (Process Function) The Y’s

(Inputs)
Variation – “Voice of
(Outputs)
the Process”
Frequency
Op i Verified Op i + 1
? Data for
Y1…Yn
X1
Y1 10.16
9.87
X2 Off-Line
10.11
10.16 10.16
Analysis Scrap 10.05
10.11 9.99
9.87 10.11
10.12
9.99 10.05
Correction 10.33
10.05
10.44
10.33 10.43
10.12 10.33
X3 Y2 9.86
10.44 10.21
10.43 10.44
10.01
10.21 9.86
9.80 9.90 10.0 10.1 10.2 10.3 10.4 10.5
10.07
9.86
10.29
10.07 10.15
10.01 10.07
10.36
10.29 10.44
10.15 10.29
X4 10.36 10.03
10.44 10.36
10.33
10.03
10.15
10.33
Yes No Y3 10.15
X5 Correctable
?
Requirements – “Voice
Critical X(s): Data - VOP of the Customer”
Any variable(s) 10.16
10.11 9.87 10.16
LSL = 9.96 USL = 10.44
10.05 9.99 10.11
which exerts an 10.33

10.44
10.12
10.43
10.21
10.05
10.33
undue influence on
9.86 10.44
10.07 10.01 9.86
10.29 10.15 10.07
the important 10.36 10.44

10.03
10.33
10.29
10.36
outputs (CTQ’s) of a 10.15
Defects
process Defects
Capability Analysis Numerically -6 -5 -4 -3 -2 -1 +1 +2 +3 +4 +5 +6
9.70 9.80 9.90 10.0 10.1 10.2 10.3 10.4 10.5 10.6
Compares the VOP to the VOC Percent Composition
329
Process Output Categories
Incapable Off target

LSL
Average
USL LSL Average
USL
Target Target
Re
ss
du
Capable and e
oc
ce
on target r
rp
sp
te
re
Average n
ad
LSL USL Ce
Target
330
Problem Solving Options – Shift the Mean
This involves finding the variables that will shift the process over to the target.
This is usually the easiest option.
USL
LSL
Shift
331
Problem Solving Options – Reduce Variation
This is typically not so easy to accomplish and occurs often in Six Sigma
projects.
LSL USL
332
Problem Solving Options – Shift Mean & Reduce Variation
This occurs often in Six Sigma projects.
USL
LSL Shift & Reduce
333
Problem Solving Options
Obviously this implies making them wider, not narrower. Customers usually
do not go for this option but if they do…it’s the easiest!
LSL USL USL

Move Spec
334
Capability Studies
Capability Studies:
• Are intended to be regular, periodic, estimations of a process’s ability to meet its
requirements.
• Can be conducted on both Discrete and Continuous Data.
• Are most meaningful when conducted on stable, predictable processes.
• Are commonly reported as Sigma Level which is optimal (short term)
performance.
• Require a thorough understanding of the following:
– Customer’s or business’s specification limits
– Nature of long-term vs. short-term data
– Mean and Standard Deviation of the process
– Assessment of the Normality of the data (Continuous Data only)
– Procedure for determining Sigma level
335
Steps to Capability
Select Output for

Improvement
#1 Verify Customer
Requirements
#2 Validate
Specification
Limits
#3 Collect Sample
Data
#4 Determine
Data Type
(LT or ST)
#5 Check data
for normality
#6 Calculate
Z-Score, PPM,
Yield, Capability
Cp, Cpk, Pp, Ppk
#7
336
Verifying the Specifications
Questions to consider:
• What is the source of the specifications?

– Customer requirements (VOC)
– Business requirements (target, benchmark)
– Compliance requirements (regulations)
– Design requirements (blueprint, system)
• Are they current? Likely to change?
• Are they understood and agreed upon?

– Operational definitions
– Deployed to the work force
337
Data Collection
Capability Studies should include “all” observations (100% sampling) for a specified period.
Short-term data: Long-term data:
•Collected across a narrow inference space. •Is collected across a broader inference space.
•Daily, weekly; for one shift, machine, •Monthly, quarterly; across multiple shifts,
operator, etc. machines, operators, etc
•Is potentially free of special cause •Subject to both common and special causes of
variation. variation.
•Often reflects the optimal performance •More representative of process performance over
level. a period of time.
•Typically consists of 30 – 50 data points. •Typically consists of at least 100 – 200 data
points.
F ill Q u a n tity
Lot 1 Lot 5
Lot 3
Lot 2
Lot 4
Short-term studies
Long-term study
338
Baseline Performance
Process
ProcessBaseline:
Baseline:The
Theaverage,
average,
long-term
long-termperformance
performancelevel
levelof
ofaa
process
processwhen
whenall
allinput
inputvariables
variablesare
are
unconstrained.
unconstrained. Long-term
Long-term
baseline
baseline
Short
4
ShortTerm
Term
Performance
Performance
` 3
2
1
LSL TARGET USL
339
Even stable processes will drift and shift over time by as much as 1.5 Standard
Deviations on the average.
Long Term
Overall Variation
Short Term
Between Group Variation
Short Term
Within Group Variation
340
Sum of the Squares Formulas
SS total = SS between + SS within
Precision
Shift (short-term capability)
x
Output Y
x x
x
x x
x x x
x x
x
x x x Time
x x
x x x x
x x x
x
341
Stability
A Stable Process is consistent over time. Time Series Plots and Control Charts are
the typical graphs used to determine stability.
At this point in the Measure Phase there is no reason to assume the process is stable.
Tic toc…
tic toc…
342
Measures of Capability
Mathematically Cpk and Ppk are the same and Cp and Pp are the same.
The only difference is the source of the data, Short-term and Long-term, respectively.
– Cp and Pp
• What is Possible if your process is perfectly Centered Hope
• The Best your process can be
• Process Potential (Entitlement)
– Cpk and Ppk

• The Reality of your process performance
• How the process is actually running
Reality
• Process Capability relative to specification limits
343
Capability Formulas
Six times the sample

Standard Deviation
Sample Mean
Three times the sample Standard

Deviation
Note: Consider the “K” value the penalty for being off center LSL – Lower specification limit
USL – Upper specification limit
344
MINITAB™ Example
Open worksheet “Camshaft.mtw”. Check for Normality.
By looking at the “P-values” the

data look to be Normal since P is
greater than .05
345
MINITAB™ Example
Assuming the same sample data set in the previous exercise, let us find
out the Capability indices of the data set.
Standard
LSL USL Xbar Deviation Cp Cpl Cpu Cpk
4.94 5.84 5.12 0.17 0.87 0.35 1.40 0.35
Review the Capability indices calculation in the Capability indices

worksheet in LSSBB Data Set.
Meaningful business decisions can be taken on the values of Cp and Cpk.
If Cp < 1, what does it mean?

If Cp and Cpk values are too different, what does it mean?
346
Process Performance
Zst 2.61
Zlt 1.04
Zst or Short Term Sigma is calculated with the
formula, Cp * 3.
Zlt or Long Term Sigma is calculated with the

formula, Cpk * 3.
1. Sub group sizes will alter the way how Cp and Cpk are calculated.
2. With subgroup sizes coming into play, variations below come into play
a. Within Sub group variations
b. Between Sub group variations
347
Capability Steps
Select Output for

Improvement We can follow the steps for
calculating capability for
#1 Verify Customer
Requirements
Continuous Data until we reach
the question about data
#2 Validate
Specification Normality…
Limits
#3 Collect Sample
Data
#4 Determine
Data Type
(LT or ST)
#5 Check data
for Normality
Calculate
#6 Z-Score, PPM,
Yield, Capability
Cp, Cpk, Pp, Ppk
#7
348
Attribute Capability Steps
Select Output for

Improvement
Notice the difference when we
#1 Verify Customer
come to step 5…
Requirements
Validate
#2 Specification
Limits
#3 Collect Sample
Data
#4
Calculate
DPU
#5
Find Z-Score
#6 Convert Z-Score
to Cp & Cpk
#7
349
Z Scores
Z Score is a measure of the distance in Standard Deviations of a sample from the

Mean.
The Z Score effectively transforms the actual data into standard normal units. By
referring to a standard Z table you can estimate the area under the Normal curve.
– Given an average of 50 with a Standard Deviation of 3 what is the proportion
beyond the upper spec limit of 54?
50
54
350
Z Table
351
Attribute data is always long-term in the shifted condition since it requires so many samples
to get a good estimate with reasonable confidence.
Short-term Capability is typically reported, so a shifting method will be employed to estimate

short-term Capability.
You Want to Estimate : ZST ZLT

Short Term Long Term Sigma Short-Term Long-Term
Your Data Is : Capability Capability Level DPMO DPMO
1 158655.3 691462.5
Short Term Subtract
ZST Capability 1.5 2 22750.1 308537.5
3 1350.0 66807.2
Long Term Add
ZLT Capability 1.5
4 31.7 6209.7
5 0.3 232.7
352 6 0.0 3.4

By viewing these formulas you can see there is a relationship between them.
If we divide our Z short-term by 3 we can determine our Cpk and if we divide our Z long-
term by 3 we can determine our Ppk.
353
Attribute Capability Example
A customer service group is interested in estimating the Capability of their call

center.
A total of 20,000 calls came in during the month but 2,500 of them “dropped” before
they were answered (the caller hung up).
Results of the call center data set:

Samples = 20,000
Defects = 2,666
They hung up….!

354
Attribute Capability Example
1. Calculate DPU
2. Look up DPU value on the Z-Table
3. Find Z-Score
4. Convert Z Score to Cpk, Ppk
Example:
Look up ZLT
ZLT = 1.11
Convert ZLT to ZST = 1.11+1.5 = 1.61
355
1. Calculate DPU
2. Look up DPU value on the Z-Table
3. Find Z Score
4. Convert Z Score to Cpk, Ppk
Example:
Look up ZLT
ZLT = 1.11
Convert ZLT to ZST = 1.11+1.5 = 1.61
356
Summary
• Estimate Capability for Continuous Data
• Estimate Capability for Attribute Data
• Describe the impact of Non-normal Data on the analysis presented in this

module for Continuous Capability
357
Measure Phase
Measure Phase Overview - The Goal
The goal of the Measure Phase is to:
• Define, explore and classify “X” variables using a variety of tools.

– Detailed Process Mapping
– Fishbone Diagrams
– X-Y Matrixes
– FMEA
• Demonstrate a working knowledge of Basic Statistics to use as a communication

tool and a basis for inference.
• Perform Measurement Capability studies on output variables.
• Evaluate stability of process and estimate starting point capability.
359
Six Sigma Behaviors
• Being tenacious, courageous
• Being rigorous, disciplined
• Making data-based decisions
• Embracing change & continuous learning Walk

• Sharing best practices
the
Walk!
Each
Each“player”
“player”ininthe
theSix
SixSigma
Sigmaprocess
processmust
mustbe
be
AAROLE
ROLEMODEL
MODEL
for
forthe
theSix
SixSigma
Sigmaculture
culture
360
Measure Phase Deliverables
Listed below are the Measure Deliverables that each candidate should present in a
Power Point presentation to their mentor and project champion.
At this point you should understand what is necessary to provide these deliverables
in your presentation.
– Primary Metric
– Process Map – detailed
– FMEA
– X-Y Matrix
– Basic Statistics on Y
– MSA
– Stability graphs
– Capability Analysis
– Project Plan
– Issues and Barriers
361
Measure Phase - The Roadblocks
Look for the potential roadblocks and plan to address them before they become
problems:
– Team members do not have the time to collect data.
– Data presented is the best guess by functional managers.
– Process participants do not participate in the creation of the X-Y Matrix,
FMEA and Process Map.
It won’t all be
smooth sailing…..
362
DMAIC Roadmap
Process Owner
Champion/

Define
Estimate COPQ
Establish Team
Measure

Analyze

Improve

Control
363
Measure Phase
Detailed Problem Statement Determined
Detailed Process Mapping
Identify All Process X’s Causing Problems (Fishbone, Process Map)
Select the Vital Few X’s Causing Problems (X-Y Matrix, FMEA)
Assess Measurement System
Y
Repeatable &
Reproducible?
N
Implement Changes to Make System Acceptable
Assess Stability (Statistical Control)
Assess Capability (Problem with Centering/Spread)
Estimate Process Sigma Level
Review Progress with Champion
Ready for Analyze
364
Measure Phase Checklist
Measure Questions
Identify critical X’s and potential failure modes
• Is the “as is” Process Map created?
• Are the decision points identified?
• Where are the data collection points?
• Is there an analysis of the measurement system?
• Where did you get the data?
Identify critical X’s and potential failure modes
• Is there a completed X-Y Matrix?
• Who participated in these activities?
• Is there a completed FMEA?
• Has the Problem Statement changed?
• Have you identified more COPQ?
Stability Assessment
• is the “Voice of the Process” stable?
• If not, have the special causes been acknowledged?
• Can the good signals be incorporated into the process?
• Can the bad signals be removed from the process?
• How stable can you make the process?
Capability Assessment
• What is the short-term and long-term Capability of the process?
• What is the problem, one of centering, spread or some combination?
General Questions
• Are there any issues or barriers that prevent you from completing this phase?
• Do you have adequate resources to complete the project?
365
Planning for Action
WHAT WHO WHEN WHY WHY NOT HOW
Identify the complexity of the process
Focus on the problem solving process

Define Characteristics of Data
Validate Financial Benefits
Balance and Focus Resources
Establish potential relationships between variables
Quantify risk of meeting critical needs of Customer, Business and

People
Predict the Risk of sustainability
Chart a plan to accomplish the desired state of the culture

What is your defect?
When does your defect occur?
How is your defect measured?
366
Summary
• Have a clear understanding of the specific action items
• Have started to develop a project plan to complete the action items
• Be ready to apply the Six Sigma method within your business
367
Analyze Phase
Inferential Statistics
Welcome to Analyze
Inferential Statistics Nature of Sampling
Intro to Hypothesis Testing Central Limit Theorem
Hypothesis Testing ND P1
Hypothesis Testing NND P1
369
Nature of Inference
in·fer·ence (n.) “The act or process of deriving logical conclusions from

premises known or assumed to be true. The act of reasoning from factual
knowledge or evidence.” 1 1. Dictionary.com
Inferential Statistics – To draw inferences about the process or population being

studied by modeling patterns of data in a way that account for randomness and
uncertainty in the observations. 2
2. Wikipedia.com
Putting the pieces of

the puzzle
together….
370
5 Step Approach to Inferential Statistics
1. What do you want to know?
2. What tool will give you that information?
3. What kind of data does that tool require?
4. How will you collect the data?
5. How confident are you with your data summaries?
So many
questions….?
371
Types of Error
1. Error in sampling
– Error due to differences among samples drawn at random from the population
(luck of the draw).
– This is the only source of error that statistics can accommodate.
2. Bias in sampling
– Error due to lack of independence among random samples or due to systematic
sampling procedures (height of horse jockeys only).
3. Error in measurement
– Error in the measurement of the samples (MSA/GR&R).
4. Lack of measurement validity
– Error in the measurement does not actually measure what it intends to measure
(placing a probe in the wrong slot measuring temperature with a thermometer that
is just next to a furnace).
372
Population, Sample, Observation
Population
– EVERY data point that has ever been or ever will be generated from a given
characteristic.
Sample
– A portion (or subset) of the population, either at one time or over time.
X
X X
X X
Observation
– An individual measurement.
373
Significance
Significance is all about differences. In general, larger differences (or deltas) are
considered to be “more significant.”
Practical difference and significance is:
– The amount of difference, change, or improvement that will be of practical,
economic or technical value to you.
– The amount of improvement required to pay for the cost of making the
improvement.
Statistical difference and significance is:
– The magnitude of difference or change required to distinguish between a true
difference, change or improvement and one that could have occurred by chance.
Six Sigma decisions will ultimately have a return on resource investment
(RORI)* element associated with them.
– The key question of interest for our decisions “is the benefit of making a change
worth the cost and risk of making it?”
* RORI includes not only dollars and assets but the time and participation of your teams.
374
The Mission
Variation
Mean Shift Both
Reduction
Your mission, which you have chosen to accept, is to reduce cycle time, reduce the error rate, reduce costs, reduce
investment, improve service level, improve throughput, reduce lead time, increase productivity… change the output
metric of some process, etc…
In statistical terms, this translates to the need to move the process Mean and/or reduce the process Standard Deviation.
You’ll be making decisions about how to adjust key process input variables based on sample data, not population data
- that means you are taking some risks.
How will you know your key process output variable really changed, and is not just an unlikely sample? The Central
Limit Theorem helps us understand the risk we are taking and is the basis for using sampling to estimate population
parameters.
375
A Distribution of Sample Means
Imagine you have some population. The individual values of this population form
some distribution.
Take a sample of some of the individual values and calculate the sample Mean.
Keep taking samples and calculating sample Means.
Plot a new distribution of these sample Means.
The Central Limit Theorem says that as the sample size becomes large, this new
distribution (the sample Mean distribution) will form a Normal Distribution, no
matter what the shape of the population distribution of individuals.
376
Different Distributions
We will draw samples of 3 each and draw 10 samples. Let us see how
Central Limit Theorem works:
Data Sample Mean
5.01
5.01
5.01 5.01
5
5.01
5 5.00
5.01
4.99
4.99 5.00
5
5
5 5.00
4.99
5

1. Find out the average of each sample
5
5
5.00
2. Find out the grand average of all the
5
5.15 5.05 averages.
5.16
5.19
5.19

5.18
3. This is known as Mean of Means, and is
5.2
5.32

almost close to Population Mean
5.34 5.29
5.35
5.4
5.42 5.39
5.6
5.42
5.36 5.46
Mean of means 5.14

377
Central Limit Theorem – For estimating standard deviation
1. The population standard deviation can be estimated by using the

standard deviation across all the means. This is known as Sampling
Standard Deviation. (And not --- Sample Standard Deviation).
2. The error associated with sampling is represented as Standard Error of

Mean (SEM) and is represented by the formula ơ/√n.
Put together, the three primary clauses will help us deal with inferential
statistics with ease.
Of course, the concept of confidence intervals kicks in too, which we will

discuss in later chapters.
378
Observations
As the sample size (number of dice) increases from 1 to 5 to 10, there are three
points to note:
1. The Center remains the same.
2. The variation decreases.
3. The shape of the distribution changes - it tends to become Normal.
The Mean of the sample Mean The Standard Deviation of the sample
distribution: Mean distribution, also known as the
Standard Error.
Good news: the Mean of the sample Mean Better news: I can reduce my uncertainty
distribution is the Mean of the population. about the population Mean by increasing my
sample size n.
379
Central Limit Theorem
If all possible random samples, each of size n, are taken from any population
with a Mean μ and Standard Deviation σ, the distribution of sample Means will:
have a Mean
have a Std Dev
and be Normally Distributed when the parent population is Normally

Distributed or will be approximately Normal for samples of size 30 or more
when the parent population is not Normally Distributed.
This improves with samples of larger size.
Bigger is Better!
380
So What?
So how does this theorem help me understand the risk

I am taking when I use sample data, instead of
population data?
Recall that 95% of Normally Distributed data is within ± 2 Standard Deviations

from the Mean. Therefore, the probability is 95% that my sample Mean is within
2 standard errors of the true population Mean.
381
A Practical Example
Let’s say your project is to reduce the setup time for a large casting:
– Based on a sample of 20 setups, you learn that your baseline average is 45

minutes, with a Standard Deviation of 10 minutes.
– Because this is just a sample, the 45 minute average is just an estimate of
the true average.
– Using the Central Limit Theorem, there is 95% probability that the true
average is somewhere between 40.5 and 49.5 minutes.
– Therefore don’t get too excited if you made a process change that resulted
in a reduction of only 2 minutes.
382
Sample Size and the Mean
When taking a sample we have only estimated the true Mean:

– All we know is that the true Mean lies somewhere within the theoretical
distribution of sample Means or the t-distribution which are analyzed using t-tests.
– T-tests measure the significance of differences between Means.
Theoretical distribution of sample

Means for n = 2
Theoretical distribution of sample Distribution of individuals in the

Means for n = 10 population
383
Standard Error of the Mean
The Standard Deviation for the distribution of Means is called the standard
error of the Mean and is defined as:
– This formula shows that the Mean is more stable than a single observation by a
factor of the square root of the sample size.
384
Standard Error
The rate of change in the Standard Error approaches zero at about 30 samples.
Standard Error
0 5 10 20 30
Sample Size
This is why 30 samples is often recommended when generating summary
statistics such as the Mean and Standard Deviation.
This is also the point at which the t and Z distributions become nearly equivalent.
385
Summary
• Explain the term “Inferential Statistics”
• Explain the Central Limit Theorem
• Describe what impact sample size has on your estimates of population

parameters
• Explain Standard Error
386
Hypothesis Testing (ND)
Inferential Statistics Hypothesis Testing Purpose
Tests for Central Tendency

Intro to Hypothesis Testing
Tests for Variance
ANOVA
387
Six Sigma Goals and Hypothesis Testing
Our goal is to improve our Process Capability, this translates to the need to move the process Mean (or
proportion) and reduce the Standard Deviation.
– Because it is too expensive or too impractical (not to mention theoretically impossible) to collect
population data, we will make decisions based on sample data.
– Because we are dealing with sample data, there is some uncertainty about the true population
parameters.
Hypothesis Testing helps us make fact-based decisions about whether there are different population
parameters or that the differences are just due to expected sample variation.
Process Capability of Process Before Process Capability of Process After
LSL USL LSL USL

P rocess D ata W ithin P rocess D ata W ithin
LS L 100.00000 Ov erall LS L 100.00000 Ov erall
T arget * T arget *
USL 120.00000 P otential (Within) C apability USL 120.00000 P otential (Within) C apability
S ample M ean 108.65832 Cp 1.42 S ample M ean 109.86078 Cp 2.14
S ample N 150 C PL 1.23 S ample N 100 C PL 2.11
S tD ev (Within) 2.35158 C PU 1.61 S tD ev (Within) 1.55861 C PU 2.17
S tD ev (O v erall) 5.41996 C pk 1.23 S tD ev (O v erall) 1.54407 C pk 2.11
C C pk 1.42 C C pk 2.14
O v erall C apability O v erall C apability
Pp 0.62 Pp 2.16
PPL 0.53 PPL 2.13
PPU 0.70 PPU 2.19
P pk 0.53 P pk 2.13
C pm * C pm *
96 100 104 108 112 116 120 102 105 108 111 114 117 120
O bserv ed P erformance E xp. Within P erformance E xp. O v erall P erformance O bserv ed P erformance E xp. Within P erformance E xp. O v erall P erformance
P P M < LS L 6666.67 P P M < LS L 115.74 P P M < LS L 55078.48 P P M < LS L 0.00 P P M < LS L 0.00 P P M < LS L 0.00
P P M > U S L 0.00 P P M > U S L 0.71 P P M > U S L 18193.49 P P M > U S L 0.00 P P M > U S L 0.00 P P M > U S L 0.00
P P M Total 6666.67 P P M T otal 116.45 P P M T otal 73271.97 P P M Total 0.00 P P M Total 0.00 P P M Total 0.00
388
Purpose of Hypothesis Testing
The purpose of appropriate Hypothesis Testing is to integrate the Voice of the Process with
the Voice of the Business to make data-based decisions to resolve problems.
Hypothesis Testing can help avoid high costs of experimental efforts by using existing data.
This can be likened to:
– Local store costs versus mini bar expenses.
– There may be a need to eventually use experimentation, but careful data analysis
can indicate a direction for experimentation if necessary.
The probability of occurrence is based on a pre-determined statistical confidence.
Decisions are based on:

– Beliefs (past experience)
– Preferences (current needs)
– Evidence (statistical data)
– Risk (acceptable level of failure)
389
The Basic Concept for Hypothesis Tests
Recall from the discussion on classes and cause of distributions that a data set
may seem Normal, yet still be made up of multiple distributions.
Hypothesis Testing can help establish a statistical difference between factors

from different distributions.
0.8
0.8
0.7
0.7
0.6
0.6
0.5
0.5
freq
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
0.0
0.0
-3
-3 -2
-2 -1
-1 0
0 1
1 22 3
3
xx
Did my sample come from this population? Or this? Or this?

390
Significant Difference
Are the two distributions “significantly” different from each other? How
sure are we of our decision?
How do the number of observations affect our confidence in detecting

population Mean?
 
Sample 1 Sample 2
391
Detecting Significance
Statistics provide a methodology to detect differences.

– Examples might include differences in suppliers, shifts or equipment.
– Two types of significant differences occur and must be well understood,
practical and statistical.
– Failure to tie these two differences together is one of the most common
errors in statistics.
HO: The sky is not falling.
HA: The sky is falling.
392
Practical vs. Statistical
Practical Difference: The difference which results in an improvement of

practical or economic value to the company.
– Example, an improvement in yield from 96 to 99 percent.
Statistical Difference: A difference or change to the process that probably (with

some defined degree of confidence) did not happen by chance.
– Examples might include differences in suppliers, markets or servers.
We will see that it is possible to realize a statistically significant difference

without realizing a practically significant difference.
393
Detecting Significance
During the Measure Phase,

it is important that the nature of Mean Shift
the problem be well understood.
In understanding the problem,

the practical difference to be
achieved must match the
statistical difference.
The difference can be either a Variation Reduction

change in the Mean or in the
variance.
Detection of a difference is then

accomplished using statistical
Hypothesis Testing.
394
Hypothesis Testing
A Hypothesis Test is an a priori theory relating to differences between variables.
A statistical test or Hypothesis Test is performed to prove or disprove the theory.
A Hypothesis Test converts the practical problem into a statistical problem.

– Since relatively small sample sizes are used to estimate population
parameters, there is always a chance of collecting a non-representative
sample.
– Inferential statistics allows us to estimate the probability of getting a non-
representative sample.
395
DICE Example
We could throw it a number of times and track how many each face occurred. With a
standard die, we would expect each face to occur 1/6 or 16.67% of the time.
If we threw the die 5 times and got 5 one’s, what would you conclude? How sure can you
be?
– Pr (1 one) = 0.1667 Pr (5 ones) = (0.1667)5 = 0.00013
There are approximately 1.3 chances out of 1000 that we could have gotten 5 ones with a
standard die.
Therefore, we would say we are willing to take a 0.1% chance of being wrong about our
hypothesis that the die was “loaded” since the results do not come close to our predicted
outcome.
396
Hypothesis Testing
DECISIONS
β n
397
Statistical Hypotheses
A hypothesis is a predetermined theory about the nature of, or relationships

between variables. Statistical tests can prove (with a certain degree of
confidence) that a relationship exists.
We have two alternatives for hypothesis.
– The “null hypothesis” Ho assumes that there are no differences or relationships.
This is the default assumption of all statistical tests.
– The “alternative hypothesis” Ha states that there is a difference or relationship.
P-value > 0.05 Hoo = no difference or relationship

P-value < 0.05 Haa = is a difference or relationship
Making a decision does not FIX a problem, taking

action does.
398
Steps to Statistical Hypothesis Test
1. State the Practical Problem.

2. State the Statistical Problem.
a) HO: ___ = ___
b) HA: ___ ≠ ,>,< ___
3. Select the appropriate statistical test and risk levels.
a) α = .05
b) β = .10
4. Establish the sample size required to detect the difference.
5. State the Statistical Solution.
6. State the Practical Solution.
Noooot THAT practical

solution!
399
How Likely is Unlikely?
Any differences between observed data and claims made under H 0 may be real
or due to chance.
Hypothesis Tests determine the probabilities of these differences occurring

solely due to chance and call them P-values.
The a level of a test (level of significance) represents the yardstick against which
P-values are measured and H0 is rejected if the
P-value is less than the alpha level.
The most commonly used levels are 5%, 10% and 1%.
400
Hypothesis Testing Risk
The alpha risk or Type 1 Error (generally called the “Producer’s Risk”) is the
probability that we could be wrong in saying that something is “different.” It is an
assessment of the likelihood that the observed difference could have occurred by
random chance. Alpha is the primary decision-making tool of most statistical tests.
Actual Conditions
Not Different Different
(Ho is True) (Ho is False)
Not Different Correct Type II

(Fail to Reject Ho)
Decision Error
Statistical
Conclusions
Different
Type 1 Correct
(Reject Ho) Error Decision
401
Alpha Risk
Alpha ( ) risks are expressed relative to a reference distribution.

Distributions include:
– t-distribution
– z-distribution The a-level is represented by the
clouded areas.
– 2- distribution Sample results in this area lead to
rejection of H00.
– F-distribution
Region of Region of
DOUBT DOUBT
Accept as chance differences
402
Hypothesis Testing Risk
The beta risk or Type 2 Error (also called the “Consumer’s Risk”) is the
probability that we could be wrong in saying that two or more things are the same
when, in fact, they are different.
Actual Conditions
Not Different Different
(Ho is True) (Ho is False)
Not Different Correct Type II

(Fail to Reject Ho)
Decision Error
Statistical
Conclusions
Different
Type 1 Correct
(Reject Ho) Error Decision
403
Beta Risk
Beta Risk is the probability of failing to reject the null hypothesis when a
difference exists.
Distribution if H0 is true
Reject H0
 = Pr(Type 1 error)
 = 0.05
H0 value
Accept H0 Distribution if Ha is true

= Pr(Type II error)
Critical
Critical value
value ofof test
test
statistic
statistic
404
Distinguishing between Two Samples
Recall from the Central Limit Theoretical Distribution

Theorem as the number of individual of Means
observations increase the Standard  When n = 2
=5
Error decreases. S=1
In this example when n=2 we cannot

distinguish the difference between the
Means (> 5% overlap, P-value >
0.05).
Theoretical Distribution
When n=30, we can distinguish of Means
between the Means (< 5% overlap, P- When n = 30
= 5
value < 0.05) There is a significant S=1
difference.
405
Delta Sigma—The Ratio between  and S
Delta () is the size of the difference

between two Means or one Mean and a
target value.
Large Delta
Sigma (S) is the sample Standard 
Deviation of the distribution of
individuals of one or both of the samples
under question.
When  & S is large, we don’t need
statistics because the differences are so
large.
If the variance of the data is large, it is
difficult to establish differences. We need
larger sample sizes to reduce uncertainty.
Large S
We want to be 95% confident in all of our estimates!
406
Typical Questions on Sampling
Question: “How many samples should we take?”

Answer: “Well, that depends on the size of your delta and Standard Deviation”.
Question: “How should we conduct the sampling?”

Answer: “Well, that depends on what you want to know”.
Question: “Was the sample we took large enough?”

Answer: “Well, that depends on the size of your delta and Standard Deviation”.
Question: “Should we take some more samples just to be sure?”

Answer: “No, not if you took the correct number of samples the first time!”
407
The Perfect Sample Size
The minimum sample size

required to provide
exactly 5% overlap (risk).
In order to distinguish the
Delta.
Note: If you are working

with Non-normal Data, 40 50 60 70
multiply your calculated Population
sample size by 1.1
40 50 60 70
408
Hypothesis Testing Roadmap
Normal
uous
n
onti ta
C Da
Test of Equal Variance 1 Sample Variance 1 Sample t-test
Variance Equal Variance Not Equal
2 Sample T One Way ANOVA 2 Sample T One Way ANOVA
409
uous
n
onti ta
C Da Non Normal
Test of Equal Variance Median Test
Mann-Whitney Several Median Tests
410
ute Attribute Data

ri b
Att ata
D
One Factor Two Factors
Two Samples Two or More Samples
One Sample
One Sample Two Sample Chi Square Test

Proportion Proportion (Contingency Table)
Minitab:
Stat - Basic Stats - 2 Proportions Minitab:
If P-value < 0.05 the proportions are Stat - Tables - Chi-Square Test
different If P-value < 0.05 at least one proportion is
different
Chi Square Test

(Contingency Table)
Minitab:
Stat - Tables - Chi-Square Test
If P-value < 0.05 the factors are not
independent
411
Common Pitfalls to Avoid
While using Hypothesis Testing the following facts should be borne in mind at the
conclusion stage:
– The decision is about Ho and NOT Ha.
– The conclusion statement is whether the contention of Ha was upheld.
– The null hypothesis (Ho) is on trial.
– When a decision has been made:
• Nothing has been proved.
• It is just a decision.
• All decisions can lead to errors (Types I and II).
– If the decision is to “Reject Ho,” then the conclusion should read “There is sufficient
evidence at the α level of significance to show that “state the alternative hypothesis
Ha.”
– If the decision is to “Fail to Reject Ho,” then the conclusion should read “There isn’t
sufficient evidence at the α level of significance to show that “state the alternative
hypothesis.”
412
Summary

• Articulate the purpose of Hypothesis Testing
• Explain the concepts of the Central Tendency
• Be familiar with the types of Hypothesis Tests
413
Hypothesis Testing Normal Data Part 1
Intro to Hypothesis Testing Sample Size
Hypothesis Testing ND P1 Testing Means
Analyzing Results
414
Test of Means (t-tests)
t-tests are used:

– To compare a Mean against a target.
• i.e.; The team made improvements and wants to compare the Mean
against a target to see if they met the target.
– To compare Means from two different samples.

• i.e.; Machine one to machine two.
• i.e.; Supplier one quality to supplier two quality.
– To compare paired data.

• Comparing the same part before and after a given process.
They don’t look the

same to me!
415
1 Sample t
A 1-sample t-test is used to compare an expected population Mean to a target.
Target μsample
MINITAB TM performs a one sample t-test or t-confidence interval for the Mean.
Use 1-sample t to compute a confidence interval and perform a Hypothesis Test of the
Mean when the population Standard Deviation, σ, is unknown. For a one or two-tailed 1-
sample t:
– H0: μsample = μtarget If P-value > 0.05 fail to reject Ho

– Ha: μsample ≠, <, > μtarget If P-value < 0.05 reject Ho
416
1 Sample t-test Sample Size
Target
Population
n=2 Can not tell the difference

X
X
X
between the sample and the XX
X X
target. X
X X X X X X
n=30 Can tell the difference

X
between the sample and the X
target. XX
X X
X XX
S
SE Mean 
n
417
Sample Size
Use the 1- Sample t test Excel template provided to you in the Data Set. Enter your sample data in the excel sheet and the results to be seen as below
Why to use a 1- Sample t test?
1.You wish to understand if your sample data is good enough to meet a target
2.You wish to understand if in the past your sample data has indeed met the target
Hypothesis
Null Hypothesis: The Hypothesized Mean/ Population Mean is the same as the sample mean statistically
Alternate Hypothesis: Hypothesized Mean/ Population Mean is not the same as the sample mean statistically
Sample Size 6
α = 0.05.
Population Mean 5.5
If p value is less than 0.05, reject
Sample Mean 12.5
the null hypothesis, else fail to
Standard Deviation 1.870828693
reject the null hypothesis.
t 9.16515139
df 5
Two-tailed Probability 0.000
One-tailed Probability 0.000

What
Cohen's d is your inference 3.742
from the p-value obtained from the 1 sample t test?
418
Sample Size
Power and Sample Size
1-Sample t Test
Testing mean = null (versus not = null)

Calculating power for mean = null + difference
Alpha = 0.05 Assumed standard deviation = 1
Sample The various sample sizes show

Size Power Difference how much of a difference can be
10 0.9 1.15456 detected assuming a Standard
15 0.9 0.90087 Deviation = 1.
20 0.9 0.76446
25 0.9 0.67590
30 0.9 0.61245
35 0.9 0.56408
40 0.9 0.52564
419
1-Sample t Example
1. Practical Problem:
• We are considering changing suppliers for a part that we currently purchase from a
supplier that charges us a premium for the hardening process.
• The proposed new supplier has provided us with a sample of their product. They have
stated that they can maintain a given characteristic of 5 on their product.
• We want to test the samples and determine if their claim is accurate.
2. Statistical Problem:
Ho: μN.S. = 5
Ha: μN.S. ≠ 5
3. 1-sample t-test (population Standard Deviation unknown, comparing to target).

α = 0.05 β = 0.10
420
Example
4. Sample Size:
• Open the MINITABTM worksheet: “Exh_Stat.MTW”.
• Use the C1 column: Values
– In this case, the new supplier sent 9 samples for
evaluation.
– How much of a difference can be detected with this
sample?
421
1-Sample t Example
This means we will be able to detect a

difference of only 1.24 if the population
has a Standard Deviation of 1 unit.
MINITABTM Session Window

1-Sample t Test
Testing Mean = null (versus not = null)
Calculating power for Mean = null + difference
Alpha = 0.05 Assumed Standard Deviation = 1
Sample
Size Power Difference
9 0.9 1.23748
422
Example: Follow the Road Map
5. State Statistical Solution
Probability Plot of Values

Normal
99
Mean 4.789
StDev 0.2472
95 N 9
AD 0.327
90
P-Value 0.442
80
70
Percent
60
50
40
30
20
Are the data in
10
the values
column Normal?
4.2 4.4 4.6 4.8 5.0 5.2 5.4

Values
423
1-Sample t Example
Click “Graphs”
-Select all 3
Click “Options…”
- In CI enter ’95’
424
Histogram of Values
Histogram
Histogramof
of Values
Values
(with
(with Ho
Ho and
and 95%
95%t-confidence
t-confidence interval
interval for
for the
the mean)
mean)
2.0
2.0
1.5
1.5
Frequency
Frequency
1.0
1.0
0.5
0.5
0.0
0.0 __
XX
Ho
Ho
4.4
4.4 4.5
4.5 4.6
4.6 4.7
4.7 4.8
4.8 4.9
4.9 5.0
5.0 5.1
5.1
Values
Values
Note our target Mean (represented by red Ho) is outside our population
confidence boundaries which tells that there is significant difference
between population and target Mean.
425
Box Plot of Values
Boxplot
Boxplot of
of Values
Values
(with
(with Ho
Ho and
and 95%
95%t-confidence
interval for
for the
the mean)
mean)
__
XX
Ho
Ho
4.4
4.4 4.5
4.5 4.6
4.6 4.7
4.7 4.8
4.8 4.9
4.9 5.0
5.0 5.1
5.1
Values
Values
426
Individual Value Plot (Dot Plot)
Individual
Individual Value
Value Plot
Plot of
of Values
Values
(with
(with Ho
Ho and
and 95%
95% t-confidence
interval for
for the
the mean)
mean)
__
XX
Ho
Ho
4.4
4.4 4.5
4.5 4.6
4.6 4.7
4.7 4.8
4.8 4.9
4.9 5.0
5.0 5.1
5.1
Values
Values
427
Session Window
Ho Ha
n
(X i  X) 2
One-Sample T: Values
s 
i 1 n 1
Test of mu = 5 vs not = 5 S
SE Mean 
n
Variable N Mean StDev SE Mean 95% CI T P

Values 9 4.78889 0.24721 0.08240 (4.59887, 4.97891) -2.56 0.034
T-Calc = Observed – Expected over SE Mean

T-Calc = X-bar – Target over Standard Error
T-Calc = 4.7889 – 5 over .0824 = - 2.56
N – sample size
Mean – calculate mathematic average
StDev – calculated individual Standard Deviation (classical method)
SE Mean – calculated Standard Deviation of the distribution of the Means
Confidence Interval that our population average will fall between 4.5989, 4.9789
428
Evaluating the Results
Since the P-value of 0.034 is less than 0.05, reject the null hypothesis.
Based on the samples given there is a difference between the average of the sample
and the desired target.
X Ho
6. State Practical Conclusions

The new supplier’s claim that they can meet the target of 5 for the
hardness is not correct.
429
Manual Calculation of 1- Sample t
Let’s compare the manual calculations to what the computer calculates.

– Calculate t-statistic from data:
X  Target 4.79  5.00

t   2.56
s 0.247
n 9
– Determine critical t-value from t-table in reference section.
• When the alternative hypothesis has a not equal sign, it is a two-sided
test.
• Split the α in half and read from the 0.975 column in the t-table for n
-1 (9 - 1) degrees of freedom.
430
Manual Calculation of 1- Sample t
degrees of T - Distribution
freedom
.600 .700 .800 .900 .950 .975 .990 .995
1 0.325 0.727 1.376 3.078 6.314 12.706 31.821 63.657
2 0.289 0.617 1.061 1.886 2.920 4.303 6.965 9.925
3 0.277 0.584 0.978 1.638 2.353 3.182 4.541 5.841
4 0.271 0.569 0.941 1.533 2.132 2.776 3.747 4.604
5 0.267 0.559 0.920 1.476 2.015 2.571 3.365 4.032
6 0.265 0.553 0.906 1.440 1.943 2.447 3.143 3.707

7 0.263 0.549 0.896 1.415 1.895 2.365 2.998 3.499
8 0.262 0.546 0.889 1.397 1.860 2.306 2.896 3.355
9 0.261 0.543 0.883 1.383 1.833 2.262 2.821 3.250
10 0.260 0.542 0.879 1.372 1.812 2.228 2.764 3.169

-2.56
-2.306 2.306
α/2=.025 α/2 =.025
0
If the calculated t-value lies anywhere Critical Regions
in the critical regions, reject the null hypothesis.
– The data supports the alternative hypothesis that the estimate for the
Mean of the population is not 5.0.
431
Confidence Intervals for Two-Sided t-test
The formula for a two-sided t-test is:
s s
X  t α/2, n 1  μ  X  t α/2, n 1
n n
or
X  t crit SE mean  4.788  2.306 * .0824
4.5989 to 4.9789
4.5989 X Ho
4.9789
4.7889
432
1-Sample t Exercise
Exercise objective: Utilize what you have learned to conduct and

analyze a one sample t-test using MINITABTM.
1. The last engineering estimation said we would achieve a product

with average results of 32 parts per million (ppm).
2. We want to test if we are achieving this performance level, we

want to know if we are on target, with 95% confidence in our
answer. Use data in column “ppm VOC”
3. Are we on Target?
433
1-Sample t Exercise: Solution
Since we do not know the population Standard Deviation, we will use the 1 sample T test to
determine if we are at Target.
434
After selecting column C1 and setting

“Test mean” to 32.0, click “Graphs…”
and select “Histogram of data” to get a
good visualization of the analysis.
Depending on the test you are running

you may need to select “Options…” to
set your desired Confidence Interval
and hypothesis. In this case the
MINITABTM defaults are what we
want.
435
Because we used the option

of “Graphs…” we get a nice Histogram of ppm VOC
visualization of the data in a (with Ho and 95% t-confidence interval for the mean)
histogram AND a plot of the 10
null hypothesis relative to the 8
confidence level of the

6
population Mean.
Frequency
4
Because the null hypothesis 2
is within the confidence _

0
level, you know we will “fail Ho
X
to reject” the null hypothesis 20 25 30 35 40 45 50
and accept the equipment is ppm VOC
running at the target of 32.0.
436
In MINITABTM’s Session Window (ctrl – M), you can see the P-value of 0.201.
Because it is above 0.05, we “fail to reject” the null hypothesis so we accept the
equipment is giving product at a target of 32.0 ppm VOC.
437
Normal
uous
n
o nti ta
C Da
438
2 Sample t-test
A 2-sample t-test is used to compare two Means.

Stat > Basic Statistics > 2-Sample t
MINITABTM performs an independent two-sample t-test and generates a confidence
interval.
Use 2-Sample t to perform a Hypothesis Test and compute a confidence interval of the
difference between two population Means when the population Standard Deviations, σ’s,
are unknown.
Two tailed test:

– H0: μ1 = μ2If P-value > 0.05 fail to reject Ho
– Ha: μ1 ≠ μ2 If P-value < 0.05 reject Ho
One tailed test:

– H0: μ1 = μ2
– Ha: μ1 > or < μ2
1 2
439
Sample Size
The first step to any 2 Sample tests for means is to know if the variances are equal or not. To do so, we need to test homogeneity of variances between the
2 samples.
A basic data set is considered below, which will be used for testing variances as well as means.
This data set contains data collected from 2 samples. We wish to test if the means are statistically the same or do they differ.
Step 1 – Test Homogeneity of Variances

Sample 1 Sample 2
4 5 Null Hypothesis – The variances of both
F-Test Two-Sample for Variances the samples are statistically same.
4.5 5.1

5.2 4.9 Alternate Hypothesis – The variances of
4.6 4.7 both the samples are statistically different.
4.7 5.2 Variable 1 Variable 2
P-value is 0.46 > 0.05, means we fail to
4.8 5.8 reject the null.
Mean 4.633333 5.116666667 Means we fail to reject the fact that the
variances are statistically same.
Variance 0.154667 0.141666667 Means we conclude that the variances are

statistically same, i.e. treated equal.
Observations 6 6
df 5 5
F 1.091765
P(F<=f) one-tail 0.462798
F Critical one-tail 5.050329
440
2 Sample T-Test: Solution
Step 2 – Conduct the 2-Sample t test
Hypothesis
Null Hypothesis – The means between the 2 samples are statistically the same
Alternate Hypothesis – The means between the 2 samples are statistically different.
Use Data Analysis in Excel
t-Test: Two-Sample Assuming Equal Variances

Variable 1 Variable 2
Mean 4.633333333 5.116666667 The p-value here is 0.057. Although it is greater
Variance 0.154666667 0.141666667 than 0.05, practical interpretation of the results
would need to be done with process expertise.
Observations 6 6
Pooled Variance 0.148166667
Hypothesized Mean
Difference 0
df 10
t Stat -2.174864075
P(T<=t) one-tail 0.027359205
t Critical one-tail 1.812461123
P(T<=t) two-tail 0.05471841
t Critical two-tail 2.228138852

441
Normal
uous
n
onti ta
C Da
442
Unequal Variance Example
Open MINITABTM worksheet:

“2 SAMPLE UNEQUAL VARIANCE DATA”
Don’t just sit there….

open it!
443
Normal
uous
n
onti ta
C Da
444
Paired t-test
• A Paired t-test is used to compare the Means of two measurements from the same samples generally used as a before and after test.
• You can do Paired t test with Excel. This is appropriate for testing the difference between two Means when the data are paired and the paired
differences follow a Normal Distribution.
• Use the Paired t command to compute a confidence interval and perform a Hypothesis Test of the difference between population Means when
observations are paired. A paired t-procedure matches responses that are dependent or related in a pair-wise
manner. This matching allows you to account for
variability between the pairs usually resulting in
a smaller error term, thus increasing the sensitivity
of the Hypothesis Test or confidence interval.
– Ho: μδ = μo
– Ha: μδ ≠ μo
• Where μδ is the population Mean of the differences and μ0 is the hypothesized Mean of the differences, typically zero.
delta
()
Use Data Analysis in Excel
before after
445
Example
• We are interested in changing the sole material for a popular brand of shoes
for children.
• In order to account for variation in activity of children wearing the shoes, each
child will wear one shoe of each type of sole material. The sole material will
be randomly assigned to either the left or right shoe.
H o: μ δ = 0
Ha: μδ ≠ 0
3. Paired t-test (comparing data that must remain paired).
α = 0.05 β = 0.10
Just checking your souls,

er…soles!
446
Example
We wish to test if the average dimensions for Mat B is higher than Mat A with Mat B given a
Mat A Mat B treatment post measurement of Mat A.
13 14
13.2 14.5 Null Hypothesis: Mean of Mat B and Mean of Mat A are statistically the same.
Alternate Hypothesis: Mean of Mat B and Mean of Mat A are statistically different.
13.4 14.4
13.5 13 Use Data Analysis in Excel.
14 14.5
13.8 13.5
13.9 13.2 t-Test: Paired Two Sample for Means
14 14
14.2 13
Variable 1 Variable 2
Mean 13.66667 13.78888889
Variance 0.1675 0.393611111

Observations 9 9
Pearson Correlation -0.38621
Hypothesized Mean Difference 0

df 8
As the p-value is 0.68 > 0.05, we fail to reject null, i.e. fail to reject the fact that Means of Mat B and Mat A are statistically the same.
This means the treatment that was done to Mat A has not worked to its best.
t Stat -0.42075
You can use Paired t test as an option to validate improvements post solutions have been implemented in a project.
P(T<=t) one-tail 0.342507
t Critical one-tail 1.859548

447
P(T<=t) two-tail 0.685013
Paired t-test Exercise
Exercise objective: Utilize what you have learned to conduct and analyze a
paired t-test using MINITABTM.
1. A corrugated packaging company produces material which has creases to make

boxes easier to fold. It is a critical to quality characteristic to have a predictable
Relative Crease Strength. The quality manager is having her lab test some
samples labeled 1-11. Then those same samples are being sent to her colleague at
another facility who will report their measurements on those same 1-11 samples.
2. The US quality manager wants to know with 95% confidence what the average
difference is between the lab located in Texas and the lab located in Mexico when
measuring Relative Crease Strength.
3. Use the data in columns “Texas” & “Mexico” in “RM Suppliers.mtw” to

determine the answer to the quality manager’s question.
448
Paired t-test Exercise: Solution
Because the two labs ensured to

exactly report measurement
results for the same parts and the
results were put in the correct
corresponding row, we are able to
do a paired t-test.
The first thing we must do is

create a new column with the
difference between the two test
results.
Calc>Calculator
449
We must confirm the differences (now in a new calculated column) are from a Normal
Distribution. This was confirmed with the Anderson-Darling Normality Test by doing a
graphical summary under Basic Statistics.
Summary
Summary for
for TX_MX-Diff
TX_MX-Diff
AAnderson-D
nderson-Darling
arlingNNormality
ormality Test
Test
AA-S quared
-S quared 0.45
0.45
PP-V
-Value
alue 0.222
0.222
MMean
ean 0.22727
0.22727
SStD
tDev
ev 0.37971
0.37971
VVariance
ariance 0.14418
0.14418
SSkew
kewness
ness -0.833133
-0.833133
Kurtosis
K urtosis -0.233638
-0.233638
NN 11
11
MMinimum
inimum -0.50000
-0.50000
1st
1stQQuartile
uartile -0.10000
-0.10000
MMedian
edian 0.40000
0.40000
3rd
3rdQQuartile
uartile 0.50000
0.50000
-0.50 -0.25 0.00 0.25 0.50 0.75 MMaximum 0.70000
-0.50 -0.25 0.00 0.25 0.50 0.75 aximum 0.70000
95%
95% CConfidence
onfidenceInterv
Intervalalfor
forMMean
ean
-0.02782
-0.02782 0.48237
0.48237
95%
95% CConfidence
onfidenceInterv
Intervalalfor
forMMedian
edian
-0.11644
-0.11644 0.50822
0.50822
95%
95% CConfidence
onfidenceInterv
Intervalalfor
forSStD
tDev
95%
95%Confidence
ConfidenceIntervals
Intervals ev
0.26531
0.26531 0.66637
0.66637
Mean
Mean
Median
Median
0.0
0.0 0.2
0.2 0.4
0.4 0.6
0.6
450
As we’ve seen before, this 1 Sample T analysis is found with:

Stat>Basic Stat>1-sample T
451
Even though the Mean difference is 0.23, we have a 95% confidence interval that includes zero so we
know the 1-sample t-test’s null hypothesis was “failed to be rejected”. We cannot conclude the two
labs have a difference in lab results.
Histogram
Histogramof
of TX_MX-Diff
TX_MX-Diff
(with
(withHo
Hoand
and95%
95%t-confidence
interval for
for the
the mean)
mean)
55
44
33
Frequency
Frequency
The P-value is greater than 0.05 so

we do not have the 95% confidence 22
we wanted to confirm a difference
in the lab Means. This confidence 11
interval could be reduced with
more samples taken next time and 00 __
XX
analyzed by both labs. Ho
Ho
-0.50
-0.50 -0.25
-0.25 0.00
0.00 0.25
0.25 0.50
0.50 0.75
0.75
TX_MX-Diff
TX_MX-Diff
452
Normal
uous
n
onti ta
C Da
453
Summary

• Determine appropriate sample sizes for testing Means
• Conduct various Hypothesis Tests for Means
• Properly analyze results
454
Hypothesis Testing Normal Data Part 2
Calculate Sample Size
Hypothesis Testing ND P2 Variance Testing
Analyze Results
455
Tests of Variance
Tests of Variance are used for both Normal and Non-normal Data.
Normal Data
– 1 Sample to a target
– 2 Samples – F-Test
– 3 or More Samples Bartlett’s Test
Non-Normal Data
– 2 or more samples Levene’s Test
The null hypothesis states there is no difference between the Standard

Deviations or variances.
– Ho: σ1 = σ2 = σ3 …
– Ha = at least on is different
456
1-Sample Variance
A 1-sample variance test is used to compare an expected population variance to

a target.
Stat > Basic Statistics > Graphical Summary
If the target variance lies inside the confidence interval, fail to reject the null
hypothesis.
– Ho: σ2Sample = σ2Target
– Ha: σ2Sample ≠ σ2Target
Use the sample size calculations for a 1 sample t-test since they are rarely
performed without performing a 1 sample t-test as well.
457
1-Sample Variance
• We are considering changing supplies for a part that we currently purchase
from a supplier that charges a premium for the hardening process and has a
large variance in their process.
• The proposed new supplier has provided us with a sample of their product.
They have stated they can maintain a variance of 0.10.
Ho: σ2 = 0.10 or Ho: σ = 0.31
Ha: σ2 ≠ 0.10 Ha: σ ≠ 0.31
3. 1-sample variance:
α = 0.05 β = 0.10
458
1-Sample Variance
4. Sample Size:
• Open the MINITABTM worksheet: “Exh_Stat.MTW”
• This is the same file used for the 1 Sample t example.
– We will assume the sample size is adequate.
5. State Statistical Solution
Stat > Basic Statistics > Graphical Summary
459
1-Sample Variance
Recall the target Standard Deviation is 0.31.
Summary
Summary for
for Values
Values
AAnderson-D
nderson-Darling
arlingNNormality
ormality Test
Test
AA-S quared
-S quared 0.33
0.33
PP-V
-Value
alue 0.442
0.442
MMean
ean 4.7889
4.7889
SStD
tDev
ev 0.2472
0.2472
VVariance
ariance 0.0611
0.0611
SSkew
kewness
ness -0.02863
-0.02863
Kurtosis
Kurtosis -1.24215
-1.24215
NN 99
MMinimum
inimum 4.4000
4.4000
1st
1stQQuartile
uartile 4.6000
4.6000
MMedian
edian 4.7000
4.7000
3rd
3rdQQuartile
uartile 5.0500
5.0500
4.4
4.4 4.6
4.6 4.8
4.8 5.0
5.0 MMaximum 5.1000
aximum 5.1000
95%
95% CConfidence
onfidence Interv
Interval
alfor
forMMean
ean
4.5989
4.5989 4.9789
4.9789
95%
95% CConfidence
onfidence Interv
Interval
alfor
forMMedian
edian
4.6000
4.6000 5.0772
5.0772
95%
95% CConfidence
onfidence Interv
Interval
alfor
forSStDev
tDev
95%
95%Confidence
ConfidenceIntervals
Intervals
0.1670
0.1670 0.4736
0.4736
Mean
Mean
Median
Median
4.6
4.6 4.7
4.7 4.8
4.8 4.9
4.9 5.0
5.0 5.1
5.1
460
Test of Variance Example
We want to determine the effect of two different storage methods on the rotting
of potatoes. You study conditions conducive to potato rot by injecting potatoes
with bacteria that cause rotting and subjecting them to different temperature and
oxygen regimes. We can test the data to determine if there is a difference in the
Standard Deviation of the rot time between the two different methods.
Ho: σ1 = σ2
Ha: σ1 ≠ σ2
3. Equal variance test (F-test since there are only 2 factors.)
461
Test of Variance Example
4. Sample Size:
α = 0.05 β = 0.10
Stat > Power and Sample Size > One-Way ANOVA…
EXH_AOV.MTW

One-way ANOVA
Alpha = 0.05 Assumed Standard Deviation = 1 Number of
Levels = 2
Sample Maximum
Size Power SS Means Difference
50 0.9 0.214350 0.654752
The sample size is for each level.
462
Normality Test – Follow the Roadmap
5. Statistical Solution:
Stat>Basic Statistics>Normality Test
463
Normality Test – Follow the Roadmap
Ho: Data is Normal

Ha: Data is NOT Normal Stat>Basic Stats> Normality Test
(Use Anderson Darling)
Probability
Probability Plot
Plot of
of Rot
Rot 11
Normal
Normal
99.9
99.9
Mean
Mean 4.871
4.871
StDev
StDev 0.9670
0.9670
99
99 NN 100
100
AADD 0.306
0.306
95
95 P-Value 0.559
P-Value 0.559
90
90
80
80
70
70
Percent
Percent
60
60
50
50
40
40
30
30
20
20
10
10
55
11
0.1
0.1
22 33 44 55 66 77 88
Rot
Rot 11
464
Test of Equal Variance
Stat>ANOVA>Test for Equal Variance
465
6. Practical Solution:
The difference between the Standard Deviations from the two samples is not
significant.
Test
Test for
for Equal
Equal Variances
Variances for
for Rot
Rot 11
F-Test
F-Test
Test Statistic
Test Statistic 0.74
0.74
Factors 11 P-Value
P-Value 0.298
0.298
Lev
Levene's
ene'sTest
Factors
Test
Test
TestStatistic 0.53
Use F-Test for 2 samples Statistic
P-Value
0.53
0.469
P-Value 0.469
Normally distributed data.
22
0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4

P-value >0.05 (.298) Assume0.7 0.8
95%
0.9
Bonferroni
1.0
Confidence
1.1
Intervals for
1.2
StDevs
1.3 1.4
95% Bonferroni Confidence Intervals for StDevs
Equal Variance
11
Factors
Factors
22
22 33 44 55 66 77
Rot
Rot11
466
Normality Test
Perform another test using the column Rot.
Probability
Probability Plot
Plot of
of Rot
Rot
Normal
Normal
99
99
Mean
Mean 13.78
13.78
StDev
StDev 7.712
7.712
95
95 NN 18
18
AADD 0.285
0.285
90
90 P-Value
P-Value 0.586
0.586
80
80
70
70
Percent
Percent
60
60
50
50
40
40
30
30
The P-value is > 0.05
20
20 We can assume our data is
10
Normally Distributed.
10
55
11
-5
-5 00 55 10
10 15
15 20
20 25
25 30
30 35
35
Rot
Rot
467
Test for Equal Variance (Normal Data)
Test for equal variance using Temp as factor.
468
Test
Test for
for Equal
Equal Variances
Variances for
for Rot
Rot
F-Test
F-Test
Test Statistic
Test Statistic 0.68
0.68
10
10 P-Value
P-Value 0.598
0.598
Lev
Levene's
ene'sTest
Test
Temp
Temp
Test
TestStatistic
Statistic 0.05
0.05
P-Value
P-Value 0.824
0.824
16
16
22 44 66 88 10
10 12
12
Ho: σ11 = σ22
Ha: σ1≠ σ22
10
10
P-value > 0.05; there is no statistically
Temp
Temp
significant difference.
16
16
00 55 10
10 15
15 20
20 25
25
Rot
Rot
469
Use F- Test for 2 samples of

Normally Distributed Data.
470
Continuous Data - Normal
471
Test For Equal Variances
472
Test For Equal Variances Graphical Analysis
Test
Test for
for Equal
Equal Variances
Variances for
for Rot
Rot
Temp
Temp Oxygen
Oxygen
Bartlett's
Bartlett'sTest
Test
22 Test Statistic
Test Statistic 2.71
2.71
P-Value
P-Value 0.744
0.744
10 66 Lev
Levene's
ene'sTest
Test
10
Test
TestStatistic
Statistic 0.37
0.37
P-Value
P-Value 0.858
0.858
10
10
22
16
16 66
10
10
00 20
20 40
40 60
60 80
80 100
100 120 120 140
140
95%
95%Bonferroni
BonferroniConfidence
Confidence Intervals
Intervals for
for StDevs
StDevs
P-value > 0.05 shows insignificant

difference between variance
473
Test For Equal Variances Statistical Analysis
Test for Equal Variances: Rot versus Temp, Oxygen
95% Bonferroni confidence intervals for standard deviations
Temp Oxygen N Lower StDev `Upper

10 2 3 2.26029 5.29150 81.890
10 6 3 1.28146 3.00000 46.427
10 10 3 2.80104 6.55744 101.481
16 2 3 1.54013 3.60555 55.799
16 6 3 1.50012 3.51188 54.349
16 10 3 3.55677 8.32666 128.862
Bartlett's Test (normal distribution)

Use this if
Test statistic = 2.71 P-value = 0.744 data is Normal
and for Factors < 2
Levene's Test (any continuous distribution)

Use this if
Test statistic = 0.37, P-value = 0.858 data is Non-normal
for factors > 2
474
Tests for Variance Exercise
Exercise objective: Utilize what you have learned to conduct and analyze a
test for equal variance using MINITABTM.
1. The quality manager was challenged by the plant director as to why the VOC
levels in the product varied so much. After using a Process Map, some potential
sources of variation were identified. These sources included operating shifts and
raw material supplier. Of course, the quality manager has already clarified the
Gage R&R results were less than 17% study variation so the gage was acceptable.
2. The quality manager decided to investigate the effect of the raw material supplier.
He wants to see if the variation of the product quality is different when using
supplier A than supplier B. He wants to be 95% confident the variances are
similar when using the two suppliers.
3. Use data ppm VOC and RM Supplier to determine if there is a difference between
suppliers.
475
Tests for Variance Exercise: Solution
First we want to do a graphical summary of the two samples from the 2 suppliers.
476
In “Variables:” enter ‘ppm

VOC’
In “By variables:” enter ‘RM

Supplier’
We want to see if the two samples

are from Normal populations.
477
The P-value is greater than 0.05 for both Anderson-Darling Normality Tests so we
conclude the samples are from Normally Distributed populations because we “failed to
reject” the null hypothesis that the data sets are from Normal Distributions.
Summary for ppm VOC Summary for ppm VOC

RM Supplier = A RM Supplier = B
A nderson-D arling N ormality Test A nderson-D arling N ormality Test
A -S quared 0.33 A -S quared 0.49
P -V alue 0.465 P -V alue 0.175
M ean 37.583 M ean 30.500

S tD ev 7.090 S tD ev 6.571
V ariance 50.265 V ariance 43.182
S kew ness 0.261735 S kew ness -0.555911
Kurtosis -0.091503 Kurtosis -0.988688
N 12 N 12
M inimum 25.000 M inimum 19.000

1st Q uartile 33.250 1st Q uartile 25.000
M edian 35.500 M edian 31.500
20 25 30 35 40 45 50 3rd Q uartile 42.000 20 25 30 35 40 45 50 3rd Q uartile 37.000
M aximum 50.000 M aximum 38.000
95% C onfidence Interv al for M ean 95% C onfidence Interv al for M ean
33.079 42.088 26.325 34.675
95% C onfidence Interv al for M edian 95% C onfidence Interv al for M edian
33.263 42.000 25.000 37.000
95% Confidence Intervals 95% Confidence Intervals
95% C onfidence Interv al for S tD ev 95% C onfidence Interv al for S tD ev
Mean 5.022 12.038 Mean 4.655 11.157
Median Median
32 34 36 38 40 42 25.0 27.5 30.0 32.5 35.0 37.5
Are both Data Sets Normal?
478
479
For “Response:” enter ‘ppm VOC’

For “Factors:” enter ‘RM Supplier’
Note MINITABTM defaults to 95% confidence level which is exactly the level we want
to test for this problem.
480
Because the 2 populations were considered to be Normally Distributed, the F-test is used to evaluate
whether the variances (Standard Deviation squared) are equal.
The P-value of the F-test was greater than 0.05 so we “fail to reject” the null hypothesis.
So once again in English: The variances are equal between the results from the two suppliers on our
product’s ppm VOC level.
Test
Testfor
for Equal
Equal Variances
Variances for
for ppm
ppmVOC
VOC
F-Test
F-Test
Test
TestStatistic
Statistic 1.16
1.16
AA P-Value
P-Value 0.806
0.806
Supplier
RMSupplier
Lev
Levene's
ene'sTest
Test
Test
TestStatistic
Statistic 0.02
0.02
P-Value
P-Value 0.890
0.890
RM
BB
44 66 88 10
10 12
12 14
14
AA
Supplier
RMSupplier
RM
BB
20
20 25
25 30
30 35
35 40
40 45
45 50
50
ppm
ppmVOC
VOC
481
Normal
uous
n
onti ta
C Da
482
Purpose of ANOVA
Analysis of Variance (ANOVA) is used to investigate and model the relationship

between a response variable and one or more independent variables.
Analysis of variance extends the two sample t-test for testing the equality of two
population Means to a more general null hypothesis of comparing the equality
of more than two Means, versus them not all being equal.
– The classification variable, or factor, usually has three or more levels (If
there are only two levels, a t-test can be used).
– Allows you to examine differences among means using multiple
comparisons.
– The ANOVA test statistic is:
Avg SS between S2 between

 2
Avg SS within S within
483
What do we want to know?
Is the between group variation large enough to be distinguished from the

within group variation?
delta X (Between Group Variation)
(δ)
Total (Overall) Variation
Within Group Variation

(level of supplier 1)
X
X
X X
X
X X X
μ1 μ2
484
Calculating ANOVA
Where:
G - the number of groups (levels in the study)
xij = the individual in the jth group
nj = the number of individuals in the jth group or level
X = the grand Mean
Xj = the Mean of the jth group or level
Total (Overall) Variation
delta
(δ) Within Group Variation
(Between Group Variation)
Between Group Variation Within Group Variation Total Variation

g g nj g nj

j1
nj (Xj  X )
2
 (X ij  X)
2
  (X
j1 i 1
ij  X) 2
j1 i 1
485
Alpha Risk and Pair-Wise t-tests
The alpha risk increases as the number of Means increases with a pair-wise t-
test scheme. The formula for testing more than one pair of Means using a t-test
is:
1  1  α 
k
where k  number of pairs of means

so, for 7 pairs of means and an α  0.05 :
  7
1 - 1 - 0.05  0.30
or 30% alpha risk
486
Three Samples
We have three potential suppliers that claim to have equal levels of quality.
Supplier B provides a considerably lower purchase price than either of the
other two vendors. We would like to choose the lowest cost supplier but we
must ensure that we do not effect the quality of our raw material.
File>Open Worksheet > ANOVA.MTW
Supplier A Supplier B Supplier C

3.16 4.24 4.58
4.35 3.87 4.00
3.46 3.87 4.24
3.74 4.12 3.87
3.61 3.74 3.46
We would like test the data to determine whether there is a

difference between the three suppliers.
487
Follow the Roadmap…Test for Normality and Equal Variances
All three suppliers samples are Box Plots don’t reveal unusual trends
Supplier A P-value 0.568 Equal Variances test pass.

Supplier B P-value 0.385
Supplier C P-value 0.910
Anova: Single Factor

SUMMARY
Groups Count Sum Average Variance
Column 1 5 18.32 3.664 0.19373
Column 2 5 19.84 3.968 0.04207 As p-value is 0.28 > 0.05, we fail to

reject null hypothesis.
Column 3 5 20.15 4.03 0.1745
We fail to reject the evidence that the

means of all three suppliers are
statistically the same.
ANOVA
Practically, we infer that the three
suppliers are producing parts with the
Source of Variation SS df MS F P-value F crit same average statistically.
Between Groups 0.383693 2 0.191847 1.40273 0.283502 3.885294
488
Within Groups 1.6412 12 0.136767
Sample Size
Let’s check how much difference we can see with a sample of 5.

One-way ANOVA
Alpha = 0.05 Assumed Standard Deviation = 1 Number of Levels = 3
Sample Maximum
Size Power SS Means Difference
5 0.9 3.29659 2.56772
The sample size is for each level.
489
ANOVA Assumptions
1. Observations are adequately described by the model.

2. Errors are normally and independently distributed.
3. Homogeneity of variance among factor levels.
In one-way ANOVA, model adequacy can be checked by either of the

following:
1. Check the data for Normality at each level and for homogeneity of
variance across all levels.
2. Examine the residuals (a residual is the difference in what the model
predicts and the true observation).
1. Normal plot of the residuals
2. Residuals versus fits
3. Residuals versus order
If the model is adequate, the residual plots will be structureless.

490
Residual Plots
Stat>ANOVA>One-Way Unstacked>Graphs
491
Histogram of Residuals
Histogram
Histogramof
of the
the Residuals
Residuals
(responses
(responses are
are Supplier
Supplier A,
A, Supplier
Supplier B,
B, Supplier
Supplier C)
C)
55
44
Frequency
Frequency
33
22
11
00
-0.6
-0.6 -0.4
-0.4 -0.2
-0.2 0.0
0.0 0.2
0.2 0.4
0.4 0.6
0.6
Residual
Residual
The Histogram of residuals should show a

bell shaped curve.
492
Normal Probability Plot of Residuals
Normality plot of the residuals should follow a straight line.

Results of our example look good.
The Normality assumption is satisfied.
Normal
Normal Probability
Probability Plot
Plot of
of the
the Residuals
Residuals
(responses
(responses are
are Supplier
Supplier A,
A, Supplier
Supplier B,
B, Supplier
Supplier C)
C)
99
99
95
95
90
90
80
80
70
70
Percent
Percent
60
60
50
50
40
40
30
30
20
20
10
10
55
11
-1.0
-1.0 -0.5
-0.5 0.0
0.0 0.5
0.5 1.0
1.0
Residual
Residual
493
Residuals versus Fitted Values
The plot of residuals versus fits examines constant variance.

The plot should be structureless with no outliers present.
Our example does not indicate a problem.
Residuals
Residuals Versus
Versus the
theFitted
FittedValues
Values
(responses
(responses are
are Supplier
Supplier A,
A, Supplier
Supplier B,
B,Supplier
Supplier C)
C)
0.75
0.75
0.50
0.50
0.25
0.25
Residual
Residual
0.00
0.00
-0.25
-0.25
-0.50
-0.50
3.65
3.65 3.70
3.70 3.75
3.75 3.80
3.80 3.85
3.85 3.90
3.90 3.95
3.95 4.00
4.00 4.05
4.05
Fitted Value
Fitted Value
494
ANOVA Exercise
Exercise objective: Utilize what you have learned to conduct and

analyze a one way ANOVA using MINITABTM.
1. The quality manager was challenged by the plant director as to why the
VOC levels in the product varied so much. The quality manager now
wants to find if the product quality is different because of how the shifts
work with the product.
2. The quality manager wants to know if the average is different for the
ppm VOC of the product among the production shifts.
3. Use Data in columns “ppm VOC” and “Shift” in “hypoteststud.mtw” to

determine the answer for the quality manager at a 95% confidence level.
495
ANOVA Exercise: Solution
First we need to do a Stat>Basic Stat>Graphical Summary

graphical summary
of the samples from
the 3 shifts.
496
We want to see if the 3 samples

are from Normal populations.
In “Variables:” enter ‘ppm

VOC’
In “By variables:” enter ‘Shift’
497
Summary
Summaryfor
forppm
ppmVOC
The P-value is greater than 0.05 for both Shift
VOC
Shift==11
P-Value 0.446
Anderson-Darling Normality Tests so we

A nderson-D arling N ormality Test
A nderson-Darling N ormality Test
A -S quared 0.32
A -S quared 0.32
P -V alue 0.446
P -V alue 0.446
conclude the samples are from Normally

M ean 39.500
M ean 39.500
S tD ev 6.761
S tD ev 6.761
V ariance 45.714
V ariance 45.714
S kew ness 0.58976
Distributed populations because we

S kew ness 0.58976
Kurtosis -1.13911
Kurtosis -1.13911
N 8
N 8
M inimum 32.000
“failed to reject” the null hypothesis that

M inimum 32.000
1st Q uartile 33.500
1st Q uartile 33.500
M edian 38.000
M edian 38.000
20 25 30 35 40 45 50 3rd Q uartile 46.500
3rd Q uartile 46.500
the data sets are from Normal

20 25 30 35 40 45 50
M aximum 50.000
M aximum 50.000
95% C onfidence Interv al for M ean
95% C onfidence Interv al for M ean
33.847 45.153
33.847 45.153
Distributions. 95% Confidence Intervals

95% Confidence Intervals
95% C onfidence Interv al for M edian
95% C onfidence Interv al for M edian
32.936
32.936
48.129
48.129
95% C onfidence Interv al for S tD ev
95% C onfidence Interv al for S tD ev
Mean 4.470 13.761
Mean 4.470 13.761
Median
Median
35 40 45 50
35 40 45 50
Summary Summary
Summaryfor
forppm
ppmVOC P-Value 0.658
Summaryfor
forppm
ppmVOC
VOC P-Value 0.334 VOC
Shift
Shift
Shift==22 Shift==33
P -V alue 0.334 P -V alue 0.658
P -V alue 0.334 P -V alue 0.658
M ean 34.625 M ean 28.000
M ean 34.625 M ean 28.000
S tD ev 5.041 S tD ev 6.525
S tD ev 5.041 S tD ev 6.525
S kew ness -0.74123 S kew ness 0.06172
S kew ness -0.74123 S kew ness 0.06172
Kurtosis 1.37039 Kurtosis -1.10012
Kurtosis 1.37039 Kurtosis -1.10012
N 8 N 8
N 8 N 8
30.411 38.839 22.545 33.455
30.411 38.839 22.545 33.455
30.614 37.322 20.871 33.322
95% Confidence Intervals 30.614 37.322 95% Confidence Intervals 20.871 33.322
95% Confidence Intervals 95% C onfidence Interv al for S tD ev 95% Confidence Intervals 95% C onfidence Interv al for S tD ev
Mean Mean 4.314 13.279
Mean
3.333 10.260 Mean 4.314 13.279
3.333 10.260
Median Median
Median Median
30 32 34 36 38 40 20.0 22.5 25.0 27.5 30.0 32.5 35.0

30 32 34 36 38 40 20.0 22.5 25.0 27.5 30.0 32.5 35.0
498
First we need to determine if our data has

Equal Variances.
Stat > ANOVA > Test for Equal Variances…
Now we need to test the variances.
For “Factors:” enter ‘Shift’
499
The P-value of the F-test was greater than 0.05 so we “fail to reject” the null
hypothesis.
Test
Testfor
for Equal
Equal Variances
Variances for
for ppm
ppmVOC
VOC
Bartlett's
Bartlett'sTest
Test
Test
TestStatistic
Statistic 0.63
0.63
11 P-Value
P-Value 0.729
0.729
Lev
Levene's
ene'sTest
Test
Test
TestStatistic
Statistic 0.85
0.85
P-Value
P-Value 0.440
0.440
Shift
Shift
22
33
22 44 66 88 10
10 12
12 14
14 16
16 18
18
Are the variances are equal…Yes!

500
We need to use the One-Way ANOVA to

determine if the Means are equal of product
quality when being produced by the 3 shifts.
Again, we want to put 95.0 for the confidence
level.
Stat > ANOVA > One-Way…
For “Factor:” enter ‘Shift’
Also be sure to click “Graphs…” to select “Four in

one” under residual plots.
Also, remember to click “Assume equal variances”

because we determined the variances were equal
between the 2 samples.
501
We must look at the Residual Plots to be sure our ANOVA analysis is valid.
Since our residuals look Normally Distributed and randomly patterned, we will
assume our analysis is correct.
Residual
Residual Plots
Plots for
for ppm
ppmVOC
VOC
Normal
NormalProbability
Probability Plot
Plot Residuals
ResidualsVersus
Versusthe
theFitted
Fitted Values
Values
99
99 NN 24
24 10
AD 0.255
10
90 AD 0.255
90 P-Value 0.698
P-Value 0.698 55
Residual
Percent
Residual
Percent
50
50 00
10 -5
-5
10
11 -10
-10
-10
-10 00 10
10 30
30 35
35 40
40
Residual
Residual Fitted
FittedValue
Value
Histogram
Histogramof
of the
theResiduals
Residuals Residuals
ResidualsVersus
Versusthe
theOrder
Orderof
of the
theData
Data
4.8 10
10
4.8
3.6 55
Frequency
3.6
Frequency
Residual
Residual
2.4 00
2.4
1.2 -5
-5
1.2
0.0 -10
-10
0.0
-10
-10 -5
-5 00 55 10
10 22 44 66 88 10
10 12
12 14
14 16
16 18
18 20
20 22
22 24
24
Residual
Residual Observation Order
Observation Order
502
Since the P-value of the ANOVA test is less than 0.05, we “reject” the null
hypothesis that the Mean product quality as measured in ppm VOC is the
same from all shifts.
We “accept” the alternate hypothesis that the Mean product quality is
different from at least one shift.
Don’t miss that

shift!
Since the confidence intervals of the

Means do not overlap between Shift 1
and Shift 3, we see one of the shifts is
delivering a product quality with a
higher level of ppm VOC.
503
Summary

• Be able to conduct Hypothesis Testing of Variances
• Understand how to Analyze Hypothesis Testing Results
504
Hypothesis Testing Non Normal Data Part 1
Equal Variance Tests
Tests for Medians
505
Non-Normal Hypothesis Tests
At this point we have covered the tests for determining significance for Normal
Data. We will continue to follow the roadmap to complete the test for Non-
normal Data with Continuous Data.
Later in the module we will use another roadmap that was designed for Discrete
Data.
– Recall that Discrete Data does not follow a Normal Distribution, but
because it is not Continuous Data, there are a separate set of tests to
properly analyze the data.
We can test for anything!!

506
Non-Normality
Why do we care if a data set is Normally Distributed?

– When it is necessary to make inferences about the true nature of the
population based on random samples drawn from the population.
– When the two indices of interest (X-Bar and s) depend on the data being
– For problem solving purposes, because we don’t want to make a bad
decision – having Normal Data is so critical that with EVERY statistical
test, the first thing we do is check for Normality of the data.
Recall the four primary causes for Non-normal Data:

– Skewness – Natural and Artificial Limits
– Mixed Distributions - Multiple Modes
– Kurtosis
– Granularity
We will focus on Skewness for the remaining tests for Continuous Data.
507
uous
nt in
Co Data Non Normal
Test of Equal Variance Median Test
Mann-Whitney Several Median Tests
508
Levene’s test of Equal Variance is used to compare the estimated

population Standard Deviations from two or more samples with Non-
normal Distributions.
– Ho: σ1 = σ2 = σ3 …
– Ha: At least one is different.
509
Follow the Roadmap…
Open the MINITABTM worksheet “EXH_AOV.MTW”
P-value < 0.05 (0.00)

Assume data is not Normally
Distributed.
Probability Plot of Rot 2

Normal
99.9
Mean 1.023
StDev 1.407
99 N 100
AD 7.448
95 P-Value <0.005
90
80
70
Percent
60
50
40
30
20
10
5
0.1
EXH_AOV.MTW -5.0 -2.5 0.0 2.5 5.0 7.5 10.0
Rot 2
510
Test of Equal Variance Non-Normal Distribution
Stat>ANOVA>Test for Equal Variance Use Levene’s Statistics for Non-Normal

Data
P-value >0.05 (0.860) Assume variance is
equal.
Hoo: σ1 = σ2 = σ3 …
Haa: At least one is different.
Test
Testfor
for Equal
Equal Variances
Variances for
for Rot
Rot22
F-Test
F-Test
Test Statistic
Test Statistic 1.75
1.75
11 P-Value
P-Value 0.053
0.053
Factors2
Lev
Levene's
ene'sTest
Factors2
Test
Test
TestStatistic
Statistic 0.03
0.03
P-Value
P-Value 0.860
0.860
22
1.0
1.0 1.2
1.2 1.4
1.4 1.6
1.6 1.8
1.8 2.0
2.0 2.2
2.2
11
Factors2
Factors2
22
00 22 44 66 88 10
10
Rot
Rot22
511
Making Conclusions
When testing 2 samples with Normal Distribution, use F-test:

– To determine whether two Normal Distributions have equal variance.
When testing >2 samples with Normal Distribution, use Bartlett’s test:
– To determine whether multiple Normal Distributions have equal variance.
When testing 2 or more samples with Non-normal Distributions, use Levene’s test:
– To determine whether two or more distributions have Equal Variance.
Our focus for this module when working with Non-normal Distributions.
512
Exercise
Exercise objective: To practice solving problem presented using

the appropriate Hypothesis Test.
A credit card company wants to understand the need for customer service
personnel. The company thinks there is variability impacting the efficiency
of its customer service staff. The credit card company has two types of
cards. The company wants to see if there is more variability in one type of
customer card than another. The Black Belt was selected and told to give
with 95% confidence the answer of similar variability between the two card
types.
1. Analyze the problem using the Hypothesis Testing roadmap.

2. Use the columns named CallsperWk1 and CallsperWk2.
3. Having a confidence level of 95% is there a difference in variance?
HYPOTESTSTUD.MPJ
513
Test for Equal Variance Example: Solution
First test to see if the Data is Normal or Non-Normal.

Stat>Basic Statistics>Normality Test
514
Since there are two variables

we need to perform a
Normality Test on
CallsperWk1 and
CallsperWk2.
First select the variable

‘CallsperWk1’ and Press
“OK”.
Follow the same steps for

CallsperWk2.
515
For the Data to be Normal the

P-value must be greater than
0.05
516
Since we know the variables are Non-

Normal Data, continue to follow the
Roadmap.
The next step is to test Calls/Week

for Equal Variance.
Before performing a Levene’s Test

we have to stack the columns for
CallsperWk1 and CallsperWk2
because currently the data is in
separate columns.
Data>Stack>Columns…
517
After stacking the Calls/Week

columns the next step in the
Roadmap is performing a Levene’s
Test.
Stat>ANOVA>Test for Equal Variances
518
Nonparametric Tests
A non-parametric test makes no assumptions about Normality.

For a skewed distribution:
– The appropriate statistic to describe the central tendency is the Median, rather than
the Mean.
– If just one distribution is not Normal, a non-parametric should be used.
Non-parametric Hypothesis Testing works the same way as parametric testing. Evaluate
the P-value in the same manner.
~ ~ ~
Target X X1 X2
519
Mean and Median
In general, nonparametric tests do the following: rank order the data, sum the data by ranks, sign the data
above or below the target, and calculate, compare and test the Median.
Comparisons and tests about the Median make nonparametric tests useful with very Non-normal Data.
Note: This Graphical Summary provides the confidence interval for the Median.
With Normal Data, notice the symmetrical shape to

the distribution, and notice how the Mean and the
Median are centered.
A nderson-D arling N ormality Test
With skewed data, the Mean is influenced by the outliers.
A -S quared
P -V alue
0.30
0.574
Notice the Median is still centered.
M ean 350.51
S tD ev 5.01
V ariance 25.12
S kew ness -0.079532 A nderson-Darling N ormality Test
Kurtosis -0.635029 A -S quared 3.72
N 75 P -V alue < 0.005
M inimum 339.09 M ean 4.8454
1st Q uartile 347.48 S tD ev 3.1865
M edian 350.48 V ariance 10.1536
3rd Q uartile 353.99 S kew ness 1.11209
M aximum 359.53 Kurtosis 1.26752
340 344 348 352 356 360 95% C onfidence Interv al for M ean N 200
349.35 351.66 M inimum 0.1454
95% C onfidence I nterv al for M edian 1st Q uartile 2.4862
349.30 351.85 M edian 4.1533
3rd Q uartile 6.5424
95% C onfidence I nterv al for S tD ev
M aximum 16.4629
4.32 5.97 0 3 6 9 12 15 95% C onfidence I nterv al for M ean
95% Confidence Intervals 4.4011 5.2898
Mean 95% C onfidence Interv al for M edian
3.6296 4.7174
Median 95% C onfidence I nterv al for StD ev
2.9018 3.5336
349.0 349.5 350.0 350.5 351.0 351.5 352.0
95% Confidence Intervals
Mean
Median
3.5 4.0 4.5 5.0 5.5
520
MINITABTM’s Nonparametrics
1-Sample Sign: performs a one-sample sign test of the Median and calculates the
corresponding point estimate and confidence interval. Use this test as an alternative to
one-sample Z and one-sample t-tests.
1-Sample Wilcoxon: performs a one-sample Wilcoxon signed rank test of the Median and
calculates the corresponding point estimate and confidence interval (more
discriminating or efficient than the sign test). Use this test as a nonparametric
alternative to one-sample Z and one-sample t-tests.
Mann-Whitney: performs a Hypothesis Test of the equality of two population Medians and
calculates the corresponding point estimate and confidence interval. Use this test as a
nonparametric alternative to the two-sample t-test.
Kruskal-Wallis: performs a Hypothesis Test of the equality of population Medians for a
one-way design. This test is more powerful than Mood’s Median (the confidence
interval is narrower, on average) for analyzing data from many populations, but is less
robust to outliers. Use this test as an alternative to the one-way ANOVA.
Mood’s Median Test: performs a Hypothesis Test of the equality of population Medians in
a one-way design. Test is similar to the Kruskal-Wallis Test. Also referred to as the
Median test or sign scores test. Use as an alternative to the one-way ANOVA.
521
1-Sample Sign Test
This test is used when you want to compare the Median of one distribution to a
target value.
– Must have at least one column of numeric data. If there is more than one column
of data, MINITABTM performs a one-sample Wilcoxon test separately for each
column.
The hypotheses:
– H0: M = Mtarget
– Ha: M ≠ Mtarget
Interpretation of the resulting P-value is the same.
Note: For the purpose of calculating sample size for a non-parametric (Median)
test use:
n t test
n non -parametric 
0.864
522
1-Sample Example
Our facility requires a cycle time from an improved process of 63 minutes. This process supports
the customer service division and has become a bottleneck to completion of order processing. To
alleviate the bottleneck the improved process must perform at least at the expected 63 minutes.
Ho: M = 63
Ha: M ≠ 63
3. 1-Sample Sign or 1-Sample Wilcoxon
Open the MINITABTM data file: DISTRIB1.MTW

Stat>Non parametric> 1 sample sign …
Or
Stat> Non parametric> 1 sample Wilcoxon
4. Sample Size:
This data set has 500 samples (well in excess of necessary sample size).
523
1-Sample Example
Stat>Non parametric> 1 Sample Sign …
For a two tailed test, choose the

“not equal” for the alternative
hypothesis.
=
Sign Test for Median: Pos Skew
Sign Test of Median = 63.00 versus = 63.00
N Below Equal Above P Median
Pos Skew 500 37 0 463 0.0000 65.70
524
1-Sample Example
Stat>Non parametric> 1 Sample Wilcoxon …
Wilcoxon Signed Rank Test: Pos Skew

Test of Median = 63.00 versus Median not = 63.00
N for Wilcoxon Estimated

N Test Statistic P Median
Pos Skew 500 500 124015.0 0.000 67.83
525
1-Sample Example
For a confidence interval, enter

desired level
Sign confidence interval for Median

Confidence
Achieved Interval
Since the target of 63 is not N Median Confidence Lower Upper Position
within the confidence Pos Skew 500 65.70 0.9455 65.30 66.50 229
interval, reject the null 0.9500 65.26 66.50 NLI
hypothesis. 0.9558 65.20 66.51 228
526
1-Sample Example
Since the target of 63 is not within the

confidence interval, reject the null
hypothesis.
Wilcoxon Signed Rank CI: Pos Skew

Confidence
Estimated Achieved Interval
N Median Confidence Lower Upper
Pos Skew 500 67.83 95.0 67.01 68.70
527
1 Sample Example: Solution

A mining company is falling behind profit targets. The mine manager

wants to determine if his mine is achieving the target production of 2.1
tons/day and has some limited data to analyze. The mine manager asks
the Black Belt to say if the mine is achieving 2.1 tons/day and the Black
Belt says she will answer with 95% confidence.

2. Use the column Tons hauled.
3. Does the Median equal the target value?
HYPOTESTSTUD.MPJ
528
According to the hypothesis the Mine Manager feels he is achieving his target of 2.1 tons/day.
H0: M = 2.1 tons/day Ha: M ≠ 2.1 tons/day
Since we are using one sample, we have a choice of choosing either a 1 Sample-Sign or 1 Sample Wilcoxon.
For this example we will use a 1 Sample-Sign.
Stat>Nonparametrics>1-Sample Sign
529
Sign Test for Median: Tons hauled

Sign Test of Median = 2.100 versus = 2.100
N Below Equal Above P Median
Tons hauled 17 14 0 3 0.0127 1.800
The results show a P-value of 0.0127 and a Median of 1.800.
The Black Belt in this case does not agree; based on this data the Mine
Manager is not achieving his target of 2.1 tons/day.
We disagree!
530
Mann-Whitney Example
The Mann-Whitney test is used to test if the Medians for 2 samples are different.
1. Determine if different machines have different Median cycle times.
2. Ho: M1 = M2
Ha: M1 ≠ M2
3. Perform the Mann-Whitney test.
4. There are 200 data points for each machine, well over the minimum sample
necessary.
5. Open the MINITABTM data set: “Nonparametric.mtw”
531
5. Statistical Conclusion
Probability
Probability Plot
Plotof
of Mach
MachAA
Normal
Normal
99.9
99.9
Mean
Mean 15.24
15.24
StDev
StDev 5.379
5.379
99
99 NN 200
200
AADD 1.550
1.550
95
95 P-Value
P-Value <0.005
<0.005
90
90
80
80
70
Probability
Probability Plot
Plotof
of Mach
MachBB
Percent
70
Normal
Percent
60
60 Normal
50
50
40 99.9
99.9
40 Mean 16.73
30
30 Mean 16.73
20 StDev
StDev 5.284
5.284
20 99
99 NN 200
200
10
10 AADD 0.630
0.630
55 95
95 P-Value
P-Value 0.099
0.099
90
90
11
80
80
70
Percent
70
Percent
0.1
0.1 60
60
00 10 20 50
50 30 40
10 20 40 30 40
40
Mach
MachAA 30
30
20
20
10
10
55
11
0.1
0.1
00 55 10
10 15
15 20
20 25
25 30
30 35
35
Mach B
Mach B
532
6. Practical Conclusion: The Medians of the machines are different.
Stat>Nonparametric>Mann-Whitney…
If the samples are the same, zero

would be included within the
confidence interval.
Mann-Whitney Test and CI: Mach A, Mach B

N Median
Mach A 200 14.841
Mach B 200 16.346
Point estimate for ETA1-ETA2 is -1.604
95.0 Percent CI for ETA1-ETA2 is (-2.635,-0.594)
W = 36509.0
Test of ETA1 = ETA2 vs ETA1 not = ETA2 is significant at
0.0019
533
Exercise
Exercise objective: To practice solving problem presented using the

appropriate Hypothesis Test.
A credit card company now understands there is no variability difference in

customer calls/week for the two different credit card types. This means no
difference in strategy of deploying the workforces. However, the credit card
company wants to see if there is a difference in call volume between the two
different card types. The company expects no difference since the total sales
among the two credit card types are similar. The Black Belt was selected and told
to evaluate with 95% confidence if the averages were the same. The Black Belt
reminded the credit card company the calls/day were not Normal distributions so
he would have to compare using Medians since Medians are used to describe the
central tendency of Non-normal Populations.

2. Use the columns named CallsperWk1 and CallsperWk2.
3. Is there a difference in call volume between the 2 different card types?
HYPOTESTSTUD.MPJ
534
Mann-Whitney Example: Solution
Since we know our data for CallperWk1 and CallperWk 2 are

Non-normal we can proceed to performing a Mann-Whitney Stat>Nonparametrics>Mann-Whitney
Test.
Mann-Whitney Test and CI: CallsperWk1, CallsperWk2

N Median
CallsperWk1 22 739.0
95.0 Percent CI for ETA1-ETA2 is (-91.9,43.0)
W = 36509.0
Test of ETA1 = ETA2 vs ETA1 not = ETA2 is significant at 0.4580
535
Mann-Whitney Example: Solution
As you can see there is a difference in the Median between CallsperWk1 and
CallsperWk2.
Therefore, there is not a difference in call volume between the two different card
types.
Mann-Whitney Test and CI: CallsperWk1, CallsperWk2

N Median
95.0 Percent CI for ETA1-ETA2 is (-91.9,43.0)
W = 36509.0
Test of ETA1 = ETA2 vs ETA1 not = ETA2 is significant at 0.4580
536
Mood’s Median Test
1. An aluminum company wanted to compare the operation of its three facilities

worldwide. They want to see if there is a difference in the recoveries among
the three locations. A Black Belt was asked to help management evaluate the
recoveries at the locations with 95% confidence.
2. Ho: M1 = M2 = M3
Ha: at least one is different
3. Use the Mood’s Median test.

4. Based on the smallest sample of 13, the test will be able to detect a difference
close to 1.5.
5. Statistical Conclusions: Use columns named Recovery and Location for
analysis.
= = ?
537
Follow the Roadmap…Normality
Stat>Basic Statistics>Graphical Summary…
Summary
Summary for
for Recovery
Recovery
Location
Location==Savannah
Savannah
AAnderson-D
nderson-Darling
arlingNNormality
ormality Test
Test
AA-S
-Squared
quared 0.81
0.81
PP-V
-Value
alue 0.032
0.032
MMean
ean 87.660
87.660
SStD
tDev
ev 7.944
7.944
VVariance
ariance 63.113
63.113
SSkew
kewness
ness -0.15286
-0.15286
Kurtosis
Kurtosis -1.11764
-1.11764
NN 25
25
MMinimum
inimum 75.300
75.300
1st
1stQQuartile
uartile 79.000
79.000
MMedian
edian 87.500
87.500
78
78 84
84 90
90 96
96
3rd
3rdQQuartile
uartile 96.550
96.550
MMaximum
aximum 99.200
99.200
95%
95% CConfidence
onfidenceInterv
Intervalalfor
forMMean
ean
84.381
84.381 90.939
90.939
95%
95% CConfidence
onfidenceInterv
Intervalalfor
forMMedian
edian
86.179
86.179 90.080
90.080
9955%
% CConfidence
onfidenceInter
Intervals 95%
v als 95% CConfidence
onfidenceInterv
Intervalalfor
forSStDev
tDev
Mean 6.203
6.203 11.052
11.052
Mean
Median
Median
84.0
84.0 85.5
85.5 87.0
87.0 88.5
88.5 90.0
90.0 91.5
91.5
538
Follow the Roadmap…Normality
Summary
Summary for
for Recovery
Recovery
Location
Location==Bangor
Bangor
AAnderson-D
nderson-Darling
arlingNNormality
ormality Test
Test
AA-S
-Squared
quared 0.72
0.72
PP-V
-Value
alue 0.045
0.045
MMean
ean 93.042
93.042
SStDev
tDev 5.918
5.918
VVariance
ariance 35.017
35.017
SSkew
kewness
ness -1.81758
-1.81758
Kurtosis
Kurtosis 4.66838
4.66838
NN 13
13
MMinimum
inimum 76.630
76.630
1st
1stQQuartile
uartile 90.600
90.600
MMedian
3rd
edian Summary
Summary for
94.800
94.800
for Recovery
Recovery
78
78
84
84
90
90
96
96 3rdQQuartile
uartile 97.350
97.350
MMaximum
aximum 99.700 Location = Ankhar
99.700 Location = Ankhar
95%
95% CConfidence
onfidenceInterv
Intervalalfor
forMMean AAnderson-D
ean nderson-Darling
arlingNNormality
ormality Test
Test
89.466 96.617 AA-S
89.466 96.617 -Squared
quared 0.86
0.86
95%
95% CConfidence
onfidenceInterv
Intervalalfor
forMMedian
edian PP-V
-Value
alue 0.022
0.022
90.637
90.637 97.036
97.036 MMean 88.302
ean 88.302
9955%
% CConfidence
onfidenceInter
Intervals 95%
onfidenceInterv
Intervalalfor
forSStDev
tDev SStD
tDev
ev 6.929
6.929
4.243 9.768 VVariance
ariance 48.008
48.008
Mean 4.243 9.768
Mean SSkew
kewness
ness -0.105610
-0.105610
Median
Kurtosis
Kurtosis 0.182123
0.182123
Median NN 20
20
90 92 94 96 98
90 92 94 96 98 MMinimum
inimum 73.500
73.500
1st
1stQQuartile
uartile 85.150
85.150
MMedian
edian 88.425
88.425
78
78
84
84
90
90
96
96
3rd
3rdQQuartile
uartile 89.700
89.700
MMaximum
aximum 99.450
99.450
95%
95% CConfidence
onfidenceInterv
Intervalalfor
forMMean
ean
85.059
85.059 91.545
91.545
95%
95% CConfidence
onfidenceInterv
Intervalalfor
forMMedian
edian
86.735
86.735 89.299
89.299
9955%
% CConfidence
onfidenceInter
Intervals 95%
onfidenceInterv
Intervalalfor
forSStD
tDev
ev
Mean 5.269
5.269 10.120
10.120
Mean
Median
Median
85
85 86
86 87
87 88
88 89
89 90
90 91
91
539
Follow the Roadmap…Equal Variance
Test
Testfor
for Equal
Equal Variances
Variances for
for Recovery
Recovery
Bartlett's
Bartlett'sTest
Test
Test
TestStatistic
Statistic 1.33
1.33
Ankhar
Ankhar P-Value
P-Value 0.514
0.514
Lev
Levene's
ene'sTest
Test
Test Statistic
Test Statistic 1.02
1.02
Location
Location P-Value
P-Value 0.367
0.367
Bangor
Bangor
Savannah
Savannah
33 44 55 66 77 88 99 10
10 11 11 1212
95%
95%Bonferroni
BonferroniConfidence
Confidence Intervals
Intervals for
for StDevs
StDevs
540
Mood’s Median Test
Statistical Solution: Since the P-value of the Mood’s Median test is less than
0.05, we reject the null hypothesis.
Practical Solution: Bangor has the highest recovery of all three facilities.
We observe the confidence intervals for the Medians of
the 3 populations. Note there is no overlap of the 95%
confidence levels for Bangor—so we visually know the
P-value is below 0.05.
Mood Median Test: Recovery versus Location
Mood median test for Recovery

Chi-Square = 12.11 DF = 2 P = 0.002
Individual 95.0% CIs

Location N<= N> Median Q3-Q1 ---+---------+---------+---------+---
Ankhar 13 7 88.4 4.5 (-----*--)
Bangor 1 12 94.8 6.8 (-------------*------)
Savannah 15 10 87.5 17.6 (----*-------)
---+---------+---------+---------+---
87.0 90.0 93.0 96.0
Overall median = 88.9
541
Kruskal-Wallis Test
Using the same data set, analyze using the Kruskal-Wallis test.
Kruskal-Wallis Test: Recovery versus Location
Kruskal-Wallis Test on Recovery
Location N Median Ave Rank Z

Ankhar 20 88.43 27.3 -0.73
Bangor 13 94.80 40.2 2.60
Savannah 25 87.50 25.7 -1.49
Overall 58 29.5
H = 6.86 DF = 2 P = 0.032
H = 6.87 DF = 2 P = 0.032 (adjusted for ties)
This output is the “least friendly” to interpret. Look for the P-value which tells us
we reject the null hypothesis. We have the same conclusion as with the Mood’s
Median test.
542
Exercise

A manufacturing company making pagers is interested in evaluating the

defect rate of 3 months from one of its facilities. A customer has said
that the defect rate was surprising lately but didn’t know for sure. A
Black Belt was selected to investigate the first 3 months of this year.
She is to report back to senior management with 95% confidence about
any shift(s) in defect rates.

2. Use the columns named ppm defective1, ppm defective2 and ppm
defective3.
3. Are the defect rates equal for 3 months?
HYPOTESTSTUD.MPJ
543
Pagers Defect Rate Example: Solution
Let’s follow the Roadmap to

see if the data is Normal.
Instead of performing a
Normality Test, we can find
the P-value using the
Graphical Summary in
MINITABTM.
Stat>Basic Statistics>Graphical Summary

544
Before we can perform a Mood’s Median

test we must first stack the columns
ppmdefective1, ppmdefective2 and
ppmdefective3.
545
After stacking the columns we

can perform a Mood’s Median
test.
Stat>Nonparametric>Mood’s Median Test
546
Unequal Variance
Where do you go in the roadmap if the variance is not equal?

– Unequal variances are usually the result of differences in the shape
of the distribution.
• Extreme tails
• Outliers
• Multiple modes
These conditions should be explored through data demographics.
For Skewed Distributions with comparable Medians it is unusual for the

variances to be different without some assignable cause impacting the
process.
547
Example
Model A and Model B are similar in nature (not exact), but are manufactured in
the same plant.
– Check for Normality:
Var_Comp.mtw
Probability Plot of Model A Probability Plot of Model B

Normal Normal
99 99
Mean 10.28 Mean 2.826
StDev 0.7028 StDev 3.088
95 N 10 95 N 10
AD 0.227 AD 0.753
90 90
P-Value 0.747 P-Value 0.033
80 80
70 70
Percent
Percent
60 60
50 50
40 40
30 30
20 20
10 10
5 5
1 1
8.5 9.0 9.5 10.0 10.5 11.0 11.5 12.0 -5.0 -2.5 0.0 2.5 5.0 7.5 10.0
Model A Model B
Model A is Normal, Model B is Non-normal.
548
Example
Check for Equal Variances using the Levene’s test.

Test
Testfor
for Equal
Equal Variances
Variances for
for Data
Data
F-Test
F-Test
Test Statistic
Test Statistic 0.05
0.05
Model
ModelAA P-Value
P-Value 0.000
0.000
Lev
Levene's
ene'sTest
Test
idvar
idvar
Test
TestStatistic
Statistic 4.47
4.47
P-Value
P-Value 0.049
0.049
Model
ModelBB
00 11 22 33 44 55 66 77
Model
ModelAA
idvar
idvar
Model
ModelBB
00 22 44 66 88 10
10 12
12
Data
Data
The P-value is just under the limit of .05. Whenever the result
is borderline, as in this case, use your process knowledge to
make a judgment.
549
Example
Let’s look at data demographics for clues.

Summary for Model A Summary for Model B
P -V alue 0.747 P -V alue 0.033
M ean 10.279 M ean 2.8260

S tD ev 0.703 S tD ev 3.0882
S kew ness 0.330968 S kew ness 1.29887
Kurtosis -0.614597 Kurtosis 0.92377
N 10 N 10

3rd Q uartile 10.816 3rd Q uartile 5.5508
9.0 9.5 10.0 10.5 11.0 11.5 M aximum 11.496 0 2 4 6 8 10 M aximum 9.4440
9.776 10.782 0.6169 5.0352
9.767 10.848 0.3465 5.5873
9 5 % C onfidence Inter v als 9 5 % C onfidence Inter vals
0.483 1.283 2.1242 5.6379
Mean Mean
Median Median
9.8 10.0 10.2 10.4 10.6 10.8 11.0 0 1 2 3 4 5 6
Dotplot of Model A, Model B
Model A
Model B
-0.0 1.6 3.2 4.8 6.4 8.0 9.6 11.2
Data
Graph> Dotplot> Multiple Y’s, Simple
550
Black Belt Aptitude Exercise

• A recent deployment at a client raised the question of which educational

background is best suited to be a successful Black Belt candidate.
• In order to answer the question, the MBB instructor randomly sampled
the results of a Six Sigma pretest taken by now certified Black Belts at
other businesses.
• Undergraduate backgrounds in Science, Liberal Arts, Business and
Engineering were sampled.
• Management wants to know so they can screen prospective candidates
for educational background.

2. What educational background is best suited for a potential Black Belt?
HYPOTESTSTUD.MPJ
551
Black Belt Aptitude Exercise: Solution
First follow the Roadmap to check the data for Normality.
552
Black Belt Aptitude Exercise: Solution
Next we are going to check

for variance.
Before performing a Test for

Equal Variance should the
data be stacked?
553
Summary

• Conduct Hypothesis Testing for Equal Vvariance
• Conduct Hypothesis Testing for Medians
• Analyze and interpret the results
554
Hypothesis Testing Non Normal Data Part 2

Tests for Proportions
Contingency Tables
555
Hypothesis Testing Roadmap Attribute Data
ute Attribute Data

r ib
Att ata
D
One Factor Two Factors
Two Samples Two or More Samples
One Sample
One Sample Two Sample Chi Square Test

Proportion Proportion (Contingency Table)
MINITABTM: MINITABTM:
Stat - Basic Stats - 2 Proportions Stat - Tables - Chi-Square Test
If P-value < 0.05 the proportions are If P-value < 0.05 at least one proportion is
different different
Chi Square Test

(Contingency Table)
MINITABTM:
Stat - Tables - Chi-Square Test
If P-value < 0.05 the factors are not
independent
556
Sample Size and Types of Data
For Continuous Data:

– Capability analysis – a minimum of 30 samples
– Hypothesis Testing – depends on the practical difference to be
detected and the inherent variation in the process.
For Attribute Data:

– Capability analysis – a lot of samples
– Hypothesis Testing – a lot, but depends on practical difference to be
detected.
MINITABTM can estimate sample sizes, but remember the smaller the
difference that needs to be detected the larger the sample size will be!
557
Proportion versus a Target
This test is used to determine if the process proportion (p) equals some
desired value, p0.
The hypotheses:
– H0: p = p 0
– Ha: p ¹ p 0
The observed test statistic is calculated as follows:
Z 
 pˆ  p 
0
(normal approximation) p 1  p  n
obs
0 0
This is compared to Zcrit = Za/2
558
1. Shipping accuracy must be on target of 99%; determine if the current process is

on target.
Enter multiple values for

2. Hypotheses: alternative values of p and
– H0: p = 0.99 MINITABTM will give the different
sample sizes.
– Ha: p ¹ 0.99
3. One sample proportion test

– Choose a = 5%
4. Sample size:
559

Test for One Proportion
Testing proportion = 0.99 (versus not = 0.99)
Alpha = 0.05
Alternative Sample Target

Proportion Size Power Actual Power
0.95
0.96
140
221
0.9
0.9
0.900247
0.900389
Yes sir,
0.97 428 0.9 0.900316 they’re all
0.98 1402 0.9 0.900026
good!
Our sample included 500 shipped items of which 480 were accurate.
X 480
p̂    0.96
n 500
560
5. Statistical Conclusion: Reject the null hypothesis.

6. Practical Conclusion: We are not performing to the accuracy target of 99%
The hypothesized Mean is not

within the confidence interval,
reject the null hypothesis.
Test and CI for One Proportion

Test of p = 0.5 vs p not = 0.5
Exact
Stat>Basic Statistics>1 Proportion… Sample X N Sample p 95% CI P-value
1 480 500 0.960000 (0.938897, 0.975399) 0.000
561
Exercise

You are the shipping manager and are in charge of improving

shipping accuracy. Your annual bonus depends on your ability
to prove that shipping accuracy is better than the target of 80%.
1. How many samples do you need to take if the anticipated sample

proportion is 82%?
2. Out of 2000 shipments only 1680 were accurate.

• Do you get your annual bonus?
• Was the sample size good enough?
562
Proportion vs Target Example: Solution
First we have to figure out the proper

sample size to achieve our target of
80%.
Stat>Power and Sample Size>1 Proportion
563
Proportion vs Target Example: Solution
?
Now let us calculate if we receive
our bonus…
Out of the 2000 shipments, 1680

were accurate. Was the sample
size sufficient?
X 1680
p̂    0.84
n 2000
564
Comparing Two Proportions
This test is used to determine if the process defect rate (or proportion, p) of one
sample differs by a certain amount D from that of another sample (e.g., before and
after your improvement actions)
The hypotheses:
H0: p1 - p2 = D
Ha: p1 - p2 ¹ D
The test statistic is calculated as follows:

p̂1  p̂ 2  D
Zobs 
p̂1 1  p̂1  n1  p̂ 2 1  p̂ 2  n 2
This is compared to Zcritical = Za/2
Catch some Z’s!

565
Sample Size and Two Proportions
Take a few moments to practice calculating the minimum sample size required to
detect a difference between two proportions using a power of 0.90.
Enter the expected proportion for proportion 2 (null hypothesis).
For a more conservative estimate when the null hypothesis is close to 100, use smaller
proportion for p1. When the null hypothesis is close to 0, use the larger proportion for
p1.
a  p1 p2 n
5% .01 0.79 0.8 ___________
5% .01 0.81 0.8 ___________
5% .02 0.08 0.1 ___________
5% .02 0.12 0.1 ___________
5% .01 0.47 0.5 ___________
5% .01 0.53 0.5 ___________
566
1. Shipping accuracy must improve from a historical baseline of 85% towards a target of
95%. Determine if the process improvements made have increased the accuracy.
2. Hypotheses:
– H0: p1 – p2 = 0.0 Stat>Power and Sample Size> 2 Proportions…
– Ha: p1 – p2 ¹ 0.0
3. Two sample proportion test
– Choose a = 5%
4. Sample size:

Test for Two Proportions
Testing proportion 1 = proportion 2 (versus not =)
Calculating power for proportion 2 = 0.95
Alpha = 0.05
Sample Target
Proportion 1 Size Power Actual Power
0.85 188 0.9 0.901 451
The sample size is for each group.
567
The following data were taken:
Total Samples Accurate
Before Improvement 600 510

Calculate proportions:
After Improvement 225 212 X1 510
Before Improvement: 600 samples, 510 accurate p̂1    0.85
n1 600
X 2 212
After Improvement: 200 samples, 220 accurate p̂ 2    0.942
n 2 225
568
Stat>Basic Statistics>2 Proportions…

5. Statistical Conclusion: Reject the null
6. Practical Conclusion: You have

achieved a significant difference in
accuracy.
Test and CI for Two Proportions

Sample X N Sample p
1 510 600 0.850000
2 212 225 0.942222
Difference = p (1) - p (2)

Estimate for difference: -0.0922222
95% CI for difference: (-0.134005, -0.0504399)
Test for difference = 0 (vs not = 0): Z = -4.33 P-Value = 0.000
569
Exercise

Boris and Igor tend to make a lot of mistakes writing

requisitions.
1. Who is worse?
2. Is the sample size large enough?
570
2 Proportion vs Target Example: Solution
First we need to calculate our

estimated p1 and p2 for Boris and Igor.
X1 47
Boris p̂1    0.132
n1 356
X2 99
Igor p̂ 2    0.173
n 2 571
571
2 Proportion vs Target Example: Solution
Now let’s see what the minimum

sample size should be…
Stat>Power and Sample Size>2 Proportions
572
Contingency Tables
Contingency Tables are used to simultaneously compare more than two

sample proportions with each other.
It is called a Contingency Table because we are testing whether the

proportion is contingent upon, or dependent upon the factor used to
subgroup the data.
This test generally works the best with 5 or more observations in each cell.
Observations can be pooled by combining cells.
Some examples for use include:

– Return proportion by product line
– Claim proportion by customer
– Defect proportion by manufacturing line
573
Contingency Tables
The null hypothesis is that the population proportions of each group are the
same.
– H0: p1 = p2 = p3 = … = pn
– Ha: at least one p is different
Statisticians have shown that the following statistic forms a chi-square

distribution when H0 is true:

 observed  expected
2
expected
Where “observed” is the sample frequency, “expected” is the calculated

frequency based on the null hypothesis, and the summation is over all cells in
the table.
574
Test Statistic Calculations
Chi-square Test
r c (O ij  E ij ) 2
χ 
2
o  Where:
i 1 j 1 E ij O = the observed value

(from sample data)
E = the expected value
(Frow * Fcol )
E ij  r = number of rows
Ftotal c = number of columns
Frow = total frequency for that row
χ 2
cri t ical χ 2
α,ν Fcol = total frequency for that column
From the Chi-Square Table Ftotal = total frequency for the table
n = degrees of freedom [(r-1)(c-1)]
575
Contingency Table Example
1. Larry, Curley and Moe are order entry operators and you suspect that
one of them has a lower defect rate than the others.
2. Ho: pMoe = pLarry = pCurley
Ha: at least one p is different

3. Use Contingency Table since there are 3 proportions.
4. Sample Size: To ensure that a minimum of 5 occurrences were detected,
the test was run for one day.
Moe Larry Curley

Defective 5 8 20
OK 20 30 25
Can’t you clowns get the

entries correct?!
576
The sample data are the “observed” frequencies. To calculate the

“expected” frequencies, first add the rows and columns:
Moe Larry Curley Total

Defective 5 8 20 33
OK 20 30 25 75
Total 25 38 45 108
Then calculate the overall proportion for each row:

Defective 5 8 20 33 0.306
OK 20 30 25 75 0.694 33/108 = 0.306
Total 25 38 45 108
577
Now use these proportions to calculate the expected frequencies in each cell:
0.306*45 = 13.8

Defective 5 8 20 33 0.306
OK 20 30 25 75 0.694
Total 25 38 45 108
0.694 * 38 = 26.4
578
Next calculate the 2 value for each cell in the table:
observed - expected 2
expected
Moe Larry Curley  20  13.8 2  2.841

Defective 0.912 1.123 2.841
OK 0.401 0.494 1.250 13.8
Finally, add these numbers to get the observed chi-square:
2  0.912 1.123 2.841

χ obs
0.401 0.4941.250
2  7.02
χ obs
579
A summary of the table:
Moe Larry Curley

Observed 5 8 20
Expected 7.6 11.6 13.8
2
Defective  0.912 1.123 2.841
Observed 20 30 25
Expected 17.4 26.4 31.3 2  7.02
χ obs
2
OK  0.401 0.494 1.250
580
Critical Value:
• Like any other Hypothesis Test, compare the observed statistic with the critical
statistic. We decide a = 0.05, what else do we need to know?
• For a chi-square distribution, we need to specify n, in a Contingency Table:
n = (r - 1)(c - 1), where
r = # of rows
c = # of columns
• In our example, we have 2 rows and 3 columns, so n = 2
• What is the critical chi-square? For a Contingency Table, all the risk is in the
right hand tail (i.e. a one-tail test); look it up in MINITABTM using
Calc>Probability Distributions>Chisquare…
2  5.99
χ crit
581
Graphical Summary:
Since the observed chi-square exceeds the critical chi-square, we reject the
null hypothesis that the defect rate is independent of which person enters the
orders.
Chi-square probability density function for  = 2
0.5
0.4
0.3
Accept Reject
f
0.2 obs
2  7.02
0.1
0.0
0 1 2 3 4 5 6 7 8
chi-square crit  5.99
2
582
Using MINITABTM
• Of course MINITABTM eliminates the tedium of crunching these numbers.

Type the order entry data from the Contingency Table Example into
MINITABTM as shown:
• Notice that row labels are not necessary and row and column totals are not
used, just the observed counts for each cell.
583
Stat>Tables>Chi-Square Test (2 way table 5. Statistical Conclusion: Reject the null

in worksheet) hypothesis.
6. Practical Conclusion: The defect rate
for one of these stooges is different. In
other words, defect rate is contingent
upon the stooge.
Chi-Square Test: Moe, Larry, Curley
Expected counts are printed below observed counts

Chi-Square contributions are printed below expected counts

1 5 8 20 33
7.64 11.61 13.75
0.912 1.123 2.841
2 20 30 25 75
17.36 26.39 31.25
0.401 0.494 1.250
Total 25 38 45 108
Chi-Sq = 7.021, DF = 2, P-Value = 0.030
584
Exercise

• You are the quotations manager and your team thinks that the reason
you don’t get a contract depends on its complexity.
• You determine a way to measure complexity and classify lost contracts
as follows:
Low Med High

Price 8 10 12
Lead Time 10 11 9
Technology 5 9 16
1. Write the null and alternative hypothesis.
2. Does complexity have an effect?
585
Contingency Table Example: Solution
First we need to create a table in

MINITABTM
Secondly, in MINITABTM perform a

Chi-Square Test
Stat>Tables>Chi-Square Test
586
Contingency Table Example: Solution
Are the factors independent of each other?
587
Overview
Contingency Tables are another form of Hypothesis Testing.

They are used to test for association (or dependency) between two classifications.
The null hypothesis is that the classifications are independent.
A Chi-square Test is used for frequency (count) type data.
If the data is converted to a rate (over time) then a continuous type test would be
possible. However, determining the period of time that the rate is based on can be
controversial. We do not want to just pick a convenient interval; there needs to be
some rational behind the decision. Many times we see rates based on a day
because that is the easiest way to collect data. However, a more appropriate way
would be to look at the rate distribution per hour.
Per hour? Per day? Per month?
588
Summary
• Calculate and explain test for proportions
• Calculate and explain Contingency Tests
589
Analyze
Analyze Wrap Up Overview
The goal of the Analyze Phase is to:
• Locate the variables which are significantly impacting your Primary Metric. Then
establish Root Causes for “X” variables using Inferential Statistical Analysis such
as Hypothesis Testing and Simple Modeling.
• Gain and demonstrate a working knowledge of Inferential Statistics as a means of

identification of leverage variables.
591
Six Sigma Behaviors
• Embracing change
• Continuous learning
• Being tenacious and courageous
• Make data-based decisions
• Being rigorous
• Thinking outside of the box
Each
Each“player”
“player”ininthe
theSix
SixSigma
Sigmaprocess
processmust
mustbe
be
AAROLE
ROLEMODEL
MODEL
for
forthe
theSix
SixSigma
Sigmaculture.
culture.
592
Analyze Deliverables
• Listed below are the Analyze Phase deliverables that each candidate will present
in a Power Point presentation at the beginning of the Control Phase training.
• At this point you should all understand what is necessary to provide these
deliverables in your presentation.
– Primary Metric
– Data Demographics
– Hypothesis Testing (applicable tools)
– Modeling (applicable tools)
– Strategy to reduce X’s
– Project Plan
– Issues and Barriers It’s your show!
593
Analyze Phase - The Roadblocks
Look for the potential roadblocks and plan to address them before they
become problems:
– Lack of data
– Data presented is the best guess by functional managers
– Team members do not have the time to collect data
– Process participants do not participate in the analysis planning
– Lack of access to the process
594
DMAIC Roadmap
Process Owner
Champion/

Define
Estimate COPQ
Establish Team
Measure

Analyze

Improve

Control
595
Analyze Phase
Vital Few X’s Identified
State Practical Theories of Vital Few X’s Impact on Problem
Translate Practical Theories into Scientific Hypothesis
Select Analysis Tools to Prove/Disprove Hypothesis
Collect Data
Perform Statistical Tests
State Practical Conclusion
Statistically
Significant?
N
Y
Update FMEA
N
Practically
Significant?
Root
Cause
N
Y
Identify Root Cause
Ready for Improve and Control
596
Analyze Phase Checklist
Analyze Questions
Define Performance Objectives Graphical Analysis

• Is existing data laid out graphically?
• Are there newly identified secondary metrics?
• Is the response discrete or continuous?
• Is it a Mean or a variance problem or both?
Document Potential X’s Root Cause Exploration

• Are there a reduced number of potential X’s?
• Who participated in these activities?
• Are the number of likely X’s reduced to a practical number for analysis?
• What is the statement of Statistical Problem?
• Does the process owner buy into these Root Causes?
Analyze Sources of Variability Statistical Tests

• Are there completed Hypothesis Tests?
• Is there an updated FMEA?
General Questions
• Are there any issues or barriers that prevent you from completing this phase?
• Do you have adequate resources to complete the project?
597
Planning for Action
WHAT WHO WHEN WHY WHY NOT HOW
Qualitative screening of vital from controllable trivial X’s
Qualitative screening for other factors
Quantitative screening of vital from controllable trivial X’s
Ensure compliance to problem solving strategy
Quantify risk of meeting needs of customer, business and people
Predict risk of sustainability
Chart a plan to accomplish desired state of culture
Assess shift in process location

Minimize risk of process failure
Modeling Continuous or Non Continuous Output
Achieving breakthrough in Y with minimum efforts 598

Summary
• Have started to develop a project plan to meet the deliverables
• Be ready to apply the Six Sigma method through your project
You’re on your way!
599
Improve Phase
Process Modeling
Welcome to Improve
Correlation
Process Modeling: Regression Introduction to Regression
Simple Linear Regression

Advanced Process Modeling: MLR
Designing Experiments
Experimental Methods
Full Factorial Experiments
Fractional Factorial Experiments
601
Correlation
• The primary purpose of linear correlation analysis is to measure the strength of linear
association between two variables (X and Y).
• If X increases and there is no definite shift in the values of Y, there is no correlation or no
association between X and Y.
• If X increases and there is a shift in the values of Y, there is a correlation.
• The correlation is positive when Y tends to increase and negative when Y tends to decrease.
• If the ordered pairs (X, Y) tend to follow a straight line path, there is a linear correlation.
• The preciseness of the shift in Y as X increases determines the strength of the linear
correlation.
• To conduct a linear correlation analysis you need:
– Bivariate Data – Two pieces of data that are variable
– Bivariate data is comprised of ordered pairs (X/Y)
– X is the independent variable
– Y is the dependent variable
602
Correlation Coefficient
Ho: No Correlation Ho ho ho….

Ha: There is Correlation
Ha ha ha….
The Correlation Coefficient (always) assumes a value between –1 and +1.
The Correlation Coefficient of the population, R, is estimated by the sample Correlation

Coefficient, r:
603
Types and Magnitude of Correlation
Strong Positive Correlation Moderate Positive Correlation Weak Positive Correlation
110 110 85
100
100
90
90
75
80
Output
Output
Output
80
70
70
60 65
60
50
50
40
55
40
30
40 50 60 70 80 90 100 110 120 50 60 70 80 90 100 40 50 60 70 80 90
Input Input Input
Strong Negative Correlation Moderate Negative Correlation Weak Negative Correlation

110 110
85
100
100
90
90
80 75
80
Output
Output
Output
70
70
60
65
60
50
50
40
40 55
30
0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 10 20 30 40 50 60
Input Input Input
604
Limitations of Correlation
• A strong positive or negative correlation between X and Y does not indicate causality.
• Correlation provides an indication of the strength but does not provide us with an exact
numerical relationship (i.e. Y=f(x)).
• The magnitude of the Correlation Coefficient is somewhat relative and should be used with
caution.
• Just like any other statistic, you need to assess whether the Correlation Coefficient is
statistically significant, as well as practically significant.
• As usual, statistical significance is judged by comparing a P-value with the chosen degree
of alpha risk.
• Guidelines for practical significance are as follows:
– If | r | > 0.80, relationship is practically significant
– If | r | < 0.20, relationship is not practically significant
Areaofofnegative
Area negative Area of positive
linearcorrelation
correlation No linear correlation linear correlation
linear
-1.0 -0.8 -0.2 0 0.2 0.8 +1.0

605
Correlation Example
RB Stats Correlation.mtw
X values Y values
The Correlation Coefficient [r]: Payton carries Payton yards
• Is a positive value if one variable increases 196 679
as the other variable increases. 311 1390
• Is a negative value if one variable 339 1852
decreases as the other increases. 333 1359
369 1610
317 1460
339 1222
148 596
Correlation Formula
314 1421
381 1684
( X i  X )(Yi  Y )
r 324 1551
( X i  X ) 2 (Yi  Y ) 2 321 1333
146 586
606
Correlation Analysis
1. With the data set, first you need to create a scatter plot. The scatter plot will give you a visual aid to how this model appears.
2. Then, you need to determine the Pearson’s Coefficient of Correlation. If the Pearson’s Coefficient of Correlation is > 0.8 or < -0.8, proceed to Step
3.
3. Understand the R-Square value. If R-Square is > 0.64, proceed to step 4.
4. Use the p-values to decide which terms to include in the final regression equation.
5. Use the Regression equation and note down Predicted values.
6. Check if the difference between the Predicted values and Actual values (Known as Residuals), are normally distributed.
7. Non-normal residuals would often mean outliers in the residuals, which could be due to unusual observations resulting from special causes of
variations.
607
Step 1 – The Pearson’s Coefficient of Correlation is 0.93. Check the Simple Regression sheet to know how the Pearson’s Coefficient of
Correlation has been arrived at.
Step 2 – As Pearson’s Coefficient of Correlation is 0.93, i.e. > 0.8, a strong correlation between Y and X is established.
Step 3 – Study the R-Square value. R-Square value is 0.8733, i.e. 87.33%. R-Square > 64%.
Step 4 – Now to study the p-values to statistically determine which of the components of the regression equation y = 4.916x – 163.5.
Step 5 – If p-values of components > 0.05, the component needs to be omitted. If p-values of components < 0.05, the component needs to be
included in the model.
608
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.934531611
R Square 0.873349332
Adjusted R
Square 0.861835635
Observations 13
ANOVA
df SS MS F Significance F
Regression 1
Step 6 – The p-value of the Intercept indicates 1798586.901 1798587
a non-significant p-value. 75.8530755
That means 2.88668E-06
the constant would need to be omitted from the model. The X Variable
shows a significant p-value, that means the final regression equation should be
Residual 11 260826.0217 23711.46
Y = 4.916x.
Total 12 2059412.923
Step 7 – Let us now plot the predicted values for the same value of x, and find out Residuals.
Lower Upper
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% 95.0% 95.0%
Intercept -163.4974881 172.0359099 -0.95037

609 0.36233931 -542.1459727 215.151 -542.146 215.151
The normal probability plot and the residuals plot do not indicate the presence of any special patterns.
The residuals are normally distributed.
The Regression model is valid to be chosen.
The Regression equation y = 4.916x can be chosen for predictive purposes.
Statistical Solution
1.The Regression equation can be used to arrive at the Statistical solution.

2.For example with y = 4.916x, if Payton Yards is the final output we are measuring and for Payton Carries = 100, Payton Yards is 700, what
should the value of Payton Yards if Payton Carries needs to improve to 150.
3.The new value of Payton Yards will be the statistical solution desired from the project.
610
Modeling Y=f(x) Exercise: Question 3 Solution
If Dorsett carries the football 325 times the predicted value would be
determined as follows…
Step 1: Dorsett Yards = -160.1 + 4.993 (Dorsett Carries)
Step 2: Dorsett Yards = -160.1 + 4.993 (325)
Step 3: Dorsett Yards = -160.1 + 1622.725
Solution: Dorsett Yards = 1462.63
611
Summary

 Perform the steps in a Correlation and a Regression Analysis
 Explain when Correlation and Regression is appropriate
612
Advanced Modeling & Regression
Review Corr./Regression
Process Modeling: Regression
Non-Linear Regression
Transforming Process Data
Multiple Regression
613
Correlation and Linear Regression Review
Correlation and Linear Regression are used:

– With historical process data. It is NOT a form of experimentation.
– To determine if two variables are related in a linear fashion.
– To understand the strength of the relationship.
– To understand what happens to the value of Y when the value of X is increased
by one unit.
– To establish a Prediction Equation that will enable us to predict Y for any level of
X.
Correlation explores association.

Correlation and regression do
not imply a causal relationship.
Designed experiments allow
for true cause and effect
relationships.
Correlations: Stirrate, Impurity

Pearson correlation of Stirrate and Impurity = 0.966
P-value = 0.000
614
Correlation Review
Correlation is used to measure the linear relationship between two Continuous Variables
(bi-variate data).
Pearson Correlation Coefficient “r” will always fall between –1 and +1.
A Correlation of –1 indicates a strong negative relationship, one factor increases the
other decreases.
A Correlation of +1 indicates a strong positive relationship, one factor increases so does
the other.
P-value > 0.05, Ho: No relationship

P-value < 0.05, Ha: Is relationship
“r”
Strong No Strong
Correlation Correlation Correlation
-1.0 0 +1.0
Decision Points
615
Linear Regression Review
Linear Regression is used to model the relationship between a Continuous

response variable (Y) and one or more Continuous independent variables (X).
The independent predictor variables are most often Continuous but can be
ordinal.
– Example of ordinal - Shift 1, 2, 3, etc.
P-value > 0.05, Ho: Regression equation is not significant

P-value < 0.05, Ha: Regression equation is significant
F itte d L in e P lo t
Im p u r i ty = - 0 . 2 8 9 + 0 . 4 5 6 6 S tir r a te
2 0 .0 S 0.919316
R-S q 9 3.4 %
R - S q ( a d j) 9 2.7 %
1 7 .5
1 5 .0
Impurity
1 2 .5
The change in Y-value for
every one unit change in (X)
1 0 .0
Stirrate (Slope of the Line)
20 25 30 35 40 45
S t ir r a t e
616
Regression Analysis Review
Correlation only tells us the strength of a linear relationship, not the numerical
relationship.
The last step to proper analysis of Continuous Data is to determine the
regression equation.
The regression equation can mathematically predict Y for any given X.
Prediction Equations:
Y = a + bx (Linear or 1st order model)
Y = a + bx + cx2 (Quadratic or 2nd order model)
Y = a + bx + cx2 + dx3 (Cubic or 3rd order model)
Y = a (bx) (Exponential)
617
Simple vs. Multiple Regression Review
Simple Regression
– One X, One Y
– Analysis can be done with Data Analysis and
Scatter Plots
Multiple Regression
– Two or More X’s, One Y
– Analysis can be done with Data Analysis and
Scatter Plots
In both cases the R-sq value estimates the amount of

variation explained by the model.
618
Regression Step Review
The basic steps to follow in Regression are as follows:

1. Create Scatter Plot
2. Determine Correlation
3. Determine and Interpret R-Square value
4. Check for p-values
5. Use Regression equation to find residuals
6. Check if residuals are normally distributed
One step at a time….

619
Simple Regression Example
In doing Simple Linear Regression, you would find that residuals are not
normally distributed, or also find unusual observations.
These indicate departure from Simple Linear Regression (One X and One Y
with the Regression equation of the form, y=mx + c).
In such scenarios, you should consider Curvilinear Regression.
620
Curvilinear Regression
A sample data set for a contact center setting is as shown below. Note that the Call AHT is the eventual response variable (Y) and Call ACW is the
predictor variable we wish to investigate.
Steps for doing a Curvilinear Regression will remain the same with the only difference of one option.
As you can see an immediate change in the shape of the equation.
Earlier, the equation was of the form, y = mx + c, but now it is y = m1x1 + m1x22 +c
One should use Curvilinear Regression only when Model Adequacy parameters are not met with Simple Linear Regression.
621
Types of Non-Linear Relationships
Oh, which formula to use?!
622
Multiple Linear Regression
What do you do when you have multiple input variables (Predictors) and one output variable (Response) to test?
Excel provides us a simple option of doing Multiple Linear Regression. All the steps of Simple Linear Regression hold except for drawing Scatter plots.
Drawing Scatter plots with multiple variables is a time consuming affair. You need to draw individual scatter plots for each variable. Yet, for illustration we
will adhere to the steps.
Data set for which we should do Multiple Linear Regression is as below
Call AHT Call ACW Wait Time

4.5 0.2 1
4.6 0.3 1.1
4.7 0.4 1.2
4.8 0.5 1.3
5 0.6 1.4
5.1 0.7 1.5
5.2 0.8 1.6
5.6 0.9 1.7
5.8 1 1.8
623
Scatter charts for both the variables show a good fit line passing through all the points. Their Pearson’s coefficient of correlation is high as well.
Time for us to test out all other metrics and see how well does this model hold.
SUMMARY OUTPUT
Regression Statistics
Multiple R 1
R Square 1
Adjusted R Square 1
Standard Error 5.36E-17

Observations 9
ANOVA
df SS MS F Significance F
Regression 2 0.6 0.3 1.04E+32 2.37E-95

Residual 6 1.72E-32 2.87E-33
Total 8 0.6
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept -0.8 5.57E-16 -1.4E+15 7.69E-90 -0.8 -0.8 -0.8 -0.8
X Variable 1 1 3.16E-16 3.17E+15 6.7E-92 1 1 1 1
X Variable 2 4.416E-16 1.95E-16 2.268954 0.063759 -3.5E-17 9.18E-16 -3.5E-17 9.18E-16
RESIDUAL OUTPUT PROBABILITY OUTPUT
624
Observation Predicted Y Residuals Percentile Y
The Standard Error of the model ( 1 Response and 2 Predictors) is close to 0.
The Adjusted R-Square metric is close to 1, i.e. 100% of variability in residuals/ predicted values from actual values is explained by the model.
The p-values of the Intercept and one of the X variables are significant, thus demanding the need of inclusion of all the two in the final regression equation.
The final regression equation for the model would be
Y = -0.8 + Call ACW
Call Wait time is taken off from the regression model because it showed a non-significant p-value.
Yet, when we did a simple scatter plot, we saw a Best Fit Line passing through the points in the Scatter.
In situations like these, help of process expertise is needed to help you decide if you need to consider both the variables for your final regression equation.
625
Transforming Process Data
Data that is asymmetric can often be transformed to make it more symmetric using a
numerical function which operates more strongly on large numbers than small ones;
such as logarithms and roots.
Transform Rules:
1. The transform must preserve the relative order of the data.
2. The transform must be a smooth and continuous function.
3. Most often useful when the ratio of largest to smallest value is greater than two. In most
cases, the transform will have little effect when this rule is violated.
4. All external reference points (spec limits, etc.) must use the same transform.
Transformation Power(p)
Cube 3
{ }
Square 2
xp
xtrans= No Change 1
log(x) Square Root 0.5
Logarithm 0
Reciprocal Root -0.5
Reciprocal -1
626
Effect of Transformation
Before Transform After Transform

20
25
20 15
Frequency
Frequency
15
10
10
5
5
0 0
10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
Right Skew Sqrt
The transformed data now shows a Normal

Distribution.
627
Transforming Data Using MINITABTM
The Box Cox Transformation procedure in MINITABTM is a method of determining the

transform power (called “lambda” in the software) for a set of data.
Stat>Control Charts>Box-Cox Transformation

Transform.MTW
628
Box Cox Transform
Box-Cox Plot of Pos skew

Lower C L Upper CL
3.0 Lambda
(using 95.0% confidence)
Estimate 0.337726
2.5
Lower C L 0.136963
Upper C L 0.537207
Best Value 0.500000

2.0
StDev
1.5
1.0
Limit
-1 0 1 2 3
Before Transform Lambda After Transform
Probability Plot of Pos skew Probability Plot of BoxCox
Normal Normal
99.9 99.9
Mean 1.050 Mean 0.9469
StDev 0.8495 StDev 0.3934
99 99
N 100 N 100
AD 2.883 AD 0.265
95 P-Value <0.005 95 P-Value 0.687
90 90
80 80
70 70
Percent
Percent
60 60
50 50
40 40
x 0.50 or x
30 30
20 20
10 10
5 5
1 1
0.1 0.1
-2 -1 0 1 2 3 4 5 6 0.0 0.5 1.0 1.5 2.0 2.5
Pos skew BoxCox
629
Transforming Without the Box Cox Routine
Transform.MTW An alternative method of transforming data is

to use standard transforms.
The square root and natural log transform are
most commonly.
A disadvantage of using the Box Cox
transformation is the difficulty in reversing the
transformation.
The column of process data is in C1, labeled Pos
Skew. Remember this data was not Normally
Distributed as determined with the Anderson
Darling Normality test.
Using the MINITABTM calculator, calculate the
square root of each observation in C1 and store
in C3, calling it “Square Root”.
630
Transforming Without the Box Cox Routine
The output should resemble this view.

Transform.MTW
Confirm if the new dataset found in C3 is
Probability
ProbabilityPlot
Plotof
ofSquare
SquareRoot
Root
Normal
Normal
99.9
99.9
Mean
Mean 0.9469
0.9469
StDev
StDev 0.3934
0.3934
99
99 N 100
N 100
AD 0.265
95 AD 0.265
95 P-Value
P-Value 0.687
0.687
90
90
80
80
70
Percent
70
Percent
60
60
50
50
40
40
30
30
20
20
10
10
5
5
11
0.1
0.1
0.0
0.0 0.5
0.5 1.0
1.0 1.5
1.5 2.0
2.0 2.5
2.5
Square Root
Square Root
Our transform is the square root—the same as the Box Cox transform of
lambda = 0.5
631
Regressions are run on historical process data. They are NOT a form of experimentation.
Multiple Linear Regression investigates multiple input variable’s effect on an output
simultaneously.
– If R2 is not as high as desired in the Simple Linear Regression.
– Process knowledge implies more than one input affects the output.
The assumptions for residuals with Simple Regressions are still necessary for Multiple Linear
Regressions.
An additional assumption for MLR is the independence of predictors (X’s).
– MINITABTM can test for multicollinearity (Correlation between the predictors or X’s).
Model error (residuals) is impacted by the addition of measurement error for all the input
variables.
632
Definitions of MLR Equation Elements
The definitions for the elements of the Multiple Linear Regression model are as follows:
Y = 0+ 1X1 + 2X2 + 3X3 + 

Y = The response (dependent) variable.
X1, X2, X3: The predictor (independent) inputs. The predictor variables used to
explain the variation in the observed response variable, Y.
β0: The value of Y when all the explanatory variables (the Xs) are equal to zero.
β1, β2, β3 (Partial Regression Coefficient): The amount by which the response variable
(Y) changes when the corresponding Xi changes by one unit with the other input
variables remaining constant.
ε (Error or Residual): The observed Y minus the predicted value of Y from the
Regression.
633
MLR Step Review
The basic steps to follow in Multiple Linear Regression are:
1. Create matrix plot (Graph>Matrix Plot)

2. Run Best Subsets Regression (Stat>Regression>Best Subsets)
3. Evaluate R2, adjusted R2 , Mallows’ Cp, number of predictors and S.
4. Iteratively determine appropriate regression model. (Stat>Regression> Regression
>Options)
5. Analyze residuals (Stat>Regression>Regression >Graphs)
1. Normally Distributed
2. Equal variance
3. Independence
4. Confirm one or two points do not overly influence model.
6. Verify your model by running present process data to confirm your model error.
634
Multiple Linear Regression Model Selection
When comparing and verifying models consider the following:

1. Should be a reasonably small difference between R2 and R2 - adjusted (much less
than 10% difference).
2. When more terms are included in the model, does the adjusted R2 increase?
3. Use the statistic Mallows’ Cp. It should be small and less than the number of
terms in the model.
4. Models with smaller S (Standard Deviation of error for the model) are desired.
5. Simpler models should be weighed against models with multiple predictors
(independent variables).
6. The best technique is to use MINITABTM’s Best Subsets command.
635
Flight Regression Example
An airplane manufacturer wanted to see what variables affect flight speed.

The historical data available covered a period of 10 months.
Flight Regression MLR.MTW
636
Flight Regression Example Matrix Plot
Look for plots that show correlation.
Matrix
Matrix Plot
Plot of
of Flight
Flight Speed,
Speed, Altitude,
Altitude, Turbine
Turbine Angl,
Angl, Fuel/Air
Fuel/Air rat,
rat, ...
...
600
600 750
750 900
900 32
32 36
36 40
40 99 12
12 15
15
Output Response
600
600
Flight
FlightSpeed
Speed 500
500
400
400
900
900
750
750 Altitude
Altitude
600
600
37.0
37.0
34.5
34.5
Turbine
TurbineAngle
Angle
32.0
32.0
40
40
36 Fuel/Air
36 Fuel/Airratio
ratio
32
32 19.5
19.5
18.0
18.0
ICR
ICR
16.5
16.5
15
15
12
12
Temp
Temp
99
400
400 500
500 600
600 32.0
32.0 34.5
34.5 37.0
37.0 16.5
16.5 18.0
18.0 19.5
19.5
Predictors
Since 2 or more predictors show Correlation, run MLR.
637
Flight Regression Example Best Subsets
Best Subsets Regression: Flight Speed versus Altitude,

Turbine Angl, ...
Response is Flight Speed
F
T u
u e
r l
b /
i A
A n i
l e r
t
i A r
t n a T
u g t I e
Mallows d l i C m
Vars R-Sq R-Sq(adj) C-p S e e o R p
1 72.1 71.1 38.4 28.054 X
1 39.4 37.2 112.8 41.358 X
2 85.9 84.8 9.0 20.316 X X
2 82.0 80.6 17.9 22.958 X X
3 87.5 85.9 7.5 19.561 X X X
3 86.5 84.9 9.6 20.267 X X X
4 89.1 87.3 5.7 18.589 X X X X
4 88.1 86.1 8.2 19.481 X X X X
5 89.9 87.7 6.0 18.309 X X X X X
638
Flight Regression Example Model Selection
Best Subsets Regression: Flight Speed versus Altitude,

Turbine Angl, ...
T
F
u
List of all the Predictors
u e (X’s)
r l
b /
i A
A n i
l e r
t
i A r
t n a T
u g t I e What model would you select?
Mallows d l i C m
1 72.1 71.1 38.4 28.054 X Let’s consider the 5 predictor model:
1 39.4 37.2 112.8 41.358 X • Highest R-Sq(adj)
2 85.9 84.8 9.0 20.316 X X
2 82.0 80.6 17.9 22.958 X X • Lowest Mallows Cp
3 87.5 85.9 7.5 19.561 X X X
3 86.5 84.9 9.6 20.267 X X X
• Lowest S
4 89.1 87.3 5.7 18.589 X X X X • However there are many terms.
4 88.1 86.1 8.2 19.481 X X X X
5 89.9 87.7 6.0 18.309 X X X X X
639
Stat>Regression>Regression>Options
640
Regression Analysis: Flight Speed versus Altitude, Turbine Angle, ...
The regression equation is

Flight Speed = 770 + 0.153 Altitude + 5.81 Turbine Angle + 8.70 Fuel/Air ratio
- 52.3 ICR + 4.11 Temp
Predictor Coef SE Coef T P VIF The VIF for temp indicates it should
Constant 770.4 229.7 3.35 0.003 be removed from the model. Go back
Altitude 0.15318 0.06605 2.32 0.030 2.3
Turbine Angle 5.806 2.843 2.04 0.053 1.4
to the Best Subsets analysis and select
Fuel/Air ratio 8.696 3.327 2.61 0.016 3.2 the best model that does not include
ICR -52.269 6.157 -8.49 0.000 2.6 the predictor temp.
Temp 4.107 3.114 1.32 0.200 5.4
S = 18.3088 R-Sq = 89.9% R-Sq(adj) = 87.7%
Variance Inflation Factor (VIF) detects Correlation among predictors.

• VIF = 1 indicates no relation among predictors
• VIF > 1 indicates predictors are correlated to some degree
• VIF between 5 and 10 indicates Regression Coefficients are poorly estimated and are
unacceptable.
641
Note: It is not necessary to re-

Best Subsets Regression: Flight Speed versus Altitude, run the Best Subsets analysis.
Turbine Angl, ...
The numbers do not change.
F
T u
u e
r l
b /
i A
A n i
l e r
t
i A r
t n a T
u g t I e
Mallows d l i C m Select a model with 4 terms
1 72.1 71.1 38.4 28.054 X because Temp was removed as
1
2
39.4
85.9
37.2
84.8
112.8
9.0
41.358
20.316
X
X X
a predictor since it had
2 82.0 80.6 17.9 22.958 X X Correlation with the other
3 87.5 85.9 7.5 19.561 X X X
3 86.5 84.9 9.6 20.267 X X X variables.
4 89.1 87.3 5.7 18.589 X X X X Re-run the Regression.
4 88.1 86.1 8.2 19.481 X X X X
5 89.9 87.7 6.0 18.309 X X X X X
642
Regression Analysis: Flight Speed versus Altitude, Turbine Angle, ...

Flight Speed = 616 + 0.117 Altitude + 6.70 Turbine Angle + 12.2 Fuel/Air ratio
- 48.2 ICR
The VIF values are NOW
acceptable.
Predictor Coef SE Coef T P VIF
Constant 616.1 200.7 3.07 0.005
Altitude 0.11726 0.06109 1.92 0.067 1.9 Evaluate the P-values.
Turbine Angle 6.702 2.802 2.39 0.025 1.3 • If p > 0.05, the term(s) should be
Fuel/Air ratio 12.151 2.082 5.84 0.000 1.2 removed from the Regression.
ICR -48.158 5.391 -8.93 0.000 1.9
S = 18.5889 R-Sq = 89.1% R-Sq(adj) = 87.3%

Remove altitude, re-run model.
643
Regression Analysis: Flight Speed versus Turbine Angl, Fuel/Air rat, ICR

Flight Speed = 887 + 4.82 Turbine Angle + 12.1 Fuel/Air ratio - 55.0 ICR
Predictor Coef SE Coef T P VIF

Constant 886.6 150.4 5.90 0.000
Turbine Angle 4.822 2.763 1.75 0.093 1.1
Fuel/Air ratio 12.106 2.191 5.53 0.000 1.2
ICR -55.009 4.251 -12.94 0.000 1.1
S = 19.5613 R-Sq = 87.5% R-Sq(adj) = 85.9%
Re-run the
The P-value for Turbine Angle now
indicates it should be removed and Regression
re-run the Regression
because P > 0.05
644
Flight Regression Final Regression Model
Regression Analysis: Flight Speed versus Fuel/Air ratio, ICR

Flight Speed = 1101 + 10.9 Fuel/Air ratio - 55.2 ICR
This is the final Regression model
Predictor Coef SE Coef T P VIF because all remaining terms are
Constant 1101.04 90.00 12.23 0.000
statistically significant (we wanted
Fuel/Air ratio 10.921 2.163 5.05 0.000 1.1
ICR -55.197 4.414 -12.51 0.000 1.1 95% confidence or P-value of less
than 0.05) and the R-Sq shows the
S = 20.3162 R-Sq = 85.9% R-Sq(adj) = 84.8% remaining terms explain 85% of
the variation of flight speed.
Analysis of Variance
Source DF SS MS F P
Regression 2 65500 32750 79.35 0.000
Residual Error 26 10731 413 Consider removing this outlier but
Total 28 76231 be careful, this is historical data
Note the ICR predictor that has no further information.
Source DF Seq SS accounts for 84.7% of the
Fuel/Air ratio 1 951
ICR 1 64549
variation. Remember, the objective is to get
84.7% = 64549/76231 information that can be used in a
Unusual Observations Designed Experiment where true
Fuel/Air Flight cause and effect relationships can
Obs ratio Speed Fit SE Fit Residual St Resid be established.
1 40.6 618.00 624.29 11.55 -6.29 -0.38 X
22 36.3 578.00 524.45 5.43 53.55 2.74R
R denotes an observation with a large standardized residual.
X denotes an observation whose X value gives it large influence.
645
Flight Regression Example Residual Analysis
646
Flight Regression Example Residual Analysis
Residual
Residual Plots
Plots for
for Flight
Flight Speed
Speed
Normal
Normal Probability
Probability Plot
Plot of
of the
the Residuals
Residuals Residuals
Residuals Versus
Versus the
the Fitted
Fitted Values
Values
99
99
Residual
StandardizedResidual
90 22
90
Percent
Percent
50
Standardized
50
00
10
10
11 -2
-2
-3.0
-3.0 -1.5
-1.5 0.0
0.0 1.5
1.5 3.0
3.0 450
450 500
500 550
550 600
600 650
650
Standardized Residual
Standardized Residual Fitted
FittedValue
Value
Histogram
Histogram of
of the
the Residuals
Residuals Residuals
Residuals Versus
Versus the
the Order
Order of
of the
the Data
Data
88
Residual
22
66
Frequency
Frequency
Standardized
44
00
22
00 -2
-2
-2
-2 -1
-1 00 11 22 22 44 66 88 10
10 12
12 14
14 16
16 18
18 20
20 22
22 24
24 26
26 28
28
Standardized Residual Observation Order
Observation Order
• Normally Distributed Residuals (Normal Probability Plot)

• Equal variance (Residuals vs. Fitted Values)
• Independence (Residuals vs. Order of Data)
647
Summary
• Perform Non-linear Regression Analysis
• Perform Multiple Linear Regression Analysis (MLR)
• Examine Residuals Analysis and understand its effects
648
Advanced Process Modeling: MLR Reasons for Experiments
Designing Experiments Graphical Analysis
DOE Methodology
649
Project Status Review
• Understand our problem and it’s impact on the business. (Define)

• Established firm objectives/goals for improvement. (Define)
• Quantified our output characteristic. (Define)
• Validated the measurement system for our output characteristic. (Measure)
• Identified the process input variables in our process. (Measure)
• Narrowed our input variables to the potential “X’s” through Statistical Analysis.
(Analyze)
• Selected the vital few X’s to optimize the output response(s). (Improve)
• Quantified the relationship of the Y’s to the X’s with Y=f(x). (Improve)
650
Six Sigma Strategy
p liers Cu u ts
p sto SIPOC
O
u
S me Inp
ut
VOC
Con rs
pu
Project Scope
trac Emplo
tors yees
st
P-Map, X-Y, FMEA
(X1) (X11) (X9)
(X2) (X3) (X4) (X8) Capability
(X6) (X7) (X5) (X10)
Box Plot, Scatter
(X3) (X4) (X1) (X11) Plots, Regression
(X5) (X8)
(X2)
Fractional Factorial
Full Factorial
(X5) (X3) Center Points
(X11)
(X4)
651
Reasons for Experiments
The Analyze Phase narrowed down the many inputs to a critical few, now it is necessary to
determine the proper settings for the vital few inputs because:
– The vital few potentially have interactions.
– The vital few will have preferred ranges to achieve optimal results.
– Confirm cause and effect relationships among factors identified in Analyze Phase (e.g.
regression)
Understanding the reason for an experiment can help in selecting the design and focusing
the efforts of an experiment.
Reasons for experimenting are:
– Problem Solving (Improving a process response)
– Optimizing (Highest yield or lowest customer complaints)
– Robustness (Constant response time)
– Screening (Further screening of the critical few to the vital few X’s)
Design where you’re going - be sure you get there!

652
Desired Results of Experiments
Problem Solving
– Eliminate defective products or services.
– Reduce cycle time of handling transactional processes.
Optimizing
– Mathematical model is desired to move the process response.
– Opportunity to meet differing customer requirements (specifications or VOC).
Robust Design
– Provide consistent process or product performance.
– Desensitize the output response(s) to input variable changes including NOISE
variables.
– Design processes knowing which input variables are difficult to maintain.
Screening
– Past process data is limited or statistical conclusions
prevented good narrowing of critical factors in Analyze
Phase.
When it rains it PORS!

653
DOE Models vs. Physical Models
What are the differences between DOE modeling and physical models?
– A physical model is known by theory using concepts of physics, chemistry,
biology, etc...
– Physical models explain outside area of immediate project needs and include
more variables than typical DOE models.
– DOE describes only a small region of the experimental space.
The objective is to minimize

the response. The physical
model is not important for our
business objective. The DOE
Model will focus in the region
of interest.
654
Definition for Design of Experiments
Design of Experiments (DOE) is a scientific method of planning and

conducting an experiment that will yield the true cause-and-effect
relationship between the X variables and the Y variables of interest.
DOE allows the experimenter to study the effect of many input variables that
may influence the product or process simultaneously, as well as possible
interaction effects (for example synergistic effects).
The end result of many experiments is to describe the results as a

mathematical function.
Y = f (x)
The goal of DOE is to find a design that will produce the information
required at a minimum cost.
Properly designed DOE’s are more efficient experiments.

655
One Factor at a Time is NOT a DOE
One Factor at a Time (OFAT) is an experimental style but not a planned experiment
or DOE.
The graphic shows yield contours for a process that are unknown to the experimenter.
Trial Temp Press Yield

Yield Contours Are 1 125 30 74
Unknown To Experimenter 75 2 125 31 80
3 125 32 85
4 125 33 92
80 5 125 34 86
6 130 33 85
Pressure (psi)
7 120 33 90
135 85
6
130 90
125
1 2 3 4 5 Optimum identified
95 with OFAT
120 7
True Optimum available

30 31 32 33 34 35 with DOE
Temperature (C)
656
Types of Experimental Designs
The most common types of DOE’s are:

– Fractional Factorials
• 4-15 input variables
– Full Factorials
– Response Surface Methods (RSM)

Response
Surface
Full Factorial
Fractional Factorials
657
Nomenclature for Factorial Experiments
The general notation used to designate a full factorial design is given by:
– Where k is the number of input variables or factors.

– 2 is the number of “levels” that will be used for each factor.
• Quantitative or qualitative factors can be used.
658
Visualization of 2 Level Full Factorial
600 (-1,+1) (+1,+1)

300
Temp
350
2 2
Press
500
Press
600 500
Uncoded levels for factors (-1,-1) (+1,-1)
300F Temp 350F
T P T*P
-1 -1 +1 Four experimental runs:
+1 -1 -1 • Temp = 300, Press = 500
-1 +1 -1 • Temp = 350, Press = 500
+1 +1 +1 • Temp = 300, Press = 600
Coded levels for factors • Temp = 350, Press = 600
659
Graphical DOE Analysis - The Cube Plot
Consider a 23 design on a catapult...
8.2 4.55 A B C Response
Run Start Stop Meters

Number Angle Angle Fulcrum Traveled
3.35 1.5 1 -1 -1 -1 2.10
2 1 -1 -1 0.90
3 -1 1 -1 3.35
Stop Angle
5.15 2.4 4 1 1 -1 1.50

5 -1 -1 1 5.15
6 1 -1 1 2.40
Fulcrum
7 -1 1 1 8.20
8 1 1 1 4.55
2.1 Start Angle 0.9
What are the inputs being manipulated in this design?

How many runs are there in this experiment?
660
Graphical DOE Analysis - The Cube Plot
Stat>DOE>Factorial>Factorial Plots … Cube, select response and factors
This graph is used by the experimenter to visualize how the response data is
distributed across the experimental space.
How do you read Cube

Cube Plot
Plot (fitted
(fittedmeans)
means) for
for Distance
Distance
or interpret this
8.20
8.20 4.55
4.55
plot?
3.35
3.35 1.50
1.50
11
What are these?

Stop
StopAngle
Angle 5.15 2.40
5.15 2.40
11
Fulcrum
Fulcrum
2.10
2.10 0.90
0.90
-1
-1 -1
-1
-1
-1 11
Start
StartAngle
Angle
Catapult.mtw
661
Graphical DOE Analysis - The Main Effects Plot
Stat>DOE>Factorial>Factorial Plots … Main Effects, select response and factors
This graph is used to see the relative effect of each factor on the output
response.
Main Effects Plot for Distance

Data Means
Start Angle Stop Angle
5
3 Hint: Check the

2
slope!
Mean
-1 1 -1 1
Fulcrum
5
2
-1 1
Which factor has the largest impact on the output?

662
Main Effects Plots’ Creation
Avg Distance at Low Setting of Start Angle: 2.10 + 3.35 + 5.15 + 8.20 = 18.8/4 = 4.70
Main Effects Plot (data means) for Distance
-1 1 -1 1 -1 1
5.2
4.4
Dist
3.6
2.8
2.0
Start Angle Stop Angle Fulcrum
Avg. distance at High Setting of Start Angle: 0.90 + 1.50 + 2.40 + 4.55 = 9.40/4 = 2.34
Run # Start Angle Stop Angle Fulcrum Distance
1 -1 -1 -1 2.10
2 1 -1 -1 0.90
3 -1 1 -1 3.35
4 1 1 -1 1.50
5 -1 -1 1 5.15
6 1 -1 1 2.40
7 -1 1 1 8.20
8 1 1 1 4.55
663
Interaction Definition
Interactions occur when variables act together to impact the output of the process. Interactions
plots are constructed by plotting both variables together on the same graph. They take the form
of the graph below. Note that in this graph, the relationship between variable “A” and Y changes
as the level of variable “B” changes. When “B” is at its high (+) level, variable “A” has almost no
effect on Y. When “B” is at its low (-) level, A has a strong effect on Y. The feature of interactions
is non-parallelism between the two lines.
Higher
B-
Y
When B changes
from low to high, the
Output
output drops
When B changes dramatically.
from low to high, the
output drops very
B+
little. Lower
- A +
664
Degrees of Interaction Effect
Some Interaction No Interaction Full Reversal

High High High
B- B-
B-
Y B+
B+ Y B+ Y
B+
Low Low Low
- A + - A + - A +
Strong Interaction Moderate Reversal
High High
B- B-
Y Y
B+
B+ B+
Low Low
- A + - A +
665
Interaction Plot Creation
Interaction Plot (data means) for Distance

Start Angle
6.5
-1
1
5.5
4.5
Mean
3.5
2.5
(4.55 + 2.40)/2 = 3.48
1.5
(0.90 + 1.50)/2 = 1.20 -1 1

Fulcrum
Run # Start Angle Stop Angle Fulcrum Distance

1 -1 -1 -1 2.10
2 1 -1 -1 0.90
3 -1 1 -1 3.35
4 1 1 -1 1.50
5 -1 -1 1 5.15
6 1 -1 1 2.40
7 -1 1 1 8.20
8 1 1 1 4.55
666
Graphical DOE Analysis - The Interaction Plots
Stat>DOE>Factorial>Factorial Plots … Interactions, select response and factors

When you select more than two variables, MINITABTM generates an Interaction Plot
Matrix which allows you to look at interactions simultaneously. The plot at the upper
right shows the effects of Start Angle on Y at the two different levels of Fulcrum. The
red line shows the
effects of Fulcrum on
Y when Start Angle is Interaction
InteractionPlot
Plot(data
(datameans)
means) for
for Distance
Distance
-1 11 -1 11
at its high level. The -1 -1
Start
Start 66
black line represents AAngle
ngle
-1
-1
11 44
the effects of Fulcrum Star
StarttAAngle
ngle
22
on Y when Start
Stop
Stop
Angle is at its low AAngle
ngle
66
-1
-1
level. Stop
StopAAngle
ngle
11 44
22
Note: In setting up this graph we selected Fulcr

Fulcrum
um
options and deselected “draw full interaction
matrix”
667
Graphical DOE Analysis - The Interaction Plots
Stat>DOE>Factorial>Factorial Plots … Interactions, select response and factors

The plots at the lower left in the graph below (outlined in blue) are the “mirror image” plots of
those in the upper right. It is often useful to look at each interaction in both representations.
Interaction
InteractionPlot
Plot(data
(datameans)
means) for
for Distance
Distance
-1
-1 11
Start
Start
66
AAngle
ngle
-1
-1
44
Star 11
StarttAAngle
ngle
22
Stop
Stop
66
AAngle
ngle
-1
-1
44
Stop
StopAAngle
ngle
11
Choose this option
22
Fulcrum
for the additional
Fulcrum
plots.
66
-1
-1
11
44
Fulcr
Fulcrum
um
22
-1
-1 11 -1
-1 11
668
DOE Methodology
1. Define the Practical Problem

2. Establish the Experimental Objective
3. Select the Output (response) Variables
4. Select the Input (independent) Variables
5. Choose the Levels for the Input Variables
6. Select the Experimental Design
7. Execute the experiment and Collect Data
8. Analyze the data from the designed experiment and draw Statistical
Conclusions
9. Draw Practical Solutions
10. Replicate or validate the experimental results
11. Implement Solutions
669
Generate Full Factorial Designs in MINITABTM
“DOE”>”Factorial”>”Create Factorial Design…”
670
Create Three Factor Full Factorial Design
Stat>DOE>Factorial>Create Factorial Design
671
672
673
Three Factor Full Factorial Design
Hold on! Here we go….

674
Summary
• Determine the reason for experimenting
• Describe the difference between a physical model and a DOE model
• Explain an OFAT experiment and its primary weakness
• When shown a Main Effects Plots and interactions, determine which effects and
interactions may be significant.
• Create a Full Factorial Design
675
Methodology
Experimental Methods Considerations
Full Factorial Experiments Steps
676
DOE Methodology

5. Choose the Levels for the input variables
7. Execute the experiment and Collect Data
8. Analyze the data from the Designed Experiment and draw Statistical Conclusions
10. Replicate or validate the experimental results
677
Questions to Design Selection
Project Management Considerations
What is the process environment:

1. How much access to the process?
2. Are the team members and any subject matter experts fully involved?
3. Who are the process owners and stakeholders?
4. Are the process owners involved?
5. Do the process owners know what a DOE is ?
6. Do the process owners know what the DOE means to them?
7. How many runs can you afford (time and money)?
8. Will you run the DOE at the process or in a lab?
9. What Noise variables need to be designed around?
10. How large of an experimental region will be explored for the DOE?
678
Questions to Design Selection
Technical Considerations
What are the objectives/goals of the experiment:
1. What factors are important? (narrowed from Analyze Phase)
2. What is the operating range for each factor?
3. How can I minimize both the cost of DOE and the cost of running the
process?
4. How much change in the process do we require?
5. How close to optimal does the process currently run?
6. Are we tackling a centering or variation problem?
7. What is the impact to the process while running the DOE?
8. What are the costs of competing DOE designs?
9. What do you know about the process interactions?
679
DOE Methodology Step 1
• Relate how the experiment connects with the original project scope. Practically
speaking, what is this experiment supposed to accomplish?
1. Identify Root Cause
2. Measure Variation
3. Measure Output Response
• Have the measurement systems been verified for the Input Variables and Output
Response?
A circuit board manufacturer wanted to identify what factors impact

the adhesion level between circuit boards. The factors and output had
satisfactory Gage R&R results of less than 15% study variation.
680
• Objective must include the critical characteristics and the desired outcome.
– If the experiment and project is tackling recurring issues, consider a different critical
characteristic.
• The characteristic may require a different physical phenomenon being measured
or with a differing measurement system.
• The measurement system precision and accuracy may influence the specific
output to be measured.
• Identify the desired experimental outcome.

1. Eliminate Root Cause
2. Reduce Variation
The output of interest is the tackiness
3. Achieve a target of the top surface of the circuit boards.
4. Maximize Output Response We want to maximize the Output
5. Minimize Output Response Response.
6. Robust process or product
681
• Is the output(s) qualitative or quantitative?

• What was the past Response Variable’s baseline results?
• Is the output(s) typically under statistical control?
• Does the output(s) vary with time?
• How much change in the output(s) do you want to detect?
• Is the measurement system adequate with the same units of measure as identified
in Step 1?
– For experimental reasons, this measurement may be different than your past
outputs considered.
• How many outputs?
The output is tackiness and is measured in Newtons (force).

The output measurement must be done within an hour of production and
the measurement system has not changed. We want to detect at least a
change in tackiness of 15 Newtons in the Response Variable.
682
• Use the Analyze Phase and subject matter experts to select these factors.
• All factors must be independent of each other.
• Consider past results from previous experiments.
• Test the most likely candidates first.
• Factors not included in the designed experiment should be held constant and recorded.
• Noise or uncontrollable factors (typically environmental conditions) should be
monitored and the experimental design may be impacted (see Step 6).
The inputs selected by the team following the Six Sigma methodology are dwell time (sec), temperature
of solution (deg F) and concentration of solution (% solids). Noise factors of ambient temperature and
humidity were recorded and monitored.
683

• Factor levels must be considered to create the desired change in Output Response
identified in Step 3.
• Do NOT create unsafe or beyond the feasibility of the process conditions.
– This does NOT mean constraining Input Variable levels to current process range.
– Be wary if operating near the extremes or operating limits.
• Realize some experimental runs may produce unacceptable product or process results.
These results must be weighed against the risk of future production.
• Even when designing your experiment with coded levels for the factors, the team MUST
be aware of what the levels mean in the process language.
• Factor levels can be impacted by the Experimental Objective in Step 2.
– Screening experiments have wider settings for factors
– Full Factorials have narrower settings than screening experiments
– Response surface Designed Experiments have quite narrow settings
684
DOE Methodology Step 5 (cont.)

• Setting the factor levels too wide may cause the experiment to miss an important region or
change in the Output Response.
Results of experiment show

no significant difference in
settings
Output Response
“-” “+”
Factor Settings
685

• Setting the factor levels too narrow will show no difference in the output or not give
enough statistical confidence in the effect of the factor on the output relative to the noise
in the experiment.
Output Response
“-” “+” Factor Settings
686

• Should be set far enough apart to detect a difference in the response and to have enough
statistical confidence in the change of the output relative to the experimental noise.
Output Response
Factor Settings
“-” “+”
The experiment is using coded levels:
Dwell time: +1 (20 sec); -1 (10 sec)
Temp of sol’n: +1 (80 deg F); -1 (100 deg F)
Conc. of sol’n: +1 (40%) ; -1 (20%)
687

• Factorial Design (full vs.
fractional)
– Full designs typically have 5
or fewer factors
• All interactions can be
estimated
– Screening or Fractional
Factorial designs have many
factors
• Not all interactions can
be estimated
688
Balanced and orthogonal designs are highly encouraged and the definition of
balanced and orthogonal is covered in a later module.
Center Points are used for investigating curvature and advanced designs. Center
Points are covered in a later module.
Blocking can be used to account for noise variables and is covered in a later
module.
I’m keeping out the Noise coach!!
689
Randomization has an impact on statistical confidence because experimental noise is

spread across the runs.
What would happen if

another unknown significant
variable changed halfway
thru our experiment?
690
Sample size must be determined.

Determined by
Step 4.
For full
factorials, this
equals 2factors
Specified in
Step 2.
Typically 0.9
σ of process output
variable See first slide
of Step 6.
After the number of replicates is determined,

691 we must decide the sampling
strategy.
A sample size of 2 is indicated

for the example shown. What
does this mean?
2-Level Factorial Design
Factors: 3 Base Design: 3, 8

Blocks: none
Center Total Target

Points Effect Reps Runs Power Actual Power
0 2 2 16 0.9 0.936743
692
Replication of an experimental run is an independent observation of the run that represents

variation from experimental run to experimental run.
– A replicate must be made at a unique time or sequence in the experiment.
Single Replicate Design Replicated Design (2)
693
Additional considerations are required when determining what a sample size means.
For the experimental results to be representative of the process, sample across the
largest family of variation.
– It is also necessary to determine how to define a representative sample and
experimental unit.
• Characteristics of a representative sample are:
– Repeatable measurement and represents natural variation of the
process.
• An experimental unit is the basic unit to which an experimental run
can be applied and includes all the qualities of a representative sample.
694
Recall from the Analyze Phase the Multi-Vari tool described the three families of
variation. Consider these families of variation to determine how to sample with
replication for an experiment.
– Within Unit or Positional
• Within piece variation related to the geometry of the part.
• Variation across a single unit containing many individual parts such as a
wafer containing many computer processors.
• Location in a batch process such as plating.
– Between Unit or Cyclical
• Variation among consecutive pieces.
• Variation among groups of pieces.
• Variation among consecutive batches.
• Temporal or Over Time
• Shift-to-Shift
• Day-to-Day
• Week-to-Week
695
7. Execute the Experiment and Collect Data
• Discuss the experimental scope, time and cost with the process owners prior to the
experiment.
• Some team members must be present during the entire experiment.
• After the experiment has started, are you getting output responses you expected?
– If not, quickly evaluate for Noise or other factors and consider stopping or
canceling the experiment.
• Use a log book to make notes of observations, other factor settings, etc.
• Communicate with the operators, technicians, staff about the experimental details and
why the experiment is being discussed before running the experiment.
– This communication can prevent “helping” by the operators, technicians, etc. that
might damage your experimental design.
• Alert the laboratory or quality technicians if your experiment will increase the number
of samples arriving during the experiment.
696
8. Analyze the Data from the Designed Experiment and draw Statistical
Conclusions
• Graphical Analysis has already been covered in the previous modules.

• Further analysis of “reducing” the model to the significant terms will be
covered in the next module.
• Further analysis of “reducing” the model to the significant terms will be
covered in the next module.
• The final model fitting will occur.
• Terms in the final DOE equation will have statistical confidence you needed.
• Diagnose the residuals similarly to that of Regression Analysis.
• Details of this step are covered in the next module.
697
• This will be covered in detail in the next module.

• Even if terms or factors are statistically significant, for practical significance the
term might be removed.
• “Stat>DOE>Factorial>Response Optimizer” will help the project team find
where the vital few factors need to be targeted to achieve the desired output
response.
– This will be covered in detail in the next module.
• This step is how the project team determines the project’s potential success.
• Immediately share the results with the process owner for feedback on
implementation of the experimental results.
698
10. Replicate or Validate the Experimental Results
• After finding the Practical Results from Step 9, verify the results:
– Set the factors at the Practical Results found with Step 9 and see if the
process output responds as expected. This verification replicates the result
of the experiment.
– Do not forget your model has some error.
699
• If the objective of the experiment was accomplished and the Business Case is
satisfied, then proceed to the Control Plan which is covered in the Control Phase.
• Do not just run experiments and not implement the solutions.
• Further experiments may need to be designed to further change the output to
satisfy the Business Case.
– This possible need for another experiment is why we stated in earlier
modules that DOE’s can be an iterative process.
700
Summary
 Be able to Design, Conduct and Analyze an Experiment
701
Advanced Process Modeling:

MLR
Designing Experiments Mathematical Models
Experimental Methods Balance and Orthogonality

Fit and Diagnose Model
Center Points
702
Why Use Full Factorial Designs
2k Full Factorial designs are used to:

• Investigate multiple factors at only two levels, requiring fewer runs than multi-level
designs.
• Investigate large number of factors simultaneously in relatively few runs.
• Provide insight into potential interactions.
• Frequently used in industrial DOE applications because of simplicity and ease of analysis.
• Obtain a mathematical relationship between X’s and Y’s.
• Determine a numerical, mathematical relationship to identify the most important or critical
factors in the experiments.
Full Factorial designs are used when:

• There are five or fewer factors.
• You know the critical factors and need to explain interactions.
• Optimizing processes.
703
Mathematical Output of Experiments
• The end result of a DOE is a mathematical function to describe the results of the
experiment.
• For the 2k Factorial designs this module discusses, Linear relationships are
covered.
• All models will have some error as shown by the ε in the below equation.
• The mathematical equation below is the Prediction from the experimental data.
Notice there is no error term in this form.
• is the predicted Output Response as a function of the Input Variables used in
Ŷ experiment.
the
704
Linear Mathematical Model
The Linear Model is sufficient for most industrial experimental objectives.

The Linear Model can explain response planes and twisted response surfaces because of
interactions.
– The following is a linear prediction model used in a two-level full or fractional
factorials.
Surface Plot of % Reacted Surface Plot of % Reacted
65
65
60 55
% Reacted % Reacted
1 1
55
45
-1
0
Cn -1
0
Cn
0 -1 0 -1
Ct 1 T 1
705
Quadratic Mathematical Model
Quadratic Models can be obtained with designs not described in this module.
Quadratic Models explain curvature, maximums, minimums and twisted maximums and
minimums when interactions are active.
– The following is the quadratic prediction model used in some response surface models
not covered in this training.
– The simpler 2k models do not include enough information to generate the Quadratic
Model.
Surface Plot of C6
21
16
C6
11
1.5
1.0
6 0.5
-1.5 -0.5
0.0
B
-1.0 -0.5 -1.0
0.0 0.5 -1.5
A 1.0 1.5
706
Nomenclature for Factorial Experiment
2-level designs are most commonly used:

– 2k where k is the number of factors
– The total number of runs in the design is equal to the result of the math.
• Example: 3 factors
• 23 = 8 runs
Other designs have more levels in the Factorial Designs.

– Example is a 34 factorial design with 4 factors at 3 levels for each factor.
707
Treatment Combinations
Treatment combinations, or experimental runs, show how to set the levels for each
of the factors.
Minuses and plusses can be used to indicate low and high factor level settings,
Center Points are indicated with zeros.
If the process is evaluated with combinations of the temperature set at 10 and 20

degrees and pressure at 50 and 100 psi, an example of an experimental run or
treatment combination would be 20 degrees and 50 psi.
– This 22 design shown below has 2 factors at 2 levels.
– A total of 4 treatment combinations are in this experiment.
Temperature
10 20 Treatment combination for
Pressure 50 1 2 run number 2 is:
100 3 4
Temperature at 20 deg and
Pressure at 50 psi.
708
Standard Order of 2 Level Designs
The design matrix for 2k factorials are shown in standard order (not randomized).
– The low level is indicated by a “-” and the high level by a “+”.
– This order is commonly referred to Yates standard order for Dr. Frank Yates.
709
Full Factorial Design with 4 Factors
710
Full Factorial Design
Stat>DOE>Factorials>Create Factorial Design

This design is in coded units because it simply lists minus and plus signs for the factor levels.
Coded units provide some advantages in the analysis but is not useful for process owners
when running an experiment.
The table is also referred to as a Table of Contrasts.
Factors
711
Balanced Design
Factorial Designs should be balanced for proper interpretation of the mathematical

equation.
An experiment is balanced when each factor has the same number of experimental
runs at both high and low levels.
Summing the signs of the column contrast should yield a zero.
Balance simplifies the math necessary to analyze the experiment.

– If you always use the designs MINITABTM provides, they will always be balanced.
A B
1 - -
2 + -
3 - +
4 + +
 Xi 0 0
712
Orthogonal Design
An Orthogonal Design allows each effect in an experiment to be measured

independently; they are vectors that are at 90 degrees to each other.
If every interaction for all possible variable pair sums to zero, the design is
orthogonal.
With an Orthogonal Design, if an interaction is found to be significant, it is because
of the data and not the experimental design.
– If you always use the designs MINITABTM provides, they will always be orthogonal
and balanced.
AA BB CC AB
AB AC
AC BC
BC
11 -- -- ++ ++ -- --
22 ++ -- -- -- -- ++
33 -- ++ -- -- ++ --
44 ++ ++ ++ ++ ++ ++
 Xi X y  00 00 00
713
Biomedical Production Example
In this example we will walk through the 11 Step DOE methodology. The
biomedical firm is attempting to increase the yield of a specific protein expression
for use in research by universities and pharmaceutical companies.

•Increase the current production yield by 50%. The Measurement System Analysis for yield
has been verified. The baseline for the primary metric of yield is at 50%. The objective of
the Project Charter required the team to achieve at least a 50% increase in yield.

• Maximize the yield.
714

• Yield of protein expression is the only output of interest.
• It is desirable to change the yield from 50% to at least 75%.

• Temperature
• Concentration
• Catalyst
• Noise and other variables such as ambient room temperature and technician will be
recorded during the experiment.

• The following levels were determined with tools from the Analyze Phase such as
Regression, Box Plots, Hypothesis Testing and Scatter Plots. The levels were set far
enough to attempt large yield changes to get statistical confidence in our results.
– Temperature C (25, 45)
– Concentration % (5, 15)
– Catalyst (Supplier A, Supplier B)
715

• A Full Factorial Design is desired because the team has no knowledge of the interactions
and the number of factors is only 3.
• Randomization is desired because of statistical confidence.
• Randomization is possible because all factors can be changed easily without large, long
disruptions to the process.
• The sample size will be based on a delta of 2 Standard Deviations.
Stat>Power and Sample Size> 2-level Full Factorial

2-Level Factorial Design
Factors: 3 Base Design: 3, 8

Blocks: none
Center Total Target
Points Effect Reps Runs Power Actual Power
0 2 2 16 0.9 0.93674
716
Stat>DOE> factorial>Create Factorial Design
717
718
For ease of data entry for the results of the DOE, we have turned off “Randomize runs” by
removing the check mark in the window in the “Options…” tab.
719
In an empty column, C8, type in ‘Yield’ where we will place the experimental results.
Do NOT edit, copy, paste or alter anything in the first 7 columns or MINITAB TM will not
understand the worksheet.
720

• Enter the results of the experiment in the column labeled “Yield”, our output.
• The ambient room temperature and technician were recorded per our original plan but we
did not place the information into this worksheet.
721
8. Analyze the Data from the Designed Experiment

Stat>DOE> Factorial>Analyze Factorial Design
Select “Normal” and “Pareto”
for Effects Plots.
Select “Standardized” for
Residuals for Plots.
722
MINITABTM defaults with all

effects in the model. After the
significant effects are determined,
the insignificant effects will be
removed.
723
Normal Probability Plot of the Standardized Effects The Normal Probability Plot assumes that
(response is Yield, Alpha = .05) insignificant effects are due to Noise and
99
Effect Ty pe therefore Normally Distributed. Any
Not Significant
95 Significant significant effects will be plotted off the
90 A F actor
A
N ame
Temp straight line and highlighted in red.
80 B C onc
AC C S upplier
70
Percent
60
50
40
30
20 Pareto Chart of the Standardized Effects
(response is Yield, Alpha = .05)
10
5 2.31
F actor N ame
A Temp
1
A B C onc
0 10 20 30 40 50 60 70 C S upplier
Standardized Effect AC
The Pareto chart of standardized

Term
BC
effects graphically shows which
ABC
effects are significant based on the
selected alpha level. Any effect AB
that goes beyond the red line is C

significant.
0 10 20 30 40 50 60 70
Standardized Effect
724
In the Session Window under the Factorial Fit, any effect that has a P-value less than 0.05 (for
an alpha of 0.05) is considered significant.
Notice that all three methods of determining what effects belong in the final model fit agree.
Factorial Fit: Yield versus Temp, Conc, Supplier
Estimated Effects and Coefficients for Yield (coded units)
Term Effect Coef SE Coef T P

Constant 61.1250 0.1811 337.44 0.000
Temp 23.4500 11.7250 0.1811 64.73 0.000
Conc 0.5750 0.2875 0.1811 1.59 0.151
Supplier 0.0000 0.0000 0.1811 0.00 1.000
Temp*Conc -0.0250 -0.0125 0.1811 -0.07 0.947
Temp*Supplier 10.0500 5.0250 0.1811 27.74 0.000
Conc*Supplier -0.4750 -0.2375 0.1811 -1.31 0.226
Temp*Conc*Supplier 0.1750 0.0875 0.1811 0.48 0.642
725
Re-fit the model by removing the insignificant factors.
Even though Supplier was not a

significant effect, it is necessary to
include it in the model because the
Temp/Supplier effect was significant.
This type of model is referred to as a

Hierarchical Model.
726
The Residual Analysis will

be discussed shortly.
727
Stat>DOE> Factorial>Factorial Plots

Anytime there is a significant interaction, it is
useful to plot.
Plot both “Main Effects Plot” and “Interaction

Plot” in this example.
728
Main Effects Plot (data means) for Yield Non-parallel lines in the
Temp Conc
Interaction Plot indicated
70
65 significance. The lines do not
60 have to cross each other to be
55
significant. Also, they can
Mean of Yield
50
25 45 5 15 cross slightly and still be
Supplier
insignificant.
70 Interaction Plot (data means) for Yield
65 5 15 A B
60 Temp
70 25
55 45
50 T emp 60
A B 50
C onc
70 5
15
A steep slope in the Main C onc 60
Effects Plot indicates

50
significance. A flat slope

indicates no significance. Supplier
729
Factorial Fit: Yield versus Temp, Supplier

Estimated Effects and Coefficients for Yield (coded units)
Constant 61.1250 0.1847 330.94 0.000
Temp 23.4500 11.7250 0.1847 63.48 0.000
Supplier 0.0000 0.0000 0.1847 0.00 1.000
Temp*Supplier 10.0500 5.0250 0.1847 27.21 0.000
S = 0.738805 R-Sq = 99.75% R-Sq(adj) = 99.69%

Analysis of Variance for Yield (coded units)
Source DF Seq SS Adj SS Adj MS F P
Main Effects 2 2199.61 2199.61 1099.80 2014.91 0.000
Model is significant
2-Way Interactions 1 404.01 404.01 404.01 740.17 0.000
Residual Error 12 6.55 6.55 0.55
Pure Error 12 6.55 6.55 0.55
Total 15 2610.17
730
Interpret the Residual Analysis the same as in Regression.
Residual
Residual Plots
Plots for
for Yield
Yield
Normal
Normal Probability
Probability Plot
Plot of
of the
the Residuals
Residuals Residuals
Residuals Versus
Versus the
the Fitted
Fitted Values
Values
99
99 22
Residual
90
90 11
Percent
Percent
Standardized
50
50 00
10 -1
-1
10
11 -2
-2
-2
-2 -1
-1 00 11 22 40
40 50
50 60
60 70
70 80
80
Standardized
Residual Fitted
FittedValue
Value
Histogram
Histogram of
of the
the Residuals
Residuals Residuals
Residuals Versus
Versus the
the Order
Order of
of the
the Data
Data
44 22
Residual
33 11
Frequency
Frequency
Standardized
22 00
11 -1
-1
00 -2
-2
-1.5
-1.5 -1.0
-1.0 -0.5
-0.5 0.0
0.0 0.5
0.5 1.0
1.0 1.5
1.5 11 22 33 44 55 66 77 88 99 10
10 11
11 12
12 13
13 14
14 15
15 16
16
Standardized Observation
ObservationOrder
Residual Order
731
The Residuals versus variables are most important when deciding what level to set an
insignificant factor.
A typical guideline is a difference of a factor of 3 in the spread of the Residuals between the
low and high levels of an insignificant input variable.
– In this case concentration was not significant, but we still need to make a
decision on how to set it for the process. The low level for concentration has
a smaller spread of Residuals, but there is not a difference of 3:1. Other
considerations for setting the variable are cost and reducing cycle time.
Residuals Versus Temp Residuals Versus Conc
(response is Yield) (response is Yield)
2 2
Spread of residuals
1 1
0 0
-1 -1
-2 -2
25 30 35 40 45 5.0 7.5 10.0 12.5 15.0
Temp Conc
732

Stat>DOE> Factorial>Response Optimizer
Recall the objective was to

maximize the yield. It is
necessary to establish a
target and lower limit for
the yield values.
733
Practical Solution:
- Temp 45C
- Concentration 5%
- Supplier B
734

• Verify, verify, verify.
• Verify settings determined in the last step, by producing several typical
manufacturing quantities.
• The variation or error seen in the experiment will be different than the variation
seen in the manufacturing validation.

• Implement the changes necessary to maintain the new gains to the process.
735
Center Points
A Center Point is an additional experimental run made at the physical center of the
design.
– Center Points do not change the model to quadratic.
– They allow a check for adequacy of Linear model.
The Center Point provides a check to see if it is valid to say that the output response
is Linear through the center of the design space.
If a straight line connecting high and low levels passes through the center of the
design, the model is adequate to predict inside the design space.
– “Curvature” is the statistic used to interpret the adequacy of the Linear Model.
– If curvature is significant
the P-value will be
less than 0.05. Output Response
Do NOT predict outside the
design space.
“-” “c” “+” Factor Settings

736
Center Point Clues
A Center Point is always a good insurance policy but is most effective when all the
input factors are Continuous.
A guideline is to run 2-4 Center Point runs distributed uniformly through the
experiment when all the input factors are continuous in a Full or Fractional
Factorial.
Y
Maximize Response
Does it matter that the
linear model is
inappropriate?
x
“-” “c” “+”
737
Panel Cleaning Example
In this example we will walk through the 11 step DOE methodology for a panel
cleaning machine using Center Points in the analysis. The manufacturing firm is
attempting to start up a new panel cleaning machine and would like to getting it
running quickly. They have experience with this type of machine, but they do not
have experience with this particular model of equipment.

• Start up the new equipment as efficiently as possible. The need for the new equipment was
determined in the Analyze Phase.
• A measurement system analysis has been completed and modified to bring within
acceptable guidelines.

• Hit a target for Width of 40 +/- 5.
• Minimize variation as much as possible.
738

• Width of conductor is the only response.

• Dwell Time
• Temperature
• Na2S2O8
• The experts believe that ambient temperature and humidity will have no effect on
the process. Monitors will be placed in the room to record temperature and
humidity.

– Dwell Time ( 4, 6) minutes
– Temperature (40, 80) C
– Na2S2O8 (1.8, 2.4) gm/lit
739

• A Full Factorial will be used since there are only 3 input variables.
• Randomization is possible because all factors can be changed easily without large, long
disruptions to the process.
• Is the sample size adequate based on a delta of 2 Standard Deviations?
Why are
Notice the Center
Center Points Points not
are uniformly random?
distributed
through the
design.
Panel Cleaning.mtw
740
Creating Designs with Center Points
MINITABTM will place the Center Points randomly in the worksheet. Through the
next few steps we will demonstrate how to move the Center Points so they are
uniformly distributed.
• Create a 3 factor design with 3 Center Points and 2 replicates, be sure to
randomize the design.
741
Notice the Center Points

are not uniformly
distributed with this
random design. It is
desirable to move one
Center Point near or at
the beginning, middle
and end.
742
DO NOT move rows or generate new worksheets in MINITABTM’s DOE

platform, it will corrupt the model stored in memory!
To move the center points

to new locations, find a
Center Point and type a
‘1’ in the “RunOrder”
column. Find the original
1 and replace with the
original Center Point
RunOrder number.
743
To complete the Center Point arrangement, sort the data on the RunOrder
column but DO NOT create a new worksheet.
Data>Sort
744

• The experiment has been run in the order shown below.
• One of the most common
mistakes in DOE is typing
the data in the data sheet
incorrectly. Always verify
number entry!
745
8. Analyze the Data from the Designed Experiment
Stat>DOE> Factorial>Analyze Factorial Design
746
Normal Probability Plot of the Standardized Effects

(response is Width, Alpha = .05)
99
Effect Ty pe
Not Significant
95 Significant
90 C F actor N ame
A D w ell Time
80 B Temp
B C N a2S 2O 8
70
A
Percent
60
50
40
30
20
Pareto Chart of the Standardized Effects
10 BC (response is Width, Alpha = .05)
5
2.23
F actor N ame
1 A D w ell Time
-10 -5 0 5 10 15 20 C B Temp
Standardized Effect C N a2S 2O 8
B
Term BC
A
The significant effects are
AB
Na2S2O8, Temp, Dwell Time
AC
and the interaction of Temp
ABC
with Na2S2O8.
0 5 10 15 20
Standardized Effect
747
Notice that all three methods of determining what effects belong in

the final model fit agree.
Factorial Fit: Width versus Dwell Time, Temp, Na2S2O8
Estimated Effects and Coefficients for Width (coded units)
Constant 34.724 0.2605 133.30 0.000
Dwell Time 4.871 2.436 0.2605 9.35 0.000
Temp 6.484 3.242 0.2605 12.44 0.000
Na2S2O8 9.169 4.584 0.2605 17.60 0.000
Dwell Time*Temp 0.941 0.471 0.2605 1.81 0.101
Dwell Time*Na2S2O8 0.861 0.431 0.2605 1.65 0.129
Temp*Na2S2O8 -4.876 -2.438 0.2605 -9.36 0.000
Dwell Time*Temp*Na2S2O8 -0.199 -0.099 0.2605 -0.38 0.711
Ct Pt 0.296 0.6556 0.45 0.662
S = 1.04201 R-Sq = 98.48% R-Sq(adj) = 97.26%
748
Re-fit the model by removing the insignificant factors.
749
A Degree of Freedom (DF) is a measure of the number of independent pieces of information

used to estimate a parameter. It is a measure of the precision of an estimate of variability. A
typical definition is n -1= D. F., however it depends on what parameters are being estimated.
Analysis of Variance for Width (coded units)3 DF for the 3 Main Effects, 1 DF for the
Source DF Seq SS Adj SS interaction
Adj MS effect
F in the model.
P
Main Effects 3 599.336 599.336 199.779 148.18 0.000
1 DF for curvature based on the
difference between the average of the
Curvature 1 0.221 0.221 0.221 points
factorial 0.16and 0.692
the average of the
Residual Error 13 17.527 17.527 1.348 Points.
Center
Lack of Fit 3 6.669 6.669 2.223 2.05 0.171
Pure Error 10 10.858 10.858 13 DF for residual error broken into two
1.086
Total 18 712.195 components: Lack of Fit and Pure Error.
18 DF for the Total (# Lack of Fit: 3 DF for the 3 insignificant

Estimated Coefficients for Width using data interaction
in uncoded units
effects that were removed
of data points -1).
Term Coef from the model.
Constant -70.4706
Pure Error: 10 DF: 8 from the replicated
Dwell Time 2.43562 runs (#reps-1 * # of runs) and 2 from the
Temp 1.01544 Center Points
Na2S2O8 39.6625 (#CP – 1).
Temp*Na2S2O8 -0.406354
750
Adj MS = Adj SS/DF

For each respective source. F= Adj MS/MSError
Analysis of Variance for Width (coded units)

Main Effects 3 599.336 599.336 199.779 148.18 0.000
Curvature 1 0.221 0.221 0.221 0.16 0.692
Residual Error 13 17.527 17.527 1.348
Lack of Fit 3 6.669 6.669 2.223 2.05 0.171
Pure Error 10 10.858 10.858 1.086 No significant
Total 18 712.195 curvature, the
linear model is
Estimated Coefficients for Width using data in uncoded units adequate.
Term Coef
Constant -70.4706 Prediction Equation No significant lack
based on coefficients. of fit, the effects do
Dwell Time 2.43562
not belong in the
Temp 1.01544
model.
Na2S2O8 39.6625
Temp*Na2S2O8 -0.406354
Ŷ  - 70.47  2.44 * Dwell Time  1.02 * Temp 
39.6625 * Na 2S2 O8 - 0.41 * Temp * Na 2S2 O8
751
Prediction Equation
Determine the predicted value when:

– Dwell time = 4.2 minutes
– Temp = 75C
– Sodium Persulfate = 2.0
Simply insert these values into the equation and do the math.
752
Main Effects Plot (data means) for Width

Dwell Time Temp Point Ty pe
C orner
38 C enter
36
34
Mean of Width
32
30
4 5 6 40 60 80
Na2S2O8
Interaction Plot (data means) for Width
38 40 60 80 1.8 2.1 2.4
36 Dwell
40
Time Point Type
34 4 C orner
Dwell T ime 5 C enter
32
32 6 C orner
30
24
1.8 2.1 2.4
Temp Point Type
40
40 C orner
60 C enter
T emp 80 C orner
32
Interaction shows there is very 24
little difference in the predicted

response as long as Sodium
Na2 S2 O 8
Persulfate is held at the high level.
753
Cube
Cube Plot
Plot (data
(data means)
means) for
for Width
Width
CCenterpoint
enterpoint
36.875
36.875 43.350
43.350 Factorial
FactorialPoint
Point
33.245
33.245 38.395
38.395
80
80
35.020
35.020
Temp
Temp 36.010
36.010 41.000
41.000
2.4
2.4
Na2S2O8
Na2S2O8
23.025
23.025 25.895
25.895
40
40 1.8
1.8
44 66
Dwell
DwellTime
Time
754
The Residual Plots look good.
Residual
Residual Plots
Plots for
for Width
Width
Normal
NormalProbability
Probability Plot
Plot of
of the
the Residuals
Residuals Residuals
Residuals Versus
Versus the
the Fitted
Fitted Values
Values
99
99 22
Residual
90
90 11
Percent
Percent
50 00
Standardized
50
-1
-1
10
10
11 -2
-2
-2
-2 -1
-1 00 11 22 20
20 25
25 30
30 35
35 40
40
Standardized Residual Fitted Value
Fitted Value
Histogram
Histogramof
of the
the Residuals
Residuals Residuals
Residuals Versus
Versus the
the Order
Order of
of the
the Data
Data
22
6.0
Residual
6.0
4.5 11
4.5
Frequency
Frequency
00
Standardized
3.0
3.0
1.5 -1
-1
1.5
0.0 -2
-2
0.0
-2
-2 -1-1 00 11 22 22 44 66 88 10
10 1212 1414 16
16 18
18
Standardized
Residual Observation
ObservationOrder
Order
755
Residuals Versus Dwell Time Residuals Versus Temp

(response is Width) (response is Width)
2 2
1 1
0 0
-1 -1
-2 -2
4.0 4.5 5.0 5.5 6.0 40 50 60 70 80
Dwell Time Temp
Residuals Versus Na2S2O8

(response is Width)
2
1
-1
-2
1.8 1.9 2.0 2.1 2.2 2.3 2.4
Na2S2O8
756
Stat>DOE> Factorial>Response Optimizer
757
The Response Optimizer has a little trick; if you include Center Points in the model
it will treat the low, center and high values as discrete points.
As you can see the Center Points fit the Linear Model.
Optimal Dwell Ti Temp Na2S2O8

D High 6.0 80.0 2.40
Cur 5.6162 40.0 2.40
0.99885 Low 4.0 40.0 1.80
Composite
Desirability
0.99885
Width
Targ: 40.0
y = 40.0057
d = 0.99885
758

D High 6.0 80.0 2.40
Cur [4.9346] [80.0] [2.40]
1.0000 Low 4.0 40.0 1.80
Composite
Desirability
1.0000
Width
Targ: 40.0
y = 40.0
d = 1.0000
759
Is this the only solution?

D High 6.0 80.0 2.40
Cur [4.9346] [80.0] [2.40]
1.0000 Low 4.0 40.0 1.80
Setting each factor at

Composite these settings will
Desirability
1.0000 achieve the target
output.
Width
Targ: 40.0
y = 40.0
d = 1.0000
Predicted output
760
What if you assume Na2S2O8 is very expensive? Where would you

set the variables.
Use the mouse and slide
the red line for Na2S2O8
to the low level first, then
adjust the other sliders to
move the predicted
response to 40. Is it
possible to achieve 40
with Sodium Persulfate
set at the minimum
value?
761
There is another MINITABTM function that will show the complete

solution set for a targeted values.
Stat>DOE>Factorial>Overlaid Contour Plot
762
Overlaid Contour Plot of Width Overlaid Contour Plot of Width

2.40 2.40
Width W idth
35 35
45 45
Hold Values Hold Values

2.25 Dwell Time 4 2.25 Dwell Time 5
Na2S2O8
Na2S2O8
2.10
Dwell Time at 2.10 Dwell Time at
low setting middle setting
1.95 1.95
1.80 1.80
40 50 60 70 80 40 50 60 70 80
Temp Temp
Overlaid Contour Plot of Width
The areas 2.40

Width
35
shown in 45
Hold Values
2.25
white are the
Dwell Time 6
solution set
Na2S2O8
2.10 Dwell Time at

for adjusting high setting
Temp and 1.95
Sodium 1.80
Persulfate. 40 50 60
Temp
70 80
763

• Verify, verify, verify.
• Verify settings determined in the last step, by producing several typical
manufacturing quantities.
• The variation or error seen in the experiment will be different than the variation
seen in the manufacturing validation.

• Implement the changes necessary to maintain the new gains to the process.
764
Summary
 Understand how to Create Balanced & Orthogonal Designs
 Explain how Fit & Diagnose & Center Points factors into an experiment
765
Designs
Creation
Generators
Confounding & Resolution
766
Why Use Fractional Factorial Designs
Fractional Factorial Designs are used to:

• Analyze factors to find cause/effect relationships if the Analyze Phase was unable to
sufficiently narrow the number of factors impacting the output(s).
– Fractional Factorials are often referenced as “screening experiments”-- fewer runs with larger
number of factors.
– Fractional Factorials are usually done in early stages of the improvement process.
Fractional Factorial Design Full Factorial Design

StdOrder A B C D StdOrder A B C D
1 -1 -1 -1 -1 1 -1 -1 -1 -1
2 1 -1 -1 1 2 1 -1 -1 -1
3 -1 1 -1 1 3 -1 1 -1 -1
4 1 1 -1 -1 4 1 1 -1 -1
5 -1 -1 1 -1
5 -1 -1 1 1
6 1 -1 1 -1
6 1 -1 1 -1 7 -1 1 1 -1
7 -1 1 1 -1 8 1 1 1 -1
8 1 1 1 1 9 -1 -1 -1 1
10 1 -1 -1 1
11 -1 1 -1 1
12 1 1 -1 1
13 -1 -1 1 1
14 1 -1 1 1
15 -1 1 1 1
16 1 1 1 1
767
Why Use Fractional Factorial Designs
Fractional Factorial Designs are also used to:

• Study Main Effects and 2-way interactions if the experimenter and team has good process
knowledge and can assume higher order interactions are negligible.
• Reduce time and cost of experiments because the number of runs have been lowered.
– As the number of factors increases, the number of runs required to run a full 2 k factorial
experiment also increases (even without repeats or replicates)
• 3 factors: 2x2x2 = 8 runs
• 4 factors: 2x2x2x2 = 16 runs
• 5 factors: 2x2x2x2x2 = 32 runs etc….
• Be an initial experiment that can be augmented with another fraction to reduce
confounding and estimate factors of interest.
The answer is in there

somewhere!!
768
Nomenclature for Fractional Factorials
The general notation for k-p

Fractional Factorials is:
R
– k = number of factors to be investigated
– p = number of factors assigned to an interaction column (also called “degree of
fractionating” with 1=1/2, 2=1/4,3=1/8, etc.)
– R = design resolution (III, IV, V, etc.). It details amount of confounding to
compare design alternatives
– 2k-p = the number of experimental runs
5-1
V
The example clarifies how to use the nomenclature.
• How many factors in the experiment?
• How many runs if no repeats or replicates?
• What fractional design is this (1/8, 1/4 or 1/2)?
769
Half-Fractional Experiment Creation
Recall the 2x2x2 full 3-factor, 2-level Factorial Design. Suppose we needed to investigate a fourth factor
but we could NOT increase the number of runs because of time or cost. Select the highest order interaction
to represent the levels of the fourth factor. The ABC interaction will determine the levels for factor D.
When we replace the ABC interaction with factor D, we say the ABC 3-way interaction was aliased or
confounded with D. This experiment maintains balance and orthogonality.
– The first experimental run in the first row indicates the experiment is executed with factor D at the low level
while running all the 3 other factors at the low level.
Factor D
A B C AxB AxC BxC AxBxC
-1 -1 -1 1 1 1 -1
1 -1 -1 -1 -1 1 1
-1 1 -1 -1 1 -1 1
1 1 -1 1 -1 -1 -1
-1 -1 1 1 -1 -1 1
1 -1 1 -1 1 -1 -1
-1 1 1 -1 -1 1 -1
1 1 1 1 1 1 1
This is a half-fraction 24-1 design - a Resolution IV design with only 8 runs.

770
Half-Fractional Experiment Creation
Why is the design, shown as orange rows, called a “half” fraction? This is the design just created on
the previous slide. This is a half fraction since a full 2x2x2x2 factorial would take 16 runs. With the
half fraction we can estimate the effects of 4 factors in 8 runs. What is the cost? We lose the ability to
study the higher order interaction independently!
A B C D AxB AxC BxC AxBxC

-1 -1 -1 -1 1 1 1 -1
1 -1 -1 -1 -1 -1 1 1
-1 1 -1 -1 -1 1 -1 1
Half Fraction: 1 1 -1 -1 1 -1 -1 -1
-1 -1 1 -1 1 -1 -1 1
Alias Structure: 1 -1 1 -1 -1 1 -1 -1
D = ABC -1 1 1 -1 -1 -1 1 -1
Note D settings are 1 1 1 -1 1 1 1 1
-1 -1 -1 1 1 1 1 -1
the same as the 1 -1 -1 1 -1 -1 1 1
ABC interaction -1 1 -1 1 -1 1 -1 1
1 1 -1 1 1 -1 -1 -1
-1 -1 1 1 1 -1 -1 1
1 -1 1 1 -1 1 -1 -1
-1 1 1 1 -1 -1 1 -1
1 1 1 1 1 1 1 1
Could we create a quarter fraction experiment out of the above matrix and
still study four factors at once?
Why or why not?
771
Graphical Representation of Half-Fraction
We have discussed half-fractional Experimental Designs for 4 factors:

The graphical representation shows the 8 runs we created on the previous 2 slides.
- A + - A +
Top line of previous slide
-
- C
+
B
-
+ C
+
- D +
Remember that D is confounded with the ABC interaction in this half-fractional
design.
772
Design Generators
Design Generators are an easier technique to use than to generate the

Fractional Factorial Designs by hand as done in the previous slides.
Design Generators help us EASILY find the confounding within the

Fractional Design.
A Design Generator is the mathematical definition for how to begin aliasing a

Full Factorial to create a Fractional Factorial.
Example of a Design Generator:
Design Generator D = ABC
This means the D column is the same as the ABC interaction

column; they cannot be distinguished from each other so are
called “confounded”.
773
Design Generators
Design Generator D = ABC

• Because of the Design Generator we can now fill out the D column
– For each row of D, multiply the values in the columns of A, B and C together
and create the column
• You may correctly suspect some 2-factor interactions are confounded
• Create contrast columns for AD, BD, CD using a similar technique used to create
the column for D
A B C AB AC BC D AD BD CD
-1 -1 -1 1 1 1
1 -1 -1 -1 -1 1
-1 1 -1 -1 1 -1
1 1 -1 1 -1 -1
-1 -1 1 1 -1 -1
1 -1 1 -1 1 -1
-1 1 1 -1 -1 1
1 1 1 1 1 1
774
Design Generators
A B C AB AC BC D AD BD CD
-1 -1 -1 1 1 1 -1 1 1 1
1 -1 -1 -1 -1 1 1 1 -1 -1
-1 1 -1 -1 1 -1 1 -1 1 -1
1 1 -1 1 -1 -1 -1 -1 -1 1
-1 -1 1 1 -1 -1 1 -1 -1 1
1 -1 1 -1 1 -1 -1 -1 1 -1
-1 1 1 -1 -1 1 -1 1 -1 -1
1 1 1 1 1 1 1 1 1 1
775
What does this mean?
Fractional Factorial Design
Factors: 4 Base Design: 4, 8 Resolution: IV

Runs: 8 Replicates: 1 Fraction: 1/2
Blocks: none Center pts (total): 0
Design Generators: D = ABC

Alias Structure
I + ABCD
A + BCD
B + ACD
C + ABD
D + ABC
AB + CD
AC + BD
AD + BC
776
So What is “Confounding”?
Confounding is the consequence an experimenter accepts for not running a Full

Factorial Design.
When using the “Confounding” or “Alias” pattern we assume that the higher order
interactions in a Confounded effect are not significant.
– Sparsity of effects principle indicates that higher order interactions are very rare.
• “While interactions are important they do not abound…, interactions that are more
complex than those involving two factors are rare” Thomas B. Barker
In the past example, the D factor was Confounded with the ABC 3-way interaction.
When the effect is assigned to D which is Confounded with ABC, we assume because
of the sparsity of effects principle the effect is entirely because of the D factor.
Remember when two items such as an interaction with a Main Effect are Confounded,
one cannot distinguish if the statistical significance is a result of the Main Effect or the
interaction or a combination.
Aliasing is another term for “Confounding”.
777
Confounded Effects With Fractionals
MINITABTM will automatically generate the alias Same levels

structure which lists all the Confounded Effects.
Note: For this case Alias Structure
AA BC
BC ABC
ABC
– A is Confounded with BC
+1
+1 +1 +1
+1 +1
– B is Confounded with AC I + ABC -1
-1 -1
-1 +1
+1
– C is Confounded with AB -1
-1 -1
-1 +1
+1
The Confounding means any effect noted cannot A + BC
be specifically assigned to either of the B + AC B AC ABC
Confounded factors. -1 -1 +1
C + AB
– Remember we will use the sparsity +1 +1 +1
principle. -1 -1 +1
+1 +1 +1
Note: This is a level III design and is NOT
recommended since Confounding exists C AB ABC
between Main Effects and 2-factor -1 -1 +1
interactions. -1 -1 +1
+1 +1 +1
+1 +1 +1
778
Experimental Resolution
Remember R in the nomenclature referenced the Resolution. k-p

This useful visual aid remembers definitions of the Confounding
designated by the Resolution. 2R
Resolution III Fully Saturated Design
Hold up Three Fingers, One on one hand
and Two on the other. This illustrates the
Confounding of main effects with two way
Main Effects Two Way Interactions interactions.
Resolution IV
Next hold up four fingers
The Confounding is main effects with three
way interactions or…
Main Effects Three Way Interactions
Two way interactions Confounded with other

two way interactions.
Two Way Interactions Two Way Interactions

779
Experimental Resolution
k-p
The visual aid is shown through Resolution V.
R
Resolution V
Hold up Five Fingers, One on one hand and
Four on the other. This illustrates the
Confounding of main effects with four way
Main Effects Four Way Interactions interactions or …
Two way interactions Confounded with

three way interactions.
Two Way Interactions Three Way Interactions
780
MINITABTM Fractional Factorial Design Creation
Fortunately, MINITABTM creates the designs for us to prevent having to create a Fractional
Factorial by hand. This output, found in the MINITABTM session window after creating a
Fractional Factorial design, should be understood because it also informs us of the Resolution
of the design.
Stat>DOE>Factorial>Create Factorial Design … 4 factors, Designs, ½ fraction
Fractional Factorial Design
Factors: 4 Base Design: 4, 8 Resolution: IV

Runs: 8 Replicates: 1 Fraction: 1/2
Blocks: none Center pts (total): 0
Design Generators: D = ABC

Alias Structure
I + ABCD
A + BCD
B + ACD
C + ABD
D + ABC
AB + CD
AC + BD
AD + BC
781
(5 -1)
2V Fractional Design Resolution V
Example of a very useful Fractional Design often used for screening designs.
Run A B C D E
1 -1 -1 -1 -1 1
2 1 -1 -1 -1 -1
3 -1 1 -1 -1 -1
E
4 1 1 -1 -1 1
5 -1 -1 1 -1 -1
6 1 -1 1 -1 1
7 -1 1 1 -1 1
B
8 1 1 1 -1 -1
C
A 9 -1 -1 -1 1 -1
D 10 1 -1 -1 1 1
11 -1 1 -1 1 1
Pros Cons 12 1 1 -1 1 -1
13 -1 -1 1 1 1
5 factors (Main Effects) 16 trials to get 5 Main Effects 14 1 -1 1 1 -1
10 2-way interactions 2nd order interactions are 15 -1 1 1 1 -1
Main Effects only Confounded with Confounded with 3rd order
rare 4-way interactions 16 1 1 1 1 1
782
MINITABTM’s Display of Available Designs
Fractional Designs are colored boxes

without “Full”
Note: Since we discourage Design Resolution III or IV, MINITABTM has shaded these as
RED and YELLOW for cautionary. GREEN is acceptable because Main Effects are not
Confounded with lower level interactions.
783
DOE Methodology

5. Choose the Levels for the input variables
7. Execute the Experiment and collect data
8. Analyze the Data from the designed experiment and draw Statistical
Conclusions
Just follow these simple steps…..
784
Fractional Factorial Example

• 8 factors are of interest in increasing the output but process knowledge is limited
because of a previously poor gauge for the output
• The output is to be maximized
3. Select the Output Variables
• The output is labeled Y and has a Gage R&R % study variation of less than 5%
4. Select the Input Variables
• The Input Variables are simply labeled A through H
• For simplicity sake of this exercise, the Levels can be expected to be appropriately set
and we will only work with coded levels
785
Fractional Factorial Example (cont.)

 Select the appropriate design in MINITAB TM and create this exact worksheet in columns C1
through C12.
 We have no reason to believe curvature exists and are satisfied that no replicates are
required.
 For ease of this exercise, be sure NOT to have randomized the experiment.
786
7. Execute the Experiment and Collect the Data

 Select the appropriate design in MINITABTM and create this exact worksheet in columns C1
through C12.
 We have no reason to believe curvature exists and are satisfied that no replicates are
required.
 The resources and time allow us to only run the experiment with 16 treatment
combinations or experimental runs.
787
8. Analyze the Data and draw Statistical Conclusions

 Before doing any analysis, let’s review what Confounding exists in this highly fractionated
Factorial Design
 The Main Effects are Confounded with numerous 3-way interactions
 The 2-way interactions are Confounded with numerous 2-way interactions
 This is important and must be remembered in our analysis.
788
We want 95% confidence in our Statistical Conclusions for this example.
We have generated the initial Pareto of effects.
Pareto
Pareto Chart
Chart of
of the
the Effects
Effects
(response
(response isis Y,
Y, Alpha
Alpha =
= .10)
.10)
0.26
0.26
EE FFactor
actor NName
ame
AA AA
AAC
C BB BB
HH CC CC
BB DD DD
AAFF EE EE
FF FF
AAEE GG GG
AAD
D HH HH
Term
Term
AA
AAG
G
CC
AAH
H
AABB
GG
FF
DD
00 22 44 66 88 10
10 12
12 14
14
Effect
Effect
Lenth's
Lenth's PSE
PSE =
= 0.129375
0.129375
789
A choice must be made in reducing our model or reducing the number of terms in the model. We
have chosen to look at the Confounding table generated by MINITAB TM.
 The AC 2 factor interaction is Confounded with other 2-way interactions but we will
assume for now using the Confounding table from MINITAB TM that the 2-way AC
interaction is actually the EH 2 factor interaction because both factors E and H are
significant.
 The second highest effect for a 2 factor interaction AF. We will look at the
Confounding table and assume it is the BE 2-way interaction since the B and E factors
are significant.
 The 2-way interaction AE also is significant with the alpha above 0.1. We cannot find
another 2-way interaction that might be significant using just the B, E, and H factors.
 If the AE interaction is kept in the model, then to maintain “hierarchical order”
factors A and E must be kept in the model.
 We will now lower reduce the model and see if we can further reduce the model.
790
The Reduced Model is shown here and we want 95% confidence to include terms
Notice the AE 2-way interaction has the smallest effect of the statistically significant terms and
factor A kept in the model to maintain the “hierarchical order” also has a small term and is
statistically insignificant. We choose to reduce the model and remove those terms. R-sq should
not be severely impacted. If it was impacted severely, we would reconsider this choice.
Factorial Fit: Y versus A, B, E, H

Estimated Effects and Coefficients for Y (coded units)

Constant 22.001 0.04381 502.21 0.000
A 0.144 0.072 0.04381 1.64 0.139
B 4.939 2.469 0.04381 56.37 0.000
E 12.921 6.461 0.04381 147.48 0.000
H -6.246 -3.123 0.04381 -71.29 0.000
A*E -0.351 -0.176 0.04381 -4.01 0.004
B*E -3.836 -1.918 0.04381 -43.78 0.000
E*H 8.244 4.122 0.04381 94.09 0.000
S = 0.175232 R-Sq = 99.98% R-Sq(adj) = 99.96%

Analysis of Variance for Y (coded units)
Main Effects 4 921.55 921.545 230.386 7502.91 0.000
Residual Error 8 0.25 0.246 0.031
Total 15 1252.99
791
The further refit model shows an adequate model because:

 Simplicity of terms; which is desired but NOT required
 R-sq is quite high (overly unusual for practical experiments)
 No or few unusual observations which would be noted below the ANOVA in MINITAB TM’s
session window
 Residuals are appropriate
Factorial Fit: Y versus B, E, H

Constant 22.001 0.07167 306.98 0.000
B 4.939 2.469 0.07167 34.46 0.000
E 12.921 6.461 0.07167 90.15 0.000
H -6.246 -3.123 0.07167 -43.58 0.000
B*E -3.836 -1.918 0.07167 -26.76 0.000
E*H 8.244 4.122 0.07167 57.51 0.000
S = 0.286673 R-Sq = 99.93% R-Sq(adj) = 99.90%

Analysis of Variance for Y (coded units)

Main Effects 3 921.46 921.462 307.154 3737.52 0.000
Residual Error 10 0.82 0.822 0.082
Lack of Fit 2 0.10 0.099 0.050 0.55 0.597
Pure Error 8 0.72 0.722 0.090
Total 15 1252.99
792
The Residuals Analysis is adequate and appropriate because:

 The residuals are concluded to be normally distributed
 No pattern for residuals in the order or versus Fitted Value
Residual
Residual Plots
Plots for
for YY
Normal
NormalProbability
Probability Plot
Plot Residuals
Residuals Versus
Versus the
the Fitted
Fitted Values
Values
99
99 NN 16
16 0.4
0.4
AD
AD 0.532
0.532
90
90 P-Value
P-Value 0.146
0.146 0.2
0.2
Residual
Percent
Residual
Percent
50
50
0.0
0.0
10
10 -0.2
-0.2
11
-0.50
-0.50 -0.25
-0.25 0.00
0.00 0.25
0.25 0.50
0.50 00 10
10 20
20 30
30
Residual
Residual Fitted
FittedValue
Value
Histogram
Histogram of
of the
the Residuals
Residuals Residuals
Residuals Versus
Versus the
the Order
Order of
of the
the Data
Data
44 0.4
0.4
33 0.2
Frequency
0.2
Frequency
Residual
Residual
22
0.0
0.0
11
-0.2
-0.2
00
-0.3
-0.3 -0.2
-0.2 -0.1
-0.1 0.0
0.0 0.1
0.1 0.2
0.2 0.3
0.3 0.4
0.4 11 22 33 44 55 66 77 88 99 10
10 11
11 12
12 13
13 14
14 15
15 16
16
Residual Observation
ObservationOrder
Order
Residual
793
Statistical Conclusions to maintain terms in the model must consider:

 Maintaining hierarchical order
 A 2-way interaction must have the involved factors in the model also
 High statistical confidence with the P-value less than your alpha risk
 A higher R-sq or model explanation of the process changes is desired
 Proper residuals and few to no unusual observations
No, no unusual
observations here…
794

 We will have to remember our Experimental Objective to increase the output Y.
 Looking at the positive coefficient for B and E, we know if we put those factors at the high
level or value of +1, the output increases
 Looking at the negative coefficient for H, we would think we should operate at the low level
or value of -1. However, the 2-way interaction of EH shows a coefficient that is larger and
would result in a net decrease in the output of Y so we must set H to a +1 or the high level.
 A big reminder is we have ASSUMED the 2-way interactions involved the factors we left in
the model.
Factorial Fit: Y versus B, E, H

Constant 22.001 0.07167 306.98 0.000
B 4.939 2.469 0.07167 34.46 0.000
E 12.921 6.461 0.07167 90.15 0.000
H -6.246 -3.123 0.07167 -43.58 0.000
B*E -3.836 -1.918 0.07167 -26.76 0.000
E*H 8.244 4.122 0.07167 57.51 0.000
S = 0.286673 R-Sq = 99.93% R-Sq(adj) = 99.90%
795
It can be difficult to optimize the solutions and get the Practical Solution desired.
Using Response Optimizer within MINITAB TM helps us find the Practical Solution of setting the
factors left in the model all at the high level or +1
Stat>DOE>Factorial>Response Optimizer…..set goal to maximize
796
Practical Conclusions to keep in the model include:

 Simple models can be useful depending on the project or process requirements
 Terms with practically large enough significance even if statistically significant
 Impact of R-sq by removing a term with low effects
 Ability to set and control the controllable inputs in the model may decide on the
use of terms
 Robust designs or minimal variation requirements may require close
inspection of interactions’ effects on the Y
 If multiple outputs are involved in the process requirements, balancing of
requirements will be necessary
That’s a lot of juggling….
797

 After we have determined with 95% statistical confidence, we must replicate the
results to confirm our assumptions; such as which 2-way interactions were
significant among the Confounded ones
 If the results do not match the expected results OR the project goal, further
experimentation may be needed
 In this case, we were able to achieve 29.8 on average with the process setting of E,
B and H and so the results are considered successful in the project
We win, we win…!!
798

 Work with the process owners and develop the Control Plans to sustain
your success
799
Fractional Factorial Exercise
Exercise objective: Open file “bhh379.mtw” and analyze using the

11 Step methodology.
1. What kind of Factorial Design is this?
2. Generate Factorial Plots in MINITABTM.
3. Create the Statistical and Practical model.
800
Summary
• Explain why & how to use a Fractional Factorial Design
• Create a proper Fractional Factorial Design
• Analyze a proper model with aliased interactions
Not that kind of model!!
801
Control Phase
Advanced Capability
Welcome to Control
Capability as Monitoring Technique
Capability of Data Types

Advanced Capability
Data Distribution Identification
Lean Controls
Special Causes Impacting Capability
Defect Controls
Statistical Process Control (SPC)
803
Beginnings of Control Phase
You’ve already narrowed to the “vital few” with the Define, Measure, Analyze and
Improve Phases.
Just because you are able to achieve results with your project or through a DOE
does not mean you have Process Capability.
This module in the Control Phase gives you tools and ideas to tackle Special Causes
that may be hampering your Process Capability even if you found your “vital few”
to get an improved average.
804
Capability and Monitoring
• If the project was important enough to warrant the time and attention of
you and your team, it is important enough to ensure that performance
levels are maintained
• Monitoring the improved process is a key element of the Control Plan
• Reporting Capability and Stability should be used together as primary
components of the monitoring plan
• In the Measure Phase, Capability was used to establish baseline
performance by assessing what had occurred in the relevant past
• In the Control Phase, Capability becomes a predictive (inferential) tool to
predict the expected process performance, usually based on a sample.
805
Capability Studies
Capability Studies
• Are intended to be periodic estimations of a process’s ability to meet its
requirements
• Can be conducted on both Discrete and Continuous Data
• Are most meaningful when conducted on stable processes
• Can be reported as Sigma Level which is optimal (short term) performance
• These concepts should be remembered from using the Six Sigma toolset applied
so far:
– Customer or business specification limits
• Business specification limits cannot be wider than the specification limits
of a final product
– Nature of long term vs. short term data
– Mean and Standard Deviation of the process (for Continuous Data)
– The behavior and shape of the distribution of Continuous Data
– Procedure for determining Sigma level
– Relevance of data
806
As in the Measure Phase, understanding whether you are dealing with long
term or short term data is an important first step.
If the process is stable, short term data provides a quick estimate of true
process potential since special causes are minimal.
Lot 1 Lot 5
Lot 3
Lot 2
Lot 4
Short-term studies
Long-term study
807
Capability for the Control Phase
The New Capability Road Map
Assess Stability
Verify Specifications
Distinguish between Long-Term

and Short-Term Data
N Are Data
Continuous?
N Determine the
Defects or Are Data
Shape of the
D efe ctive s
Defectives? Normal?
Distribution
D efe cts
Stat> Quality Tools> Stat> Quality Tools> Stat> Quality Tools> Stat> Quality Tools>
Capability Analysis> Capability Analysis> Capability Analysis> Capability Analysis>
Binomial Poisson Normal Nonnormal
808
Stability Check
• To assess the stability of the performance, you need to use a tool called Control Charts.
• Control Charts vary on the type of data for the metric chosen.
• For defectives data, you should plot p chart or np chart. For defects per unit data, you should plot u chart or c chart.
In this example, we wish to track if the late coming of reports out of the total class is stable or not.
Perform the standard calculations for a p-chart using the formulas provided by the facilitator
Day Subgroup size Late Reports
1 89 14
2 101 18
3 108 13
4 80 12
5 107 13
6 77 15
7 76 18
8 101 18
9 72 16
10 105 12
11 102 14
12 82 15
13 92 16
14 76 15
15 105 20
16 99 11
17 76 20
18 105 11
19 88 19
20 78 16
21 108 17
22 89 14
23 80 15
24 105 19
25 102 20
26 105 18 Is the process stable in behavior?
27 80 31
28 89 16
29 88 12
30 97 17
809
Control Phase
Discrete Capability: Poisson Output
Our target was to be less than 0.05 DPU. Based on the Excel data sheet used to make the p –chart and calculations shown below, the DPU
is less than 0.05.
USL 0.3
Defective per Unit is done by means of a simple calculation,

Number of items that DPU = (Number of Defectives)/(Total Number of Units)
were less than 0.3 925
Next, calculate Rolled Throughput Yield with the formula
Out of how many items 950 RTY = e-DPU
Then to calculate Sigma levels, you can either use the Sigma tables or use
Defective per unit 0.026316 the formula
=-NORMSINV(DPU) + 1.5
RTY 97.40%
Sigma Level 3.437932
810
Normal Capability Sixpack
Use the data sheet provided to you in the LSSBB Data set in the worksheet
mentioned Control Chart Data.
Step 1 – Calculate Sample Means and Sample Ranges for the data sample chosen.
Step 2 – Calculate grand average of Sample Means and Sample Ranges.
Step 3 --- Use the Control Chart formulas for both the Xbar and R Charts, as below
For Xbar Chart,
Center line = X double bar

UCL = X double bar + A2* R bar (A2 for a sample size of 4 is 0.729)
LCL = X double bar – A2 * R bar
For R Chart
Center line = R bar

UCL = D4 * R bar (D4 for a sample size of 4 is 2.282)
LCL = D3 * R bar (D3 for a sample size of 4 is 0)
811
Normal Capability Sixpack
Use the data sheet provided to you in the LSSBB Data set in the worksheet
mentioned Control Chart Data.
1. X bar Chart is in control but R

Chart is out of control.
2. Points can be out of control due

to special cause of variation.
3. Also look in the control charts for

modality violations.
812
Lean Controls
Advanced Capability Vision of Lean Supporting Six Sigma
Lean Controls Lean Tool Highlights
Defect Controls Project Sustained Success
813
Lean Controls
You’ve begun the process of sustaining your project after finding the “vital few”
X’s for your project.
In Advanced Process Capability, we discussed removing some of the Special

Causes causing spread from Outliers in the process performance.
This module gives more tools from the Lean toolbox to stabilize your process.
Belts, after some practice, often consider this module’s set of tools a way to
improve some processes that are totally “out of control” or of significantly poor
Process Capability before applying the Six Sigma methodology.
814
The Vision of Lean Supporting Your Project
Kanban
The Continuous Goal…  We cannot sustain Kanban
Sustaining Results Kaizen without Kaizen.
Standardized Work
 We cannot sustain Kaizen (Six
Sigma) without Standardized Work.
Visual Factory  We cannot sustain Standardized Work

without a Visual Factory.
 We cannot sustain a visual factory

without 5S.
5S Workplace Organization
Lean tools add discipline required to further sustain gains realized

with Six Sigma Belt Projects.
815
What is Waste (MUDA)?
Waste is often the root of any Six Sigma project. The 7 basic elements of waste (muda
in Japanese) include:
– Muda of Correction
– Muda of Overproduction
– Muda of Processing
– Muda of Conveyance
– Muda of Inventory
– Muda of Motion
– Muda of Waiting
Get that garbage outta here!
The specifics of the MUDA were discussed in the Define Phase:
– The reduction of MUDA can reduce your outliers and help with defect
prevention. Outliers because of differing waste among procedures, machines, etc.
816
The Goal
Don’t forget the goal -- Sustaining your Project which eliminates MUDA!
With this in mind, we will introduce and review some of the Lean tools used to sustain
your project success.
817
5 S - Workplace Organization
• 5S means the workplace is

clean, there is a place for
everything and everything is in
its place.
• 5S is the starting point for
implementing improvements to
a process.
• To ensure your gains are
sustainable, you must start with
a firm foundation.
• Its strength is contingent upon
the employees and company
being committed to maintaining
it.
818
5 S Translation - Workplace Organization
Step Japanese Literal Translation English
Step 1: Seiri Clearing Up Sorting
Step 2: Seiton Organizing Straightening
Step 3: Seiso Cleaning Shining
Step 4: Seketsu Standardizing Standardizing
Step 5: Shitsuke Training & Discipline Sustaining
Focus on using the English words, much easier to remember.
819
SORTING - Decide what is needed.
Definition:
– To sort out necessary and
unnecessary items.
– To store often used items at the
work area, infrequently used
items away from the work area
and dispose of items that are not
needed.
Why: Things
Thingstotoremember
remember
•• Start
Startininone
onearea,
area,then
thensort
sortthrough
through
– Removes waste. everything.
everything.
– Safer work area. •• Discuss
Discussremoval
removalofofitems
itemswith
withall
all
persons involved.
persons involved.
– Gains space. •• Use
Useappropriate
appropriatedecontamination,
decontamination,
– Easier to visualize the process. environmental
environmentaland andsafety
safetyprocedures.
procedures.
•• Items
Itemsthat
thatcannot
cannotbe
beremoved
removed
immediately
immediately should betagged
should be taggedforforlater
later
removal.
removal.
•• IfIfnecessary,
necessary,use
usemovers
moversandandriggers.
riggers.
820
A Method for Sorting
Item
Useful Unknown Useless
Keep &
Monitor
Keep &
Store
Useful Sorting Useless
ABC Storage Dispose
821
A Method for Sorting
Use this graph as a general guide for deciding

Frequency of Use where to store items along with the table
below.
A
B
C
Distance
Frequency of Keep within arms Keep in local Keep in remote

Utilization Class reach location location
Daily or several times
a day A YES MAYBE NO
Weekly B MAYBE YES NO
Monthly or quarterly C NO NO YES
822
STRAIGHTENING – Arranging Necessary Items
Definition:
– To arrange all necessary items.
– To have a designated place
for everything.
– A place for everything and everything in its
place.
– Easily visible and accessible.
Why:
– Visually shows what is required or is out of Things
place. Thingstotoremember
remember
•• Things
Thingsusedusedtogether
togethershould
shouldbe
be
– More efficient to find items and documents kept together.
(silhouettes/labels). kept together.
•• Use
Uselabels,
labels,tape,
tape,floor
floor
– Saves time by not having to
markings,
markings,signs,
signs,and
and
search for items.
shadow
shadowoutlines.
outlines.
– Shorter travel distances.
•• Sharable
Sharableitems
itemsshould
shouldbe bekept
kept
atataacentral location (eliminated
central location (eliminated
excess).
excess).
823
SHINING – Cleaning the Workplace
Definition:
– Clean everything and find
ways to keep it clean.
– Make cleaning a part of your
everyday work.
Why:
– A clean workplace indicates
a quality product and
process. Things
Thingstotoremember
remember
– Dust and dirt cause product •• “Everything
“Everythingininits itsplace”
place”frees
freesupuptime
timefor
for
contamination and potential cleaning.
cleaning.
health hazards. •• Use
Useananoffice
officeororfacility
facilitylayout
layoutasasaavisual
visual
– A clean workplace helps aid to identify individual responsibilities
aid to identify individual responsibilities
identify abnormal for
forcleaning.
cleaning.This
Thiseliminates
eliminates“no “noman’s
man’s
conditions. land.”
land.”
•• Cleaning
Cleaningthe thework
workareaareaisislike
likebathing.
bathing.ItIt
relieves
relievesstress
stressand
andstrain,
strain,removes
removessweat
sweat
and
and dirt, and prepares the body for thenext
dirt, and prepares the body for the next
day.
day.
824
STANDARDIZING – Creating Consistency
Definition:
– To maintain the workplace at a
level that uncovers problems
and makes them obvious.
– To continuously improve your
office or facility by continuous
assessment and action.
Why:
– To sustain Sorting, Storage and
Things
Thingstotoremember
remember
Shining activities every day. • • We
Wemust
mustkeep
keepthe
thework
workplace
placeneat
neatenough
enough
for visual identifiers to be effective in
for visual identifiers to be effective in
uncovering
uncoveringhidden
hiddenproblems.
problems.
• • Develop
Developaasystem
systemthat
thatenables
enableseveryone
everyoneinin
the
theworkplace
workplacetotosee
seeproblems
problemswhen
whenthey
they
occur.
occur.
825
SUSTAINING – Maintaining the 5S
Definition:
– To maintain our discipline,
we need to practice and repeat
until it becomes a way of life.
Why:
Things
ThingstotoRemember
Remember
– To build 5S into our everyday •• Develop
Developschedules
schedulesand
andcheck
check
process. lists.
lists.
•• Good
Goodhabits
habitsare
arehard
hard
totoestablish.
establish.
•• Commitment
Commitmentand anddiscipline
discipline
toward
toward housekeepingare
housekeeping areessential
essential
first steps toward being world
first steps toward being world
class.
class.
826
The Visual Factory
The basis and foundation of a Visual Factory are the 5S Standards.
A Visual Factory enables a process to manage its processes with clear indications of opportunities.
Your team should ask the following questions if looking for a project:
– Can we readily identify Downtime Issues?
– Can we readily identify Scrap Issues?
– Can we readily identify Changeover Problems?
– Can we readily identify Line Balancing Opportunities?
– Can we readily identify Excessive Inventory Levels?
– Can we readily identify Extraneous Tools & Supplies?
Exercise:
– Can you come up with any opportunities for “VISUAL” aids in your project?
– What visual aids exist to manage your process?
827
What is Standardized Work?
If the items are organized and orderly, then

standardized work can be accomplished.
– Less Standard Deviation of results
– Visual factory demands framework of Standardized Work
standardized work.
The “one best way” to perform each operation has

been identified and agreed upon through general
consensus (not majority rules) Visual Factory
– This defines the “Standard” work procedure
We cannot sustain
Standardized Work without
5S and the Visual Factory.
5S - Workplace Organization
828
Prerequisites for Standardized Work
Standardized work does not happen without the visual factory which can be further
described with:
Availability of required tools (5S). Operators cannot be expected to maintain standard

work if required to locate needed tools
Consistent flow of raw material. Operators cannot be expected to maintain standard

work if they are searching for needed parts
Visual alert of variation in the process (visual factory). Operators, material handlers,
office staff all need visual signals to keep “standard work” a standard
Identified and labeled in-process stock (5S). As inventory levels of in-process stock
decrease, a visual signal should be sent to the material handlers to replenish this stock
829
What is Kaizen?
• Definition*: The philosophy of continual Kaizen

improvement, that every process can and should be
continually evaluated and improved in terms of time
required, resources used, resultant quality and other
aspects relevant to the process. Standardized Work
• Kaikaku are breakthrough successes which are the

first focus of Six Sigma projects.
Visual Factory
* Note: Kaizen Definition from: All I Needed To Know

About Manufacturing I Learned in Joe’s Garage.
Miller and Schenk, Bayrock Press, 1996. Page
75.
830
Prerequisites for Kaizen
Kaizen’s need the following cultural elements:

Management Support. Consider the corporate support which is the reason why Six Sigma focus is
a success in your organization
Measurable Process. Without standardized work, we really wouldn’t have a consistent process to
measure. Cycle times would vary, assembly methods would vary, batches of materials would be
mixed, etc…
Analysis Tools. There are improvement projects in each organization which cannot be solved by
an operator. This is why we teach the analysis tools in the breakthrough strategy of Six Sigma.
Operator Support. The organization needs to understand that its future lies in the success of the
value-adding employees. Our roles as Belts are to convince operators that we are here for them--
they will then be there for us.
831
What is Kanban?
Kanbans are the best control method of inventory which impacts some of the 7
elements of MUDA shown earlier.
Kanban provides production, conveyance, and delivery information. Kanban
In it’s purest form the system will not allow any goods to be
moved within the facility without an appropriate Kanban (or
signal) attached to the goods.
– The Japanese word for a communication signal
Kaizen
or card--typically a signal to begin work
– Kanban is the technique Standardized Work
used to “pull” products and
material through and into
the lean manufacturing system.
– The actual “Kanban” can be a
physical signal such as an empty Visual Factory
container or a small card.
832
Two Types of Kanban
There are two main categories of Kanbans:

Type 1: Finished goods Kanbans
–
Intra-process
Signal Kanban: Should be posted at the
end of the processing area to signal for P.I.K.
production to begin.
– P.I.K Kanban: Used for a much more Production Instruction
refined level of inventory control. Kanban Kanban
is posted as inventory is depleted thus
insuring only the minimum allowable level Signal
of product is maintained.
Type 2: Incoming Material Kanbans

– Used to purchase materials from a
supplying department either internal or Withdrawal
external to the organization. Regulates the
amount of WIP inventory located at a Inter-Process
particular process. Between two
processes
Supplier
833
Prerequisites for a Successful Kanban System
These items support successful Kanbans:

• Improve changeover procedures.
• Relatively stable demand cycle.
• Number of parts per Kanban (card) MUST be standard and SHOULD be kept to as few
as possible parts per card.
• Small amount of variation (or defects).
• Near zero defects should be sent to the assembly process (Result of earlier belt projects).
• Consistent cycle times defined by Standardized Work.
• Material handlers must be trained in the organization of the transportation system.
834
Warnings Regarding Kanban
As we have indicated, if you do NOT have 5S, visual factory,

standardized work and ongoing Kaizen’s, Kanbans cannot
succeed.
Kanban systems are not quick fixes to large inventory problems,

workforce issues, poor product planning, fluctuating demand
cycles, etc...
Don’t forget that “weakest Link” thing!
835
The Lean Tools and Sustained Project Success
The Lean tools help sustain project success. The main lessons you should consider are:
1. The TEAM should 5S the project area and begin integrating visual factory indicators.
– Indications of the need for 5S are:
– Outliers in your project metric
– Loss of initial gains from project findings
2. The TEAM should develop Standardized Work Instructions

– They are required to sustain your system benefits.
– However, remember without an organized work place with 5S standardized work
instructions won’t create consistency
3. Kaizen’s and Kanban’s cannot be attempted without organized workplaces and organized
work instructions.
– Remember the need for 5S and Standardized Work Instructions to support our
projects.
4. Project Scope dictates how far up the Lean tools ladder you need to implement measures
to sustain any project success from your DMAIC efforts.
836
Class Exercise
In the boundaries for your project scope, give some examples of Lean tools in operation.
– Others can learn from those items you consider basic.
List other Lean tools you are most interested in applying to sustain your project results.
837
Summary
• Describe the Lean tools
• Understand how these tools can help with project sustainability
• Understand how the Lean tools depends on each other
• Understand how tools must document the defect prevention created in the Control
Phase
838
Defect Controls
Advanced Capability
Lean Controls Realistic Tolerance and Six Sigma Design
Defect Controls Process Automation or Interruption
Statistical Process Control (SPC) Poka-Yoke
839
Purpose of Defect Prevention in Control Phase
Process improvement efforts often falter during implementation of new operating

methods learned in the Analyze Phase.
Sustainable improvements can not be achieved without control tactics to guarantee

permanency.
Defect Prevention seeks to gain permanency by eliminating or rigidly defining human

intervention in a process.
Yes sir, we are in CONTROL!
840
 Level for Project Sustaining in Control
5-6Six Sigma product and/or process design eliminates an error condition OR an BEST
automated system monitors the process and automatically adjust critical X’s to
correct settings without human intervention to sustain process improvements
4-5Automated mechanism shuts down the process and prevents further

operation until a required action is performed
3-5: Mistake proofing prevents a product/service from passing onto the next step
3-4: SPC on X’s with the special causes are identified and acted upon by fully
trained operators and staff who adhere to the rules
2-4SPC on Y’s
1-3Development of SOPs and process audits
0-1Training and awareness
WORST
841
6 Product/Process Design
Designing products and processes such that the output Y meets or exceeds the target
capability.
24
Specification on Y
22
Distribution 21
of Y
19
Relationship Y
17 = F(x)
10 11 12 13 14 15 16 17 18 19 20
Distribution of X
When designing the part or process, specifications on X are set such that the target capability
on Y is achieved.
Both the target and tolerance of the X must be addressed in the spec limits.
842
6s Product/Process Design
Upper
Prediction
24
Interval
Specification on Y
22
Distribution 21 Relationship Y
of Y
= F(x)
19
17
10 11 12 13 14 15 16 17 18 19 20 Lower
Prediction
Distribution of X Interval
If the relationship between X and Y is empirically developed through Regressions or

DOE’s uncertainty exists.
As a result, confidence intervals should be used when establishing the specifications for
X.
843
Product/Process Design Example
Using 95% prediction bands within MINITABTM

Stat > Regression>Fitted Lin Plot …..Options…Display Prediction Interval
Regression Plot
Generate your own Data Y = 7.75434 + 5.81104X
Set(s) and experiment with R-Sq = 88.0 %
this MINITABTM function. 90
80
70
60
50
Output
40
30
What are the 20
spec limits for 10
Regression
the output? 0
95% PI
0 5 10
Input
What is the tolerance range for the input?
If you want 6 performance, you will remember to tighten the output’s specification
to select the tolerance range of the input.
844
Product/Process Design Example
Regression Plot
Y = 2.32891 - 0.282622X
R-Sq = 96.1 %
10
Note: High output spec connects

with top line in both cases.
Output2
Regression
0 95% PI Regression Plot

Y = 7.75434 + 5.81104X
R-Sq = 88.0 %
-30 -20 -10 0
Input2 90
80
70
60
Output
50
40
30
20
Regression
Lower input spec

10
95% PI
0
0 5 10
Input
Using top output spec determines high or low tolerance for input
depending on slope of Regression.
845
Poor Regression Impacting Tolerancing
Regression Plot
Y = -4.7E-01 R-Sq =
+ 0.811312X 90.4 % Poor Correlation does not allow
for tighter tolerancing.
20
Outp1
10
Regression Regression Plot

0 95% PI Y = 1.46491 R-Sq =
+ 0.645476X 63.0 %
0 10 20 30
30
Inp1
20
Outp2
10
Regression
0
95% PI
0 10 20 30
Inp1
846
5 – 6  Full Automation
Full Automation: Systems that monitor the process and automatically adjust critical
X’s to correct settings.
• Automatic gauging and system adjustments

• Automatic detection and system activation systems - landing gear extension based
on aircraft speed and power setting
• Systems that count cycles and automatically make adjustments based on an
optimum number of cycles
• Automated temperature controllers for controlling heating and cooling systems
• Anti-Lock braking systems
• Automatic welder control units for volts, amps and distance traveled on each weld
cycle
847
Full Automation Example
A Black Belt is working on controlling rust on machined surfaces of brake rotors:

– A rust inhibiter is applied during the wash cycle after final machining is
completed
– Concentration of the inhibiter in the wash tank is a Critical X that must be
maintained
– The previous system was a standard S.O.P. requiring a process technician to
audit and add the inhibiter manually
As part of the Control Phase, the team has implemented an automatic check and
replenish system on the washer.
Full Automation
Don’t worry boss, it’s automated!!

848
4 – 5  Process Interruption
Process Interruption: Mechanism installed that shuts down the process and prevents
further operation until a required action is preformed:
• Ground fault circuit breakers
• Child proof caps on medications
• Software routines to prevent undesirable commands
• Safety interlocks on equipment such as light curtains, dual palm buttons, ram
blocks
• Transfer system guides or fixtures that prevent over or undersized parts from
proceeding
• Temperature conveyor interlocks on ovens
• Missing component detection that stops the process when triggered
849
4 – 5  Process Interruption
Example:
• A Black Belt is working on launching a new electric drive unit on a transfer system
– One common failure mode of the system is a bearing failure on the main motor
shaft
– It was determined that a high press fit at bearing installation was causing these
failures
– The root cause of the problem turned out to be undersized bearings from the
supplier
• Until the supplier could be brought into control or replaced, the team implemented a
press load monitor at the bearing press with a indicator
– If the monitor detects a press load higher than the set point, it shuts down the
press and will not allow the unit to be removed from press until an interlock key
is turned and the ram reset in the manual mode
– Only the line lead person and the supervisor have keys to the interlock
– The non-conforming part is automatically marked with red dye
Process Interruption
850
3 – 5  Mistake Proofing
Mistake Proofing is best defined as:

– Using wisdom, ingenuity, or serendipity to create devices allowing a
100% defect free step 100% of the time
Poka-Yoke is the Japanese term for mistake proofing or to avoid “yokeuro”

inadvertent errors “poka”.
1 2 3 4
See if you can

5 7 8
find the Poka-
Yokes!
6
851
Traditional Quality vs. Mistake Proofing
Traditional Inspection
Result
Sort
Worker or Don’t Do Defective At Other
Machine Error Anything Step
Discover Take Action/ No Next

Error Feedback Defect Step
Source Inspection
“KEEP ERRORS FROM
TURNING INTO DEFECTS”
852
Styles of Mistake Proofing
There are 2 states of a defect which are addressed with mistake proofing.
ERROR ABOUT TO OCCUR ERROR HAS OCCURRED
DEFECT ABOUT TO OCCUR DEFECT HAS OCCURRED

(Prediction) (Detection)
WARNING SIGNAL WARNING SIGNAL
CONTROL / FEEDBACK CONTROL / FEEDBACK
SHUTDOWN SHUTDOWN
(Stop Operation) (Stop Operation)
853
Mistake Proofing Devices Design
Hints to help design a mistake proofing device:

– Simple
– Inexpensive
– Give prompt feedback
– Give prompt action (prevention)
– Focused application
– Have the right people’s input
BEST ...makes it impossible for errors to occur

BETTER ……allows for detection while error is being made
GOOD ...detects defect before it continues to the next operation
854
Types of Mistake Proof Devices
Contact Method
– Physical or energy contact
with product
1 Guide Pins of
Different Sizes
• Limit switches
• Photo-electric beams
Fixed Value Method 2 Error Detection
and Alarms
– Number of parts to be
attached/assembled etc.
constant
are
3 Limit Switches
– Number of steps done

in operation
• Limit switches 4 Counters
Motion-step Method
– Checks for correct sequencing
– Checks for correct timing 5 Checklists
• Photo-electric switches
and timers
855
Mistake Proofing Examples
Everyday examples of mistake-proofing: • Automobile

• Home – Seat belts
– Automated shutoffs on electric coffee pots – Air bags
– Ground fault circuit breakers for bathroom in – Car engine warning lights
or outside electric circuits • Office
– Pilotless gas ranges and hot water heaters – Spell check in word processing software
– Child proof caps on medications – Questioning “Do you want to delete” after
– Butane lighters with safety button depressing the “Delete” button on your
• Computers computer
• Factory
– Mouse insertion
– Dual palm buttons and other guards on
– USB cable connection machinery
– Battery insertion • Retail
– Power save feature – Tamper proof packaging
856
Advantages of Mistake Proofing as A Control Method
Mistake Proofing advantages include:

– Only simple training programs are required
– Inspection operations are eliminated and the process is simplified
– Relieves operators from repetitive tasks of typical visual inspection
– Promotes creativity and value adding activities
– Results in defect free work
– Requires immediate action when problems arise
– Provides 100% inspection internal to the operation
The best resource for pictorial examples of Mistake Proofing is:
Poka-Yoke: Improving Product Quality by Preventing Defects.

Overview by Hiroyuki Hirano. Productivity Press, 1988.)
857
Defect Prevention Culture and Good Control Plans
Involve everyone in Defect Prevention:

– Establish process capability through SPC
– Establish and adhere to standard procedures
– Make daily improvements
– Invent Mistake-proofing devices
Make immediate feedback and action part of culture
Don’t just stop at one mistake proofing device per product
Defect Prevention is needed for all potential defects
Defect Prevention implemented MUST be documented in your living FMEA for

the process/product
858
Class Exercise
Break into your groups and discuss mistake proofing systems currently at your facilities
Identify one automation example and one process interruption example per group
Be prepared to present both examples to the class
Answer the following questions as part of the discussion and presentation:

– How was the need for the control system identified? If a Critical X is mistake
proofed, how was it identified as being critical?
– How are they maintained?
– How are they verified as working properly?
– Are they ever disabled?
You have 30 minutes!
859
Class Exercise about Defect Prevention
Prepare a probable Defect Prevention method to apply to your project.
List any potential barriers to implementation.
860
Summary
• Describe some methods of Defect Prevention
• Understand how these techniques can help with project sustainability:

– Including reducing those outliers as seen in the Advanced Process Capability section
– If the vital X was identified, prevent the cause of defective Y
• Understand what tools must document the Defect Prevention created in the Control Phase
861
Statistical Process Control
Advanced Capability
Lean Controls
Elements and Purpose
Defect Controls
Methodology
Special Cause Tests
Examples
862
SPC Overview: Collecting Data
Population:
– An entire group of objects that have been made or will be made
containing a characteristic of interest
Sample:
– A sample is a subset of the population of interest
– The group of objects actually measured in a statistical study
– Samples are used to estimate the true population parameters
Population
Sample
Sample
Sample
863
SPC Overview: I-MR Chart
• An I-MR chart combines a Control Chart of the average moving range with the Individual’s Chart.
• You can use individuals charts to track the process level and to detect the presence of Special Causes when the
sample size is 1.
• Seeing both charts together allows you to track both the process level and process variation at the same time,
providing greater sensitivity that can help detect the presence of Special Causes.
I-MR Chart
U C L=226.12
225.0
Individual Value
222.5
_
220.0 X=219.89
217.5
215.0
LC L= 213.67
1 13 25 37 49 61 73 85 97 109
O bse r v a tion
8
U C L=7.649
6
Moving Range
4
__
M R=2.341
2
0 LC L= 0
1 13 25 37 49 61 73 85 97 109
O bse r v a tion
864
SPC Overview: Xbar-R Chart
If each of your observations consists of a subgroup of data, rather than just individual measurements, an
Xbar-R chart providers greater sensitivity. Failure to form rational subgroups correctly will make your
Xbar-R charts dangerously wrong.
Xbar-R Chart
U C L=225.76
225
Sample Mean
222 _
_
X=221.13
219
LC L=216.50
216
1 3 5 7 9 11 13 15 17 19 21 23
Sample
U C L=16.97
16
Sample Range
12
_
8 R=8.03
0 LC L=0
1 3 5 7 9 11 13 15 17 19 21 23
Sample
865
SPC Overview: U Chart
• C Charts and U Charts are for tracking defects.

• A U Chart can do everything a C Chart can, so we’ll just learn how to do a U Chart. This
chart counts flaws or errors (defects). One “search area” can have more than one flaw or
error.
• Search area (unit) can be practically anything we wish to define. We can look for
typographical errors per page, the number of paint blemishes on a truck door or the number
of bricks a mason drops in a workday.
• You supply the number of defects on each unit inspected.
U Chart of Defects
0.14 1
1
UCL=0.1241
0.12
Sample Count Per Unit
0.10
0.08
0.06 _
U=0.0546
0.04
0.02
0.00 LCL=0
1 3 5 7 9 11 13 15 17 19
Sample
866
SPC Overview: P Chart
• NP Charts and P Charts are for tracking defectives.

• A P Chart can do everything an NP Chart can, so we’ll just learn how to do a P Chart!
• Used for tracking defectives – the item is either good or bad, pass or fail, accept or reject.
• Center Line is the proportion of “rejects” and is also your Process Capability.
• Input to the P Chart is a series of integers — number bad, number rejected. In addition,
you must supply the sample size.
P Chart of Errors
0.30
UCL=0.2802
0.25
Proportion
_
0.20 P=0.2038
0.15
LCL=0.1274
1 3 5 7 9 11 13 15 17 19
Sample
867
SPC Overview: Control Methods/Effectiveness
Type 1 Corrective Action = Countermeasure: improvement made to the process which will eliminate
the error condition from occurring. The defect will never be created. This is also referred to as a long-
term corrective action in the form of mistake proofing or design changes.
Type 2 Corrective Action = Flag: improvement made to the process which will detect when the error
condition has occurred. This flag will shut down the equipment so that the defect will not move
forward.
SPC on X’s or Y’s with fully trained operators and staff who respect the rules. Once a chart signals a
problem everyone understands the rules of SPC and agrees to shut down for Special Cause
identification. (Cpk > certain level).
Type 3 Corrective Action = Inspection: implementation of a short-term containment which is likely

to detect the defect caused by the error condition. Containments are typically audits or 100%
inspection.
SPC on X’s or Y’s with fully trained operators. The operators have been trained and understand the
rules of SPC, but management will not empower them to stop for investigation.
S.O.P. is implemented to attempt to detect the defects. This action is not sustainable short-term or long-
term.
SPC on X’s or Y’s without proper usage = WALL PAPER.
868
Purpose of Statistical Process Control
Every process has Causes of Variation known as:

– Common Cause: Natural variability
– Special Cause: Unnatural variability
• Assignable: Reason for detected Variability
• Pattern Change: Presence of trend or unusual pattern
SPC is a basic tool to monitor and improve variation in a process.
SPC is used to detect Special Cause variation telling us the process is “out of control”
but does NOT tell us why.
SPC gives a glimpse of ongoing process capability AND is a visual management tool.
869
Elements of Control Charts
Developed by Dr Walter A. Shewhart of Bell Laboratories from 1924

Graphical and visual plot of changes in the data over time
– This is necessary for visual management of your process.
Control Charts were designed as a methodology for indicating change in performance, either
variation or Mean/Median.
Charts have a Central Line and Control Limits to detect Special Cause variation.
Control Chart of Recycle

60 1
UCL=55.24
Special Cause 50
Variation Detected 40
Individual Value
30
_
X=29.06 Process Center
(usually the Mean)
20
Control Limits
10
LCL=2.87
0
1 4 7 10 13 16 19 22 25 28
Observation
870
Understanding the Power of SPC
Control Charts indicate when a process is “out of control” or exhibiting Special Cause variation but NOT why!
SPC Charts incorporate upper and lower Control Limits.

– The limits are typically +/- 3  from the Center Line.
– These limits represent 99.73% of natural variability for Normal Distributions.
SPC Charts allow workers and supervision to maintain improved process performance from Six Sigma projects.
Use of SPC Charts can be applied with all processes.

– Services, manufacturing, and retail are just a few industries with SPC applications.
– Caution must be taken with use of SPC for Non-normal processes.
Control Limits describe the process variability and are unrelated to customer specifications. (Voice of the Process
instead of Voice of the Customer)
– An undesirable situation is having Control Limits wider than customer specification limits. This will exist
for poorly performing processes with a Cp less than 1.0
Many SPC Charts exist and selection must be appropriate for effectiveness.
871
The Control Chart Cookbook
General Steps for Constructing Control Charts

1. Select characteristic (critical “X” or CTQ) to be charted.
2. Determine the purpose of the chart.
3. Select data-collection points.
4. Establish the basis for sub-grouping (only for Y’s).
5. Select the type of Control Chart.
6. Determine the measurement method/criteria.
7. Establish the sampling interval/frequency.
8. Determine the sample size.
9. Establish the basis of calculating the Control Limits.
Stirred or
10. Set up the forms or software for charting data. Shaken?
11. Set up the forms or software for collecting data.
12. Prepare written instructions for all phases.
13. Conduct the necessary training.
872
Focus of Six Sigma and the Use of SPC
Y=F(x)
To get results, should we focus our behavior on the Y or X?
Y X1 . . . XN
Dependent Independent
Output Input
Effect Cause
Symptom Problem
Monitor Control
If we find the “vital few” X’s, first consider using SPC

on the X’s to achieve a desired Y?
873
Control Chart Anatomy
Special Cause
Variation Process Run Chart of data
is “Out of Control” points
Upper Control
Limit
Common Cause
+/- 3 sigma
Variation Process
is “In Control”
Lower Control
Limit
Mean
Special Cause
Variation Process
is “Out of Control”
Process Sequence/Time Scale
874
Control and Out of Control
Outlier
3
2
1
99.7%
95%
68%
-1
-2
-3
Outlier
875
Size of Subgroups
Typical subgroup sizes are 3-12 for variable data:

– If difficulty of gathering sample or expense of testing exists the size, n, is smaller
– 3, 5, and 10 are the most common size of subgroups because of ease of calculations
when SPC is done without computers.
Size of subgroups aid in detection of shifts of Mean indicating Special Cause exists. The larger
the subgroup size, the greater chance of detecting a Special Cause. Subgroup size for Attribute
Data is often 50 – 200.
Lot 1 Lot 5
Lot 3
Lot 2
Lot 4
Short-term studies
Long-term study
876
The Impact of Variation
Sources of Variation Sources of Variation Sources of Variation
- Natural Process Variation as - Natural Process Variation - Natural Process Variation

defined by subgroup selection - Different Operators - Different Operators
- Supplier Source
-UCL
-LCL
First, select the spread that we

will declare as the “Natural
Process Variation”, so that
whenever any point lands outside
these “Control Limits”, an alarm So, when a second source of And, of course, if two additional sources of
will sound variation appears, we will variation arrive, we will detect that, too!
know!
If you base your limits on all three sources of variation, what will sound the alarm?
877
Frequency of Sampling
Sampling Frequency is a balance between cost of sampling and testing versus cost of not
detecting shifts in Mean or variation.
Process knowledge is an input to frequency of samples after the subgroup size has been
decided.
– If a process shifts but cannot be detected because of too infrequent sampling, the
customer suffers
– If choice is given of large subgroup samples infrequently or smaller subgroups
more frequently, most choose to get information more frequently.
– In some processes, with automated sampling and testing frequent sampling is
easy.
If undecided as to sample frequency, sample more frequently to confirm detection of

process shifts and reduce frequency if process variation is still detectable.
A rule of thumb also states “sample a process at least 10X more frequent than the
frequency of ‘out of control’ conditions”.
878
Frequency of Sampling
Sampling too little will not allow for sufficient detection of shifts in the process because
of Special Causes.
I Chart of Sample_3
Output 7.5
UCL=7.385
All possible samples 7.0

7.5
7
Individual Value
6.5
_
6.5 6.0
X=6.1
6
5.5 5.5
5 5.0
Sample every half hour LCL=4.815
1 7 13 19 25 31 37
1 2 3 4 5 6 7 8 9 10 11 12 13
Observation
I Chart of Sample_6 I Chart of Sample_12

6.6
UCL=8.168 UCL=6.559
8
6.4
6.2
7
Individual Value
Individual Value
6.0
_ _
X=6.129 X=5.85
6 5.8
5.6
5
Sample every hour 5.4
Sample 4x per shift
5.2
LCL=5.141
4 LCL=4.090
5.0
1 2 3 4 5 6 7 1 2 3 4
Observation Observation
879
SPC Selection Process
Choose Appropriate
Control Chart
ATTRIBUTE type CONTINUOUS

of data
type of
subgroup
attribute
size
data
DEFECTS DEFECTIVES
Sample size 1 2-5 10+

type
type of
of defect
subgroups
I – MR X–R X–S
Chart Chart Chart
CONSTANT VARIABLE CONSTANT VARIABLE
Individuals & Mean & Mean & Std.
Moving Range Dev.
Range
C Chart U Chart NP Chart P Chart SPECIAL CASES
Number of Incidences Number of Proportion

Incidences per Unit Defectives Defectives
CumSum EWMA
Chart Chart
Cumulative Exponentially
Sum Weighted Moving
Average
880
Understanding Variable Control Chart Selection
Type of Chart When do you need it?
Average & Range or S  Production is higher volume; allows process Mean and variability to be viewed and
(Xbar and R or assessed together; more sampling than with Individuals Chart (I) and Moving Range
Xbar and S) Charts (MR) but when subgroups are desired. Outliers can cause issues with Range (R)
charts so Standard Deviation charts (S) used instead if concerned.
Most common
Individual and  Production is low volume or cycle time to build product is long or homogeneous sample
Moving Range represents entire product (batch etc.); sampling and testing is costly so subgroups are
not desired. Control limits are wider than Xbar Charts. Used for SPC on most inputs.
 Set-up is critical, or cost of setup scrap is high. Use for outputs
Pre-Control  Small shift needs to be detected, often because of autocorrelation of the output results.
Used only for individuals or averages of Outputs. Infrequently used because of
Exponentially calculation complexity.
Weighted
Moving Average  Same reasons as EWMA (Exponentially Weighted Moving Range) except the past data
is as important as present data.
Cumulative Sum
Less Common
881
Understanding Attribute Control Chart Selection
Type of Chart When do you need it?
P  Need to track the fraction of defective

units; sample size is variable and usually > 50
nP  When you want to track the number of defective units per

subgroup; sample size is usually constant and usually > 50
 When you want to track the number of defects per subgroup of

C units produced; sample size is constant
 When you want to track the number of

defects per unit; sample size is variable
U
882
Detection of Assignable Causes or Patterns
Control Charts indicate Special Causes being either assignable causes or patterns.
The following rules are applicable for both variable and Attribute Data to detect Special Causes.
These four rules are the only applicable tests for Range (R), Moving Range (MR) or Standard Deviation (S) charts.
– One point more than 3 Standard Deviations from the Center Line.
– 6 points in a row all either increasing or all decreasing.
– 14 points in a row alternating up and down.
– 9 points in a row on the same side of the center line.
These remaining four rules are only for variable data to detect Special Causes.
– 2 out of 3 points greater than 2 Standard Deviations from the Center Line on the same side.
– 4 out of 5 points greater than 1 Standard Deviation from the Center Line on the same side.
– 15 points in a row all within one Standard Deviation of either side of the Center Line.
– 8 points in a row all greater than one Standard Deviation of either side of the Center Line.
883
Recommended Special Cause Detection Rules
• If implementing SPC manually without software initially, the most visually obvious violations are more
easily detected. SPC on manually filled charts are common place for initial use of defect prevention
techniques.
• These 3 rules are visually the most easily detected by personnel.
– 6 points in a row all either increasing or all decreasing.
– 15 points in a row all within one Standard Deviation of either side of the Center Line.
• Dr. Shewhart that worked with the Western Electric Co. was credited with the following 4 rules referred
to as Western Electric Rules.
– 8 points in a row on the same side of the Center Line.
– 2 out of 3 points greater than 2 Standard Deviations from the Center Line on the same side.
– 4 out of 5 points greater than 1 Standard Deviation from the Center Line on the same side.
• You might notice the Western Electric rules vary slightly. The importance is to be consistent in your
organization and decide what rules you will use to detect Special Causes.
• VERY few organizations use all 8 rules for detecting Special Causes.
884
Special Cause Rule Default in MINITABTM
If a Belt is using MINITABTM, you must be aware of what default settings for the
rules. You can alter your program defaults with:
Tools>Options>Control Charts and Quality Tools> Tests
Many experts have commented on the appropriate tests and numbers to be used.
Decide then be consistent when implementing.
885
Special Cause Test Examples
This is the MOST common Special Cause test used in SPC charts.
Test 1 One point beyond zone A

1
A
B
C
C
B
A
886
This test is an indication of a shift in the process Mean.
Test 2 Nine points in a row on

same side of center line
A
B
C
C
B 2
887
This test is indicating a trend or gradual shift in the Mean.
Test 3 Six points in a row, all

increasing or decreasing
A 3
B
C
C
B
A
888
This test is indicating a non-random pattern.
Test 4 Fourteen points in a

row, alternating up and down
A
B
C
C 4
B
A
889
This test is indicating a shift in the Mean or a worsening of variation.
Test 5 Two out of three points in

a row in zone A (one side of center
line)
5
A
B
C
C
B
A 5
890
This test is indicating a shift in the Mean or degradation of variation.
Test 6 Four out of five points in

zone B or beyond (one side of
center line)
6
A
B
C
C
B 6
891
This test is indicating a dramatic improvement of the variation in the process.
Test 7 Fifteen points in a row in

zone C (both sides of center line)
A
B
C
C 7
B
A
892
This test is indicating a severe worsening of variation.
Test 8 Eight points in a row

beyond zone C (both sides of
center line)
A
B
C
C
B 8
893
SPC Center Line and Control Limit Calculations
Calculate the parameters of the Individual and MR Control Charts with the following:
Center Line Control Limits

k
R
k
x i i
UC L x  X  E 2M R U C L MR  D 4M R
X i 1 MR  i
k k LC L x  X  E 2M R LC L MR  D 3M R
Where:
Xbar: Average of the individuals, becomes the Center Line on the Individuals Chart
Xi: Individual data points
k: Number of individual data points
Ri : Moving range between individuals, generally calculated using the difference between each
successive pair of readings
MRbar: The average moving range, the Center Line on the Range Chart
UCLX: Upper Control Limit on Individuals Chart
LCLX: Lower Control Limit on Individuals Chart
UCLMR: Upper Control Limit on moving range
LCLMR : Lower Control Limit on moving range (does not apply for sample sizes below 7)
E2, D3, D4: Constants that vary according to the sample size used in obtaining the moving range
894
Calculate the parameters of the Xbar and R Control Charts with the following:

k
R
k
x i i UC L x  X  A 2R UC L R  D 4R
X i 1
R  i
LC L x  X  A 2R LC L R  D 3R
k k
Where:
Xi: Average of the subgroup averages, it becomes the Center Line of the Control Chart
Xi: Average of each subgroup
k: Number of subgroups
Ri : Range of each subgroup (Maximum observation – Minimum observation)
Rbar: The average range of the subgroups, the Center Line on the Range Chart
UCLX: Upper Control Limit on Average Chart
LCLX: Lower Control Limit on Average Chart
UCLR: Upper Control Limit on Range Chart
LCLR : Lower Control Limit Range Chart
A2, D3, D4: Constants that vary according to the subgroup sample size
895
Calculate the parameters of the Xbar and S Control Charts with the following:

k k
x i s i UC L x  X  A 3S UC L S  B 4S
X i 1
S i 1
k k LC L x  X  A 3S LC L S  B 3S
Where:
Xi: Average of the subgroup averages, it becomes the Center Line of the Control Chart
Xi: Average of each subgroup
k: Number of subgroups
si : Standard Deviation of each subgroup
Sbar: The average S. D. of the subgroups, the Center Line on the S chart
UCLX: Upper Control Limit on Average Chart
LCLX: Lower Control Limit on Average Chart
UCLS: Upper Control Limit on S Chart
LCLS : Lower Control Limit S Chart
A3, B3, B4: Constants that vary according to the subgroup sample size
896
Calculate the parameters of the P Control Charts with the following:
Total number of defective items p (1  p )

p UCL p  p  3
Total number of items inspected ni
p (1  p )
LC L p  p  3
Where: ni
p: Average proportion defective (0.0 – 1.0)
ni: Number inspected in each subgroup
LCLp: Lower Control Limit on P Chart
UCLp: Upper Control Limit on P Chart
Since the Control Limits are a function of sample

size, they will vary for each sample.
897
Calculate the parameters of the nP Control Charts with the following:
Total number of defective items U C L n p  n i p  3 n i p (1  p )

np 
Total number of subgroups
Where: L C L n p  n i p  3 n i p (1 - p )
np: Average number defective items per subgroup
LCLnp: Lower Control Limit on nP chart
UCLnp: Upper Control Limit on nP chart
Since the Control Limits AND Center Line are a function of

sample size, they will vary for each sample.
898
Calculate the parameters of the U Control Charts with the following:

u
u 
Total number of defects Identified UCL u  u  3
Total number of Units Inspected ni
u
Where: LCL u  u  3
ni
u: Total number of defects divided by the total number of units inspected.
LCLu: Lower Control Limit on U Chart.
UCLu: Upper Control Limit on U Chart.
Since the Control Limits are a function of sample size,

they will vary for each sample.
899
Calculate the parameters of the C Control Charts with the following:

Total number of defects UCL c  c  3 c
c
Total number of subgroups
LC L c  c  3 c
Where:
c: Total number of defects divided by the total number of subgroups.

LCLc: Lower Control Limit on C Chart.
UCLc: Upper Control Limit on C Chart.
900
Calculate the parameters of the EWMA Control Charts with the following:

σ λ
Z t  λ X t  (1  λ ) Z t  1 UC L  X  3 ( )[1  (1  λ) 2 t ]
n 2λ
σ λ
LC L  X  3 ( )[1  (1  λ) 2 t ]
Where: n 2λ
Zt: EWMA statistic plotted on Control Chart at time t
Zt-1: EWMA statistic plotted on Control Chart at time t-1
: The weighting factor between 0 and 1 – suggest using 0.2
: Standard Deviation of historical data (pooled Standard Deviation for subgroups
– MRbar/d2 for individual observations)
Xt: Individual data point or sample averages at time t
UCL: Upper Control Limit on EWMA Chart
LCL: Lower Control Limit on EWMA Chart
n: Subgroup sample size
901
Calculate the parameters of the CUSUM control charts with MINITABTM or

other program since the calculations are even more complicated than the
EWMA charts.
Because of this complexity of formulas, execution of either this or the EWMA

are not done without automation and computer assistance.
Ah, anybody got a laptop?
902
Pre-Control Charts
Pre-Control Charts use limits relative to the specification limits. This is the first and ONLY
chart you will see specification limits plotted for Statistical Process Control. This is the most
basic type of chart and unsophisticated use of process control.
Red Zones. Zone outside the

0.0 0.25 0.5 0.75 1.0 specification limits. Signals the
process is out-of-control and should be
stopped
Yellow Zones. Zone between the PC

RED Yellow GREEN Yellow Red
Lines and the specification limits,
indicates caution and the need to
watch the process closely
Green Zone. Zone lies between the

LSL Target USL PC Lines, signals the process is in
control
903
Process Setup and Restart with Pre-Control
Qualifying Process
• To qualify a process, five consecutive parts must fall within the green zone
• The process should be qualified after tool changes, adjustments, new operators,
material changes, etc
Monitoring Ongoing Process

• Sample two consecutive parts at predetermined frequency
– If either part is in the red, stop production and find reason for variation
– When one part falls in the yellow zone inspect the other and
• If the second part falls in the green zone then continue
• If the second part falls in the yellow zone on the same side, make an adjustment to
the process
• If second part falls in the yellow zone on the opposite side or in the red zone, the
process is out of control and should be stopped
– If any part falls outside the specification limits or in the red zone, the process is out of
control and should be stopped
904
Responding to Out of Control Indications
• The power of SPC is not to find out what the Center Line and Control Limits are.
• The power is to react to the Out of Control (OOC) indications with your Out of Control Action Plans
(OCAP) for the process involved. These actions are your corrective actions to correct the output or
input to achieve proper conditions.
Individual SPC chart for Response Time
40
1
UCL=39.76
VIOLATION:
Special Cause is indicated
30
Individual Value
20 _
X=18.38
10 OCAP
If response time is too high, get
0
LCL=-3.01
additional person on phone bank
1 4 7 10 13 16 19 22 25 28 31
Observation
• SPC requires immediate response to a Special Cause indication.

• SPC also requires no “sub optimizing” by those operating the process.
– Variability will increase if operators always adjust on every point if not at the Center Line. ONLY
respond when an Out of Control or Special Cause is detected.
– Training is required to interpret the charts and response to the charts.
905
Attribute SPC Example
Practical Problem: A project has been launched to get rework reduced to less than
25% of paychecks. Rework includes contacting a manager about overtime hours to
be paid. The project made some progress but decides they need to implement SPC to
sustain the gains and track % defective. Please analyze the file “paycheck2.mtw” and
determine the Control Limits and Center Line.
Step 3 and 5 of the methodology is the primary focus for this example.
– Select the appropriate Control Chart and Special Cause tests to employ
– Calculate the Center Line and Control Limits
– Looking at the data set, we see 20 weeks of data.
– The sample size is constant at 250.
– The amount of defective in the sample is in column C3.
Paycheck2.mtw
906
Attribute SPC Example (cont.)
The example includes % paychecks defective. The metric to be charted

is % defective. We see the P Chart is the most appropriate Attribute
SPC Chart.
907
Notice specifications were never discussed. Let us calculate the Control

Limits and Central Line for this example.
We will confirm what rules for Special Causes are included in our Control
Chart analysis.
908
Remember to click on the “Options…” and “Tests” tab to clarify the rules for
detecting Special Causes.
…. Chart Options>Tests
Chart analysis. The top 3 were selected.
909
No Special Causes were detected. The average % defective checks were

20.38%. The UCL was 28.0% and 12.7% for the LCL.
P Chart of Empl_w_Errors
0.30
UCL=0.2802
0.25
Proportion
_
0.20 P=0.2038
0.15
LCL=0.1274
1 3 5 7 9 11 13 15 17 19
Sample
Now we must see if the next few weeks are showing Special Cause from the
results. The sample size remained at 250 and the defective checks were 61, 64,
77.
910
Remember, we have calculated the Control Limits from the first 20 weeks. We must
now put in 3 new weeks and NOT have MINITABTM calculate new Control Limits
which will be done automatically if we do not follow this technique. We are executing
Steps 6-8
– Step 6: Plot process X or Y on the newly created Control Chart
– Step 7: Check for Out-Of-Control (OOC) conditions after each point
– Step 8: Interpret findings, investigate Special Cause variation, & make
improvements following the Out of Control Action Plan (OCAP)
Notice the new 3 weeks of data was entered

into the spreadsheet.
911
…… Chart Options>Parameters
Place the pbar from the first chart

we created in the “Estimates’ tab.
This will prevent MINITABTM from
calculating new Control Limits
which is step 9.
1
0.30
UCL=0.2802
0.25
The new updated SPC chart is
Proportion
shown with one Special Cause. 0.20
_
P=0.2038
0.15
LCL=0.1274
1 3 5 7 9 11 13 15 17 19 21 23
Sample
912
Attribute SPC example (cont.)
Because of the Special Cause, the process must refer to the OCAP or Out of Control Action Plan that states what
Root Causes need to be investigated and what actions are taken to get the process back in Control.
1
0.30
UCL=0.2802
0.25
Proportion
_
0.20 P=0.2038
0.15
LCL=0.1274
1 3 5 7 9 11 13 15 17 19 21 23
Sample
After the corrective actions were taken, wait until the next sample is taken to see if the process has changed to not
show Special Cause actions.
– If still out of control, refer to the OCAP and take further action to improve the process. DO NOT make
any more changes if the process shows back in control after the next reading.
• Even if the next reading seems higher than the Center Line! Don’t cause more variability.
If process changes are documented after this project was closed, the Control Limits should be recalculated as in step
9 of the SPC methodology.
913
Variable SPC Example
Practical Problem: A job shop drills holes for its largest customer as a
final step to deliver a highly engineered fastener. This shop uses five drill
presses and gathers data every hour with one sample from each press
representing a subgroup. The data is gathered in columns C3-C7.
Step 3 and 5 of the methodology is the primary focus for this example.
– Select the appropriate Control Chart and Special Cause tests to employ
– Calculate the Center Line and Control Limits
Holediameter.mtw
914
Variable SPC Example (cont.)
The example has Continuous Data, subgroups and we have no interest

in small changes in this small process output. The Xbar R Chart is
selected because we are uninterested in the Xbar S Chart for this
example.
915
Specifications were never discussed. Let us calculate the Control Limits and
Center Line for this example.
Chart analysis.
916
Remember to click on the “Options…” and “Tests” tab to clarify the rules for
detecting Special Causes.
……..Xbar-R Chart Options>Tests
Chart analysis. The top 2 of 3 were selected.
917
Also confirm the Rbar method is used for estimating Standard Deviation.
Stat>Control Charts>Variable Charts for Subgroups>Xbar-R>Xbar-R Chart Options>Estimate
918
No Special Causes were detected in the Xbar Chart. The average hole diameter was
26.33. The UCL was 33.1 and 19.6 for the LCL.
Xbar-R
Xbar-R Chart
Chartof
of Part1,
Part1, ...,
..., Part5
Part5
35
35
UUCCL=33.07
L=33.07
ean
30
SampleMMean
30
_
__
_
X=26.33
Sample
X=26.33
25
25
20 LC
20 LCL=19.59
L=19.59
11 66 11
11 16
16 21
21 26
26 31
31 36
36 41
41 46
46
Sample
Sample
1
1
24 UUCCL=24.72
L=24.72
24
Range
18
SampleRange
18
__
12
12 R=11.69
R=11.69
Sample
66
00 LC
LCL=0
L=0
11 66 11
11 16
16 21
21 26
26 31
31 36
36 41
41 46
46
Sample
Sample
Now we will use the Control Chart to monitor the next 2 hours and see if we are still
in control.
919
Remember, we have calculated the Control Limits from the first 20 weeks. We must
now put in 2 more hours and NOT have MINITABTM calculate new Control Limits
which will be done automatically if we do not follow this step. We are executing Steps
6-8
– Step 6: Plot process X or Y on the newly created Control Chart
– Step 7: Check for Out-Of-Control (OOC) conditions after each point
– Step 8: Interpret findings, investigate special cause variation, & make
improvements following the Out of Control Action Plan (OCAP)
Notice the new 2 hours of

data was entered into the
spreadsheet.
920
……..Xbar-R Chart Options>Parameters
Place the mean from the FIRST chart we

created in the estimates tab. The Standard
Deviation is Rbar/d2. This will prevent
MINITABTM from calculating new Control
Limits which is Step 9. d2 is found by finding
the table of constants shown earlier.
Xbar-R Chart of Part1, ..., Part5

35
U C L=33.07
Sample Mean
30
•The new updated SPC Chart is shown with no 25

_
_
X=26.33
indicated Special Causes in the Xbar Chart. 20 LC L=19.59
The Mean, UCL and LCL are unchanged 1 6 11 16 21 26

Sample
31 36 41 46 51
because of the completed option above. 24

1
U C L=24.72
Sample Range
18
_
12 R=11.69
0 LC L=0
1 6 11 16 21 26 31 36 41 46 51
Sample
921
Because of no Special Causes, the process does not refer to the OCAP or Out of Control Action Plan
and NO actions are taken.
Xbar-R
Xbar-RChart
Chartof
ofPart1,
Part1,...,
...,Part5
Part5
35
35
U C L=33.07
U C L=33.07
ean
30
SampleMMean
30
_
__
_
X=26.33
Sample
X=26.33
25
25
20 LC L=19.59
20 LC L=19.59
1 6 11 16 21 26 31 36 41 46 51
1 6 11 16 21 26 31 36 41 46 51
Sample
Sample
1
1
24 U C L=24.72
24 U C L=24.72
Range
18
SampleRange
18
_
12 _
R=11.69
12 R=11.69
Sample
6
6
0 LC L=0
0 LC L=0
1 6 11 16 21 26 31 36 41 46 51
1 6 11 16 21 26 31 36 41 46 51
Sample
Sample
If process changes are documented after this project was closed, the Control Limits should be
recalculated as in Step 9 of the SPC methodology.
922
Recalculation of SPC Chart Limits
• Step 9 of the methodology refers to recalculating SPC limits.

• Processes should see improvement in variation after usage of SPC.
• Reduction in variation or known process shift should result in Center Line and
Control Limits recalculations.
– Statistical confidence of the changes can be confirmed with Hypothesis
Testing from the Analyze Phase.
• Consider a periodic time frame for checking Control Limits and Center Lines.
– 3, 6, 12 months are typical and dependent on resources and priorities
– A set frequency allows for process changes to be captured.
• Incentive to recalculate limits include avoiding false Special Cause detection with
poorly monitored processes.
• These recommendations are true for both Variable and Attribute Data.
923
SPC Chart Option in MINITABTM for  Levels
Remembering many of the tests are based on the 1st and 2nd Standard Deviations from
the Center Line, some Belts prefer to have some additional lines displayed. This is
possible with:
Stat>Quality Charts> ….. Options>S

Limits “tab”
The extra lines can be helpful if users are using MINITABTM for the SPC.
924
Summary
• Describe the elements of an SPC Chart and the purposes of SPC

• Understand how SPC ranks in Defect Prevention
• Describe the 13 step route or methodology of implementing a chart
• Design subgroups if needed for SPC usage
• Determine the frequency of sampling
• Understand the Control Chart selection methodology
• Be familiar with Control Chart parameter calculations such as UCL, LCL and the
Center Line
925

LSSBB Course Notes Secured

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

LSSBB Course Notes Secured

Uploaded by

Copyright:

Available Formats

Define Phase

What is Six Sigma?

Understanding Six Sigma

Six Sigma Fundamentals

Wrap Up & Action Items

σ sigma is a letter of the Greek alphabet.

Sigma is a measure of deviation. The mathematical calculation for the

By definition, the Standard Deviation is the

The probability of creating a defect can be estimated and translated

*LSL – Lower Spec Limit

“Sigma Level” is:

• Rolled Throughput yield (RTY) 12

• First Time Yield (FTY) 10

Yield PPMO COPQ Sigma

99.9997% 3.4 <10% 6 World Class Benchmarks

99.976% 233 10-15% 5 10% GAP

99.4% 6,210 15-20% 4 Industry Average

93% 66,807 20-30% 3 10% GAP

65% 308,537 30-40% 2 Non Competitive

50% 500,000 >40% 1

What does 20 - 40% of Sales represent to your Organization?

 Define - the business opportunity

 Measure - the process current state

 Analyze - determine Root Cause or Y= f (x)

 Improve - eliminate waste and variation

 Control - evidence of sustained results

Six Sigma contains a broad set of tools, interwoven in a business

Low Hanging Fruit

• Simplistically, Six Sigma was a program that was generated around

• By using the process Standard Deviation to determine the location of the

• There is an allowance for the process Mean to shift 1.5 Standard

Control Improve Analyze Measure Define

MOTOROLA GENERAL ELECTRIC

Identify Problem Area

Determine Appropriate Project Focus

Assess Stability, Capability, and Measurement Systems

Identify and Prioritize All X’s

Prove/Disprove Impact X’s Have On Problem

Identify, Prioritize, Select Solutions Control or Eliminate X’s Causing Problems

Implement Solutions to Control or Eliminate X’s Causing Problems

Implement Control Plan to Ensure Problem Does Not Return

Verify Financial Impact

Notify Belts and Stakeholders

Create High-Level Process Map

Determine Appropriate Project Focus

Define & Charter Project

Ready for Measure

Six Sigma places the emphasis on the Process

Six Sigma is a Breakthrough Strategy

Success of Six Sigma depends on the extent of transformation

Conventional definitions of quality focused on conformance to standards.

Bad Good Bad

Conventional strategy was to create a product or service that met certain

The Problem Solving Methodology focuses on:

If we are so good at the X’s why are we constantly

Exercise: Consider establishing a Y = f(X) equation for a

We use a variety of Six Sigma tools to help

Time Juran’s Quality Handbook by Joseph Juran

By utilizing the DMAIC problem solving methodology to identify and

VOC is Customer Driven

There are many roles and responsibilities for successful

Eventually there should be a big base of support internal

• Sets meaningful goals and objectives for the corporation

• Sets performance expectations for the corporation