Six Sigma Manual

Session 1
CONTENTS :
• Basic Statistics
• Measure of Central Tendency
• Measure of Spread
• Definition of Six Sigma
• Guidelines for Project Selection
1
BASIC STATISTICS
Statistics is the science of drawing information from numerical value. Such numerical values are
referred to as data. It deals with :
1) Collection of Data
2) Summarization of Data
3) Analysis of Data , and
4) Drawing valid inference from data
Data : It is the collection of any number of related observations. For eg. grades on student
exams, measures of athletic performance , the number of sales in a unit each month , the
ages of a people living in a town . The collection of data is called a data set , and single
observation is known as a data point.
Population : A population or universe consists of all members of a class or category of interest

or under consideration.
Sample : A sample is some portion or subset of a population.
Parameter : It is defined as the characteristic of the population.
2
Statistic : It is defined as the characteristic of the sample .
The characteristic of the population : Parameter

Population
Sample The characteristic of the sample : Statistic
Sampling : A sample is selected, evaluated and studied in an effort to gain information about the
larger population from which the sample was drawn. Advantages of sampling are :
1) Cost : Samples can be studied at much lower cost .
2) Time : Samples can be evaluated more quickly than populations.
3) Accuracy : The larger the data set , the more opportunity is there for errors to occur. A
sample can provide a data set which is small enough to monitor carefully .
4) Feasibility : In some research situations, the population of interest is not available for study.
For example in case of destructive testing.
5) Scope of Information : When evaluating a smaller group , it is sometimes possible to gather
more extensive information on each unit evaluated.
3
Measure of Central Tendency
Most sets of numerical data show a distinct tendency to group or cluster about a certain central
Point. This peculiar kind of property exhibited by data is known as central tendency. Most common
measure of central tendency are :
1) Mean : Also known as the arithmetic mean , the mean is typically what is meant by the word
average. The mean of variable is given by ( the sum of all values )/(number of values ).
Despite its popularity, the mean may not be an appropriate measure of central tendency .
This is generally in case of populations, having outliers or extreme values .
2) Median : Median is the 50th percentile of a distribution. To find the median of a number of
values, first order them in ascending or descending order , then find the observation in the
middle. The median of 5,2,7,9 and 4 is 5. ( Note that if there is an even number of values ,
one takes the average of the middle two ; the average of 4,6,8 and 10 is 7 ). The median is
more appropriate than mean , in populations with larger outliers .
3) Mode : It is the most common value in the distribution . It is the value occurring maximum
number of times ( having highest frequency ).
4
Measure of Spread
Spread refers to the statistical fluctuation displayed by the variables in a given data. Statistical
fluctuation refers to the extent, to which variables take on different values :
1) Range :Range is defined as the maximum value minus the minimum value .
2) Variance : Variance is a measure of how spread out a distribution is. The formula for the
variance in population is :
s2 = S(X-m)2 If sample size > 30, in that case

n-1 n-1 is replace by n
Where m is the mean and n is the number of scores or population size
3) Standard Deviation : Standard Deviation also indicates the spread in the population. It is the
square root of variance. Standard Deviation is denoted by s
5
Example : Calculate the standard deviation and variance for the given data ; 1,2,3,4,5
Solution : m= ( 1+2+3+4+5 )/5 = 15/5 = 3
X X-m (X-m)2 S(X-m)2 = 10

1 -2 4
2 -1 1
S(X-m)2 = 10/4 = 2.5
3 0 0
4 1 1 n-1
5 2 4
Hence Variation = 2.5
Total 10
Standard Deviation = sqrt(2.5)=1.581
Range for the above data = 5-1 = 4
6
Definition of Six Sigma
The goal of Six Sigma is to increase profits by eliminating variability , defects and waste that
undermine customer loyalty . It is a disciplined data driven approach and methodology for
eliminating defects . Six Sigma can be understood / perceived at three levels :
1) Metric : The statistical representation of six sigma describes quantitatively how a process
is performing . To achieve Six Sigma a process must not produce more than 3.4
defects per million opportunity ( DPMO ) or 3.4 Parts Per Million ( PPM ) .
2) Methodology :
DMAIC : DMAIC refers to a data driven quality strategy for improving processes and is an
integral part of Company’s Six Sigma initiative . It refers to Define , Measure , Analyze ,
Improve , Control . It is a highly structured approach that includes both a set of tools and road
map or sequence of applying those tools .
DMADV : The development of a new product or service is essentially a problem solving
process . Here we need to see , what are the problems that we can face because of the
existing design and the effectiveness of the existing design from the customer’s point of view.
7
DMADV stands for Define , Measure , Analyze , Design and Validate .
DMADV approach is applied in R&D Six Sigma and TQ Six Sigma Projects .
3) Philosophy : Six Sigma tools and methods concentrate on reducing variability in the process .
Variation : A process can be defined as a series of operations performed to bring about the result .
Thus a process is further made up of small – small processes . The result can be the
delivery of a service or the manufacturing of a product . Variation is the sum total of all
minute changes that occur every time a process is performed . Variation is always
present at some level . If something appears to be constant, we usually have just not
looked at it with a fine enough resolution. In delivering services or manufacturing
products, variation is our enemy. Consistency and thus minimal variation leads to
improved quality, reduced costs, higher profits and higher customer satisfaction .
Six Sigma believes in benchmarking against the best in the world.
TQ stands for Transactional Quality 8

Meaning of Z value and Sigma
Six Sigma Process Three Sigma Process
LSL Target USL LSL Target USL

6s 6s 3s 3s
In case of Six Sigma process Z value or Sigma Value is 6 .
Z Value indicates the number of Standard Deviations that are lying between target and USL or
target and LSL. For a process with only one specification limits ( Upper or Lower ), this results in
six process standard deviations between the mean of the process and the customer’s
specification limit ( hence 6s )
9
Six Sigma Process
Three Sigma Process
LSL Target USL
In the above example , three sigma and six sigma processes have been shown .
It is clear that acceptable area under Six Sigma process is more as compared to a three sigma
process . Hence it can be inferred that :
• More the Zvalue , lesser are the rejections .
• Zvalue of a process can be increased by lowering Sigma or variation , which is the ultimate
aim of Six Sigma methodology.
10
Comparison between a 3s level company and 6s level company
The 3s level company The 6s level company
• Believes that 99% is good enough • Believes that 99% is unacceptable
• Spends 15-25% of sales dollars on costs of failure • Spends 5% of sales dollar on cost of failure
• Produces 66,807 defects per million opportunities • Produces 3.4 defects per million opportunities
• Relies on inspection to find defect • Relies on capable process that don’t produce
defects
• Believe high quality is expensive • Knows that the high quality producer is a low cost
producer
• Does not have disciplined approach to gather and • Uses Measure, Analyze, Improve, Control
analyze data
• Benchmark themselves against their competition • Benchmark themselves against he best in the world
• Defines CTQs internally • Defines CTQs externally
11
Different Approaches
DMAIC : The DMAIC methodology is used when a product or a process is in existence at the
company but is not meeting customers specification or is not performing adequately. In case
of Transactional Quality projects , this is applied for services.
DMADV : The DMADV methodology is used when a product or process is not existing and
needs to be developed . It is used for design and development .
Stages DMAIC Stages DMADV

Define the project goals and Define the project goals and customer(internal
Define Define
customer(internal and external) deliverables and external) deliverables
Measure the process to determine current Measure and determine customer needs and
Measure Measure
performance specification
Analyze and determine the root cause of Analyze the process options to meet the
Analyze Analyze
defect customer needs
Design(detailed) the process to meet the
Improve Improve the process by eliminating defects Design
customer needs
Validate the design performance and ability to
Control Control future process performance Validate
meet customer demands
12
Session 1-Define
13
DEFINE
CONTENTS :
• Introduction to define
Define Basics
• Brainstorming
• Pareto Chart
• Logic Tree
• 5-Why Analysis
• Process Mapping
Define Tools
• RTY
• Yield
• QFD
• FMEA
14
Introduction to Define
Define is the first stage of any Six Sigma project. It aims at identifying what is important for
problem solving and hence gives us the opportunity to find defect.
It aims at theme selection and justification.

Define aims at listing down all the problems and then prioritizing them. Priority order is
decided on the basis of business vision, cost impact to the organization and certain other
factors like severity and frequency.
Based on this priority analysis a theme is selected. Next step is defining this theme using as
is vs. should be approach.
As is : Current status or the performance of the process.

Should be : How the process should actually be.
Theme is always selected, keeping in mind customer’s need and view point.
15
Brainstorming
Brainstorming is a name given to the situation when a group of people meet to generate new
ideas around a specific area of interest. Using rules which remove inhibitions, people are able
to think more freely and move into new areas of thoughts and so create numerous new ideas
and solutions.
Bottom line of brainstorming is that no new idea should be criticized . All ideas are noted down
and when the brainstorming session is over, these ideas are evaluated.
Rules of Brainstorming :
1) Postpone and withhold your judgment of ideas
2) Encourage wild and exaggerated ideas
3) Quantity counts at the initial stage
4) Build on the ideas put forward by others
5) Every person and every idea has equal worth
16
Different types of Brainstorming :
1) Free wheel : It refers to the spontaneous flow of ideas by all team member.
2) Round Robin : Team member take turns suggesting ideas .
3) Card Method : Team members write ideas on card with no discussion.
Brainstorming is always carried out in the presence of process experts. Process expert is a
person who is close to the process and is well versed with it. A cross functional approach is
always preferred in a brainstorming session, since inputs can be obtained from several
functions and then can be assimilated to form a new idea or a concept.
Brainstorming is a very useful exercise before theme selection as well as during problem
solving. It explores the problem in a very effective manner.
Use of brainstorming will be seen, while we discuss various tools of Define stage.
17
Pareto Chart
The theory behind the Pareto Chart was originated in 1897 by an Italian economist Vilfredo Pareto.
It says that 80% of the effects are caused by 20% of the causes. Infact many defect distribution
follow a simple pattern, with a relatively small number of issues accounting for an overwhelming
share of the defects. The Pareto chart shows the relative frequency of defects in rank order, and
thus provides a prioritization tool so that the process improvement activities can be organized.
Pareto Chart of Defects

700
100
600
500 80
Percent
400
Count
60
300
40
200
100 20
0 0
Defects g e ch m en m nt kn he
r
sin ag at ble ok ble De oe Ot
Mi
s a rp S cr r o Br ro br
w W rp ord rp ide
re to v e u
Sc ec rC Lo
u ir
g
onn owe A
C P
Count 180 120 100 85 60 28 23 10 26
Percent 28.5 19.0 15.8 13.4 9.5 4.4 3.6 1.6 4.1
Cum % 28.5 47.5 63.3 76.7 86.2 90.7 94.3 95.9 100.0
18
Pareto Chart
How to access Pareto Chart in MINITAB

Stat : Quality Tools : Pareto Chart
19
Pareto Chart
Put defects in
labels
Put frequency
against frequency
20
Pareto Chart
However this Pareto Chart is constructed from one dimension only-defect frequency. One must
consider the other constraints like cost, reliability also.
21
Logic Tree
It is used to examine the issue in detail and identify the root causes. It break down problem into
manageable groups based on MECE ( Mutually Exclusive and Collectively Exhaustive ). Causes
of the problem are brainstormed and are placed under 4M in case of Manufacturing Project and
5P in case of Transactional Quality project. People
Man
Mfg. Project Transactional Price
Quality Project
Machine
Product
Cause
Method Cause
Promotion
Material
Place
22
Logic Tree
Logic Tree Structure

Effects
Cause A1
A MECE
A2
Problem MECE
B1
B MECE
B2
23
Logic Tree
What is MECE ( Mutually Exclusive and Collectively Exhaustive )
Mutually Exclusive : Two events that have no outcomes in common are known as mutually
exclusive events. These are the events that can not occur at the same time.
For eg. A pair of dice is rolled. The event of rolling a 9 and rolling a double
(3,3 or 4,4) have no outcome in common. These two events are mutually
exclusive event.
Collectively Exhaustive : This is the second aspect of MECE. It means that all the issue points
have been covered , no point has been left.
For eg. While flipping a coin, getting a head and getting a tail are
collectively exhaustive events.
24
Logic Tree
Is MECE Is not MECE

■ Overlapping Area
AB is ME, but not CE.

area : Creature
A : Mammal
B : Fish
ABC is CE but is not ME.
area : Female
A : Unmarried
B : Married
C : OL
AB is not ME or CE.
area : Class of students
A : Students do well in Math
B : Students do well in English
* Mutually Exclusive and Collectively Exhaustive
(No overlap) (No exclusion)
25
5-Why Analysis
This tool is based on KEI (Knowledge , Experience and Intuition) approach. It is used to study
the symptoms and understand the true root cause of the system. It is said that only by asking
“why” 5 times, successfully can you delve into a problem deeply enough to understand the
ultimate root cause. By the time you get to the 4th and 5th why, you are likely to approach the
root cause of the problem. Here is a brief illustration of this concept :
Benjamin Franklin’s 5-Why Analysis • In corporate world the kingdom is

companies vision or growth
• Nail are the small-small non conformities
For want of a nail a shoe was lost,
that affect the process or service
For want of a shoe a horse was lost, • If not taken care of, these non-conformities
For want of a horse a rider was lost, or defects can lead to big losses and
For want of a rider an army was lost, customer dissatisfaction .
• We may not be able to recognize them by
For want of an army a battle was lost,
conventional approaches.
For want of a battle the war was lost, • In such cases 5-Why analysis needs to be
For want of the war the kingdom was lost, done , so as to recognize these non
And all for the want of the little horse shoe nail conformities/root causes .
26
Case Study on 5-Why Analysis
Here is a real world example from a kitchen range manufacturer :
There is so much work in process inventory, yet we never seem to have the right parts.
Why?
The enameling process is unpredictable,and the press room does not respond quickly enough.
Why?
It takes them too long to make a changeover between parts, so the lot sizes are too big and often
the wrong parts.
Why?
Many of the stamping dies make several different parts, and must be reconfigured in the tool room
between runs,which takes as long as eight hours.
Why?
The original project management team had cost overruns on the building site work, so they
skimmed on the number of dies – they traded dedicated dies and small lot sizes for high work in
process (which was not measured by their project budget).
Why?
Root Cause : Company management did not understand lean manufacturing, and did not set
appropriate project targets when the plant was launched .
27
Process Mapping
The use of Process mapping is to clarify improvement opportunity with the understanding of
defined processes form start to finish. It is a visual representation of the work flow. A good
process mapping should:
1) Allow people unfamiliar with the process to understand the interaction of causes during the
work flow.
2) Contain additional information relating to the Six Sigma project , that is information per
critical step about input and output variables, time, cost, DPU, value etc.
Application Effect of Process Mapping :
Understanding graphical representation of work flow
• It provides the outline or Big Picture to help understand the operation flow and standardize
the design.
• It can be used as the fundamental source for analyzing the present situation
• It can be established as a common point for process improvement
• It helps to clarify the bottleneck
• Identify unnecessary excess process
• Identification of loss area due to incorrect process sequence
28
Process Mapping
Process Mapping Symbols

Symbol Use Application
Start/end Process boundary
Activity Precise description of contents of task
Decision Description of decision making contents, comparison,

examination
Document/form Result document issued, report
Task Flow Expression of task flow/direction
Process connection Connection to different page or process

point
A1 Activity Number Proceeding order of activity
D5 Decision Number Proceeding order of decision
29
Process Mapping
SIPOC :
SIPOC stands for Supplier, Input, Process, Output and Customer. You obtain input from your
suppliers, add value through your process, and provide an output that meets or exceeds your
customers requirement. A SIPOC diagram is a tool used by a team to identify all relevant
elements of a process improvement project before the work begins.
The SIPOC tool is particularly useful,when it is not clear :

• Who supplies inputs to the process
• What specifications are placed on the inputs
• Who are the true customer of the process
• What are the requirements of the customer
30
RTY
Six Sigma Methodology says that “ DO IT RIGHT FOR THE FIRST TIME “.
In one of the convocations, Bill Smith ( an Engineer in Motorola and founder of Six Sigma
methodology ) emphasized on the fact that, a product which has been reworked will have lesser
life as compared to a product which has been manufactured right for the first time. Same fact is
applicable well to service industries also.
This turned out to be a breakthrough idea and resulted in a major paradigm shift. All companies
accepted Six Sigma as an improved version of TQM( Total Quality Management ).
Following this principle Bob Galvin ( CEO of Motorola ), advocated the use of DPO ( Defects per
Opportunity ) rather than DPU ( Defect Per Unit ) for service related problems. These terms will be
discussed in detail in the coming sessions.
RTY is also based on the same concept. It says that reworks are the hidden losses( also referred
as hidden factory ). These reworks have a huge negative impact on the product life as well as
customer satisfaction. RTY is one of the ways through which we can calculate the extent of
hidden losses that are taking place. Higher value of RTY indicates better process. It is one of the
major criteria through which we can select the bottleneck process on the production line.
31
RTY
RTY stands for Rolled Throughput Yield. It represents the probability of getting the product right
for the first time. It is the product of Yft of all the stages in case of lines and all the sub processes
in case of a process.
Yft stands for Yield First Time and is the probability of getting the product right for the first time for
a particular stage or a particular sub process.
2 Rework 5 Rework 4 Rework
100 98 98 93
Stage 1 Stage 2 Stage 3
2 Reject 0 Reject 5 Reject

Consider the following stages :
Stage 1 : In this stage 100 is the input , 2 are rejects and 2 are rework.
Therefore number of pieces obtained right for the first time are Input - ( Reject + Rework )
= 100-(2+2) = 96
Hence Yft = Input - ( Reject + Rework ) = 96/100
Input
Stages or subsystems having lower values of Yft are bottleneck areas
32
RTY
Stage 2 : Output from stage-1 will act as input for stage-2

Number of pieces obtained right for the first time are Input - ( Reject + Rework )
= 98-(0+5) = 93
Input
Stage 3 : Output from stage-2 will act as input for stage-3
Number of pieces obtained right for the first time are Input - ( Reject + Rework )
= 98-(4+5) = 89
RTY for the entire process = Yft1 * Yft2 * Yft3 = (96/100)*(93/98)*(89/98) = .82735 = 82.73%
• Yft is calculated for individual stage
• RTY is calculated for the entire process
• More is the RTY, better is the process
33
RTY
Process in Series
Two processes are said to be in series, when outcomes from both the processes are required.
Process 1 Process 2 Process 3
80% 65%
Process 2 a Process 2 b
75% 85%
Example
In the above case for Process 2 to take place, if it requires contribution from Process 2a as well as
Process 2b, then 2a and 2b are said to be in series. In such case Process 2 can not take place
without 2a or 2b.
Yft of process 2 is given by Yft2 = Yft2a * Yft2b =.75*.85 = 0.6375
Hence now the process becomes
80% 63.75% 65%
RTY for the process = Yft 1 * Yft 2 * Yft 3 = .80 * .6375 * .65 = .3315 or 33.15%
34
RTY
Process in Parallel
Two processes are said to be in parallel, when outcomes from even a single process is sufficient.
Process 1 Process 2 Process 3 In the given case for process 2 to take place, if input
80% 65% from 2a or 2b is sufficient, then 2a and 2b are in
parallel. In such case Process 2 can take place with
Process 2 a Process 2 b either of the sub process .
75% 85%
Yft for process 2 = sqrt ( Yft 2a * Yft 2b )
= sqrt ( .75 * .85 ) = .7984
Hence, the process now becomes
80% 79.84% 65%
RTY for the process = Yft 1 * Yft 2 * Yft 3 = .80 * .7984 * .65 = .4151 or 41.51%
Yft for n processes in parallel = ( Yft 1 * Yft 2 * Yft 3 * Yft 4 *-------------*Yft n )1/n
35
RTY
Case Study : Calculate the RTY of the R1 line
[Example] Rolled Throughput Yield

of R1 Line Door Ass’y 89.7%
D/Plate Plate/Paint
99.0%
D/Liner Injection/Mold Door Form Door Assembly

99.7% 93.4% 97.3%
99.6% 81.0%
I/Case Injection/Mold Case Form Cycle Assembly

83.8%
97.7%
Front - CTQ, L Paint O/Case, B/Plate Plate LQC & Appearance

99.2% 91.7%
96.5%
Case Ass’y 73.4%
Shipping
36
RTY
Consider all the stages of the given production line :
Door Assembly : In this case it is clear that all the sub processes are required for Door foaming
to take place, hence all the processes are in series.
Yft ( Door Assembly ) = .99 * .997 * .934 * .973 = .8969
Case Forming : In this case all the sub processes are required for case forming to take place ,
hence all process are in series.
Yft ( Case Assembly ) = .996 * .992 * .917 * .81 = .7338
Cycle : Yft = .977 ( given )
Assembly : Yft = .838 ( given )
LQC & Appearance : Yft = .965 ( given )

In the entire process, all sub processes are in series , hence final RTY is
RTY = .8969 * .7338 * .977 * .838 * .965 = .5200 = 52.0 %
37
RTY
Yna ( Normalized Yield )
Normalized yield is the Geometric mean of all the Yft of a given process .
RTY = ( Yft )1/n Where n is the total number of stages in a given line or total number of sub
processes in a given process.
Example : Calculate the Normalized yield ( Yna ) for the process shown below .
80% 79.84% 65%
Solution : RTY of the given process is

RTY = Yft1 * Yft2 * Yft3 = 0.80*0.7984*0.65 = .4151
Number of stages in the given process = 3
Yna = ( RTY )1/3 = ( .4151 ) 1/3 = .7481 = 74.81%
Yna is used for evaluating the level of quality in completed project .
38
RTY
Yna ( Normalized Yield )
Consider the following case there are two production lines having same value of RTY, but different
number of processes . These two lines can be compared by comparing there Yna.
Line 1 : RTY = .65 , Number of processes = 3
Line 2 : RTY = .65 , Number of processes = 5
For Line 1 , Yna = (.65)1/3 = .86784 = 86.78%
For Line 2 , Yna = (.65)1/5 = .9174 = 91.74%
Since Yna for Line 2 is more, hence it is a better line as compared to Line 1 . Same concept is
applicable to service industry also , where we have processes and sub processes.
Absolute value of Yft , RTY and Yna is always less than 1.
39
RTY
Complex Problem Solving :

Consider the following process made up of 5 sub processes. Calculate the RTY of the entire process.
Process 1 Process 2 Process 3 Process 4 Process 5 ②
Order Processing Credit Review Shipping Post Sales Billing
????=.80 ???? ????= ???? ????
Contact CFS Order Review Order Marriage Returns Customer Billing

93% 73% 88% 98.3% 98%
Availability Scheduling Receivables Cash Application

88% 88% 89% 73%
30% 70% ①
Order Build Warehouse
Loading Freight Payment Freight Payment
100% 95.5% SDS 81% Direct 90%
30% 70%
Pricing
98% Ship SDS Ship direct
90.9% 96.9%
In this case, 30% of the payments are made by freight payment SDS and 70% of the payments are made by
freight payment direct .
Yfreight = P(SDS Freight)*YSDS Freight + P(Direct Freight)*Ydirect freight = 0.3*0.81 + 0.7*0.9 = .243 + .630
= 0.873
On the similar lines, calculating Yft for shipping
Yshipping = P(Ship SDS ) * Y (SDS) + P(Ship Direct) * Y (Ship Direct) = .30*.909 + .70*.969 = .2727 + .6783
= 0.951
40
RTY
Now considering each process one by one :
Process 1 : Yft1 = Yft pricing * Yft order build * Yft availability * Yft contact
= 0.98 * 1.00 * 0.88 * 0.93 = 0.8020
Process 2 : Yft 2 = Yft cfs order review = 0.73
Process 3 : Yft 3 = Yft order marriage * Yft scheduling * Yft warehouse loading * Yft shiping
= 0.88 * 0.88 * 0.955 * 0.951 = 0.7033
Process 4 : Yft 4 = Yft returns * Yft receivables = 0.983 * 0.89 = 0.8748
Process 5 : Yft 5 = Yft customer billing * Yft cash application * Yft freight payment
= 0.98 * 0.73 * 0.873 = 0.6245
RTY = Yft 1 * Yft 2 * Yft 3 * Yft 4 * Yft 5 = 0.80 * 0.73 * 0.70 * 0.87 * 0.62 = .2205 = 22.05%
Yna = (RTY)1/n n=5 in this case
Yna = (.2205)1/5 = .7390 = 73.90 %
41
YIELD
DPU & DPO :

As has been discussed in the earlier sessions, Six Sigma approach focuses more on defects
rather that defectives .
Defect : Any type of undesired result is a defect . It is a failure to meet one of the acceptance
criteria of the customer.
Defective : The word defective, describes an entire unit that fails to meet acceptance criteria . A
defective can have any number of defects.
Unit : A unit is any item that is produced or processed which is liable for measurements or
evaluation against predetermined criteria or standards.
Opportunity : An area within a product , process , service , or other system where a defect
could be produced or where you can fail to achieve the ideal product in the eyes
of the customer.
DPU ( Defect per unit ) : DPU represents the number of defects divided by the number of
products.
DPO ( Defect per Opportunity ) : Defects per opportunity represents total defects divided
by total opportunity .
42
The Yield Calculation using Poisson formula.
The Poison equation is used to describe a number of process where the process can be
described by a discrete random variable that takes on integer(whole) values such as 0, 1, 2 , 3
and so on
•In our case for calculation yeild when DPU is given we will use.This below equation
( -dpu) (When DPU is given) e : exponential function (e = 2.718...)

Y= e
(When DPO is given) (ND is NON defect)
Y = P(ND)Opportunity = (1-DPO)Opportunity
•When we have yield with respect to DPU we can calculate Zlt and Zst.
Zlt = Normsinv(yeild)
Zst = Normsinv(yeild)+1.5 Normsinv is the statistical function of excel form
•When we have DPO we can calculate Zlt and Zst.
Z lt = Normsinv[(1-DPO) opportunity]
Normsinv[(1-DPO) opportunity] + 1.5 Note : Its always advisable
to calculate Z value with respect to DPO as
Opportunity also taken in to consideration.
•We can also calculate Zst and Zlt using Z table. 43
YIELD
Example :100 invoices were processed in a financial firm. After processing, 34 defects were
observed. If an invoice can fail in 20 ways, for the above process calculate DPU , DPO
and DPMO .
Solution : For the above problem , Units = 100 Defects = 34
Opportunity = 20 ( Since a single invoice can fail in 34 ways )
Defect Per Unit ( DPU ) = Total Defects / Total Units = 34/100 = 0.34
Since opportunity for single unit to fail =20
For 100 units, total opportunities = 20*100 = 2000
Therefore DPO = Defects / Total Opportunity = 34 / 20*100 = 34 / 2000 = .017
DPMO ( Defects Per Million Opportunity ) = DPO * 1 Million = .017 * 10,00,000 = 17,000
44
Process Yield & Sigma (Example)
(Sample Question) : DPU, DPO, DPMO, Yield & Sigma Calculation
If there are 34 Defects out of 750 units, Let’s calculate sigma value of DPU / DPO / Yield /
DPMO / Sigma.... (10 opportunities per each unit)
1) DPU = defects / unit is DPU is ( 34 ) ÷ ( 750 ) = ( 0.045 ) .
2) DPO = number of defects / (total number of units × opportunity) ,

DPO = ( 34 ) ÷ ( 750 × 10 ) = ( 0.0045 )
3) Yield Value Zero Defect (r = 0), Poision Distribution is Y= e (-d/u)

Y = 2.7183 -0.045 = ( 0.956 ) = ( 95.6 )%
OR Y = P(ND)10 = (1-DPO)10 = (1 - 0.0045 )10 = ( 0.956 ) = ( 95.6 )%
4) DPMO = DPO × 1,000,000,

DPMO = ( 0.0045 ) × 1,000,000 = 4,500 (It has 4,500 ppm per one opportunity,
Thus, Defect has 45,000 ppm per 1 Unit.)
5) Sigma Value = normsinv( 0.956 ) + 1.5 shift = ( 1.71) + 1.5 = ( 3.21) 45

QFD
Two steps involved in QFD are :

1) Converting Customers Voice into Engineer’s voice
2) Converting Engineer’s voice into Technical Specifications
CTQ’s and CTP’s are then decided from the technical specifications.
CTQ stands for Critical to Quality. It is always measurable.

CTP stands for critical to process. It is not always measurable
These are the indices that control the defects. For example lea
46
QFD
Bottom line of QFD is to study what customer wants. Customer demand is then correlated with
Engineers voice, which in turn is then correlated with measurable technical specifications.
CTQ’s and CTP’s are selected from these measurable technical specifications.
During step1 of QFD analysis, customer’s feedback is collected. Imprtant guidelines for collecting
customer’s feedback are :
1) It should be very brief
2) It should specify what is required in a product or a process
3) It should not provide any type of countermeasure or plans for improvement
It should be ensured that feedback should be taken from all types of customers.
In Step-1, all the customer requirements are correlated with Engineer’s voice at a scale of 1,3,9.
Where 9 means maximum correlation, 3 means moderate correlation and 1 means very week
correlation. In case of no correlation, no marks are given. Importance is given to each Voice of
customer, based on
47
QFD
Step 1 : Converting Customer’s Voice to Engineer’s voice
Importance Engg. V 1 Engg. V 2 Engg. V 3 Engg. V 4 Engg. V 5 Engg. V 6 Engg. V 7
VOC1 5 1 1 3 9
VOC2 5 3 3
VOC3 4 1 1 9
VOC4 4 9 3 1 9
VOC5 4 9 1 3
VOC6 3 1 1 9 3
VOC7 3 1 1 1 3
VOC8 2 3 3 3
Rating 54 30 41 16 46 37 138
VOC stands for Voice of Customer, Engg. V stands for engineer’s voice
48
QFD
Step 1 : Converting Engineer’s voice to Technical Specification
Tech.Sp 1 Tech.Sp 2 Tech.Sp 3 Tech.Sp 4 Tech.Sp 5 Tech.Sp 6 Rating
Engg. V 1 1 3 9 54
Engg. V 2 9 30
Engg. V 3 3 41
Engg. V 4 3 9 16
Engg. V 5 3 1 46
Engg. V 6 3 37
Engg. V 7 1 3 1 138
Rating 453 381 462 162 144 670
Engg. V stands for engineer’s voice, Tech. Sp stands for Technical Specification
49
FMEA
• FMEA is a systematic tool for identifying effects or consequences of potential failure of
product or a process.
•It’s a method to elimination or reduce the chance or failure occurrence.

FMEA generates a living document that can be used to anticipate and prevent failures from
occurring
(Note : Documents should be updated regularly.

FMEA is most effective when its used before a design is released
rather than after the Fact. Focus should be on failure prevention not on detection.
Types of FMEA
Design FMEA :Examines the functions of a components, subsystem or main system.

Potential failures : Incorrect martial choice , in appropriate specifications
Example : Air Bag (Excessive air bag inflator force) .
Process FMEA :Examines the process used to make a component , sub system or main
system.
Potential failures : operator assembling part incorrectly , excessive variation in
Process resulting in out of specification
Example : Air Bag assembly process (operator may not install airbag properly on assembly
Line such that it may not engage during impact.
50
FMEA
FMEA characteristics
Its a team based Work , which will include leader and process/Products experts.
FMEA Terminology
Failure mode -Physical description of a failure

Ex:Noise produced in the car door while closing and opening
Failure Effects -Impact of failure on people , equipment.

Ex:Driver/Car owner dissatisfaction
Failure cause-Refer to cause of the failure

Ex:Insufficient door seal.
FMEA variable
Severity –Is a rating corresponding to the seriousness of an effect of a potential failure
mode.
•Severity rating is in the scale of 1 to 10 , if the rating is 1 the failure would not be
noticeable to the customer and would not effect his process or products .
•If the rating is 10 :the failure is dangerously High and would injure the customer .
51
FMEA
Occurrence - Is a rating corresponding to the rate at which a first level cause and its
resultant failure mode will occur over design life of the system over the
• design life of the product, or before any additional control are applied. Occurrence rating is in
the scale of 1 to 10 .
•If the rating is 1 the occurrence of failure is very rare (One occurrence in a five year or less
than two occurrence in I billion events)
•If rating is 10 failure is almost inevitable: (More than one occurrence per day or probability of
more than 3 Occurrence in 10 events )
52
FMEA
Detection - Is a rating corresponding to the likelihood that the detection method or

current controls will detect the potential failure mode before the product is released
for production for design , or for process before it leaves the production facility
Detection rating is in the scale of 1 to 10 .
•If the rating is 1 the detection is obvious that means automatic inspection is there
and 100% chance of
•Detecting the failure before it leaves the production facility
•If the rating is 10 absolutely there is no system to detect the failure .
RPN = Risk Priority Number
RPN = Severity x Occurrence x Detection
•There is no absolute rule for what is a high RPN number rather , FMEA often viewed
On relative scale (i,e Highest RPN addressed First )
53
FMEA
54
FMEA
55
Session 3- Measurement
56
Session 1 : MEASURE
CONTENTS :
• Introduction to Measure
• Central Limit Theorem
• Types of Sampling
• Gage R&R
• Process Capability
57
Introduction to Measure
Measure comes after Define in Six Sigma methodology.
• It estimates the present status of a process : Before starting with any project, it is
imperative for us to know , where the process stands right now . Even though a part of this
exercise is covered in Define Stage , but this aspect is discussed completely in Measure stage
through several indices like Zvalue , Zbench , Cp , Cpk etc. These indices will be discussed in
detail in the coming sessions.
• It establishes the validity of the measurement system : Measurement system includes
operator as well as instrument. In Measure stage, we have to check , whether the measurement
system is appropriate or not for a particular process . Conclusions regarding any process can only
be made after the appropriateness of the measurement system has been confirmed . Tool to be used
for Measurement System Validation is Gage R&R .
58
Variation
Variation : Variation is the feature by which something varies or deviates from the desired state.
Although the center of data is a good measure of estimating the nature of process, it can not show
the variation that is inherent in data, which is a very important aspect.

Hav = Average height
h1,h2,h3,h4 are heights at respective points
Hav
h1
h2
h4 h3
59
Variation
A person once went to a pond to take bath. He was told that the average height of the pond is 1.2m.
After learning this, he decided to dive in the pond, since this height was well below acceptable level.
In this case, his decision is solely based on the average depth of the pond. He has failed to realize
the variation in the trajectory of the pond (as shown in the picture on the previous slide).
Same is the case with the processes that we encounter in our daily life. In such processes also mean
is not the wholesome descriptor of the process. Variation is a very important index, to be taken in
account.
It has been observed that customer satisfaction is directly correlated with the extent to which the
process is consistent. This theorem is particularly applicable in case of service sector. Success of Six
Sigma in Banking sector is a result of the same philosophy. In such sectors, they have targeted
indices like cycle time, transaction efficiency to maintain consistency in the process.
60
Variation
Types of Variation : There are two types of variation that affect our process .
Common Cause Variation : It is also known as uncontrollable variation or white noise. This kind
of variation is inherent to the process . This variation can not be controlled under given technology.
It can be reduced by improving the technology. Despite of any improvement that we make in our
technology, this variation will always exist in the process.
Assignable Cause Variation : It is also known as controllable noise or black noise. This kind of
variation is caused by the factors that are external to the process. It is caused by 4M changes. It
can be eliminated by improving the process control.
61
Probability Distribution
Probability Distribution : Probability distribution is a theoretical frequency distribution. A

theoretical frequency distribution is a probability distribution that describes how outcomes are
expected to vary.
Consider the tossing of a fair coin. Suppose the coin is being tossed twice.Table shown below
illustrates the possible outcomes from this two toss experiment.
Number of tails on Probability of the 4

First Toss Second Toss two tosses possible outcomes
T T 2 0.5*0.5=0.25
T H 1 0.5*0.5=0.25
H H 0 0.5*0.5=0.25
H T 1 0.5*0.5=0.25
62
Probability Distribution for the possible number of tails from two tosses of a fair coin.
Table - A
Probability of this
Number of tails Tosses
outcome, P(T)
0 (H,H) .25
1 (T,H) + (H,T) .50
2 (T,T) .25
63
Probability distribution of the number of tails in two cases of fair coin
0.50
Probability
0.25
0 1 2
Number of Tails
It should be kept in mind that Table-A does not represents the actual outcome of the problem.
Rather it is a theoretical outcome.
64
Consider a process with specification 45+-2. Now if the data is collected for this particular
process, then the graph of the probability distribution representing the actual readings will be like
the one shown below. Targeted Mean
Actual Mean
0.4
Probability
0.3
0.2
0.1
43 45 47
Specification
Probability Distribution for the readings
It should be noticed that in case of real probability distribution curves, actual average is always
different from the targeted average.
65
Normal Distribution
A data is said to be normally distributed when :
1) There is a strong tendency for the data to take a central value.
2) Positive and negative deviations from this data are equally likely.
3) The frequency of deviations falls off rapidly as deviations becomes larger.
For a distribution to be normal distribution :
1) The curve has a highest peak, thus it is unimodal.
2) It has the bell shape
3) The mean of the normally distributed population lies at a center of its normal curve
4) The curve should be symmetrical.
5) Mean, median and mode are the same value and lie at the center of the curve.
6) The two tails of the normal probability distribution extend indefinitely and never touch the
horizontal axis.
66
Central Limit Theorem
Central limit theorem states that as the sample size increases, the sampling distribution of the
mean will approach normality. Statisticians use the normal distribution as an approximation to the
sampling distribution, whenever the sample size is at least 30.

Difference in the standard deviation of
population and sample
n=8 n=30
Sample Size
67
Central Limit Theorem
From the graph shown on the previous slide, it is evident that for sample size 30 the difference
between the standard deviations of sample and population is very less. Even though this
difference reduces further by increasing sample size, but this reduction is negligible. Hence while
sampling, sample size of 30 is considered as the idle sample size.
Moreover the difference between the standard deviations of sample and population decreases
when the sample size is 8. Hence when it is not possible, to sample 30 pieces ( in case of
destructive testing or in cases where sampling/experimentation is very expensive ) sample size of
8 is considered for analysis. Minimum sample size required for any kind of analysis is 8.
68
Sampling
Sampling is the process of selecting units (e.g., people, organizations) from a population of interest
so that by studying the sample we may fairly generalize our results back to the population from
which they were chosen. It is not possible for us to study each and every unit of the population
to make interpretations. Not only is it time consuming and uneconomical, it is inaccurate also.
Types of Sampling:
1) Random Sampling : In a random or probability sample all items in the population have a
chance of being chosen in the sample.
2) Cluster Sampling : In cluster sampling, we divide the population into groups, and then collect a
random sample of these clusters. We assume that these individual clusters are representative
of the population as a whole. For cluster sampling to be successful, a cluster has to be very
heterogeneous.It should be ensured that a cluster contains as many varieties as population. In
such cases variation between the samples is less than variation within the sample.
69
Sampling
3) Stratified Sampling : In stratified sampling, population is divided into relatively homogenous group
called strata. Strata should be as homogenous as possible. From each stratum a specified
number of elements corresponding to the proportion of that stratum in the population is drawn.
When subpopulations vary considerably, it is advantageous to sample each subpopulation
(stratum) independently. Stratification is the process of grouping members of the population
into relatively homogeneous subgroups before sampling. The strata should be mutually
exclusive : every element in the population must be assigned to only one stratum. The strata
should also be collectively exhaustive : no population element can be excluded. Then random
sampling is applied within each stratum. This often improves the representative ness of the
sample by reducing sampling error.
70
Gage R&R Basics
Measurement System Analysis

If measurements are used to guide decisions, then it follows logically that the more error there is
in the measurements, the more error there will be in the decisions based on those measurements.
The purpose of Measurement System Analysis is to qualify a measurement system for use by
quantifying its accuracy, precision, and stability.
Measurement System Analysis is a critical first step that should precede any data-based decision
making, including Statistical Process Control, Correlation and Regression Analysis, and Design of
Experiments. Measurement system includes both operator and instrument. Total variation in the
measurement system is the result of variation caused by operator as well as instrument . Hence it
can be written as :
Variation (Measurement System) = Variation(Instrument) + Variation(Operator)
71
Gage R&R Basics
A measurement system can be characterized, or described, in five ways:
Location (Average Measurement Value vs. Actual Value):
Stability refers to the capacity of a measurement system to produce the same values over time
when measuring the same sample. As with statistical process control charts, stability means the
absence of "Special Cause Variation", leaving only "Common Cause Variation" (random variation).
Bias, also referred to as Accuracy, is a measure of the distance between the average value of
the measurements and the "True" or "Actual" value of the sample or part.
Linearity : Linearity is a measure of the consistency of Bias over the range of the measurement
device. For example, if a bathroom scale is under by 1.0 pound when measuring a 150 pound
person, but is off by 5.0 pounds when measuring a 200 pound person, the scale Bias is non-linear
in the sense that the degree of Bias changes over the range of use.
72
Gage R&R Basics
Repeatability assesses whether the same appraiser can measure the same part/sample multiple
times with the same measurement device and get the same value.
Reproducibility assesses whether different appraisers can measure the same part/sample with
the same measurement device and get the same value.
Precision : It is the property by virtue of which same readings are obtained when measured over a
different interval of time.
True Value
Accurate & Accurate & not Precise but not Neither accurate
Precise Precise accurate nor precise
73
Gage R&R Basics
Requirements :
Following are general requirements of all capable measurement systems:
Statistical stability over time.
Variability small compared to the process variability.
Variability small compared to the specification limits (tolerance).
The resolution, or discrimination of the measurement device must be small relative to the smaller
of either the specification tolerance or the process spread (variation). As a rule of thumb, the
measurement system should have resolution of at least 1/10th the smaller of either the
specification tolerance or the process spread. If the resolution is not fine enough, process
variability will not be recognized by the measurement system, thus blunting its effectiveness.
74
Gage R&R Basics
Measurement System Analysis Fundamentals:
1.Determine the number of appraisers, number of sample parts, and the number of repeat
readings. Larger numbers of parts and repeat readings give results with a higher confidence
level, but the numbers should be balanced against the time, cost, and disruption involved.
2.Use appraisers who normally perform the measurement and who are familiar with the
equipment and procedures.
3.Make sure there is a set, documented measurement procedure that is followed by all
appraisers.
4.Select the sample parts to represent the entire process spread. This is a critical point. If the
process spread is not fully represented, the degree of measurement error may be
overstated.
75
Gage R&R Basics
5.If applicable, mark the exact measurement location on each part to minimize the impact of
within-part variation (e.g. out-of-round).
6.Ensure that the measurement device has adequate discrimination/resolution, as discussed in
the Requirements section.
7.Parts should be numbered, and the measurements should be taken in random order so that the
appraisers do not know the number assigned to each part or any previous measurement value for
that part. A third party should record the measurements, the appraiser, the trial number, and the
number for each part on a table.
76
Gage R&R
Stability Assessment :
1.Select a part from the middle of the process spread and determine its reference value relative
to a traceable standard. If a traceable standard is not available, measure the part ten times in a
controlled environment and average the values to determine the Reference Value. This
part/sample will be designated as the Master Sample.
2.Over at least twenty periods (days/weeks), measure the master sample 3 to 5 times. Keep
the number of repeats fixed. Take readings throughout the period to capture the natural
environmental variation.
3.Plot the data on an Xbar & R chart
4. Referring to the Xbar & R chart, subtract the Reference Value from Xbar to yield the Bias:
Bias = Xbar-Reference Value
Process Variation = 6 Standard Variation ( Sigma )
5. Calculate the bias percentage BP = Bias / Process Variation
77
Gage R&R
Analyze the result, if there is relatively high value, following can be the reasons behind it :
1) Appraisers not following the management procedure
2) An error in measuring the reference value.
3) Instability in the measurement. If the SPC chart shows a trend, the measurement device could
be wearing or calibration could be drifting.
Types of Gage R&R Study :

1) Short Term Gage R&R : Require only 2 operators and 5 parts. Each part is measured only
once. Cant separate repeatability and reproducibility.
2) Long Term Gage R&R : In such type of Gage R&R, prerequisites are :
a) Minimum 2 operators .
b) 10 parts and minimum two readings per operator(totally 20 readings from each operator).
c) Least count the instrument should be 1/10th of the process tolerance.
78
Gage R&R
Short Study Method
※ The height of a CTQ component assembly Spec. = 2.000 ± 0.015

Part Operator 1 Operator 2 |Range(1-2)|
1 2.003 2.001 0.002
2 1.998 2.003 0.005
3 2.007 2.006 0.001
4 2.001 1.998 0.003
5 1.999 2.003 0.004
Range sum 0.015
Average range(R-Bar) = ∑R /n = 0.015 / 5 = 0.003 Tolerance = 0.030

Gage Error = (5.15 / 1.19)*(R-bar) = 4.33 *0.003 = 0.013
GRR as a % of Tolerance = (0.013 x 100) / 0.030 = 43.3%
d* values for distribution of the average range

Number of parts Number of operators
☆ Gage error is calculated by multiplying 2 3 4 5
the average range by a constant d*, 1 1.41 1.91 2.24 2.48
where d* is determined from the following 2 1.28 1.81 2.15 2.40
table. 5.15 is 99% confidence interval 3 1.23 1.77 2.12 2.38
4 1.21 1.75 2.11 2.37
by the gage. 5 1.19 1.74 2.10 2.36
6 1.18 1.73 2.09 2.35
7 1.17 1.73 2.09 2.35
8 1.17 1.72 2.08 2.35
9 1.16 1.72 2.08 2.34
10 1.16 1.72 2.08 2.34
79
Gage R&R
Long Study Method Operator 1 Operator 2

Consider the following data for Gage R&R Reading - 1 Reading - 2 Reading - 1 Reading - 2
Long Study 54 54 54 55
55 55 55 56
57 57 57 57
57 57 57 57
Accessing Gage R&R in MINITAB 58 58 58 58
57 57 57 57
54 54 54 54
54 54 54 54
55 55 55 55
57 57 57 57
1) Quality Tools
2) Gage Study
3) Gage R&R Study (Crossed)
80
Gage R&R
Fill the work sheet with the appropriate information
Put Parts, Operator and Reading in the window

session
81
Gage R&R
Click Options and put the value of tolerance in the window session
Substitute 6 by 5.15
82
Gage R&R
Gage R&R Study - ANOVA Method
Two-Way ANOVA Table With Interaction If significant, P-value < 0.25 indicates
that an operator is having a problem
measuring some the parts. Hence Gage
Source DF SS MS F P R&R is not acceptable.
Parts 9 81.6 9.06667 204.000 0.000
Operator 1 0.1 0.10000 2.250 0.168
Parts * Operator 9 0.4 0.04444 0.889 0.552
Repeatability 20 1.0 0.05000
Total 39 83.1
Two-Way ANOVA Table Without Interaction
Source DF SS MS F P
Parts 9 81.6 9.06667 187.810 0.000
Operator 1 0.1 0.10000 2.071 0.161
Repeatability 29 1.4 0.04828
Total 39 83.1
83
Gage R&R
Source VarComp (of VarComp)

% Study Variation and % Study Tolerance<20%, Gage
Total Gage R&R 0.05086 2.21
R&R is acceptable
Repeatability 0.04828 2.09
Reproducibility 0.00259 0.11 Between 20% and 30% , Gage R&R is conditionally
Operator 0.00259 0.11 acceptable
Part-To-Part 2.25460 97.79 Above 30% , it is unacceptable
Total Variation 2.30546 100.00
Study Var %Study Var %Tolerance

Source StdDev (SD) (6 * SD) (%SV) (SV/Toler)
Total Gage R&R 0.22553 1.35316 14.85 33.83
Repeatability 0.21972 1.31831 14.47 32.96
Reproducibility 0.05085 0.30513 3.35 7.63
Operator 0.05085 0.30513 3.35 7.63
Part-To-Part 1.50153 9.00919 98.89 225.23
Total Variation 1.51837 9.11024 100.00 227.76
For Gage R&R to be acceptable, number of distinct categories > 4
Number of Distinct Categories = 9
84
Gage R&R
Total Measurement Variation ( s2 total ) = s2 part to part + s2 Gage R&R
Variation due to defect in parts itself Variation due to measuring system
Total Gage R&R = Variation because of instrument ( repeatability )+Variation because of operator ( reproducibility )
( s2 total ) = s2 part to part + s2 Gage R&R
( s2 Gage R&Rl ) = s2 repeatability + s2 reproducibility
% Study Variation = s Gage R&Rl 100

s total
% Study Tolerance =
5.15 s Gage R&Rl 100
Total Tolerance
% Study Variation indicates the contribution of measuring system variation in the total variation.
For measurement system to be efficient, % Study Variation should be very less.
85
Gage R&R
% Study Tolerance indicates the ability of the measuring system to perform within the given
tolerance range. It might happen that, Variation because of Gage R&R has very less magnitude
while % Study Variation is higher. In such case it can be inferred that the measuring system is not
capable enough to take measurements within the given tolerance range.
Interpretations from MINITAB ◈ X - Bar Control-Chart

• It is very favorable consequence that most
Gage name:
Date of study:
Gage R&R (ANOVA) for Thickness Reported by:
Tolerance:
Misc:
of measurement points are out of control limits.
Operator Xbar Chart by Operator → Control limits are calculated from variation
2.6 1 2 3
Variation 2.5 between operators.(Operator Variation)

Sample Mean
2.4 3.0SL=2.380
2.3 X=2.307
-3.0SL=2.235
Small Op variation means narrow control limits.
2.2
2.1 → Measurement variation(operator, measuring
2.0
1.9 system) is smaller than parts variation
1.8
0
relatively read variation between the parts.
R Chart by Operator
Instrument 0.15 1 2 3
Variation 3.0SL=0.1252
Sample Range
0.10
In this case favorable condition is when most of
the measuring points are in control. Repetition of
0.05
R=0.03833 same measuring value indicates that the
0.00 -3.0SL=0.000
measuring system is accurate.
0
86
Gage R&R Discrete Data
In case of attribute data, testing criteria is pass or fail. This is similar to Long Term Gage R&R for
continuous data.
1) Readings have to be taken by two operator
2) Minimum 20 samples have to be measured
3) Each part has to be checked twice. S.No.
Appraiser "A" Appraiser "B"
1 2 1 2
1 G G G G
Gage R&R for any part is acceptable, when all the four
2 G G G G
observations are same. 3 NG G G G
•The gage is acceptable if all the checkers 4 NG NG NG NG
(four per part) agree.. 5 G G G G
6 G G G G
7 NG NG NG NG
Gage R&R is un acceptable if the error is 8 NG NG G G
9 G G G G
More than 10%.
10 G G G G
In this case, unacceptable cases 11 G G G G
12 G G G G
= 3 * 100 = 15%
13 G NG G G
20 14 G G G G
15 G G G G
Since there are three cases in which all readings are
16 G G G G
not same. 17 G G G G
18 G G G G
Since % error in this case is <10%
19 G G G G
Hence, measurement system is not valid. 20 G G G G
87
Capability Analysis
Capability analysis is a set of calculations used to assess whether a system is statistically able to meet a set of
specifications or requirements. To complete the calculations, a set of data is required, usually generated by a
control chart; however, data can be collected specifically for this purpose.
While collecting data for capability analysis, rational sub grouping should be ensured. Sampling should be done in
such a way that all the components of the population are covered.
Specifications or requirements are the numerical values within which the system is expected to operate, that is,
the minimum and maximum acceptable values. Occasionally there is only one limit, a maximum or minimum.
Customers, engineers, or managers usually set specifications. Specifications are numerical requirements, goals,
aims, or standards. It is important to remember that specifications are not the same as control limits.
Control limits come from control charts and are based on the data. Specifications are the numerical
requirements of the system.
All methods of capability analysis require that the data is statistically stable, with no special causes of variation
present. To assess whether the data is statistically stable, a control chart should be completed. If special causes
exist, data from the system will be changing. If capability analysis is performed, it will show approximately what
happened in the past, but cannot be used to predict capability in the future.
88
It will provide only a snapshot of the process at best. If, however, a system is stable, capability analysis shows
not only the ability of the system in the past, but also, if the system remains stable, predicts the future
performance of the system.
Capability analysis is summarized in indices; these indices show a system’s ability to meet its numerical
requirements. They can be monitored and reported over time to show how a system is changing. Various
capability indices are presented in this section; however, the main indices used are Cp and Cpk. The indices
are easy to interpret; for example, a Cpk of more than one indicates that the system is producing within the
specifications or requirements. If the Cpk is less than one, the system is producing data outside the
specifications or requirements. This section contains detailed explanations of various capability indices and
their interpretation. Capability analysis is an excellent tool to demonstrate the extent of an improvement made
to a process. It can summarize a great deal of information simply, showing the capability of a process, the
extent of improvement needed, and later the extent of the improvement achieved.
89
Cp ( Process Capability) :
The capability index is defined as: Cp = (allowable range)/6s = (USL - LSL)/6s
The capability index show how well a process is able to meet specifications. The higher the value of the index,
the more capable is the process:
Cp < 1 (process is unsatisfactory)
1 < Cp < 1.6 ( process is of medium relative capability)
Cp > 1.6 (process shows high relative capability)
For a process to be a Six Sigma process Cp = 2
Cpk (Process Capability Index) :

Even though process capability show how well a process is able to meet its specification. But it does so in
terms of spread only. It does not covers the shift of the process. So as to account for shift, process capability
index is used.
For a process to be a Six Sigma process Cpk = 1.5
90
Process Capability
Process capability refers to the ability of a process to produce a defect-free product or service in a
controlled manner of production or service environment. Various indicators are used-some address overall
performance, some address potential performance.
Process Capability Indices(Cp and Cpk) [Process Capability Indices(Cp and Cpk)]
A standardised measurement on the short-term process performance of a process is Cp, nornally equated to
six standard deviations of the dispersion ; Cp = (USL-LSL) / 6s
The long term process performance, Cpk, has a similar ratio to that of Cp except that this ratio considers the
shift of the mean relative to the target value ; Cpk = min{(USL-T) / 3s, (T-LSL) / 3s)
Zshort = 3 x Cp
Zlong = 3 x Cpk
91
Statistical Concept
Process Capability Index formula
LSL X USL
Balanced USL - LSL
Both Spec.
CP =
6σ
X
One End Spec.
(Upper Spec Limit) USL - X
CP =
3σ
X
One End Spec.
(Lower Spec Limit) X - LSL
CP =
3σ
CPK = ( 1 - k ) CP
Shift
Both Spec. M-X
T- X
k= T -LSL
X T (Target)
*In case of One End Specification regardless of specific limit, Cp equals Cpk. (Should be less than 25dB)
92
1.3.11
Consider the following process
In this case it is evident that the spread of the process is very less, but it is shifting from the mean. This process
will have a good Cp value but poor Cpk value.
93
Cpk = ( 1-k )Cp
K= ( M - X-bar ) / (T/2)
Where M is the target mean
X-bar is the actual obtained mean
T is total tolerance
Cp tells us about the spread of the process, Cpk tells us about both shift and spread of the process
Calculating Cp and Cpk using MINITAB
Consider the following data. This data has SubGroup1SubGroup2SubGroup3SubGroup4SubGroup5SubGroup6

been divided into six subgroups 11 13 9 13 15 16
14 11 10 12 20 18
11 13 14 11 12 12
13 13 7 13 14 8
16 10 7 11 15 19
94
Stack the given data
95
Select the subgroups that are to
be stacked.
96
Conduct the normality test for the given data
Put Column name in Variables
Click on Normality Test
97
Probability Plot of Data
Normal
If p-Value>.05, then data is normal
99
Mean 12.7
StDev 3.164
95 N 30
AD 0.369
90
P-Value 0.404
80
70
Percent
60
50
40
30
20
10
1
5.0 7.5 10.0 12.5 15.0 17.5 20.0
Data
98
Case 1: For Normal Data
Step 1 : Stack the given data
Step 2 : Perform process capability test on the given data.
99
Mention
Subgroup Size
Lower Spec
Upper Spec
100
Process Capability of Data St Dev(Within) represents short term
variation
LSL USL
Process Data Within
LSL 9.00000 Overall St Dev(Overall) represents long term
Target *
USL 15.00000 Potential (Within) C apability variation
Sample Mean 12.70000 Cp 0.36
Sample N 30 C PL 0.44
C PU 0.28
StDev (Within) 2.77649
C pk 0.28
Cp represents process capability .
StDev (O v erall) 3.19130
C C pk 0.36
O v erall C apability
Pp 0.31
Cpk represents process capability index
PPL 0.39
PPU 0.24
Ppk 0.24 Pp represents process performance
C pm *
Ppk represents process performance

6 8 10 12 14 16 18 20 index
O bserv ed Performance Exp. Within Performance Exp. O v erall Performance
PPM < LSL 100000.00 PPM < LSL 91328.61 PPM < LSL 123146.19
Cp and Cpk are short term indices.
PPM > USL 166666.67 PPM > USL 203726.50 PPM > U SL 235544.19
PPM Total 266666.67 PPM Total 295055.11 PPM Total 358690.38
Pp and Ppk are long term indices.
For Rational Sub grouping to be there, lines for short term variation and long term variation
should be distinct. If the lines are overlapping then it shows poor rational sub grouping.
101
Process Capability of Data
LSL USL
P rocess D ata W ithin
LS L 9.00000 O v erall
T arget *
USL 15.00000 P otential (Within) C apability
S ample M ean 12.70000 C p 0.36
S ample N 30 C PL 0.44
S tD ev (Within) 2.77649 C PU 0.28
S tD ev (O v erall) 3.19130 C pk 0.28
C C pk 0.36
O v erall C apability
Pp 0.31
PPL 0.39
PPU 0.24
P pk 0.24
C pm *
6 8 10 12 14 16 18 20
O bserv ed P erformance E xp. Within P erformance E xp. O v erall P erformance
P P M < LS L 100000.00 PPM < LS L 91328.61 P P M < LS L 123146.19
PPM > USL 166666.67 PPM > USL 203726.50 PPM > USL 235544.19
P P M T otal 266666.67 PPM T otal 295055.11 P P M T otal 358690.38
Cp = ( USL – LSL )/6 Standard Deviation Within

In case of Pp ( Process Performance Index ),
Cpu = ( USL – X-Bar )/3 Standard Deviation Within standard Deviation within is replaced by
Cpl = ( X-Bar – LSL )/3 Standard Deviation Within standard deviation overall
Cpk = (1-k) Cp
102
Case 2: For Non Normal Data
Step 1 : If the data is non normal, then Box Cox transformation needs to be done.
Substitute for data and subgroup size
103
Box-Cox Plot of Data
Lower C L Upper C L
9 Lambda
(using 95.0% confidence)
8 Estimate 0.38924 Select the value of Lambda est

Lower C L -0.68642
Upper C L 1.44562
7
Best Value 0.50000
6
StDev
3 Limit
2
-5.0 -2.5 0.0 2.5 5.0
Lambda
Select Capability Analysis Normal
104
Select Box Cox Transformation
Put the value of Lambda in the dialogue box.

Obtain the indices as in previous case
105
Binomial Distribution of data
If the data is non measurable, not continuous then capability test will be done using
Binomial method
Trial Defects1 Go to Stat/Quality Tools/Capability analysis/Binomial
1900 23
3456 567
2345 678
2345 234
106
Binomial Distribution of data
trial defects1
1900 23
3456 567
2345 678
2345 234
107
Enter Defects in Defectives and No of trials in Use sizes in then press OK
108
Results
Binomial Process Capability Analysis of defects1
P C har t Rate of Defectiv es
0.3 1 30
% Defective
Proportion
0.2 _ 20
U C L=0.1716
P =0.1495
LC L=0.1274
0.1 10
1
0.0 1 0
1 2 3 4 1800 2400 3000
Sample Sample Size
T ests performed w ith unequal sample sizes
C umulativ e % Defectiv e Dist of % Defectiv e
S ummary S tats T ar
16 1.00
(using 95.0% confidence)
% D efectiv e: 14.95
12 0.75
% Defective
Low er C I: 14.26
U pper C I: 15.66
8 T arget: 0.00 0.50
P P M D ef: 149512
4 Low er C I: 142592 0.25
U pper C I: 156637
0 P rocess Z: 1.0385 0.00
1.0 1.5 2.0 2.5 3.0 3.5 4.0 Low er C I: 1.0084 0 5 10 15 20 25 30
Sample U pper C I: 1.0687
Interpreting the results

We have to see the process Z which is very much less .0385 and the process has to be
Improved .
109
Z-Value
Z Value indicates the number of Standard Deviations that are lying between target and USL or
target and LSL. For a process with only one specification limits ( Upper or Lower ), this results in
six process standard deviations between the mean of the process and the customer’s
specification limit ( hence 6s ).
Z = ( USL-X-Bar ) / s = ( X Bar – LSL ) / s
Z value gives an idea about the quality of the process. Higher the Z value better is the process.
Zst ( Z-Short Term ) : Zst is based on data, that has been taken on short term basis. Zst only
replicates technology of the process. It does not takes into account, shift in the process that is
caused due to process variation ( 4M factors ). In other words we can say that, Zst takes into
account white noise only .
110
Z-Value
Zlt takes into account both white noise as well as black noise. It is based on the data, that has
been taken over an extended period of time. It replicates technology as well as process control.
Process control involves shift in the process, that is caused because of process variation (4M
Factors).
Examples of process shift or 4M factors : Seasonal changes, variation because of external factors,
Change in skill of a worker over a period of time etc. These changes can be attributed to :
a) personnel who operate the processes;
b) materials which are used as inputs (including information);
c) machines or equipment being used in the process (in process execution or
monitoring/measurement;
d) methods (including criteria and various documentations used along the process);
e) work environment
111
Z-Value
Estimating Z-Value from Z-Table : Calculate the Z-Value for 1000 PPM value.
Step 1: Calculate the absolute value for 1000 PPM = 1000/10^6
Step 2: 1000 PPM can be written as 1.0 * 10-3.
Step 3: It should be noticed that the red colored term lies between 1.0 and 9.9
Step 4: Look for all 10-3 terms.
Step 5: Locate 1*10-3 in the table
As can be seen, the value lies in 3.0 row and 0.09 column, hence Z value = 3.0 + 0.09 = 3.09
This table gives Zlt value for any process .
Hence, Zst = Zlt + Zshift = Zlt + 1.5 = 3.09 + 1.5 = 4.59
112
Z bench
Definition of Z.BENCH
9% 10% 19%
ZBENCH = .88
ZLSL = 1.34 ZUSL = 1.22
The total defect area taken from LSL and USL is called Z bench
113
Z bench
Steps involved in calculating Z bench
•PUSL is the probability of a defect relative to the USL.
Z USL= (USL-m)/ s then see the area in Z table
•PLSL is the probability of a defect relative to the LSL.
Z LSL= (LSL-m)/ s then see the area in Z table
•ZBench is the total probability of a defect. Z bench = ZUSL + ZLSL

Add the areas obtained from zee table for Z USL and Z LSL
Zbench is the Z value from the normal table which corresponds to the total number of defects.
That is the after addition of Z USL and Z LSL once again see the table for Z value for this area.
114
Z bench
Q 4. Calculate the following :-
1) Z bench
LSL USL
s -5 ,
-∞ +∞
10 15 20 25 30 35 50
Measurement(Time) :
Z USL= (USL-m)/ s
Z USL= (50-25)/ 5 = 5 then see the area in Z table = 4.98*10-7
Z LSL= (LSL-m)/ s
Z USL= (10-25)/ 5 = 3 then see the area in Z table =1.35 *10-3
Zbench = ZUSL + ZLSL = 4.98*10-7 + 1.35 *10-3 = 0.00135498
Now see the corresponding z value for 0.00135498 and the Z value is 3
Hence Z bench =3
115
Analysis
CONTENTS :
• Introduction to analysis
• Introduction to hypothesis
•Consumers and manufacturer risk
•Calculating Confidence interval.
• Tools used in analysis
116
ANALYZE
Analyze comes after Define and Measure stage in Six Sigma methodology .
Define aims at theme selection and justification along with finding for the possible root causes.
Measure aims at finding the present status of a process and establishing whether the existing
measurement system is valid or not.
Analyze selects the vital few factors out of the possible factors. Possible factors are the probable
reasons that have been brainstormed in the define stage.
First step of analyze is fish bone diagram or cause and effect diagram.
Fish bone diagram involves listing all the probable factors and classifying them under suitable
categories.
117
Fish Bone Diagram
The cause & effect diagram is the brainchild of Kaoru Ishikawa, who pioneered quality
management processes in the Kawasaki shipyards, and in the process became one of the
founding fathers of modern management. The cause and effect diagram is used to explore
all the potential or real causes (or inputs) that result in a single effect (or output). Causes are
arranged according to their level of importance or detail, resulting in a depiction of
relationships and hierarchy of events. This can help you search for root causes, identify
areas where there may be problems, and compare the relative importance of different
causes.
Causes in a cause & effect diagram are frequently arranged into four major categories. While
these categories can be anything, you will often see:
1) Man, Method, Material and Machinery (recommended for manufacturing)
2) Equipment, Policies, Procedures and People (recommended for administration
and service).
These guidelines can be helpful but should not be used if they limit the diagram or are
inappropriate. The categories you use should suit your needs.
118
Fish Bone Diagram
The C&E diagram is also known as the fishbone diagram because it was drawn to resemble
the skeleton of a fish, with the main causal categories drawn as "bones" attached to the spine
of the fish, as shown below.
Man Machine
Cause 2
Cause 1 Problem
or
Effect
Method Material
119
What is hypothesis??
•Hypothesis an assumption which we make about our population parameter.Once we do the

Assumption we have to collect the data and use simple statistic to conclude how likely is
our hypothesized(assumption ) is correct.
•When we collect the data and judge the hypothesized value and the actual value if the
Difference is less that means our assumption right and difference is more then the assumption
Is not correct.
120
Analysis.
Example :
•Suppose a manager says his employee performance level is 90% .How we can test the validity of
her hypothesis.??
•For that we have to collect sample , if the sample indicates her performance is 95% we can
directly accept managers statement , if our sample statistics says her performance is 46% then
we can directly rejects managers statement . This reject and accept outcome is done using our
common sense.
•Now suppose our sample says her performance level is 88% , which is very close to managers
Statement , but we are not absolutely certain to accept or reject managers statement.therefore we
have to learn to deal with the uncertainty in our decision making.
•We just can not accept or reject a hypothesis about a population parameter simply by intuition ,
instead we need to learn how to decide absolutely on the basis of simple information whether to
accept or reject. 121
Null Hypothesis
•In hypothesis testing we must state the assumed or hypothesized value(or we should tell what
we are assuming about the sample) of the population before we begin the sampling.
The assumption we wish to test is called the NULL Hypothesis and Symbolized Ho .
•For example in a medical application in order to test the the Effectiveness of a new drug the
tested hypothesis (null hypothesis ) Was that it had no effect.that means there is no difference
between treated and untreated samples.
•When we use hypothesized value of the population mean in our problem we would represent
it symbolically
µHO (The null hypothesized value of the population mean.)
122
• If our sample results fail to support the null hypothesis(if the results are not as per our assumption)
we must conclude that something else is true.
•Whenever we reject the null hypothesis , the conclusion we do accept is called the Alternate
Hypothesis and Symbolized HA .
123
Interpreting the Significance Level.
•The purpose of hypothesis to make a judgment about the difference between the sample and
the hypothesized population parameter.
• The next step after null and alternate hypothesis is to find out which Criteria to use for
deciding whether to accept or reject the null hypothesis
The term which will allow us to decide whether to accept or to reject is Significance level
What is significance level ??
Lets take this example to understand what is significance level.

In this area where there is no significant
difference between the statistic and hypothesizes parameter(accept Ha)
.95 of area
.025 of area .025 of area
µHO 124
In this area where there is significant
difference between the statistic and hypothesizes parameter(reject null hypothesis)
•As total area under a distribution curve is 1 in this example totally 5% of the area (.025 each side
is) which is marked in black color is outside at the tail of the curve which.
•From Z table we can determine that 95% of all the area under the curve is included in an interval
extending 1.96s either side of the hypothesized mean.In this 95% area is not having any significant
difference between the observed value of the sample statistic and the hypothesized value of the
population parameter the remaining 5% area (colored in black) where significant difference exist.
•In this example .95 of the area under the curve we would accept the null hypothesis.
The black colored part under the curve(.025 each side) representing total 5% of the area where
We would reject null hypothesis.
125
Selecting a significance level
•There is no universal standard or level for selecting the significance level. In some instances 5%
Is used and in some 1 %.Its possible to set hypothesis at any level of Significance .
•But its very important to remember that this significance level is the only important factor which
Will decide to reject or to accept null hypothesis.The higher the significance level ,
the higher the probability of rejecting a null hypothesis When its true (when its really null hypothesis).
126
Type I and type II

.99 of area

.90 of area

.50 of area
Fig 1. 127
In hypothesis testing 2 major risk is involved risk
1. Producers risk
2. Consumer’s risk
• Producers Risk : This is when we do hypothesis testing , rejecting a Null hypothesis when its true
that means rejecting the good production lot even though there is no
Much evidence to prove that the production lot is defective. Which will lead to high rework.
This type of error done by the producer is called a type I error and is symbolized α(alpha)
Consumers : This is when we do hypothesis testing, accepting a Null hypothesis when its falls ,
that means accepting the good production lot even though there is a evidence to prove that the
production lot is defective. That means taking a chance by the producers to release the production
To market even though its defective by calculating relatively inexpensive warranty and repair at
Consumer end than reworking the entire lot.
This type of error which will be a risk on consumer.is called a type II error and is symbolized β(beta).
128
Representation of Producer and Consumer’s Risk
True
Ho Ha
The ratio which is
Correct Type 2 being “Ha” even if it’s false.
Ho Where “β” is usually
Decision Error
set up at 10%.
β
The ratio which is Accept Consumer risk
being rejected Ho even
though certain thing is true Type 1 Correct
Ha
where “ α” is α error. Error Decision
(usually 5%) α
Producer Risk
129
•In Fig1 last curve the acceptance region is quite small hence there is a rare chance of accepting
null hypothesis actually when its true . To deal with this situation in personal life professional
situations decision will be done estimating the loss ,cost or penalties attached to both the type of
error.
Example for Type I and Type II error

•Suppose in a chemical company making type I error involves the time and trouble of rework
a batch of chemicals that should have been accepted. At the same time making type II error
Means taking a chance that an entire group of users of this chemical will be poisoned .
Obviously the management will prefer type I error than type II error and as a result , they will
Set very high level of significance in its testing to get low βs.
•Suppose making a type I error involves disassembling an entire engine at the factory ,
but making type II error involves relatively inexpensive warranty repairs by the dealers then the
manufacturer is more likely to prefer type II error and they will set lower significance level
in its testing.
130
Two tailed and one tailed tests of hypotheses.
In two tiled hypothesis there are two region to reject null hypothesis, if the sample mean is
Significantly higher than or lower than hypothesized population mean.
Two tailed hypothesis is appropriate when null hypothesis µ=µHo And alternate hypothesis is µµHo
Accept null hypothesis in this

Region.
µHo
Reject null hypothesis in this
Region.
131
Ex: the manufacturer of a bulb wants to produces bulbs with bulb mean life time of µ=µHo = 1000hrs.
If the life of the bulb is less than 1000hs he will loose the customer , if its more than that then his
Manufacturing cost will go high . So now he does not want to deviate significantly from 1000 hrs in
Either direction,thus he should use tow tailed hypothesis that he will reject the null hypothesis if
The mean life of the bulb in sample is either too far above 100 hrs or too far below 1000 hrs.

Region.
µHo
Reject null hypothesis in this
Region.
132
•And also there is a situation where the wholesaler who is buying these bulbs will rejects
where mean life time of the bulbs are lesser than 1000hs.but he will not reject if the bulbs
are measuring more than 1000hrs as he need not to pay more money for that extra hrs.so
he will use only one tailed hypothesis Ho: µ=µHo 1000hrs and Ha µ<1000hrs.
This hypothesis is also called as left tailed test or lower tailed test.

Region.
Reject null hypothesis in this 1000Hrs

Region. 133
There is also a hypothesis is also called as right tailed test or higher tailed test.
•When the hypothesis are Ho: µ=µHo 1000hrs and Ha : µ>µHo that is only values of the sample
means which are above(upper side ) the hypothesized population mean will cause us to reject
Null hypothesis .
Ex. The sales manager has asked her sales person to keep the traveling expenses to average
100$ per day. When we collect the sample null hypothesis Ho: µ=µHo =100$ but manager will be
interested only with high expenses . Thus alternate hypothesis is Ha : µ> 100$ .so here right
tailed test or higher tailed test is used

Region.
1000Hrs Reject null hypothesis in this

Region. 134
One sample Z test - When historic mean and standard deviation is known
Use 1-Sample Z to perform a hypothesis test of the mean when s is known.
Example : Measurements were made on nine metal Pieces. And distribution of measurements has
historically been close to normal with s = 0.2. Because you know s, and you wish to test if the
population mean is 5 and obtain a 95% confidence interval for the mean, you use the Z-procedure.
Go to stat/Basic Statistics/1-sample Z test

Values
4.9
5.1
4.6
5
5.1
4.7
4.4
4.7
4.6
135
One sample Z test
1 In sample in columns enter values standard deviation enter.2 Test mean enter 5
2 Click on Options
3 enter 95 in confidence level

4 Click ok
5 Click on Graph then select box plot

6 Click ok each time
136
One-Sample Z: Values
Results Test of mu = 5 vs not = 5

The assumed standard deviation = 0.2
Variable N Mean StDev SE Mean 95% CI Z P

Values 9 4.78889 0.24721 0.06667 (4.65822, 4.91955) -3.17 0.002
Boxplot of Values
(with Ho and 95% Z-confidence interval for the Mean, and StDev = 0.2)
_
X
Ho
4.4 4.5 4.6 4.7 4.8 4.9 5.0 5.1

Values
Interpreting the results

The test statistic, Z, for testing if the population mean equals 5 is -3.17.
The p-value is 0.002 which is less than .05 hence reject null hypothesis .
The hypothesized value falls outside the 95% confidence interval for the population mean
(4.65822, 4.91955) and so you can reject the null hypothesis.

137
One sample T test - When historic mean known
Use 1-Sample t to compute a confidence interval and perform a hypothesis test of the mean when
the population standard deviation, s, is unknown.
where m is the population mean and m 0 is the hypothesized population mean.
Go to stat/Basic Statistics/1-sample T test
Values
4.9
5.1
4.6
5
5.1
4.7
4.4
4.7
4.6
138
3 enter 95 in confidence level
4 Click ok
1 In sample in columns enter values Test mean enter 5

2 Click on Options
5 Click on Graph then select box plot

6 Click ok each time
139
One-Sample T: Values
Test of mu = 5 vs not = 5
Variable N Mean StDev SE Mean 95% CI T P

Values 9 4.78889 0.24721 0.08240 (4.59887, 4.97891) -2.56 0.034
Boxplot of Values
(with Ho and 90% t-confidence interval for the mean)
_
X
Ho
4.4 4.5 4.6 4.7 4.8 4.9 5.0 5.1

Values
Our criterion is based on the three indices circled above
1) If P value > 0.05, Accept Ho and if P value < 0.05, Accept Ha

2) If Tcal > Ttable, Accept Ha and if Tcal < Ttable, Accept Ho
3) If Zero value lies in between 95 % CI values, Accept Ho otherwise accept Ha
140
How to use the T table
Sample size Confidence level
(Normally taken as 95 %)
In our example it is 30
141
Two sample T test – 2 set of samples are available and don’t have any
Historic data
A study was performed in order to evaluate the effectiveness of two devices for improving the
efficiency of gas home-heating systems. The energy consumption data are stacked Now you want to
compare the effectiveness of these two devices by determining whether or not there is any evidence
that the difference between the devices is different from zero.
BTU.In Damper
7.87 1
9.43 1
7.16 1
8.67 1
12.31
9.84
16.9
10.04
12.62
1
1
1
1
1
1. First step Normality test to be done .
7.62 1
11.12 1
13.43 1
9.07 1
6.94 1
10.28 1
9.37 1
7.93 1
13.96 1
6.8 1
4 1
8.58 1
8 1
5.98 1
15.24 1
8.54 1
11.09 1
11.7 1
12.71 1
6.78 1
9.82 1
12.91 1
10.35 1
9.6 1
9.58 1
9.83 1
9.52 1
18.26 1
10.64 1
6.62 1
5.2 1
12.28 2
7.23 2
2.97 2
8.81 2
9.27 2
11.29 2
8.29 2
9.96 2
10.3 2
16.06 2
14.24 2
11.43 2
10.28 2
13.6 2
5.94 2
10.36 2
6.85 2
6.72 2
10.21 2
8.61 2
142
Results of normality test
Probability Plot of Data
Normal
99.9
Mean 10.04
StDev 2.868
99 N 90
AD 0.272
95 P-Value 0.663
90
80
70
Percent
60
50
40
30
20
10
5
0.1
0 5 10 15 20
Data
Data is normal as P>.05
143
2 Nest step is HOV test
Where we will check for difference
in the variance of 3 plating lines
Path:Stat/Basic statistics/2variances
144
Results of equal variance test
Test for Equal Variances for Data
F-Test
Test Statistic 1.19
1 P-Value 0.558
Damper
Lev ene's Test

Test Statistic 0.00
P-Value 0.996
2
2.0 2.5 3.0 3.5 4.0

95% Bonferroni Confidence Intervals for StDevs
1
Damper
5 10 15 20
Data
Based on normality test (data was normal) we have to choose

F test results where P value is .558 which is greater then .05 hence
Variances are same.
145
Next step is 2 sample test .
Check it
146
Two-Sample T-Test and CI: Data, Damper
Boxplot of Data by Damper
20
Two-sample T for Data

15
Data
Damper N Mean StDev SE Mean
10
1 40 9.91 3.02 0.48

2 50 10.14 2.77 0.39 5
1 2
Damper
Difference = mu (1) - mu (2)

Estimate for difference: -0.235250
95% CI for difference: (-1.450131, 0.979631)
T-Test of difference = 0 (vs not =): T-Value = -0.38 P-Value = 0.701 DF = 88
P value is more than .05

0 is outside the 95% CI
Tabel value is 1.64 which is more than calculated T value .
Hence accept Ho that is , there is no evidence for a difference in energy use
when using an electric vent damper versus a thermally activated vent damper
147
How to use the T table
Sample size Confidence level
(Normally taken as 95 %)
148
ANOVA :Analysis of variance
•This test is used to analyze the differences between more than two samples.
Example: 1. Comparison of mileage of 5 different brand bikes.
2. Comparison between results of 4 collages for same type of courses.

ANOVA expression
µ1 = µ2 = µ3 = Ho(Null hypothesis)
µ1 ,µ2 and µ3 are not equal Ha (alternate Hypothesis )
Conditions before going to ANOVA

•Two or More samples
•Data to be continues type
•Normality test to be conducted
• Homogeneity of Variance test to be conducted prior to check
• the s2(variance) which has to be same for all the sample sets.
149
Steps of ANOVA test
Conduct Normality test
Normal (P > 0.05) Abnormal (P < 0.05)
Bartlett’s test (Homogeneity of Lavene’s test

Variance tests)
P < 0.05 P > 0.05 P > 0.05 P < 0.05
Variances are same,

Variances are different Go for ANOVA test Variances are different
Accept Ha P < 0.05 P > 0.05 Accept Ho 150

3 criteria in ANOVA to decide null hypothesis
1. P value should be greater then .05

2. F calculated should be less than table value
3. And also boxplot results should be visualised
151
Example : A vendor is having 3 different plating lines in his factory and he has produced
same metal part with the same thickness but he want to validate that the plating thickness of
the each plating line is same or different .
Plating line A Plating line B Plating line C

20.5 20.8 20.5
20.2 20.1 20.6
20.1 20.2 21.8
20.2 20.3 20.9
20.4 20.4 20.8
20.8 20.9 20.1
20.2 20.8 20.9
20.3 20.7 20.9
20.2 20.6 20.2
20.1 21.2 20.2
Enter all factor & response in

individual columns
Eg. In this case plating line is the
152
factor and thickness is the response
1. Normality test
Path :Stat > Basic Statistics > Normality test
153
Results of Normality test 2 Nest step is HOV test
Where we will check for difference
in the variance of 3 plating lines
Probability Plot of Thickness
Normal Path:Stat/Basic statistics/2variances
99
Mean 20.53
StDev 0.3975
95 N 30
AD 1.070
90
P-Value 0.007
80
70
Percent
60
50
40
30
20
10
1
19.5 20.0 20.5 21.0 21.5 22.0
Thickness
P value is less than .05 hence

Data is non normat
154
How To perform HoV Test in Minitab?
Minitab path
Stat > ANOVA > Test for Equal variances…
Select the columns
155
Results Test for Equal Variances: Thickness versus
Plating line
Plating line N Lower StDev Upper

A 10 0.137616 0.216025 0.45972
(1) Sessions table
B 10 0.220677 0.346410 0.73719
(2) Graphical analysis C 10 0.318448 0.499889 1.06380
Bartlett's Test (normal distribution)

Test statistic = 5.56, p-value = 0.062
Levene's Test (any continuous distribution)

Test statistic = 2.38, p-value = 0.112
As the data was non normal in normality test we have to go for

Lavene’s test and check for P value and p value is more than .05 hence its null hypothesis now go to ANOVA
156
How To perform ANOVA in Minitab?
Minitab path
Stat / ANOVA / One way…..
157
One-way ANOVA: Thickness versus Plating line
Minitab Gives Us: Source DF SS MS F P
The following Output Plating line 2 0.834 0.417 3.00 0.066
Error 27 3.749 0.139
Total 29 4.583
(1) Sessions table
S = 0.3726 R-Sq = 18.20% R-Sq(adj) = 12.14%
(2) Box plot
Individual 95% CIs For Mean Based on Pooled StDev
Level N Mean StDev --------+---------+---------+---------+-
A 10 20.300 0.216 (---------*---------)
B 10 20.600 0.346 (---------*---------)
C 10 20.690 0.500 (---------*--------)
--------+---------+---------+---------+-
20.25 20.50 20.75 21.00
Our criteria is again based on the three

indices circled above
1) P Value : .06 which is >.05

2) Fcal Vs Ftable :F cal =3,tab=4.17
3) Visual difference in the box plot
In this example the samples are same that is Null hypothesis . 158
Degrees of freedom (DF=N-1)
in our samples its 3 plating line hence (3-1=2)
(10-1)+(10-1)+(10-1)
159
IN discrete type we have 3 tests in analysis
•1 Proportion test
•2 Proportion
•3 Chi-Square
160
1 proportion test: When historic mean known
Performs a test of one binomial proportion.
Ho: p = p0 versus Ha: p ≠ p0 where p is the population proportion and p0 is the
hypothesized value.
•Lets see one example .
Bk500EI yari was running in cell2 from past 1 year with DPU of 8% later the same model
is moved to cell1 and they produced 400 units out of 210 units passed. Calculate for
statistical difference between these cell performance
161
How To perform 1 ProportionMinitab
test inpath
Minitab?
Stat > Basic Statistics > One proportion….
Your historic (or) DPU %
Enter sample size here

Enter no. of samples failed
162
Results
Test and CI for One Proportion
Test of p = 0.8 vs p not = 0.8
Exact
Sample X N Sample p 95% CI P-Value
1 190 400 0.475000 (0.425155, 0.525217) 0.000
Decision based on the criterion circled above
P value is less than .05

0 is outside the 95% CI
Hence Reject Null hypothesis .
163
2 proportion –
Performs a test of two binomial proportions.
Use the 2 Proportions command is to perform a hypothesis test of the
difference between two proportions.
Ho: p1 - p2 = p0 versus Ha: p1 - p2 ≠ p0
Ex : A corporation's purchasing manager need to authorize the purchase of twenty new
photocopy machines. After comparing many brands in terms of price, copy quality, warranty,
and features, he has narrowed the choice to two: Brand X and Brand Y. and decide that the
determining factor will be the reliability of the brands as defined by the proportion requiring service
within one year of purchase. Because corporation already uses both of these brands,
he was able to obtain information on the service history of 50 randomly selected machines of each
brand. Records indicate that six Brand X machines and eight Brand Y machines needed service.
he this information fo guide to choice of brand for purchase.

164
Choose Stat > Basic Statistics > 2 Proportions.
2 Choose Summarized data.

3 In First sample, under Trials, enter 50. Under Events, enter 44.
4 In Second sample, under Trials, enter 50. Under Events, enter 42.
5. Click OK.
165
Results
Test and CI for Two Proportions
Sample X N Sample p
1 44 50 0.880000
2 42 50 0.840000
Difference = p (1) - p (2)

Estimate for difference: 0.04
95% CI for difference: (-0.0957903, 0.175790)
Test for difference = 0 (vs not = 0): Z = 0.58 P-Value = 0.564
P value is greater than .05

0 is in-between the 95% CI
Hence accept Null hypothesis .
That is, the proportion of photocopy machines that needed service in the first
year did not differ depending on brand. As the purchasing manager, you need
to find a different criterion to guide your decision on which brand to purchase. 166
Chi – square
Chi square is used when we have more than 2 samples
With us .
Steps involved in calculating the chi-square .
167
Hypothesis Testing(Discrete Data)
Chi-Square Example 1 : Product defect
During 3months, The types of refrigerator defects are classified according to production shift
and we survey whether they has a character(dependent) or not(independent).
If there is a characterized defect type, the improvement activity could be developed by our
investigating types of defects on each shift with furthermore investigation.
A total of n=309 refrigerator defects were recorded, and the defects were classified into one of the 4
categories(A, B, C, and D) listed below. At the same time, each refrigerator was identified according to
the production shift on which it was manufactured. Our objective is to test the null hypothesis(Ho), tha
the type of defect is independent of shift, against the alternative hypothesis(Ha), that the defects are
dependent on the shift.
Types of Defect Types of Defect
Shift A B C D
A : Dents
B : sealed system leaks 1 15 21 45 13
C : Switch failure 2 26 31 34 5
D : Missing Parts 3 33 17 49 20
Ho : Defects are Independent of the shift.

Ha : Defects are Dependent upon the shift.
168
Stat/tables/chi square table(table in worksheet)
169
Hypothesis Testing(Discrete Data)
Chi-Square [Ex] 1 : Product defect Expected

(Row observation Total)
* (Column observation Total)
value =
Grand observation Total
Session Confirmation from Minitab
Chi-Square Test
Expected counts are printed below observed counts
A B C D Total Expected Value of “A” in Defect
1 15 21 45 13 94 type of the *Shift
22.51 20.99 38.94 11.56
E = (94 X 74)/309 = 22.51
2 26 31 34 5 96
22.99 21.44 39.77 11.81 Chi-Square = (O-E) / E 2
3 33 17 49 20 119 2
Chi-Square = (15-22.51) / 22.51
28.50 26.57 49.29 14.63 = 2.506
Total 74 69 128 38 309
Chi-Sq = 2.506 + 0.000 + 0.944 + 0.179 + Higher values may show dependence
0.394 + 4.266 + 0.836 + 3.923 +
0.711 + 3.449 + 0.002 + 1.967 = 19.178 Since P Value < 0.05 ;
DF = 6, P-Value = 0.004 Reject Ho , Accept Ha
DF = (r-1)(c-1) 2
The individual χ values will answer the question:
Where does the dependent relationship exist between the defect type
and the shift?
170
Chi Square( χ 2 )Distribution Hypothesis Testing(Discrete Data)
df 0.250 0.100 0.050 0.025 0.010 0.005 0.001
Chi Square( χ 2) 1 1.323 2.706 3.841 5.024 6.635 7.879 10.828
Distribution 2 2.773 4.605 5.991 7.378 9.210 10.579 13.816
3 4.108 6.251 7.815 9.348 11.345 12.838 16.266
4 5.385 7.779 9.488 11.143 13.277 14.860 18.467
(R-1)(C-1)=df 5 6.626 9.236 11.070 12.832 15.086 16.650 20.515
6 7.841 10.645 12.592 14.449 16.812 18.548 22.458
df : Degrees of Freedom 7 9.037 12.017 14.067 16.013 18.475 20.278 24.322
8 10.219 13.362 15.507 17.535 20.090 21.955 26.125
9 11.389 14.684 16.919 19.023 21.666 23.589 27.877
10 12.549 15.987 18.307 20.483 23.209 25.188 29.588
11 13.701 17.275 19.675 21.920 24.725 26.757 31.264
12 14.845 18.549 21.026 23.337 26.217 28.300 32.909
13 15.984 19.812 22.362 24.736 27.688 29.819 34.528
14 17.117 21.064 23.685 26.119 27.141 31.319 36.123
15 18.245 22.307 24.996 27.488 30.578 32.801 37.697
16 19.369 23.541 26.296 28.845 32.000 34.267 39.252
17 20.489 24.769 27.587 30.191 33.409 35.718 40.790
18 21.605 25.989 28.869 31.526 34.805 37.156 43.312
19 22.718 27.204 30.144 32.852 36.191 38.582 43.820
20 23.828 28.412 31.410 34.170 37.566 39.997 45.315
21 24.935 29.615 32.671 35.479 38.932 41.401 46.797
22 26.036 30.813 33.924 36.781 40.289 42.796 48.268
23 27.141 32.007 35.172 38.076 41.638 44.181 49.728
24 28.241 33.196 36.415 39.364 42.980 45.558 51.179
25 29.339 34.382 37.652 40.646 44.314 46.928 52.620
26 30.434 35.563 38.885 41.923 45.642 48.290 54.052
27 31.518 36.741 40.113 43.194 46.963 49.645 55.476
28 32.620 37.916 41.337 44.461 48.278 50.993 56.892
29 33.711 39.087 42.557 45.722 49.588 52.336 58.302
30 34.800 40.256 43.773 46.979 50.892 53.672 59.703
40 45.616 51.806 55.758 59.342 63.691 66.766 73.402
50 56.334 63.167 67.505 71.420 76.154 79.490 86.661
60 66.981 74.397 79.082 83.298 88.379 91.953 99.607
70 77.577 85.527 90.531 95.023 100.425 104.215 112.317
80 88.130 96.578 101.879 106.629 112.329 116.321 124.839
90 98.650 107.565 113.145 118.136 124.116 128.299 137.208
100 109.141 118.498 124.342 129.561 135.807 140.169 149.449
171
Degrees of freedom
•Degrees of freedom is the number of values we can choose freely.
Ex:Assume we are dealing with 2 samples a and b and they have a mean of 18.
= a+b = 18
2
•In this can a and b can have any values whose sum should be 36
Because 36/2 = 18.
•Suppose if we know a value 10 now b is no longer having a

Freedom to have any value but must have the value of 26 because
If a is 10 then 10+b = 18
2
so 10+b=36 there fore b= 26
•Hence if we have 2 elements in a sample and we we know the mean and we are free to
specify only one element as one more value is fixed to get the specified mean.
172
See one more example
a + b + c + d + e + f + g = 16
7
In this case the degree of freedom ,or the number of variable we can freely specify is 7- 1= 6
that means we are free to give values to 6 variable and no longer free to specify the the seventh
one as its determined automatically.
173
Estimation
Estimation : Every one makes estimations , when we are ready to cross the road we will do the
Estimation based one the car speed which is approaching and the distance between us and the car
And also our walking speed. After calculating all theses we will take a quick decision to cross the
road.
•There are 2 types of estimations
•1 Point estimation
•2 Interval estimation.
•Point estimation. : It’s a single number used to estimate an unknown population parameter.
•Example : The manager will give a statement that by end of this Q2 2005 we should have 15
green belts in our department.
•A point estimation is always insufficient be cause its either right or wrong chance of rejecting null
Hypothesis is more most of the time as we have to stick to one value only.
Point estimation: It’s a range of values used to estimate a population parameter.
It will give 2 types of error indication which we will do while estimation that is one is by the 174
extent
of its range and the second one is by the probability of falling the true value within that estimated
range .
Example If a manager says by the end of this 2005 we will have 20 to 25 green belts in our
department.That he might have calculated based on the training given each quarter to min to 50
Employees and also based on the passing ratio in white belt.
Estimator and Estimate
Any sample statistic which is used to estimate a population parameter is called estimator .
Sample x can be a estimator for population mean µ.and also we can use sample range as a
An estimator of the population range.
Any specific observed value of a statistic is called estimate.
Example : If we want to make a estimation of one furniture company , population parameter is
Employees of the furniture company our estimator is mean turn over fir a perid of 1 month ,
And our estimate is 8.9% turn over per year. 175

Criteria of good estimator :
1 Unbiasedness :Biasing is taking the decision not just based on the previous proven results .
2. Efficiency : Efficiency in estimation is measured using standard error that is smaller the standard
deviation in a sample population greater the chance of perfect estimation.
3 Consistency: a statistics consistency is as the sample size increases then the estimated value
will come closer to value of population parameter. That means estimation will be more accurate if the
Sample size is more.
4 Sufficiency :Estimator is sufficient when the estimation is done using all the possible information
from the sample.
176
Confidence Interval: It’s a major part of estimation which will indicate how much confident we are
that the interval estimation will include in the population parameter.Higher probability means higher
The confidence.(It’s a range of the estimation that we are making).
•In estimation we use 90 and 95 and 99 % confidence , but we are free to apply any confidence
level.
Example : When we are preparing a income report of some community at 90% CI level and our
Statement is that the mean of the population income lie between $8000 and 24000$ then this range
$8000-$24000 is our confidence interval .
•But normally we will express our confidence interval in standard error rather than in numerical value.
•That is x +1.64s x
•Where x +1.64s x = upper limit of the confidence interval
•Where x -1.64s x = upper limit of the confidence interval
177
Relationship between confidence level and confidence interval : We might think that we
Should use high confidence level (99%) in our estimation as high confidence means high accuracy.
But in practice high confidence levels will produce large confidence intervals , which are not precise.
This example will give the idea of confidence interval and confidence level.
•As the customer sets the tighter targets Confidence level will go down.
Customer question Store manager reaction Implied confidence level Confidence interval
Will I get washing m/c I am obsolutly sure about
within 1 year? that 99% 1
yes I am almost sure that it
Will I get washing m/c can be delivered within 1
within 1 month? month 95% 1 month
Will I get washing m/c
within 1 week? I am pretty sure 80% 1 week
Tomorrow??I am not that
Will I get washing m/c sure about it but we will try
by tomorrow? our best 40% 1 day
Will my washing m/c
get home before I could
reach? there is a very little chance . 1% 1 Hr
178
Regression
• Used to mathematically equate the relationship between the
factor X and response Y
179
Eg..To check which equation suitable for relationship between Age &
Height
Age (In years) Height (In Inches)

21 5
26 5.1
32 5.4
41 5.3
26 5.3
28 5.6
24 5.2
19 4.8
31 5.4
27 5.3
45 5.6
15 4.5
26 4.9
10 4
28 5.5
30 5.6
20 5
24 4.9
39 6.1
23 5.05
180
How to perform Regression in Minitab?
Minitab path
Stat > Regression > Regression….
Select the columns
181
Minitab will give you the following output
Regression Analysis: Height (In Inches) versus Age (In years)
Criteria The regression equation is

Height (In Inches) = 3.99 + 0.0445 Age (In years)
Regression equation holds good if the
Rsq & Rsq(adj) are more than 65 %
Predictor Coef SE Coef T P
Constant 3.9864 0.1992 20.01 0.000
Age (In years) 0.044527 0.007123 6.25 0.000
S = 0.260133 R-Sq = 68.5% R-Sq(adj) = 66.7%
Analysis of Variance
Source DF SS MS F P
Regression 1 2.6443 2.6443 39.08 0.000
Residual Error 18 1.2180 0.0677
Total 19 3.8624
182
Linear regression equation
Select the columns

Then press Ok
Select Linear
183
Regression Analysis: Height (In Inches) versus Age (In years)
The regression equation is

Height (In Inches) = 3.986 + 0.04453 Age (In years) Equation to be ok
R-Sq and R-Sq(adj)
To be more than 64%
S = 0.260133 R-Sq = 68.5% R-Sq(adj) = 66.7%
Fitted Line Plot

Analysis of Variance Height (In Inches) = 3.986 + 0.04453 Age (In years)
S 0.260133
6.0 R-Sq 68.5%
R-Sq(adj) 66.7%
Source DF SS MS F P
Regression 1 2.64433 2.64433 39.08 0.000 5.5
Height (In Inches)

Error 18 1.21804 0.06767
Total 19 3.86238 5.0
4.5
Fitted Line: Height (In Inches) versus Age (In years) 4.0
10 20 30 40 50
Age (In years)
184
Linear quadratic equation
Select the columns

Then press Ok
Select Quadratic
185
Equation to be ok
Polynomial Regression Analysis: Height (In Inches) versus Age (In years)
R-Sq and R-Sq(adj)
To be more than 64%
Height (In Inches) = 2.757 + 0.1402 Age (In years) - 0.001701 Age (In years)**2
S = 0.201868 R-Sq = 82.1% R-Sq(adj) = 80.0%
Fitted Line Plot

Source DF SS MS F P Height (In Inches) = 2.757 + 0.1402 Age (In years)
- 0.001701 Age (In years)**2
Regression 2 3.16961 1.58481 38.89 0.000
S 0.201868
Error 17 0.69276 0.04075 6.0 R-Sq 82.1%
R-Sq(adj) 80.0%
Total 19 3.86238
Height (In Inches)

5.5
Sequential Analysis of Variance

5.0
Source DF SS F P
4.5
Linear 1 2.64433 39.08 0.000
Quadratic 1 0.52528 12.89 0.002 4.0
10 20 30 40 50
Age (In years)
Fitted Line: Height (In Inches) versus Age (In years) 186
Linear Cubic equation
Select the columns

Then press Ok
Select Cubic
187
Polynomial Regression Analysis: Height (In Inches) versus Age (In years)

Height (In Inches) = 3.230 + 0.0757 Age (In years) + 0.000910 Age (In years)**2
- 0.000032 Age (In years)**3
Equation to be ok
S = 0.205991 R-Sq = 82.4% R-Sq(adj) = 79.1% R-Sq and R-Sq(adj)
To be more than 64%
Fitted Line Plot

Source DF SS MS F P Height (In Inches) = 3.230 + 0.0757 Age (In years)
+ 0.000910 Age (In years)**2 - 0.000032 Age (In years)**3
Regression 3 3.18346 1.06115 25.01 0.000
S 0.205991
Error 16 0.67892 0.04243 6.0 R-Sq 82.4%
R-Sq(adj) 79.1%
Total 19 3.86238
Height (In Inches)

5.5
Sequential Analysis of Variance

5.0
Source DF SS F P
4.5
Linear 1 2.64433 39.08 0.000
Quadratic 1 0.52528 12.89 0.002
4.0
Cubic 1 0.01384 0.33 0.576
10 20 30 40 50
Age (In years)
Fitted Line: Height (In Inches) versus Age (In years) 188
Note : after finding out all the 3 (linear, quadratic, cubic )equation which equation is giving more %
(more than 64%) value for R-Sq and R-Sq(adj) That equation to be used in our
experiment.
If R-Sq and R-Sq(adj) is less than 64% that equation will not be
valid.
189
Design of Experiments
• Factorial DOE
•Response Optimizer
190
DOE Steps
Decide the no.of factors of study & their levels
1) Create Factorial Design table
2) Do the experiments & Update the values in the table
3) Conduct Design of Experiments & Find out optimum level of each factor
Step 1
Lets assume in our example, following are the factors & its levels
Response – Fan strength
Factors
Injection speed (2 Levels)

Injection pressure (2 Levels) and
Hold on time (2 Levels)
191
Step 2 - How to Create Minitab path
Factorial Design table ? Stat > DOE > Factorial > Create Factorial Design…
192
How to Create Factorial Design table ?
3
Click this
Select the no.of factors
Select ½ factorial if no.of runs are more

Otherwise select full factorial
No.of replicates you want to do

for each experimental run
193
5
Now click this
7
6
Unfilled one.. 194

Fill the details like factors & its levels as shown above
8
Now click this
Unselect this If you don’t

want to randomise the runs
195
Minitab gives you the Factorial design table in the worksheet
10
196
Do experiments with these set values & update the table
11
197
12
Minitab path
Stat > DOE > Factorial > Analyse Factorial Design…
13
Select the column
198
14
Minitab path
Stat > DOE > Factorial > Factorial plots….
15
199
16
Click these
three options Then click these
Three options
17 18
Select the column
Click here
200
Follow the same procedure for all 3 “ Setup” pop-ups shown above
Minitab gives the following outputs Main Effects Plot (data means) for Fan strength
Inj.speed Inj.pressure
6.60
6.45
1) Main effects plot 6.30
Mean of Fan strength

6.15
6.00
2) Interaction plot 8
Hold on time
12 5 30
6.60
6.45
3) Cube plot 6.30
6.15
6.00
1.5 3.0
Cube Plot (data means) for Fan strength Interaction Plot (data means) for Fan strength
5 30 1.5 3.0
6.66 6.58 Inj.speed

6.50 8
12
Inj.speed
6.25
6.54 6.70
30
6.00
Inj.pressure
6.50 5
30
Inj.pr essur e
6.25
Inj.pressure 6.06 6.10
3 6.00
Hold on time
6.06 6.10
5 H old on time
1.5
8 12
Inj.speed
201
How to infer the Main effects plot ?
Main Effects Plot (data means) for Fan strength
Inj.speed Inj.pressure
6.60
6.45
6.30
Mean of Fan strength
6.15
6.00
8 12 5 30
Hold on time
6.60
6.45
6.30
6.15
6.00
1.5 3.0
Criteria : Factor having high slope has the greatest effect on the response
In our eg..Injection pressure has greatest effect on Fan strength, Injection

speed has a very minor effect & hold time has no effect at all
202
How to infer the Interaction plot ?
Interaction Plot (data means) for Fan strength

5 30 1.5 3.0
Inj.speed
6.50 8
12
Inj.speed
6.25
6.00
Inj.pressure
6.50 5
30
Inj.pr essur e
6.25
6.00
H old on time
Criteria : Factor cutting across each interaction have high interaction
In our eg..Injection speed & Hold on time have strong interaction with each
other which can affect fan strength, Injection speed & Injection pressure have
moderate interaction and Injection pressure & Hold on time have no I/action
203
How to infer the Cube plot ?
Cube Plot (data means) for Fan strength
Eight corner points

6.66 6.58
represents the
response that will be
achieved at these
6.54 6.70 settings
30
Optimum settings for Max.

Fan Strength (6.7)
Inj.pressure 6.06 6.10
Injection speed - 12
3
Injection pressure - 30
Hold on time
6.06 6.10
Hold on time - 1.5
5 1.5
8 12
Inj.speed
Criteria : Select the best optimum corner value as per your target and find out the level
at which factors are set for that point. Those are the optimum settings.
In our eg..Lets assume we want the maximum fan strength. Then the best option is 6.7
So the optimum parameters for achieving this strength are
204
DOE
Response Optimizer
205
Lets assume
1) You are not getting the optimum response in both the levels of a factor,
but you want to study the response keeping the factor in between the
extreme levels, HOW DO YOU PROCEED?
2) You want to study the effects of two responses in the various settings
of factors. For eg..you want to increase the fan strength and reduce the
manufacturing cost for the same. HOW DO YOU STUDY THE OPTIMUM
SETTINGS FOR ACHEVING THESE TWO RESPONSES?
RESPONSE OPTIMISER
206
Response optimiser Minitab path
1
Stat > DOE > Factorial > Response optimiser….
2 3
Click this
207
4
Enter the lower level

values of each factor
208
6
209
8
Select ‘Minimise’ if you want to reduce

Select ‘Maximise if you want to improve
Fill other appropriate ones
210
9
In our eg..we want to improve the Fan strength So we have selected

‘Maximise’
In that enter your min.expected value (I.e.7) and the target value
211
Minitab gives the following output
10
212
Minitab gives the following output
11
Move the red lines. You will see the change in the target value..
Move all the three lines to get the Y value you require.
213
214
215

Six Sigma Manual

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Six Sigma Manual

Uploaded by

Copyright:

Available Formats

Session 1

• Measure of Central Tendency

• Definition of Six Sigma

• Guidelines for Project Selection

Population : A population or universe consists of all members of a class or category of interest

Sample : A sample is some portion or subset of a population.

Parameter : It is defined as the characteristic of the population.

The characteristic of the population : Parameter

Sample The characteristic of the sample : Statistic

fluctuation refers to the extent, to which variables take on different values :

s2 = S(X-m)2 If sample size > 30, in that case

Where m is the mean and n is the number of scores or population size

square root of variance. Standard Deviation is denoted by s

Solution : m= ( 1+2+3+4+5 )/5 = 15/5 = 3

X X-m (X-m)2 S(X-m)2 = 10

Range for the above data = 5-1 = 4

Six Sigma believes in benchmarking against the best in the world.

TQ stands for Transactional Quality 8

LSL Target USL LSL Target USL

Three Sigma Process

LSL Target USL

The 3s level company The 6s level company

• Believes that 99% is good enough • Believes that 99% is unacceptable

• Defines CTQs internally • Defines CTQs externally

Stages DMAIC Stages DMADV

It aims at theme selection and justification.

As is : Current status or the performance of the process.

Pareto Chart of Defects

How to access Pareto Chart in MINITAB

Logic Tree Structure

What is MECE ( Mutually Exclusive and Collectively Exhaustive )

Is MECE Is not MECE

AB is ME, but not CE.

Benjamin Franklin’s 5-Why Analysis • In corporate world the kingdom is

Here is a real world example from a kitchen range manufacturer :

Application Effect of Process Mapping :

Understanding graphical representation of work flow

Process Mapping Symbols

Start/end Process boundary

Activity Precise description of contents of task

Decision Description of decision making contents, comparison,

Document/form Result document issued, report

Task Flow Expression of task flow/direction

Process connection Connection to different page or process

A1 Activity Number Proceeding order of activity

D5 Decision Number Proceeding order of decision

The SIPOC tool is particularly useful,when it is not clear :

2 Reject 0 Reject 5 Reject

Stage 2 : Output from stage-1 will act as input for stage-2

80% 63.75% 65%

80% 79.84% 65%

Case Study : Calculate the RTY of the R1 line

[Example] Rolled Throughput Yield

D/Liner Injection/Mold Door Form Door Assembly

I/Case Injection/Mold Case Form Cycle Assembly

Front - CTQ, L Paint O/Case, B/Plate Plate LQC & Appearance

Consider all the stages of the given production line :

Cycle : Yft = .977 ( given )

Assembly : Yft = .838 ( given )

LQC & Appearance : Yft = .965 ( given )

RTY = .8969 * .7338 * .977 * .838 * .965 = .5200 = 52.0 %

Yna ( Normalized Yield )

80% 79.84% 65%