You are on page 1of 91

Tech-Pro Consultants

Six Sigma Basic Statistics


March 2005
Dr. K.S.Ravichandran
Tech-Pro Consultants
Basic Statistics
Tech-Pro Consultants
Objectives
Review & Enhance The Basic Statistical & Quality Terms Needed
For Six Sigma Process Improvement
Begin To Enhance Minitab Operating Skills
Politicians Promise: if elected, I'd make certain that everybody gets an above average income
Tech-Pro Consultants
What is Statistics?
Is the science that develops methods to effectively derive
information from numerical data
Statistics is a collection of scientific methods for collecting,
organizing and interpreting data, usually with the goal of inferring
certain properties of the population from a representative sample of
the population
science of collecting and classifying a group of facts according to
their relative number and determining certain values that represent
characteristics of the group
There are three kinds of Lies: Lie, Damned Lie and Statistics Mark Twain
Tech-Pro Consultants
Types of data
Measures of the Center of the data
Mean
Median
Mode
Measures of the Spread of Data
Range
Variance
Standard Deviation
Normal Distribution and Normal Probabilities
Process Stability and Process Capability
Basic Statistics
Used With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
Ask a statistician for her phone number... and get an estimate with 95% confidence
Tech-Pro Consultants
What sorts of data do you see being
collected around your area?
(List them below)
___________________________________________________
___________________________________________________
___________________________________________________
___________________________________________________
___________________________________________________
___________________________________________________
___________________________________________________
___________________________________________________
___________________________________________________
___________________________________________________
___________________________________________________
In God we trust. All others must bring data.
Tech-Pro Consultants
Two General Kinds
of Data
(but 3 families)
ATTRIBUTE DATA - The data is discrete (counted).
Results from using go/no-go gages, or from the inspection of
visual defects, visual problems, missing parts, or from
pass/fail or yes/no decisions.
VARIABLE DATA - The data is continuous
(measured). Results from the actual measuring of a
characteristic such as impedance of a motor winding, tensile
strength of steel, diameter of a pipe, flow rate of a pump, etc.
Statisticians do it discretely and continuously.
Tech-Pro Consultants
ATTRIBUTE DATA (Count Data)
(#1) Number of Items in a Category (Count-Based Proportions)
Heads / Tails (i.e., counting # of Heads and # of Tails)
Yes / No (Order Form Filled Out Accurately or Not)
Pass / Fail; Good / Bad (Accurate Billing/Overcharged)
(#2) Counts of Discrete Event Occurrences
# of Scratches on a Car Hood
# of Errors on a Form
# of Insulation Breaks in a Spool of Wire
# of times customer hangs up before receiving response
2 General Kinds of Data (but 3 families)
Different Types Of Data Require Different Analysis Tools
- VARIABLE DATA (Continuous Measurement Scale)
(#3) Continuous Data
> Decimal subdivisions are meaningful
Ex: Time to answer the telephone ( Exact # of secs. per call)
Just ask
yourself,
Am I
counting
things,
here?
If yes, you
have
attributes
data.
Type-I
Attributes
Data
(Binomial)
Type-II
Attributes
Data
(Poisson)
Tech-Pro Consultants
V
A
R
I
A
B
L
E
S
D
A
T
A
TYPE-I
Any Bubbles?
(accept / reject
the entire item)
TYPE-II
Number of
Bubbles?
Reject Reject Accept Reject
3 2 0 4
A
T
T
R
I
B
U
T
E
S

D
A
T
A
Sample#1 Sample#2 Sample#3 Sample#4
3 Families of Data:

A
m

I

C
o
u
n
t
i
n
g

T
h
i
n
g
s
?

(
D
i
s
c
r
e
t
e

D
a
t
a
)
(
C
o
n
t
i
n
u
o
u
s

D
a
t
a
)
(
M
e
a
s
u
r
e
m
e
n
t

D
a
t
a
)
P
o
i
s
s
o
n

D
i
s
t
r
i
b
u
t
i
o
n
B
i
n
o
m
i
a
l

D
i
s
t
r
i
b
u
t
i
o
n
N
o
r
m
a
l

D
i
s
t
r
i
b
u
t
i
o
n
o
r

O
t
h
e
r

Manufacturing Process: Making Sheets of Glass
Weight = 12.2 Weight = 12.4 Weight = 12.1
Glass
Weight
Weight = 11.9
Tech-Pro Consultants
V
A
R
I
A
B
L
E
S
D
A
T
A
TYPE-I
Any Errors?
(accept / reject
the entire item)
TYPE-II
Number of
Errors on Form?
Reject Reject Accept Reject
3 2 0 4
A
T
T
R
I
B
U
T
E
S

D
A
T
A
Form#1 Form#2 Form#3 Form#4
3 Families of Data:

A
m

I

C
o
u
n
t
i
n
g

T
h
i
n
g
s
?

(
D
i
s
c
r
e
t
e

D
a
t
a
)
(
C
o
n
t
i
n
u
o
u
s

D
a
t
a
)
(
M
e
a
s
u
r
e
m
e
n
t

D
a
t
a
)
P
o
i
s
s
o
n

D
i
s
t
r
i
b
u
t
i
o
n
B
i
n
o
m
i
a
l

D
i
s
t
r
i
b
u
t
i
o
n
N
o
r
m
a
l

D
i
s
t
r
i
b
u
t
i
o
n
o
r

O
t
h
e
r

Transactional Process: Converting an expense account form
into a reimbursement check
Time to
Reimburse
Employee
36.1 hrs 24.6 hrs 21.0 hrs 29.2 hrs
Tech-Pro Consultants
Sample at 8:00am Sample at 9:00am Sample at 10:00am
CONTROL CHART FOR ATTRIBUTE DATA
PLANT PART NUMBER AND NAME
p c
np u
DEPARTMENT OPERATION NUMBER AND NAME
DATE CONTROL
Avg.- UCL- LCL- LIMITS CALCULATED: Average Sample Size:
Frequency:
Sample
(n)
Number
(np, c)
Proportion
(p,u)
Date
(Shift, Time, etc.)
ANY CHANGE IN PEOPLE, EQUIPMENT, MATERIALS, METHODS, ENVIRONMENT, OR MEASUREMENT
SYSTEMS, SHOULD BE NOTED. THESE NOTES WILL HELP YOU TO TAKE CORRECTIVE OR PROCESS
IMPROVEMENT ACTION WHEN SIGNALLED BY THE CONTROL CHART.
DATE TIME COMMENTS
30%
20%
10%
40%
8:00am
Pass/Fail
Data
9:00am
She tells you are just Average: never mind, she is just being Mean
Tech-Pro Consultants
CONTROL CHART FOR ATTRIBUTE DATA
PLANT PART NUMBER AND NAME
p c
np u
DEPARTMENT OPERATION NUMBER AND NAME
DATE CONTROL
Avg.- UCL- LCL- LIMITS CALCULATED: Average Sample Size:
Frequency:
Sample
(n)
Number
(np, c)
Proportion
(p,u)
Date
(Shift, Time, etc.)
ANY CHANGE IN PEOPLE, EQUIPMENT, MATERIALS, METHODS, ENVIRONMENT, OR MEASUREMENT
SYSTEMS, SHOULD BE NOTED. THESE NOTES WILL HELP YOU TO TAKE CORRECTIVE OR PROCESS
IMPROVEMENT ACTION WHEN SIGNALLED BY THE CONTROL CHART.
DATE TIME COMMENTS
8:00am8:10am
3
2
1
4
Number of
Blemishes
Data
8:00am
8:10am
8:20am
8:50am
9:00am
9:10am
etc.
etc.
8:30am
8:40am
Tech-Pro Consultants
Exercise: Which Type of Data Is It?
(1) Percent defective parts in hourly production
(2) Percent cream content in milk bottles (comes in four-bottle container sets)
(3) Amount of time it takes to respond to a request
(4) Number of blemishes per square yard of cloth, where pieces of cloth may be of variable
size
(5) Daily test of water acidity (pH)
(6) Number of raisins per box of Raisin Bran
(7) Number of defective parts in lots of size 100
(8) Length of screws in samples of size ten from production lots
(9) Number of errors on a purchase order
DIRECTIONS: For each of the following applications, identify the type of data you
would be investigating (Attributes Type-I, Attributes Type-II, or Variables Data)
... AND EXPLAIN YOUR CHOICE
Tech-Pro Consultants
What is the largest probability possible? _______
What does this mean?
What is the smallest probability possible? _______
What does this mean?
What does a probability of 0.50 mean? _______________
What is the probability you will be struck by lightning during your
lifetime? _____________________
What are your chances of appearing on The Tonight Show?
___________________
What is the probability of being killed by terrorists overseas?
____________________
What are your chances of being killed by an American in Baltimore?
_______________
The Probability Test
Tech-Pro Consultants
What is the largest probability possible? ___1.0 = 100%__
What does this mean?
What is the smallest probability possible? ___0.0 = 0%__
What does this mean?
What does a probability of 0.50 mean? 50% Just flip a coin
What is the probability you will be struck by lightning during your
lifetime? 0.000001667 = 1/600,000
What are your chances of appearing on The Tonight Show?
0.00000204 = 1/490,000
What is the probability of being killed by terrorists overseas?
0.000001538 = 1/650,000
What are your chances of being killed by an American in Baltimore?
0.00025 = 1/4,000
The Probability Test
Instructor Page
Answers
Tech-Pro Consultants
Roll a fair die once, what is Prob(a six)? ______
Roll a fair die twice, what is Prob(a six on the second roll)?__
Roll two fair dice, what is Prob(get two sixes)?____________
What do you think of the recent headline, Education
research shows 49.5% of all American high school students
fall below the national average!
The Probability Test (cont.)
Tech-Pro Consultants
The Customer Requirements
Suppose a certain customer permits only those
combinations which yield 3, 4, 5, . . . , or 11.
What is the process capability?
What is the probability of meeting the requirements?
Are capability and probability related?
Probability
Used With Permission
6 Sigma Academy Inc. 1995
The Practical Problem Statement ...
Tech-Pro Consultants
1 2 3 4 5 6
1
2
3
4
5
6
2 3 4 5 6 7
3 4 5 6 7 8
4 5 6 7 8 9
5 6 7 8 9 10
6 7 8 9 10 11
7 8 9 10 11 12
Computing the Risks- The Statistical Problem Statement
Ways to form a 2
in
=
Ways to form a 12 in
=
Probability of Defect
Used With Permission
6 Sigma Academy Inc. 1995
Tech-Pro Consultants
Deeper Insight Into Probability
Die 1 Die 2 Probability
1 4 .0278
2 3 .0278
3 2 .0278
4 1 .0278
Total .1111
What is the probability of
rolling a 5 using a fair pair
of dice?
1 2 3 4 5 6
1 .0278 .0278 .0278 .0278 .0278 .0278
2 .0278 .0278 .0278 .0278 .0278
3 .0278 .0278 .0278 .0278 .0278
4 .0278 .0278 .0278 .0278 .0278
5 .0278 .0278 .0278 .0278 .0278 .0278
6 .0278 .0278 .0278 .0278 .0278 .0278
.0278
.0278
.0278
Used With Permission
6 Sigma Academy Inc. 1995
Tech-Pro Consultants
Establishing the Odds
Value Combinations Probability
2 1 .0278
3 2 .0556
4 3 .0833
5 4 .1111
6 5 .1389
7 6 .1667
8 5 .1389
9 4 .1111
10 3 .0833
11 2 .0556
12 1 .0278
Total 36 1.0000
Probability of any given value on Die 1 = 1/6 = .1667
Probability of any given value on Die 2 = 1/6 = .1667
Probability of any given combination = 1/6 x 1/6 = 1/36 = .0278
Used With Permission
6 Sigma Academy Inc. 1995
Tech-Pro Consultants
Graphing the Results
. . .Hence, the probability of Customer Satisfaction is 94.4 %
Zone of Customer Satisfaction 94.4%
18
16
14
12
10
8
6
4
2
2 12 10 8 6 4 14 0
Total of Dice Values
2.8% 2.8%
LSL
USL
Suppose a certain customer permits only those
combinations which yield 3, 4, 5, . . . , or 11.
Value Combinations Probability
2 1 .0278
3 2 .0556
4 3 .0833
5 4 .1111
6 5 .1389
7 6 .1667
8 5 .1389
9 4 .1111
10 3 .0833
11 2 .0556
12 1 .0278
Total 36 1.0000
Used With Permission
6 Sigma Academy Inc. 1995
Tech-Pro Consultants
Statistical Distributions
We can describe the behavior of any process or
system by plotting multiple data points for the
same variable
Over time
Across products or business
By different people, machines, etc...
The accumulation of these data can be viewed as
a distribution of values
Represented by:
Dot plots
Histograms
Normal curve or other smoothed distribution
Used With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
Tech-Pro Consultants
Y = Weight (lbs) 220
160 100
Process = Hose
1 Drop = 1 Unit of Output
Histogram is ...
a pile of individual values
Dotplot
: :
: :
. : . : :
: : : : : : : : : : : . :
. . ::.::::: :.:::.:.:.:.: : : : : . : . .
-----+---------+---------+---------+---------+---------+-C1
100 125 150 175 200 225
Tech-Pro Consultants
Dot Plots
1st Observation 2nd Observation
1.1 1.0
1.15
1.2
1.25
1.3
1.35
1.4
1.05
Suppose we have a manufacturing line that is producing shafts.
Diameters range from 1.0 to 1.4 inches. As we make a measurement of a
shaft, we record the value with a dot on the above scale
Ex:
1st Observation = 1.4 inches
2nd Observation = 1.1 inches
Diameter
Tech-Pro Consultants
And Suppose we continue sampling until 150 shafts have been measured
What Statements Can You Make About Our Process ?
:: :::. . .
:.. :::::: : :
. :.. ..::::::::::::::: :: ::.
:..: .::::::::::::::::::::::::.:::..:.: .
Dot Plots
1.1 1.0
1.15
1.2
1.25
1.3
1.35
1.4
1.05
Tech-Pro Consultants
:: :::. . .
:.. :::::: : :
. :.. ..::::::::::::::: :: ::.
:..: .::::::::::::::::::::::::.:::..:.: .
Dot Plots
1.1 1.0
1.15
1.2
1.25
1.3
1.35
1.4
1.05
Now imagine the same data, grouped into intervals
with bars used to represent how the data looks.
Tech-Pro Consultants
Histogram Distribution
1.1 1.05 1.0
35
30
25
20
15
10
5
0
F
r
e
q
u
e
n
c
y
1.15 1.2 1.25 1.3 1.35 1.4
Data represented just with the dots is called a Dot Plot
Using data represented in the above bar format is called a Histogram
Tech-Pro Consultants
Histogram
Now weve combined the Histogram with our Lower and Upper Specifications.
Question #1 : What are Specifications ? Where do they come from ?
Question #2: What can you say about our process now ?
1.1 1.0
1.15
1.2
1.25
1.3
1.35
1.4
1.05
Upper Specification Lower Specification
.001 2.0
Tech-Pro Consultants
Histogram
Suppose the customer has given us new specifications !
Question: What can you say about our process now ?
1.1 1.0
1.15
1.2
1.25
1.3
1.35
1.4
1.05
Lower Specification
1.1
Upper Specification
1.3
Tech-Pro Consultants
Dotplot Distribution
Imagine a customer service help line in which the business knows that to
stay competitive, it must return the customers telephone calls in less
than 30 minutes. The actual response time was measured 150 times and
plotted above.
: : :::. . .
:.. :::::: : :
. :.. ..::::::::::::::: :: ::.
:..: .::::::::::::::::::::::::.:::..:.: .
-+---------+---------+---------+---------+-------
28.0 29.0 30.0 31.0 32.0
Tech-Pro Consultants
34 32 30 28 26
Upper Spec Lower Spec
Time
s
Mean-3s
Mean+3s
Mean
n
k
LSL
USL
Targ
Cpm
Cpk
CPL
CPU
Cp
0
0
0
0
0.00
0.00
0.00
0.00
Obs
PPM<LSL Exp
Obs
PPM>USL Exp
Obs
%<LSL Exp
Obs
%>USL Exp
0.8986
27.3735
32.7649
30.0692
150.000
0.014
25.000
35.000
*
*
1.83
1.88
1.83
1.85
Process Capability Analysis
Smoothed (Normal) Distribution
Finally, we can view the data as a smoothed distribution (red line), in this
example using the normal distribution assumption. It provides an
approximation of how the data might look if we were to collect an infinite
number of data points
Used With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
Tech-Pro Consultants
Forming the Normal Curve
Units of Measure

Center of the bar


Smooth curve interconnecting
the center of each bar
Area of Yield
Performance
Limit
Probability
of a Defect
p(x > a) =
1
o 2t
e
-(1/2)[(x - )/o]
2
a

dx
+ infinity - infinity
Given that 100% of the area
under the normal curve lies
between , we may
calculate that area which lies
beyond the performance limit.
Doing so would reveal the
random chance probability of
creating a defect.
Note: The tails of the normal curve will touch the baseline at infinity.
Used With Permission
6 Sigma Academy Inc. 1995
a
Tech-Pro Consultants
Types of data
Measures of the Center of the Data
Mean
Median
Mode
Measures of the Spread of Data
Range
Variance
Standard Deviation
Shape: Normal Distribution and Normal Probabilities
Process Stability and Process Capability
Basic Statistics
Used With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
Tech-Pro Consultants
Data Example
(Actual # of Days from Order to Ship)
140 170 215 130 136 130 150
145 175 150 155 123 120 110
160 175 145 150 155 130 116
190 170 155 148 140 131 108
155 180 155 155 120 120 95
165 135 150 150 130 118 125
150 170 155 140 138 125 133
190 157 150 180 121 135 110
195 130 180 190 125 125 150
138 185 160 145 116 118 108
160 190 135 150 145 122
155 155 160 164 150 115
153 170 140 112 102
145 155 142 125 115
Tech-Pro Consultants
Mean = The average value
(the Center of Gravity)
Where is the Center of the Data?
Decribed in 2 ways:
- Uses all data points
- Heavily influenced by
extreme values
X
=
Sum of the data points
Number of data points
Median = the 50% point,
(or the middle number)
To find the median of a data set,
(1) arrange data in order from
smallest to largest
(2) the middle number is the median!
1, 2, 3, 14, 85
The median is 3
- Not heavily influenced by
extreme values
Tech-Pro Consultants
What is the average income
(or center of gravity)?
$10, 20, 30, 40, 50 ($ in thousands)
What is the median
income?
As head of the universitys Communications Dept. you are asked
to summarize the average starting salaries of Communications
graduates.
$10, 20, 30, 40, 5000 ($ in thousands)
What is the average income
(or center of gravity)?
What is the median
income?
However, under the advice of the Public Relations Dept. you consider
to including one of your former Communications majors:
Shaquille ONeal (a rather wealthy rookie basketball star)
Tech-Pro Consultants
Mode (not used as much): The value that occurs most often.
The Mode may not exist; and if does exist, it may
not be unique.
-Can be used with categorical/attribute data
Where is the Center?
What is the mode for the following set of defect data?
# of change notices issued:
-Price change: 13
-Spec change: 112
-Ship to address change: 40
-Delivery date changed: 79
What does
Bimodal
mean?
Tech-Pro Consultants
Breakout
Example
Suppose your son or daughter is
considering going to work for a small, family
owned business after graduation. The
owner of the business proudly states that,
of the last 7 college graduates hired, the
mean salary was $25,000; the salaries were
bimodal, with modes of $18,000 and
$20,000; and the median salary was
$19,000. He refuses to identify the
individual salaries
Use your knowledge of the mean, median,
From Introductory Statistics William D. Ergle
Tech-Pro Consultants
Exercise
Minitab can easily calculate the Mean and Median
1. Open up Minitab
2. Open file: Distskew.mtw
3. Perform The Following
Stat>
Basic Statistics>
Descriptive Statistics>
4. Enter The Variables Names
5. Evaluate Results
Tech-Pro Consultants
TABULAR FORM
Variable N Mean Median TrMean StDev
Normal 500 70.000 69.977 70.014 10.000
Pos Skew 500 70.000 65.695 68.554 10.000
Neg Skew 500 70.000 73.783 71.368 10.000
Descriptive Statistics For 3 Distributions
Look For This In Your Session Window !
Tech-Pro Consultants
Graphical Form
Tech-Pro Consultants
110 100 90 80 70 60 50 40 30 20
100
50
0
C1
F
r
e
q
u
e
n
c
y
Comparison of Distributions.
Sketch in the Means and Medians on each Distribution.
Negative Skew Positive Skew
Symmetric
Distribution
80 70 60 50 40 30 20 10 0
300
200
100
0
C3
F
r
e
q
u
e
n
c
y
Comparison of Distributions.
Tail
130 120 110 100 90 80 70 60
300
200
100
0
C2
F
r
e
q
u
e
n
c
y
Comparison of Distributions.
Tail
Different Distributions
Used With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
Tech-Pro Consultants
Graphical Reminder
* The 3 Charts On The Previous Page
Were Created Under The Minitab Histogram Option
Graph>Histogram
Tech-Pro Consultants
110 100 90 80 70 60 50 40 30 20
100
50
0
Normal
F
r
e
q
u
e
n
c
y
Mean, Median
80 70 60 50 40 30 20 10 0
300
200
100
0
Neg Skew
F
r
e
q
u
e
n
c
y
Median
Mean
130 120 110 100 90 80 70 60
300
200
100
0
Pos Skew
F
r
e
q
u
e
n
c
y
Median
Mean
Relationship Of The Mean & Median
Used With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
Tech-Pro Consultants
Types of data
Measures of the Center of the Data
Mean
Median
Mode
Measures of the Spread of Data
Range
Variance
Standard Deviation
Normal Distribution and Normal Probabilities
Process Stability and Process Capability
Basic Statistics
Used With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
Tech-Pro Consultants
Population Parameters vs Sample Statistics

= Population Mean
o
= Population Standard Deviation
Examples of
POPULATION:
Entire United States
Yrs. Worth of Acct. Payable
Every Grain of Sand On The Beach
Examples of SAMPLE:
1000 US Citizens
Hrs. Worth of Acct.
Pay
Handful of Sand
o
^
= Sample Standard Deviation
X
= Sample Mean
s =
Tech-Pro Consultants
Range = R the difference between largest
and smallest observations
Standard Deviation = s
Variance = s
2
(just the square of the std dev!)
3 ways to describe how far the
data is spread:
Tech-Pro Consultants
Avg = ___
Sum of the
last column
= _______
Divide the
Sum by (n-1):
= Variance = S
2
= __________
X
=
Sum of the data points
Number of data points
X
5
4
3
1
2
X
2
-1
( )
X X
4
1
( )
X X
2
Square Root of
the Variance
= Std.Dev. = S
= _________
S S =
2
Calculate manually the Variance and Standard
Deviation of These 5 Data Points
S
2
CLASS EXERCISE
Tech-Pro Consultants
Avg = 3
Sum of the
last column
= 10
Divide the
Sum by (n-1):
= Variance = S
2
= 2.5
X
=
Sum of the data points
Number of data points
X
5
4
3
1
2
X
2
1
0
-2
-1
( )
X X
4
1
0
4
1
( )
X X
2
Square Root of
the Variance
= Std.Dev. = S
= 1.58
S S =
2
Calculate manually the Variance and Standard
Deviation of These 5 Data Points
S
2
CLASS EXERCISE
Instructor Page
Tech-Pro Consultants
Computational Equations
Population Mean
=
X
N
i
i
N
=

1
Sample Mean
Population Standard
Deviation
o

=
(X )
N
i
2
i=1
N

Sample Standard
Deviation
x =
x
n
i
i=1
n

s =
(X )
n -1
i
2
i=1
N

X
Used With Permission
6 Sigma Academy Inc. 1995
Tech-Pro Consultants
The Standard Deviation

Point of Inflection
1o
T USL
p(d)
Upper Specification Limit (USL)
Target Specification (T)
Lower Specification Limit (LSL)
Mean of the distribution ()
Standard Deviation of the distribution (o)
3o
The distance between the point of inflection and
the mean constitutes the size of a standard
deviation. If three such deviations can be fit
between the target value and the specification limit,
we would say the process has three sigma
capability.
Used With Permission
6 Sigma Academy Inc. 1995
Tech-Pro Consultants
Types of data
Measures of the Center of the Data
Mean
Median
Mode
Measures of the Spread of Data
Range
Variance
Standard Deviation
Normal Distribution and Normal Probabilities
Process Stability and Process Capability
Basic Statistics
Used With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
Tech-Pro Consultants
The Normal Distribution
The Normal Distribution is a distribution of
data which has certain consistent properties
These properties are very useful in our
understanding of the characteristics of the
underlying process from which the data were
obtained
Most natural phenomena and man-made
processes are distributed normally, or can be
represented as normally distributed
Used With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
Tech-Pro Consultants
Property 1: A normal distribution can be
described completely by knowing only the:
mean, and
standard deviation
The Normal Distribution
Distribution One
Distribution
Two
Distribution Three
What is the difference among these three normal distributions?
Used With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
Tech-Pro Consultants
Statistical Number Line
X Axis
3o 2o 1o +1o +2o +3o
300
Suppose the weights of players on a football
team had =300 lbs and o=10 lbs
You fill in the X-axis values (weights) above
Exercise
(pounds)
add 10 add 10 add 10
Tech-Pro Consultants
Statistical Number Line
X Axis
3o 2o 1o +1o +2o +3o
300 310 320 330 270 280 290
Suppose the weights of a football team
had =300 lbs and o=10 lbs
You fill in the X-axis values (weights)
Exercise
Instructor Page
(pounds)
Tech-Pro Consultants
X Axis
3o 2o 1o +1o +2o +3o
300 310 320 330 270 280 290
(pounds)
68%
+ 1o = 68%of
the individuals
Instructor Page
Tech-Pro Consultants
X Axis
3o 2o 1o +1o +2o +3o
300 310 320 330 270 280 290
(pounds)
95%
+ 2o = 95%of the individuals
Instructor Page
Tech-Pro Consultants
X Axis
3o 2o 1o +1o +2o +3o
300 310 320 330 270 280 290
+ 3o = 99.7%of the individuals
(pounds)
99.7%
Instructor Page
Tech-Pro Consultants
The Normal Curve and Probability Areas
Associated with the Standard Deviation
4 3 2 1 0 -1 -2 -3 -4
40%
30%
20%
10%
0%
68%
95%
P
r
o
b
a
b
i
l
i
t
y

o
f

s
a
m
p
l
e

v
a
l
u
e
Number of standard deviations from the mean
99.73%
Property 2: The area under sections of the curve
can be used to estimate the cumulative probability
of a certain event occurring
Cumulative probability
of obtaining a value
between two values
Used With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
Tech-Pro Consultants
Empirical Rule of Standard Deviation
Number of
Standard
Deviations
Theoretical
Normal
Empirical
Normal
+/- 1
o
68% 60-75%
+/- 2
o
95% 90-98%
+/- 3
o
99.7% 99-100%
The previous rules of cumulative probability apply even when a set of data is
not perfectly normally distributed. Lets compare the values for a theoretical
(perfect) normal distributions to empirical (real-world) distributions
Used With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
Tech-Pro Consultants
How can I tell if my data is bell-shaped?
(i.e., Normally Distributed)
Tech-Pro Consultants
Normal Probability Plots
We can test whether a given data set can be described as normal
with a test called a Normal Probability Plot
If a distribution is close to normal, the normal probability plot will be a
straight line.
Minitab makes the normal probability plot easy. Using Distskew.Mtw.
Choose: Stat>Basic Stats>Normality Tests
Produce a normal plot of each of the first 3 columns. Which appear to
be normal?
Tech-Pro Consultants
3 Ways To See If Your Data Is Normally
Distributed
80 70 60 50 40 30 20 10 0
300
200
100
0
C3
F
r
e
q
u
e
n
c
y
Normal Probability Plots
130 120 110 100 90 80 70 60
300
200
100
0
C2
F
r
e
q
u
e
n
c
y
Normal Probability Plots
110 100 90 80 70 60 50 40 30 20
100
50
0
C1
F
r
e
q
u
e
n
c
y
Normal Probability Plots
106 96 86 76 66 56 46 36 26
.999
.99
.95
.80
.50
.20
.05
.01
.001
P
r
o
b
a
b
i l i t
y
Normal
p-value: 0.328
A-Squared: 0.418
Anderson-Darling Normality Test
N of data: 500
Std Dev: 10
Average: 70
Normal Distribution
130 120 110 100 90 80 70 60
.999
.99
.95
.80
.50
.20
.05
.01
.001
P
r
o
b
a
b
i l i t
y
Pos Skew
p-value: 0.000
A-Squared: 46.447
Anderson-Darling Normality Test
N of data: 500
Std Dev: 10
Average: 70
Positive Skewed Distribution
80 70 60 50 40 30 20 10 0
.999
.99
.95
.80
.50
.20
.05
.01
.001
P
r
o
b
a
b
i l i t
y
Neg Skew
p-value: 0.000
A-Squared: 43.953
Anderson-Darling Normality Test
N of data: 500
Std Dev: 10
Average: 70
Negative Skewed Distribution
Used With Permission
AlliedSignal 1995 -
Dr. Steve Zinkgraf
If the Normality
Test shows a
P-value that is
less than 0.05,
then the data is
NOT
represented
well by a
normal
distribution
Tech-Pro Consultants
P Value for Normality Test
If your P value is less that than .05, then
the data is NOT approximately normal.
Tech-Pro Consultants
Mystery Distribution
Generate a Normal Probability Plot for the Mystery variable
in Mystery.mtw
What is your conclusion? Is this a normal distribution?
150 100 50
.999
.99
.95
.80
.50
.20
.05
.01
.001
P
r
o
b
a
b
i
l
i
t
y
Mystery
p-value: 0.000
A-Squared: 27.108
Anderson-Darling Normality Test
N of data: 500
Std Dev: 32.3849
Average: 100
Mystery Distribution
Used With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
Tech-Pro Consultants
Central Limit Theorem
The central limit theorem states that the distribution of the sample means, our estimate of , can be
approximated with a normal distribution even though the original population may be non-normal.
Given this, we may say that the grand average (resulting from averaging sets of samples) approaches
the universe mean as the number of sample sets approaches infinity. This property is at the core of
many statistical tests and is very important for resolving a wide array of industrial problems.
Random sample of g sets with n measurements assigned to each set
Various sampling distributions of individual measurements
X
X
UsedWithPermission
6 SigmaAcademy Inc. 1995
For more detail, see
the next few pages.
Tech-Pro Consultants
The Distribution of Averages
The Distribution of Individuals
VS
Important Distinctions:
Tech-Pro Consultants
What would the Distribution of
Individuals look like?
= Individual Measurement
= Average of the Subgroup
Flashlight
Y = Lifetime(Hrs)
96 85 74
Y = Lifetime(Hrs)
96 85 74
? ?
The Distribution
of Individuals
Tech-Pro Consultants
What would the Distribution of
Individuals look like?
= Individual Measurement
= Average of the Subgroup
Flashlight
Y = Lifetime(Hrs)
96 85 74
Y = Lifetime(Hrs)
96 85 74
The Distribution
of Individuals
Tech-Pro Consultants
What would the Distribution of
Averages look like?
= Individual Measurement
= Average of the Subgroup
Y = Weight (lbs)
10.5 10 9.5
The Distribution of Averages
?
Tech-Pro Consultants
What would the Distribution of Averages look like?
= Individual Measurement
= Average of the Subgroup
Y = Weight (lbs)
10.5 10 9.5
The Distribution of Averages
Tech-Pro Consultants
o
X
Distribution of Individuals Distribution of Averages
A Pile of Individuals A Pile of X-Bars
Spread is...
o
o
X
X
n
=
Histogram is...
1 Individual 1 Avg (i.e., 1 X-Bar) 1 point is ...
What is the probability that the
average lifetime of an n=20 sample
will exceed 87 hours?
What is the probability that
an individual battery will last
beyond 87 hours?
The question
might be...
85 74 96 85 74 96
Compressed by
n
Graphically...
SE(Mean)
Tech-Pro Consultants
97
95
93
91
89
87
85
83
81
79
77
75
73
__
_
_
__
_
_
__
_
_
__
_
_
__
_
_
__
_
_
__
_
_
__
_
_
__
_
_
__
_
_
__
_
_
__
_
_
__
n=20 n=50 n=12 n=4 n=2 n=1
Dist. of Avgs spread compresses by factor of
n
o
o
X
X
n
=
I
n
d
i
v
i
d
u
a
l
s
97
95
93
91
89
87
85
83
81
79
77
75
73
__
_
_
__
_
_
__
_
_
__
_
_
__
_
_
__
_
_
__
_
_
__
_
_
__
_
_
__
_
_
__
_
_
__
_
_
__
Tech-Pro Consultants
Types of data
Measures of the Center of the Data
Mean
Median
Mode
Measures of the Spread of Data
Range
Variance
Standard Deviation
Normal Distribution and Normal Probabilities
Process Stability and Process Capability
Basic Statistics
Used With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
Tech-Pro Consultants
Basic Statistics
Variability
Is the process on target with minimum variability?
We use the mean to determine if process is on target.
We use the Standard Deviation (o) determine variability
Stability
How does the process perform over time?
Represented by a constant mean and predictable variability over time.
Which process is the best process? Used With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
25 20 15 10 5 0
80
70
60
50
Sample Number
S
a
m
p
l
e
M
e
a
n
X-Bar Chart for Process B
X=70.98
UCL=77.27
LCL=64.70
25 20 15 10 5 0
75
70
65
Sample Number
S
a
m
p
l
e
M
e
a
n
X-Bar Chart for Process A
X=70.91
UCL=77.20
LCL=64.62
Tech-Pro Consultants
While every process displays Variation, some processes display
controlled variation, while other processes display uncontrolled
variation (Walter Shewhart).
. Controlled Variation is characterized by a stable and consistent
pattern of variation over time. Associated with Common Causes.
Uncontrolled Variation is characterized by variation that changes
over time. Associated with Special Causes.
Process A shows controlled variation.
Process B shows uncontrolled variation
Special Causes
Variation
25 20 15 10 5 0
75
70
65
Sample Number
S
a
m
p
l
e
M
e
a
n
X-Bar Chart for Process A
X=70.91
UCL=77.20
LCL=64.62
25 20 15 10 5 0
80
70
60
50
Sample Number
S
a
m
p
l
e
M
e
a
n
X-Bar Chart for Process B
X=70.98
UCL=77.27
LCL=64.70
Used With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
Tech-Pro Consultants
There will always be variability present in any process
We can tolerate variability if
The total variability of the Output is relatively small compared to the
process specifications and the process is on target
The process is stable over time
LSL
USL Nom USL
LSL
USL Nom
Acceptable
C
o
s
t
C
o
s
t
OLD
New
Traditional
Goal Post
Mentality
Can We Tolerate Variability ?
Used With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
Tech-Pro Consultants
Expanding On The Goal Post Mentality
LSL
USL Nom
UNDER THE OLD RULES,
The field goal kicker gets 3 points for his team as long as
the ball falls between the LSL and USL.
3 Points
Tech-Pro Consultants
Expanding On The Goal Post Mentality
LSL
USL Nom
UNDER THE NEW RULES,
The Field Goal Kicker Might Get...
3 points Target & +/-1o
2 points Between +/-1o & +/-2o
1 point > +/-2o Out To The LSL & USL
3 2 1 2
1
Points
Tech-Pro Consultants
Data Analysis Tasks For Improvement
Determine If Process Is Stable
If process is not stable, identify and remove causes of
instability
Determine The Location Of The Process Mean.
Is It On Target?
If not, identify the variables which affect the mean and
determine optimal settings to achieve target value
Estimate The Magnitude Of The Total Variability. Is
it acceptable with respect to the customer requirements (spec limits)?
If not, identify the sources of the variability and eliminate or
reduce their influence on the process
Used With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
Tech-Pro Consultants
Visualizing the Process Dynamics - Is The
Process Stable ?
Inherent Capability of the
Process
General Assumptions::
Over time, a typical process
will shift and drift by approx. 1.5o
. . . also called short-term capability
Time 1
Time 2
Time 3
Time 4
T LSL USL
Sustained Capability of the
Process
. . . also called long-term capability
Used With Permission
6 Sigma Academy Inc. 1995
Tech-Pro Consultants
Variables Data
0% Rejected
Target
Attributes Data
The Goal Is ...
Tech-Pro Consultants
PHASE ONE - Unpredictable Performance
- VARIATION (SPECIAL / NATURAL CAUSES)
- UNPREDICTABLE (HOURLY, DAILY)
- DETECT AND ELIMINATE SPECIAL CAUSES
PHASE TWO - Stability
- IN CONTROL
- NATURAL VARIATION ONLY
How We Progress Toward The Goal
Not capable of getting
all the water output into
the clowns mouth?
Tech-Pro Consultants
IN CONTROL, BUT NOT CAPABLE
(Variation from common causes excessive)
IN CONTROL AND CAPABLE
(Variation from common causes reduced)
How We Progress Toward The Goal
SIZE
LOWER
SPECIFICATION
LIMIT UPPER
SPECIFICATION
LIMIT
Now it is capable of
getting all the water output
into the clowns mouth
1.1 1.0
1.15
1.2
1.25
1.3
1.35
1.4
1.05
Upper Specification Lower Specification
.001 2.0
Tech-Pro Consultants
Is The Process on Target ? - Accurate ?
USL
Part
T
LSL
Recognize that the process center (m) is
independent of the design center (T). In
other words, the ability of a process to
repeat any given centering condition is
independent of the design specifications.
1.233 1.235 1.239 1.241 1.243 1.245 1.247 1.237

Manufacturing
Distribution of the Widget
Part
5 4 3 2 1
Increase in nonconformance due
to shift in process centering
Used With Permission
6 Sigma Academy Inc. 1995
Tech-Pro Consultants
1.235 1.237 1.239 1.241 1.243 1.245 1.247
USL
Part
T
LSL
Recognize that the process width is
independent of the design width. In
other words, the inherent precision of
a process is not determined by the
design specifications.
Is The Process on Target ? - Precise?
Manufacturing Distribution
of the Widget Part
Used With Permission
6 Sigma Academy Inc. 1995
Tech-Pro Consultants
Is The Variability Acceptable
To Customer Requirements ?
USL
Y = f (X
1
. . . X
N
)
The variation inherent to any dependent variable (Y) is determined by
the variations inherent to each of the independent variables.
LSL
Poor Process
Capability
LSL USL
Very High
Probability
of Defects
Very High
Probability
of Defects
LSL USL
Excellent
Process
Capability
Very Low
Probability
of Defects
Very Low
Probability
of Defects
Used With Permission
6 Sigma Academy Inc. 1995
Tech-Pro Consultants
Summary
Reviewed & Enhanced The Basic Statistical & Quality Terms
Needed For Six Sigma Process Improvement
Began to Build Up Minitab Operating Skills
Tech-Pro Consultants
Six Sigma
Tech-Pro Consultants
Six Sigma

You might also like