Professional Documents
Culture Documents
Analyze Phase: Inferential Statistics
Analyze Phase: Inferential Statistics
Inferential Statistics
M# 298
Inferential Statistics
Welcome to Analyze
Hypothesis Testing ND P1
Hypothesis Testing ND P2
M# 299
OSSS LSS Black Belt v10.3 - Analyze Phase 2 © Open Source Six Sigma, LLC
Nature of Inference
So many
questions….?
M# 300
OSSS LSS Black Belt v10.3 - Analyze Phase 4 © Open Source Six Sigma, LLC
Types of Error
1. Error in sampling
– Error due to differences among samples drawn at random from the
population (luck of the draw).
– This is the only source of error that statistics can accommodate.
2. Bias in sampling
– Error due to lack of independence among random samples or due to
systematic sampling procedures (height of horse jockeys only).
3. Error in measurement
– Error in the measurement of the samples (MSA/GR&R).
4. Lack of measurement validity
– Error in the measurement does not actually measure what it intends
to measure (placing a probe in the wrong slot measuring temperature
with a thermometer that is just next to a furnace).
M# 300
OSSS LSS Black Belt v10.3 - Analyze Phase 5 © Open Source Six Sigma, LLC
Population, Sample, Observation
Population
– EVERY data point that has ever been or ever will be generated
from a given characteristic.
Sample
– A portion (or subset) of the population, either at one time or over
time.
X X
X
X X
Observation
– An individual measurement.
X
M# 301
OSSS LSS Black Belt v10.3 - Analyze Phase 6 © Open Source Six Sigma, LLC
Significance
* RORI includes not only dollars and assets but the time and participation of your teams.
M# 301
OSSS LSS Black Belt v10.3 - Analyze Phase 7 © Open Source Six Sigma, LLC
The Mission
Variation
Mean Shift Reduction Both
Your mission, which you have chosen to accept, is to reduce cycle time, reduce the
error rate, reduce costs, reduce investment, improve service level, improve throughput,
reduce lead time, increase productivity… change the output metric of some process,
etc…
In statistical terms, this translates to the need to move the process Mean and/or reduce
the process Standard Deviation.
You’ll be making decisions about how to adjust key process input variables based on
sample data, not population data - that means you are taking some risks.
How will you know your key process output variable really changed, and is not just an
unlikely sample? The Central Limit Theorem helps us understand the risk we are
taking and is the basis for using sampling to estimate population parameters.
M# 302
OSSS LSS Black Belt v10.3 - Analyze Phase 8 © Open Source Six Sigma, LLC
A Distribution of Sample Means
The Central Limit Theorem says that as the sample size becomes
large, this new distribution (the sample Mean distribution) will form a
Normal Distribution, no matter what the shape of the population
distribution of individuals.
M# 302
OSSS LSS Black Belt v10.3 - Analyze Phase 9 © Open Source Six Sigma, LLC
Sampling Distributions—The Foundation of Statistics
Population
3
• Samples from the population, each with five observations:
5
2 Sample 1 Sample 2 Sample 3
12 1 9 2
10 12 8 3
1
6 9 5 6
12 7 14 11
5 8 10 10
6
12 7.4 9.2 6.4
14
3 • In this example, we have taken three samples out of the
6 population, each with five observations in it. We computed a
11 mean for each sample. Note that the means are not the
9
10 same!
10 • Why not?
12 • What would happen if we kept taking more samples?
M# 303
OSSS LSS Black Belt v10.3 - Analyze Phase 10 © Open Source Six Sigma, LLC
Constructing Sampling Distributions
M# 303
OSSS LSS Black Belt v10.3 - Analyze Phase 11 © Open Source Six Sigma, LLC
Sampling Distributions
M# 304
OSSS LSS Black Belt v10.3 - Analyze Phase 12 © Open Source Six Sigma, LLC
Sampling Error
Calculate the Mean and Standard Deviation for each column and
compare the sample statistics to the population.
Stat > Basic Statistics > Display Descriptive Statistics…
Select all 6 columns to the ‘Variables” window
M# 305
OSSS LSS Black Belt v10.3 - Analyze Phase 14 © Open Source Six Sigma, LLC
Sampling Error - Reduced
Calculate the Mean and Standard Deviation for each column and
compare the sample statistics to the population.
“Stat > Basic Statistics > Display Descriptive Statistics…”
Feeling
lucky…?
M# 307
OSSS LSS Black Belt v10.3 - Analyze Phase 18 © Open Source Six Sigma, LLC
Sampling Distributions
Select “Same X ,
including same
bins” to facilitate
comparison
M# 307
OSSS LSS Black Belt v10.3 - Analyze Phase 19 © Open Source Six Sigma, LLC
Different Distributions
Sample Means
Individuals
M# 308
OSSS LSS Black Belt v10.3 - Analyze Phase 20 © Open Source Six Sigma, LLC
Observations
Good news: the Mean of the sample Better news: I can reduce my
Mean distribution is the Mean of the uncertainty about the population
population. Mean by increasing my sample size n.
M# 308
OSSS LSS Black Belt v10.3 - Analyze Phase 21 © Open Source Six Sigma, LLC
Central Limit Theorem
If all possible random samples, each of size n, are taken from any
population with a Mean μ and Standard Deviation σ, the distribution
of sample Means will:
have a Mean
Bigger is Better!
M# 308
OSSS LSS Black Belt v10.3 - Analyze Phase 22 © Open Source Six Sigma, LLC
So What?
M# 309
OSSS LSS Black Belt v10.3 - Analyze Phase 23 © Open Source Six Sigma, LLC
A Practical Example
M# 309
OSSS LSS Black Belt v10.3 - Analyze Phase 24 © Open Source Six Sigma, LLC
Sample Size and the Mean
Theoretical distribution of
sample Means for n = 2
M# 310
OSSS LSS Black Belt v10.3 - Analyze Phase 25 © Open Source Six Sigma, LLC
Standard Error of the Mean
M# 310
OSSS LSS Black Belt v10.3 - Analyze Phase 26 © Open Source Six Sigma, LLC
Standard Error
0 5 10 20 30
Sample Size
M# 312
OSSS LSS Black Belt v10.3 - Analyze Phase 28 © Open Source Six Sigma, LLC