You are on page 1of 52

Analyze Phase

Hypothesis Testing Non Normal Data


Part 1
Hypothesis Testing Non Normal Data Part 1

Welcome to Analyze

“X” Sifting

Inferential Statistics

Intro to Hypothesis Testing

Hypothesis Testing ND P1

Hypothesis Testing ND P2
Equal Variance Tests
Hypothesis Testing NND P1
Tests for Medians
Hypothesis Testing NND P2

Wrap Up & Action Items

LSS Green Belt v11.1 MT - Analyze Phase 2 © Open Source Six Sigma, LLC
Non-Normal Hypothesis Tests

At this point we have covered the tests for determining significance for
Normal Data. We will continue to follow the roadmap to complete the
test for Non-normal Data with Continuous Data.

Later in the module we will use another roadmap designed for


Discrete Data.
– Recall Discrete Data does not follow a Normal Distribution but
because it is not Continuous Data there are a separate set of
tests to properly analyze the data.

We can test for anything!!

LSS Green Belt v11.1 MT - Analyze Phase 3 © Open Source Six Sigma, LLC
Non-Normality

Why do we care if a data set is Normally Distributed?


– When it is necessary to make inferences about the true nature of
the population based on random samples drawn from the
population.
– When the two indices of interest (X-Bar and s) depend on the
data being Normally Distributed.
– For problem solving purposes because we do not want to make a
bad decision – having Normal Data is so critical with EVERY
statistical test the first thing we do is check for Normality.

Recall the four primary causes for Non-normal Data:


– Skewness: Natural and Artificial Limits
– Mixed Distributions: Multiple Modes
– Kurtosis
– Granularity

We will focus on Skewness for the remaining tests for Continuous Data.
LSS Green Belt v11.1 MT - Analyze Phase 4 © Open Source Six Sigma, LLC
Hypothesis Testing Roadmap

Non Normal

Test of Equal Variance Median Test

Mann-Whitney Several Median Tests

LSS Green Belt v11.1 MT - Analyze Phase 5 © Open Source Six Sigma, LLC
Test of Equal Variance

Levene’s Test of Equal Variance is used to compare the


estimated population Standard Deviations from two or
more samples with Non-normal Distributions.

– Ho: σ1 = σ2 = σ3 …
– Ha: At least one is different.

LSS Green Belt v11.1 MT - Analyze Phase 6 © Open Source Six Sigma, LLC
Follow the Roadmap…

Open the MINITABTM worksheet “EXH_AOV.MTW”

P-value < 0.05 (0.00)


Assume data is not
Normally Distributed.

Probability Plot of Rot 2


Normal
99.9
Mean 1.023
StDev 1.407
99 N 100
AD 7.448
95 P-Value <0.005
90
80
Stat > Basic Statistics > Normality test… 70
Percent

60
50
40
30
20
10
5

0.1
-5.0 -2.5 0.0 2.5 5.0 7.5 10.0
Rot 2

LSS Green Belt v11.1 MT - Analyze Phase 7 © Open Source Six Sigma, LLC
Test of Equal Variance Non-Normal Distribution

Stat>ANOVA>Test for Equal Variance Use Levene’s Statistics for Non-


Normal Data
P-value > 0.05 (0.860) Assume
variance is equal.
Ho: σ1 = σ2 = σ3 …
Ha: At least one is different.

Test for Equal Variances for Rot 2


F-Test
Test Statistic 1.75
1 P-Value 0.053
Factors2 Lev ene's Test
Test Statistic 0.03
P-Value 0.860
2

1.0 1.2 1.4 1.6 1.8 2.0 2.2


95% Bonferroni Confidence Intervals for StDevs

1
Factors2

0 2 4 6 8 10
Rot 2

LSS Green Belt v11.1 MT - Analyze Phase 8 © Open Source Six Sigma, LLC
Making Conclusions

When testing 2 samples with Normal Distribution use F-test:


– To determine whether two Normal Distributions have Equal
Variance.

When testing >2 samples with Normal Distribution use Bartlett’s test:
– To determine whether multiple Normal Distributions have
Equal Variance.

When testing two or more samples with Non-normal Distributions use


Levene’s Test:
– To determine whether two or more distributions have Equal
Variance.

Our focus for this module is working with Non-normal Distributions.

LSS Green Belt v11.1 MT - Analyze Phase 9 © Open Source Six Sigma, LLC
Exercise

Exercise objective: To practice solving problem


presented using the appropriate Hypothesis Test.

A credit card company wants to understand the need for


customer service personnel. The company thinks there is
variability impacting the efficiency of its customer service staff.
The credit card company has two types of cards. The company
wants to see if there is more variability in one type of customer
card than another. The Black Belt was selected and told to give
with 95% confidence the answer of similar variability between
the two card types.

1. Analyze the problem using the Hypothesis Testing roadmap.


2. Use the columns named CallsperWk1 and CallsperWk2 in
Minitab worksheet “Hypoteststud.mwt”.
3. Having a confidence level of 95% is there a difference in
variance?

LSS Green Belt v11.1 MT - Analyze Phase 10 © Open Source Six Sigma, LLC
Test for Equal Variance Example: Solution

First test to see if the Data is Normal or Non-Normal.

LSS Green Belt v11.1 MT - Analyze Phase 11 © Open Source Six Sigma, LLC
Test for Equal Variance Example: Solution

Since there are two


variables we need to
perform a Normality Test
on CallsperWk1 and
CallsperWk2.

First select the variable


‘CallsperWk1’ and
Press “OK”.

Follow the same steps for


CallsperWk2.

LSS Green Belt v11.1 MT - Analyze Phase 12 © Open Source Six Sigma, LLC
Test for Equal Variance Example: Solution

For the Data to be


Normal the P-value
must be greater
than 0.05

LSS Green Belt v11.1 MT - Analyze Phase 13 © Open Source Six Sigma, LLC
Test for Equal Variance Example: Solution

Since we know the variables


are Non-Normal Data continue
to follow the Roadmap.

The next step is to test


Calls/Week for Equal Variance.

Before performing a Levene’s


Test we have to stack the
columns for CallsperWk1 and
CallsperWk2 because currently
the data is in separate columns.

LSS Green Belt v11.1 MT - Analyze Phase 14 © Open Source Six Sigma, LLC
Test for Equal Variance Example: Solution

Stat>ANOVA>Test for Equal Variances


After stacking the Calls/Week
columns the next step in the
Roadmap is performing a
Levene’s Test.

LSS Green Belt v11.1 MT - Analyze Phase 15 © Open Source Six Sigma, LLC
Nonparametric Tests

A non-parametric test makes no assumptions about Normality.


For a skewed distribution:
– The appropriate statistic to describe the central tendency is the
Median rather than the Mean.
– If just one distribution is not Normal a non-parametric should be
used.
Non-parametric Hypothesis Testing works the same way as parametric
testing. Evaluate the P-value in the same manner.
δ

~ ~ ~
Target X X1 X2

LSS Green Belt v11.1 MT - Analyze Phase 16 © Open Source Six Sigma, LLC
Mean and Median

This Graphical Summary provides the confidence interval for the Median.

With Normal Data notice the With skewed data the Mean is
symmetrical shape of the influenced by the Outliers.
distribution and how the Mean and Notice the Median is still
the Median are centered. centered.

A nderson-Darling N ormality Test A nderson-Darling N ormality Test

A -S quared 0.30 A -S quared 3.72


P -V alue 0.574 P -V alue < 0.005

M ean 350.51 M ean 4.8454


S tDev 5.01 S tDev 3.1865
V ariance 25.12 V ariance 10.1536
S kew ness -0.079532 S kew ness 1.11209
Kurtosis -0.635029 Kurtosis 1.26752
N 75 N 200

M inimum 339.09 M inimum 0.1454


1st Q uartile 347.48 1st Q uartile 2.4862
M edian 350.48 M edian 4.1533
3rd Q uartile 353.99 3rd Q uartile 6.5424
M aximum 359.53 M aximum 16.4629
340 344 348 352 356 360 0 3 6 9 12 15 95% C onfidence Interv al for M ean
95% C onfidence Interv al for M ean
349.35 351.66 4.4011 5.2898
95% C onfidence Interv al for M edian 95% C onfidence Interv al for M edian
349.30 351.85 3.6296 4.7174
95% C onfidence Interv al for S tDev 95% C onfidence Interv al for S tDev
4.32 5.97 2.9018 3.5336
95% Confidence Intervals 95% Confidence Intervals
Mean Mean

Median Median

349.0 349.5 350.0 350.5 351.0 351.5 352.0 3.5 4.0 4.5 5.0 5.5

LSS Green Belt v11.1 MT - Analyze Phase 17 © Open Source Six Sigma, LLC
MINITABTM’s Nonparametrics

1-Sample Sign: performs a one-sample sign test of the Median and calculates
the corresponding point estimate and confidence interval. Use this test as
an alternative to one-sample Z and one-sample t-tests.
1-Sample Wilcoxon: performs a one-sample Wilcoxon signed rank test of the
Median and calculates the corresponding point estimate and confidence
interval (more discriminating or efficient than the sign test). Use this test as
a nonparametric alternative to one-sample Z and one-sample t-tests.
Mann-Whitney: performs a Hypothesis Test of the equality of two population
Medians and calculates the corresponding point estimate and confidence
interval. Use this test as a nonparametric alternative to the two-sample t-
test.
Kruskal-Wallis: performs a Hypothesis Test of the equality of population
Medians for a one-way design. This test is more powerful than Mood’s
Median (the confidence interval is narrower, on average) for analyzing data
from many populations, but is less robust to outliers. Use this test as an
alternative to the one-way ANOVA.
Mood’s Median Test: performs a Hypothesis Test of the equality of population
Medians in a one-way design. Test is similar to the Kruskal-Wallis Test.
Also referred to as the Median test or sign scores test. Use as an alternative
to the one-way ANOVA.
LSS Green Belt v11.1 MT - Analyze Phase 18 © Open Source Six Sigma, LLC
1-Sample Sign Test

This test is used to compare the Median of one distribution to a


target value.
– Must have at least one column of numeric data. If there is more
than one column of data MINITABTM performs a one-sample
Wilcoxon test separately for each column.
The hypotheses:
– H0: M = Mtarget
– Ha: M ≠ Mtarget
Interpretation of the resulting P-value is the same.

Note: For the purpose of calculating sample size for a non-


parametric (Median) test use:

n t test
n non-parametric 
0.864

LSS Green Belt v11.1 MT - Analyze Phase 19 © Open Source Six Sigma, LLC
1-Sample Example

1. Practical Problem:
Our facility requires a cycle time from an improved process of 63 minutes. This
process supports the customer service division and has become a bottleneck to
completion of order processing. To alleviate the bottleneck the improved process
must perform at least at the expected 63 minutes.

2. Statistical Problem:
Ho: M = 63
Ha: M ≠ 63

3. 1-Sample Sign or 1-Sample Wilcoxon

Open the MINITABTM worksheet: DISTRIB1.MTW


Stat>Non parametric> 1 sample sign …
Or
Stat> Non parametric> 1 sample Wilcoxon

4. Sample Size:
This data set has 500 samples (well in excess of necessary sample size).

LSS Green Belt v11.1 MT - Analyze Phase 20 © Open Source Six Sigma, LLC
1-Sample Example

Stat>Non parametric> 1 Sample Sign …

For a two tailed test choose


the “not equal” for the
alternative hypothesis.

=
Sign Test for Median: Pos Skew
Sign Test of Median = 63.00 versus = 63.00
N Below Equal Above P Median
Pos Skew 500 37 0 463 0.0000 65.70

LSS Green Belt v11.1 MT - Analyze Phase 21 © Open Source Six Sigma, LLC
1-Sample Example

Stat>Non parametric> 1 Sample Wilcoxon …

Wilcoxon Signed Rank Test: Pos Skew


Test of Median = 63.00 versus Median not = 63.00

N for Wilcoxon Estimated


N Test Statistic P Median
Pos Skew 500 500 124015.0 0.000 67.83

LSS Green Belt v11.1 MT - Analyze Phase 22 © Open Source Six Sigma, LLC
1-Sample Example

For a confidence interval


enter desired level

Stat>Non parametric> 1 Sample Sign …

Sign confidence interval for Median


Confidence
Achieved Interval
Since the target of 63 N Median Confidence Lower Upper Position
is not within the Pos Skew 500 65.70 0.9455 65.30 66.50 229
confidence interval 0.9500 65.26 66.50 NLI
reject the null 0.9558 65.20 66.51 228
hypothesis.

LSS Green Belt v11.1 MT - Analyze Phase 23 © Open Source Six Sigma, LLC
1-Sample Example

Since the target of 63 is not


within the confidence interval
reject the null hypothesis.

Wilcoxon Signed Rank CI: Pos Skew


Confidence
Estimated Achieved Interval
N Median Confidence Lower Upper
Pos Skew 500 67.83 95.0 67.01 68.70

LSS Green Belt v11.1 MT - Analyze Phase 24 © Open Source Six Sigma, LLC
1 Sample Example: Solution

Exercise objective: To practice solving a problem


presented using the appropriate Hypothesis Test.

A mining company is falling behind profit targets. The mine


manager wants to determine if his mine is achieving the
target production of 2.1 tons/day with some limited data to
analyze. The mine manager asks the Black Belt to
determine if the mine is achieving 2.1 tons/day and the
Black Belt says she will answer with 95% confidence.

1. Analyze the problem using the Hypothesis Testing


roadmap.
2. Use the column Tons hauled within the Minitab worksheet
“Hypoteststud.mtw.
3. Does the Median equal the target value?

LSS Green Belt v11.1 MT - Analyze Phase 25 © Open Source Six Sigma, LLC
1 Sample Example: Solution

According to the hypothesis the Mine Manager feels he is achieving his target of
2.1 tons/day.
H0: M = 2.1 tons/day Ha: M ≠ 2.1 tons/day

Since we are using one sample we have a choice of choosing either a 1 Sample-Sign or 1
Sample Wilcoxon. For this example we will use a 1 Sample-Sign.

LSS Green Belt v11.1 MT - Analyze Phase 26 © Open Source Six Sigma, LLC
1 Sample Example: Solution

Sign Test for Median: Tons hauled


Sign Test of Median = 2.100 versus = 2.100
N Below Equal Above P Median
Tons hauled 17 14 0 3 0.0127 1.800

The results show a P-value of 0.0127 and a Median of 1.800.

The Black Belt in this case does not agree; based on this data
the Mine Manager is not achieving his target of 2.1 tons/day.

We disagree!

LSS Green Belt v11.1 MT - Analyze Phase 27 © Open Source Six Sigma, LLC
Mann-Whitney Example

The Mann-Whitney test is used to test if the Medians for 2 samples


are different.

1. Determine if different machines have different Median cycle


times.

2. Ho: M1 = M2
Ha: M1 ≠ M2

3. Perform the Mann-Whitney test. Use the data provided in the


MINITABTM worksheet: “Nonparametric.mtw”

3. There are 200 data points for each machine well over the
minimum number of samples necessary.

LSS Green Belt v11.1 MT - Analyze Phase 28 © Open Source Six Sigma, LLC
Mann-Whitney Example

First run a Normality Test…of course!

Probability Plot of Mach A


Normal
99.9
Mean 15.24
StDev 5.379
99 N 200
AD 1.550
95 P-Value <0.005
90
80
70
Probability Plot of Mach B
Percent

60 Normal
50
40 99.9
30 Mean 16.73
20 StDev 5.284
99 N 200
10
AD 0.630
5 95 P-Value 0.099
90
1
80
70
Percent

0.1 60
0 10 20 50
30 40
40
Mach A 30
20
10
5

0.1
0 5 10 15 20 25 30 35
Mach B

LSS Green Belt v11.1 MT - Analyze Phase 29 © Open Source Six Sigma, LLC
Mann-Whitney Example

Now you will actually run the Mann-Whitney test and based on the
results end up determining that Medians of the machines are different.
Stat>Nonparametric>Mann-Whitney…

If the samples are the same


zero would be included within
the confidence interval.

Mann-Whitney Test and CI: Mach A, Mach B


N Median
Mach A 200 14.841
Mach B 200 16.346
Point estimate for ETA1-ETA2 is -1.604
95.0 Percent CI for ETA1-ETA2 is (-2.635,-0.594)
W = 36509.0
Test of ETA1 = ETA2 vs ETA1 not = ETA2 is
significant at 0.0019

LSS Green Belt v11.1 MT - Analyze Phase 30 © Open Source Six Sigma, LLC
Exercise

Exercise objective: To practice solving problem presented


using the appropriate Hypothesis Test.

A credit card company now understands there is no variability


difference in customer calls/week for the two different credit card
types. This means no difference in strategy of deploying the
workforces. However the credit card company wants to see if there is
a difference in call volume between the two different card types. The
company expects no difference since the total sales of the two credit
card types are similar. The Black Belt was told to evaluate with 95%
confidence if the averages were the same. The Black Belt reminded
the credit card company the calls/day were not Normal distributions
so he would have to compare using Medians since Medians are used
to describe the central tendency of Non-normal Populations.

1. Analyze the problem using the Hypothesis Testing roadmap.


2. Use the columns named CallsperWk1 and CallsperWk2 in MINITABTM
worksheet “Hypoteststud.mtw”
3. Is there a difference in call volume between the 2 different card
types?

LSS Green Belt v11.1 MT - Analyze Phase 31 © Open Source Six Sigma, LLC
Mann-Whitney Example: Solution

Since we know the data for CallperWk1 and CallperWk 2 are Non-normal we can proceed to
performing a Mann-Whitney Test.
Stat>Nonparametrics>Mann-Whitney

Mann-Whitney Test and CI: CallsperWk1, CallsperWk2


N Median
CallsperWk1 22 739.0
CallsperWk2 105 770.0
Point estimate for ETA1-ETA2 is -26.5
95.0 Percent CI for ETA1-ETA2 is (-91.9,43.0)
W = 36509.0
Test of ETA1 = ETA2 vs ETA1 not = ETA2 is significant at 0.4580

LSS Green Belt v11.1 MT - Analyze Phase 32 © Open Source Six Sigma, LLC
Mann-Whitney Example: Solution

As you can see there is no significant difference in the Median


between CallsperWk1 and CallsperWk2.

Therefore, there is no significant difference in call volume between


the two different card types.

Mann-Whitney Test and CI: CallsperWk1, CallsperWk2


N Median
CallsperWk1 22 739.0
CallsperWk2 105 770.0
Point estimate for ETA1-ETA2 is -26.5
95.0 Percent CI for ETA1-ETA2 is (-91.9,43.0)
W = 36509.0
Test of ETA1 = ETA2 vs ETA1 not = ETA2 is significant at 0.4580

LSS Green Belt v11.1 MT - Analyze Phase 33 © Open Source Six Sigma, LLC
Mood’s Median Test

1. An aluminum company wanted to compare the operation of its three


facilities worldwide. They want to see if there is a difference in the
recoveries among the three locations. A Black Belt was asked to help
management evaluate the recoveries at the locations with 95% confidence.
2. Ho: M1 = M2 = M3
Ha: at least one is different
3. Use the Mood’s Median test.
4. Based on the smallest sample of 13 the test will be able to detect a
difference close to 1.5.
5. Statistical Conclusions: Use the data in the columns named “Recovery” and
“Location” in the MinitabTM worksheet “Hypoteststud.mtw” for analysis.

= = ?
LSS Green Belt v11.1 MT - Analyze Phase 34 © Open Source Six Sigma, LLC
Follow the Roadmap…Normality

Stat>Basic Statistics>Graphical Summary…

Summary for Recovery


Location = Savannah
A nderson-Darling N ormality Test
A -S quared 0.81
P -V alue 0.032
M ean 87.660
S tDev 7.944
V ariance 63.113
S kew ness -0.15286
Kurtosis -1.11764
N 25
M inimum 75.300
1st Q uartile 79.000
M edian 87.500
78 84 90 96 3rd Q uartile 96.550
M aximum 99.200
95% C onfidence Interv al for M ean
84.381 90.939
95% C onfidence Interv al for M edian
86.179 90.080
9 5 % C onfidence Inter vals 95% C onfidence Interv al for S tDev
Mean 6.203 11.052

Median

84.0 85.5 87.0 88.5 90.0 91.5

LSS Green Belt v11.1 MT - Analyze Phase 35 © Open Source Six Sigma, LLC
Follow the Roadmap…Normality

Summary for Recovery


Location = Bangor
A nderson-Darling N ormality Test
A -S quared 0.72
P -V alue 0.045
M ean 93.042
S tDev 5.918
V ariance 35.017
S kew ness -1.81758
Kurtosis 4.66838
N 13
M inimum 76.630
1st Q uartile 90.600

78 84 90 96
M edian
3rd Q uartile
94.800
97.350
Summary for Recovery
M aximum 99.700 Location = Ankhar
95% C onfidence Interv al for M ean A nderson-Darling N ormality Test
89.466 96.617 A -S quared 0.86
95% C onfidence Interv al for M edian P -V alue 0.022
90.637 97.036 M ean 88.302
9 5 % C onfidence Inter vals 95% C onfidence Interv al for S tDev S tDev 6.929
4.243 9.768 V ariance 48.008
Mean
S kew ness -0.105610
Median
Kurtosis 0.182123
N 20
90 92 94 96 98
M inimum 73.500
1st Q uartile 85.150
M edian 88.425
78 84 90 96 3rd Q uartile 89.700
M aximum 99.450
95% C onfidence Interv al for M ean
85.059 91.545
95% C onfidence Interv al for M edian
86.735 89.299
9 5 % C onfidence Inter vals 95% C onfidence Interv al for S tDev
Mean 5.269 10.120

Median

85 86 87 88 89 90 91

LSS Green Belt v11.1 MT - Analyze Phase 36 © Open Source Six Sigma, LLC
Follow the Roadmap…Equal Variance

Test for Equal Variances for Recovery

Bartlett's Test
Test Statistic 1.33
Ankhar P-Value 0.514
Lev ene's Test
Test Statistic 1.02
P-Value 0.367
Location

Bangor

Savannah

3 4 5 6 7 8 9 10 11 12
95% Bonferroni Confidence Intervals for StDevs

LSS Green Belt v11.1 MT - Analyze Phase 37 © Open Source Six Sigma, LLC
Mood’s Median Test

Stat>NonParametrics > Moods Median [Session Output}…

Mood Median Test: Recovery versus Location

Mood median test for Recovery


Chi-Square = 12.11 DF = 2 P = 0.002

Individual 95.0% CIs


Location N<= N> Median Q3-Q1 ---+---------+---------+---------+---
Ankhar 13 7 88.4 4.5 (-----*--)
Bangor 1 12 94.8 6.8 (-------------*------)
Savannah 15 10 87.5 17.6 (----*-------)
---+---------+---------+---------+---
87.0 90.0 93.0 96.0
Overall Median = 88.9

We observe the confidence intervals for the Medians of the three


populations. Note there is no overlap of the 95% confidence
levels for Bangor—so we visually know the P-value is below 0.05.

Statistical Conclusion: Since the P-value of the Mood’s Median


test is less than 0.05 we reject the null hypothesis.
Practical Conclusion: Bangor has the highest recovery of all three
facilities.
LSS Green Belt v11.1 MT - Analyze Phase 38 © Open Source Six Sigma, LLC
Kruskal-Wallis Test

Using the same data set analyze using the Kruskal-Wallis test.

Kruskal-Wallis Test: Recovery versus


Location

Kruskal-Wallis Test on Recovery

Location N Median Ave Rank Z


Ankhar 20 88.43 27.3 -0.73
Bangor 13 94.80 40.2 2.60
Savannah 25 87.50 25.7 -1.49
Overall 58 29.5

H = 6.86 DF = 2 P = 0.032
H = 6.87 DF = 2 P = 0.032 (adjusted for ties)

This output is the “least friendly” to interpret. Look for the


P-value which tells us we reject the null hypothesis. We
have the same conclusion as with the Mood’s Median test.

LSS Green Belt v11.1 MT - Analyze Phase 39 © Open Source Six Sigma, LLC
Exercise

Exercise objective: To practice solving problem presented


using the appropriate Hypothesis Test.

A company making cell phones is interested in evaluating the


defect rate of 3 months from one of its facilities. A customer felt
the defect rate was surprising lately but did not know for sure. A
Black Belt was selected to investigate the first three months of
this year. She is to report back to senior management with 95%
confidence about any shift(s) in defect rates.

1. Analyze the problem using the Hypothesis Testing roadmap.

1. Use the columns named ppm defective1, ppm defective2 and


ppm defective3 in MINITABTM worksheet “Hypoteststud.mtw”

1. Are the defect rates equal for three months?

LSS Green Belt v11.1 MT - Analyze Phase 40 © Open Source Six Sigma, LLC
Cell Phone Defect Rate Example: Solution

Let’s follow the


Roadmap to see if the
data is Normal.

Instead of performing
a Normality Test we
can find the P-value
using the Graphical
Summary in
MINITABTM.

Stat>Basic Statistics>Graphical Summary

LSS Green Belt v11.1 MT - Analyze Phase 41 © Open Source Six Sigma, LLC
Cell Phone Defect Rate Example: Solution

Before we can perform a Mood’s


Median Test we must first stack the
columns ppm defective1, ppm
defective2 and ppm defective3.

LSS Green Belt v11.1 MT - Analyze Phase 42 © Open Source Six Sigma, LLC
Cell Phone Defect Rate Example: Solution

Stat>Nonparametric>Mood’s Median Test

After stacking the


columns we can perform
a Mood’s Median Test.

LSS Green Belt v11.1 MT - Analyze Phase 43 © Open Source Six Sigma, LLC
Unequal Variance

Where do you go in the roadmap if the variance is not equal?


– Unequal variances are usually the result of differences in
the shape of the distribution.
• Extreme tails
• Outliers
• Multiple modes

These conditions should be explored through data


demographics.

For Skewed Distributions with comparable Medians it is unusual


for the variances to be different without some assignable cause
impacting the process.

LSS Green Belt v11.1 MT - Analyze Phase 44 © Open Source Six Sigma, LLC
Example

First open MinitabTM worksheet “Var_Comp.mtw”. Then check for Normality


using “Stat > Basic Statistics > Normality”….

Model A and Model B are similar in nature (not exact) but are
manufactured in the same plant.

Probability Plot of Model A Probability Plot of Model B


Normal Normal
99 99
Mean 10.28 Mean 2.826
StDev 0.7028 StDev 3.088
95 N 10 95 N 10
AD 0.227 AD 0.753
90 90
P-Value 0.747 P-Value 0.033
80 80
70 70
Percent

Percent
60 60
50 50
40 40
30 30
20 20

10 10
5 5

1 1
8.5 9.0 9.5 10.0 10.5 11.0 11.5 12.0 -5.0 -2.5 0.0 2.5 5.0 7.5 10.0
Model A Model B

Model A is Normal, Model B is Non-normal.

LSS Green Belt v11.1 MT - Analyze Phase 45 © Open Source Six Sigma, LLC
Example

Now let’s check for Equal Variances using Levene’s Test but remember
first you will need to stack the data so you can run this test…

Test for Equal Variances for Data


F-Test
Test Statistic 0.05
Model A P-Value 0.000
Lev ene's Test

idvar
Test Statistic 4.47
P-Value 0.049
Model B

0 1 2 3 4 5 6 7
95% Bonferroni Confidence Intervals for StDevs

Model A

idvar
Model B

0 2 4 6 8 10 12
Data

The P-value is just under the limit of .05. Whenever the result is borderline,
as in this case, use your process knowledge to make a judgment.

LSS Green Belt v11.1 MT - Analyze Phase 46 © Open Source Six Sigma, LLC
Example

Let’s look at data demographics for clues.


Summary for Model A Summary for Model B
A nderson-Darling N ormality Test A nderson-Darling N ormality Test
A -S quared 0.23 A -S quared 0.75
P -V alue 0.747 P -V alue 0.033

M ean 10.279 M ean 2.8260


S tDev 0.703 S tDev 3.0882
V ariance 0.494 V ariance 9.5370
S kew ness 0.330968 S kew ness 1.29887
Kurtosis -0.614597 Kurtosis 0.92377
N 10 N 10

M inimum 9.213 M inimum 0.2253


1st Q uartile 9.779 1st Q uartile 0.3488
M edian 10.111 M edian 1.7773
3rd Q uartile 10.816 3rd Q uartile 5.5508
9.0 9.5 10.0 10.5 11.0 11.5 0 2 4 6 8 10 M aximum 9.4440
M aximum 11.496
95% C onfidence Interv al for M ean 95% C onfidence Interv al for M ean
9.776 10.782 0.6169 5.0352
95% C onfidence Interv al for M edian 95% C onfidence Interv al for M edian
9.767 10.848 0.3465 5.5873
95% C onfidence Interv al for S tDev 95% C onfidence Interv al for S tDev
9 5 % C onfidence Inter vals 9 5 % C onfidence Inter vals
0.483 1.283 2.1242 5.6379
Mean Mean

Median Median

9.8 10.0 10.2 10.4 10.6 10.8 11.0 0 1 2 3 4 5 6

Dotplot of Model A, Model B

Model A
Model B
-0.0 1.6 3.2 4.8 6.4 8.0 9.6 11.2
Data

Graph> Dotplot> Multiple Y’s, Simple

LSS Green Belt v11.1 MT - Analyze Phase 47 © Open Source Six Sigma, LLC
Black Belt Aptitude Exercise

Exercise objective: To practice solving problem


presented using the appropriate Hypothesis Test.

• A recent deployment at a client raised the question of which


educational background is best suited to be a successful
Black Belt candidate.
• In order to answer the question the MBB instructor randomly
sampled the results of a Six Sigma pretest taken by now
certified Black Belts at other businesses.
• Undergraduate backgrounds in Science, Liberal Arts,
Business and Engineering were sampled.
• Management wants to know so they can screen prospective
candidates for educational background.

1. Analyze the problem using the Hypothesis Testing roadmap.


2. What educational background is best suited for a potential
Black Belt?
3. Use the data within MinitabTM worksheet “BBaptitude.mtw”

LSS Green Belt v11.1 MT - Analyze Phase 48 © Open Source Six Sigma, LLC
Black Belt Aptitude Exercise: Solution

First follow the Roadmap to check the data for Normality.


Stat > Basic Statistics > Normality Test…

LSS Green Belt v11.1 MT - Analyze Phase 49 © Open Source Six Sigma, LLC
Black Belt Aptitude Exercise: Solution

Next we are going to check for variance.


(Remember, stack the data first!)
Stat>ANOVA>Test for Equal Variance

LSS Green Belt v11.1 MT - Analyze Phase 50 © Open Source Six Sigma, LLC
Summary

At this point you should be able to:


• Conduct Hypothesis Testing for Equal Vvariance

• Conduct Hypothesis Testing for Medians

• Analyze and interpret the results

LSS Green Belt v11.1 MT - Analyze Phase 51 © Open Source Six Sigma, LLC
A Simple, Fresh, Clean Approach to Lean Six Sigma
Project Tracking and Program Management.

Signup for a free trial now at…


www.SixGrid.com

LSS Green Belt v11.1 MT - Analyze Phase © Open Source Six Sigma, LLC

You might also like