Professional Documents
Culture Documents
Analyze
Analyze
1
Test Hypotheses
Analyze List Vital Few Xs
www.effortsconsulting.com ‹#›
2
Learning Objectives
www.effortsconsulting.com ‹#›
3
Types of Analysis
Individual Experience
Group Experience
www.effortsconsulting.com ‹#›
4
Planning the Tests
For each theory (or group of theories), list all data required,
each tool to be used, the results that will support the theory,
and results that will rule out the theory.
Theory or group Tool to be Results that will Results that will
Data required
to be tested applied support theory rule out theory
www.effortsconsulting.com ‹#›
5
Planning the Data Collection
FILE: ANALYZE-DATA COLLECTION PLAN.XLS
www.effortsconsulting.com ‹#›
6
Stratification
7
Stratification
of Data
www.effortsconsulting.com ‹#›
8
Learning Objectives
www.effortsconsulting.com ‹#›
9
What Is Stratification?
Concept
www.effortsconsulting.com ‹#›
10
What Is Stratification?
www.effortsconsulting.com ‹#›
11
Example: Data on Invoice Errors
Invoice Number 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15
Day of Week M Th T M T W T Th F F M W Th F T
Week of Month 1 4 3 4 4 4 1 2 4 3 4 1 4 3 4
Accountant A B A C B A B C A B C A B C A
Invoice Number 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Day of Week Th F F Th M T T T W T W Th F M M
Week of Month 4 2 4 4 4 4 1 3 4 4 4 4 3 4 1
Accountant A B A C C D D D D A B C D A A
Invoice Number 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
Day of Week M W W T F Th Th M W F Th T T W F
Week of Month 4 3 2 4 2 3 4 3 2 4 4 2 4 3 2
Accountant A C D D A C C D A C D B A B C
Invoice Number 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Day of Week F F Th W W W Th M T Th W F M F W
Week of Month 4 4 3 4 4 4 2 1 4 4 2 4 2 1 4
Accountant D D D C A D C D A C C C A D C
www.effortsconsulting.com ‹#›
12
Errors on Invoices
20
Number of Errors
10
0
M T W Th F A B C D
www.effortsconsulting.com ‹#›
13
Errors by Week: Total and Percent in Error
30 3
20 2
10 1
0 0
1 2 3 4 1 2 3 4
Weeks Weeks
www.effortsconsulting.com ‹#›
14
How to Stratify
www.effortsconsulting.com ‹#›
15
How to Interpret Stratification
www.effortsconsulting.com ‹#›
16
Pitfalls
www.effortsconsulting.com ‹#›
17
Cascading Stratification
Frequency
Pareto A
Product Line
Pareto B
Location
Pareto C
Operator
www.effortsconsulting.com ‹#›
18
Stratification—Histograms
Total
Frequency
Measure
Supplier A Supplier B
Frequency
Frequency
15
20
25
30
5
10
35
40
45
5
10
15
20
25
30
35
40
45
Measure Measure
www.effortsconsulting.com ‹#›
19
Exercise: Interpretation
15 Minutes
www.effortsconsulting.com ‹#›
20
Stratification – Summary
www.effortsconsulting.com ‹#›
21
Hypothesis Testing
22
Hypothesis
Testing Introduction
www.effortsconsulting.com ‹#›
23
Learning Objectives
www.effortsconsulting.com ‹#›
24
Hypothesis Testing Example
www.effortsconsulting.com ‹#›
25
Hypothesis Testing Example
www.effortsconsulting.com ‹#›
26
Alpha and Beta Risk
Truth
Ho Ha
Note: Type II error
Type II is also called
Fail to Correct Error Consumers’ Risk
Decision
Reject Ho b
Decision
Type I
Error Correct
Reject Ho Decision
a
Note: Type I error
is also called
Producers’ Risk
www.effortsconsulting.com ‹#›
27
Process Example
FILE: ANALYZE-HYPOTHESIS TESTING PROCESS 2.MTW
Process A Process B
Yield Yield
89.7 84.7
81.4 86.1
84.5 83.2
84.8 91.9
87.3 86.3
79.7 79.3
85.1 82.6
81.7 89.1
83.7 83.7
84.5 88.5
www.effortsconsulting.com ‹#›
28
Process Example
FILE: ANALYZE-HYPOTHESIS TESTING PROCESS 2.MTW
www.effortsconsulting.com ‹#›
29
Process Example
FILE: ANALYZE-HYPOTHESIS TESTING PROCESS 2.MTW
Process A Process B
Statistical Concept:
In actuality, do the yield
measurements from the
processes represent two
different populations? B B B B B BB BB B
A AA AAAA A
A
80.0 82.5 85.0 87.5 90.0 92.5
www.effortsconsulting.com ‹#›
30
Concept: Formulating the “Null” and “Alternative” Hypotheses
H a : µ a ¹ µb
population means for A and Process B are different.
Processes A and B.
§ Practical Interpretation: The
§ Practical Interpretation: average yield of Process B is
There is no difference in different than that of
the average yields of the Process A.
two processes, i.e., your
changes did not help.
Goal: You must show that the values you observed were so
unlikely to come from the same population that Ho must be wrong!
You want to reject Ho, thus defaulting to Ha.
www.effortsconsulting.com ‹#›
31
Definitions
www.effortsconsulting.com ‹#›
32
Hypothesis Testing: P-Value
P-Value £ α: There is a
statistical difference
www.effortsconsulting.com ‹#›
33
Steps in Hypothesis Testing
www.effortsconsulting.com ‹#›
34
Summary
www.effortsconsulting.com ‹#›
35
Confidence Intervals
36
Confidence
Intervals
www.effortsconsulting.com ‹#›
37
Learning Objectives
www.effortsconsulting.com ‹#›
38
Confidence Intervals Agenda
www.effortsconsulting.com ‹#›
39
Definition of the Confidence Interval
Practical
A range determined by sample data, which tells you where the true
population parameter is located with a certain degree of confidence.
Statistical
A (1- α) % confidence interval for a population parameter, μ or σ, is
a random interval that is located between upper and lower
confidence limits (CL's):
§ Probability [ Lower CL ≤ μ ≤ Upper CL ] = 1 - α
§ Probability [ Lower CL ≤ σ ≤ Upper CL ] = 1 - α
regardless of the value of the parameter.
www.effortsconsulting.com ‹#›
40
Why Study CIs?
§ Sample statistics, such as the sample mean and the sample standard
deviation, are only estimates of the true population parameters, μ and s.
§ Because of inherent sample to sample variability in these estimates,
uncertainty is quantified using statistically based Confidence Intervals
(CI).
§ The automotive industry calculates 95% CI for their data.
§ Some medical applications calculate 99.9% CI for their data.
§ Confidence Intervals can be interpreted as follows:
– Approximately 95 samples out of 100 will yield a CI that contains the
true population parameter, or
– Approximately 5 samples out of 100 will yield a CI that does not
contain the true population parameter
– 95% certainty that the true population parameter is inside the interval
§ The CI provides a way to investigate sample to sample variation.
§ Use CIs to obtain statistical confidence for the population mean, standard
deviation, and Cp parameters.
www.effortsconsulting.com ‹#›
41
The Confidence Interval Equation
www.effortsconsulting.com ‹#›
42
Confidence Interval for the Population Mean
When process performance is estimated, it is from a relatively small sample
of the process. To calculate, use estimates of the mean and standard
deviation taken from a sample. Both estimates have inherent variation. The
goal is to quantify certainty of where the true process is performing.
First, look at the confidence interval for the population mean.
Ø Example: Suppose you want to determine the 95% Confidence Interval
for the population mean from 10 samples (n = 10) off a reactor. Sample
the reactor and obtain the following sample statistics:
§ Sample Mean = x = 249.6
§ Sample Standard Deviation = s =14.15
With this sample information, use the general formula for Confidence
Intervals to estimate the population mean:
s s
x - t n -1,1-a /2 £ µ £ x + t n -1,1-a /2
n n
Where
§ x = sample mean
§ s = sample standard deviation
§ n = sample size
§ t n - 1, 1- a/2 = t value for n-1 Degrees of freedom and probability 1 - a / 2
www.effortsconsulting.com ‹#›
43
What Is the t Distribution?
www.effortsconsulting.com ‹#›
44
t Distribution vs. Normal Distribution
t Distribution
( n ≤ 30 )
t
Z
X -µ if n > 30 X -µ
t n -1 = Z=
s/ n if n ≤ 30 s/ n
www.effortsconsulting.com ‹#›
45
Back to the Example
Ø Example: Suppose you want to determine the 95% Confidence Interval
for the population mean from 10 samples (n = 10) off a reactor.
Sample the reactor and obtain the following sample statistics:
§ Sample Mean = X = 249.56
§ Sample Standard Deviation = S =14.15
s s
x - t n -1,1-a / 2 £ µ £ x + t n -1,1-a / 2
n n
14.15 14.15
249.56 - 2.262 * £ µ £ 249.56 + 2.262 *
10 10
t n-1, 1-a/2:
df (Degrees of Freedom) =n - 1 =9
1-a/2=1-.05/2=.975
t9, .975 =2.262
249.56 - 10.11 £ µ £ 249.56 + 10.11
239.45 £ µ £ 259.67
§ Solution: You are 95% confident that the actual process mean is
somewhere between 239.45 and 259.67.
§ Notice that the t value is very close to 2.00.
§ Do the problem in MINITAB®.
www.effortsconsulting.com ‹#›
46
Example: Calculating CI for Population Mean
FILE: ANALYZE-PROCESS 1-T.MTW
Using MINITAB®
Confidence Intervals
Variable N Mean StDev SE Mean 95.0 % CI
Process 10 249.56 14.15 4.47 ( 239.44, 259.68 )
www.effortsconsulting.com ‹#›
47
Testing Hypothesis
48
Testing Hypotheses
When Y Is Categorical and
X Is Categorical
www.effortsconsulting.com ‹#›
49
Learning Objectives
2. Compare 2 proportions.
www.effortsconsulting.com ‹#›
50
Test This Theory: Is it a “Fair” Coin?
www.effortsconsulting.com ‹#›
51
Is it a “Fair” Coin?
A “fair” coin will flip heads 50% of the time. In other words,
the proportion is 0.5.
§ Look at the 95% confidence interval. You can say that
based on 8 heads out of 10 flips, you are 95% confident
that the proportion of heads you can expect to get from
this coin lies between 0.44 to 0.97.
§ Why do you have such a large interval?
§ Since this interval (0.44 to 0.97) includes 0.5, it is
concluded that you do not have sufficient evidence to
reject Ho.
§ Before you conclude, you should evaluate the power of
this test. The power of this test is only 47% (0.4688),
which means that the ability of this test to detect a
difference when there is one is only 47% (the probability
of being correct in rejecting Ho is 0.4688). This test does
not have enough power due to the small sample size.
www.effortsconsulting.com ‹#›
52
Is It a “Fair” Coin?
www.effortsconsulting.com ‹#›
53
Is It a “Fair” Coin?
Compare:
Notice that as your sample size increased (from 10 flips to
100 flips), your 95% Confidence Interval decreased. In
other words, you are now more confident of your results
because you have more data.
www.effortsconsulting.com ‹#›
55
Are Office A and Office B Different?
Ha: PA ¹ PB or PA- PB ¹ 0
www.effortsconsulting.com ‹#›
56
Office A vs. Office B
Problem:
The quality assurance department for the ABC television company
wants to estimate the proportion of their 35-inch television sets that
needed to be repaired within four years of purchase. The department is
particularly interested in whether the need for repairs for their brand of
television is different than all other brands currently on the market. It is
known that the overall repair rate for all other televisions currently on
the market is 6.8% or 0.068.
Determine whether or not the television has the same repair rate as the
average of all brands of televisions currently on the market.
15 Minutes
www.effortsconsulting.com ‹#›
58
Exercise Two: Power and Sample Size
POWER AND SAMPLE SIZE
Repair rate for television sets
Problem:
For exercise one, the repair rate for 35-inch televisions from the ABC
television company was significantly different than the repair rate of
similar televisions from other manufacturers. You tested the following
hypotheses:
Ho: p=0.068
Ha: p ≠0.068 or p > 0.068
If you repeat this study many times, sampling 2,856 sets each time,
how often will the test show ABC to be different. . .
IF the true rate for ABC sets is in fact 0.068? 15 Minutes
IF the true rate for ABC sets is in fact 0.083 (as measured)?
www.effortsconsulting.com ‹#›
59
What You Have Learned
§ 1-Proportion Test
§ 2-Proportions Test
§ Power and Sample Size
§ Next: Is Y independent of X?
– Chi-Square Contingency Tables Test
www.effortsconsulting.com ‹#›
60
Contingency Table Description
{% of {% of
“Expected” T-col T-row T-col X T-row
f e = total in X total in X {Total} = X XN =
N
frequency column} row} N N
Invoices Invoices
(no errors) (with errors)
www.effortsconsulting.com ‹#›
62
Χ2 Critical Table
www.effortsconsulting.com ‹#›
63
Chi-Square Test for Independence
FILE: ANALYZE-CHI SQUARE—INVOICE EXAMPLE.MTW
(20-23.33)2 (50-46.67)2
23.33 +
g
(fo-fe)2 46.67
c2calc = S
j=1 fe
=
(40-36.67)2 (70-73.33)2
= 1.169
36.67 + 73.33
Since
c2calc < c2crit
c2critical = 3.841 Fail to reject H0. Quality is
independent of shift.
Also, P-Value > α
www.effortsconsulting.com ‹#›
64
Steps in Chi-Squared Hypothesis Testing
www.effortsconsulting.com ‹#›
65
Chi-Square Test: Exercise
FILE: ANALYZE-CHI SQUARE—MOLDING.MTW
www.effortsconsulting.com ‹#›
67
Normality Test
68
Normality Tests
www.effortsconsulting.com ‹#›
69
Learning Objectives
www.effortsconsulting.com ‹#›
70
Normality Test Method
FILE: ANALYZE-NORMALITY TESTS.MTW
Population 1
85.3
97.1
67.9
77.6
99.3
84.0
82.4
86.1
88.8
www.effortsconsulting.com ‹#›
72
Normality Test Example
FILE: ANALYZE-NORMALITY TESTS.MTW
Note the P-Value in the summary to the right of the plot. Because the P-Value
(0.640) is greater than the alpha level of 0.05, you fail to reject the null
hypothesis. In other words, the data may be treated as Normal. The points fall
reasonably close to the reference line indicating that the data follow a normal
distribution.
www.effortsconsulting.com ‹#›
73
Alternative MINITAB® for Steps 3, 4, and 5
FILE: ANALYZE-NORMALITY TESTS.MTW
M ean 85.389
S tDev 9.488
V ariance 90.026
S kew ness -0.284629
Kurtosis 0.508642
N 9
M inimum 67.900
1st Q uartile 80.000
M edian 85.300
3rd Q uartile 92.950
70 75 80 85 90 95 100 M aximum 99.300
95% C onfidence Interv al for M ean
78.096 92.682
95% C onfidence Interv al for M edian
78.694 95.208
95% C onfidence Interv al for S tDev
9 5 % C onfidence Inter vals
6.409 18.177
Mean
Median
80 84 88 92 96
www.effortsconsulting.com ‹#›
74
Exercise
FILE: ANALYZE-NORMALITY TESTS.MTW
www.effortsconsulting.com ‹#›
75
Summary
www.effortsconsulting.com ‹#›
76
Test for
Equal Variance
www.effortsconsulting.com ‹#›
77
Learning Objectives
www.effortsconsulting.com ‹#›
78
Comparing Two Variances
BEFORE AFTER
CHANGE CHANGE
or A or B
www.effortsconsulting.com ‹#›
79
Equal Variances (MINITAB®)
§ If the data are not normally distributed, then use Levene’s Test
Statistic presented by MINITAB®.
www.effortsconsulting.com ‹#›
80
Example
FILE: ANALYZE-PROCESS 2.MTW
www.effortsconsulting.com ‹#›
81
Example: Comparing 2 Variances
FILE: ANALYZE-PROCESS 2.MTW
Ho: s2processA= s2processB
Ha: s2processA ¹ s2processB
www.effortsconsulting.com ‹#›
83
Testing Hypothesis
84
Testing Hypotheses
When Y Is Continuous
and X Is Categorical
www.effortsconsulting.com ‹#›
85
Learning Objectives
www.effortsconsulting.com ‹#›
86
1-Sample t-Test
www.effortsconsulting.com ‹#›
88
Example: 1-Sample t-Tests
FILE: ANALYZE-1-SAMPLE t POPULATIONS.MTW
_
X
Ho
70 75 80 85 90 95 100
Population 1
www.effortsconsulting.com ‹#›
89
Example: 1-Sample t-Tests (continued)
_
X
Ho
70 75 80 85 90 95 100
Population 1
www.effortsconsulting.com ‹#›
90
1-Sample t-Test
You also can ask, is the population mean greater than the
target value? This creates a one-sided test.
Ho: population mean = 75
Ha: population mean > 75
Individual Value Plot of Population 1
(with Ho and 95% t-confidence interval for the mean)
_
X
Ho
70 75 80 85 90 95 100
Population 1
www.effortsconsulting.com ‹#›
91
1-Sample t-Test
_
X
Ho
70 75 80 85 90 95 100
Population 1
www.effortsconsulting.com ‹#›
92
Exercise: 1-Sample t-Tests
FILE: ANALYZE-1-SAMPLE t POPULATIONS.MTW
www.effortsconsulting.com ‹#›
93
What You Have Learned: 1-Sample t-Tests
www.effortsconsulting.com ‹#›
94
Comparing 2 Means
Difference in
Output Means
www.effortsconsulting.com ‹#›
95
Why Test Population Parameters Against Each
Other?
µ Sample 2
Then you can conclude, with some
σ
degree of statistical confidence, that
these parameters came from one
population or two.
www.effortsconsulting.com ‹#›
96
2-Sample t-Tests: Example
FILE: ANALYZE-2 SAMPLE T.MTW
www.effortsconsulting.com ‹#›
97
Steps in 2-Sample t-Tests
www.effortsconsulting.com ‹#›
98
2-Sample t-Tests: Example
FILE: ANALYZE-2 SAMPLE T.MTW
Population A Population B
81.29 75.6
86.56 73.2
86.20 71.1
85.33 75.3
80.95 70.3
87.76 75.0
88.45 79.1
70.0
80.1
74.5
82.2
www.effortsconsulting.com ‹#›
99
2-Sample t-Tests: Example
FILE: ANALYZE-2 SAMPLE T.MTW
Population A
Population B
70.0 72.5 75.0 77.5 80.0 82.5 85.0 87.5
Data
www.effortsconsulting.com 100
‹#›
2-Sample t-Tests: Example
FILE: ANALYZE-2 SAMPLE T.MTW
StDev 3.985 80
70
N 11
Percent
60
50
AD 0.292 40
30
P-Value 0.542 20
10
1
65 70 75 80 85
Population B
Probability Plot of Population A
Normal
99
Mean 85.22
StDev 2.980
95 N 7
AD 0.452
90
P-Value 0.185
80
70
Percent
60
50
40 Mean 85.22
30
20
StDev 2.980
10 N 7
5 AD 0.452
1
P-Value 0.185
78 80 82 84 86 88 90 92 94
Population A
www.effortsconsulting.com 101
‹#›
2-Sample t-Tests: Example
FILE: ANALYZE-2 SAMPLE T.MTW
www.effortsconsulting.com 102
‹#›
Exercise: 2-Sample t-Tests
FILE: ANALYZE-2 SAMPLE T EXERCISE.MTW
www.effortsconsulting.com 103
‹#›
What You Have Learned: 2-Sample t-Test
105
Analysis of Variance
(ANOVA)
www.effortsconsulting.com 106
‹#›
Learning Objectives
www.effortsconsulting.com 107
‹#›
Analysis of Variance Highlights
www.effortsconsulting.com 111
‹#›
Example: Steps 3-6 (ANOVA Analysis)
FILE: ANALYZE-ANOVA DIET DATA.MTW
www.effortsconsulting.com 112
‹#›
Example: Step 7
FILE: ANALYZE-ANOVA DIET DATA.MTW
Box pl ot of Di et A, Di et B, Di et C, Di et D
72.5
70.0
67.5
65.0
Dat a
62.5
60.0
57.5
55.0
Diet A Diet B Diet C Diet D
www.effortsconsulting.com 113
‹#›
Example: Step 8
FILE: ANALYZE-ANOVA DIET DATA.MTW
60 95 99 N 6
80 Mean 61 P-Value 0.644
50 AD 0.492
70 90 StDev 2.619
40 P-Value 0.129 Levene's Test
Percent
60 95 N 8
30 80 Test Statistic 0.65
50 AD 0.279
20 70 90 P-Value 0.593
40 P-Value 0.543
Diet B
Percent
30 60 80
10
50
20 70
Diet
5 40
Percent
30 60
10 50
20
1 5 40 Diet C
56 57
10
58
30 59 60 61 62 63 64 65
20 Diet A
1 5
60 10 62 64 66 68 70 72 74
Diet B Diet D
1 5
64 65 66 67 68 69 70 71 72
Diet C
1
55.0 57.5 60.0 62.5 65.0 67.5 0 2 4 6 8 10 12
Diet D 95% Bonferroni Confidence Int ervals for St Devs
www.effortsconsulting.com 114
‹#›
Example: Step 9
FILE: ANALYZE-ANOVA DIET DATA.MTW
www.effortsconsulting.com 116
‹#›
Example: Step 9a
FILE: ANALYZE-ANOVA DIET DATA.MTW
Residual
values of the fits.
Percent
50 0.0
-2.5
10
-5.0
1
-5.0 -2.5 0.0 2.5 5.0 60 62 64 66 68 This graph investigates
Residual Fitted Value
how the Residuals
Histogram of the Residuals Residuals Versus the Order of the Data
5.0
behave across the
4.8
2.5
experiment. This is
3.6
Frequency
0.0
2.4
-2.5
important graph, since
1.2
-5.0
it will signal that
0.0
-4 -2 0 2 4 2 4 6 8 10 12 14 16 18 20 22 24 something outside the
Residual Observation Order
experiment may be
Histogram - bell Individual residuals - operating. Nonrandom
curve? Ignore for patterns are warnings.
small datasets (<30).
trends? or outliers?
www.effortsconsulting.com 117
‹#›
Example: Step 9b
FILE: ANALYZE-ANOVA DIET DATA.MTW
www.effortsconsulting.com 118
‹#›
Example: Step 10
FILE: ANALYZE-ANOVA DIET DATA.MTW
www.effortsconsulting.com 119
‹#›
Exercise
FILE: ANALYZE-ANOVA HOTELS.MTW
City Hotel Stars Price
LA NEW OTANI 3 119
LA HILTON 3 150
LA BEVERLY PLZA 3 110
LA
LA
HOL INN CONV
LE DUFY
2
2
79
145
Are Hotels In New York City
LA BILTMORE 4 140 More Expensive Than Hotels
LA LE PARC 2 165
LA SHERATON GRD 3 175 In Other Cities ?
SF HOL INN FIN 2 99
SF STOUFFER 5 185
SF MANDARIN 4 265
SF DIVA 2 109
SF GRAND HYATT 4 169
SF HOL INN GATE 2 99
SF NOB HILL LAM 2 175
SF INN AT OPERA 3 110
DC LOMBARDY 2 115
DC SHERATON 2 185
DC HILTON 3 166
DC GRAND HYATT 3 189
DC ONE WASH CIR 3 125
DC COMFORT INN 1 64
DC CAPITOL HILL 1 120 25 Minutes
DC RAD PRK TERR 3 119
NY EASTGATE 1 170
NY HELMSLEY MID 2 135
NY HOL INN CRWN 2 185
NY THE MARK 3 250
NY PENINSULA 4 250
NY WARWICK 2 170
NY GRAND HYATT 3 210
NY THE REGENCY 4 215
www.effortsconsulting.com 120
‹#›
Non Parametric Tests
121
Nonparametric Tests
www.effortsconsulting.com 122
‹#›
Learning Objectives
2. Perform tests.
§ Wilcoxon
§ Mann-Whitney
§ Kruskal-Wallis
www.effortsconsulting.com 123
‹#›
Agenda
www.effortsconsulting.com 124
‹#›
Definition of Nonparametric
www.effortsconsulting.com 125
‹#›
Why Study Nonparametric Tests?
www.effortsconsulting.com 126
‹#›
Nonparametric Tests
www.effortsconsulting.com 127
‹#›
Hypothesis Testing Method: Non-normal Data
www.effortsconsulting.com 128
‹#›
Concept: Wilcoxon
This test can be used when the sample data are not
normally distributed, but it should be from a population
whose measurement scale is at least ordinal or continuous
and the distribution is assumed to be symmetrical.
www.effortsconsulting.com 129
‹#›
1-Sample Wilcoxon in MINITAB®
FILE: ANALYZE-MEDIAN HOUSING PRICES.MTW
Percent
60
50
40
30
20
10
1
0 100 200 300 400 500 600 700 800
Boston Suburb
www.effortsconsulting.com 130
‹#›
1-Sample Wilcoxon in MINITAB®
FILE: ANALYZE-MEDIAN HOUSING PRICES.MTW
www.effortsconsulting.com 131
‹#›
1-Sample Wilcoxon Conclusion
FILE: ANALYZE-MEDIAN HOUSING PRICES.MTW
MINITAB®
eliminates the
Estimated Achieved Conf. Interval sample values
N Median Confidence Lower Upper that = the
hypothesized
Boston Suburb 20 293 95.0 235 373 median and
reduces N
accordingly.
State the Practical Conclusion
www.effortsconsulting.com 132
‹#›
Concept: Mann-Whitney
It requires:
§ The measurement scale is at least ordinal
§ The two sample distributions are of similar shape
§ The two variances are not significantly different
But...
§ Does not assume normality
§ Distributions do not need to be symmetrical
§ The two samples do not need to be the same size
www.effortsconsulting.com 133
‹#›
Example: Mann-Whitney
FILE: ANALYZE-MANN-WHITNEY 2 PROCESS LOSS.MTW
Step 1: State the Practical Problem: You want to compare the medians of
two processes to see if one reduces loss.
Step 2: State the null and alternate P robabili ty P lot of P r ocess 1 L oss
hypotheses. 99
Normal
Mean 252.5
Ho: Median 1 = Median 2 95
StDev
N
54.91
28
Pe r cent
60
Step 3:
50
Test medians, original not 40
30
normal. 20
Step 4:
10
Alpha level 0.05. 5
1
100 150 200 250 300 350 400
www.effortsconsulting.com 136
‹#›
Example: Mann-Whitney
FILE: ANALYZE-MANN-WHITNEY 2 PROCESS LOSS.MTW
350
300
Data
250
200
150
100
Process 1 Loss Process 2 Loss
www.effortsconsulting.com 137
‹#›
Example: Mann-Whitney
FILE: ANALYZE-MANN-WHITNEY 2 PROCESS LOSS.MTW
Step 8: Check distribution
assumptions.
§ Normality
§ Similar shape—if not
use Mood’s Median
Median Median
230 240 250 260 270 280 210 220 230 240 250
www.effortsconsulting.com 138
‹#›
Example: Mann-Whitney
www.effortsconsulting.com 139
‹#›
Mann-Whitney Signed Rank Test
N Median
Process 1 Loss 28 268.00
Process 2 Loss 28 235.00
www.effortsconsulting.com 140
‹#›
Exercise: Mann-Whitney
FILE: ANALYZE-DEPARTMENT CLAIMS—KRUSKAL WALLIS.MTW
You have claim resolution time data (in days) from 2 claims-
processing departments. You desire to prove if one
department is processing claims faster than the other.
Department 1 Department 2
150 130
335 315
148 128
146 126
160 140
156 136
152 132 10 Minutes
148 128
144 124
55 35
www.effortsconsulting.com 141
‹#›
Concept: Kruskal-Wallis
It requires:
§ The measurement scale is at least ordinal
§ The distributions are of similar shape
§ Distribution is any continuous one
But . . .
§ Does not assume normality
§ Distribution does not need to be symmetric
§ The samples do not need to be the same size
The Kruskal-Wallis is the most robust test to use when you are testing
medians from more than 2 samples.
www.effortsconsulting.com 142
‹#›
Example: Kruskal-Wallis
FILE: ANALYZE-DEPARTMENT CLAIMS—KRUSKAL-WALLIS.MTW
www.effortsconsulting.com 143
‹#›
Example: Kruskal-Wallis
www.effortsconsulting.com 144
‹#›
Example: Kruskal-Wallis
FILE: ANALYZE-DEPARTMENT CLAIMS—KRUSKAL-WALLIS.MTW
Step 5-6: Determine sample size (n=10 each), plan, and collect
data.
Department 1 Department 2 Department 3
150 130 125
335 315 310
148 128 123
146 126 121
160 140 135
156 136 131
152 132 127
148 128 123
144 124 119
55 35 30
www.effortsconsulting.com 145
‹#›
Example: Kruskal-Wallis
7 Minutes
www.effortsconsulting.com 148
‹#›
Summary
www.effortsconsulting.com 149
‹#›
Testing Hypothesis
150
Testing Hypotheses When Y
Is Continuous and X Is
Continuous
Correlation and Regression
and
Scatter Diagrams
www.effortsconsulting.com 151
‹#›
Learning Objectives
www.effortsconsulting.com 152
‹#›
Scatter Diagram
Concept
A scatter diagram is a graphic representation of the
relationship between two variables.
Variable (Y)
Output
www.effortsconsulting.com 153
‹#›
Scatter Diagrams and Cause-Effect Relationships
RH V RH V
10 43.1 30 46.0
39.9 43.2
41.3 45.5
42.2 45.8
40.4 44.1
20 45.2 40 49.1
42.9 45.0 Scatterplot of Volts vs RH
42.6 48.4 60
44.3 48.9
55
vs. Volts
50
45
40
0 20 40 60 80 100
RH
www.effortsconsulting.com 154
‹#›
Constructing Scatter Diagrams
www.effortsconsulting.com 155
‹#›
Steps in Constructing and Interpreting a Scatter
Diagram
1. Develop a plausible and relevant theory about the
suspected relationship between two variables of interest.
2. Obtain the table of raw data, and determine the high and
low values for each variable.
3. Decide which variable will be plotted on the horizontal axis.
4. Draw and label the horizontal and vertical axes.
5. Plot the paired data.
6. Title the chart, and provide other appropriate notations.
7. Identify and classify the pattern of correlation.
8. Check for potential pitfalls in your analysis. Consider
confounding factors and other possible explanations for
the observed pattern, and decide on the team’s next steps.
www.effortsconsulting.com 156
‹#›
Typical Patterns of Correlation
Patterns of Correlation
Y Y Y
X X X
Strong, Positive Strong, Negative Complex
Y Y Y
X X X
Weak, Positive Weak, Negative None
www.effortsconsulting.com 157
‹#›
Confounding Factors
18
16
14
12
10
8
6
4
2
0
0 12 24 36 48 60 72
Months of Service
www.effortsconsulting.com 158
‹#›
Stratified: Errors vs. Experience
18
16
14
12
10
8
6
4
2
0
0 12 24 36 48 60 72
Months of Service
www.effortsconsulting.com 159
‹#›
Shelf Life Study 1
150 Specified
Weight of Active Ingredient (mg)
Shelf
Life
140
130
Minimum
120 Effective
Weight
110
100
0 5 10 15 20 25
Time Since Production (months)
www.effortsconsulting.com 160
‹#›
Shelf Life Study 1
150 Specified
Weight of Active Ingredient (mg)
Shelf
Life
140
130
Minimum
120 Effective
Weight
110
100
0 5 10 15 20 25
Time Since Production (months)
www.effortsconsulting.com 161
‹#›
Shelf Life Study by Supplier
Supplier A
150 Supplier B Specified
Weight of Active Ingredient (mg)
Shelf
Life
140
130
Minimum
120 Effective
Weight
110
100
0 5 10 15 20 25
Time Since Production (months)
www.effortsconsulting.com 162
‹#›
Example: Scatter Diagrams Using MINITAB®
FILE: ANALYZE-STICKY.MTW
www.effortsconsulting.com 163
‹#›
Example: Scatter Diagrams Using MINITAB®
www.effortsconsulting.com 164
‹#›
Example: Scatter Diagrams Using MINITAB®
Scatterplot of Volts vs RH
60
55
Volts
50
45
40
0 20 40 60 80 100
RH
www.effortsconsulting.com 165
‹#›
Scatter Diagrams
Warning! Correlation Does Not Imply Causation
100 200 300
80 80
70 70
Population
(In Thousands)
60 60
50 50
100 200 300
Number of Storks
Source: Box, Hunter, Hunter. Statistics For Experimenters. New York, NY: John Wiley & Sons. 1978
www.effortsconsulting.com 166
‹#›
Scatter Diagrams: Pitfalls
Pitfalls:
§ Assuming correlation = causation
- Rum prices and ministers’ salaries are known to be correlated.
Does an increase in salary cause a rum price increase? Or the
other way around?
- Patient days have increased without staffing increases while
computer budget has increased. Does more spending on
computers boost nursing staff productivity?
www.effortsconsulting.com 167
‹#›
Pitfalls and Problems in Interpretation
50 A
50 A
Y
45
B
Y
45
40
0 25 50 75 100
X
True Relationship between X & Y 40
0 20 40
Complex Correlation X
Examining Only Part of Range
Yields False Conclusion of Strong
Positive Correlation
www.effortsconsulting.com 168
‹#›
Pitfalls and Problems in Interpretation
50 B 100 C
49
75
48 Y
Y
50
47
25
46
45 0
30 50 70 0 25 50 75 100
X X
www.effortsconsulting.com 169
‹#›
When to Use Scatter Diagrams
When to Use
§ Testing hypotheses and identifying vital few Xs
§ Designing remedies and controls
www.effortsconsulting.com 170
‹#›
Summary—Scatter Diagrams
www.effortsconsulting.com 171
‹#›
Correlation and Regression Topics
www.effortsconsulting.com 172
‹#›
Definitions
§ Correlation
§ Regression Equation
§ Coefficient of Determination
www.effortsconsulting.com 173
‹#›
Why are These Tools Used?
30
20
§ Regression can provide 10
Y=-10.3333+7.75X
R- Squar ed=0.941
prediction equations with 0
0 1 2 3 4 5 6 7 8
KNOB-1
www.effortsconsulting.com 174
‹#›
How Much Does X Impact the Y?
What Is the Size of the Correlation?
( xi - x )( yi - y )
The Correlation Formula: r= å
å (x i
- x ) 2 å ( yi - y ) 2
30 30 30 30
29 29 29 29
28 28 28 28
27 27 27 27
26 26 26 26
25
C1
25 25 25
C1
C1
C1
24 24 24 24
23 23 23 23
22 22 22 22
21 21 21 21
20 20 20 20
20 21 22 23 24 25 26 27 28 29 30 20 21 22 23 24 25 26 27 28 29 30 20 21 22 23 24 25 26 27 28 29 30 20 21 22 23 24 25 26 27 28 29 30
C2 C2 C2
www.effortsconsulting.com 175
‹#›
Correlation Relationship Significance Table
Decision Decision
n point
n point
5 0.878 18 0.468
6 0.811 19 0.456
7 0.754 20 0.444
8 0.707 22 0.423
Guideline: 9 0.666 24 0.404
§ If | r | > 0.80, then 10 0.632 26 0.388
relationship is 11 0.602 28 0.374
important. 12 0.576 30 0.361
13 0.553 40 0.312
§ If | r | < 0.20, then
14 0.532 50 0.279
relationship is not
significant. 15 0.514 60 0.254
16 0.497 80 0.22
17 0.482 100 0.196
www.effortsconsulting.com 176
‹#›
Correlation Data Requirements
NOTE: Correlation study can also be done with more than one X.
www.effortsconsulting.com 177
‹#›
Correlation Example
FILE: ANALYZE-TEMP VS. MORTALITY.MTW
www.effortsconsulting.com 178
‹#›
Identical Values of Correlation Coefficient
5 5
4 4
3 3
Y 2 Y 2
1 1
0 0
0 1 2 3 4 5 0 1 2 3 4 5
X X
5 5
4 4
3 3
Y 2 Y 2
1 1
0 0
0 1 2 3 4 5 0 1 2 3 4 5
X X
www.effortsconsulting.com 179
‹#›
Regression
www.effortsconsulting.com 180
‹#›
Linear Relationship
b = Y intercept = the Y value when
the line intersects Y axis at
X=0 rise
m = slope =
run
Y
rise
run
b
0 X
A simple linear relationship can be described
mathematically by Y = b+mX
www.effortsconsulting.com 181
‹#›
Regression Example (Fitted Line Plot)
FILE: ANALYZE-SCATT 39.MTW
38
Mass
36
34
32
MINITAB® Output: 30
The regression equation is 3.50 3.75 4.00
Speed
4.25 4.50 4.75
Residual
Percent
50 0
-1
10
-2
1
-3.0 -1.5 0.0 1.5 3.0 34 36 38 40
Residual Fitted Value
Residual
4 0
-1
2
-2
0
-3 -2 -1 0 1 2 2 4 6 8 10 12 14 16 18 20
Residual Observation Order
www.effortsconsulting.com 183
‹#›
Simple Regression Practice Exercises
FILE: ANALYZE-EROSION.MTW
600
550
500
Abrasion
450
400
350
300
www.effortsconsulting.com 185
‹#›
Simple Regression Practice Exercises, Solutions
FILE: ANALYZE-EROSION.MTW
Regression Analysis
The regression equation is Problem 1, MINITAB®
Abrasion = 2693 - 3.16 Hardness Analysis
Predictor Coef StDev T p
Constant 2692.8 242.9 11.09 0.000
Hardness -3.1607 0.3462 -9.13 0.000
S = 41.9371 R-Sq = 78.4% R-Sq(adj) = 77.4%
Analysis of Variance
Source DF SS MS F p
Regression 1 146569 146569 83.34 0.000
Error 23 40451 1759
Total 24 187020
The P-Value is ≤ 0.05 and R-sq explains 78% of
the variation. Check the residuals.
www.effortsconsulting.com 186
‹#›
Summary
www.effortsconsulting.com 187
‹#›
Checkpoints
www.effortsconsulting.com 188
‹#›
Project Plan Deliverables
www.effortsconsulting.com 189
‹#›
Additional Summary Exercises
FILE: ANALYZE-FINAL EXERCISES.MTW
45 Minutes
www.effortsconsulting.com 190
‹#›
Effort Consulting
q Our services
• Business Transformation
• Enterprise Management System Get in touch with us
• Strategy & Digital Marketing
Efforts Consulting Pvt. Ltd.
• IRMM (Digital Factories) 508, Palladium Business Centre,
Opp. 4D Square Mall,
• Lean Six Sigma Near IIT Engg. College,
• Time and Motion Study Gandhinagar Road, Ahmadabad – 382424
(+91) 9879391004
• MFCA
www.effortsconsulting.com
• Productivity Norms
www.effortsconsulting.com 191
‹#›