You are on page 1of 57

5/7/2015

ANALYZE PHASE

Strategic Management and Business Analysis - 2014 1


1 -1

Sources of Variations

Strategic Management and Business Analysis - 2014 2


1 -2

1
5/7/2015

Multi-vari Plots.
• Confidence Intervals on Mean.
• Hypothesis Test on Mean.
• Hypothesis Test on Paired Mean.
• Hypothesis Test on Mean of Two Samples.
• Hypothesis Test on Variance of Two Samples.
• Contingency tables.
• Power & Sample Size.
• Non-parametric Tests.

Strategic Management and Business Analysis - 2014 3


1 -3

What are Multi Vari Plots…?


 Graphical tool to look at interactions among data. An interaction occurs
when the change in response from one level of a factor to another level
depends upon the level of another factor.

 It displays the response means at each level for every factor. For
example, in a two-factor model (Factor A and Factor B, each with two
levels).

 This categorization of input factors helps to:

1. Reduce number of input factors.


2. Detect effect of interactions.

Strategic Management and Business Analysis - 2014 4


1 -4

2
5/7/2015

Multi Vari Plot Example…

Strategic Management and Business Analysis - 2014 5


1 -5

Multi Vari Plot Example…


 From previous graph, we can get the following information:

• No big difference between (Reda’s) performance regardless production


line (1 Or 2), while there is a difference in case of (Ali). This means that
there is an interaction between operator and production line.

• For (Ali), night shift is better where response (lead time) is less generally
regardless production line.

• Best results (lowest lead time) can be achieved with the following
conditions: Operator (Ali) working on (Night shift) using production
line (2).

• etc..
Strategic Management and Business Analysis - 2014 6
1 -6

3
5/7/2015

• Multi-vari Plots.
 Confidence Intervals on Mean.
• Hypothesis Test on Mean.
• Hypothesis Test on Paired Mean.
• Hypothesis Test on Mean of Two Samples.
• Hypothesis Test on Variance of Two Samples.
• Contingency tables.
• Power & Sample Size.
• Non-parametric Tests.

Strategic Management and Business Analysis - 2014 7


1 -7

Statistical Inference…
 Statistical inference is the field of statistics that allow to compare
sample statistics, such as the sample mean and sample standard
deviation, to known populations or to other samples.

 For example, if we sample customer response times before and after a


process change, we will likely see a difference. Does that difference
suggest that there was an improvement, or are the two samples similar
enough to have come from the same population?

 Due to sampling errors, a given sample from a population may lie within
a range of values, just as multiple samples from Deming’s Red Bead
Experiment yielded different estimates of the percent of red beads, even
though no one changed the bucket.

Strategic Management and Business Analysis - 2014 8


1 -8

4
5/7/2015

Statistical Inference…
 Similarly, the true population parameter may lie anywhere within a
given range of our estimates. This is the basis for a confidence interval.
A key assumption is that population is both constant (it does not change
over time) and homogenous (a given sample is representative of the
sample).

Sample Population
Average Xbar µ
Standard Dev. s σ

Strategic Management and Business Analysis - 2014 9


1 -9

1. 1 Sample Z test
(Confidence intervals for Mean, while (σ) is known)…
 When we sample from a population, and have historical evidence of the
sigma value, we can estimate the confidence interval of the mean at a
given confidence level. (For example 95%)

 On Minitab, To calculate the 95% confidence interval on the mean,

(Stat > Basic statistics > 1 sample Z).

Strategic Management and Business Analysis - 2014 10


1 -10

5
5/7/2015

1. 1 Sample Z test
(Confidence intervals for Mean, while (σ) is known)…
 It will be noticed that where samples n increases, the confidence
interval gets smaller. That is, we have less confidence of where the true
mean is for a smaller sample.

 An assumption is that the samples are from a population with a Normal


distribution. Otherwise we should use non-parametric tests.

 Needs large sample size (30 or more points)

 Not commonly used as (σ) is not always exists.

Strategic Management and Business Analysis - 2014 11


1 -11

1. 1 Sample Z test (Example)…


 Average waiting time for 25 patients is 35.7 minutes, with known
standard deviation of 1.8 minutes. Calculate 95% confidence interval on
mean.

Strategic Management and Business Analysis - 2014 12


1 -12

6
5/7/2015

1. 1 Sample Z test (Solution)…


 Using Minitab ,

(Stat > Basic statistics > 1 sample Z).

Final answer is: 34.99 ≤ µ ≤ 36.41

Strategic Management and Business Analysis - 2014 13


1 -13

2. 1 Sample t test
(Confidence intervals for Mean, while (σ) is unknown)…
 Often in practice, we do not know the true population standard
deviation. Rather, we seek to estimate that with our sample as well. In that
case, we use the student “t” distribution, which approaches a Normal
distribution as the sample size n increases. An additional parameter of the
student “t” distribution is the degrees of freedom often shown using the
Greek letter (pronounced nu), which equals n-1 for this test statistic.

 Can be done with a small sample size.

 On Minitab, To calculate the 95% confidence interval on the mean,

(Stat > Basic statistics > 1 sample t).

Strategic Management and Business Analysis - 2014 14


1 -14

7
5/7/2015

2. 1 Sample t test (Example)…


 Average waiting time for 25 patients is 35.7 minutes, with sample
standard deviation of 1.8 minutes. Calculate 95% confidence interval on
mean.

Strategic Management and Business Analysis - 2014 15


1 -15

2. 1 Sample t test (Solution)…


 Using Minitab ,

(Stat > Basic statistics > 1 sample t).

Final answer is: 34.96 ≤ µ ≤ 36.44

 Note that: Confidence interval is wider than that calculated (using the
same statistics) assuming a known population sigma. The wider interval is
indicative of the decreased confidence associated with absence of
population standard deviation.
Strategic Management and Business Analysis - 2014 16
1 -16

8
5/7/2015

• Multi-vari Plots.
• Confidence Intervals on Mean.
 Hypothesis Test on Mean.
• Hypothesis Test on Paired Mean.
• Hypothesis Test on Mean of Two Samples.
• Hypothesis Test on Variance of Two Samples.
• Contingency tables.
• Power & Sample Size.
• Non-parametric Tests.

Strategic Management and Business Analysis - 2014 17


1 -17

What is the hypothesis?


 Example of Fantastic Jump !!!!!

 Example of Risky Games !!!!!

 A hypothesis is a tentative statement that proposes a possible


explanation to some phenomenon or event. A useful hypothesis is a
TESTABLE statement which may include a prediction.

 Usually, a hypothesis is based on some previous observations.

Strategic Management and Business Analysis - 2014 18


1 -18

9
5/7/2015

Why we are using hypothesis?

 The Analyze phase in Six Sigma closely examines the many process
inputs identified in the Measure phase to determine if they are related to
outputs, if a relationship does exist and if it is statistically significant. An
important tool for this analysis is hypothesis testing. Hypothesis testing
uses statistical analysis to determine if the observed relationship between
two or more samples is real or due to random chance. A variety of tests
are used to find statistical evidence to reject or "not to reject" a
hypothesis. Once this is accomplished, the Six Sigma team is ready to
move forward with identifying, testing, and implementing solutions to
address the root causes of failure.

Strategic Management and Business Analysis - 2014 19


1 -19

What is the hypothesis tests?


 It is a test performed to know how two variables might be related.

 How to perform a hypothesis test:

1. Put the two assumptions: NULL “H0” (The statement of no


change) & ALTERNATIVE assumption “H1 Or Ha” (The statement
of a change).
2. Agree about Alpha (α) level (Accepted error level Or
SIGNIFICANCE). Commonly used as 5% but may reach 1% in
some critical tests and even 0.1%.
3. Calculate (P) Value (Probability with a value ranging from Zero to
One).
4. Compare between (P & α ) Values.

Strategic Management and Business Analysis - 2014 20


1 -20

10
5/7/2015

Definitions…
 Null Hypothesis H0 : Statement of no change.. Means of the two
samples are equal.

 Alternative Hypothesis H1 Or Ha : Statement of a change.. Means of the


two samples are not equal.

 P Value : Probability - with a value ranging from Zero to one -


of obtaining results same or more extreme than the sample data if the null
hypothesis is true.
Samples this extreme or more extreme would occur p%
of time if the null is true

Strategic Management and Business Analysis - 2014 21


1 -21

Definitions…
 Alpha (α) Error : Probability of wrongly rejecting the null hypothesis..
Also called Type I error.

 Beta (β) Error : Probability of wrongly accepting the null hypothesis..


Also called type II error.

 Power of the test : 1 – β.

Strategic Management and Business Analysis - 2014 22


1 -22

11
5/7/2015

REMEMBER!!!!

 Ha … Means that there is a change …

 H0 (Hum…) Means that there is no change …

Strategic Management and Business Analysis - 2014 23


1 -23

Mean 1
Mean 3
Mean 2

Distribution 3

Distribution 2
Distribution 1

Strategic Management and Business Analysis - 2014 24


1 -24

12
5/7/2015

Hypothesis tests conclusion…


 Consider the Hypothesis on the Mean. In this case, the null hypothesis
is defined to test whether the population mean is equal to a specified
value. The alternative is that the population mean is not equal to that
value.

 This is known as a two-sided test, since we must test two alternatives


on either side of the mean as follows:
1. If Mean is larger than the specified value
2. If Mean is smaller than the specified value.

 In two-sided tests, we must split the level of significance alpha


between the two alternatives. For example, if we used a level of
significance of 5%, then alpha/2 = 2.5% would be applied to each
alternative to achieve a total level of significance of 5%.

Strategic Management and Business Analysis - 2014 25


1 -25

Hypothesis tests conclusion…


 Note that we could also specify our Null and Alternatives to use the
entire level of significance in a one-sided test. Say, for example, we were
interested in asserting that the average fill volume of a bottle of medicine
is 3 ml or more. In that case, we might specify the Null as the mean is
greater than or equal to 3 ml, with the alternative that the mean is less
than 3 ml. If we use a level of significance of 5%, it would be applied
entirely to the one-sided test.

 If (P < α) >> Reject Null hypothesis.

 If (P > α) >> Fail to reject Null hypothesis.

Strategic Management and Business Analysis - 2014 26


1 -26

13
5/7/2015

REMEMBER!!!!
 We don’t ACCEPT null hypothesis, we either reject (Strong conclusion)
or fail to reject (Weak conclusion).

 If (P) is low, NULL will go.. If (P) is high, NULL is the guy.

Strategic Management and Business Analysis - 2014 27


1 -27

• Multi-vari Plots.
• Confidence Intervals on Mean.
• Hypothesis Test on Mean.
 Hypothesis Test on Paired Mean.
• Hypothesis Test on Mean of Two Samples.
• Hypothesis Test on Variance of Two Samples.
• Contingency tables.
• Power & Sample Size.
• Non-parametric Tests.

Strategic Management and Business Analysis - 2014 28


1 -28

14
5/7/2015

3. Paired t test
Hypothesis on paired sample…
 We can also compare the means of two samples to test if they are from
populations with equal means (or from the same population).

 In this first case, we will consider samples that are paired: each
observation has a corresponding observation in the other sample batch.
For example, if we have two operators, or two test methods, and we make
measurements for each piece from each operator or method, then the
data is paired: Observation 1 from Operator A is paired with Observation 1
from Operator 2.

 In the case of paired observations, we can calculate the difference


between the two samples for each piece, and we would expect the
average difference to be close to zero if the samples were from
populations of equal mean.
Strategic Management and Business Analysis - 2014 29
1 -29

3. Paired t test
Hypothesis on paired sample…
 We can use the Hypothesis Test on Mean we have previously
discussed, testing µ=0 vs. µ≠0

(Stat > Basic statistics > Paired t).

Strategic Management and Business Analysis - 2014 30


1 -30

15
5/7/2015

3. Paired t test (Example)…


 To evaluate the effect of a new process design on cycle time, five
orders are processed by both the current & new process. Test at 5%
significance (2 sided) if µCURR = µNEW

Current New
13.75 11.5
13.5 9.5
16.75 12.5
13.25 14.5
15.5 14.5
Strategic Management and Business Analysis - 2014 31
1 -31

3. Paired t test (Solution)…


 Using Minitab ,

(Stat > Basic statistics > Paired t).

Final answer is: P > 0.05 (0.114) , and consequently we fail to reject Null
There is no significant difference between the two samples

Strategic Management and Business Analysis - 2014 32


1 -32

16
5/7/2015

• Multi-vari Plots.
• Confidence Intervals on Mean.
• Hypothesis Test on Mean.
• Hypothesis Test on Paired Mean.
 Hypothesis Test on Mean of Two Samples.
• Hypothesis Test on Variance of Two Samples.
• Contingency tables.
• Power & Sample Size.
• Non-parametric Tests.

Strategic Management and Business Analysis - 2014 33


1 -33

4. 2 sample t test
Hypothesis on two sample means…
 In a more general two sample case, we can use Hypothesis Tests to
compare the mean of two samples, to test whether the samples came
from the populations of equal mean (or from the same population).
Assuming the population distribution is Normal, we calculate the test
statistic and the degrees of freedom for the critical value of the test
statistic ( V, t0) as follows:

X1 − X2 (s12 / n1 + s22 / n2 )2
t0 = ν= 2 2
s12 s22 (s1 / n1) (s22 / n2 )2
+ +
n1 n 2 n1 −1 n2 −1
Strategic Management and Business Analysis - 2014 34
1 -34

17
5/7/2015

4. 2 sample t test (Example)…

 Given the following data, calculate P value.


n1=25; Xbar1=15.7; S1=1.8; S12=3.24
n2=50; Xbar2=21.2; S2=2.5; S22=6.25

Strategic Management and Business Analysis - 2014 35


1 -35

4. 2 sample t test (Solution)…

t0 = (21.2-15.7) / SQRT[(3.24/25)+(6.25/50)] = 10.90


ν = ((3.24/25)+(6.25/50))2 / (((3.24/25)2/24) + ((6.25/50)2/49))
ν = 63; t.025, 63 = 2.29 ;
p=0.000

Reject Null: Means are different

Strategic Management and Business Analysis - 2014 36


1 -36

18
5/7/2015

• Multi-vari Plots.
• Confidence Intervals on Mean.
• Hypothesis Test on Mean.
• Hypothesis Test on Paired Mean.
• Hypothesis Test on Mean of Two Samples.
 Hypothesis Test on Variance of Two Samples.
• Contingency tables.
• Power & Sample Size.
• Non-parametric Tests.

Strategic Management and Business Analysis - 2014 37


1 -37

Hypothesis on two samples variance…


 We can also test if samples come form populations of equal variance
using the F distribution. The F statistic is calculated as the ratio of the two
sample variances. The critical value of the F statistic is calculated for
alpha/2 (for the two-sided case where H1: σ1 ≠ σ2; for a one-sided case
(ex: H1: σ1 > σ2) we use alpha). The degrees of freedom for the F statistic
are based on the sample size n for each sample; equal sample sizes are
not required.

 DOF ν1 = n1-1; ν2 = n2-1

(Stat > Basic statistics > 2 variances).

Strategic Management and Business Analysis - 2014 38


1 -38

19
5/7/2015

• Multi-vari Plots.
• Confidence Intervals on Mean.
• Hypothesis Test on Mean.
• Hypothesis Test on Paired Mean.
• Hypothesis Test on Mean of Two Samples.
• Hypothesis Test on Variance of Two Samples.
 Contingency tables.
• Power & Sample Size.
• Non-parametric Tests.

Strategic Management and Business Analysis - 2014 39


1 -39

R x C Contingency tables…
 Contingency tables, also known as R x C Contingency Tables, refers to
data that can be assembles in tables (of rows and columns) for
comparison. For example:

 We may have five healthcare plans to choose from, and wish to


determine if there is a detectable difference between how these different
plans are rated by hourly and salaried employees.

 We may be interested to see if there is a difference between how men


and women rate three different television shows,

 We may need to check whether the repair rate for 4 machines is


different from shift to shift.

Strategic Management and Business Analysis - 2014 40


1 -40

20
5/7/2015

R x C Contingency tables…
 The methodology for analyzing the (r) rows by (c) columns involves
using the chi-square statistic to compare the observed values with the
expected values, assuming independence.

 The Null Hypothesis is that the p-values are equal for each column in
each row. The alternative is that at least one of the p-values is different.

 The degrees of freedom equals (r-1)*(c-1), where r is the number of


rows and c is the number of columns in the Contingency table.

(Stat > tables > Chi-square test).

Strategic Management and Business Analysis - 2014 41


1 -41

Chi-square test…
 Used to test whether two discrete variables are associated or not (one
variable is dependent on the other one).

 Two variables are associated if the distribution of observations for one


variable differs depending on the category of the second variable.

 Two variables are independent if the distribution of observations for one


variable is similar for all categories of the second variable.

 It compares the actual data readings with the expected values in case
of independence.

 When testing chi square, use numbers not percentages.


 To use contingency tables, Expected frequency count for each cell of
the table should be at least 5
Strategic Management and Business Analysis - 2014 42
1 -42

21
5/7/2015

Hypothesis tests problems…


 There are some predictable problems that can occur with Hypothesis
Testing that should be considered in our samples and our analysis:

We must address and validate the key assumptions of the tests.

 Samples must be random, and we must ensure they are representative


of the population we are investigating. In surveys, low response rates
would typically provide extreme value estimates (that is, the sub-
population of people who have strong opinions one way or the other).

 Samples must be from a single population. If the population is changing


over time, then estimates will be biased, with associated increases in
alpha and beta risks.

Strategic Management and Business Analysis - 2014 43


1 -43

Hypothesis tests problems…


 Many of the hypothesis tests, and associated alpha and beta risks, are
dependant on normality of the population. If the population is significantly
non-normal, then the tests are not meaningful. We should test this
assumption using the Goodness of Fit tests. Some test additionally
require equal variance, which should also be tested.

 Realize failure to reject is NOT acceptance. Rather, it means we don’t


have proof yet that the hypothesis should be rejected.

 We need to recognize that the alpha risk is real!

 Finally, we should consider the power of the samples to detect real


differences. High power implies that we are more likely to detect a
difference that truly exists.

Strategic Management and Business Analysis - 2014 44


1 -44

22
5/7/2015

Alpha risk is real…


 Alpha Is the probability of rejecting the null hypothesis when it is true.
For example, if alpha is 0.05, then there are 5 chances in 100 of
incorrectly rejecting a true null hypothesis. If n investigators are
independently researching the issue, then the probability that at least one
researcher (incorrectly) rejects the null hypothesis is 1-(1- α)n. For
example, the chance that one of ten researchers, each with an alpha risk
of 0.05, will (incorrectly) reject the true null hypothesis is 40%!

 Consider this the next time you read the report of the Surprising Results
Of A New Study in your newspaper. Would the unsurprising results of the
other nine researchers warrant a headline?

Strategic Management and Business Analysis - 2014 45


1 -45

Beta risk…
 The Beta risk is the probability of not rejecting a false null hypothesis.
Sometimes, we speak instead of the power of the test, which is the
probability of correctly rejecting the false null hypothesis. Either way, we
need to recognize that even though we fail to reject, the null hypothesis
may still be false.

 What influences our ability to correctly reject the false null hypothesis?
Or what gives a higher power (small beta risk)

1. Large difference between null & alternative mean


2. Small population sigma
3. Large sample size
4. Large Significance (α) Level

Strategic Management and Business Analysis - 2014 46


1 -46

23
5/7/2015

Alpha & Beta risk relation…


Null Hypothesis As alpha decreases,
decision line moves to right,
which increases beta risk
α

µo

β
True Condition
Decision Line

Strategic Management and Business Analysis - 2014 47


1 -47

• Multi-vari Plots.
• Confidence Intervals on Mean.
• Hypothesis Test on Mean.
• Hypothesis Test on Paired Mean.
• Hypothesis Test on Mean of Two Samples.
• Hypothesis Test on Variance of Two Samples.
• Contingency tables.
 Power & Sample Size.
• Non-parametric Tests.

Strategic Management and Business Analysis - 2014 48


1 -48

24
5/7/2015

Power & Sample size…


 It links or calculate sample size required for certain test power or vice
versa

(Stat > Power & sample size).

Strategic Management and Business Analysis - 2014 49


1 -49

Power & Sample size (Example)…


 Calculate required sample size to achieve power of .90 in t-test w
α=0.05, σ=1.5

Null hypothesis H0 µ=50


Alternative H1 µ= 53

Strategic Management and Business Analysis - 2014 50


1 -50

25
5/7/2015

Power & Sample size (Solution)…


 Using Minitab,

(Stat > Power & sample size).

Final answer is 5

Strategic Management and Business Analysis - 2014 51


1 -51

• Multi-vari Plots.
• Confidence Intervals on Mean.
• Hypothesis Test on Mean.
• Hypothesis Test on Paired Mean.
• Hypothesis Test on Mean of Two Samples.
• Hypothesis Test on Variance of Two Samples.
• Contingency tables.
• Power & Sample Size.
 Non-parametric Tests.

Strategic Management and Business Analysis - 2014 52


1 -52

26
5/7/2015

Non parametric tests…


 We sometimes choose to use non-parametric tests, particularly when
distributional assumptions can not be met. A non-parametric test is one in
which there are no distributional requirements, such as Normality, for the
validity of the test.

Typically, non-parametric tests require larger sample sizes than the


parametric tests.

Strategic Management and Business Analysis - 2014 53


1 -53

Non-Parametric Sign Tests on Median…


 Basic form: 1/2 of data should have positive difference from stated
median

 Ignores magnitude of difference

Strategic Management and Business Analysis - 2014 54


1 -54

27
5/7/2015

Wilcoxon Signed Rank Test on Median…


 Includes magnitude & sign of difference from median

 Assumes symmetric continuous distribution

 Compare to tabulated values based on n, alpha

 Can be applied to differences between paired observation

Strategic Management and Business Analysis - 2014 55


1 -55

Non-Parametric Two Sample Test …


 Wilcoxon Rank-Sum Test

 Aka Mann-Whitney test

 Assumes distributions of X1 & X2 have same shape & spread

 Compare to tabulated values based on n, alpha

 If normal, 95% as efficient as t-test in large samples

 Always ≥86% as efficient as t-test

Strategic Management and Business Analysis - 2014 56


1 -56

28
5/7/2015

ANOVA

Strategic Management and Business Analysis - 2014 57


1 -57

 Assumptions.

• One way ANOVA.

• Two Way ANOVA.

• Multi Factor ANOVA.

Strategic Management and Business Analysis - 2014 58


1 -58

29
5/7/2015

Why ANOVA …
 Imagine that we need to compare 4 means (µ1, µ2, µ3, µ4) using t
sample t test.

 Firstly we need to check if (µ1 = µ2)


then (µ1 = µ3)
then (µ1 = µ4)
then (µ2 = µ3)
then (µ2 = µ4)
then (µ3 = µ4)

So, six tests should be performed to get the result….

 In addition to the time and effort needed to do so, there is an important


factor should be considered which is (α) risk..

Strategic Management and Business Analysis - 2014 59


1 -59

Why ANOVA …
 If the six tests was performed at a 95% significance level, the final (α)
risk value will be as follows:

(α) risk value = 1 – confidence level


= 1 – (0.95 * 0.95 * 0.95 * 0.95 * 0.95 * 0.95)
= 1 – (0.735)
= 0.265
= 26.5%

So, probability to wrongly reject the Null hypothesis (equal 4 means) will
be 26.5% …

Strategic Management and Business Analysis - 2014 60


1 -60

30
5/7/2015

What ANOVA is …

 ANalysis Of VAriance.
 It is a tool used for comparison of two or more means using analysis of
variance and F statistics.

Strategic Management and Business Analysis - 2014 61


1 -61

ANOVA Assumptions…
1. Treatments are following normal distribution, yet there is strong
evidence that the test is robust to departures from Normality. In other
words, Normality is not generally an issue.

2. Most important is the assumption of a common level of variance


amongst the treatments, The F-test has been found to be quite
sensitive to unequal variances, so this must always be verified before
applying the F-test… Minitab includes Bartlett’s test for the data
following normal distribution & Levene’s test, which is recommended if
the data is significantly not following normal distribution. (Stat >>
ANOVA >> Test for equal variances)

3. Finally, each treatment is influenced by only random effects. In other


words, any other possible factor influencing the treatment is equally
represented within each treatment.
Strategic Management and Business Analysis - 2014 62
1 -62

31
5/7/2015

• Assumptions.

 One way ANOVA.

• Two Way ANOVA.

• Multi Factor ANOVA.

Strategic Management and Business Analysis - 2014 63


1 -63

Onw Way ANOVA …


 One-way ANOVA compares mean of several treatments of a variable

 Ex: Revenue for each of 4 sales representatives. (Each representative


is a treatment).

 Null hypothesis (H0) is µ1= µ2= µ3 = µ4

 Alternative hypothesis (Ha) is one or more (µ) is different.

 The F-statistic is calculated to check hypothesis by dividing the


variation between treatments (the Treatment Mean Square MST) by the
variation within treatments (the Error Mean Square MSE), as shown in the
following slides.

Strategic Management and Business Analysis - 2014 64


1 -64

32
5/7/2015

Onw Way ANOVA Components …


 ANOVA partitions the total variation in the data amongst the following
two components:

1. Variation between treatment means


MST: Treatment Mean Square

2. Variation within treatment means


MSE: Error Mean Square
Which is the random error common to all treatments (due to equal
variance assumption)

Ratio of between to within near (Which is F statistics) equal 1 if treatment


means are equal. In other words, when the variation between the
treatments is not different than the variation within the treatments.

Strategic Management and Business Analysis - 2014 65


1 -65

One Way ANOVA Example …


 For the data below and assuming equal variances between the four
representatives, Is the variation between the representatives simply due
to random error, or are their mean revenues significantly different, at a
significance alpha of 0.05?

Rep A Rep B Rep C Rep D


11.7 11.6 13.6 12.3
10.7 10.8 11.4 14.1
12.2 10.4 11.9 11.5
13.3 11.5 11.2 13.2
13.2 11.0 10.4 12.0
13.7 11.8 12.7 13.0

Strategic Management and Business Analysis - 2014 66


1 -66

33
5/7/2015

One Way ANOVA Solution …


 Using Minitab, the output from the session window will be:

Source DF SS MS F P
Factor 3 8.157 2.719 2.88 0.062
Error 20 18.883 0.944
Total 23 27.040

 From the P value, we can get the conclusion that there is no significant
difference between them…

 Also, from calculated value for (F Statistics), we can get the same result
after comparing it with the critical F value … How??

Strategic Management and Business Analysis - 2014 67


1 -67

What is the critical value for a statistic…


 Critical value is the value corresponding to a given significance level.
This cutoff value determines the boundary between rejecting null
hypothesis and fail to reject it. If the absolute value of the calculated value
from the statistical test is greater than the critical value, then the null
hypothesis is rejected in favor of the alternative hypothesis, and vice
versa.

 In other words, it is the statistic value which should be reached or


exceeded in order to have a (P Value) equal to or less than the required
significance level to be able to reject the null hypothesis…

Strategic Management and Business Analysis - 2014 68


1 -68

34
5/7/2015

How can we calculate the critical value for a statistic…


 Open Minitab

 Go to Calc >> Probability Distribution >> then Choose the distribution


 Choose “Inverse cumulative probability” .
 Put required value (we can get it from the session window of the
statistics test “for example, one way ANOVA”)
 Choose “input constant”, put the value of significance. (0.95 if not
stated)

And then press OK.

 From session window, record the ABSOLUTE value of “X

Strategic Management and Business Analysis - 2014 69


1 -69

One Way ANOVA with Confidence Intervals…


 Minitab goes a step further by providing the confidence intervals of the
mean for each treatment.

 If the F-test had detected a difference in one of the means (with a p <
0.05), then we could use these confidence intervals on the mean to
determine which mean is significantly different.

 In this example, the intervals overlap, consistent with our inability to


detect a statistical difference between the means. If one of the means
were statistically different, its confidence interval would not overlap one or
more of the others.

Strategic Management and Business Analysis - 2014 70


1 -70

35
5/7/2015

One Way ANOVA with Confidence Intervals…

A (----------*---------)
B (----------*---------)
C (---------*----------)
D (----------*---------)

Strategic Management and Business Analysis - 2014 71


1 -71

• Assumptions.

• One way ANOVA.

 Two Way ANOVA.

• Multi Factor ANOVA.

Strategic Management and Business Analysis - 2014 72


1 -72

36
5/7/2015

Two Way ANOVA …


 Two-Way ANOVA is used when there is a second factor that occurs
within each treatment. For example, Does the product type impact each
sales representatives revenue?

 We sometimes refer to this second factor as a block effect within each


treatment (the treatments may also be called main effects).

 The same basic assumptions are necessary as with the One-way


ANOVA, with the added provision that all other possible factors are
randomized within each block (when multiple samples occur within a
block).

Strategic Management and Business Analysis - 2014 73


1 -73

Two Way ANOVA Components …


 Like One Way, Two Way ANOVA partitions the total variation in the data
amongst the following two components:

1. Variation between treatment means

2. Variation within treatment means


Where within treatment variation divided into three parts:

i. The random error common to all treatments (equal variance)


ii. The effect of the block variable
iii. The interaction of the treatment & the block. For example, Is
revenue for given sales agent affected by product type?

Strategic Management and Business Analysis - 2014 74


1 -74

37
5/7/2015

Two Way ANOVA Example …


 For the data below and assuming equal variances between the four
representatives, Is the variation between the representatives simply due
to random error, or are their mean revenues significantly different, at a
significance alpha of 0.05? Are there differences due to the Product Type
(P1 or P2)?

Product Rep A Rep B Rep C Rep D

P1 11.7 11.6 13.6 12.3


P2 14.2 13.3 14.7 15.3
P1 12.2 10.4 11.9 11.5
P2 13.3 12.5 15.6 13.2
Strategic Management and Business Analysis - 2014 75
1 -75

Two Way ANOVA Example …


 In the above data, because we have repeated (or replicated)
conditions, we can also estimate the interaction between the Product
Types and Representatives. In other words, does the effect of Product
Type on sales vary depending on the Rep? In other words, are some
Reps better at selling certain products, but not others?

Strategic Management and Business Analysis - 2014 76


1 -76

38
5/7/2015

Two Way ANOVA Solution …


 Using Minitab, Firstly we have to stack the values of all representatives
in one column. Use (Data >> Stack >> Columns), output from the session
window will be:

Source DF SS MS F P
Product 1 17.8506 17.8506 24.02 0.001
Representative 3 8.1019 2.7006 3.63 0.064
Interaction 3 0.2819 0.0940 0.13 0.942
Error 8 5.9450 0.7431
Total 15 32.1794

Strategic Management and Business Analysis - 2014 77


1 -77

Two Way ANOVA with Confidence Intervals…


 Sales Rep. confidence intervals

A (-------*------)
B (------*-------)
C (-------*------)
D (-------*------)
 Product type confidence intervals

P1 (------*-----)
P2 (-----*-----)

Strategic Management and Business Analysis - 2014 78


1 -78

39
5/7/2015

Two Way ANOVA with Confidence Intervals…


 Minitab’s confidence intervals (shown with the Display Means option)
shows that the difference in means for Sales Reps B and C account for
the relatively low p-value for Rep (0.06). The means for Product Types P1
and P2 are shown to be significantly different.
Note that this is not a preferred method for testing whether the means are
different, as the combined Type I error for the confidence intervals is much
higher than the alpha significance of each individually (as was discussed
in the first portion of this topic). A better approach using Tukey’s HSD
statistic is shown later in this topic.

Strategic Management and Business Analysis - 2014 79


1 -79

• Assumptions.

• One way ANOVA.

• Two Way ANOVA.

 Multi Factor ANOVA.

Strategic Management and Business Analysis - 2014 80


1 -80

40
5/7/2015

Multi Factor ANOVA Example…


 Example data is shown of revenue for four sales representatives
(expressed in thousands of dollars). Is the variation between the reps
simply due to random error, or are their mean revenues significantly
different, at a significance alpha of 0.05?

 Are there differences due to the Product Type (P1 or P2)? Or to the
geographic area (region) they service?

Prod Area Rep A Rep B Rep C Rep D


P1 X 11.7 11.6 13.6 12.3
P2 X 14.2 13.3 14.7 15.3
P1 Y 12.2 10.4 11.9 11.5
P2 Y 13.3 12.5 15.6 13.2
Strategic Management and Business Analysis - 2014 81
1 -81

Multi Factor ANOVA Example…


 In the above data, we have the correct combination of runs to estimate
the two-factor interactions between the main factors: Rep and Product;
Rep and Area; Product and Area. The interactions test if there is a
significant difference in the Rep means for different Product Types? (In
other words, are some Reps better at selling certain products, but not
others?) Or different Areas? Is there a significant difference in the Product
means for different Areas?

Strategic Management and Business Analysis - 2014 82


1 -82

41
5/7/2015

Multi Factor ANOVA Solution…


 On Minitab, go to Stat >> ANOVA >> General Linear model

 Enter required data including interactions between factors using symbol


(*) … But don’t forget to stack the data points first…

 Output from the session window, will be:

Source DF Seq SS Adj SS Adj MS F P


Rep. 3 8.1019 8.1019 2.7006 3.07 0.191
Product 1 17.8506 17.8506 17.8506 20.31 0.020
Area 1 2.3256 2.3256 2.3256 2.65 0.202
Rep.*Product 3 0.2819 0.2819 0.0940 0.11 0.951
Rep.*Area 3 0.9769 0.9769 0.3256 0.37 0.782
Product*Area 1 0.0056 0.0056 0.0056 0.01 0.941
Error 3 2.6369 2.6369 0.8790
Total 15 32.1794
Strategic Management and Business Analysis - 2014 83
1 -83

Tukey’s HSD Test…


 As mentioned earlier, a simple check of the confidence intervals
between all pairs of treatment means will elevate the Type I error. Tukey’s
HSD (Honestly Significant Difference) test will allow consideration of all
mean differences for a combined significance level as stated.

 Reject equal means if difference in means ≥ HSD

 Minitab: use Comparisons button w/in GLM dialog

 Select main factors as Terms

Strategic Management and Business Analysis - 2014 84


1 -84

42
5/7/2015

Tukey’s HSD Example…


 Using the Multi-Factor example data, only the Tukey’s 95%
simultaneous confidence interval for the absolute value of P2-P1 provides
an interval that does not include the value 0.0. This indicates that the
difference between the treatment means P1 and P2 is significantly
different.

Product = P1 subtracted from P2:


Lower: 0.6207
Center: 2.112
Upper: 3.604
(--------------*--------------)
----+---------+---------+---------+--
1.0 2.0 3.0 4.0
Strategic Management and Business Analysis - 2014 85
1 -85

Regression Analysis

Strategic Management and Business Analysis - 2014 86


1 -86

43
5/7/2015

 Cause & Effect Analysis.

• Scatter Diagram.

• Regression Linear Model.

Strategic Management and Business Analysis - 2014 87


1 -87

Cause & Effect Diagram …


 Structured approach to brainstorm root causes

 Graphical representation of your understanding of relationships

 Also known as Fishbone Diagrams because of their form

 Sometimes called Ishakawa Diagrams in reference to a Japanese


engineer who popularized their use for Quality Improvement

 Listing all the causes for a given effect in a clear, organized way makes
it easier to separate out potential problems and target areas for
improvement.

Strategic Management and Business Analysis - 2014 88


1 -88

44
5/7/2015

Cause & Effect Diagram …

Cause

Effect

Sub-cause
Sub-Cause

Sub-Cause
Cause

Strategic Management and Business Analysis - 2014 89


1 -89

Cause & Effect Diagram Example …

Strategic Management and Business Analysis - 2014 90


1 -90

45
5/7/2015

Cause & Effect Diagram Methodology …


 To create a Cause and Effect diagram, begin by brainstorming the
potential relationships between the process and the outcome. The
outcome, or effect, is typically stated in terms of a problem rather than a
desired condition, which tends to help the brainstorming.

 Bear in mind that the causes listed are all potential cause, since we
have no data at this point to support whether any of the causes really
contribute to the problem. In that regard, as in all brainstorming activities,
avoid judging the merits of each cause as it is offered. Only data can lead
to that judgment.

 The major branches of the Fishbone are chosen to assist in


brainstorming, or afterwards to categorize the potential problems. Sub-
causes are added as needed, and it’s often helpful to go down several
levels of sub-causes.
Strategic Management and Business Analysis - 2014 91
1 -91

Cause & Effect Diagram Methodology …


 You may find it convenient to use either the 5M and E or 4 P’s to either
categorize on the final Fishbone or to ensure that all areas are considered
during brainstorming.

• 5M’s & E: Manpower, Machines, Methods, Material, Measurement,


Environment
• 4 P's: Policy, Procedures, Plant, People

Strategic Management and Business Analysis - 2014 92


1 -92

46
5/7/2015

Cause & Effect Diagram Uses …


 Measure Phase:
To brainstorm potential areas / defects / causes to focus process
measurements

 Analyze Phase:
To brainstorm potential basic process factors, which can be investigated
in designed experiment

Strategic Management and Business Analysis - 2014 93


1 -93

• Cause & Effect Analysis.

 Scatter Diagram.

• Regression Linear Model.

Strategic Management and Business Analysis - 2014 94


1 -94

47
5/7/2015

Scatter Diagram …
 Scatter Diagrams are used to investigate possible correlation, or
(inter-dependence), of one variable to another.

 Correlation indicates that as one variable changes, the other also


changes. Although this may indicate a cause and effect relationship, this
is not always the case, since there may be other characteristics (possibly
many more) that are actually the cause, and both the characteristics being
investigated are their effect.

 In spite of these somewhat relations between variables, it can still be


useful to establish correlation. If we know, for instance, that there is
considerable correlation between two characteristics, we can use one to
predict the other, particularly if one characteristic is easy to measure and
the other isn’t.

Strategic Management and Business Analysis - 2014 95


1 -95

Scatter Diagram …
 For example, if we can show that weight gain in the first three months
of pregnancy correlates well with baby development, we can use weight
gain as a predictor of healthy fetal development. The alternative would be
expensive tests to monitor the actual development of the baby.

 A Scatter Diagram is a graphical tool to examine the relationship


between data collected for two different characteristics. Although the
Scatter Diagram cannot determine the cause of such a relationship, it
can show whether or not such a relationship exists, and if so, just how
strong it is.

Strategic Management and Business Analysis - 2014 96


1 -96

48
5/7/2015

Scatter Diagram Example …

X Y

dependent variable
90%
1.5 76
1.3 85
1.1 75 80%

1.6 84
2 88 70%
1.75 90

1.0 1.5 2.0


independent variable
Strategic Management and Business Analysis - 2014 97
1 -97

Interpolation and extrapolation…


 In Scatter diagram we can predict Y variable given known values of the
X variables. Prediction within the range of values in the dataset used for
model-fitting is known informally as INTERPOLATION. Prediction outside
this range of the data is known as EXTRAPOLATION . Performing
extrapolation relies strongly on the model assumptions. The further the
extrapolation goes outside the data, the more room there is for the model
to fail due to wrong assumptions.

 It is generally advised that when performing extrapolation, one should


accompany the estimated value of the dependent variable with a
prediction interval that represents the uncertainty. Such intervals tend to
expand rapidly as the values of the independent variable(s) moved
outside the range covered by the observed data.

Strategic Management and Business Analysis - 2014 98


1 -98

49
5/7/2015

Extrapolation …

Strategic Management and Business Analysis - 2014 99


1 -99

Correlation & Scatter plot…


 Correlation is a statistical technique that can show whether and how
strongly pairs of variables are related. For example, height and weight are
related; taller people tend to be heavier than shorter people.

Dependent variable

Independent variable

Strategic Management and Business Analysis - 2014 100


1 -100

50
5/7/2015

Positive Correlation…

Dependent variable

Independent variable

 Slope is positive.

 As X increases, Y will increase.

Strategic Management and Business Analysis - 2014 101


1 -101

Negative Correlation…

Dependent variable

Independent variable

 Slope is negative.

 As X increases, Y will decrease.

Strategic Management and Business Analysis - 2014 102


1 -102

51
5/7/2015

Weak Correlation…

Dependent variable

Independent variable

 As X varies, response Y not well predicted by line


(high error).

Strategic Management and Business Analysis - 2014 103


1 -103

Strong Correlation…

Dependent variable

Independent variable

 As X varies, response Y well predicted by line


(low error)

Strategic Management and Business Analysis - 2014 104


1 -104

52
5/7/2015

Correlation coefficient …
• The main result of a correlation is called the correlation coefficient (or
"r"). It ranges from -1.0 to +1.0. The closer r is to +1 or -1, the more
closely the two variables are related. If r is close to 0, it means there is no
relationship between the variables.

• If r is positive, it means that as one variable gets larger the other gets
larger.

• If r is negative it means that as one gets larger, the other gets smaller
(often called an "inverse" correlation).

Strategic Management and Business Analysis - 2014 105


1 -105

Determination coefficient …
 Used to identify how much the fitted line can determine variability in
dependent variable according to variability in independent variable. Also
known as (R-squared).

 Regression model can be trusted only if (R-squared) is higher than 70%

Strategic Management and Business Analysis - 2014 106


1 -106

53
5/7/2015

Non Linear Correlation…

Dependent variable

Independent variable

 Correlation may also be non-linear in nature. This is neither good nor


bad in a general sense, although it may serve our specific interests one
way or the other. Statistically, we prefer linear regression only because the
analysis is less complicated. When the correlation is non-linear we can
use stratification technique.
Strategic Management and Business Analysis - 2014 107
1 -107

Stratification…

Dependent variable

Independent variable
 In this Scatter Diagram, there is no clear correlation between the two
variables. As we change the x-variable, we can see no clear pattern in the
y-values.

 So, we will analyze the same data by grouping them into three series
based on the value of a third variable.
Strategic Management and Business Analysis - 2014 108
1 -108

54
5/7/2015

Stratification…

Dependent variable

Independent variable

 Suppose each of the three series, shown in the graph as yellow, orange
and blue points, represent different cooking temperatures.
• Series one, displayed here in yellow, shows a clear positive correlation.
• Series two, in orange, shows a negative correlation.
• Series three, in blue, shows no correlation.
Strategic Management and Business Analysis - 2014 109
1 -109

• Cause & Effect Analysis.

• Scatter Diagram.

 Regression Linear Model..

Strategic Management and Business Analysis - 2014 110


1 -110

55
5/7/2015

What is the meaning of regression analysis…


 Technique for modeling and analyzing several variables, when the
focus is on the relationship between a dependent variable and one or
more independent variables. More specifically, regression analysis helps
us to understand how the typical value of the dependent variable changes
when any one of the independent variables is varied, while the other
independent variables are held fixed. Most commonly, regression analysis
estimates the expected value of the dependent variable given the
independent variables.

Strategic Management and Business Analysis - 2014 111


1 -111

What is the benefit of regression analysis…


 The objective of the regression analysis is to determine if the
dependent variable can be adequately predicted by the x-axis. We can
never observe each potential value of x to determine y, so it is useful if we
could instead observe a sample of the possible outcomes and define the
relationship in the form of an equation. We could then use the equation, or
model, to predict the dependent variable for other values of the
independent variable.

Strategic Management and Business Analysis - 2014 112


1 -112

56
5/7/2015

How to build regression model…


 The Regression Model used for Simple Linear Regression is that of a
straight line. You might recall this equation as y equals m times x plus b,
where y is the dependent variable, x is the independent variable, m is the
slope, and b is the value of y when x equals zero. (b is sometimes called
the intercept).

Y= β0 + β1 X + error
 The error term is an acknowledgement that even if we could sample all
possible values, there would most likely be some
unpredictability in the outcome. It could be due to many
possibilities, including measurement error in either the
dependent or independent variable, the effects of other
unknown variables, or non-linear effects.

Strategic Management and Business Analysis - 2014 113


1 -113

57

You might also like