Professional Documents
Culture Documents
UNCLASSIFIED / FOUO
National Guard
Black Belt Training
Module 36
UNCLASSIFIED / FOUO
This material is not for general distribution, and its contents should not be quoted, extracted for publication, or otherwise
copied or distributed without prior coordination with the Department of the Army, ATTN: ETF. UNCLASSIFIED / FOUO
UNCLASSIFIED / FOUO
ACTIVITIES TOOLS
• Value Stream Analysis
• Identify Potential Root Causes • Process Constraint ID
• Reduce List of Potential Root • Takt Time Analysis
Causes • Cause and Effect Analysis
• Brainstorming
• Confirm Root Cause to Output
• 5 Whys
Relationship
• Affinity Diagram
• Estimate Impact of Root Causes • Pareto
on Key Outputs • Cause and Effect Matrix
• FMEA
• Prioritize Root Causes
• Hypothesis Tests
• Complete Analyze Tollgate • ANOVA
• Chi Square
• Simple and Multiple
Regression
Note: Activities and tools vary by project. Lists provided here are not necessarily all-inclusive. UNCLASSIFIED / FOUO
UNCLASSIFIED / FOUO
Learning Objectives
Terminology and data requirements for conducting a
regression analysis
Interpretation and use of scatter plots
Interpretation and use of correlation coefficients
The difference between correlation and causation
How to generate, interpret, and use regression
equations
Application Examples
Administrative – A financial analyst wants to predict
the cash needed to support growth and increases in
training
Market/Customer Research – The main exchange
wants to determine how to predict a customer’s
buying decision from demographics and product
characteristics
Hospitality – The MWR Guest House wants to see if
there is a relationship between room service delays
and order size
Regression ANOVA
Attribute
The tool depends on the data type. Regression is typically used with a continuous
input and a continuous response but can also be used with count or categorical
inputs and outputs.
Simple Linear Regression UNCLASSIFIED / FOUO 5
UNCLASSIFIED / FOUO
Regression Terminology
Types of Variables
Input Variable (Xs)
These are also called predictor
variables or independent variables
Best if the variables are continuous, Error
but can be count or categorical
X1
Output Variable (Ys) Process or
X2 Y
These are also called response
Product
X3
variables or dependent variables
(what we’re trying to predict)
Best if the variables are continuous,
but can be count or categorical
Be Careful
Correlation does not
guarantee causation!
Other examples?
Average life expectancy
Gas mileage
Lurking
variables!
When is it correct to infer causation?
50
Call Length
40
30
20
10 20 30
Broker Experience
Does it look like a relationship exists between Broker Experience and Call Length?
Simple Linear Regression UNCLASSIFIED / FOUO 11
UNCLASSIFIED / FOUO
Y Axis
60
Paired
(Result?) Data
50
Call Length
40
X Axis
30 ( Suspected
Influence )
20
10 20 30
Broker Experience
Paired Data?
To use a scatter plot, you must have measured two factors for a single observation or item (ex: for a
given measurement, you need to know both the call length and the broker’s experience). You have to
make sure that the data “pair-up” properly in Minitab, or the diagram will be meaningless.
X
Simple Linear Regression UNCLASSIFIED / FOUO 13
UNCLASSIFIED / FOUO
Scatter Plots
Example One
Example Two
Example Three
5. Double click on
C5 Wait Time to enter it
as the Y variable, then
double click on
C6 Deliveries to enter it
as the X variable
7. Click OK
50
Wait Time
45
40
35
10 15 20 25 30 35
Deliveries
r=-.8
50
Call Length
40
r = - 0.896
30
(a strong negative correlation)
20
10 20 30
Broker Experience
Exercise: Correlation
The scatter plot shows that the customers are waiting
longer when Anthony’s Pizza has to make more
deliveries
Next, the Belt wants to quantify the strength of that
relationship
To do that, we will calculate the Pearson Correlation
Coefficient, r
Pizza Correlation
1. Choose Stat > Basic Statistics > Correlation
Correlation Coefficient
Interpreting Coefficients – r2
First, we obtained r from the Correlation analysis
Next, in Regression, we will look at r2 to see how good our
model (regression equation) is
r2: Compute by multiplying r x r (Pearson correlation
squared)
Regression Analysis
Regression Analysis is used in conjunction with
Correlation and Scatter Plots to predict future
performance using past results
While Correlation shows how much linear relationship
exists between two variables, Regression defines the
relationship more precisely
Use this tool when there is existing data over a
defined range
Regression analysis is a tool that uses data on
relevant variables to develop a prediction equation, or
model
Linear Regression
In Simple Linear Regression, a single variable “X” is
used to define/predict “Y”
y
x
Exercise: Regression
Since the Pearson Correlation (r) was .970, we know
that there is a strong positive correlation between the
number of deliveries and the wait time
Next, the Belt would like to get an equation to predict
how long the customers will be waiting
Regression (Cont.)
1. Choose Stat>Regression>Fitted Line Plot
2. Double click on
C5 Wait Time to enter it as
the Response (Y) variable
3. Double click on
C6 Deliveries to enter it as
the Predictor (X) variable
4. Make sure Linear is checked
for the type of Regression
5.Edit dialog box options
(Optional)
6. Click OK
45
40
35
10 15 20 25 30 35
Deliveries
Prediction Equation
(Regression Model)
Ŷ
50
“fitted” observation
(the line)
Wait Time
45
Y
40
true observation
(the data point)
35
10 15 20 25 30 35
Deliveries
Minitab will find the “best fitting” line for us. How does it do that?
•We want to have as little difference as possible between the true observations and
the fitted line
•Minitab minimizes the sums of squares of the distance between the fitted and true
observations
Simple Linear Regression UNCLASSIFIED / FOUO 42
UNCLASSIFIED / FOUO
Multiple Regression
Use this when you want to consider more than one
predictor variable
The benefit is that you might need more predictors to
create an accurate model
In the case of our Anthony’s Pizza example, we may
want to look at the impact that incorrect orders,
damaged pizzas, and cold pizzas have on wait time
Absentee Rate
1. Open an blank Minitab worksheet Experience Absences
and input the data 18.1 31.5
2. Create a scatter plot and decide 20.0 33.1
whether a straight line is a 20.8 27.4
reasonable model 21.5 24.5
3. Conduct a regression analysis and 22.0 27.0
get the linear prediction equation 22.4 27.8
4. Predict the number of absences for 22.9 23.3
employees with 19.5 months of 24.0 24.7
experience
25.4 16.9
27.3 18.1
Takeaways
Start with a visual tool – create a scatter plot
Determine the Pearson correlation coefficient, r, to
determine the strength of the relationship
Remember that correlation does not guarantee
causation!
Create and interpret the Regression Plot
Use the prediction equation
Validate the prediction model’s r-squared using new
data (not part of the data set used in creating the
prediction equation)
UNCLASSIFIED / FOUO