You are on page 1of 225

Lean-Sigma Black Belt

Development program
Week 3 booklet for trainees
Devenir Green Belt

July 2015 – version A


Process Mapping

2/ Black Belt development program - Week 3


Ce document et les informations qu’il contient sont la propriété de Snecma. Ils ne doivent pas être copiés ni communiqués à un tiers sans l’autorisation préalable et écrite de Snecma.
WHAT IS A PROCESS MAP?

 Process mapping is a diagnostic tool used to visually illustrate how a


product, document or service flows through a process. The map is
 A simplified representation of reality
 A graphical means of representing business activities
 A layered approach that can break down into sub-processes

 It provides the opportunity for:


 analysis
 evaluation and quantification
 simulation and improvement
 communication

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
HOW ARE WE ORGANISED TO DELIVER?

Executive

Research/ Sales /
Operations HR Finance /
developmen marketing admin
t

Functional Functional Functional Functional Functional


Objectives Objectives Objectives Objectives Objectives

Companies are normally structured hierarchically, with each area having its
own functional responsibilities.

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 4
WHAT CUSTOMER SEES

But when a customer looks at a company


he sees process outputs not functional outputs
Research & Operations Sales / Marketing HR Finance / Admin
Development

Delivered
service
(Q, C, D)

However we can see that processes


operate across functional boundaries

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 5
WHY IDENTIFYING PROCESS ?

To satisfy customers and / or improve


the business we need to identify processes

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 6
MAPPING THE CURRENT STATE
Measuring / Analyzing & Improving
Share
Define the Define the Plan
current
target solution actions
state

TWO NEW TOOLS FOR MAPPING THE PROCESS

Operations /
Flow Chart Process Sheet

Date : 1st Apr


PROCESS MAPPING SHEET
Page 1 of 1
COMPANY : SMMT Industry Forum
PART NUMBER / PROCESS : Machine and Paint blank
Operation Inspection Transport Delay Storage
Time Distance
No PROCESS STEP
(Sec's) (M)

1 Storage of blanks 604800


2 Transport to cell 1200 250
3 Turning operation 1200
4 Wait for batch of 30 36000
5 Transport to furnace 2000 350
6 Heat Treatment Cycle 43200
7 Transport to vertical borer 2000 350
8 Vertical Bore 1200
9 Wait for batch of 30 36000
10 Transport to miller 250 50
11 Mill 900
12 Wait for batch of 30 27000
13 Transport to paint 250 50
14 Paint operation 1200
15 Wait for batch of 30 36000
16 Transport to Stores 1200 250
17 Store for despatch 604800
Total Time 47700 6900 135000 1209600 1399200 1300
Grand Total Time 1399200

Percentage 3.4% 0.0% 0.5% 9.6% 86.4%

SUMMARY GRAPH

3.4% 9.6% 86.4% Operation


Inspection
Transport
Delay
0.5% Storage

COMMENTS

The parts is stored for 86.4% of the throughput time - this is the delay at the stores both incoming and outgoing.
In the process the part is delayed for 9.6% of the time due to working in batches of 30
The part is transported for 1300 metres around the plant.

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 8
FLOW CHART

 Flow charts are made with the following HR Department Production


Interview Requested
symbols
No Require
“preinterview”
chat ?
Yes
Arrange
Hold «pre-
« pre-interviews»
interview» cat
chat

Start/End Decision Arrange Yes Proceed No


formal interview further?

Hold
interview

Process Make
Yes Want
to offer
offer
Flow Offer/decline
?

made No

Decline

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 9
FLOW CHART
STARTING THE FLOWCHART

HR Department Production
Interview Requested
1. Create the ‘swim
Require lanes’ for the involved
“preinterview”
chat ? functional areas
2. Draw a ‘start’ box at
the top
3. Agree the first
activity.
4. Draw an ‘end’
terminator box at the
bottom.

Offer/decline
made

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 10
FLOW CHART
CONTINUING THE FLOWCHART

HR Department Production
Interview Requested

Require
No
“preinterview”
chat ? Continue adding
Yes process steps
Arrange
Hold «pre-
« pre-interviews»
interview» cat
chat

Arrange Yes Proceed No …until the ‘end’


formal interview further?
is reached by all
Hold paths.
interview

Yes
Make Want
offer to offer ?
Offer/decline
made No

Decline

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 11
Date : 1st Apr

PROCESS SHEET PROCESS MAPPING SHEET


COMPANY : SMMT Industry Forum
Page 1 of 1

PART NUMBER / PROCESS : Machine and Paint blank


Operation Inspection Transport Delay Storage
Time Distance
No PROCESS STEP
(Sec's) (M)

1 Storage of blanks 604800


2 Transport to cell 1200 250
3 Turning operation 1200
4 Wait for batch of 30 36000
5 Transport to furnace 2000 350
6 Heat Treatment Cycle 43200
7 Transport to vertical borer 2000 350
8 Vertical Bore 1200
9 Wait for batch of 30 36000
10 Transport to miller 250 50
11 Mill 900
12 Wait for batch of 30 27000
13 Transport to paint 250 50
14 Paint operation 1200
15 Wait for batch of 30 36000
16 Transport to Stores 1200 250
17 Store for despatch 604800
Total Time 47700 6900 135000 1209600 1399200 1300
Grand Total Time 1399200

Percentage 3.4% 0.0% 0.5% 9.6% 86.4%

SUMMARY GRAPH

3.4% 9.6% 86.4% Operation


Inspection
Transport
Delay
0.5% Storage

COMMENTS

The parts is stored for 86.4% of the throughput time - this is the delay at the stores both incoming and outgoing.
In the process the part is delayed for 9.6% of the time due to working in batches of 30
The part is transported for 1300 metres around the plant.

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 12
PROCESS MAPPING SHEET EXAMPLE
PROCESS MAPPING SHEET Date :
Page
1st Apr
1 of 1 Time for each
A line for each COMPANY : SMMT Industry Forum
PART NUMBER / PROCESS : Machine and Paint blank activity
activity No PROCESS STEP
Operation Inspection Transport Delay Storage
Time
(Sec's)
Distance
(M)

1 Storage of blanks 604800


2 Transport to cell 1200 250
3 Turning operation 1200 Distance
Marks for each 4 Wait for batch of 30 36000
travelled
5 Transport to furnace 2000 350
activity 6 Heat Treatment Cycle 43200
7 Transport to vertical borer 2000 350
8 Vertical Bore 1200
9 Wait for batch of 30 36000
Transport to miller 250 50
10

11 Mill 900
Totals for time and
12 Wait for batch of 30 27000 distance
Graph of activity 13

14
Transport to paint

Paint operation
250
1200
50

split into value 15 Wait for batch of 30 36000

added, non value 16

17
Transport to Stores

Store for despatch


1200

604800
250
Totals for each
added and waste Total Time 47700 6900 135000 1209600 1399200 1300 activity type
Grand Total Time 1399200

Percentage 3.4% 0.0% 0.5% 9.6% 86.4%

SUMMARY GRAPH

3.4% 9.6% 86.4% Operation


Inspection

0.5%
Delay
Storage
Percentage of each
Transport

Summary of type of activity


findings and COMMENTS

The parts is stored for 86.4% of the throughput time - this is the delay at the stores both incoming and outgoing.

focus areas of In the process the part is delayed for 9.6% of the time due to working in batches of 30
The part is transported for 1300 metres around the plant.

waste

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
STANDARD SYMBOLS

Operation step where the product, document or service is


changed in line with customer requirements.

Inspection – indicates a check for quality or quantity.

Transportation – movement of worker, material or equipment.

Delay - indicates a delay in the process, or an object laid aside


until required.

Storage - accumulation of material held under controlled


conditions.

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
PROCESS MAPPING TIPS

 Focus should be kept on the product / document / service flow and not the
people

 Make sure everyone understands what is being mapped

 Watch just one part! When the part is waiting and not being worked on, it is a
delay

 One step is equivalent to 0.8 m and take 1 second

 If you can’t get a unique time measurement for an activity use min & max time

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
30’

Are you interested in the


process of realization of your
coffee morning...

Are there things to do to speed


up this process?

Operation mode
• Groups of 3-4
• Make the sheet of the operations of the
process 'make the morning coffee.
• Review = 5’ per group
Utensils / Ingredients

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 17
Notes

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 18
This document and the information it contains belong to
Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 19
ADDITIONAL CONTENT ON PROCESS
MAPPING
PROCESS MAP

 You need to know the “real As Is” to guide your data collection and
subsequent analysis

 Do not leap to solutions too quickly in your process mapping. You may
begin to ‘fix’ the wrong things

What people What it What it


THINK it is ACTUALLY is COULD or
SHOULD be

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 21
WHICH MAP DO WE USE?

 Flow Diagrams
 understand the basics of a process.
 Show inputs & outputs (required information to create value)
 Allow visibility of customers and suppliers

 Flow Charts
 Detailed levels to communicate procedures and show tasks
 Easily understood at all levels of the organisation

 Mapping sheets
 Well-adapted for industrial process
 Enable to decompose into standard task
 Show order and hierarchy from task to task
 Allow analyse of time cycles with a “Lean” light

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
PROCESS MAPPING TIPS & PITFALLS

 Check for previous work on your current process


 Company’s previous process maps might be available
 Investigate first & Use existing documents as a baseline

 Ask information to “doers” rather than managers


 Capt information from people who carry out the process.
 Don’t assume managers or more senior people know more.

 Map the real process


 Regularly check by asking ‘Is that what actually happens?’
 Verify the process map with people who operate the process
 ‘Walk-through’ the process as it happens.

 Include
 Enough details to understand where problems occur
 But not too much detail to be able to see problems clearly.

 Check the decision points cover all the alternatives

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
60’

1st part: the Burger Queen


company faced commercial
difficulties.

You are a Black Belt and you


are assigned to help this
restaurant…

Operating mode
• 2 groups read the available instructions
• and map out the processes
• Time = 15' by group
This document and the information it contains belong to
Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 25
BUILD A TARGET

 Analytic way to build a target


 Analyse the causes of deficient performance
 Define SMART objectives based on VA, NVA, PARETO, …

 Empirical way to set objectives


 Use internal Benchmark and reduce by 50%
 On the process you have identify, benchmark the different sites’ performances

Operations # persons Lead Time

Site 1 30 8 8 weeks

Site 2 15 6 6 weeks

- 50%
Site 3 18 10 5 weeks

Target 7 3 3 weeks

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 26
ELABORATE SOLUTIONS - SIMPLIFY

 Use the process map to identify value added activities and metrics

 Eliminate or reduce dramatically any activity that does not add value. This
is the “war” on wastes
 Avoid over-production, Eliminate waiting time, balance the work stations and
level the demand, eradicate multiple transportation, no pointless operation,
reduce WIP & movements (persons or files), Rework requires to do the same
things twice!

xx x x
Low value added process

NVA NVA VA NVA VA NVA


High value added process

VA VA

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 27
ELABORATE SOLUTIONS

 Group product into lines to increase value added proportion


 For instance, in services, we may gather people to improve communication
& information flow, product by product. Every one is focused on customer’s
care rather than department objectives.

Expertise & Departments

Product line

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 28
ELABORATE SOLUTIONS

 The Value Chain is in movement

 Eliminate anything that interrupts the flow!

Intermittent flow process


 
One piece flow process

 Prefer a one piece flow process


 In production cut the fabrication order

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 29
ELABORATE SOLUTIONS

 Balance the flow to maximise it


 Capacity of a process is sized by the bottleneck
 If you notice some waiting time, you may ask those questions
 Is the line balanced?
 Is it possible to transfer work from one station to another
 To eradicate bottlenecks, we need
 A balanced line
 Polyvalence to absorb non-attendance and hazards


This document and the information it contains belong to
Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 30
ELABORATE SOLUTIONS

 Workstation lay out


 Organise and standardize to address non value added
 Who is the most appropriate person to complete the task?
 Who is responsible for the task?
 Does it make sense to have information and material as close as possible?

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 31
ELABORATE SOLUTIONS
CREATE VALUE

 Manufacturing process
 Value added is defined by the customer. We cannot add value to the product
more than the specification

 Office / Engineering
 You can always create some added value when customer demand is not
really explicit
 If you already satisfy customer basic requirements, how can you delight him
with a new service?
 Ex: offer helpdesk service 24/7, discounts on products, do better and quicker, …

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 32
ELABORATE SOLUTIONS

 Promote paralleled tasks


 When the takt time is too demanding, when ressources are unique or when
completion of a task doesn’t require completion of previous task, you may
define parallel tasks.
 Parallel tasks reduce lead time. Example:

Start Launch Drink Read the


Lead Time
computer coffee coffee newspaper
28’
(10’) (5’) (5’) (8’)

Switch on computer (10’)

Launch Coffee Drink Coffee Lead Time


(5’) (5’) 10’

Read Newspaper (8’)

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 33
ELABORATE SOLUTIONS

 Split standard cases from complex & specific


 Standard tasks are standardized, well know, predictable
 you can easily optimize the flow of documentation or parts
 Complex, specific or randomness provides stops, waiting time, chaos and
requires specific knowledge
 Mixing standard and complex case decrease efficiency and effectiveness of
the organisation => define decision points to choose the appropriate flow

Standard
Specific cases
? doesn’t affect
standard flow

Specific

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 34
REVIEW AND COMPLETION

 The end review meeting with the Steering Committee is imperative


 Introduce the current process map
 Present mapping target
 Propose the action plan prioritized
 Steering Committee must sign-off

 We recommend that you schedule a daily review at then end of each day
in order to
 Reorient the project if necessary
 Prepare the sponsor and the Steering Committee to proposed
developments.

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 35
60’

2nd part : The Burger Queen


company faced commercial
difficulties.

You are a Black Belt and need


the best restaurant...

Operation Mode
• 2 groups
• Read the instructions available and look for
solutions to your main service problems
• Time = 15’ per group
CONCLUSION

 Process maps allow us to visualize the flow


of products or service

"I'm a product that passes through the


value stream... »

 Process Steps
 Start with a macro process
 Detail what seems important to you
 Add actual data to reinforce your point of
view

 Search simple improvements which address


the sources of waste

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 37
Smart decisions
HOW TO USE THE FIGURES AND
STATISTICS IN YOUR PROJECTS
38 / Black Belt development program - Week 3
Ce document et les informations qu’il contient sont la propriété de Snecma. Ils ne doivent pas être copiés ni communiqués à un tiers sans l’autorisation préalable et écrite de Snecma.
20’
You are interested in the timeliness of 4
airlines. You have told your team (4 people)
to do a comparative study.

They each took 30 flights at random by


airline and calculated the percentage of
cancelled flights or delayed more than 15
minutes.

You have just received their report (next


page)...

Operation Mode
• Groups of 3-4
• Analyze the results and indicate your own
conclusions
• Time = 5’ per group
CASE STUDY: PERCENTAGE OF AIRCRAFT WHICH TOOK OFF WITH MORE
THAN 15 MINUTES LATE - SAMPLES OF 30 DEPARTURES
50 45
45
40
40
35 50 43
35
43 40 40
30
30
27 27 30
0 0
Air France Aeromexico American Lufthansa Air France Aeromexico American Lufthansa

Conclusions - John: Aeromexico and Conclusions - Mary: American is much


American are much better than others. better. The others are equivalent.
Lufthansa is to be avoided.
50 40
40 30
30
20 40 37 37
20 43 33
27 30 10
10 20
0 0
Air France Aeromexico American Lufthansa Air France Aeromexico American Lufthansa
Conclusions - Elias: Air France and American Conclusions - Sarah: the companies are
are much better. Aeromexico is to be avoided. equivalent
This document and the information it contains belong to
Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 40
Notes

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 41
Solution

40
Les 30 échantillons ont été tirés
30
parmi 4 groupes de 3000 données
20
33 33 33 33 RIGOUREUSEMENT identiques dans
10 lesquels 33% des vols étaient en
retard / annulés
0
Air France Aeromexico American Lufthansa

Lessons apprises?

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 42
Notes

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 43
SAMPLING & CONFIDENCE INTERVAL
BASICS ON ESTIMATION
CONFIDENCE INTERVAL COMPUTATION
SAMPLING SIZE
SAMPLING STRATEGY
Green Belt reminder
VOCABULARY

 In order to study the weight of the population,


data is recorded for all company employees
Variable
Population
WEIGHT
Item/
Individual

THE WORLD Sample

A PERSON

THIS CLASS

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 45
Green Belt reminder
VOCABULARY DEFINITION

 Variable
 Characteristic of the individuals/items.
 Ex: Salaries of employees, diameter of a part, age of a model

 Item/Individual
 A single observation of the population under study.
 Ex: An employee, a part, a model.

 Sample
 A collection of observations / individuals
 Ex: 100 employees, 1000 parts models, 10 models.

 Population
 The totality of items/individuals under consideration and from which a sample
has been taken.
 Ex: All employees, all parts built, all models sold.

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
Green Belt reminder
VOCABULARY POPULATION

 Descriptive Statistics is the field of statistics that defines or characterizes


a population based upon the data points (i.e. values) taken from that
population

A Population is made up of all the values fitting a particular description


taken from a product or process

 Rule : The letter N is used to describe the number of values in a


population (the population size). The Greek letters m and s represent
respectively the mean and the standard deviation of the entire population.

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
Green Belt reminder
VOCABULARY SAMPLE

 A Sample is a subset of data taken from a population.

 Statistics (or sample statistics) are terms used to describe the key
characteristics of a sample.
 We usually measure Sample Statistics in order to learn something about Population Parameters.
 Inferential Statistics is the field of statistics that draws conclusions about a population based upon our
analysis of the sample data.

 Rule : The letter n is used to describe the number of values in a sample (the sample size).
The symbol x (xbar) and the letter s represent respectively the mean and the standard
deviation of the sample.

n, x , s
Population
N, m, s
Sample

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 48
VOCABULARY AND NOTATIONS FOR SAMPLE / POPULATION

Average: m
Standard dev.: s
Size: N

SAMPLE

Average: X
Standard dev.: s
Size: n

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 49
Green Belt reminder
SAMPLING

 Benefits
 Saves time & money
 Allows for more meaningful data
 Simplifies measurement over time

 Drawbacks
 Accept a degree of uncertainty as the cost of not measuring the whole
population

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 50
Green Belt reminder
PRECAUTIONS WHEN SAMPLING

 Respect random occurrences (no sorting when collecting)

 The sample size is up to the aim:


 Make sure that the goal is reached
 Discover if the performance is far away from the specifications

 Choose measuring equipment based upon the precision required

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 51
KNOWN SAMPLE SIZE

NON representative sample NON representative sample Representative Sample

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 52
Green Belt reminder
VOCABULARY: DATA TYPES

Continuous Data Discrete Data

• When the variable being • When the parameter being


measured is expressed on a measured can only take on certain
continuous scale, (there is an qualitative values such as “pass/fail,”
infinity of values between two or categorized values as
fixed values), the distribution of “red/yellow/green,” the distribution of
data is called “continuous” data is called “discrete”

105 N.m ±15%

Measurement
results J Customer without
written complaint
L Letter of complaint sent
106
107
106.7
106.8
106.75 Traffic lights
106.753

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 53
CONTINOUS OR DISCRETE DATA?

 Number of rejected products


in today's production ?

 Errors in an invoice?

 Oven temperature

# of employees absent

# of IT tickets

 Length of a rubber band

 On time delivery

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 54
CONTINUOUS VERSUS DISCRETE

 Why is it important ?
 Discrete data is generally faster and easier to capture

 The sample size of discrete data has to be considerably higher as compared


to continuous data.

 With continuous data you can perform more analysis.

 If you have a choice and you can afford the time and resources, you will
want to collect continuous data whenever possible e.g.: lateness of a
delivery
 if you measure how late the delivery is, this is continuous data
 if you only measure the deliveries that exceeded the due date, this is discrete data

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 55
SAMPLE POPULATION

Average X [m ]
Standard dev. S [s ]
Proportion p̂ [p]

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 56
ARE YOU FLUENT IN
STATISTICIAN
LANGUAGE?

Why use 95% and 5% of


risk concepts?

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 57
CONFIDENCE INTERVAL

 Fact of the day


 I bought 10 baguettes today. What is
the 95% confidence interval for the
mean weight of the overall baguettes
sold by the baker?

 Data = weight of 10 baguettes


 249.4g, 246.9g, 247.4g, 249.9g, 248.3g,
252.3 g, 247.8g, 247.2g, 249.1g, 251.8g,
249.3 g, 249.7 g, 248.2 g, 250.6 g,
249.8g

 Result = there is 95% chance that the


average baguettes weight is between
248.3g and 250.1g

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 58
CONFIDENCE INTERVAL AND STANDARD DEVIATION

Calculated When the variation within


Mean the sample is small, the
confidence in the calculated
mean is high and the
confidence interval is small.

When the variation within


the sample is high, the
confidence in the calculated
mean is low and the
confidence interval is large.
The Confidence Confidence Interval for the mean
Interval depends
also on standard
deviation !

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 59
CONFIDENCE INTERVAL AND STANDARD DEVIATION

 Lets accept the following:


 When we calculate a mean from a sample composed of n data points, there
is a degree of error in the calculation. This error called « Standard Error of
the mean » (SE mean) is calculated by

SE mean = s n
Confidence interval on the mean
SE mean
X
For a given S, the higher the
number of data is, the lower
the error on the mean will X  tn 1;(1 ) / 2 .( s n) X  t n 1;(1 ) / 2 .( s n)
be
The confidence interval on the
mean calculation decreases as n
Low n = Large n = increases and s decreases
n
big error small error

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 60
CONFIDENCE INTERVAL
A PRACTICAL EXAMPLE

 Let’s continue with our


example: BB_Exercises_Eng.xls / CI
 The column A represents the last 1000 orders amount.
 The column B and C are random extracts from the whole population.

 Using Stat/Basic Statistics/Graphical summary


 Check the Confidence Interval for the mean of each sample.

 What are your conclusions?

 Other exercise:
BB_Exercises_Eng.xls / Baguettes

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
CONFIDENCE INTERVAL
CONCLUSION

 Larger sample size decreases the error on the mean

 Lower standard deviation decreases the error on the mean

 A similar approach allows us to calculate the confidence interval on the


median and on the standard deviation as shown below:
Summary for 30sample1
A nderson-Darling N ormality Test
A -S quared 0,28
P -V alue 0,618

M ean 317,92
S tDev 10,54
V ariance 111,13
S kew ness 0,153662
Kurtosis -0,132055
N 30
M inimum 297,92
1st Q uartile 308,41
M edian
3rd Q uartile
318,07
325,63
CI on mean
300 310 320 330 340
M aximum 343,97
95% C onfidence Interv al for M ean
313,98 321,85
95% C onfidence Interv al for M edian CI on median
313,67 322,49
95% C onfidence Interv al for S tDev
95% Confidence Intervals
8,40 14,17
Mean
CI on standard
Median

314 316 318 320 322


deviation

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
20’

Resume using John’s


data and integrate the
concept of confidence
in your decision...

Operational Mode
• Groups of 2-3
• Use Minitab to calculate the
confidence intervals of the
percentages given by John
• Review = 15’ each
4 CASE STUDY: PERCENTAGE OF AIRCRAFT WHICH TOOK OFF MORE
THAN 15 MINUTES LATE - SAMPLES OF 30 DEPARTURES

50
Air Aero Ameri- Lufthan-
45 France Mexico can sa
40
# late 13 8 8 15
35

30
n 30 30 30 30
25 50
43
20

15
% of late 43% 27% 27% 50%
27 27
10

5 Terminal l:
interv. … … … …
0
Air France Aeromexico American Lufthansa
conf

Conclusions of John: Aeromexico and Terminal


American are much better than others. sup interv. … … … …
Lufthansa is to avoid. conf

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 64
USING MINITAB TO CALCULATE A PROPORTION CONFIDENCE INTERVAL

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 65
USING MINITAB TO CALCULATE A PROPORTION CONFIDENCE INTERVAL

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 66
Personal reading

CALCULATING SAMPLE SIZE

 When determining a certain characteristic from a sample, you are always


sure that you do not find the exact value of this characteristic, but “only”an
estimate.

 TWO important questions:


 how accurate should this estimate be?
 how confident do you want to be of the estimate (usually 95%)?

 We will say that we find the characteristic within a


certain precision d, for instance 10 min
 We want to find the average production time within +/- 10 min
 From the sample we find average production time 100 min.
 From this we draw following conclusion :
 We do not know the real average production time,
 But we are 95% sure that the real average is somewhere between 90 and 110 min (found average
+/- d)

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
Personal reading

NOTION OF CONFIDENCE CASE STUDY

 Each month, the Purchasing Department emits, on average, 5000 orders. Some vendors
highlight the too many errors in order to justify their delays in delivery. I want to estimate
the percentage of incorrect orders.

 Out of 10 orders, 2 of them have at least one error. What is the actual percentage of
erroneous orders?

 Out of 100 orders, 20 of them have at least one error. What is the actual percentage of
erroneous orders?

 Out of 1000 orders, 200 of them have at least one error. What is the actual percentage of
erroneous orders?

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 68
Personal reading
CALCULATING SAMPLE SIZE
DISCRETE DATA

CONTINUOUS DATA DISCRETE DATA

2
 s 2
2

n  2  n    p(1  p)
 d d 
 n = sample Size
 p = proportion
 s = standard deviation
 d = precision
 95% Confident

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 69
Personal reading

CALCULATING SAMPLE SIZE

 So, there are 4 important factors

 Continuous or discrete factors

 How accurate do you want to estimate the characteristic


 Precision is inversely proportional to the square root of sample size
 The more accurate, the higher the sample size needed

 The standard deviation/ the proportion

 How confident do you want to be ( the formulas on the previous page give
you the 95% confident intervals.)

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 70
Personal reading

CALCULATING SAMPLE SIZE

 To calculate sample size


 With continuous data, we need the standard deviation
 With discrete data we need the proportion

 But since we haven’t sampled yet !

 How do we resolve this problem…..


 With continuous data
 Educated guess / Historical data
 Look for extreme values: between the lowest and highest value, we would expect 6
times the standard deviation
 Estimate standard deviation by taking a small sample
 With discrete data
 As continuous data
 + we may use the worst case for p (largest sample size)  p = 0,5

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 71
Personal reading
CALCULATING SAMPLE SIZE
IF WE KNOW THE POPULATION SIZE

 Assumptions in the sample size calculations.


 Formula for proportions only valid if 0,01 < p < 0,5
 Formula assumes the sample size n < 0.05 x N (population)
 If you are sampling more than 5% of the population, we can adjust the
formula as follows

n
nadjusted 
n
1
N
 Formulas only valid when sampling from a population

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 72
Personal reading

CALCULATING SAMPLE SIZE

 Example
 We have a box with coloured beads, and we are interested to know the
proportion of white beads in this box.

 Henri take a sample of 25 and find 2 white beads. Conclusion?

 John take a second sample of 25 and find 5 white beads. Conclusion?

 Questions ?
 Which one is more accurate: Henri (8%) or John (20% of white beads) ?
 Can sample size formula help us to draw better conclusions?

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 73
Personal reading

CALCULATING SAMPLE SIZE

 Let’s apply the formula for the confidence interval


 n= 25 & p = proportion found in the sample

p(1  p) 0,08(1  0,08) 0,0736


d 2 2 2  11%
n 25 25
p(1  p) 0,2(1  0,2) 0,16
d 2 2 2  16 %
n 25 25
 Conclusions:
 Henri would tell he is 95% sure that the real proportion is somewhere
between 0% and 19%
 John would tell he is 95% sure that the real proportion is somewhere
between 4% and 36%

 What would be d for 50, 100 samples ?

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 74
Personal reading

CALCULATING SAMPLE SIZE

 If n = 50 and p = 50 %  d = 14 %
 The confidence interval on the proportion is + or - 14%
 If n = 100 and p = 50 %  d = 10 %
 The confidence interval on the proportion is + or - 14%
 Now, we sample the 100 beads and found 13 white
 dmax is 10% (cf previous page)
 But we have a better estimate for p (13%) using the formula with the observed value
for p
p(1  p) 0,13(1  0,13) 0,113
d2 2 2  0,067  7%
n 100 100

 So we can say that we are 95% sure the real proportion of white beads is between
6% and 20%.

 If this is not precise enough, increase the sample size.

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 75
Personal reading
COMPUTE THE SAMPLE SIZE
USING THE EXCEL SPREADSHEET

 Case 1 : Description
 You want to know the average length for the last production of wiper blades.
This last production run contained 3500 units.
 The standard deviation in the length is about 13 mm.

 Question ?
 How many units should be sampled?
 This is not a destructive test, the only cost involved is the extra operator.

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 76
Personal reading
COMPUTE SAMPLE SIZE
USING THE EXCEL SPREADSHEET

 Case 2 - Description
 You want to know the proportion of wiper blades that will fail the end
inspection.
 You guess about 2% of the wiper blades will fail this test.
 There is an average of 3500 units produced per shift.

 Question ?
 How many units should be sampled?

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 77
POPULATION OR PROCESSUS
DISCUSSION

 What is the difference if you want to sample some water in a standing pond & a running
stream ?

 Population:
 Static situation: you can make a photograph
 The boundaries are defined
 Sampling done to either quantify a characteristic
or to compare two (or more) groups
 e.g.: Is there a change in the active ingredients
in a drug after 1 year? Record the age of all goods in stock.

 Process:
 There is a time element
 Important to show the dynamics
 Characteristics might change from one moment
to the other
 Sampling done to understand the process and track improvements
 E.g.: record call volume every 15 min for a help desk.

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 78
POPULATION VERSUS PROCESS

 The sampling formulas are only applicable when sampling from a


population.

 In the 6 sigma projects, we will often sample from processes.

 What if sampling from process


 Check if process is stable (we will discuss this later)
 If process is stable you can use the formulas as a conservative estimate

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 79
SAMPLING STRATEGIES

 Random sampling
 all elements have an equal chance of being selected

 Systematic sampling
 shows time element in a process

 Judgment sampling
 based on prior knowledge of the population

 When do we use which sampling strategy?

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 80
SAMPLING STRATEGIES
RANDOM SAMPLING

 Random sampling is the preferred strategy when sampling from a population.

 Each unit has the same chance of being selected

 To have a true random sample, use a random generator.


Population Sample

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 81
SAMPLING STRATEGIES
RANDOM SAMPLING

 Description
 Your company wants to know invoice payment lead time
 Your database contains 1 345 customers
 You cannot measure lead time for all your customers. You need to choose a
set of 200.

 How are we going to organize 200 samples?

P P P
I
PPP P P P
I I I
P P P P P P P
I I
P P P P P P
I
P P

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 82
SAMPLING STRATEGY
SYSTEMATIC SAMPLING

 Systematic sampling is the sampling strategy for processes

 The time order in the process is preserved


 e.g.: Sample every 100th part produced

 Also used for population sampling since it sometimes is easier to


organise than true random sampling
 e.g.: ask every 15th person entering a store how pleased he or she is with
the service.

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 83
SAMPLING STRATEGY
SYSTEMATIC SAMPLING: SUBGROUPS

 Especially used in high volume processes.

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 84
SAMPLING STRATEGY
JUDGMENT SAMPLING

 When random sampling is used, values of X that are often observed, will
occur more in the sampling.
 The low and high values (that are rather rare) tend not to show up in the
sample.

 The preferred sampling method for regression analysis (or scatter plots)
 Ensures that you have low values and high values of X
(= wide range) so that you can investigate relations

 Ensures that you have the whole data range of X

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 85
CONCLUSION

• Most of our field data are collected by


samples

• When comparing averages or


proportions, a non-negligible part of the
measured differences is attributable to
sampling fluctuations

• Before rushing to the conclusion,


calculate the confidence intervals of
your averages or proportions

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 86
This document and the information it contains belong to
Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 87
Hypothesis testing
PROVE THE ROOT CAUSES AND
IMPROVEMENTS
88 / Black Belt development program - Week 3
Ce document et les informations qu’il contient sont la propriété de Snecma. Ils ne doivent pas être copiés ni communiqués à un tiers sans l’autorisation préalable et écrite de Snecma.
Flights cancelled or delayed more than 15 minutes (%)
50

45
What tools can I use to
40 know if:
35
43
50
American and Aero
30 Mexico are actually more
0
27 27 punctual thanLufthansa
Air France Aero Mexico American Lufthansa
and Air France ?

Initial findings from John: Aeromexico and


American are much better than others.
Lufthansa is to be avoided.

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 89
WHY TESTING HYPOTHESIS?

 Graphical analysis allows us to :


 Reduce the number of factors to investigate
 « confirm » some root causes : « we can see it » and not only « we think it is
an important root cause »
 Focus on the areas to brainstorm even before starting the analysis (for
example, stratification).

 Graphical analysis may be misleading


 Take the wrong decision having read a graph
 Not explicit enough for people to make the good decision

 Statistical analysis and Hypothesis Testing allows us to confirm or reject


our intuition we got with graphics.

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
IDENTIFYING ROOT CAUSES
Y of your project
Continuous Discrete / Attributes
Graphics Graphics
• Pareto Proportions
60 50 100 50
43
40 40
50 43 50
Discrete /Attributes

20 12 12 30
4 27 27
0 0 0

Tests for mean Tests for the standard Statistics


• t Test (1 or 2 samples) deviation • Tests de proportion
• t Test matched • Test 1 or 2 standard • Test du Khi²
X (Factors / Causes to test)

• ANOVA deviation
• Variance test
Graphics
Continuous

Statistics Statistics
• Correlation Test • Logic regression
• Regression Analysis
This document and the information it contains belong to
Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 91
DEMONSTRATE AN IMPROVEMENT

Parts delivered per month (Quantity)

41
38 38
36 35
33 33
31
29

22

Before
After

Jan Feb Mar Apr May Jun Jul Aug Sept Oct

Test to be used
• 2-Sample t test (before / after)
• Test for equal variances / standard deviation

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 92
SUMMARY OF STATISTICAL TESTS

Risks decision

Case Study

Y continuous
• Influence on the average

• t Test for 2 samples


• ANOVA

• Influence on variability
• Test for 2 and + standard deviation / variance

• Correlation between 2 continuous variables


o Linear Regression

Y discrete
• Influence on the proportions
• Khi² Test

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
DECISION RISKS

 To better understand the risk in Hypothesis Testing, let’s look to a trial where a person is
judged for a murder. In this case, the hypothesis to be tested are :
 H0 : Defendant is Innocent vs H1 : Defendant is Guilty

 What ever the jury’s conclusion, there is a risk to make the wrong decision…
 Concluding « H0 » does not mean that the defendant is for sure innocent.
 Concluding « H1 » does not prove that he is guilty…

 The verdict has a probability of error and we can define 2 types of error :
 1st type of error : the jury conclude “he is guilty” but in fact “he is innocent”.
 2nd type of error : the jury conclude “he is innocent” but in fact “he is guilty”.

 Those errors lead to 2 types of risks for the society :


 keeping an innocent person in jail
 releasing a murderer into the population

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
RISK DEFINITION

 Those 2 risks have specific definition:


 Type 1 (or α risk) = Risk of “rejecting a hypothesis”
when the “hypothesis is true”.
 Type 2 (or b risk) = Risk of “accepting a hypothesis”
when the “hypothesis is false”.

CORRECTNESS OF HYPOTHESIS
Hypothesis Hypothesis
is true is false
b Risk
DECISION ON
HYPOTHESIS

Accept Correct
Error of the
Hypothesis Judgment
Second kind

 Risk
Reject Correct
Error of the
Hypothesis Judgment
first kind

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
20’
Flights cancelled or delayed greater than 15
minutes(%) Make the link between the
50 Alpha/Beta risks and the
45 case study on the
40 punctuality of the
35 50 companies
30

27
0 How:
American Lufthansa
• Groups of 2
• Read the proposals and connect
those who seem match
• Review: 10 minutes
Exercise

Hypothesis H0:
"No difference between companies.

A. Risk - May decide that the two 


companies are different then, in
reality, the two companies have
the same performance 1- 
B. Probability of detecting a
difference between companies
when they actually have a b
difference

1- b

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 97
PRINCIPLES OF THE TYPE OF STATISTICAL TEST

Inputs Process Outputs

Prerequisites: Making a test is done in 5 steps: Decisions (choose one):


Define the issue to be
treated (for roots 1 Making a chart to visualize your Data The data confirms our
causes, prove analysis / improvement
improvement) 2 Set H0
The data does not provide
Having data available 3 Choose the appropriate test sufficient evidence to
and Quality confirm our analysis /
4 Run the test and compare improvement
Equipment: p-Value at the 5%
Minitab or other
statistical processing P-value <5%  Reject H0
software equivalent P-value> 5%  Accept H0

5 Conclude in practical terms? (Non-


statistical)

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 98
1 GRAPHICS FOR THE ROOT CAUSES

Y of your project
Continuous Discrete / Attributes
Graphic = Box plot Graphics
Discrete /Attributes

• Pareto Proportions
50
50 100 50
43
40 80 45
40
30 60
35 50
20 40 43
12 12 30
10 20
X (Factors / Causes to test)

4
27 27
0 0 0

Graphic = Scatter Plot


Continuous

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 99
1 CHARTS TO PROVE AN IMPROVEMENT

Option 1 - Parts delivered per month (quantity)


Before
41
38 36 38 After
33 33 35
31 29
22

Jan Feb Mar Apr May Jun Jul Aug Sept Oct

Option 2 - Parts delivered per month (quantity)


45 Before
40 After
35
30
25
0
Jan Fev Mar Avr Mai Juin Juil Aout Sept Oct

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 100
2 3 DEFINE HYPOTHESIS HO DEPENDING ON THE PROBLEM TO SOLVE

X DISCRETE AND Y CONTINUOUS

Problem Example Hypothesis H0 Test to use Decision if P-value<5%

Does my cause Case 1: causes at 2 levels


X (discrete) • Suppliers 2 (X) vs diameter (Y) No influence of X on Y 2-Sample t X is important  there is
have an impact • 2 Guest (X) vs Sustainability parts Avg average diameter Test a difference of mean (Y)
on the average Y (Y) Supplier 1= Mean between the 2 possible
(continuous)? diameter Supplier 2 levels of my factor

Case 2: causes 3+ levels


• 3 machines (X) vs Dimension (Y) No influence of X on Y Avg ANOVA Test X is important at least 1
• 5 days of the week (X) vs parts  The average X levels has a different
delivery time delivery time is identical average of others
between Monday,
Tuesday,

Does my cause • Suppliers 2 (X) vs diameter (Y) No influence of X on the The X is important  at least 1
X (discrete) • 2 Guest (X) vs Sustainability parts variability of Y  standard X levels has a different
have an impact (Y) variability in the time of deviation variability of others
on the variability • 3 machines (X) vs Dimension (Y) deliveries is identical test (or test
of the Y • 5 days of the week (X) vs parts between Monday, of variance)
delivery time Tuesday,...

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 101
2 3 DEFINE HYPOTHESIS HO DEPENDING ON THE PROBLEM TO SOLVE
X AND Y CONTINUOUS
Problem Example Hypothesis H0 Test to use Decision if P-value<5%

Does my X • Bath temperature (X) vs No influence of X on Y Regression X is important. The X-


(continuous) deformation of the component (Y)  the slope of the test factor changes affect the
cause have an • Age of the employee (X) vs. Travel regression line is 0 factor Y
impact on the Y Time (Y)
(continuous)? • Dimension of workpiece (X) vs
motor consumption (Y)
• Number of rooms to repair (X) vs
global repair time (Y)
• Seniority in the position (X) vs
derogations processing time (Y)

X AND Y DISCRETE

Problem Example Hypothesis H0 Test to use Decision if P-value<5%

Does my cause • Team (X) vs Quality Parts (Y) No influence of X on Y Khi2 Test X is important. Y is
X (discrete) • Machines (X) vs type defects (Y)  X and Y are dependent on X. One of
have an impact • Weekday (X) vs Part type produced independent  the the teams has a higher
on the Y (Y) proportions of defects proportion of defects.
(discrete)? • Brand IT equipment (X) vs are the same in both
Categories soco-professional (Y) teams

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 102
2 3 DEFINE HYPOTHESIS HO DEPENDING ON THE PROBLEM TO SOLVE
PROVE AN IMPROVEMENT OR A DIFFERENCE BEFORE AND AFTER

` Example Hypothesis H0 Test to use Decision if P-value<5%

Is the average Case 1: using the same units, individuals


better
• Software update time (Y) BEFORE Deviation from the Y found t Test 2- Improvement is confirmed
BEFORE or
modification and AFTER on each unit is 0  sample  there is a difference
AFTER
modification t aking the measure on average of Delta = 0 PAIRED between before and after
the same physical units
• Dimensions (Y) BEFORE operation
and AFTER operation on the same
physical parts

Case 2: using units, parts, different components for the 2 measurement series

• Quantities of parts produced per day No influence between 2-Sample t Improvement is confirmed
BEFORE and AFTER improvement BEFORE and AFTER Test The average of all units
• Flow time BEFORE and AFTER  The average flow produced is different from
improvement time is identical BEFORE BEFORE AFTER
• Physical Size raised on parts and AFTER
BEFORE and AFTER retrofit of the
machine

Is variability • Dimensions of produced parts No influence between Standard Improvement is confirmed


after better BEFORE and AFTER improvement BEFORE and AFTER deviation The variability of all
than before • Flow time BEFORE and AFTER  variability is identical Test units produced is different
improvement geometric dimensions (variance BEFORE from AFTER
BEFORE and AFTER test)

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 103
4RUN THE TEST AND COMPARE THE P VALUE TO THE THRESHOLD OF
5% USE MINITAB WIZARD FOR HYPOTHESIS TESTING

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 104
EXAMPLE OF UNDERSTANDING DELAYED AIRLINES
3 Test  comparison of 4 proportions (see Wizard
Minitab)

1 Graph = flights cancelled/delay of + 15 ' (%)

50

45

40
50
35
43
30
4
27 27
0
Air France Aero Mexico American Lufthansa

2 H0 = no influence  'proportions are equivalent


on the 4 companies '.
5 Conclusion in non statistical terms
• With regard to the data collected, we don't have
enough elements to detect a difference of
punctuality among airlines. They are considered to
be equivalent.
This document and the information it contains belong to
Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 105
SUMMARY OF STATISTICAL TESTS

Risks decision

Case study

Y continuous
• Influence on average
• 2 samples t test
• ANOVA
• Influence on variability
• Test 2 and + standard deviation / variance
• Correlation between two continuous variables
• linear regression
Y discrete
• Influence on the proportions
• Chi² Test

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
This document and the information it contains belong to
Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 107
30’

You work for the famous


company n & n s and for a
few months you have seen
many delays on delivery by
truck

Operation mode
• Groups of 2
• Take notice of the 6-pack and the list
of project factors
• Prepare a list of tests to be carried
out (15 ')
• Review = 5 each
REDUCING THE FLOW TIME REQUIRED DURING SHOPS
DELIVERIES BY TRUCK

Impact on the Group Problem Description


• Reduction in requests for deliveries in emergency • The flow time required for the delivery of the
(commercial actions): 30 k€ / year stores is about 12 h
• Drivers overtime: 200 k€ / year
• Reduction of the fleet of available truck: 120 k€ / Definition of a defect: Lead Time > 9 h
year

Operational Objective Scope


• 95% of all deliveries by truck within 9 h Included: n & n's chocolate, almond and
peanut. Store in the Paris area for warehouse
Excluded: Other deliveries or emergency
(different process)

Planning Team
Start: March Sponsor : Chuck OLAT
End Analysis: May Project manager Black Belt : You
Implementation: August Team members: Logistics driver, drivers of
End of the project: September trucks, part shipping operator

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 109
MACRO PROCESS AND POTENTIALLY
INFLUENTIAL FACTORS

 Project Context
 The project is in phase ANALYSER you have already identified a series of
potentially influential factors
 And have launched the collection of data associated with

 Summary table

• Week (X1) • Flow time(Y1)


• Weekday (X2)
• Date (X3) Deliver
• Delivery quality (X4) packages of
• Weight of the load (X5) n-n’s
• Type truck (X6)
• Conductor (X7)

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 110
QUESTION: WHAT ARE THE FACTORS THAT IMPACT THE DELIVERY TIME?
(LIST OF THE ANALYSES TO BE CARRIED OUT)

Activities to be undertaken Associated Graphic

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 111
SUMMARY OF STATISTICAL TESTS

Risks decision

Case study

Y continuous
• Influence on average
• 2 samples t test
• ANOVA
• Influence on variability
• Test 2 and + standard deviation / variance
• Correlation between two continuous variables
• linear regression
Y discrete
• Influence on the proportions
• Chi² Test

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
2 SAMPLE T IN A NUTSHELL

Application Cases Path in Minitab


• Compare the average of Y (continuous) based on Assistant / hypothesis testing / 2 sample t-Test.
2 levels of factor X (discrete)
Examples Hypothesis H0problème
• The 2 groups can be data before and after Average population 1 = average population 2
improvement or the disaggregated groups by a
factor X (data provider 1 vs data provider 2) • Note: for the event provider, population 1
• Suppliers 2 (X) vs diameter (Y) means "the process of supplier 1" and
• Repair time (Y) collected in May compared to population 2 means "the process of supplier
June 2.
• 2 Baths of special processes (X) vs removed
thickness (Y)

Prerequisites / terms of use Interpretation


• Make a Box Plot to see the importance (or not) the • If P-value < 5%  reject H0  the 2
factor and check the prerequisites below measured samples indicate that the averages
• Normality of data and stability (no drift, trend or are different  the X factor has an influence
special causes) in each sample on the average of the Y

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 113
EXERCISE

 With the catapult, we would like to check if there is a significant difference


between 2 rubber bands (RB):
 50 shots are made with RB1:
 Mean = 245 cm, standard deviation = 5 cm
 50 shots are made with RB2:
 Mean = 250 cm, standard deviation = 5 cm

 Open: BB_Exercises_Eng.xls / T-Test_2S

 Question ?
 Is the difference (245cm versus 250cm) significant or is it just due to the
chance during sampling?

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
SOLUTION

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 115
This document and the information it contains belong to
Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 116
NOTES AND ADDITIONAL EXPLANATIONS

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
DATA SETS FOR ADDITIONAL PRACTICE

 T-test catapult Fr.mtw


 Compare the average of the shots for the rubber bands 1 and 3
 Compare the average of the shots for the elastic bands 2 and 3

 Capabilité.mtw
 Compare the height of side X 213 produced by 4 different machines 
compare the average of parts produced on machines (2-2)

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 118
NOTES AND ADDITIONAL EXPLANATIONS

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
30’

Among the identified


analyses, use the t-test for
2 samples on the relevant
cases...

Operation Mode
• Groups of 2
• Identify the analysis or the test
sample t 2 is applicable
• Perform the tests and complete your
study report
• Review = 5’ each
NOTES AND ADDITIONAL EXPLANATIONS

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
SUMMARY OF STATISTICAL TESTS

Risks decision

Case study

Y continuous
• Influence on average
• 2 samples t test
o ANOVA
• Influence on variability
• Test 2 and + standard deviation / variance
• Correlation between two continuous variables
• linear regression
Y discrete
• Influence on the proportions
• Chi² Test

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
THE ANOVA IN A NUTSHELL
Application Cases Path in Minitab
• ANOVA means Analysis Of Variance but it allows Assistant / hypothesis testing / 1 factor ANOVA
to compare the average of Y (continuous) based controlled
on the 3 + levels of factor X (discrete)
Examples Hypothesis H0problème
• Sample = The sample group disaggregated by a Average population 1 = average population 2
factor X (provider data provider data 1 vs 2) =... = average population k
• 4 Suppliers (X) vs diameter (Y) • Note: for the event provider, population 1
• Repair time (Y) in May, June and July means "the process of supplier 1" and
• 3 position in a special process baths (X) vs population 2 means "the process of supplier
removed thickness (Y) 2"... population 4 means "the process of
supplier 4.
Prerequisites / terms of use Interpretation
• Make a Box Plot to visualize the importance (or • If P-value < 5%  reject H0  at least one of
not) the factor and check the requirements below the levels of X has a different average than
• Homoscedasticity of variance (test of the variance the others  the X factor has an influence on
has a P value > 5%) the average of the Y
• Normality and stability (no drift, trend or special
causes) of residues

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 123
ANOVA TEST
EXERCISE OF UNDERSTANDING Anova Concentrations.mtw

 Case Study
 In a paint shop, the concentration of chemical elements in the paint must
remain constant. An important chemical component must be distributed
evenly in this painting to ensure a good resistance to corrosion of parts.

 The concentration of this component is measured in the storage tank and


expressed in ppm. Samples are taken at 5 different depths in the storage
tank. At each depth, 8 different samples are collected.

 We need to know whether or not the component is present homogeneously.


Should we make changes to the storage tank, yes or no?

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
This document and the information it contains belong to
Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
This document and the information it contains belong to
Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
This document and the information it contains belong to
Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
This document and the information it contains belong to
Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
This document and the information it contains belong to
Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
ADDITIONAL DATA SETS TO PRACTICE

 RÉGRESSION MULTIPLE CATAPULTE.mtw


 This data file contains the information collected during test launch with the
catapult.
 Analyze if the shooters have the same repeatability and the different elastic
marks have the same variability in terms of the shooting distance.

 Capabilité.mtw
 Compare the height of side X 213 produced by 4 different machines 
compare the standard deviation of the X 213 coast on 4 machines in a single
analysis. Is there a machine more or less dispersed than the others?

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 130
NOTES AND ADDITIONAL EXPLANATIONS

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
30’

Among the identified


analyses, use the ANOVA
test on relevant cases...

Operation Mode
• Groups of 2
• Identify where the ANOVA analysis is
applicable
• Perform the tests and complete your
study report
• Review = 5 ‘ each
NOTES AND ADDITIONAL EXPLANATIONS

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
SUMMARY OF STATISTICAL TESTS

Risks decision

Case study

Y continuous
• Influence on average
• 2 samples t test
• ANOVA

• Influence on variability
• Test 2 and + standard deviation / variance

• Correlation between two continuous variables


• linear regression
Y discrete
• Influence on the proportions
• Chi² Test

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
STANDARD DEVIATION TEST IN A NUTSHELL

Usage Path in Minitab


• Compare the standard deviation of Y (continuous) If the X-factor has 2 levels Assistant /
based on 2 + levels of factor X (discrete) hypothesis testing / SD in 2 samples
• It lends itself particularly well to Machine capability If the factor X has 3 + levels Assistant /
analysis for the study of Machine Cp hypothesis testing / Test of standard deviation

Examples Hypothesis H0problème


• Sample = The sample group disaggregated by a Gap-type population 1 = standard deviation
factor X (provider data provider data 1 vs 2) population 2 =... = standard deviation population
• 4 Machines (X) vs diameter (Y) k
• Repair time (Y) in May, June and July • Note: for the Machine case, population 1
• 3 position in a special process baths (X) vs means 'the machine 1', population 2 means
removed thickness (Y) "the machine 2"... population 4 means "the
• BEFORE / AFTER improvement machine 4.
Prerequisites / terms of use Interpretation
• Make a Box Plot to see the importance (or not) the • If P-value < 5%  reject H0  at least one of
factor and check the prerequisites below the levels of X has a different deviation from
• Normalcy and stability (no drift, trend or special the other  the X factor has an influence on
causes) of data per sample. the variability of the Y

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 135
THE EXERCISE OF
UNDERSTANDING DEVIATION TEST Anova Concentrations.mtw

 Case study (identical to that of the ANOVA test but for variances)
 In a paint shop, the concentration of chemical elements in the paint must
remain constant. An important chemical component must be distributed
evenly in this painting to ensure a good resistance to corrosion of parts.

 The concentration of this component is measured in the storage tank and


expressed in ppm. Samples are taken at 5 different depths in the storage
tank. At each depth, 8 different samples are collected.

 Questions
 We want to know if the variability of the concentration is identical in all the
storage tank.
 Apply the test of the deviation and conclude...

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
This document and the information it contains belong to
Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
This document and the information it contains belong to
Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
NOTES AND ADDITIONAL EXPLANATIONS

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
30’

Among the identified tests,


use the test of the
deviation on the relevant
cases...

Operation Mode
• Groups of 2
• Identify testing where the test of the
deviation is applicable
• Perform tests and complete your
study report
• Review = 5’ each
NOTES AND ADDITIONAL EXPLANATIONS

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
SUMMARY OF STATISTICAL TESTS

Risks decision

Case study

Y continuous
• Influence on average
• 2 samples t test
• ANOVA

• Influence on variability
• Test 2 and + standard deviation / variance

• Correlation between two continuous


variables: linear regression

Y discrete
• Influence on the proportions
• Chi² Test

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
LINEAR REGRESSION IN A NUTSHELL

Application Cases Path in Minitab


• Linear regression allows to establish the link Assistant / Regression
between a continuous Y and a continuous X,.
• It allows to calculate the intensity of the link and
generate a mathematical model of the relationship
Y = aX + b, where a and b are both constants

Examples Hypothesis H0problème


• Quantity (in kg) of products manufactured (Y) vs In the equation Y = aX + b, a = 0  no influence
speed machining (continuous X) of the X factor on Y
• Thickness of processing Plasma (Y) vs spraying
pressure (X)

Prerequisites / terms of use Interpretation


• Samples are collected by (X, Y) pair • If P-value < 5%  reject H0  ≠ 0  factor
• Make a cloud of points to see the importance (or X has an influence on the variability of the Y
non) factor
• Normality and stability (no drift, trend or special
causes) of residues

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 143
LINEAR REGRESSION: CASE STUDY
 Is there a linear relationship between the angle and the ball impact
(=distance)?

 Or can we predict the ball impact based on the angle?

3 Distance

2
4
1
3
2
Angle
1
5 4 3 2
6 1

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
SIMPLE LINEAR REGRESSION
EXERCISE OF UNDERSTANDING WITH THE CATAPULT

 Consider our Catapult: the more we draw to the rear the Catapult, the farther
the golf ball will go. A slight increase in the angle of Kickback will increase
slightly the shooting distance.

 What is the input parameter X?


 Is it (discrete) discontinuous or continuous?

 What is the output parameter Y?


 Is it (discrete) discontinuous or continuous?

 The X parameter can be used to predict the output parameter if there is a


relationship between X and Y. X is also called the Predictor of Y.

 In this case we can control X, and so we can influence the output response Y.

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
LINEAR REGRESSION: CASE STUDY

 The Scatter plot shows a linear relationship between the angle and the
distance.

Scatter Plot Distance vs Angle


Nuage de points de Distance tir et Angle recul
3800

3600

3400

3200
Distancetir

3000
Distance

2800

2600

2400

2200

2000

30 35 40 45 50 55 60
Angle recul
Angle

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
LINEAR REGRESSION

A linear model will describe mathematically the relationship between X


and Y.

 The linear model is a straight line described by the equation :

Y: Answer
X: Input
Y  aX  b a: for each single variation of X, Y
will increase by a
b: the value of Y when X=0

 This model allows to predict what could be Y for a given X

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
LINEAR REGRESSION BETWEEN X AND Y

Nuage de points
Scatter PlotdeDistance
Distance tirvs
et Angle
Angle recul
3800

3600

3400 Y  aX  b
3200
tir

3000
Distance
Distance

2800

2600

2400
For a given value of X,
2200 we can predict the
2000 value of Y

30 35 40 45 50 55 60
Angle recul
Angle

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
LINEAR REGRESSION

 Knowing X allows to predict Y (range)


 To shoot at a distance, we can now define an angle of Kickback.

 Benefit
 Check the output of our process response using the correct value of the
input parameter X.

 Vocabulary
 X is called the controllable input to this process (variable adjustment).
 X is called the Predictor for this process.

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
LINEAR REGRESSION: EXERCISE

 Let’s imagine that the equation between the angle and the distance is

Y = 50*X + 445

 What will be the distance in mm if the angle is:


 X= 45°?
 X= 75°?
 X= 0°?

 What will be the increase in the distance if the angle increases by 1°?

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
This document and the information it contains belong to
Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 151
LINEAR REGRESSION: EXERCISE

 Answers for Y = 50*X + 445

 Y= 50*45 + 445 = 2695 mm.

 Y= 50*75 + 445 = 4195mm


BUT: in this case, we are outside the model (the model has been estimated
only between 30° and 60°). It might be hazardous to use the model
outside the evaluation range.

 Y= 50*0 + 445 = 445 mm


 Again, we are outside the evaluation range. Here, with no angle, the ball
should still jump to 445mm !

 Y= 50*1 = 50mm. Per each degree, the distance should increase by 50mm.

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
Regression Plot
LINEAR REGRESSION: LIMITS OF THE MODEL

Y  50 X  445
4000
The model is linear
only within the
evaluation zone
3000
Shooting_Dis
Distance

2000

1000

Evaluation zone
0

0 10 20 30 40 50 60 70 80 90

Drawback_Ang
Angle recul
Angle

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
30’

You want to know the forecast


model that predicts the
shooting distance depending
on the angle...

Is this model there?


Is it specific enough?

Operation Mode
• Groups of 2
• Open the Regression catapult Fr file in
Minitab
• Follow the instructions in the following
pages step by step
• Review = 15’ each
LINEAR REGRESSION STEP BY STEP

11. Represent data graphically

22. Make regression in Minitab

33. Study the residuals

44. Study for unusual observations

55. Watch p values for the coefficients a and b

66. Do we have a good model? Watch the p-value for the regression.

77. Do we have a good model? Check R²

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
THE POINT CLOUD ENABLES TO READ THE DATA COLLECTED, TO
1
DETECT ERRORS AND GIVE A FIRST IDEA OF THE INTENSITY OF THE
RELATIONSHIP

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 156
2 REGRESSION IN MINITAB

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 157
3

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 158
5 6 7

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 159
YOUR CONCLUSIONS

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 160
This document and the information it contains belong to
Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 161
LINEAR REGRESSION: STRONG OR WEAK?

30 30

20 20

Y2
Y1

10 10

0 0
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
X X

Strong Regression Weak Regression

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
Regression Plot
LINEAR REGRESSION The linear correlation coefficient is an
Regression Plot
indicator between-1 & + 1 which indicates the
CORRELATION COEFFICIENT
Y= 5 + 1 X R intensity of the relationship between X & Y
Y = 5.10526 + 1 X
S=0 R-Sq = 100.0 % R-Sq(adj) = 100.0 % S = 2.11145 R-Sq = 65.3 % R-Sq(adj) = 63.2 %

15

14
15
13

12

11

Y
Y

10 10

7
5
6

0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10

X X

r = 100% r = 63%

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 163
The linear correlation coefficient is an
LINEAR REGRESSION indicator between-1 & + 1 which indicates the
Regression Plot R
CORRELATION COEFFICIENT Regression Plot
intensity of the relationship between X & Y
Y = 4.78947 + 1 X Y = 5.00009 + 0.00000009
X
S = 4.22289 R-Sq = 32.0 % R-Sq(adj) = 28.0 % S = 2.80324 R-Sq = 0 % R-Sq(adj) = 0.0 %

20 10

10
Y

Y
5

0 0

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
X X

r = 28% r = 0%

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 164
LINEAR REGRESSION: R COEFFICIENT

No correlation Moderate and positive Strong and positive


R = 0% correlation r = 60% correlation r = 80%

Perfect and positive Strong and negative Non-Linear correlation r


correlation r = 100% correlation r = - 80% = 0%

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
LINEAR REGRESSION: TO BE CAREFUL

 Check if the relationship is linear (d)

 Check if some points may influence the regression ? (e & f)


14 14
a) 12 y = 0.5x + 3.0
b) 12
y = 0.5x + 3.0
10 r = 0.82 10
8
r = 0.82
8
6 6
4 4
2 2
2 4 6 8 10 12 14 2 4 6 8 10 12 14
14 d) 14
c) 12 y = 0.5x + 3.0 12 y = 0.5x + 3.0
10 r = 0.82 10 r = 0.82
8 8
6 6
4 4
2 2
2 4 6 8 10 12 14 2 7 12 17 22

 Always draw a graph: all graphics have the same r coefficient…!

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
This document and the information it contains belong to
Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 167
ADDITIONAL DATA SETS FOR PRACTICE

 RÉGRESSION MULTIPLE CATAPULTE.mtw


 This data file contains the information collected during test launch with the
catapult.
 Analyze if the temperature has an impact on distance

 Multiple regression – Cleaning process.mtw


 Seeking to control the amount of impurity in the process of bonding of
electronic chips
 What is the influence of temperature and pressure?

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 168
NOTES AND ADDITIONAL EXPLANATIONS

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
30’

Among the identified


analyses, perform a linear
regression analysis on
relevant cases...

Operation Mode
• Groups of 2
• Identify the analyses where the
linear regression is applicable
• Perform tests and complete your
study report
• Review = 5’ each
NOTES AND ADDITIONAL EXPLANATIONS

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
SUMMARY OF STATISTICAL TESTS

Risks decision

Case study

Y continuous
• Influence on average
• 2 samples t test
• ANOVA

• Influence on variability
• Test 2 and + standard deviation / variance

• Correlation between two continuous variables


• linear regression

Y discrete
• Influence on the proportions
• Chi² Test

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
THE CHI-SQUARE TEST IN A NUTSHELL

Application Cases Path in Minitab


• Establish the existence of a link between a Stat / Tables / Cross-tabulation and Chi-square
discrete Y and discrete X,.
• Does it vary dependent or independent of X?

Examples Hypothesis H0problème


• Proportion of defective depending on the teams X and Y vary independent  distribution of
• Error type by country proportions is the same for each level of X 
• Engine Type repaired based repair centers observed frequency = frequency expected

Prerequisites / terms of use Interpretation


• 5 minimum measures for each case (X, Y)  5 • If P-value < 5%  reject H0  certain
measures in each of the cells of the pivot table X, proportions are more or less high in one of
Y the levels of X  X factors and are
dependent

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 173
KHI² TEST

 Let’s throw a coin : What is the probability to get a TAIL ? … 50%, isn’t it ?

 Let’s suppose that we get:


 8 x TAIL, when the coin is thrown 10 times
 80 x TAIL, when the coin is thrown 100 times
 800 x TAIL, when the coin is thrown 1000 times

 When can we say the coin is fake ?

 In this example, we expect to have a 50% chance to get a TAIL.

 When the observed frequency is too far away from the expected frequency,
we can reject the fact that the observed situation is just due to the hazard.

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
KHI² TEST

 Let’s suppose we collect the volume produced by 2 shifts, but also the
number of good and bad parts

 At this stage, we don’t know the exact quantities but only the overall sums

 If both teams work at the same quality level, what should be the expected
value a, b, c, d?

Morning Evening Total

Good (a) (b) 1700

Bad (c) (d) 300

Total 1300 700 2000

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
KHI² TEST

 Now, let’s look at the actual production per shift which are slightly
different than the expected quantities

 Is this difference significant?

 The answer to this question will demonstrate if there is (or not) a


relationship between the production yield and the 2 shifts

Morning Evening Total


1130 570
Good 1700
(1105) (595) Observed Frequency
(Expected Frequency)
170 130
Bad 300
(195) (105)
Total 1300 700 2000

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
KHI² TEST
The principle of the test is the comparison of the Observed versus the
Expected.

If the difference is small enough, we can conclude that the two populations
are independent otherwise we will conclude that the there is a significant
difference between the two populations with respect to the factor studied.

In this specific example, the question is: given the total production achieved
on each shift, does one shift have a significantly higher defect rate than the
other.

 =S
2
2 (fo - fe) fo: observed frequency
fe fe: expected frequency
Khi² follows a distribution (looks similar to the F distribution) with a degree of
freedom of (R-1)x(C-1) R being the number of rows and C the number of
columns.
Given the degree of freedom, if 2 is big enough, the probability will be low
(<5%) and we will conclude that there is a statistical difference.

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
This document and the information it contains belong to
Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 178
KHI² TEST
SESSION WINDOW

Morning Afternoon All

Good 1130 570 1700


1105 595
Some graphics here...
0.566 1.050
Statistical results only!
Bad 170 130 300
195 105
3.205 5.952

All 1300 700 2000

Chi-Square DF P-Value
Pearson 10.774 1 0.001
Likelihood Ratio 10.508 1 0.001

Conclusion: the quality level depends on the team. The team is a


factor that affects the quality

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 179
KHI² TEST: PRACTICAL APPLICATION

 An Express Transportation Cie is facing a cash issue not being able to be


paid on time. The invoicing process is identified as the weak point of the
process. Many errors are found in the invoices, increasing the payment
lead-time dramatically.

 Accurate data have been collected across all sites and the manager
would like to know –before investigating root causes- if there are some
differences between locations.

 Open: BB_Exercises_Eng.xls / CHI2

Invoicing
X = Country Y = Type of defect
Process
This document and the information it contains belong to
Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
KHI² TEST- SESSION WINDOW
Khi square test: Spain; Brazil; Vietnam; Australia; South Africa
counts expected are printed below those observed two Khi contributions are printed under expected counts

Spain Brazil Vietnam Australia S Africa Total


1 23 25 27 42 28 145
26,60 32,45 27,92 30,40 27,63
0,488 1,710 0,030 4,423 0,005

2 15 19 13 19 11 77
14,13 17,23 14,83 16,15 14,67
0,054 0,181 0,225 0,505 0,918

3 45 50 46 49 32 222
40,73 49,68 42,74 46,55 42,30
0,448 0,002 0,248 0,129 2,506

4 32 36 39 41 33 181 Conclusion: The p-value


33,21 40,51 34,85 37,95 34,48
0,044 0,501 0,494 0,245 0,064
< 5%. The defect types
depend on the country.
5 50 70 45 40 73 278
51,00 62,21 53,53 58,29 52,97
0,020 0,974 1,358 5,739 7,578

6 17 22 21 17 12 89
16,33 19,92 17,14 18,66 16,96
0,028 0,218 0,871 0,148 1,449

Khi deux = 31,605 ; DL = 20 ; p-value = 0,048


This document and the information it contains belong to
Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
GAMES OF ADDITIONAL DATA TO PRACTICE (2 OF 2)

 Compagnies aériennes.mtw
 Can we consider that the 4 people who have collected the information have
comparable results on delays on takeoff?
 If so, what collector / company is different?

 Elections présidentielle 2007.mtw


 The file contains the count of voters for the 1st and 2nd round of the 2007
presidential elections in France.
 Will there be differences between the regions in terms of votes?

 Khi 2 tireur.mtw
 Following sled tests, the file contains the number of times each shooter hit
the target (in) or missed the target (off)
 What is your best shot? Why?
 What is your least good shooter? Why?

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
This document and the information it contains belong to
Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 183
CONCLUSION DU CHAPITRE

 The craft of Black Belts is not statistics but


to find the root causes of the problems and
make decisions with facts and data

 The graphs and statistics are a mean to


understand your process or processes

 Be methodical! When you have to drive this


step, always ask yourself these three
questions in this order:
 What do I want to show?
 What is the nature of my and my X Y?
 What are the available tools?
 What are the conditions required for
use?

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 184
This document and the information it contains belong to
Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 185
Hypothesis Tests
POWER AND SAMPLE SIZE

186 / Black Belt development program - Week 3


Ce document et les informations qu’il contient sont la propriété de Snecma. Ils ne doivent pas être copiés ni communiqués à un tiers sans l’autorisation préalable et écrite de Snecma.
Vols annulés ou en retard de + de 15 minutes (%)
50 OK, we have only 30
45 flights sampled by
40
50
companies...
35
43
30
But what happens if I
27 27
0
Air France Aero Mexico American Lufthansa
find the same graph with
100 flights by
Initial findings from John: Aeromexico and
American are much better than others. companies... or 1000
Lufthansa is to be avoided.
flights by companies?
Conclusions statistical test: we don't have
enough elements to say that there is a
difference...

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 187
RISK DEFINITION

2 risks have a specific definition


 Type 1 (risk ) = wrongly reject the hypothesis Ho
 Type 2 (risk b) = Wrongly accept the hypothesis Ho

ACCURACY OF HYPOTHESIS
Hypothesis Hypothesis
Ho is true Ho is false

Risk b
DECISION TAKEN

Accept Judgement
Correct: (error
Hypothesis Ho
2nd type)
1- 

Reject Risk  Judgement


(error Correct:
Hypothesis Ho
1st type) 1- b

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 188
SUMMARY ERRORS ALPHA AND BETA

 When we perform a hypothesis test, 2 types of errors may occur


 We reject H0, but groups are the same: error 
 We accept H0 but the groups are different: error b

 Example on the timeliness of the airlines. We find a value p > 0.05. This
may mean either
 Companies have the same punctuality
Or
 The companies do not have the same punctuality (risk = b)

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 189
SUMMARY: SIMULATION EXERCISE

 Imagine that I know in advance the true proportions of delay on


AMERICAN and LUFTHANSA:
 Delay ratio for AMERICAN = 30%
 Delay ration for LUFTHANSA = 40%

 What are the chances / risks of detecting a difference taking random "n"
flights of each companies?

 Resultss
For 30 flights by companies,
only 13 sample over 100 will
Sample size (n) Chance of detecting a difference by lead to the rejection of H0
doing the statistical test1
30 13% It will take more than 300
flights by companies to
100 32% probably detect this gap of 10
300 73% pts
1000 99.7%
(1) : puissance du test
This document and the information it contains belong to
Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 190
POWER TEST

 The quality of a hypothesis test is characterized by


 Value of p
 The error b.

 We will not use the b error, but the notion of power of test

POWER TEST = 1 – ERROR b

Power of the test: If there's really a difference between the


groups, what probability do we detect this difference?

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 191
POWER TEST

 The standard value in industry: power > 80%.

 If there is really a difference between the groups, the probability of finding


this difference is 80%.

 How to calculate the power of the test


 Menu Minitab
 Stat > Power and sample size >2 PROPORTIONS

 We will take the simulation exercise just to realize as an example.

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 192
HOW TO CALCULATE THE POWER OF THE TEST?

Specify values for 2 of the first 3 Number of


parameters, Minitab will calculate for samples to
the 3rd parameter calculate

We want to know the


power of the test for
Nous
We want
souhaitons
to knowsavoir
to
different sample
detect
détecter
the situation
la situation
30%
sizes (leave blank
30% vs.vs
40%40%
box)

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 193
CALCULATION OF THE POWER OF THE TEST

99.7%

73%

32%
The graph shows the power if
Lufthansa is 40% but you can directly
13%
see the power for other cases (10%,
20%,...)

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 194
ADDITIONAL DATASETS EXAMPLE PROPORTIONS

A management controller wants to test if there is or not a difference in the


proportion of reports an error when working according to the old method
or after the introduction of a new and improved method.

 With the old method, 14% of reports show the errors. The management
controller expects a reduction of 10 points (14% to 4% passage) errors
with the new method.

Write 0.14 – 0.10 = 0.04

Write 0.7 0.8 0.9

Write 0.14

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 195
EXAMPLE PROPORTIONS
Note: the size of the sample
necessary with a Chi-square
(comparison of proportions) test
will be more important than the
size of the sample when
comparing averages.

Puissance et effectif de l'échantillon

Sample Target
Comparison p Size Power Actual Power
The accountant needs 128 reports produced 0.04 101 0.7 0.702420
according to the two ways to have a 80% 0.04 128 0.8 0.801925
chance of detecting a decrease of 14 to 4% 0.04 171 0.9 0.901635
error rate
The sample size is for each group.

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 196
ADDITIONAL DATASETS EXAMPLE PROPORTIONS

 Suppose that a biologist wants to test whether there is a difference in the


proportions of fish that have been affected by the pollution of both Lakes.

 Prior research suggests that approximately 25% of the fish were infected.

 The biologist would like to detect a difference in the proportion of the


order of 3% (0.03).

 What is the recommended (for each of the two lakes) the sample size?

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 197
ADDITIONAL DATASETS EXAMPLE PROPORTIONS

A producer of ice cream wants to seek the influence of time of storage of


finished products, on the Non quality.
 For this study, four groups of samples (after 1 week, 2 weeks, 3 weeks, and
4 weeks) are taken from batches that have stayed in the store.
 The proportion expected non-specification is 5%.
 An increase of off specification in the order of 2% is considered to be
important.
 How many samples must be taken per week?

 The journal Biometrika conducted a study to determine whether the


proportion of intelligent students is the same for athletes and non-
athletes. The researchers wanted to see a 5% difference.
 How many students should be tested in each of the groups?

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 198
This document and the information it contains belong to
Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 199
Test 2-Sample t
POWER TEST TO COMPARE 2 MEANS
HOW TO CALCULATE THE POWER TEST?

Specify values for 2 of the first 3


parameters, Minitab will calculate for Number of
the 3rd parameter samples to
calculate
The smallest value that we
want to detect. In our
example data, the
difference between groups
is known

We want to know the power of the


test for different sample sizes
(leave blank box)

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 201
CALCULATION OF THE POWER TEST

With a sample size of 10, we have only 18% chance


Power and Sample Size of finding a difference of 50 between the two groups, if
2-Sample t Test the difference actually exists.
Testing mean 1 = mean 2 (versus ≠)
Calculating power for mean 1 = mean 2 + difference
α = 0.05 Assumed standard deviation = 100
Results
The sample size is for each group.
Power Curve for 2-Sample t Test
Difference Sample Size Power
50 10 0.185096

50 25 0.410100

50 50 0.696893

50 100 0.940427

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 202
POWER AND SAMPLE SIZE CALCULATION

 1. We can only afford a sample with a size of 10. What is the difference the
smallest that we can observe with a power of 80%?

 2. We know that a difference of 20 in the test is important for our


conclusions. What sample size do we need to ensure we detect a
difference with a power of 80%?

 3. We believe that we made a mistake during the estimation of the


standard deviation. Real deviation is 150. When sample size do you
propose to detect a real difference of 100 (with a power of 80%)?

 4. What happens if you change the probability of error to 0.01?


 If the difference we want to detect decreases  n increases
 If the standard deviation increases  n increases
 If the probability of error decreases  n increases

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 203
EXERCISES OF UNDERSTANDING

T-test catapulte.mtw

 We performed t test once with 50 shots and once with 20 shots.

 With 20 shots, we found no difference between the two groups, with 50


shots, we found a difference.

 Calculate the power of the test for 20 and 50 shots respectively.

Pied à coulisse.mtw

 What is the power of the test?

 Would you write in your conclusions that there is no difference between


the two calipers?

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 204
This document and the information it contains belong to
Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 205
ANOVA Test
POWER OF THE TEST TO COMPARE 3
MEANS AND MORE
POWER AND SIZE OF THE SAMPLE FOR ANOVA

 Minitab can calculate the power, the size of the sample or the minimum
difference can be detected between the largest and the smallest average
group (Minitab calls this the maximum difference).
 STAT > POWER AND SAMPLE SIZE > ONE WAY ANOVA

The number of groups.

You must provide data for 2 of


the 3 parameters. Minitab will
calculate the value of the 3rd
parameter.

Do not forget to indicate the


value of the standard
deviation

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 207
POWER AND EXAMPLE ANOVA SAMPLE SIZE

 Suppose we have 4 normal populations with averages of 50, 60, 50, 60.
 1. How many observations should take of each of these populations so that
the probability of rejecting Ho (reject it the hypothesis that the population
means are equal, even though they are actually different) is at least 90%?
Set a = 0,05.
 A reasonable estimate of the standard deviation is s = 5.

 2. What becomes the sample size when s = 6?

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 208
POWER AND SAMPLE SIZE: EXAMPLE ANOVA

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 209
POWER AND SAMPLE SIZE: EXAMPLE ANOVA
Power and Sample Size
One-way ANOVA
α = 0.05 Assumed standard deviation = 5

Maximum Sample Target


Difference Size Power Actual Power
10 9 0.9 0.932577

The sample size is for each level.

We need a sample
size of 9

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 210
POWER AND SAMPLE SIZE: EXAMPLE ANOVA

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 211
POWER AND SAMPLE SIZE: EXAMPLE ANOVA

Power and Sample Size


One-way ANOVA
α = 0.05 Assumed standard deviation = 6

Maximum Sample Target


Difference Size Power Actual Power
10 12 0.9 0.921388

We need a size of 12
samples

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 212
ADDITIONAL ANOVA DATASETS EXAMPLES

Anova concentrations.mtw

 We assume that the sampling does not influence the measurement result.

 Calculate the power of the test. The conclusions are valid or should we
increase the size of the sample?

 If a maximum difference of 1 is important, what size of samples do you


suggest?

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 213
30’

You finalize your project


and want to calculate the
number of samples to be
taken after improvement...

Operation Mode
• Groups of 2
• Available data: average time before = 10
h, standard deviation = 1.5 h before
• You expect an improvement of 2 hours
from the average, when sample size take
to confirm the results?
• Review = 10’ each
NOTES AND ADDITIONAL EXPLANATIONS

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
other person without prior written authorisation from Safran.
POWER AND SIZE OF SAMPLE CONCLUSION

 Before collecting data


 Large sample sizes give more power,
but are more expensive
 What difference did you hope to
detect?
 Smaller differences will require more
samples.

 After gathering data


 When there is no significant difference
(we cannot reject the Ho), but the
difference is important for the activity:
 If the power is low, you need more data
to validate the findings.
 If the power is high, you will accept that
there is no (significant) difference

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 216
POWER AND SAMPLE SIZE SUMMARY

 When performing comparisons, we do not calculate the accuracy.

 Compare two groups


 STAT > POWER AND SAMPLE SIZE >

 Compare the averages of 2 groups


 > 2-SAMPLE T

 When comparing the means of several groups


 > ONE WAY ANOVA

 When comparing two (or more!) proportions of group> 2 Proportions


 > 2 PROPORTIONS

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 217
POWER AND SAMPLE SIZE SUMMARY

 The sample size depends on:


 The standard deviation (estimated) or the proportion of the group (estimated)
 The significant level: the probability of error. Usually uses the default value of
0.05.
 The power of the test: 1 - probability of error b: We want while power is
greater than 80%.
 The size of the difference we want to detect.

 Do not forget basic assumptions, remaining to be checked:


 Sampling of the population
 If samples of a process, the process is stable.
 Random sampling.

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 218
Notes

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 219
Hypothesis testing
CASE STUDY
220 / Black Belt development program - Week 3
Ce document et les informations qu’il contient sont la propriété de Snecma. Ils ne doivent pas être copiés ni communiqués à un tiers sans l’autorisation préalable et écrite de Snecma.
120’
You have a new project for
the HR service: reduce
travel home-work time...

Operation Mode
• Groups of 2
• Learn about the process on the
following page and associated data
• Open the file« BB exercises Eng.xlsx »
and select data in the tab « complete
exercise »
• Identify the most important factors on
travel time and submit some additional
studies and suggestions
• Review = 10’ each
HOMEWORK - COMMUTING TIME REDUCTION

Operational Impact Problem Description


• Reducing delays and disruptions starting team • Travel time of employees is unusually long
and variable

Definition of a default: time suburb to


suburb > 25'

Operational Objective Scope


• 95% of travel time < 25’ Included: Suburb to suburb (between 15 and
30 km distance)
Excluded: Alternate routes departing from
closest communes

Planning Team
Start: March Sponsor : V. Low
End Analysis: May Project Manager / Black Belt : You
Implementation: August Team Members: 5 employees, Resp. HR
End of the project: September

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 222
MACRO PROCESS
AND POTENTIALLY INFLUENTIAL FACTORS

 Project Context
 The project is in the ANALYZE phase
 You have already identified a number of factors that are potentially influential
and have started collecting the needed data

 Summary table
• day
• week • Travel time(Y)
• names
• itinerary Deliver
• climate packages of
• Type of transport n-n’s
• Jams?
• temperature
• humidity
This document and the information it contains belong to
Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 223
NOTES AND ADDITIONAL EXPLANATIONS

This document and the information it contains belong to


Safran. They must not be copied or communicated to any
Black Belt development program - Week 3
other person without prior written authorisation from Safran. 224

You might also like