You are on page 1of 337

INTRODUCTION

Introduction
ü Your Name
ü Organiza/on
ü Posi/on
ü Expecta/ons of the Course
Introduction
ü Who we are?
ü What we do?
ü Why you should listen?
Pain + Solution
ü What’s the pain?
ü What are the current solu/ons?
ü How can we eliminate the pain?
Design
ü How to design equipment?
ü Who is involved in the design process?
ü What design precau/ons are taken with
safety?
ü What design precau/ons are taken with
maintenance?
Purchase
ü How purchasing is done?
ü Who is involved in the purchasing
process?
ü What influence purchase decision is it
Quality or Price?
Operations & Maintenance
ü Where to start with involvement of
Opera/ons & Maintenance?
ü Who design O&M Manual Plans?
ü What is looked at in the seIng of O&M
Manual Plans?
ü When does a Maintenance Plan
Op/mized?
Disposal
ü Who is the custodian of an asset finance/
maintenance/opera/ons?
ü Is disposal policy amended and if so how
oMen?
ü How will a Plant Equipment be disposed?
WHAT IS A
CULTURE OF
SAFETY?
SHE Policy
Reliability Policy
HOW
KNOWLEDGEABLE
ARE YOU IN RCA?
“QUICK POLL”
How Knowledgeable Are You in RCA?
“Quick Poll”
1 – Not very knowledgeable
4 – Moderate
8 – Good
10 - Excellent
What is Root Cause Analysis
• A root cause is defined as a factor that
caused a non-conformance and should
be permanently eliminated through
process improvement.
What is Root Cause Analysis
• The root cause is the core issue – the
highest-level cause – that sets in moGon
the enGre cause-and-effect reacGon
that ulGmately leads to the problem(s).
What is Root Cause Analysis
• Root cause analysis (RCA) is defined as a
collecGve term that describes a wide
range of approaches, tools, and
techniques used to uncover causes of
problems.
What is Root Cause Analysis
• Some RCA approaches are geared more
toward idenGfying true root causes
than others, some are more general
problem-solving techniques, and others
simply offer support for the core
acGvity of root cause analysis.
TOP REASONS
WHY COMPANIES
USE RCA?
Top 5 Reasons Why Companies Do Use
RCA Effectively
1. RCA is ad-hoc at best
2. The organizaGon has not been formally
trained in RCA
3. The organizaGon does not see the value
in RCA (takes too long)
4. KPI Dashboards are not used to ensure
everyone knows the “Score in the
Game)
Top 5 Reasons Why Companies Do Use
RCA Effectively
4 KPI Dashboards are not used to ensure
everyone knows the “Score in the
Game)
5 The CMMS is not effecGve in providing
the right informaGon at the right Gme.
Top 5 Reasons Why Companies Use
RCA
1. To miGgate failures
2. To opGmize asset reliability
3. To opGmize process reliability
4. To reduce cost
5. To reduce stress
THE 5 ELEMENTS
OF AN EFFECTIVE
AND SUSTAINABLE
RCA PROGRAM
5 Elements of An Effective and
Sustainable RCA Program
FOREMOST RCA
TOOLS
AND
TECHNIQUES
Foremost RCA Tools and Techniques
ROOT CAUSE
ANALYSIS
TECHNIQUES
Root Cause Analysis Techniques
The Five (5) more common Root Cause
Analysis tools include:
• Pareto Chart
• Ishikawa or Fishbone
• The 5 Whys
• Sca\er Diagram
• Failure Mode and Effects Analysis
(FMEA)
ROOT CAUSE
TRIGGERS
Root Cause Analysis Triggers
Root Cause Analysis Triggers
Triggers determine the following based on
Triggers:
• RCA Type and # of Resources Applied
Example:
Problem: Rolling element bearing consump/on
trending up over past 6 months.
RCA Team: 1 Maintenance Tech, 1 Reliability
Engineer, Storeroom Manager
RACI Chart: Roles and Responsibili/es Defined
PARETO
CHART
What is PARETO CHART?
A Pareto Chart is a type of chart that
contains both bars and a line graph, where
individual values are represented in
descending order by bars, and the
cumulaGve total is represented by the line.
What is PARETO CHART?
5 WHY?
What is the 5 Why?
Five why’s is an alternaGve interrogaGve
technique used to explore the cause-and-
effect relaGonships underlying a parGcular
problem.
• The primary goal of the technique is to
determine the root cause of a defect or
problem by repeaGng the quesGon “Why?”.
• Each answer forms the basis of the next
quesGon.
What is the 5 Whys?
What is the 5 Whys?
What is the 5 Whys?
The 5 Why Template
ISHIKAWA
/
FISHBONE
DIAGRAM
Ishikawa/Fishbone Diagram
• Ishikawa(Fishbone) diagrams are casual
diagrams created by Kaoru Ishikawa
that show the potenGal causes of a
specific event.
• Common uses of the Ishikawa diagram
are Maintenance Process design and
quality defect prevenGon to idenGfy
potenGal factors causing an overall
effect.
Ishikawa(Fishbone) Diagram
Ishikawa(Fishbone) Diagram
Ishikawa(Fishbone) Diagram
SIPOC
SIPOC
• A sca\er plot is a type of plot or
mathemaGcal diagram using
coordinates to display values for
typically two variables for a set of data.
• If the points are coded, one addiGonal
variable can be displayed.
SIPOC
SCATTER
DIAGRAM
/
PLOT
Scatter Diagram/Plot
• A sca\er plot is a type of plot or
mathemaGcal diagram using
coordinates to display values for
typically two variables for a set of data.
• If the points are coded, one addiGonal
variable can be displayed.
Scatter Diagram/Plot
FAILURE MODE
AND EFFECTS
ANALYSIS (FMEA)
FMEA
Failure Mode Effects and Analysis (FMEA)
is a method which is used to idenGfy and
completely understand the potenGal
failure modes and its reason/causes, and
the effects of failure on the system or end
users for a given product or process.
FMEA
Fundamentals of FMEA
• IdenGfy and fully understand potenGal failure
modes and their causes, and the effects of
failure on the system or end users, for a given
product or process.
• Assess the risk associated with the idenGfied
failure modes, effects, and causes, and
prioriGze issues for correcGve acGon.
• IdenGfy and carry out correcGve acGons to
address the most serious concerns.
FME Spreadsheet
FMEA
Fundamentals of FMEA
• IdenGfy and fully understand potenGal failure
modes and their causes, and the effects of
failure on the system or end users, for a given
product or process.
• Assess the risk associated with the idenGfied
failure modes, effects, and causes, and
prioriGze issues for correcGve acGon.
• IdenGfy and carry out correcGve acGons to
address the most serious concerns.
FMEA
Purpose of the FMEA
• Methodology that facilitates process
improvement
• IdenGfies and eliminates concerns early in
the development of a process or design
• Improve internal and external customer
saGsfacGon
• Risk Management tool, focuses on
prevenGon
• FMEA may be a customer requirement (likely
contractual, Level 3 PPAP, ISO 9001)
Learning FMEA, Training Objectives
Training ObjecGves:
• To understand the use of Failure Modes
and Effect Analysis (FMEA)
• To learn the steps to developing FMEAs
• To summarize the different types of FMEAs
• To learn how to link the FMEA to other
Process tools
FMEA, Summary
FMEA, a mathemaGcal way to idenGfy:
• failure modes, the ways in which a product
or process can fail
• the Effects and Severity of a failure mode
• PotenGal causes of the failure mode
• the Occurrence of a failure mode
• the DetecGon of a failure mode
• the level of risk (Risk Priority Number)
• acGons that should be taken to reduce the
RPN
FMEA, Summary
FMEA, a mathemaGcal way to idenGfy:
RPN = Severity X Occurrence X
DetecGon
Benefits
Inputs might include other tools such as:
D-FMEA (Part and Assembly level) Defines VOC
• Customer requirements
• CTQ Flow down analysis
• Quality FuncGon Deployment (House Of
Quality)
• Risk assessments
Benefits
Inputs might include other tools such as:
P-FMEA (Process level) Delivers VOC
• Process flowchart
• Sequence Of Events
• Process Tooling
• Poka-Yoke list
FMEA, Applications Example
There are several situaGons where an FMEA is the
opGmal tool to idenGfy risk:
•Process-FMEA:
•Introducing a new process
•Reviewing exisGng processes amer modificaGons
•Introduce new Part Numbers on an exisGng
ProducGon Line
FMEA, Applications Example
There are several situaGons where an FMEA is the
opGmal tool to idenGfy risk:
•Design-FMEA:
• Introducing a new Design, Part, Sub Assembly
or Assembly
• Use an exisGng Design for another applicaGon
• Reviewing exisGng Designs amer modificaGons
What is Failure Mode?
A Failure Mode is:
• The way in which the component, subassembly,
product or process could fail to perform its
intended funcGon
• Failure modes may be the result of previous
operaGons or may cause next operaGons to fail
• Things that could go wrong INTERNALLY:
Warehouse
ProducGon Process
• Things that could go wrong EXTERNALLY:
Supplier LocaGon
Final Customer
When to Conduct an FMEA
When to Conduct an FMEA?
• Early in the New Product IntroducGon (A-Build)
complete for B build.
• When new systems, products, and processes are
being designed
• When exisGng designs or processes are being
changed, FMEA’s to be updated
• When process improvements are made due to
CorrecGve AcGon Requests
History of FMEA
History of FMEA:
• First used in the 1960’s in the Aerospace industry
during the Apollo missions
• In 1974, the Navy developed MIL-STD-1629
regarding the use of FMEA
• In the late 1970’s, the automoGve industry was
driven by liability costs to use FMEA
• Later, the automoGve industry saw the advantages
of using this tool to reduce risks related to poor
quality (QS-9000, VDA and ISO-TS 16949 standard)
Types of FMEAs
Design FMEA
• Analyzes product design before release to
producGon, with a focus on product funcGon
• Analyzes systems and subsystems in early
concept and design stages
Process FMEA
• Used to analyze manufacturing and assembly
processes before they are implemented
FMEA – A Team Tool
• A team approach is necessary, see example
AubieSat-1 communicaGon problems could have
been avoided by involving a pracGcal experienced
team!
• Team should be led by the Right person, Design,
Manufacturing or Quality Engineer, etc…familiar
with FMEA
FMEA – A Team Tool
• The following Team members should be
considered:
Ø Design Engineers
Ø Process Engineers
Ø Supply Chain Engineers
Ø Line Design Engineers
Ø Suppliers
Ø Operators
Ø PracGcal Experts
FMEA – A Team Tool
Identify failure modes Determine and assess
Identify causes of the Prioritize
and their effects actions
failure modes
and controls
FMEA – A Team Tool
FMEA – A Team Tool
FMEA – A Team Tool
FMEA – A Team Tool
FMEA – A Team Tool
FMEA – Form
FMEA Procedure
1. For each process input determine the ways in
which the input can go wrong (failure mode)
2. For each failure mode, determine effects
- Select a Severity level for each effect
3. IdenGfy potenGal causes of each failure mode
- Select an Occurrence level for each cause
4. List current controls for each cause
- Select a DetecGon level for each cause
RPN = Severity X Occurrence X DetecGon
FMEA Procedure
5. Calculate the Risk Priority Number (RPN)
6. Develop recommended acGons, assign responsible
persons, and take acGons
- Give priority to high RPNs
- MUST look at highest severity
7. Assign the predicted Severity, Occurrence, and
DetecGon levels and compare RPNs (before and
amer risk reducGon)
RPN = Severity X Occurrence X DetecGon
Rating Scales
• Preferred Scales are 1-10
• Adjust Occurence scales to reality figures for your
company
Severity:
1 = Not Severe, 10 = Very Severe
Occurrence:
1 = Not Likely, 10 = Very Likely
Detec/on:
1 = Easy to Detect, 10 = Not easy to Detect
Trickling Filter
FMEA – Form
Identify failure modes Determine and assess
Identify causes of the Prioritize
and their effects actions
failure modes
and controls
FMEA – Form
FMEA – Form
FMEA – Form
Severity
FMEA – Form
Occurrence
FMEA – Form
Detection
FMEA – Form
FMEA Procedure
RPN = Severity X Occurrence X DetecGon
FMEA, 10 Steps Checklist
1. Review the process—Use a process flowchart to
iden/fy each process component
2. Brainstorm potenGal failure modes—Review
exis/ng documenta/on and data for clues
3. List potenGal effects of failure—There may be more
than one for each failure
4. Assign Severity rankings—Based on the severity of
the consequences of failure
5. Assign Occurrence rankings—Based on how
frequently the cause of the failure is likely to occur
FMEA, 10 Steps Checklist
6. Assign DetecGon rankings—Based on the chances
the failure will be detected prior to the customer
finding it
7. Calculate the RPN—S X O X D
8. Develop the acGon plan—Define who will do what
by when
9. Take acGon—Implement the improvements
iden/fied by your PFMEA team
10.Calculate the resulGng RPN—Re-evaluate each of
the poten/al failures once improvements have been
made and determine the impact of the
improvements
8-D PROBLEM-
SOLVING
PROCESS

8-D Problem Solving Process
• It includes the idenGficaGon of the Root
and contributory factors, determinaGon
of risk reducGon strategies and
development of acGon plans along with
measurement strategies to evaluate the
effecGveness of the plans.
8-D Problem Solving Process
• RCA is based on the basic idea that
effecGve management requires more
than merely “puung out fires” for
problems that develop, but finding a
way to prevent them.
8-D Problem Solving Process
• EssenGally, RCA means finding the specific
source(s) that created the problem so that
effecGve acGon can be taken to prevent
recurrence of the situaGon.
8-D PROBLEM-SOLVING PROCESS
FAULT TREE
ANALYSIS
/
DIAGRAM
FTA Process
• Step 1: Define the Fault CondiGon, and Write Down
the Top Level Failure.
• Step 2: Using Technical InformaGon and
Professional Judgments, Determine the Possible
Reasons for the Failure to Occur.
• Step 3: ConGnue to break each element with
addiGonal gates to lower levels. Consider the
relaGonships between the elements to help you
decide whether to use an “and” an “or” logic gate.
FTA Process
• Step 4: Finalize and review the complete diagram.
The chain can only be terminated in a basic fault:
Human, Hardware or Somware.
• Step 5: If possible, evaluate the Probability of
Occurrence for each of the Lowest Level elements
and calculate the StaGsGcal possibiliGes from the
bo\om up.
Fault Tree Analysis/
Summary of FTA Symbols
DMAIC
RCA
PROCESS
5-Step RCA Process
• Step 1: Define (What problem needs to be solved?)
• Step 2: Measure (Collect data and evidence to
determine the scope and magnitude of the
problem)
• Step 3: Analyze (IdenGfy and classify the root
cause(s) of the problem)
5-Step RCA Process
• Step 4: Improve (What are the countermeasure(s)
soluGon to solve the problem?)
• Step 5: Control (Evaluate the effecGveness, impact
and sustainability of the implemented soluGon)
DMAIC RCA Process (Illustration)
Step 1: Define
• ObjecGve: IdenGfy and define the problem
• Specify the nature, the magnitude, the locaGon(s)
and the Gming of events
• Use integrated methodology (QuanGtaGve and
QualitaGve Techniques)
• Outcome: DefiniGon of well-formulated Problem
Statement
Step 2: Measure
• ObjecGve: Through the collecGon of Data and
Evidence, the determinaGon of the Scope and
Significance of the problem.
• Preferred Tools
• CATWOE or SituaGonal Analysis
• Outcome: Development of a Scope-Significance
Matrix and an in-depth understanding of the
Problem.
CATWOE Analysis
Scope-Significance Matrix
Data Collection Methods and Sources
of Data
• The two (2) categories of Data CollecGon
Methods: -
• QuanGtaGve – numeric e.g. staGsGcs
• QualitaGve – subjecGve e.g. surveys and
focus groups.
• The predominant Sources of Data
• Electronic | Documentary | Experimental |
Human
Data Collection Process
• Step 1: Develop a Data CollecGon Plan/Strategy
• Step 2: Data CollecGon
• Step 3: Data CollaGon
• Step 4: Data Analysis
Data Collection Process
• Step 5: Data InterpretaGon
• Step 6: Data VerificaGon
• Step 7: PublicaGon
Step 3: Analyze
• ObjecGve: To idenGfy the RCA of the problem
• Preferred Tool – 5 Why’s Analysis
• Outcome: IdenGficaGon and ClassificaGon of the
underlying cause that must be addressed to
alleviate/remedy the problem.
Step 3: Analyze
Step 4: Implement
• IdenGfy countermeasure/soluGons for the problem
(by means of creaGve and analyGcal thinking)
• Evaluate the proposed soluGons (by means of a
Decision Matrix) focused on the following creaGve:
• Viability | Feasibility | Sustainability
• Apply Risk MiGgaGon techniques:
• FMEA | Impact Analysis | Force-Field Analysis
• Implement the soluGon (by means of AcGon Plan)
Force-Field Analysis
Implementation Action Plan Template
Step 5: Control
• ObjecGve: Evaluate the effecGveness, impact and
sustainability of the implemented soluGon.
• Management
• Observe
• Monitor
• Evaluate
• Review
• Amend
Step 5: Control
DEFINING
ROLES
&
RESPONSIBILITIES
Roles and Responsibility
Steps to CreaGng an EffecGve Failure
MiGgaGon
R – Responsible
A – Accountable
C – Consulted
I – Informed
Roles and Responsibility
CreaGng an EffecGve Failure MiGgaGon (RACI)
1. Assemble a team of people involved Failure
Mi/ga/on/Elimina/on (ex: Planner, Supervisor,
Technician, Reliability Engineer, Produc/on)
2. Educate the team in the Failure Mi/ga/on
Strategy
3. Define the process/tasks/steps required for
success of Failure Elimina/on
4. Facilitate the team through the RACI Process
Roles and Responsibility
CreaGng an EffecGve Failure MiGgaGon (RACI)
5. Post the RACI Chart along with KPI Dashboard
focused on the resolu/on for all to see
6. Perform RCA when a costly failure impacts
produc/on and cost.
INCIDENT
[ACCIDENT]
INVESTIGATIONS
WHY SHOULD ALL
MAINTENANCE
ORGANIZATIONS
EMBED RCA INTO
THEIR CULTURE?
Why Should All Maintenance
Organizations Embed “RCA” into their
Culture?
1. To reduce “Human Induced Failures”
2. To reduce stress
3. To reduce Maintenance and Total Cost
4. To reduce employee turnover
5. To ensure equipment “meets
expectaGons of the owners”
HOW TO REDUCE
OR MITIGATE
FAILURES IN
MAINTENANCE
How to Reduce or Mitigate Failure in
Maintenance
1. Create a cross-func/onal teams to resolve
repeat and major failures
2. Define what cons/tutes a failure
3. Educate your staff in Root Cause Analysis
Techniques
4. Create Triggers to ini/ate specific Root Cause
Analysis Events
5. Measure the Outcomes of Root Cause Analysis
Events
1. Creating a Plant Cross-Functional Team to
Resolve Repeat and Major Loss Failures
• ID who are the stakeholders on the team
• ProducGon Manager
• Maintenance Manager
• Safety Manager
• Stores Manager
• Finance
• Maintenance Supervisor
• Senior Maintenance Technician
2. Define What Constitutes a Failure
• ParGal FuncGonal Failure
2. Define What Constitutes a Failure
• Total FuncGonal Failure
3. Educate Your Staff in Root Cause Analysis
Techniques
As an Example – 5 Whys
Five whys (or 5 whys) is an iteraGve interroga/ve
technique used to explore the cause-and-effect
rela/onships underlying a par/cular problem.
The primary goal of the technique is to determine
the root cause of a defect or problem by repea/ng
the ques/on “Why?”.
3. Educate Your Staff in Root Cause Analysis
Techniques
Each answer forms the basis of the next ques/on. The “five” in the name
derives from an anecdotal observa/on on the number of itera/ons needed to
resolve the problem.
1. Why? – Produc/on line stopped. (First why)
2. Why? – V-Belt Failure on Blower (Second why)
3. Why? – V-Belt problem reported by produc/on, however no ac/on taken
(Third why)
4. Why? – No one wrote a Work Order to replace the V-Belt (Fourth why)
5. Why? – “What do you think was the Root Cause, Text in your answer (FiMh
why, a root cause).
4. Create Triggers to initiate specific RCA Events
5. Measure the Outcomes of Root Cause Analysis
Events
Begin with Current Metrics
• PM Labour Hours
• Emergency/Urgent Labour Hours
• Schedule Compliance
• Rework
• OEE
Display a KPI Dashboard for everyone to
know how effecGve this process is working.
CONNECT THE
DOTS
Connect all 9 dots by drawing
4 straight consecutive lines.
ROOT CAUSE
PROCESS
Step 1
-
Define the Problem
Define the Problem
• The Elephant in the Room – SelecGng the Right
Problem
• Scoping the Problem Appropriately
• The Problem Statement (5W2H)
• Draw Different Run Charts on Excel
• Case Study Examples of Problem Statement
• Draw a Project Decision Matrix in Word
5W2H
5W2H
• Who ? Individual/Customers associated with
problem
• What ? The Problem Statement or DefiniGon
• When ? Date and Time Problem was IdenGfied
• Where ? LocaGon of Complaints (Area, FaciliGes,
Customers)
• Why ? Any Previously Known ExplanaGon
• How ? How did the problem happen (root cause)
and how will the problem be corrected (CorrecGve
AcGon?)
• How Many? Size and Frequency of Problem
Step 2
-
Understand the
Process Flow
Understand the Process Flow
• Seung the appropriate Boundaries
• Complete the Process Flow Chart Templates
• Why Process is so important, why does it fail?
• Complete the SIPOC Excel Diagram Template
Step 3
-
Human Error and
Incident
Human Error and Incident
• The 12 Causes of CommunicaGon Problems
• The 12 Causes of Poor MoGvaGon
• The 12 Steps to Create an EffecGve Team
Human Error and Incident
• The 12 Causes of CommunicaGon Problems
• The 12 Causes of Poor MoGvaGon
• The 12 Steps to Create an EffecGve Team
Step 4
-
Start Improving &
Corrective Actions
Project
Start Improving & Corrective Actions
Project
• CreaGng Problem Solving Project Character
• CreaGng a ConGnuous Improvement Project Scope
• CreaGng a Visual RCA Project Management Tool
Problem Solving Walk-Around
• Having a technical perspective
• Checking the numbers closely
• Examining things critically
• Concentrating on fixing it
• Calling in an expert
• Analyzing in depth
• Doing research
Problem Solving Walk-Around
• Getting excited, maybe impatient
• Generating lots of ‘crazy’ ideas
• Looking for new perspectives
• Breaking the rules to solve
• Looking at ‘the big picture’
• ‘Sleeping on it’
• Brainstorming
Problem Solving Walk-Around
• Making a plan
• Minimizing the risk
• Taking first things first
• Organizing the information
• Focusing on time and timelines
• Searching for overlooked details
• Considering steps to be completed
Problem Solving Walk-Around
• Talking it out
• Building a team
• Calling a friend for help
• Listening to own intuition
• Getting emotionally engaged
• Considering own values/feelings
• Persuading those involved to help
The Problem?
What?
Why?
How Who?
The Elephant in the Room
“An important and obvious topic, which everyone present is
aware of, but which isn’t discussed, as such discussion is
considered to be uncomfortable.”
Image source: www.davepear.com
Whole Brain Problem Solving?
Image source: www.davepear.com
REACTIVE
MAINTENANCE
ATTRIBUTES
Reactive Maintenance Attributes
• IneffecGve or no Planning and Scheduling
• PM compliance has a wide variance
• Performing PM on equipment that conGnues to
breakdown
• Overnight deliveries sit for weeks, months
• Everyone works as hard as they can with li\le, if
any, movement seen toward proacGve
• Storeroom is in chaos (people standing in line at
7:00 am waiGng on parts)
5W2H
• Who ? Individual/Customers associated with
problem
• What ? The Problem Statement or Defini/on
• When ? Date and Time Problem was Iden/fied
• Where ? Loca/on of Complaints (Area, Facili/es,
Customers)
• Why ? Any Previously Known Explana/on
• How ? How did the problem happen (root cause)
and how will the problem be corrected (Correc/ve
Ac/on?)
• How Many? Size and Frequency of Problem
Do you want to be Proactive or Reactive
Proac/ve Reac/ve
Not enough hours in
the day
We need to cut back on
contractors
Hurry and get those
PMs done
We need to stop
ordering parts
World Class Standards
World Class Standards
Discussion Points Assets
Pumps
• Design, installaGon, and operaGon are dominant
factors that affect a pumps mode of failure.
Centrifugal
• Centrifugal pumps are especially sensiGve to
variaGons in liquid condiGon (i.e., viscosity, specific
gravity, and temperature); sucGon variaGons, such
as pressure and availability of a conGnuous volume
of fluid; and variaGons in demand.
Discussion Points Assets
• Mechanical failure may occur for a number of reasons.
• Some failures are induced by: -
• Cavita/on,
• Hydraulic Instability, or other
• System-related problems
• Others are the direct result of improper maintenance
• Maintenance-related problems include:
•Improper lubrica/on
•Misalignment
•Imbalance
•Seal leakage
Discussion Points Assets
CavitaGon
• CavitaGon in a centrifugal pump, which has a significant,
negaGve effect on performance, is the most common failure
mode.
• CavitaGon not only degrades a pump’s performance but also
greatly accelerates the wear on its internal components.
Causes
• Three causes of cavitaGon in centrifugal pumps are change of
phase, entrained air or gas, and turbulent flow.
Discussion Points Assets
Change of Phase
• The formation or collapse of vapour bubbles in either
the suction piping or inside the pump is one of the
cause of cavitation.
• This Failure Mode normally occurs in applications, such
as Boiler Feed, where the incoming liquid is at a
temperature near its saturation point.
• A slight change in suction pressure can cause the liquid
to flash into its gaseous state.
The Question
• How long are bearings designed to last?
• What is L10 life?
• The age to which at least 90% of a sufficiently large populaGon
of the same bearings under the same condiGons can
reasonably be expected to survive.
• Predicted life is – 5 x L10
Lubrication Dynamics
For any lubricant –
• As pressure goes up…
• Viscosity goes up exponenGally.
• At 1,378MPa the lubricant becomes harder than the metal.
• And thus, the metal surfaces never touch.
Hydrodynamic Lubrication
• High Conformity
• Large contact area
• Low contact pressure
• Oil Wedge between surfaces
• Oil film supports the load
• Contact pressure may be 400 MPA
• Typically only several hundred psi
• Journal Bearing
Elastohydrodynamic Lubrication
• Low Conformity
• Small contact area
• High contact pressure
• Oil wedge between surfaces
• Oil film supports the load
• Contact pressure 3103 MPa
• Rolling Element Bearing
Bearing Packaging
Bearing Installation
Lubrication Dynamics
Lubrication Dynamics
Lubrication Dynamics
Normal Race
Normal Race
Release Relieve Relationship
How often does this happen in a Day?
• 1775 rpm (29.583 cycles per second)
• 8.193 Ball Pass Frequency Inner Race
24 hours
X 60 minutes per hour
X 60 seconds per minutes
X 29.583 cycles per second
X 8.193 BPFI
= 20,941,072 cycles per day
= 628,232,161 cycles per month
=7,538,784 cycles per year
Subsurface Fatigue
Subsurface Fatigue - Advanced
Spalling
Vibration Data from Motor Bearing
Vibration from Screen Bearing
Surface Fatigue
• Looks like:
• Contact FaGgue
• Spalling
• Piung
• Brinnelling
• Freung
• Begins by denGng due to parGcles or shock loads
• Causes surface imperfecGons that have repeat high loading
which cause a pop-out of surface material.
First Sign of Trouble
P-F Curve
Accelerators - Fatigue
• Imbalance
• Misalignment
• V-belts Too Tight
• Broken Bolts
• Loose Bolts
• Broken Welds
• Overload
• Over Speed
• Shock Load
Lubrication Dynamics - Bearings
Lubrication Dynamics - Bearings
Lubrication Dynamics - Bearings
Accelerators – Adhesive Wear
• Lubricant Too Light
• Lubricant Too Heavy
• Water
Lubrication Dynamics - Bearings
Lubrication Dynamics - Bearings
Abrasive Wear
• Looks like:
• Curls
• Corkscrew
• Long needles
• Edges serrated
• 5-100 microns
• Results from abrasive parGcles
• Also called: plowing, cuung, gouging
• Caused from 2 or 3 body abrasions with sliding
contacts
Accelerators of Wear
• Blown Seal
• Poor Lubricant Handling PracGces
• Poor LubricaGon Procedures
Often Misquoted
Bearing Failure Analysis
Bearing Failure Analysis
Bearing Failure Analysis
Bearing Failure Analysis - Vibration
I-P-F Curve
Precision Maintenance
Lubrication Related Failures
• LubricaGon-related failures account for 54% of all
bearing failures!!
• Any serious effort to permanently reduce the cost
of maintenance must include upgrading the quality
of the lubricaGon.
Gearbox
Gearbox (Background)
Background continued …
• Over the next few days they replaced the gearbox with a
spare
• Vendor was consulted. They “knew exactly what went
wrong”
• Insurance company requested an independent Root
Cause Failure Analysis
Gearbox Failure Root Cause Analysis
Gearbox Failure Root Cause Analysis
Gearbox Data Collection
• Loading, both before the incident and historically
• Equipment design, raGngs (what was it expected to do?)
• Maintenance History
• VibraGon Analysis/Report
Gearbox Data Collection
Interviews and Data Collection
Interviews and Data Collection
Root Cause Conclusion
Gearbox Inspection
• Results from 10 yearly
• Gear Backlash and clearances within spec although it was
noted that not all readings could be obtained.
• Fine metal parGcles observed in oil
Gearbox Inspection
Gearbox Routine Inspection
• The Gearbox was next inspected in April 2012.
• When oil drained and covers limed rust was found on underside
of lid.
• 2nd stage intermediate gear appeared to have ‘pieces’ of teeth
missing. (16 helical teeth @ 246 rpm)
• CondiGon Monitoring Group called in for invesGgaGon
• How could you guys have missed this!!!
Condition Monitoring Initial Inspection
• Visual InspecGon and NDT were performed on the 6
effected Helical gears.
Condition Monitoring Initial Inspection
Condition Monitoring Initial Inspection
• Recent and past vibraGon together with oil reports were
reviewed.
• No immediate acGon or trends idenGfied that this gearbox had a
problem.
• The earliest CondiGon Monitoring data from this machine is
November 2000, although there had been sporadic vibraGon
analysis before this.
• Oil totally drained from G-Box and further samples taken.
Detailed Investigation
• No upward trending of vibraGon – 6 weekly
schedule or contaminants in oil – 24 weekly
• Bo\om gearbox provided several ‘chunks’ of gear
tooth material, up to 30mm long.
Detailed Investigation (cont.)
• AddiGon oil tests from bo\om oil showed metal pracGces
consisGng of:
• FaGgue and Sliding gear parGcles up to 220 um in size and
showing aged and oxidised appearances.
• Metallurgical TesGng of gear ‘pieces’
• PiIng and spalling Fa/gue
• Causes could include lack of lubrica/on, misalignment of gears, shaM
deflec/on at start-up increased load, excessive shaM clearances or
backlash.
• Progressive failure not single event.
Detailed Investigation (cont.)
• Rust in tank
• Tonnage/Load rate increased for conveyor around year 2000, but
sGll within gearbox design.
• Last major inspecGon – incomplete measurements.
• Oil level lower than design i.e. 200 L lower than design of 600L –
wrong sight glass.
Recommendations
• Report given to Coal Plant Maintenance strategy.
• Monitor and increase CM frequency and re-
inspect @ 6 months.
• CPMS wanted immediate repair, Asset
Management involved requesGng more
evidence.
Last Inspection
• Most recent inspecGon has shown damage to be
increasing i.e. More scuffing noted on surface of
2/3 of helical teeth.
• SGll no significant Oil of Vibe data.
• Recommended planned change out in near future.
• Motor/Gearbox change Planned for March/April
2012.
Improvement Opportunities
• Accurate History and Data
• Use all forms of CondiGon Monitoring and
InspecGon
• CommunicaGon Flow.
Task Team Work
RCA
RCA
RCA
RCA
RCA
RCA
RCA
RCA
RCA
RCA
RCA
RCA
Personnel Error
Diagnostic Chart
Low Alertness High Stress High Confusion Overconfidence Low Compliance
Boredom Inadequate Inadequate Habit
Illness Fatigue High Cognitive Unawareness Embarrassed
(Non- Tracking Motivation Intrusion
(Inadequate (Mental and
Stimulating
Complexity Overload (Information
(Non-
(Fear of
Deficiency) (Place-Keeping (Inadequate
Exposure)
Action) Physical) (KB Perf. Error) (Repeat Mistakes)
Work) Failure) Attention to Detail) Compliance)
• Mental or • Inadequate • Poor work • Inadequate • Multi-tasking • Lack of • Inadequate • Poor • Reflex • Task too
physical sleep planning knowledge KB teamwork written accountability • Poor complex
weakness • Poor shift • Poor • Not familiar performance • Information instructions • Mindset supervision • Lack of
• Distraction work rotation motivation with task • Inadequate not available • High rate of • Perceived • Perceived procedures
due to illness • Circadian • Goal • Perceived time training or • Not enough distraction time pressure time pressure • Poor
• Excessive fatigue confusion pressure procedures information • Insufficient • Low supervision
work • Post meal/ • Poor • High rate of information consequence • Poor training
schedule mid- procedures interruptions • Wrong
without days afternoon • Poor • Too much info assumptions
to rest and • Repeated teamwork • Perceived time • Low
recover errors at • Spatial pressure expectations
certain times disorientation • Distractive
environment
Poor Tunnel
Low Morale Fear of Failure Underestimated Shortcuts Mindset
Supervision (Fear Results of Vision
(High Stress) (Inadequate. Work Complexity (Information (Actions Based
Error) (Wrong Action)
(Inadequate Info Deficiency) on Reflex)
Assignment) Checking)
• High emotion • Job assignment • Task confusion • Long time on • Poor • Tasks • Tasks
• Job danger error • Poor same job supervision outside of outside of
• Fear of • Staffing error accountability • Low • Poor procedure procedure
unemployment • Poor • Poor management accountability • Reflex • Assumptions
• Lack of accountability supervision standards • Long time on • Poor • Tribal
confidence • Poor command • Perceived time • Poor same job teamwork knowledge
• Poor teamwork • Poor pressure procedures • Perceived time • Perceived • Perceived
• Perceived time communication • Poor teamwork pressure time pressure time pressure
pressure • Inadequate
www.Reliability.com motivation
804-458-0645
info@reliability.com
© 2006-2013 Reliability Center, Inc.
259
RCA
RCA
Manufacturing
Root Cause Analysis Example #1
IdenGfy Problem
Part polarity reversed on circuit board
Selecting the Team
Team members:
Team Leader – Terry
Inspector – Jane
Worker – Tammy
Worker - Joe
Quality Eng – Rob
Engineer – Sally
Immediate Action
• AddiGonal inspecGon added amer this assembly
process step to check for reversed part defects
• Last 10 lots of printed circuit boards were re-
inspected to check for similar errors
RCA
Part reversed
Why?
RCA
Part reversed
Worker not sure of correct part orientation
Why?
RCA
Part reversed
Worker not sure of correct part orientation
Part is not marked properly
Why?
RCA
Part reversed
Worker not sure of correct part orientation
Part is not marked properly
Engineering ordered it that way from vendor
Why?
Part reversed
Worker not sure of correct part orientation
Part is not marked properly
Engineering ordered it that way from vendor
Process didn’t account for possible
manufacturing issues
Corrective Action
Permanent – Changed part to one that can only be
placed in correct direc/on (Mistake proofed).
Found other products with similar problem and
made same changes.
PrevenGve - Required that any new parts selected
must have orienta/on marks on them.
Manufacturing
Root Cause Analysis Example #2
IdenGfy Problem
A manager walks past the assembly line and no/ces a
puddle of water on the floor. Knowing that the water is
a safety hazard, she asks the supervisor to have
someone get a mop and clean up the puddle. The
manager is proud of herself for “fixing” a poten/al
safety problem.
RCA
But What is the Root Cause?
The supervisor looks for a root cause by asking 'why?’
Immediate Action
Knowing that the water is a safety hazard, the
manager asks the supervisor to have
someone get a mop and clean up the puddle.
RCA
Puddle of water on the floor
Why?
RCA
Puddle of water on the floor
Leak in overhead pipe
Why?
RCA
Puddle Part
of water
reversed
on the floor
Worker notLeak
sureinofoverhead
correct part
pipeorientation
Why?
Water pressure is set too high
Why?
RCA
Puddle of water on the floor
Leak in overhead pipe
Water pressure is set too high
Water pressure valve is faulty
Why?
Puddle of water on the floor
Leak in overhead pipe
Water pressure is set too high
Water pressure valve is faulty
Valve not in preventative maintenance program
Corrective Action
• Permanent – Water pressure valves placed in preventaGve
maintenance program.
• PrevenGve - Developed checklist form to ensure new
equipment is reviewed for possible inclusion in preventaGve
maintenance program.
Manufacturing
Root Cause Analysis Example #3
IdenGfy Problem
Customers are unhappy because they are being
shipped products that don't meet their specifica/ons.
Immediate Action
Inspect all finished and in-process product to ensure it meets
customer specificaGons.
RCA
Product doesn’t meet specifications
Why?
RCA
Product doesn’t meet specifications
Manufacturing specification is different from
what customer and sales person agreed to
Why?
RCA
Product doesn’t meet specifications
Manufacturing specification is different from
what customer and sales person agreed to
Sales person tries to expedite work by calling
head of manufacturing directly
Why?
RCA
Product doesn’t meet specifications
Manufacturing specification is different from
what customer and sales person agreed to
Sales person tries to expedite work by calling
head of manufacturing directly
Manufacturing schedule is not available for
sales person to provide realistic delivery date
Why?
Product doesn’t meet specifications
Manufacturing specification is different from
what customer and sales person agreed to
Sales person tries to expedite work by calling
head of manufacturing directly
Manufacturing schedule is not available for
sales person to provide realistic delivery date
Confidence in manufacturing schedule is not
high enough to release/link with order system
RCA
Confidence in manufacturing schedule is not
high enough to release/link with order system
Why?
RCA
Confidence in manufacturing schedule is not
high enough to release/link with order system
Parts sometimes not available thereby
creating schedule changes
Why?
RCA
Confidence in manufacturing schedule is not
high enough to release/link with order system
Parts sometimes not available thereby
creating schedule changes
Expediting and priority changes consume
parts not planned for
Why?
RCA
Confidence in manufacturing schedule is not
high enough to release/link with order system
Parts sometimes not available thereby
creating schedule changes
Expediting and priority changes consume
parts not planned for
Manufacturing schedule does not reflect
realistic assembly and test time
Why?
Confidence in manufacturing schedule is not
high enough to release/link with order system
Parts sometimes not available thereby
creating schedule changes
Expediting and priority changes consume
parts not planned for
Manufacturing schedule does not reflect
realistic assembly and test time
No ongoing review of manufacturing standards
Corrective Action
Permanent – Manufacturing standards
reviewed and updated.
PrevenGve - Regular ongoing review of actuals
vs standards is implemented.
RCA
IdenGfy Problem
Department didn’t complete their project on Gme
Determine the Team
Team members:
Boss – Jim
Worker – Tom
Worker - Karen
Project Mgr – Bob
Admin – Sally
Immediate Action
• AddiGonal resources applied to help get the
project team back on schedule
• No new projects started unGl Root Cause
Analysis completed
RCA
Didn’t complete project on time
Why?
Cause and Effect
Procedures Personnel
Lack of worker
knowledge
Poor project plan
Poor project
mgmt skills Lack of resources
Didn’t complete
project on time
Inadequate
Poor Inadequate
computer documentation computer system
programs
Materials Equipment
Cause and Effect
Procedures Personnel
Lack of worker
knowledge
Poor project plan
Poor project
mgmt skills Lack of resources
Didn’t complete
project on time
Inadequate
Poor Inadequate
computer documentation computer system
programs
Materials Equipment
RCA
Didn’t complete project on time
Resources unavailable when needed
Why?
RCA
Didn’t complete project on time
Resources unavailable when needed
Took too long to hire Project Manager
Why?
RCA
Didn’t complete project on time
Resources unavailable when needed
Took too long to hire Project Manager
Lack of specifics given to
Human Resources Dept
Why?
Root Cause
Didn’t complete project on time
Resources unavailable when needed
Took too long to hire Project Manager
Lack of specifics given to
Human Resources Dept
No formal process for submitting job opening
Corrective Action
• Permanent – Hired another worker to meet
needs of next project team
• PrevenGve - Developed checklist form with HR
for submiung job openings in the future
RCA
IdenGfy Problem
High pyrogen count on finished medical
catheter product using molded components.
Immediate Action
Immediate AcGon (and panic!)
• QuaranGne all finished and in-process products
(over $2 million worth!)
• Analyze locaGon of pyrogen to find common
denominator
Panic Driven Action
Panic-driven Immediate ReacGon
(without root cause analysis)
• Pyrogen traced to molding cooling water leak Holy
cow!… cooling water system hasn’t been cleaned
in 15 years!
• Shut down 24/7 molding operaGon for 2 days to
clean cooling water system
• Implement system for weekly analysis of cooling
water for pyrogens
• Threaten to fire anyone who doesn’t report a
cooling water leak
Panic Driven Action Results
Results of Panic-driven Immediate ReacGon
(without root cause analysis)
• Day 1 aMer cooling water system cleaning: water tests
clean of pyrogens
• Day 2: cooling water is saturated with pyrogens. Uh
oh.
• All operators and technicians repor/ng “possible
water leaks” on all presses, all molds, all shiMs… “just
in case”.
• Molding opera/on shuts down. Opera/ons manager
nearly fired.
• “Help” flying in from corporate offices and other
molding plants.
• Hourly conference calls to give status updates to
execu/ves.
Logic Returns
• There must be a be\er way! How about trying
something called “Root Cause Analysis”?
RCA
Pyrogens on molded components
Why?
RCA
Pyrogens on molded components
Parts released from molding even though they
had been sprayed with leaking cooling water
Why?
RCA
Pyrogens on molded components
Parts released from molding even though they
had been sprayed with leaking cooling water
Disposition of contaminated parts procedure
does not discuss water
Why?
RCA
Pyrogens on molded components
Parts released from molding even though they
had been sprayed with leaking cooling water
Disposition of contaminated parts procedure
does not discuss water
Oil, grease, dust, human contact believed to
be primary sources of contamination
Why?
Root Cause
Pyrogens on molded components
Parts released from molding even though they
had been sprayed with leaking cooling water
Disposition of contaminated parts procedure
does not discuss water
Oil, grease, dust, human contact believed to
be primary sources of contamination
No formal evaluation of contamination sources,
types, severity, and disposition action.
Corrective Action
• Permanent – DisposiGon of contaminated parts
procedure re-wri\en to include water.
• PrevenGve - Formal study of contaminaGon
sources, consequences, and disposiGon
requirements.
Remember Just Keep Asking
Why did it happen?
PROBLEM: Didn’t get to work on Gme. ---Why?
Direct Causes: Car wouldn't’t start. --- Why?
Contributory Causes:
- Ba\ery was dead. --- Why?
- Dome light ON all night. --- Why?
Root Causes:
- Kids played in car lem door ajar.
Room 1 Task
Perform a Root Cause Analysis on Water not
Reaching Community A as per the Original
SpecificaGon?
Room 2 Task
Perform a Root Cause Analysis on Lead residue
found in your Water Quality Report?
Room 3 Task
Perform a Root Cause Analysis on why in your
plants you have a lot of LTI’s (Lost Time Due to
Injuries)
Questions to Ask?
PEOPLE
• Was the document properly interpreted?
• Was the information properly disseminated?
• Did the recipient understand the information?
• Was the proper training to perform the task administered to the
person?
• Was too much judgment required to perform the task?
• Were guidelines for judgment available?
• Did the environment influence the actions of the individual?
• Are there distractions in the workplace?
• Is fatigue a mitigating factor?
• How much experience does the individual have in performing
this task?
Questions to Ask?
MEASUREMENT
• Does the gage have a valid calibration date?
• Was the proper gage used to measure the part, process,
chemical, compound, etc.?
• Was a gage capability study ever performed?
- Do measurements vary significantly from operator to
operator?
- Do operators have a tough time using the prescribed gage?
• Is the gage fixturing adequate?
• Does the gage have proper measurement resolution?
• Did the environment influence the measurements taken?
Questions to Ask?
MATERIAL
• Is a Material Safety Data Sheet (MSDS) readily available?
• Was the material properly tested?
• Was the material substituted?
• Is the supplier’s process defined and controlled?
• Were quality requirements adequate for part function?
• Was the material contaminated?
• Was the material handled properly (stored, dispensed, used
& disposed)?
• Was the correct tool used?
• Is the equipment affected by the environment?
Questions to Ask
MATERIAL
• Is the equipment being properly maintained (i.e., daily/weekly/
monthly preventative maintenance schedule)
• Was the machine properly programmed?
• Is the tooling/fixturing adequate for the job?
• Does the machine have an adequate guard?
• Was the tooling used within its capabilities and limitations?
• Are all controls including emergency stop button clearly labeled
and/or color coded or size differentiated?
• Is the machine the right application for the given job?
Questions to Ask
ENVIRONMENT
• Is the process affected by temperature changes over the
course of a day?
• Is the process affected by humidity, vibration, noise, lighting,
etc.?
• Does the process run in a controlled environment?
Questions to Ask
METHODS
• Was the canister, barrel, etc. labeled properly?
ENVIRONMENT
• Were the workers trained properly in the procedure?
• Is the process affected by temperature changes over the
• Was the testing performed statistically significant?
course of a day?
• Have I tested for true root cause data?
• Is the process affected by humidity, vibration, noise, lighting,
• etc.?
How many “if necessary” and “approximately” phrases are
found in this process?
• Does the process run in a controlled environment?
• Was this a process generated by an Integrated Product
Development (IPD) Team?
• Was the IPD Team properly represented?
• Did the IPD Team employ Design for Environmental (DFE)
principles?
Questions to Ask
METHODS
• Has a capability study ever been performed for this process?
ENVIRONMENT
– Is
•q Is thethe process
process under by
affected Statistical Process
temperature Control
changes (SPC)?
over the
– Are the
q course of awork
day?instructions clearly written?
– Are
•q Is mistake-proofing
the process devices/techniques
affected by employed?
humidity, vibration, noise, lighting,
– Are the work instructions complete?
q etc.?
•q Does thetooling
– Is the process run in a controlled
adequately designed environment?
and controlled?
q – Is handling/packaging adequately specified?
q – Was the process changed?
q – Was the design changed?
q – Was a process Failure Modes Effects Analysis (FMEA)
ever performed?
q – Was adequate sampling done?
q – Are features of the process critical to safety clearly spelled
out to the Operator?
Questions to Ask
METHODS
• Is the process under Statistical Process Control (SPC)?
• Are the work instructions clearly written?
• Are mistake-proofing devices/techniques employed?
• Are the work instructions complete?
• Is the tooling adequately designed and controlled?
• Is handling/packaging adequately specified?
• Was the process changed?
Questions to Ask
METHODS
• Was the design changed?
• Was a process Failure Modes Effects Analysis (FMEA)
ever performed?
• Was adequate sampling done?
• Are features of the process critical to safety clearly
spelled out to the Operator?
Edward de Bono’s
Six Thinking Hats
An aid to decision making and
problem solving.
The Red Hat
• What do you feel
about the
sugges/on?
• What are your gut
reac/ons?
• What intui/ons
do you have?
• Don’t think too
long or too hard.
The White Hat
• The informa/on
seeking hat.
• What are the facts?
• What informa/on is
available? What is
relevant?
• When wearing the
white hat we are
neutral in our thinking.
The Yellow Hat
• The sunshine hat.
• It is posi/ve and
construc/ve.
• It is about effec/veness
and geIng a job done.
• What are the benefits, the
advantages?
The Black Hat
• The cau/on hat.
• In black hat the thinker
points out errors or pit-
falls.
• What are the risks or
dangers involved?
• Iden/fies difficul/es and
problems.
The Green Hat
• This is the crea/ve mode of thinking.
• Green represents growth and
movement.
• In green hat we look to new ideas
and solu/ons.
• Lateral thinking wears a green hat.
The Blue Hat
• The control hat, organising thinking
itself.
• Sets the focus, calls for the use of
other hats.
• Monitors and reflects on the
thinking processes used.
• Blue is for planning.
Six Thinking Hats
Intui/ve Informa/ve Construc/ve
Cau/ous
Reflec/ve Crea/ve
Six Thinking Hats
Informa/ve
Intui/ve
Construc/ve
Crea/ve
Reflec/ve
Cau/ous
Questions?
We want to open the Workshop now up to
your questions.
Please submit your questions electronically
and we will answer as many of them as we
have time for.
If you have more questions about Technical Training, Lizwe can help!
Contact us anytime at: 071 440 5741 or sales at
sales@lizwe-engineers.co.za
Thank You

You might also like