# Reliability, FMEA and TPM

© Tapan Bagchi TQM IEM Reliability

1

Reliability
• Generally defined as the ability of a product to perform as expected over time • Formally defined as the probability that a product, piece of equipment, or system performs its intended function for a stated period of time under specified operating conditions
Tapan Bagchi TQM IEM Reliability 2

Maintainability
• The probability that a system or product can be retained in, or one that has failed can be restored to, operating condition in a specified amount of time.

Tapan Bagchi TQM IEM Reliability

3

Types of Failures
• Functional failure – failure that occurs at the start of product life due to manufacturing or material detects • Reliability failure – failure after some period of use

These relate to the ―bathtub curve‖.
Tapan Bagchi TQM IEM Reliability 4

Types of Reliability
• Inherent reliability – predicted by product design • Achieved reliability – observed during use; based on observed failure data

Tapan Bagchi TQM IEM Reliability

5

How do you measure Reliability?
• Failure rate (l) – number of failures per unit time • Alternative measures
– Mean time to failure – Mean time between failures (MTBF)

Tapan Bagchi TQM IEM Reliability

6

Failure Rate Curve

“Infant mortality period”

Tapan Bagchi TQM IEM Reliability

7

Cumulative Failure Rate Curve

Tapan Bagchi TQM IEM Reliability

8

Average Failure Rate = 0.02

Tapan Bagchi TQM IEM Reliability

9

Typical Forms of Failure
 Early failure
due to design faults,
poor quality components,
Failure Rate

manufacturing faults, installation errors,

operator & maintenance errors

 Useful life
Early Failure Useful Life Wear-out Failure

has a low, constant failure rate

 Wear-out failure
parts approach the end of life

Time

Tapan Bagchi TQM IEM Reliability

10

Measuring Reliability
 Reliability R(t): The probability of operating to an agreed level of performance  Unreliability F(t): The probability of failing to operate to an agreed level of performance

Rt   F t   1
Tapan Bagchi TQM IEM Reliability 11

Reliability Function for Service Life
• Probability density function of failure time is exponential: f(t) = le-lt for t > 0 • Probability of failure from (0, T) F(t) = 1 – e-lT • Failure rate = l • Reliability function R(T) = 1 – F(T) = e-lT
Tapan Bagchi TQM IEM Reliability 12

In general, Failure Times fit Weibull Distribution
In probability theory and statistics, the Weibull distribution is a continuous probability distribution with the probability density function

for and f(x; k, λ) = 0 for x < 0, where k > 0 is the shape parameter and λ > 0 is the scale parameter of the distribution. The Weibull distribution is often used in the field of life data analysis due to its flexibility—it can mimic the behavior of other statistical distributions such as the normal and the exponential. If the failure rate decreases over time, then k < 1. If the failure rate is constant over time, then k = 1. If the failure rate increases over time, then k > 1.
Tapan Bagchi TQM IEM Reliability 13

Tapan Bagchi TQM IEM Reliability

14

The Weibull Distribution expressions

Tapan Bagchi TQM IEM Reliability

15

Life Testing Data

Time t

Observed No. of Failure s n(t)

Cumulative No. of Failures

f(t)=
Surviving S(t) n(t)/ 2000

r(t) = n(t)/ avg S

Reliability
R(t)= S(t)/20 00

F(t) = 1 - R(t)

0 650 1 350 2 210 3 166

0

2000 0.325 0.388

1.000

0.000

650

1350 0.175 0.298

0.675

0.325

1000

1000 0.105 0.235

0.500

0.500

1210

790 0.083 0.235

0.395

0.605

4
131 5 103 6 82

1376

624
0.066 0.235

0.312

0.688

1507

493 0.052 0.233

0.247

0.754

1610

390 0.041 0.235

0.195

0.805

7
65 8 120 9 123

1692

308
0.033 0.236

0.154

0.846

1757

243 0.060 0.656

0.122

0.879

1877

123

0.062 2.000

0.939

Tapan Bagchi TQM IEM Reliability 0.062
2000 0

16
0.000 1.000

10

r(t) and R(t) Calculations displayed
Failure Rate vs. Time
2.500 2.000 1.500 1.000 0.500 0.000 0 1 2 3 4 5 Time 6 7 8 9 10 r(t) Faiure Rate Reliability R(t)

Tapan Bagchi TQM IEM Reliability

17

Reliability of Non-Repairable Items
T1 Td1 T2 Td2 T3 Td3

 Mean Time To Fail (MTTF)

ratio of total up time to number of failures.

 Mean Failure Rate (l)
inverse to MTTF.

 Mean Down Time (MDT)
ratio of total down time to number of failures.

Tapan Bagchi TQM IEM Reliability

18

Reliability of Repairable Items
Tup1 Td1 Tup2 Td2 Tup3 Td3

T

 Total Up Time (Tup)

total time minus total down time

 Mean Time Between Failures

(MTBF)

ratio of total up time to number of failures.

 Mean Failure Rate (l)
inverse to MTBF.
Tapan Bagchi TQM IEM Reliability 19

Availability
• Operational availability
MTBF AO  MTBM  MDT • Inherent availability MTBF A MTBF  MTTR
MTBM = mean time between maintenance MDT = mean down time MTBF = mean time between failures MTTR = mean time to repair

Tapan Bagchi TQM IEM Reliability

20

Design for Reliability
 Element selection
elements with well-established failure rate data

 Environment
elements can withstand normal working environment

 Minimum complexity
fewer elements (series systems)

 Redundancy
several identical elements in parallel

 Diversity
a give function is carried out by two parallel systems

Tapan Bagchi TQM IEM Reliability

21

Series Systems

1

2

n

RS = R1 R2 ... Rn

Tapan Bagchi TQM IEM Reliability

22

Reliability of Series System
Reliability of a series system is the product of individual element reliabilities.

I R1 R2 Rn

O

Rsystem  R1  R2    Rn e e
 l1t

e

 l2 t

 e

 ln t

System reliability is lower than the lowest element reliability

 ( l1  l2  ln ) t
Tapan Bagchi TQM IEM Reliability 23

Parallel Systems
1
2

n

RS = 1 - (1 - R1) (1 - R2)... (1 - Rn)
Tapan Bagchi TQM IEM Reliability 24

Reliability of Parallel System
R1 I R2 Rn
Rsystem  1  F1  F2    Fn  1  (1  e l1t )  (1  e l2t )    (1  e lnt )

O

Reliability of a parallel system is determined by the product of individual element unreliabilities. System reliability is greater than the greatest element reliability
25

Tapan Bagchi TQM IEM Reliability

Series-Parallel Systems
C

RA
A

RB
B

RC
C

RD
D

RC

• Convert to equivalent series system
RA A RB B C’ RD D
26

RC’ = 1 – (1-RC)(1-RC)
Tapan Bagchi TQM IEM Reliability

Reliability Management
• Define customer performance requirements • Determine important economic factors and relationship with reliability requirements • Define the environment and conditions of product use • Select components, designs, and vendors that meet reliability and cost criteria • Determine reliability requirements for machines and equipment • Analyze field reliability for improvement
Tapan Bagchi TQM IEM Reliability 27

Configuration Management
1. Establish approved baseline configurations (designs) 2. Maintain control over all changes in the baseline programs (change control) 3. Provide traceability of baselines and changes (configuration accounting)

Tapan Bagchi TQM IEM Reliability

28

Design Issues
• Access of parts for repair • Modular construction and standardization • Diagnostic repair procedures and expert systems

Tapan Bagchi TQM IEM Reliability

29

Maintainability
• Maintainability is the totality of design factors that allows maintenance to be accomplished easily • Preventive maintenance reduces the risk of failure • Corrective maintenance is the response to failures
Tapan Bagchi TQM IEM Reliability 30

Reliability Engineering
• • • • • • Standardization Redundancy Physics of failure Reliability testing Burn-in Failure mode and effects analysis (FMEA) • Fault tree analysis (FTA)
Tapan Bagchi TQM IEM Reliability 31

FTA

Tapan Bagchi TQM IEM Reliability http://www.weibull.com/basics/fault-tree/index.htm

32

Fault Tree Analysis (FTA)
Example:
Bulb Fails

No electricity

Glass Broken

Filament Broken

Vacuum Leak

Power Plant Fails

Power Line Fails

Connector Corroded

Impurities

Vibrations

Wind Breaks Line

Tree Breaks Line Tapan Bagchi TQM IEM Reliability

33

Bicycle fails when I rush to class Draw the FTA:

Hint: Draw an FTA diagram for the total system first.

Tapan Bagchi TQM IEM Reliability

34

Faults/Pathways Magnified N-fold for a Simple Manufacturing Process!

Tapan Bagchi TQM IEM Reliability

35

FMEA

http://www.npd-solutions.com/fmea.html

Tapan Bagchi TQM IEM Reliability

36

FMEA
Failure Mode and Effect Analysis

37

Failure, Likelihood, Impact…
• Most real systems are designed to serve a purpose or deliver some function • But few systems are perfect—most are liable to failure. Then they fail to deliver their designed functionality • A car may not start, or its braking system may fail • The consequence of such failure may be drastic and its occurrence is generally uncertain • It is possible to plan contingent actions, or modify the design—to reduce (a) the likelihood of a failure, or (b) its impact, or (c) both • FMEA—an analytical procedure that helps one mitigate the risks by proactively reducing (a) the severity of the adverse situation, or (b) the likelihood (probability) of its occurrence
38

Steps for doing FMEA
• Identify possible causes (modes) of failure • Estimate the likelihood of the cause being active • Determine the potential impact (severity) of the consequent failure • Calculate RPN—the Risk Priority Number—for this failure mode • Order the modes in descending order of RPN • Plan actions to reduce RPN, starting with the mode with the highest RPN—by reducing the likelihood (probability) of this failure mode becoming active, and/or by reducing its potential impact • Implement the preventive actions

39

A Simple Example of performing FMEA
• Mission: A family vacation at Goa • Modes that may cause the mission to fail:
Sickness Wallet lost Strike

Vacation is spoiled Travel mix up Can’t find hotel
40

Accident

Severity × Likelihood = RPN
Mode Sickness Wallet lost Strike Severity (Impact) 1 9 3 Likelihood 0.1 0.25 0.1 0.1 2.2 0.3 RPN Possible Causes of failure
Exposure Infection Unsafe One Did

acts

wallet

not see news

Accident
Travel Mix up Can’t find hotel

8
5 5

0.2
0.1 0.25

1.6
0.5 0.63

Hazards
Unsafe No

actions agent
41

reservation reservation map; no car

Unreliable No No

The ―Risk Map‖—before FMEA 10
Impact

8

Wallet lost Accident
Strike

6

4

Travel mix up

No hotel

2

Sickness
42 1.0 Probability 

0

0.2

0.4

0.6

0.8

Mitigation actions facilitated by FMEA
Causes
(things that may go wrong or fail)
Exposure Infection Unsafe One Did

Mitigation & proactive actions

Severity (Impact)

Likelihood

RPN

Avoid
Safe keeping Split \$; use Visa Check news

1
3 1

0.1
0.1 0.1

0.1
0.3 0.1

acts

wallet

not see news

Hazards Unsafe No

Identify and

2
1

0.1
0.1

0.2
0.1

actions agent

reservation

Unreliable No No

reservation map; no car

2

0.05

0.1

43

The Risk Map—After FMEA
10

Impact
8

Wallet lost Accident Strike

6
Travel

No Hotel

4

2
Sickness
44 1.0  Probability

0.0

0.2

0.4

0.6

0.8

Benefits of doing FMEA
• It enhances system performance by helping one to identify adverse factors that may impact performance • It makes most of the risks visible, and helps one to quantify their impact and probability of occurrence • It helps one take proactive steps to prevent problems ahead of the system’s being put into service, e.g. in new product design and launch • It helps in reduction of waste and costs due to nonperformance caused by failures • Today FMEA is an indispensable tool in the hands of engineers, product and process designers, and trouble-shooters

45

High-Level Combinations of Severity and Probability
Increasing Probability of Occurrence

High Risk

Medium Risk Low Risk

Increasing Severity of Harm/Consequence
Tapan Bagchi TQM IEM Reliability 46

FMEA – Why?
• Why FMEA’s? • Definition, Purpose, Types, Benefits • Team Approach

Introduction

Tapan Bagchi TQM IEM Reliability

47

FMEA – Definition
FMEA is a Structured group of activities which...

• Identify potential failure modes • Prioritize actions • Document the process.

Tapan Bagchi TQM IEM Reliability

48

FMEA – Purpose
Failures

FMEA

Crisis

(Production start)
Tapan Bagchi TQM IEM Reliability

Time
49

FMEA – Purpose
FMEA’s are intended to ...

• Rate severity of failure modes • Identify actions to reduce occurence • Test adequacy of controls

Tapan Bagchi TQM IEM Reliability

50

Potential failure Modes
Failure Mode Type
No function

Example
Not operational

Partial function

Not all of function operating

Tapan Bagchi TQM IEM Reliability

51

Severity (Weightfactor)
What is the severity of each effect identified?

Tapan Bagchi TQM IEM Reliability

52

Rating criteria for Severity (Weightfactor)
Effect
Non-conforming with safety

Criteria: Severity of effect
Safety failure

Class

S A

Unacceptable risk

Correction is nescessary

Relative big risk

Correction is recommended

B
C C D
53

risico Minimum risk

Correctie is nuttig Correction is usefull

None

AcceptedTapan Bagchi TQM IEM Reliability failure

Potential Cause of Failure
It is a weakness in the design with a failure mode as effect. (see next slide)

Tapan Bagchi TQM IEM Reliability

54

Manufacturing misbuilds
Due to design Deficiencies

+

+

-

-

Tapan Bagchi TQM IEM Reliability

55

Manufacturing misbuilds
Robust Design done after FMEA

+

+

-

-

Tapan Bagchi TQM IEM Reliability

56

Searching for Causes of failure
Use Fishbone Diagram:
Text in wrong location “Text unreadable”

Ink of poor quality
Tapan Bagchi TQM IEM Reliability 57

Ink doesn’t stick
level 1

WHY?

Surface roughness not ok.
level 2

WHY?

Design requirement
level 3

Tapan Bagchi TQM IEM Reliability

WHY?

58

Sentencing Technique: Is it an effect or a cause?
Could result in

Effect

Failure Mode

Due to

Cause

Tapan Bagchi TQM IEM Reliability

59

Sentencing Technique Example
Could result in

Dissatisfied customer

Due to

Surface roughness (designreq.)

Tapan Bagchi TQM IEM Reliability

60

Product-FMEA – Occurrence

What is the probability that the failure will occur?

Tapan Bagchi TQM IEM Reliability

61

Rating criteria of occurrence
Probability of failure
Very high

Possible Failure Rates

Ranking
5 4 3 2 1

1 of 3
> 1 of 20

Moderate

> 1 of 400 > 1 of 15000

Low

< 1 of 15000

Tapan Bagchi TQM IEM Reliability

62

Product-FMEA – Actions/Solutions

What are the possible actions to: - eliminate the failure - reduce effect - reduce occurrence

Tapan Bagchi TQM IEM Reliability

63

Do the Bicycle exercise again—by FMEA

Tapan Bagchi TQM IEM Reliability

64

Total Productivity Maintenance The Stars of TPM are the Japanese!

What is TPM?

What does maintenance mean anyway… Maintenance = The act of maintaining Maintain = To keep in a state of order. To keep in due (rightful, proper, fitting) condition, operation, or force; keep unimpaired.

Definition cont…
Fix and Repair = Maintenance
HOZEN (Maintenance in Japanese) = Maintaining and preserving perfection through Asset Management.

A common sight: Equipment breakdown, waiting for repair

The goal of TPM is to change “Buttonpusher Operators” to Process Owners or change Firefighters to Maintainers.
It cuts downtime and losses and saves \$.

The Five Pillars of TPM
• • • • • Autonomous Maintenance Maintenance Process Improvement Systematic Equipment Improvement Training and Skill Development Early Equipment Management

These are actualized through cross-functional Team-based improvement activities

TPM Goals
T1 Td1 T2 Td2 T3 Td3

Characteristics of TPM

With TPM, Maintenance no more remains the job of only the ―Maintenance Staff!!!‖

Like Six Sigma, TPM is best executed by Cross-Functional Project Teams

Typical Cross-Functional TPM Team at work

Results delivered by TPM

TPM programs deliver Real \$

Note carefully that these bring direct savings to the plant

But moving to TPM requires a Paradigm Shift

SEI—Systematic Equipment Improvement

Autonomous Maintenance

What is Autonomous Maintenance?

Seven Steps of Autonomous Maintenance —Operators not only run the machines, they also maintain them
1. Conduct initial cleaning/inspection 2. Eliminate sources of contamination 3. Establish provisional standards 4. Develop general inspection training 5. Conduct general inspections 6. Improve workplace management and control 7. Participate in advanced improvement activities

Poor or neglected maintenance—no TPM thinking by users

What losses could it lead to?

Still not that rare in a factory!

Dr Bagchi saw a similar sight at a Jute Mill 6 months ago

TPM changes the scene: Cross-Functional Teams take over maintenance activities

Maintenance Process Improvement (MPI)
(Planned, Scheduled Maintenance System)

The Different Maintenance Techniques

Maintenance Process Improvement (MPI) Activities

The Case of a TPM Culture at work

Old Chain guard

After TPM MPI action by operators

Six Major Losses due to Equipment maintenance being not up to mark

Motivation for doing TPM: How the 6 Losses reduce Effectiveness
Overall Equipment Effectiveness

=

Availability

x

Perf. Efficiency Reduced Speed Minor Stops and Idling

x

Quality

Equipment Failures

Rejects and Defects Startup Losses

6 Major Losses

Systematic Equipment Improvement
(Improve Equipment Effectiveness)

SEI
A systematic approach to eliminate waste through analysis of the ―6‖ major losses utilizing cross-functional teams to continuously investigate, test, and implement improvements with a goal of maximizing equipment effectiveness.
SEI is a DATA DRIVEN PROCESS. The goal is to reduce equipment failures, adjustments and setups, correct speed, and eliminate stops and idling— systematically. This engages reliability engineering and knowledge of the machines.

Training and Skill Development

Why does TPM require Training and Skills Development?

TPM Training needs are often obvious…

Just looking…taking a walk-around the workplace

A walk through the offices

―The part must be somewhere in here…‖

Well the drip seems fixed… must be busy time.

Clogged motor casing intake—that isn’t good. I know that, but the manager should get it cleaned!

I guess there is enough light …ya I pasted the sheet bit low ☺

―They should know what they are looking at…‖

Enter TPM

I like that!

And that too!

God! Don’t they have anything else to do other than shining dials all day? What did you say, ―TPM?!‖

Zero Breakdown Strategies
Restore equipment Maintain basic equipment conditions

Improve operator maintenance skills

Don’t stop at emergency fixes
Correct design weaknesses Study breakdowns relentlessly … if you care to survive in business.

TPM pushes down the floor and pushes out the wear-out end

Tapan Bagchi TQM IEM Reliability

109

TPM Implementation—as the Japanese do it
• • • • • • • • • Announce top management’s decision to introduce TPM Launch educational campaign Create organizations to promote TPM Establish basic TPM policies and goals Formulate master plan for TPM development Hold TPM ―kickoff‖ Improve equipment effectiveness Establish an Autonomous maintenance program for operators Set up a scheduled maintenance program for the maintenance department • Conduct training to improve operator and maintenance skills • Develop initial equipment management program • Implement TPM fully and aim for higher goals