You are on page 1of 79

Reliability improvement through

Preventive Maintenance and Optimal


Spares Stocking Policies

Ali Zuashkiani, PhD


Director of Educational Programs
Certified RCM II Practitioner
Center for Maintenance Optimization and Reliability
Engineering
University of Toronto

© A.K.S. JARDINE
About the instructor
• Ali Zuashkiani, PhD, CRL, CMRP is CEO of PAMCo, a Canadian Consulting Company with
projects across the globe. Ali is a graduate of Harvard Kennedy School of Policy, Said
Business School of Oxford, and business executive programs of WITS Business School
(South Africa), and INCAE business school (Costa Rica) and holds a PhD from the
University of Toronto. He has been Director of Educational Programs at C-MORE for 13
years.

• Ali has more than 20 years of practical experience combined with scientific rigour in
optimizing asset management decisions in more than 200 plants in 30+ countries. His
consulting endeavours include numerous Life Cycle Costing management projects for
utility and gas distribution companies in North America, RCM implementation projects
in power plants, oil and gas companies, and the electricity distribution industry, and
assignments dealing with asset management practices in 85 plants in the Middle East
and South America.

• Ali is the author of Expert Knowledge Based Reliability Models and a frequent global
speaker on a range of topics in asset management. He has been Chair of the
International Physical Asset Management Conference for the last 14 years. Ali was
named by the Asia Society as one of the world’s most dynamic young leaders in 2008
and was recognized by the World Economic Forum as a Young Global Leader of 2013.
© A.K.S. JARDINE 2
Component Replacement
Decisions

© A.K.S. JARDINE
Maintenance Optimization

Optimizing Equipment Maintenance and Replacement Decisions

Component Inspection Capital Equipment Resource


Replacement Procedures Replacement Requirements

Maintenance Management System (CMMS/EAM/ERP)

© A.K.S. JARDINE 4
Making Systems more Reliable
through Component Replacement

© A.K.S. JARDINE
Class Example

Air filter change:


Cr = $20 Km / month driven = 2000 Gas cost = $0.75/litre
Assuming 15 km/liter when new, then month 1 cost =
$100.00. Operation costs for the months 1-4 are
summarized in below table. When is the best time to
replace the air filter?

t (month) c (t)
1 100
2 105
3 110
4 112
© A.K.S. JARDINE 6
Short–Term Deterministic Replacement

Cold air Hot flue gases

Air heater
Soot deposits

steam

Boiler

Hot air Fuel

© A.K.S. JARDINE 7
Short–Term Deterministic Replacement
$/lb 1.2
Steam
Generated 1

0.8

0.6

0.4

0.2
tr

0
time

Where on the increasing operating cost curve is it economically


justifiable to make a replacement (that is, clean the air heater) ?

© A.K.S. JARDINE 8
Optimization Problem

Cost / unit time


1.2

Total cost
1

0.8

0.6 Operating cost (flue)

0.4

0.2
Replacement cost
(cleaning cost)
0
tr

Optimal tr
tr – interval between replacement.
© A.K.S. JARDINE 9
Example

Air filter change:


Cr = $20 Km / month driven = 2000 Gas cost = $0.75/litre
Assuming 15 km/litre when new, then month 1 cost = $100.00

t(month) c(t) C(t)


1 100 120
2 105 (20+100+105)/2=112.5
3 110 (20+100+105+110)/3=111.7
4 112 111.75
© A.K.S. JARDINE 10
Example

Therefore replace at end of month three, since:

Cost ($)/month

120 111.75
112.5
111.67

0 1 2 3 4
t (month)

© A.K.S. JARDINE 11
Class Example
GAS METER PROBLEM
A gas meter costs 120 USD to replace (a new gas meter costs $
50 and its installation costs $70). The company does the
replacement surveys every 5 years.

For the first 10 years the meter works fine and measures gas
consumption very accurately. On average, from year 10 to 15 it
underestimates gas consumption by $2 per year which would be
10 dollars over the 5 year period of 10-15.

Over the period of 15-20 years of age a gas meter underestimate


gas consumption by $4 per year. In the period of 20-25 years of
age it underestimates gas consumption by $6 per year and after
age 25 till age 30 it underestimates gas consumptions by $8 per
year. What is the best time to replace a gas meter?

(Note that a gas meter replacement age can be a multiple of 5


years which means it can be replaced at ages: 5, 10, 15, 20, 25,
30 only) © A.K.S. JARDINE
For Preventive Replacement

Cost minimization
Check two conditions:
1. Total cost of a failure replacement is
greater than total cost of a preventive
replacement
2. Wear-out effect occurring

© A.K.S. JARDINE 13
For Preventive Replacement

Availability maximization/downtime
minimization
Check two conditions:
1. Total outage (downtime) of a failure
replacement is greater than total
outage (downtime) of a preventive
replacement
2. Wear-out effect occurring
© A.K.S. JARDINE 14
RCM Methodology Logic

© A.K.S. JARDINE 15
What is Age?

Time-Based Discard
• Operating hours
• Calendar time
• Cycles
– Operating
– Launch

© A.K.S. JARDINE 16
Reliability Centered Maintenance View
Items are replaced at specified frequencies regardless of their
condition at the time

Replace at fixed interval regardless of condition

Scheduled Discard Tasks

These tasks resolve age related failures


© A.K.S. JARDINE
Adopted from RCM II book by John Moubray
Preventive Maintenance
If a group of similar components is subjected
Initial State to similar stresses over a period of time,
they can be expected to reach a
failed state at about the same age

Number of failures
Asset condition

Age End State


© A.K.S. JARDINE
Adopted from RCM II book by John Moubray
Preventive Maintenance
If we know the minimum age at which any one
Initial State of a group of items is likely to reach a failed
state we can prevent most of the failures by
restoring or discarding the items before they
reach this age (Usually called life or useful
life)

Number of failures
Asset condition

Age End State


© A.K.S. JARDINE
Adopted from RCM II book by John Moubray
Number of
Conditional Probability of Failure

A failure frequency curve that looks like


failures

this
Age
Probability of Failure

corresponds to a conditional
Conditional

probability of
failure curve that looks like this

Age
© A.K.S. JARDINE
Adopted from RCM II book by John Moubray
Scheduled Discard Tasks

A scheduled discard task is technically feasible


if there is an age at which there is a rapid
increase in the conditional probability of failure
Probability of Failure

Most items will survive to this point


Conditional

Age
© A.K.S. JARDINE
Adopted from RCM II book by John Moubray
Scheduled Discard Tasks
A scheduled discard task is technically feasible if there is an age at
which there is a rapid increase in the conditional probability of
failure
▪ If most of the items survive to this age
(unless the failure has safety or environmental consequences,
in which case all* the items must survive to this age)

* By all we mean that the


Probability of Failure

conditional probability of
Conditional

failure must be below the


maximum tolerable level.

Age
© A.K.S. JARDINE
Adopted from RCM II book by John Moubray
Factors that Influence Tolerance
Regarding Safety
In any one year, what probability do you tolerate of being killed by
any event in each situation?

?
I have no control and no
choice about exposure (off-site
exposure to industrial accident)

?
I believe I have no control but
I have some choice about
exposure (in a passenger plane)

?
I believe I have some control
and have some choice about
exposure (at work)

?
I believe I am in control
and I have complete choice
(in my car or home workshop) 10-8 10-7 10-6 10-5 10-4 10-3
© A.K.S. JARDINE 23
Adopted from RCM II book by John Moubray
Factors that Influence Tolerance
Regarding Safety
In any one year, what probability do you tolerate of being killed by
any event in each situation?

?
I have no control and no
choice about exposure (off-site
exposure to industrial accident)

?
I believe I have no control but
I have some choice about Intolerable
exposure (in a passenger plane)

?
I believe I have some control
and have some choice about
exposure (at work)
Tolerable
?
I believe I am in control
and I have complete choice
(in my car or home workshop)
10-8 10-7 10-6 10-5 10-4 10-3
© A.K.S. JARDINE 24
Adopted from RCM II book by John Moubray
Frequency: Scheduled Discard Tasks
The frequency of scheduled restoration and scheduled discard
tasks is governed by the "life" of the item (in other words, the age
at which there is a rapid increase in the conditional probability of
failure)...
Probability of Failure
Conditional

Age
© A.K.S. JARDINE
Adopted from RCM II book by John Moubray
Probability of Dying at Work During One Year

• North Sea Fishermen: 1/1000

• Steel Industry: 1/10,000

• Light Industry: 1/100,000

• Office Workers: 1/1,000,000

© A.K.S. JARDINE 26
Conditional Probability of Failure:
Reality

© A.K.S. JARDINE 27
Preventive Replacement Cost Conflicts

Optimal Replacement Time


Total Cost Per Week, C (tp)
$/Week

Failure Replacement
Cost/Week

Preventive Replacement
Cost/Week

tp
Optimal Value of tp
© A.K.S. JARDINE 28
Constant Interval Replacement Policy

Cf Cf
New
Item Cp Cp Cp Cp

tp tp tp tp t

© A.K.S. JARDINE 29
Construction of Model

Cp: total cost of a preventive replacement


[labour, parts, outage cost, etc.]

Cf: total cost of a failure replacement.

f(t): p.d.f. of failure times.

C(tp): total cost per unit time for preventive


replacement at intervals of length tp, plus
failure replacements as required.

© A.K.S. JARDINE 30
Failure causes downtime -
But what is downtime?
T1 T2 T3 T4 T5 T6 T7 T8 T9
Production rate
12

10

0
1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
Get parts,
start repair Full production rate
Diagnose
Complete test,
Equipment available handover to production
Maintenance arrives
Complete repair, start test
Call to Maintenance
© A.K.S. JARDINE 31
First sign of performance degradation
What is downtime? Who’s asking???

T1 T2 T3 T4 T5 T6 T7 T8 T9
Production rate
12

10 Executive
8

6
Production
4

2
Maintenance
0
1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
Get parts,
start repair Full production rate
Diagnose
Complete test,
Equipment available handover to production
Maintenance arrives
Complete repair, start test
Call to Maintenance
© A.K.S. JARDINE 32
First sign of performance degradation
We have:
Cf Cf Cp

tp

One cycle

Expected cost / cycle Cp + H(tp)Cf


C(tp) = =
Cycle length tp

Where H(tp) = expected number of renewals in interval (0,tp)

© A.K.S. JARDINE 33
Age Age-based Replacement Policy

Cf Cf
New
Item Cp Cp Cp

tp tp tp t

© A.K.S. JARDINE 34
Optimal Preventive Replacement Age
of Equipment Subject to Breakdown

Failure Failure
replacement replacement
Preventive Preventive
replacement replacement
tp tp

0 Time

© A.K.S. JARDINE 35
Determination of Optimal Preventive
Replacement Age
Construction of model:

Cp, Cf & f(t) as before.


There are two possible cycles:
Cp Cf

0 tp 0 M(tp)
GOOD FAILED
CYCLE CYCLE

© A.K.S. JARDINE 36
Determination of Optimal Preventive
Replacement Age
Construction of model:
C(tp) = total cost / unit time
when preventive replacement occur at
age tp.

Expected cost / cycle


C(tp) =
Expected cycle length

Cp * R(tp) + Cf * [ 1 – R(tp) ]
C(tp) =
tp * R(tp) + M(tp) * [1 – R(tp)]
© A.K.S. JARDINE 37
A Primer on Statistics

© A.K.S. JARDINE
Weibull Distribution

 -1  t 
  t  - 
è 
 : shape parameter
f (t ) =   
 è  
e : characteristic life

f(t) 60
=1/2 (Hyperexponential)
50
=1 (Exponential)
40
=2 (Rayleigh)

30
=3.5 (Normal)

20

10

0
© A.K.S. JARDINE
t 39
Hazard Rate [h(t)]

For Weibull distribution:

h(t)
β>1

β =1

β <1
time
© A.K.S. JARDINE 40
Summary
f(t) 1
0.9
0.8 f(t)
0.7
0.6
0.5
0.4
F(t)
0.3 R(t)
0.2
0.1
0
time
t
F(t) + R(t) = 1.0
© A.K.S. JARDINE 41
Hazard rate [h(t)]

This is a conditional probability, with h(t)δt being the


probability that an item fails during the interval δt,
given that it has survived to time t.

f(t)
h(t) =
1 – F(t)

h(t) = f(t) / R(t)

© A.K.S. JARDINE 42
System Hazard Function
Equipment Life Periods
Infant
Mortality Useful Life Wearout
Hazard
function
Overall Life
Characteristic Curve

Stress Related
Failures

Quality Wearout
Failures Failures

Time

© A.K.S. JARDINE 43
Note:

Age-based policy is cheaper than the interval


policy but disadvantage is:

- age-based policy is more “difficult”


to implement.

- need to keep a record of component age.

- need to reschedule.

© A.K.S. JARDINE 44
Sugar Refinery Centrifuge Case

Wet Sugar
Sugar Refinery
Centrifuge
Dry Sugar

36 Problems Top
6 Analyzed 5
Months Data

© A.K.S. JARDINE 45
Failure frequency: cloth interval
CLASS CUMULATIVE CLASS CUMULATIVE
INTERVAL FREQUENCY RELATIVE INTERVAL FREQUENCY RELATIVE
(weeks) FREQUENCY (%) (weeks) FREQUENCY (%)
0 < 2 24 10.5 26 < 28 4 86.9
2 < 4 36 26.2 28 < 30 1 87.3
4 < 6 27 38.0 30 < 32 4 89.1
6 < 8 23 48.0 32 < 34 4 90.8
8 < 10 15 54.6 34 < 36 5 93.1
10 < 12 9 58.5 36 < 38 2 93.9
12 < 14 12 63.8 38 < 40 2 94.8
14 < 16 11 68.6 40 < 42 2 95.6
16 < 18 13 74.2 42 < 44 2 96.5
18 < 20 4 76.0 44 < 46 2 97.4
20 < 22 12 81.2 50 < 52 4 99.1
22 < 24 5 83.4 56 < 58 1 99.6
24 < 26 4 85.2 76 < 78 1 100.0
TOTAL: 229

© A.K.S. JARDINE 46
Weibull Analysis

© A.K.S. JARDINE
2-Cycle Weibull Paper

© A.K.S. JARDINE 48
Estimation of 2.4
Parameters 720 830
0

Minimum Life, 
Shape Factor, 

Characteristic Life, 

Mean Life, 

Median Life

Bq Life

© A.K.S. JARDINE 49
Characteristic Life

f(t)

 = 2.4
63.2%

 = 830 time

© A.K.S. JARDINE 50
Mean Life

f(t)

 = 2.4
52.7%

 = 720 time

© A.K.S. JARDINE 51
Median Life

f(t)

 = 2.4
50%

time
B50 life = 700

© A.K.S. JARDINE 52
Maintenance Management

© A.K.S. JARDINE 53
Parameter Estimation for Cloth Replacement

Estimation point Cloth Replacement 229


1
13 13
0

β =1

η estimator

Perpendicular

μ=η=13
η=13

13
© A.K.S. JARDINE 54
Bearing Replacement

Historical Data

12 25 9 13 19

Shortest Time : 9 weeks


Longest time: 25 weeks To-day

Then: Establish risk of bearing failing as it ages

© A.K.S. JARDINE 55
Failure Distribution

Bearing Failure Distribution

0.06
f(t) 0.05

0.04

0.03

0.02

0.01

0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35
Time

© A.K.S. JARDINE 56
The Best Time

• Risk curve
• Economics
(Cf & Cp)
• Blend to
establish the
optimal tp

© A.K.S. JARDINE 57
PUMP FAILURE DATA
RUNNING SUSPENSION
TIME TO OR
FAILURE CENSORED
(MONTHS) TIME

3
6 6
9

 MEAN LIFE = ? MONTHS


© A.K.S. JARDINE 58
PUMP FAILURE DATA
1 FAILURE

2
3

10
Testing Time (10 weeks)
4 F + 6 Suspensions
Source: AHC Tsang
© A.K.S. JARDINE 59
PUMP FAILURE DATA

NEW
P.R P.R. P.R. F.R. F.R. P.R. P.R. F.R. P.R.

s S S F F S S F S
        

3F + 6S
Failure Suspension (or
censored
observation)

© A.K.S. JARDINE 60
Life-time distribution of water pumps
MEAN TIME
TO FAILURE

MILES x 103
0 50 100 150 200 250
MEAN = 85,000 MILES
© A.K.S. JARDINE 61
T-33 Silver Star Aircraft

© A.K.S. JARDINE 62
T-33 Silver Star Aircraft
The T33 aircraft engine is supplied with fuel provided by two fuel pumps (upper and lower).
The fuel system design is such that either pump can provide the necessary fuel pressure and
quantity to operate the engine satisfactorily. That is, the system is redundant and the failure of
a pump is not a catastrophic event.
The decision to be arrived at is: Should the pump be removed after “x” hours and overhauled
and relifed, or should we repair/overhaul it after failure only?
Failure Data
Collected over a 2-year period. Censored items represent a “snapshot” of all pumps still
operating successfully on one specific day.
Interval Failures Censored Items
(Hours) Upper Lower Upper Lower
0 – 200 1 2 7 5
200 – 400 5 1 6 5
400 – 600 10 1 5 1
600 – 800 4 1 4 10
800 – 1000 1 1 6 3
1000 – 1200 6 1 9 3
1200 – 1400 2 1 10 6
1400 – 1600 2 1 0 4
1600 – 1800 4 2 0 4
© A.K.S. JARDINE 63
Fuel Pump Failures
Estimation point
82
Endpoints of Intervals 2.25
1170 1320

̭
0

β =2.25
Perpendicular
η estimator

̭
η=1320 hours
13
© A.K.S. JARDINE 64
CATERPILLAR D10N Track-Type Tractor

© A.K.S. JARDINE 65
Steering Clutch, L.H.
(from a group of 6 CAT D10 Dozers)

MG707 Failure Replacement

7979 h 2027 h 9671 h

New Today

Failure intervals (F) 7979 h, 2027 h


Suspension interval (S) 9671 h
Assume Clutch re-built to “as new” condition
(assumption can be checked)

Similar data obtained for 5 other


dozers F=7, S=6, Sample Size = 13

Statistical Analysis of Failure Data


From Weibull analysis: MTTF = 6500 h  = 1.79
© A.K.S. JARDINE 66
Cost Data
CP = $5640 Labour: 16 * $40/h = $ 640
Parts 2600
Vehicle off the road (VOR)
(8 h * $300/h) = 2400
$ 5640

Cf = $7160 Labour: 24 * $40/h = $ 960


Parts 2600
VOR (12 h * $300/h) = 3600
$ 7160
Cheapest Policy: Replace only on Failure (R-o-o-F) @ $1.10/hr
© A.K.S. JARDINE 67
Remarks: L.H. Steering Clutch

A run-to-failure policy was a surprising


conclusion since the clutch was exhibiting
wearout characteristics. However, the
economic considerations did not justify
preventive replacement according to a fixed-
time maintenance policy.

© A.K.S. JARDINE 68
CP Rail

© A.K.S. JARDINE 69
A SUCCESS STORY IN THE ANALYSIS OF
FAILURE DATA: CP RAIL

COMPONENT POWER TRACTION TURBO-


ASSEMBLY MOTOR CHARGER
ALTERNATIVES • REPLACE
WITH NEW MAJOR SEND TO GM
• WASH & OVERHAUL FOR REBUILD
WEAR
• OVERHAUL

FORMER WASH & WEAR OVERHAUL REBUILD


POLICY AT 5 YRS AT 5 YRS AT 2 YRS

NEW OVERHAUL RUN TO RUN TO


POLICY AT 4 YRS FAILURE FAILURE
SAVING $ 410,000/YR
$$>>4000,000/YR
400,000/YR $$ >>5000,000/YR
500,000/YR

GRAND BENEFIT: > $ 1,310,000/YR


© A.K.S. JARDINE 70
Optimum Preventive Replacement of Cat
3524B Engines
ABC currently operates a fleet of 13 CAT
797A Haul Trucks at their Oil Sands facility
at Ft McMurray in Alberta. All of these
machines are powered by CAT 3524 Diesel
Engines, which have been experienced
failures on different systems. Current Part
Contract expects 18000 GOH for Engine
Change-Out. The propose of this analysis is
to assess the optimal engine replacement
time based on ages and costs by using
available data

Optimum Policy
Optimum policy is to
replace the engines
every 7000 hours.
Expected savings for
the whole fleet of 13
trucks compared to the
OEM recommended
replacement time of
18000 hours:
$40,000,000 per
year
Spare Parts Provisioning: Slow-
moving Spares

72
Repairable Spares

System

Stock

OUT OF STOCK

Repair Shop
73
73
Criteria for Decision
Making

1.Instant reliability (service level)


2.Interval reliability
3.Cost minimization
4.(Process) Availability

74
74
Conveyor Systems: Electric Motors (Table 2.12, Page 80)

Number of motors 62
Scenario
Planning Horizon 1825 Days (5 years)
Reliability and MTBRemovals 3000 Days (8 years)
Maintainability MTTRepair 80 Days
Cost of spare motor $15,000
Value of unused spare $10,000
Cost Cost of emergency spare $75,000
Downtime cost $1000/day
Holding cost $4.11/day

QUESTION: HOW MANY SPARE PARTS TO STOCK?

75
75
Results: Repairable Motors

• Instant reliability: 95% reliability requires 4 spares


• Interval reliability: 95% reliability requires 7 spares
• Cost minimization: requires 6 spares.
• Availability of 99%: requires 2 spares.

76
76
Optimum Number of Capital
Spares-TFT Pumps
Number of Pumps 6
3 Barges on Sand Dump 8
MTBremovals (days) 456.25
MTTRepair (days) 14
Cost of Spare Pump $110,000.00
Down time cost ($/day) $125,000.00
Holding Cost ($/day) $33.15

77
Optimum Number of Capital
Spares-TFT Pumps

Total Annual Costs


$800,000
No. of Spares Annual Cost Reliability
$700,000
$600,000
0 $ 8,154,100 97.02% Most economical
$500,000 no. of spare pumps

1 $ 724,525 99.74% $400,000


$300,000
2 $ 66,832 99.98% $200,000
$100,000
3 $ 38,216 100.00% $-
1 2 3 4 5 6
No. of spare pumps
4 $ 48,399 100.00%

5 $ 60,517 100.00%

78
Thank You
Email me at:
ali.zuashkiani@utoronto.ca

© A.K.S. JARDINE 79

You might also like