You are on page 1of 58

Reliability and Maintainability

1
Introduction

❖Quality: “conformance to specification”.

❖Failure: “non conformance to some defined performance

criterion”.

Reliability: is defined as the probability that a device will


perform its intended function for a specified period of time
under stated conditions. The terms used in this definition
need some attention.
2
cont.…

❖Maintainability: The probability that a failed item


will be restored to operational effectiveness within a
given period of time when the repair actions are
performed in accordance with prescribed procedures.

4
Cont…
❖ Why do we need to talk of reliability and maintainability?

a. Reliability
❖determines frequency of repair

❖fixes spares requirements

❖determines loss of revenue

❖Affects customer satisfaction

b. Maintainability
❖determines downtime

❖determines manpower requirement

❖affects training
5
❖test equipment
Reliability Function
❖Consider the probability of an item failing in the
interval between t and t+dt.
❖Given the failure rate λ(t), the probability that the item
may fail in the interval t to t+dt, provided it has
survived until time t, is given by the conditional
probability
PE2 =  (t )dt
E1

✓ where E1 is survival up to time t with the survival probability


given by the reliability

R (t ) = PE1
✓ And, E2 is item failing between time t and t+dt.
6
❖The probability of failure in the interval t to t+dt
unconditionally is f(t)dt
✓ where f(t) is the failure probability density function.
❖This probability is obtained by the multiplication
theorem which states that

f (t )dt = PE1 and E2


❖Noting that

PE1 and E2 = PE1  PE 2 E


1

f (t )dt = PE1 and E2 = R(t )  (t )dt


7
❖Thus, the failure rate is obtained to be

f (t )
 (t ) =
R(t )
❖The probability that an item may fail between running
times 0 and t is

F (t ) =  f (t )dt = 1 − R(t )
t

❖Differentiating this equation,

− dR(t )
f (t ) =
dt
8
❖Substituting for f(t) in the equation for λ (t),

1 dR(t )
 (t ) = − 
R(t ) dt
❖Integrating both sides gives

dR (t )
−   (t )dt = 
t t

0 0 R (t )

❖Which gives the reliability function

R(t ) = exp−   (t )dt 


t

 0 
9
❖Assuming constant failure rate ,

R(t ) = exp(− t ) = e − t

10
MTBF
❖Consider the (N-k) items that survived at t. Let (N-k) be
Ns(t). Then
N s (t )
R (t ) =
N
❖In each time interval dt, the time accumulated is

N s (t )dt
❖At time t = the total time accumulated is


 N s (t )dt
0
11
❖MTBF is then given by

  N s (t )dt
N s (t )dt = 
1
MTBF =
N 
0 0 N

=  R(t )dt
0

❖For constant λ,

 
MTBF =  R(t )dt =  e dt = − t 1
0 0 

12
Example 8.1
What is the highest failure rate expressed in failures/10^6 hr that a
piece of equipment can have if it is to operate with a probability of
survival of 90% for 5000hr?
Solution:
The reliability is given by :
−t −*5000
R (t ) = e = e = 0.9

 = 21 / 10 ^ 6 hrs

13
Example 8.2
The MTBF of the piece of an equipment is 1000 hr. What is the
probability of survival of this equipment for a) 200hr of operation?
b) 500hr of operation? c) 1000hr of operation?
Solution:
From the MTBF of the equipment, we obtain the failure rate to be:
 = 1 / MTBF = 1 / 1000 = 0.001
a) For 200hr, R (t ) = e − t = e −0.001*200 = 0.818
b) For 500hr, R (t ) = e − t = e −0.001*500 = 0.607
c) For 1000hr, R (t ) = e − t = e −0.001*1000 = 0.368
15
System Reliability with Weibull Failure Probability
Distribution
❖ In Weibull probability distribution of failure, the failure rate

varies with time. The failure rate and reliability function are
given by,
Where :

( ) (
 t =  t − t0 ) −1  β = shape parameter (slope)
  = Scale parameter (characteristic life)

 t −t  
R (t ) = exp −  0
 
    
❖For   1 failure rate increases with time resulting in
decreasing reliability.
17
.

18
19
20
Maintainability
❖ Maintainability is a characteristic of design and installation which is

expressed as the probability that an item will conform to specified


conditions within a given period of time, when maintenance action is
performed in accordance with prescribed procedures and resources.

❖ The objective of a maintainability program should be to influence

equipment design to assure that maintenance of the equipment can be


accomplished efficiently and safely.

❖ Maintainability requirements are usually contractual, and in such cases it

is essential that the test method and the conditions under which it is to be
carried out are carefully defined. Achievement of specified Mean Time to
Repair is rather expensive.
21
Maintainability Function (Exponentially disritibuted)
❖ The time-to-restore probability density function g(t) for the

exponential times-to-restore distribution is

g (t ) = e − t

✓ μ is equipment repair, replacement or restoration rate.

❑ Given μ, the mean time to correctively repair, replace or restore

the equipment to satisfactory operation is given by


1
MTTR =

t

g (t ) =
1
e MTTR
MTTR
23
❖ The maintainability function for the exponential time to restore

distribution is

M (t1 ) = P(t  t1 ) =  g (t )dt


t1

0
t1
=  e dt − t
0
− t1
= 1− e
❖ M (t1 ) is the probability that a repair will be completed
successfully in time t1.

24
Determination of Mean Time To Repair MTTR
❖ In many practical applications, determination of MTTR is not

easy.

❖ MTTR is the mean of the distribution of equipment repair time

and can be estimated from n

T i i
MTTR = i =1
n

Where : i =1
i

✓ Ti is the time needed to repair the equipment when the ith part
fails
✓ λi is the constant failure rate of the ith repairable part of the
equipment
✓ n is the number of repairable parts in the equipment
26
 Example 8.4.
Subsystem Failure rate T (hr)
A piece of equipment is I 0.005 2
composed of three II 0.01 1
replaceable subsystems. III 0.02 3/2
System failure rate and λ1𝑇1+λ2𝑇2+λ3𝑇3
 MTTR=
replacement time is given in λ1+λ2+λ3
.005∗2+.01∗1+.02∗3/2
the table. Estimate the =
0.005+0.01+0.02
=1.429hr
equipment mean time to
repair. Estimate the M(t)=1 − 𝑒 −(μ𝑡)
probability that the 1.5
−(1.429)
M(t)=1 − 𝑒
equipment repair will be = 0.650
accomplished in 3/2 hrs. 28
cont.…

 Performance of equipment depends on reliability and

availability of :-
➢ the equipment used,
➢ operating environment,

➢ maintenance efficiency,

➢ operation process and technical expertise of operators, etc.

29
Reasons for interest in the concept of Reliability and Maintainability
a. Complexity:
more complex machinery, more intrinsic failures,
failures are more difficult to diagnoses, and
are less likely to be foreseen by the designer.
b. Mass Production:
➢ requires very high degree of control over material procurement,

manufacture and assembly, Engineering changes etc.


c. Cost and Tolerances:
➢ A product is designed to meat a production cost objective which
puts a sever restriction.
➢ This in turn leads to the calculation of tolerance margins which
satisfy the requirement.

32
Cont.…
d. Maintenance:
✓ Field diagnosis and repair costs are much greater than

those incurred in the factory.

✓ As a result reductions in failure rate and repair time justify a

reasonable investment.

These major reasons, put together make reliability and


maintainability factors which have to be considered
properly during the design, manufacture and operation
time.
33
Inter dependence of R and M
Reliability and Maintainability are interdependent for three
basic reasons.
I. The design and assurance activities required to achieve R and M,
in many cases are the same.

II. Maintainability is a parameter that greatly contributes to the


reliability of a system.

III. Both R and M contribute to the overall availability of the system.

34
Cont.…

 Careful considerations of reliability and maintainability

factors at the design stage help in predicting

➢ the expected life of a plant,

➢ the availability of a plant, and

➢ the expected maintenance work load.

35
FMEA (Failure Mode and Effect Analysis)-
✓ A failure mode:

• is defined as any event which is likely to cause a functional failure of

an equipment.
• It describes the manner in which a failure occurs and the potential

effects or consequences associated with that failure.

✓ Failure Mode and Effect Analysis (FMEA):

• is a systematic methodology used to identify, evaluate, and prioritize

potential failure modes within a system, process, or product.


• It aims to proactively identify and mitigate risks by analyzing the

potential effects of failures and determining appropriate preventive or


40 corrective measures.
FMEA (Failure Mode and Effect Analysis)-
❑ Failure modes are classified in to three groups.

i) Falling capacity: when the initial capability of the equipment


falls below the desired performance, we have falling capacity
of the equipment.
The main causes for reduced capability are:
✓ deterioration due to wear & tear

✓ lubricant failure, dirt, disassembly(falling apart),

✓ human error

41
Cont…

42
Cont…
ii) Increase in desired performance: when desired
performance rises above initial capability of the
equipment, there is failure of equipment. The reasons
for increase in desired performance are:
➢ sustained, deliberate overloading,

➢ sustained, unintentional overloading,

➢ sudden, unintentional overloading,

➢ Incorrect process materials which are out of specifications.

43
Cont…

44
Cont…
iii) Initial Incapability: when the equipment is not capable of
doing what it is expected to do from the outset we have initial
incapability and the equipment is unfit for operation.

45
Classification of failures
By Cause
 Production-related failures

 Stress related failures

 Misuse failure

 Interest weakness failure

 Wear out failure

 Maintenance-induced failure

By Suddenness
 Immediate failure
 Gradual degradation failure
46
Cont…
By Degree
 Catastrophic failure
 Intermediate failure
 Partial failure
By Result
 Critical failure
 Major failure
 Minor failure
By Definition
 Applicable to the specification
 Not applicable

47
Failure Effects:
❑ Failure effects refer to the consequences or impacts that occur as

a result of a failure mode in a component, system, or process.

❑ Failure effects describe what happens when a failure mode

occurs.
✓ In describing failure effects the following must be noted.
A. Evidence of failure:
❖Is the failure evident to operating crew?
❖Is the failure accompanied by obvious physical effects?
❖Does the equipment/machine stop functioning as a result of the
failure?
B. Safety and environment hazards:
❖Is it possible that some one could get hurt?
48
❖Are environmental regulations and standards breached?
Cont…
C. Production effects:
❖ Is process stoppage caused?

❖ How is production affected?

❖ How long is the downtime associated with the failure?

D. Secondary effects:
❖ How is product quality affected?

❖ Is customer service and satisfaction affected?

❖ What is the increase in the operating cost?

❖ What secondary damages are caused?


49
Cont…
E. Corrective action:
❖ What must be done to repair the failure?

❖ What resources are required for the repair?

➢ To make a comprehensive failure mode and effects analysis one

needs to have information about the modes and effects which are
obtained from various sources including:

➢ the manufacturer/supplier of the equipment,


➢ other users of the equipment,
➢ the people who operate on and maintain the equipment.

50
key steps involved in conducting an FMEA
The key steps involved in conducting an FMEA
typically include:
 Identification of Failure Modes: The first step is to
identify all possible failure modes that could occur
within the system or process being analyzed. A failure
mode refers to a specific way in which a component or
process can fail or malfunction.
 Determination of Effects: For each identified failure
mode, the potential effects or consequences associated
with it are assessed. This includes evaluating the
impact on the system's performance, safety,
51
reliability, quality, and other relevant factors.
FMEA (Failure Mode and Effect Analysis)
 Estimating Severity: The severity of each failure mode is
evaluated to determine the potential impact on the system or
process. This assessment helps prioritize the failures based
on their severity levels, ranging from minor inconveniences
to critical safety hazards.
 Analysis of Causes: The underlying causes or factors that
contribute to each failure mode are identified. This involves
examining the root causes, such as design flaws, material
defects, human errors, environmental factors, or other
sources of failure.

52
FMEA (Failure Mode and Effect Analysis)

 Evaluation of Current Controls: The existing preventive and


mitigating measures already in place to address the identified
failure modes are assessed. This includes reviewing design
features, safety precautions, quality control practices,
maintenance procedures, and any other measures implemented
to prevent or minimize failures.
 Determination of Detection Ability: The ability to detect or
identify the occurrence of each failure mode is evaluated. This
helps assess the effectiveness of existing monitoring,
inspection, or testing methods in detecting failures before they
lead to significant consequences.

53
FMEA (Failure Mode and Effect Analysis)
 Calculation of Risk Priority Numbers (RPNs): An RPN is a
numerical value assigned to each failure mode based on the
severity, occurrence probability, and detection ability. The RPN
helps prioritize the failure modes for further action, with higher
RPNs indicating higher risks.
 Development of Action Plans: Based on the RPNs, appropriate
actions are determined to mitigate the identified risks. This can
involve implementing design changes, enhancing preventive
measures, improving monitoring and inspection processes, or
other actions to reduce the likelihood and impact of failure
modes.

54
FMEA (Failure Mode and Effect Analysis)
 Follow-Up and Monitoring: After implementing the
recommended actions, regular monitoring and review are
essential to ensure their effectiveness and address any new failure
modes that may arise.
 In General, FMEA is widely used in various industries,
including manufacturing, healthcare, automotive,
aerospace, and many others, to enhance reliability,
safety, and quality by proactively addressing potential
failure risks. It is a valuable tool for risk management,
process improvement, and product development.

55
The Whole-life Equipment Failure Profile:
(The Bathtub Curve)
❑ The whole-life of equipment (failure) may be divided into three

major distinct periods:

a. infant mortality period, or early failure

b. useful life period

c. wear-out period

❑ The failure rate curve, commonly known as the bathtub curve, is

the sum of three separate over-lapping failure rate distributions


known as burn-in (early failure), random failure, and wear-out
56
failure.
The Bathtub Curve

57
Reasons for burn-in failures

➢ inadequate quality control

➢ inadequate manufacturing methods

➢ substandard materials & workmanship

➢ wrong startup & installation

➢ inadequate processes and human error

➢ inadequate handling methods

59
Reasons for useful life failures

➢unexplainable causes

➢human error, abuse, natural failures

➢undetectable failures

➢low safety factors

➢higher random stress than expected.

60
Causes for wear-out failures

➢inadequate maintenance

➢wear due to friction

➢wear due to aging

➢wrong overhaul practices

➢corrosion failure

61
Effect of PM on equipment failure rate

burn-in useful life wearout


failure
rate overall failure curve
(bathtub curve)

wearout failure

effect of PM in elongating
useful equipment time
random failure

early failure
t1 t2 time

62
Availability:
a) Steady State Availability (inherent availability)
Availability is the available up-time of an
equipment. This is the probability that an
equipment, when used under stated conditions and
ideal support environments, will operate
satisfactorily at any given time.

Ass =
+
where Ass = steady state availability
= system constant repair rate
= system constant failure rate
72
❖Substituting for repair rate and failure rate, steady state
availability is
MTBF
Ass =
MTBF + MTTR

❖ In the calculations of Ass , preventive maintenance down


time, supply down time, queuing downtime,
administrative down time are excluded.
Ass is useful to designers.

73
b) Operational Availability:

❖This can be defined as the probability that an equipment,


when used under stated conditions in an actual
environment, will operate satisfactorily at any time.
MTBM
Ao =
MTBM + MDT
where
✓ MTBM = Mean Time Between Maintenance actions

✓ MDT = Mean Down Time; sum of the mean corrective and


preventive maintenance time intervals including supply down
time, administrative down time, etc.
74
Cont…

❖Activities involved to achieve good R and M

a) Design

b) Manufacture

c) Field service Operation

75
Cont.…
a) Design
many parameters have to be assembled together so as to
introduce the reliability and maintainability of the system.
✓ During the design stage the following have to be considered
adequately.
▪ Reduction in complexity

▪ Use of standard proven methods

▪ Duplication of modules to increase fault tolerance

▪ Derating practice of using components of higher stress rating

than minimum requirement

▪ Prototype Testing

▪76 Subsequent feed back of information in to the design.


Cont.…
b. Manufacture
The following considerations have direct effect on failure rate
and should be well accounted for during manufacture.
✓ Control of materials, methods etc.
✓ Control of work standards
c. Field service Operation
During operation the following items should be observed
carefully.
✓ Following adequate operating and maintenance
instructions,
✓ Use of preventive maintenance
✓ Feed back of accurate failure information to design and
manufacture
77
Activities involved to achieve good reliability
and maintainability

78
Cont.…
 Unreliable system may cause a lot of unexpected direct
and indirect losses.
 These loses could be due to
➢ accidents,
➢ penalties for not meeting delivery due dates,
➢ defected products,
➢ stoppage time, labors’
➢ health and machine breakdown.
• Both human abilities and machine capabilities play
important role in an integrated human-machine
manufacturing system.

79
Thank you

80

You might also like