RC Sharma
Consultant (O&M)
Louis Berger Consulting Pvt. Ltd.
Hyderabad
Poor Design
Faulty Design
Inadequate
Safety Margin
Poor
Construction
Poor
Workmanship
Poor Quality of
Material Used
Things Do Fail
Randomness, a Characteristics of
Electronic / Electrical Components
RAMS:
1. RAMS will provide Indicators as to how Sturdy and Reliable a
System Design can potentially be.
2. It can help to identify which Parts of a System are likely to
have the major impacts on System Level Failure, and also
which Failure Modes to expect and which Risks they pose to
the Users.
3. RAMS can assist in the Planning of Cost-effective Maintenance
and Replacement Operations.
4. RAMS can provide Indicators for avoiding of the Hazards /
Accidents. Risk Assessment would help to improve Safety
Levels. RAMS Analysis has been increasingly used in the
Assessment of Safety Integrity Levels (SIL).
5. Assessment of how good a Design Enhancement, like
Implementation of a new Part or Redundancy shall work out in
a Real Life Situation.
Specify
Reliability
Goals
Allocate
Reliability to
Components
Implement
Design
Methods
System
Effectiveness
& Life-cycle
Costs
Failure
Analysis
FMEA/FMECA
System
Safety
Analysis
(FTA)
Safety Goals
Achieved?
FTA: Fault Tree Analysis
FMEA: Failure Mode Effect Analysis.
FMECA: Failure Mode Effect &
Criticality Analysis.
Yes
Yes
Goals Achieved?
Ready for
Production
No
Design Process
No
Burn-in
Period
Wear-out
Useful
Life
Failure
Rate
Early
Failures
Random
Failures
Wear Out
Failures
Time
Availability:
Reliability:
1. Continuous Wkg. & NOT Failing
over a given Period of Time.
2. Reliable System will last for a long
Time.
3. PM brings up the Reliability.
4. Important Parameters: MTTF &
MTBF.
5. Shall depend upon Design.
Total Cost
Cost
Acquisition
Costs
Cost of
Failures
Reliability
1.
2.
Availability:
Ability of a certain Entity to be in the State of providing a Certain Function
under Certain Conditions, at a given Time Instant. It can be measured by
the Probability of an entity E not being failed at a time instant t.
A(t) = Probability [Entity E not failed at time t]
It can be expressed as the Ratio of UP Time over Total Working Time:
AINH =
AACH =
AOP =
MTBF
MTBF + MTTR
MTBM
MTBM + MDT
MTBF
MTBF + MTR
Total Cost
Cost
Acquisition
& Support
Cost
Cost of
Down Time
Availability
1.
2.
Repair rate:
Limit of the Ratio of the Conditional Probability that the Corrective
Maintenance Action ends in a Time Interval, [t, t + t], when t tends to
zero, given that the Entity is Faulty at time t=0.
Repair Rate is represented by (t).
Maintainability:
Ability of an Entity to be restored into or be kept in a Condition or State
that enables it to perform a Required Function, when Maintenance
Operations are performed under Given Conditions and are carried using
Stated Procedures and Resources. It is, thus, the Ability of an Entity to
be repaired in a given time.
Maintainability is normally measured by the Probability that the
Maintenance Procedure of a certain Entity E performed under Certain
Conditions is finished at time t given that the Entity Failed at time t = 0.
M(t) = Prob [Maintenance of E is completed by time t, when E fails at
time t = 0.
Maintainability:
1. Deals with Repair Time: MTTR (Staff, Facilities
& Logistics).
2. A Good (Maintainable) System can be easily
Repaired.
3. Shall depend upon Design.
4. Important Parameters: MTTF & MTBF.
5. Shall depend upon Design.
6. Reliability is 100%, if you do not use the
System. Decreases with Use.
7. Poor Reliability results in Maintenance Effort ,
Revenue & Down Time .
Skills
(Levels)
Human Factors
(Capabilities /
Limitations)
Repairs
Resources
(Levels)
Accessibility &
Modularisation
Reliability
Spares
(Inventory)
(Levels)
Diagnostics
(Manual / Auto)
Maintainability
Training
(Manuals)
(Levels)
Preventive /
Predictive
Maintenance
Fault isoalation
(Self Diagnosis)
Repairs vs.
Discarding
(Unit Level)
Life-cycle
Costs
Standardisation
&
Level of
Interchangeability
Repairs
(Sub-systems)
Facilities
Maintainability Features
Safety:
1. If compromised, can cause Loss of
Human Life & Property.
2. Causes Repercussions.
3. Safety Standards CENELEC SIL 4,
3, 2, 1.
4. Probabilistic Fail-safe: No single
point of Failure. One Failure should
not lead to Catastrophe and First
Failure should be detected as &
when it occurs.
Can not cause harm when fails.
Example: Redundancy.
Quality:
1. Conformance to laid down
Specifications.
2. A Static Measure of product
meeting its Specifications.
3. Reliability is a Dynamic Measure
of Product Performance.
SIL
Reliability:
Ability of an Entity to perform a Required Function under Given Conditions
for a given Time Interval. In other words, an Entity is Reliable if it hasnt
Failed, i. e. stayed within the Specifications over a Time Interval.
R(t) = Prob [Entity E not failed over Time (0,t)], the Entity is assumed to be
operating at time t = 0.
Slope =
1-A
A
Max.
Design Region
MTTR
Min.
Min.
MTBF
Av.
MTTR
Hazard:
Situation, which has the Potential to cause Damage to the System, Damage
to its Surrounding Environment, Injuries or Loss of Human Lives.
Hazard Analysis:
An Analysis comprising Hazard Identification & Causal Analysis.
Hazard Log:
The Document in which Hazards Identified, Decisions Made, Solutions
Adopted and their Implementation Status are recorded.
Safety:
Freedom from Unacceptable Risk of Harm.
Safety Case:
The Documented Demonstration that the Product, System or Process
complies with the appropriate Safety Requirements.
Risk:
Result of the Crossing of two Criteria - Probable Frequency of Occurrence
and Degree of Severity of the Impact of a Hazard.
Frequency of
Occurrence of
a Hazardous
Event
Risk Levels
Insignificant
Marginal
Critical
Catastrophic
Frequent
Undesirable
Intolerable
Intolerable
Intolerable
Probable
Tolerable
Undesirable
Intolerable
Intolerable
Occasional
Tolerable
Undesirable
Undesirable
Intolerable
Remote
Negligible
Tolerable
Undesirable
Undesirable
Improbable
Negligible
Negligible
Tolerable
Tolerable
Negligible
Negligible
Risk Evaluation
Undesirable
Tolerable
Negligible
Incredible
Intolerable
Negligible
Negligible
Shall be eliminated
Redundancy or
Duplicity for Critical
Sub-systems
Derating: Operating the System below
its Rated Stress Level
Methods to improve
Reliability &
Availablity of a
Product or System
Choice of Technology
(State-of-Art)
1 < 2 < 3
1
R(t)
Lim. R(t) = 0
t
t
Exponential Reliability Function
F(0) = 0 &
dF(t) = - dR(t)
dt
dt
f(t) 0 &
f(t) dt = 1
0
2
F(t)
t
Exponential Failure Function
1 < 2 < 3
1
R(t)
2
3
t
Exponential Reliability Function
dR(t) dt
MTTF = E(T) = t.f(t)dt = -t.
dt
0
0
= [ -t.R(t) ] + R(t)dt
0
0
Probability Concepts:
If an Experiment can result in any one of N different
equally likely outcomes, and if exactly n of these
outcomes correspond to event A, then the Probability of
event A is P(A) = n/N.
Probability of an Event A, the P(A), obeys following
Postulates:
1. P(A) is Positive, 0 P(A) 1.
2. Probability of a Certain Event equals 1.
3. If A & B are Mutually Exclusive Events, P(A) + P(B) = 1
Probability Concepts:
Two Events are independent, if occurrence of A does
NOT depend on the occurrence of B. Joint Probability of
occurrence of two Independent Events A & B is:
P(AB) = P(A).P(B)
(The Joint Probability the Probability of Intersection is
equal to the Product of their Probabilities)
Intersection
B
A
Mutually Exclusive
P(AB)
P(B)
P(AB) = P(B).P(A|B)
If two Events A & B are independent, P(A|B) = P(A) & P(B|A = P(B))
p .(1 p) , where
x
n-x
=
x
n!
x!.(n-x)!
Mean = np
Variance 2 = np.(1-p)
For p = 0.01 (A Component having 1 chance in 100
= of failing),
p(x=1) =
(0.01) .(0.99)
1
= 0.048
Exponential Distribution:
Failures due to completely Random in nature follow this Distribution.
PDF (Probability Distribution Function) is:
f(t) = .e-t, for x 0, and
f(t) = 0, for t = 0
R(t) = e -t
Mean = MTTF =
(variability
increases
increases)
of
as
1 < 2 < 3
Failure
Time
the Reliability
1
Variance = = 2
R(t)
2
3
Standard Deviation =
t
Exponential Reliability Function
Poisson Distribution:
Poisson Distribution is Discrete Distribution. It is applicable for
Constant Failure Rate. If a Component having a Constant Failure Rate
is immediately repaired or replaced, the number of Failures
observed over a Time t has a Poisson Distribution.
If is the Failure Rate,
e-t.(t)n
, pn(t) being the probability n Failures in Time t.
pn(t) =
n!
An Example:
If x is a Discrete Random Variable representing number of Failures of
a Restorable System over a one year period. If x has a Poisson
Distribution with a Mean of = 2 Failures per year, the Probability of no
more than one Failure a year shall be:
x=1 e-2.2x
Pr (X 1) = F(1) =
x=0
x!
Normal Distribution:
Normal Distribution is NOT a Reliability
Distribution since the random Variable
ranges from - to + . Normal
Distribution, however, has been successfully
used to model Fatigue and Wear-out
Phenomenon. The Density Function of the
Normal Distribution is Bell-shaped Curve.
The PDF is:
f(x) =
1
2
exp
1
2
(x )2
2
2 = 0.2
,-<x<
2 = 0.5
2 = 1
2 = 0.5
Series System:
Success of all the Components of the System are essential
for System Success. System Reliability is the Product of
Component Reliabilities.
1
s
Parallel System:
If any one Component works well, System will
work well. System will fail only if all Components
n
fail.
(1 ri)
System Reliability Rs = 1 - i = 1
(Product of Un-reliabilities)
Rs(t) = Max. {r1(t), r2(t), r3(t), . rn(t)}
For a two Component system in Parallel,
Rs(t) = 1 (1 e- 1t). (1 e 2t) = e- 1t + e 2t e (1 + 2)t
System MTTFs =
Rs(t) dt =
0
MTTFs =
1
+
1
e- 1t dt +
0
1
1
2
1 + 2
e 2t dt 0
e (1 + 2)t dt
0
Bridge Network
Path / Tie Set Methods;
Tie Sets of above Bridge Network: {1,3} {2,4} {1,4,5}
Above Tie Sets reveal that System will work if 1 & 3 work or
2 & 4 work or 1, 5 & 4 work.
Cut Set Method:
Cut Sets of above Bridge Network: {1,2} {3,4}
System shall NOT function if Components of Cut Sets fail
simultaneously.
For working out System Reliability, we need to work out the
probabilities of either Tie Sets or Cut Sets.
K out of M System:
At least K of the Sub-systems or must function for System Success.
M-K-1 or more Failures will result in System Failure.
Rs =
X=K
rX.(1 r)M-X
MTTFs =
X=K X
1 M 1
X
X=K
0.9
R1
0.9
0.98
R3
R2
0.98
R6
0.99
R4
0.99
R5
C
Failure Rate:
It is the Transition Function between a Working State and a Failed
State of a Component, Sub-system or System. It can be analytically
expressed as the Probability of a Failure to occur in a Time Interval
given that the Component was working up until then. Being a
Transition Function, it deals with Short Time Intervals.
P {t < T t + t | T > t}
(t) = Lim.
t
t 0
1
1 2
2 1
2
4
1
1 2
2 1
2
Component 1
Component 2
Operating
Operating
Failed
Operating
Operating
Failed
Failed
Failed
1
2
3
1
4
1
2e-t ] - [ e-2t
]
MTTF = R(t)dt = (2e-t e-2t) dt = [
-2
- 0
0
0
0
MTTF =
1
2
1.5
Thanks for
Kind Attention