Professional Documents
Culture Documents
Reliability Engineering
Presented By
Protik Chandra Biswas
Assistant Professor
Department of Electrical and Electronic Engineering
Khulna University of Engineering & Technology,
Khulna-9203, Bangladesh
Syllabus
Reliability Concepts
Failure rate, outage, mean time of failure, series and parallel systems
and redundancy, reliability evaluation techniques
➢ Reliability study is considered essential for proper utilization and maintenance of engineering systems and
equipments.
➢ It makes possible more effective use of resources and results in an increase in productivity and decrease in wastage
of money, material and manpower.
➢ Reliability studies of high risk systems like chemical projects, nuclear reactors, space missions, aircraft systems are
a must to minimize the risk of failure of such systems.
➢ This is necessary because of human error or of failure of some parts or components may lead to catastrophic
damage to the system resulting in heavy men, money machine losses.
➢ Disasters at Bhopal, Chernobyl, failure of space lab, Apollo 13 are few examples of such threats. These disasters
demand a serious investigation and research to ensure a high degree of reliability of such high risk systems before
they are commissioned.
➢ Maintaining the safety and production of reliable products are the two aspects of the reliability field.
➢ The reliability of a device is a quality of that device; however, it is not a quality which can be measured directly.
➢ Reliability is the probability of a successful operation of the device in the manner and under the
conditions of intended customer use.
➢ Reliability, in its simplest form, means the probability that a failure may not occur in a given time
interval.
➢ Reliability of a unit (or product) is the probability that the unit performs its intended function
adequately for a given period of time under the stated operating conditions or environment.
𝑅 𝑡 = 𝑃 𝑇 > 𝑡 … … … … (1)
𝑅 0 = 1, 𝑅 ∞ = 0 … … … … … (2)
Reliability (R)
𝑡
𝑅 = 𝑒 −𝑓𝑇 = 𝑒 − ൗ𝑇
𝑜𝑟 𝑅 𝑡 = 𝑒 −𝑓𝑡 … … … … (4)
1
𝑤ℎ𝑒𝑟𝑒 𝑓 = … … … … (5) 0.37
𝑇
➢ f is called the constant failure rate and is the
average number of failures which occur in a
0 𝑇 𝑇 Time (t) T
standard interval of time. 20 10
➢ Early failures are primarily due to manufacturing defects, such as weak parts, poor insulation, bad assembly,
poor fits, etc. Since the defective units are eliminated during the initial failure period, this period is known as the
debugging or burn-in period.
➢ After initial failure, for a long period of time of operation fewer failures are reported but it is difficult to determine
their cause. The failure during this period are often called random failures or catastrophic failures. This is the
period of normal operation and is characterized by (approximately) constant number of failures per unit time.
➢ As time passes on, the units get outworn and begin to deteriorate. This region is called the wear-out region.
S2
Failure rate
S1
0 t1 Time t2
Unreliability
➢ The opposite of reliability is known as unreliability
➢ Unreliability is the probability of failure in time t. If R(t) is the probability of success and Q(t) is the probability of
failure of a system in time t, then the sum of R(t) and Q(t) is unity. So,
𝑅 𝑡 + 𝑄(𝑡) = 1 … … … … … (6)
𝑜𝑟 𝑄 𝑡 =1−𝑅 𝑡 … … … … … (7)
➢ If the reliability R(t) of a system for time t is, for example 90%, then the unreliability Q(t) for the same system and
same time is 10%.
1.0
𝑡
𝑄 = 1 − 𝑅 = 1 − 𝑒 − ൗ𝑇
Q
R
0.63
0.50
0.37 𝑡
𝑅 = 𝑒 − ൗ𝑇
0 Time (t)
Maintenance
➢ The important period in the life cycle of a product or a system is its operating period. Since no product is perfect, it is
likely to fail.
➢ However, its life time can be increased if it can be repaired and put into operation again.
➢ In many cases preventive measures are possible and a judiciously designed preventive maintenance policy can help
eliminate failures to a large extent.
➢ The adage “prevention is better than cure” applies to products and equipments as well.
Human reliability
➢ It is impossible to completely eliminate the human involvement in the operation and maintenance of systems.
➢ The contribution of human errors to the unreliability may be at various stages of the product cycle.
➢ Failures due to the human error can be due to
▪ Lack of understanding of the equipment
▪ Lack of understanding of the process
▪ Carelessness
▪ Forgetfulness
▪ Poor judgmental skills
▪ Absence of correct operating procedures and instructions and
▪ Physical inability
➢ It is not possible to eliminate all human errors, it is possible to minimize some of them by the proper selection and
training of personnel, simplification of control schemes and other incentive measures.
➢ The designer should ensure that the operation of the equipment is as simple as possible with practically minimum
probability for error.
Human reliability
𝑈𝑝𝑡𝑖𝑚𝑒
𝐴𝑣𝑎𝑖𝑙𝑎𝑏𝑖𝑙𝑖𝑡𝑦 =
𝑈𝑝𝑡𝑖𝑚𝑒 + 𝑑𝑜𝑤𝑛𝑡𝑖𝑚𝑒
➢ The denominator is equal to the total time for which the equipment is required to function and the up-time is the
actual period for which the equipment is available for use.
➢ The down-time can include, in addition to active repair-time, administrative and other delays related to repair.
➢ The facts which are mainly responsible for contribution to the above measures are:
➢ The methods should be as far as possible more general and applicable to system with:
➢ The information at the IN end will reach the OUT end only if all the n components function satisfactorily.
➢ Let, Ei denotes the event that the component i is good (i.e. functions satisfactorily) and 𝐸
ഥ𝑖 the event that the
component i is bad.
➢ The event representing system success is then the intersection of E1, E2, …., En. The reliability of the system is the
probability of this event and is given by
𝑅 = 𝑃𝑟 (𝐸1 ∩ 𝐸2 ∩ … … ∩ 𝐸𝑛 )
= 𝑃𝑟 𝐸1 𝑃𝑟 𝐸2 Τ𝐸1 𝑃𝑟 𝐸3 Τ𝐸1 𝐸2 … … … … … … . . (1)
➢ 𝑃𝑟 𝐸2 /𝐸1 means the probability of the event E2 on the condition that E1 has occurred.
➢ However we have assumed that the components are independent and therefore
𝑅 = 𝑃𝑟 𝐸1 . 𝑃𝑟 𝐸2 … … … . . 𝑃𝑟 𝐸𝑛 … … … … … (2)
𝑅 = 1 − 𝑃𝑟 𝐸1 ∪ 𝐸2 ∪ … … ∪ 𝐸𝑛 … … … . (3)
𝑅 𝑡 = 𝑝1 𝑡 … … … 𝑝𝑛 𝑡 = ෑ 𝑝𝑖 𝑡 ………. (4)
𝑖=1
➢ Where 𝑝𝑖 𝑡 is the probability that the component i is good at time t. If the time to failure of components are
exponentially distributed, then
𝑛 𝑛
𝑡 ……….. (5)
𝑝𝑖 𝑡 = 𝑒𝑥𝑝 −𝑓𝑖 𝑡 = 𝑒𝑥𝑝 − 𝑅 𝑡 = ෑ 𝑒𝑥𝑝 −𝑓𝑖 𝑡 = 𝑒𝑥𝑝 −𝑡 𝑓𝑖
𝑇𝑖
𝑖=1 𝑖=1
𝑅 = 𝑝𝑛 = (1 − 𝑞)𝑛 … … … (7)
𝑅 𝑡 = 1 − ෑ 𝑞𝑖 𝑡 = 1 − ෑ[1 − 𝑝𝑖 𝑡 ] … … … (8)
IN 2
𝑖=1 𝑖=1
OUT
𝑜𝑟 𝑄 = [𝑞 𝑡 ]𝑚
➢ Structural redundancy provides a very effective means of improving system reliability. This involves duplication of
paths at component, subsystem or even system level and appears to be the only effective solution when components
with high reliability and/or over rated components are not available.
➢ Maintenance and repair, whenever possible, undoubtedly boost the system reliability. A maintained system when
combined with redundancy may have a reliability of almost one.
Redundancy technique
➢ A redundancy system having m units connected in parallel is preferred as an m-order system.
➢ The performance requirements of such a system may impose a condition that at least a minimum of k of its m units
should be operational for the system success.
➢ These k units are known as basic units and the remaining (m-k) units are known as redundant units (added for the
purpose of increasing the system reliability).
➢ Such systems are classified as k-out-of-m systems.
➢ Series (k = m) and parallel (k = 1) are the special cases of a k-out-of-m model. Example of this type of systems are
as follows:
▪ In an eight cylinder automobile it may be possible to drive the car if only four cylinders are firing , but if less
than four fire, then the automobile cannot be driven (k = 4, m = 8).
▪ In a communication system with three transmitters, the average measure load may be such that at least two
transmitters must be operational at all times otherwise critical measure will be lost (k = 2, m = 3).
▪ A four engine aircraft needs only two engines to perform critical functions (m = 4, k = 2)
▪ A bridge supported by n cables may require only r cables to support the maximum load (m = n, k = r)
➢ Incorporating redundancy into the system, the main problem is to determine the value of m and k under certain
constraints of reliability, mean life, cost, performance etc..
Types of redundancy
➢ Redundancy applied either in component level or in system level may be any one of the following four types:
▪ Active redundancy (Hot redundancy)
▪ Standby redundancy (Cold redundancy)
▪ Voting redundancy and
▪ Spinning redundancy
➢ In active redundancy, all the units used in the system remain active for all the time.
➢ In standby redundancy, the excess unit remains inactive, it only becomes active when an active unit fails to
operate. However, standby redundancy involves the use of failure sensing and switching devices also. If the sensing
and switching devices themselves are not reliable, it can render standby redundancy a useless proposition.
➢ The voting redundancy may also be used in the system where computer application is necessary.
➢ Spinning redundancy is used where a system/unit needs warm up time before taking full load. The redundant
unit remains active without load and takes over the charge immediately after the failure of the main unit, thereby
saving the warm up time. For example, in a power generation system, a redundant steam turbine always rotates
without load and when the system fails, the turbine takes over the charge.
▪ The simplest and straight forward approach is to provide a duplicate path for the entire system itself. This is
known as system or unit redundancy.
▪ Another approach is to provide redundant paths for each component individually. This is called component
redundancy.
▪ The third method suggests that the weak components a should be identified and strengthened for reliability.
This approach is useful when we consider reliability and cost optimization problems.
▪ The last approach is to appropriately mix the above techniques depending upon the system configuration and
reliability requirements. This approach is known as mixed redundancy.
𝐶1 𝐶2 𝐶1 𝐶2
𝐶1 𝐶2 𝐶1 𝐶2
➢ Assuming statistically independent and identical units at each element level, the reliability of the unit redundant
system is
𝑅𝑐 = [1 − (1 − 𝑝1 )2 ][1 − (1 − 𝑝2 )2 ]
𝑅𝑢 = 2𝑝2 − 𝑝4 … … … … . . (3)
𝑅𝑐 = 𝑝2 (2 − 𝑝)2 … … … … . . (4)
➢ Then 𝑅𝑐 − 𝑅𝑢 = 𝑝2 [ 2 − 𝑝 2 − 2 − 𝑝2 ]
➢ Eq. (5) clearly shows that 𝑅𝑐 − 𝑅𝑢 > 0 for 0 < 𝑝 < 1 . Of course, 𝑅𝑐 − 𝑅𝑢 = 0 for 𝑝 = 1 and 𝑝 = 0
****** This proves that redundancy at the component level is better than redundancy at the unit level as
far as reliability is concerned.
Redundancy
➢ This analysis can be extended to a more general case where the unit consists of n components in series. Assuming
that m-1 components are put in parallel in each stage, the reliability of the system would be
𝑚 𝑛
𝑅𝑐 = 1 − 1 − 𝑝 …………. (6)
➢ In the case of unit redundancy, m-1 units are added across the primary unit and therefore its reliability is
𝑅𝑢 = 1 − 1 − 𝑝𝑛 𝑚
…………. (7)
➢ Another important redundancy technique is to use partial redundancy popularly known as k-out-of-m
system. We know that a k-out-of-m system becomes a series structure when k = m and a parallel structure when k
= 1.
Weakest-link technique
➢ The reliability of a series structure is at the most equal to the reliability of the weakest component of the structure.
➢ Consider a simple system having two equipments A and B in series. Their probabilities of failure free operation are
0.9 and 0.6 respectively. This system has a reliability of 0.54 which is much less than 0.6, the reliability of the
weakest equipment.
➢ The system reliability can be improved by one of the following ways:
▪ Apply redundancy across A only
▪ Apply redundancy across B only
▪ Apply redundancy across A and B individually (component redundancy)
▪ Apply redundancy across both A and B combinedly (unit redundancy)
➢ Various system configurations and their resultant reliabilities are shown in Table A. Their reliabilities show that
the application of redundancy across a weaker equipment results in higher reliability compared to the
redundancy across the stronger equipment.
b
𝐴 𝐵
𝑅 = 1 − 1 − 𝑝𝑎 2 𝑝𝑏 = 𝟎. 𝟓𝟗𝟒
𝐴
c
𝐴 𝐵
2
𝑅 = 𝑝𝑎 1 − 1 − 𝑝𝑏 = 𝟎. 𝟕𝟓𝟔
e
𝐴 𝐵
2 2
𝑅 = 1 − 1 − 𝑝𝑎 1 − 1 − 𝑝𝑎 = 0.832
𝐴 𝐵
f
𝐴 𝐵
3
𝑅 = 𝑝𝑎 1 − 1 − 𝑝𝑏 = 0.842
𝐵
𝐵
Redundancy (Mathematics)
g 𝐴 𝐵
Home Task
𝐴
𝐴 𝐵
h 𝐴 𝐵
𝐵
Home Task
𝐴 𝐵
𝐵
i
𝐴 𝐵
𝐵 Home Task
𝐴
𝐵
Redundancy
Mixed redundancy
➢ Component and unit redundancies are simple to design and easy to implement.
➢ However, they are not the best configurations and their might be scope for further improvement in their reliability-
cost ratios.
➢ In configuration (h) in Table A, the reliability of a weak component is improved first and then unit redundancy is
applied. Its reliability is higher than the reliability of the simple component redundancy.
Cost
➢ The effect of increase in the reliability on the cost for any
product is shown in Fig. Maintenance
and repair cost Design and
➢ The initial cost increases but the operating cost decreases
production cost
with the reliability and hence there exists a value of the
reliability for which the cost is minimum.
Reliability
Reliability
0.9
∆𝑹
➢ It shows that beyond a certain point, any increase in expenditure,
no matter how large, will not result in a significant increase in 0.8 ∆𝑪
the system reliability.
0.7
THANKS TO ALL