Professional Documents
Culture Documents
CA
A GOAL WITHOUT A PLAN…
IS JUST A WISH
Intro
FOR DESIGN AND
MANUFACTURING
MTBFMONTREAL.CA
ISBN-13: 978-1607730606
ISBN-10: 160773060X
REFERENCE 1
REFERENCE 4
(OLD REFERENCE)
Definition phase:
- benchmarking,
- reliability by similarity (similar products performance analysis)
Design phase:
- evaluate design capabilities by reliability predictions
- FMEA (failure mode and effect analysis),
- fault trees, physics of failure
- HALT for design (identify weakness and increase the final design’s
reliability by early testing)
Dr. Sorin Voiculescu
Validation phase:
- reliability by similarity (taking into account design changes)
- reliability growth
Manufacturing phase:
- reduce the risks potentially induced by the production line using ESS testing
Operational phase:
- FRACAS (follow-up field performance)
- early trends detection and corrective actions
- real-time health monitoring
- optimize the maintenance tasks
Dr. Sorin Voiculescu
- maintenance models
- reliability centered maintenance
- MSG3
Reference: WEB
Above values are fictive numbers intended to highlight the need of a reliability program plan.
Disclaimer: This example is only intended to present a specific problem on a particular product of Honeywell. It is not intended to harm n anyway the good image of Honeywell. Remember: there is an improvement
potential in any product of any Company. Honeywell removed the product from the market.
Reliability
Failure
Types of Reliability
System break-down
FTA (basics)
R(t)
1
0 t
Life time
Dr. Sorin Voiculescu
Conditions
time to failure
physical, chemical, mechanically, stresses…
numbers of cycles to failures
0
t
❑ Item
Item is what we are studying and embeds:
Components: components quality plays an important role
❑ Required function
This should be defined for every part, subassembly, and product. The
statement of the required function should explicitly state or imply a
failure definition. For example, a pump's required function might be
Reference: Z. Klim
Reference: Z. Klim
Reference: WIKIPEDIA
Function
FAILURE Degraded
Failed
Time
Intermittent Nuisances
Failure is an event which causes the system performance to deviate
from the specified performance
1 https://www.consumerreports.org/cro/news/2009/11/q-a-why-do-car-batteries-die-in-winter/index.htm
Reference: WEB
Reference: WEB
In order to assess the impact of the failure on the top level function,
one must assess the link between these two. A typical breakdown
represents the link starting from the functional level, going down to
technical functions ensuring the upper function and linking these
technical functions to the physical piece-part/component/installation
Reference: WEB
Reference: WEB
Two of the most popular tools used to assess the impact and the link
between these failures are referenced in the following slides. Specifics
on these tools will be the topic of a future lecture.
Reference: WEB
hours
- 1st failure
km / miles
number of cycles - Between 2 consecutive failures
one cannot provide the precise time of arrival but can give the probability of
occurrence before a certain date
risk
Severity
Probability RISK
Dr. Sorin Voiculescu
Reference: WIKIPEDIA
❑ safety
❑ media impact
❑ availability
❑ Design Reliability
The design reliability of a product is the predicted reliability performance of
the product at the end of the development phase.
The prediction may be based on field experience from similar products, testing,
Real characteristics
Intrinsic characteristics Pending on usage and maintenance
Operational availability
Progression of estimates
INDU 6391 DR. SORIN VOICULESCU 75
MTBFMONTREAL.CA
TYPES OF RELIABILITY
❑ Design Reliability
The design reliability of a product is the predicted reliability performance of
the product at the end of the development phase.
customer and differing from the nominal values used in the design process.
It also depends on the maintenance actions carried out by the customers during
the use of the product.
Reference: Z. Klim
E.g. the risk of derailment of a train car is 7.8 derailments per billion
freight car-miles (FCM)
This is equivalent to 7.8E-9 / mile per car per mile.
For a 1,500 miles distance (Montreal to Orlando), this probability
becomes 1.17E-5 / mission
1.175E-5 = 7.8E-9 * 1,500
For a life of 10 years and 180 missions a year, the probability
becomes 2.8E-2*
Dr. Sorin Voiculescu
The maintenance action, ideally, reduces the risk value to it’s original
value (as good as new).
Risk-based maintenance (RBM) prioritizes maintenance resources
toward assets that carry the most risk if they were to fail. It is a
MTTF
MTBF
Maintainability
Availability
Failure rate
0 tfix t
For a given value of time tfix, one can compute the Reliability of a
system R(tfix) and express it as a fixed value. Often the Industry
uses a statement like “Reliability of 93%”; such statement always
involves a fix time.
Dr. Sorin Voiculescu
C2
C2
Ci m/n
Cn
Dr. Sorin Voiculescu
The manner in which the replicates are put to use depends on the type of
redundancy:
❑ In active redundancy, all (M) components of the module are in their
operational state, or “fully energized,” when put into use.
❑ In passive redundancy, only one component is in its fully energized state
and the remaining are either partially energized (warm standby) or kept
Dr. Sorin Voiculescu
Reference: Z. Klim
http://www.ecs.umass.edu/ece/koren/FaultTolerantSystems/simulator
/NonSerPar/nsnpframe.html
MTBF
14 1,600.00 1
Let’s assume the following data: over the last 15
16
1,500.00
1,000.00 1
12 months, the fleet has cumulated 100,000km 17 2,500.00 1
18 5,500.00 1
(so time definition is in km) up to the first 19 5,900.00
100,000km/5 = 20,000km 19
20
5,900.00
1,700.00 1
21 2,600.00 1
22 3,400.00
23 4,500.00 1
24 4,600.00 1
25 5,800.00 1
26 6,800.00 1
27 1,300.00 1
28 5,900.00 1
29 1,400.00 1
30 6,800.00 2
failure. 20 1,700.00 1
21 2,600.00 1
22 3,400.00
The above is leading to a MTTF = 23 4,500.00 1
24 4,600.00 1
400,000km/20 = 20,000km 25 5,800.00 1
26 6,800.00 1
27 1,300.00 1
28 5,900.00 1
29 1,400.00 1
30 6,800.00 2
1 Removal
2 3
Unscheduled Removal Scheduled Removal
Known or suspected malfunction unit removed to perform
maintenance
4 5
Failure / Fault Unjustified Removal
failure/fault found No Failure / No Fault Found (NFF)
6 Predictions 7
Failure / Fault Induced Failure / Fault
unit used within specification unit used out of specification
Dr. Sorin Voiculescu
8 9
Confirmed / Accepted Unconfirmed
Failure / Fault Failure / Fault
Failure/fault does not
Failure/fault substantiates the
substantiates the reason for
reason for removal
removal
INDU 6391 DR. SORIN VOICULESCU 135
MTBFMONTREAL.CA
DATA – HOURS TO FAILURE EXAMPLE
The following pages presents an example of a unit that should meet, by
contract, a minimum MTBUR value of 4,000 hours.
The overall analysis of the unit’s performance measured by the MTBUR = (sum
of time to failure) / (number of failures) shows compliance to contractual
value.
The first approach is to take each individual item contributing to the analysis
and to enter its knowing operating time (first column), removal (second
column) or suspension (3rd column).
Using the time to removal or to suspension for each individual unit in the fleet
can offer the performance over the life of the product since entry into service.
Dr. Sorin Voiculescu
approach should give the same identical result as the previous approach using the individual
contribution of each item in the sample size.
Reference: https://havrel.honeywell.com/docs/index.cfm?content=help/FAQ.cfm#8
18,000
1,800
201101
201103
201105
201107
201109
201111
201201
201203
201205
201207
201209
201211
201301
201303
201305
201307
201309
201311
201401
201403
201405
201407
201409
201411
201501
201503
201505
201507
201509
201511
201601
201603
201605
201607
201609
201611
Identify the driving removal reason and the driving failure mode(s)
Understand the field root-causes initiating these failure modes
Update FMEA
Implement actions to reduce/eliminate the impact of these root-causes (either by eliminating the root
causes or by reducing/eliminating the impact)
Monitor the effectiveness of the measures by tracing the performance of the units with these corrective
actions implemented
INDU 6391 DR. SORIN VOICULESCU 141
MTBFMONTREAL.CA
EXAMPLE OF USE
Reference: Google books: Final Report on the Fuel Control system of the F100 Engine
Reference: Google books: Final Report on the Fuel Control system of the F100 Engine
25000
20000
UNIT KM
15000
Dr. Sorin Voiculescu
10000
5000
0
Feb-05
Oct-05
Feb-06
Oct-06
Feb-07
Oct-07
Feb-08
Oct-08
Feb-09
Oct-09
Feb-10
Oct-10
Feb-11
Oct-11
Feb-12
Oct-12
Feb-13
Jun-05
Jun-06
Jun-07
Jun-08
Jun-09
Jun-10
Jun-11
Jun-12
B10 is the time that a devices will operate prior to 10% of a sample
of those devices would fail.
1t 2
B10 < B10
0 fix
t
Dr. Sorin Voiculescu
DR. SORIN V
R(t) 1
X%
Y%
Dr. Sorin Voiculescu
0 1 2 2 1 t
INDU 6391 DR. SORIN VOICULESCU 146
INDU 6391
AVAILABILITY
Cumulated Down-time
MDT =
Number of failures after the 1st one
MTBF = MUT+MDT
Dr. Sorin Voiculescu
MUT
Availability =
Reference: WIKIPEDIA MTBF
INDU 6391 DR. SORIN VOICULESCU 148
INDU 6391
EXAMPLE IN CLASS
❑ rebuild time
❑ verification time
Reference: https://users.ece.cmu.edu/~koopman/des_s99/sw_reliability/
Reference: https://users.ece.cmu.edu/~koopman/des_s99/sw_reliability/
Reference: https://users.ece.cmu.edu/~koopman/des_s99/sw_reliability/
Reference: https://users.ece.cmu.edu/~koopman/des_s99/sw_reliability/
situations and a certain class of the problems. We have to carefully choose the
right model that suits our specific case. Furthermore, the modeling results can not
be blindly believed and applied.
Reference: https://users.ece.cmu.edu/~koopman/des_s99/sw_reliability/
Reference: https://users.ece.cmu.edu/~koopman/des_s99/sw_reliability/
data collected is therefore used to calculate failure density, Mean Time Between
Failures (MTBF) or other parameters to measure or predict software reliability.
Reference: https://users.ece.cmu.edu/~koopman/des_s99/sw_reliability/
As more and more software is creeping into embedded systems, we must make sure they don't
embed disasters. If not considered carefully, software reliability can be the reliability
bottleneck of the whole system. Ensuring software reliability is no easy task. As hard as the
problem is, promising progresses are still being made toward more reliable software. More
standard components, and better process are introduced in software engineering field.
Reference: https://users.ece.cmu.edu/~koopman/des_s99/sw_reliability/
FMEA
TOOLS
FTA
FMEA
System
Reference: WEB
System
Component
Qualitative Quantitative
Qualitative
Exponential
Normal
Log-Normal
?
component
hasard failure
degradation
ageing random
? ?
wear-out fatigue
? software + external
?
events
mecanica force systematique failures
corrosion vibrating heat
l d
Dr. Sorin Voiculescu
chimique
wear-out FATIGUE FATIGUE
random
variable ou constant load vibrating
etc
For cases not covered by the previous page or when the association
to the proposed model on the previous page is under question, one
should consider some more extensive work before making the choice.
Other means to select the law might be:
❑ Internet search
distribution,
3
4
this
1.6 696525.2
1.7 475370
can be translated into minimum parameters
requirement
H= 5 1.9 as presented in the graph below:
349129
6 2.1 270738.5
1 10
6
7 2.3 218819.1
6
8 2.5 182645 110 9 105
8 10
5
9 2.6 156395.3
7 10
5
10 2.8 136704
6 10
5
11 3 121519.9
2 5
H 5 10
h 4 10
5
OK
3 10
5
2 10
5
1 10
5
reject
1.215105 0
1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3
1.4 1 3
b
H
Dr. Sorin Voiculescu
❑ maintenance
All the previous lectures gave you means to achieve the most reliable
design before testing. Reliability modeling and statistics are here now to
confirm the compliance of the design to requirements.
INDU 6391 DR. SORIN VOICULESCU 218
MTBFMONTREAL.CA
EXPONENTIAL DISTRIBUTION
𝑅 𝑡 = 𝑒𝑥𝑝 −𝜆 𝑡
The law the most popular among the Industry due to:
❑ ease of use (one parameter)
❑ maintenance (linearizes the failure rate)
Unreliability 0 ,0 12 0
PDF 0 ,0 12 0
1,0 0 0 0 0 ,0 10 0 0 ,0 10 0
0 ,0 0 8 0 0 ,0 0 8 0
F(t)
f(t)
f(t)
0 ,0 0 6 0 0 ,0 0 6 0
0 ,0 0 4 0 0 ,0 0 4 0
0 ,0 0 2 0 0 ,0 0 2 0
0 ,0 0 0 0 0 ,0 0 0 0 0 ,0 0 0 0
t t t
Dr. Sorin Voiculescu
Reference: RELIAWIKI.COM
𝑃 𝑇 ≥ 𝑡 + 𝑠| 𝑇 ≥ 𝑠 = 𝑒 −𝜆𝑡 = 𝑃 𝑇 ≥ 𝑡
This result indicates that the conditional reliability function for the lifetime
of a component that has survived to time s is identical to that of a new
component. This term is the so-called "used-as-good-as-new" assumption.
The lifetime of a fuse in an electrical distribution system may be assumed
Dr. Sorin Voiculescu
Implications:
MTTF = MTBF (replacement is as good as new)
A time interval Δ𝑡 has the same impact (in percentage) over the
𝑅 𝑡 − 𝑅 𝑡 + ∆𝑡 = 𝑒 −𝜆𝑡+Δ𝑡 − 𝑒 −𝜆𝑡
= 𝑒 −𝜆𝑡 − 𝑒 −𝜆𝑡 ∗ 𝑒 +Δ𝑡 = 𝑒 −𝜆𝑡 1 − 𝑒 −Δ𝑡
If for example, during Δ𝑡 = 100 operating hours, a new product
will loose 50% of it’s reliability, R(t = 0h + Δ𝑡 =100h) = 0.5,
then after 200 hours it will loose 50% of the remaining value:
R(200h) = R(100h) * (% Decrease due to functioning Δ𝑡 = 100)
= 0.5 * 0.5 = 0.25.
For the same assumptions, if after 41.49h, a product reaches
75% reliability, R(41.49h) = 0.75, after 141,49 hours (100 more
operational hours), R(41.49+100) will reduce to half of
R(41.49), this means:
R(141.49) = R(41.49) * .5 = 0.75 * 0.5 = 0.375
Dr. Sorin Voiculescu
Knowing that:
1
𝜆=
𝑀𝑇𝑇𝐹
2
2
8
INDU 6391
DR. SORIN VOICULESCU
229
Dr. Sorin Voiculescu Reliability from Concept to Culture MTBFMONTREAL.CA
MTBFMONTREAL.CA
RISK-BASED MAINTENANCE (RBM)
1
Dr. Sorin Voiculescu
𝑀𝑇𝑇𝐹 = 𝜂 ∗ Γ 1 +
𝛽
IMPORTANT
1 2 REMARK: pending on the requirement definition, the
1 shape of the graph can change and the green-red areas can reverse.
1.2 2128771.4
2
3
Always pay attention to the meaning of each 𝛽, 𝜂 couple.
1.4 1130483.9
1.6 696525.2
4 1.7 475370
H= 5 1.9 349129
6 2.1 270738.5
1 10
6
7 2.3 218819.1
8 2.5 182645 11069 105
8 10
5
9 2.6 156395.3
7 10
5
10 2.8 136704
11 3 121519.9
6 10
5
2 5
H 5 10
h 4 10
5
OK
3 10
5
2 10
5
1 10
5
NOK
1.215105 0
1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3
1.4 1 3
b
H
Dr. Sorin Voiculescu
Reference:
Reference: RELIAWIKI.COM
Reference: RELIAWIKI.COM
Reference: RELIAWIKI.COM
0.07
0.06 OK
0.05
0.04 0.04
1.1 1.2 1.3 1.4 1.5 1.6
2
m
1.1 1.6
W6
Reference: web
Reference: web
Reference: web
Reference: web
W 6 0.08
q
0.07
0.06
OK
0.05
0.04 0.04
1.1 1.2 1.3 1.4 1.5 1.6
2
m
1.1 1.6
Reference: web W6
❑ one scale parameter: 𝜂 for Weibull and 𝜇 for Normal and Lognormal. Design
improvements (except technological changes, materials changes, manufacturing
process) should directly impact the scale parameter. This will allow us later to
model the accelerated testing. Dr. Sorin Voiculescu
* Disclaimer: Literature and references exists and support cases when shape
parameter changes for the same failure mode under different operating
conditions, but the approach is less practical to use. CSS stands for Changing
Shape and Scale approach. Historical data shows that, except for some specific
technologies, there is very low added value in using the CSS
𝑅 𝑡+𝑇
Dr. Sorin Voiculescu
𝑝 = 1 − 𝑅 𝑡|𝑇 = 1 −
𝑅 𝑇
When operating with hard time T, at time T the non-failed units are
set back to a state “as good as new” (by overhaul/maintenance or by
being replaced with new units). The intrinsic performance of the unit
does not change.
The observed MTTF is what the user notices in the field, based on the
cumulated operating time and observed number of failures.
Obviously, this observed MTTF is larger than the intrinsic one as, by
the action taken at hard time, the user renews the fleet (or equivalent
to renewal).
264
Dr. Sorin Voiculescu Reliability from Concept to Culture MTBFMONTREAL.CA
MTBFMONTREAL.CA
FROM PROGRAM TARGETS TO
RELIABILITY TARGETS
OK
Dr. Sorin Voiculescu
NOK
NOK
OK
Dr. Sorin Voiculescu
NOK
Dr. Sorin Voiculescu
OK
OK
Dr. Sorin Voiculescu
NOK
NOK
Dr. Sorin Voiculescu
OK
NOK
Dr. Sorin Voiculescu
OK
OK
Dr. Sorin Voiculescu
NOK
NOK
Dr. Sorin Voiculescu
OK
NOK
Dr. Sorin Voiculescu
OK
OK
Dr. Sorin Voiculescu
NOK
NOK
Dr. Sorin Voiculescu
OK
NOK
Dr. Sorin Voiculescu
OK
NOK
NOK
Dr. Sorin Voiculescu
OK
NOK
OK
Dr. Sorin Voiculescu
OK
Dr. Sorin Voiculescu
NOK
NOK
Dr. Sorin Voiculescu
OK
NOK
OK
Dr. Sorin Voiculescu
❑LogNormal
❑Function to integrate 1-CDF[LogNormalDistribution[µ, s], t]
❑Variable : t
❑Lower limit: 0
❑Upper limit: ∞
❑Weibull
❑Function to integrate 1-CDF[WeibullDistribution[b, h], t]
❑Variable : t
❑Lower limit: 0
❑Upper limit: ∞
Dr. Sorin Voiculescu