Reliability Prediction
A Quest for Reliable Parameters
By
Yair Shai
Goals
Compare the MTBCF & MTTCF parameters in view of complex systems engineering. Failure repair policy as the backbone for realistic MTBCF calculation. Motivation for modification of the technical specification requirements.
2
Promo : Description of Parameters
t1 t2 t3 t4 t5 ....... time
Failure Event of an Item Repairable Items:
r =Number of Failures
t
=
i
Mean Time Between Failures
Non Repairable Items:
t
=
i
Mean Time To Failure
3
MTBF = MTTF ??
An assumption:
Failed item returns to As Good As New status after repair or renewal.
note: Time To Repair is not considered.
UP TIME
DOWN
Critical Failures
Moving towards System Design A System Failure resulting in (temporary or permanent) Mission Termination.
COMPUTER SUBSYSTEM COMPUTER
X X
A simple configuration of parallel hot Redundancy.
A Failure: any computer failure A Critical Failure: two computers failed
5
Critical Failures
A clue for Design Architecture
MTBCF
Mean Time Between Critical Failures
MTTCF
Mean Time To Critical Failure
SAME?
Remember the assumptions
Determining the failure repair policy: COLD REPAIR
No time for repair actions during the mission
6
Functional System Design
Switch control CPU
POWER SUPPLY
UNIT A UNIT B
ANTENA ANTENA
4 CHANNEL RECEVER CONTROLER
UNIT C UNIT D
sw
ANTENA ANTENA
CPU
POWER SUPPLY POWER SUPPLY
2/4
Operational Demand: At least two receiver units and one antenna should work to operate the system.
7
From System Design to Reliability Model
A
CPU PS1 B CONT x CPU PS1 x PS2 x C D sw ANT ANT ANT
ANT
Is this a Critical Failure ?
2/4
Serial model : Rs = R1x R2 Parallel model : Rs = 1- (1-R1)x(1-R2) K out of N model : Rs = Binomial Solution
8
From RBD Logic Diagram to Reliability Function
Simple mathematical manipulation:
Rsys(t)= f( serial / parallel / K out of N)
Classic parameter evaluation:
MTBF Rsys(t )dt MTTF
MTBCF MTTCF
WARNING !!!
Is this realistic ?
After[ each S.Zacks, repair Springer-Verlag of a critical failure 1991,-Introduction The whole system To returns Reliability to status Analysis, As Good Par As 3.5] New.
9
MTBCF vs. MTTCF
A New Interpretation
Common practice interpretation:
First
MTBCF = MTTCF = MTTCFF
Each repair Resets the time count to idle status (or) Each failure is the first failure.
Realistic interpretation:
MTBCF = MTTCF
Only failed Items which cause the failure are repaired to idle. All other components keep on aging.
10
Presentation I
Simple 3 aging components serial system model
A 2 2 3 3
C 1 2
HAD WE KNOWN THE FUTURE
2
1 3
3 2 TTCF
A B C
11
Presentation II
Simple 3 aging components serial system model
A 1 1 1 2
B 2 2
C 3
HAD WE KNOWN THE FUTURE 4 3 3 4 TBCF
A B C
12
Presentation III
Simple 3 aging components serial system model
A
1 1 1 2
B
2 2
C
3
HAD WE KNOWN THE FUTURE 4 3 3 4 TBCF A B C
MTBCF < MTTCF
2 3 1 2
2 3
1 1 1
2
3
1 3 2 TTCF
A B C
13
Simulation Method
MONTE CARLO
MATHCAD N=100,000 SETS
N=100,000 SETS
MIN (X1,1 X2,1 X3,1) MIN (X1,2 X2,2 X3,2)
.
MIN (X1,1 X2,1 X3,1) MIN (X1,2 1,2 2,2)
. MIN (X1,N 1,N 2,N) _________________
MIN (X1,N X2,N X3,N) _________________
14
1 N min i N i 1
1 N min i N i 1
How BIG is the Difference ?
1. Depends on the System Architecture.
2. Depends on the Time-To-Failure
distribution of each component.
3. The difference in a specific complex
electronic system was found to be ~40% Note: True in redundant systems even when all components have constant failure rates.
15
Why Does It Matter ?
Suppose a specification demand for a systems reliability : MTBCF = 600 hour Suppose the manufacturer prediction of the parameter: MTBCF = 780 hour
-40%
ATTENTION !!! How was it CALCULATED ????
Is this MTBCF or MTTCF ????
Real MTBCF = 480 < 600 (spec)
16
Example 1
Aging serial system each component is weibull distributed
17
18
19
20
21
Example 2
Two redundant subsystems in series each component is exponentially distributed
22
Constant failure rate
23
serial
Constant failure rate parallel
24
A Comment about Asymptotic Availability
E{TTF } E{TBF } A E{TTF } E{TTR} E{TBF } E{TTR}
(*) [ S.Zacks, Springer-Verlag 1991, Introduction To Reliability Analysis, Par 4.3]
25
(*)
Repair policies
1. Hot repair is allowed for redundant components. 2. All components are renewed on every failure event. 3. All failed components are renewed on every failure event. 4. Failed components are renewed only in blocks which caused the system failure. 5. Failed subsystems are only partially renewed.
26
Conclusions
System configuration and distribution of components determine the gap. Repair policy should be specified in advance to determine calculation method. Flexible software solutions are needed to simulate real MTBCF for a given RBD. Predict MTBCF not MTTCF
27