You are on page 1of 28

Failure Analysis of Engineering Systems

Instructor: Professor Steve Maher

Module 3: Scripture of the Module 5 Fault-Tree Analysis 6 Fault-Tree Analysis Special Topics 7 Fault-Tree Analysis Quantification

Scripture of the Module


A prudent man sees danger and takes refuge, but the simple keep going and suffer for it.
- Proverbs 22:3

Failure Analysis of Engineering Systems ENGR 5323

Assignment
Read Chapter 5, 6, and 7 of Systems Failure Analysis Do Quizzes on Bb as they appear Quiz next week at beginning of class

Failure Analysis of Engineering Systems ENGR 5323

Berks Overall FA Process


Designate a team Determine Corrective Actions Gather all related information Implement Corrective Actions Review and define problem Assess Corrective Actions

Identify all potential failure causes


Evaluate for Preventive Actions List causes in FMA & A Incorporate FA Findings

Converge on root cause

Failure Analysis of Engineering Systems ENGR 5323

Fault-Tree Analysis

What is Fault-Tree Analysis?


It is a somewhat formal graphical process to identify all the possible events and combinations of events that can cause a single undesired event. Developed to address the Air Force not wanting to inadvertently launch a nuclear ICBM. Developed by Bell Labs working with the Air Force and Boeing. Starts with the undesired event at the top and develops all causes from there Can be done either before or after a failure occurs
In FA, typically after In System development/design, typically before
Failure Analysis of Engineering Systems ENGR 5323

Fault-Tree Analysis (FTA) Concepts


Start with single undesired event - be as clear as possible in problem definition FTA can be computer generated
In teams, developed by hand is typically better Develop by hand, then record on computer

FTA does not evaluate ALL potential failure modes of all parts of the system FTA does consider ALL potential causes of a single system failure.
Failure Analysis of Engineering Systems ENGR 5323

Fault-Tree Analysis (FTA) Approach


Start with single undesired event
System does not do what it is supposed to do System does right thing at wrong time System does right thing but out of spec System does something it is not supposed to do

Use charting symbols to graph out possible causes of the failing event

Failure Analysis of Engineering Systems ENGR 5323

FTA Charting Symbols


Command Event
Undeveloped Event or Human Error Normal Event

Basic Event
INHIBIT Gate

OR

AND

Transfer Symbol

Failure Analysis of Engineering Systems ENGR 5323

FTA Symbol Definitions


Command event: Condition that is induced by the events immediately below it. Basic Failure event: Bottom-level basic cause, e.g. part failure, software code error, bad connection. Normal event: Condition that is normally present

Human error or undeveloped event: 1) Human mistake in operating the system; 2) Condition that needs no further analysis in context of system.
Failure Analysis of Engineering Systems ENGR 5323

10

FTA Symbol Definitions (Cont)


OR gate: If any events beneath it happen, the event above occurs AND gate: Event above occurs only if all events happen simultaneously. INHIBIT gate: Constrains event Constraint shown in ellipse (oval); if constraint is present and event below happens, then the event above the gate occurs. Transfer symbol: Diagram continues on another page
Failure Analysis of Engineering Systems ENGR 5323

11

Fault-Tree Analysis Example: Light Bulb


(in the textbook)
Lets go through it

Failure Analysis of Engineering Systems ENGR 5323

12

Fault-Tree Analysis Special Topics

Failure Analysis of Engineering Systems ENGR 5323

13

FTA Special Topics


Transfer symbols INHIBIT gates Sequential AND gates State-of-the-part vs. state-of-the-system

Identifying redundancy-defeating failures

Failure Analysis of Engineering Systems ENGR 5323

14

Transfer Symbols
FTA can include events and conditions in the hundreds+ For a system of any complexity, one page will not hold all the FTA Transfer symbols are used
To show connections between pages (or sheets) In conjunction with Command events only
7

Usage:
Transfer out (to sheet #) and transfer in (from sheet #) Each has a transfer designation number inside the triangle

Good to maintain a fault-tree transfer log


Failure Analysis of Engineering Systems ENGR 5323

15

INHIBIT Gates
Special form of the AND gate Both events must happen for the Command event above to occur (see Figs. 6.2 and 6.3) Many times this is used to show probability of the INHIBIT event happening
Probability can be assigned Can stack events to stack probabilities

Gating event Probability = p

Failure Analysis of Engineering Systems ENGR 5323

16

Sequential AND gates


AND gate: all events must happen for above Command event to occur Sometimes the ORDER of events is critical to the occurrence of the Command event
In one sequence, the failing event occurs In another sequence, the failing event does not occur

Use an ellipse connected to the AND gate to show the necessary order.
Ex: R, then L

L 17

Failure Analysis of Engineering Systems ENGR 5323

State-of-the-Part vs. State-of-the-System


At every Command event, ask State-of-the-part or state-of-the-system? State-of-the-part = Single component or part failure can induce the condition
Typically not the only thing that can cause it Use OR gate Look for secondary events, environmental conditions, and human error

State-of-the-system = A single component or part cannot cause the condition


Two or more conditions must be present Use AND gate
Failure Analysis of Engineering Systems ENGR 5323

18

Suggested Process to do FTA


Start at the point the failure manifests itself Work through system point-to-point At each point, identify conditions that can induce the failure Determine if state-of-the-part or state-of-the-system For state-of-the-part, use OR gate and look for secondary events, environmental conditions, and human error For state-of-the-system, use an AND gate

Failure Analysis of Engineering Systems ENGR 5323

19

Identifying Redundancy-Defeating Failures


Sophisticated system design often contains redundancy, esp. mission critical systems Ideally, FTA shows multiple failures must occur to defeat the redundancy Once complete, FA team should ask:
What is probability of combined independent events occurring? Are the conditions independent, or are there any single-point failures or conditions that can defeat the redundancy?

Look for duplications in each redundant branch Re-review undeveloped events, environmental conditions, human errors, and event relationships
Failure Analysis of Engineering Systems ENGR 5323

20

Fault-Tree Analysis Quantification

Failure Analysis of Engineering Systems ENGR 5323

21

Can Quantify Fault-Tree Analysis


A beauty of FTA is that you can calculate probabilities of failures and rank failure causes On many systems, it is imperative to know probability of catastrophic failure before it happens
So it WONT happen! Military, anything involving nuclear energy, anything involving significant safety issues FTA is a very good tool for this requirement

Failure Analysis of Engineering Systems ENGR 5323

22

Important Terms and Concepts


Probability MTBF Failure rate () = 1/MTBF Ps = e-t Pf = 1 Ps

Failure Analysis of Engineering Systems ENGR 5323

23

FTA Probability Determination Common Practices


Human errors probabilities (typically 10-5 to .3, well maybe .01 to .1) Normal events probabilities (depends on system) Basic failure events and operating times Undeveloped events probs or /op times Inhibiting conditions probs (expected frequency) Command events calculated based on the probabilities of all the events that can cause it.

Failure Analysis of Engineering Systems ENGR 5323

24

Calculating Command Event Probabilities


For OR gates:
2 input: P1 + P2 P1*P2 3 input: P1 + P2 + P3 P1*P2 P1*P3 P2*P3 P1*P2*P3

For AND gates (2 inputs):


2 input: P1*P2 3 input: P1*P2*P3

For INHIBIT gates


P1*P2

Failure Analysis of Engineering Systems ENGR 5323

25

Calculating Probabilities for Top Undesired Events


Start at bottom of FTA, quantifying probability of all possible causes Move up a level, calculating probability for first level of Command events. Move up through the FTA, calculating probability for each Command event until top event is quantified Probability of each branch (or path) is also quantified Highest probability branch(es) can be identified. Do not spend too much effort trying to be too precise
Some estimates will be high, some low The overall estimate tends to balance out and be good (Big number theory)
Failure Analysis of Engineering Systems ENGR 5323

26

FTA Quantification vs. Systems FA


FTA quantification typically used for predictive analysis Very common to have specifications for System failure rate, e.g. system loss, system down time In design, FTA can point to needs for re-design For FA (i.e. the failure has already occurred), FTA is typically used similar to the Ishikawa diagram
Quantification is generally not done FTA will point to likely analytical paths For recurring failures, likely cause is probably obscure

Once FA is done, FA team may want to take further action on other high probability conditions identified by FTA
Failure Analysis of Engineering Systems ENGR 5323

27

Group Activity (if time)