Professional Documents
Culture Documents
Presented by
Dr. Raymond Wright
FSEglobal Course Administration 2
Breaks
Lunch
Stretch, refreshment, etc.
Personal belongings
18 May 2008
Alarm Management
Alarm Definition
Alarm Prioritisation
Alarm Implementation
FSE global
Functional Safety Engineering
Training - Classroom
Functional Safety Management
IEC 61508 / IEC 61511
Preparation for Certification Examination
Software Tools
exSILentia
F&G Simulation
18 May 2008
Instructor
Raymond Wright
Background / experience
Classmates
Name, company, position
Background / experience
18 May 2008
Presented by
Dr. Raymond Wright
FSEglobal Course Objectives 7
Course Objectives
Course Objectives
Introduction
Standards – IEC 61508, IEC 61511, ISA 84.01
Philosophy of Safe Design
Introduction to the Safety Lifecycle
Risk Management
Tolerable Risk
Components of Risk
Consequence
Likelihood
Risk Matrix and Risk Graph
18 May 2008
Risk Reduction
Process Risk
Incidents – Causes & Consequences
Preventative Controls (reduce frequency)
Mitigative Controls – (reduce consequence)
Bow-Tie Diagrams
Analysis Phase
Determination of Tolerable Risk
Hazard Identification
Risk Analysis (frequency and consequence)
Identifying Safety Instrumented Functions (SIF)
Determining the Safety Integrity Level (SIL) using Layer of Protection
Analysis (LOPA)
Writing the Safety Requirement Specification
Other Design Considerations
18 May 2008
Realisation Phase 1
System Technologies – Relay, Solid State, Programmable
Subsystems – Sensor, Logic Solver, Final Element
Architectures – 1oo1, 1oo2, 2oo2, 2oo3, 1oo2D
Sensor Subsystem
Logic Solver Subsystem
Final Element Subsystem
Effects of Field Devices on SIF Performance
Common Cause – Separation, Diversity, Physical Environment
Reliability
Definition of terms
Probability
Failure Modes
Fault Tree Analysis
Reliability Block Diagrams
18 May 2008
Markov Analysis
Realisation Phase 2
SIL Verification – PFDavg and Architectural Constraints
Factory Acceptance Testing
Commissioning
Analysis Models
Operation Phase
Maintenance
Decommissioning
Documentation
Management of Change
Course Limitations
No specific manufacturers equipment
No specific programming of equipment
No specific maintenance of systems
No specific regulatory requirements for different areas, countries or
industries
Acknowledgements
Parts of this course borrowed material from the following sources:
Paul Gruhn & Harry Cheddie – Safety Shutdown Systems: Design, Analysis
and Justification (ISA)
ISA Course EC50 – Designing and Applying Emergency Shutdown Systems
(ISA)
Ed Marszal & Eric Scharpf – Safety Integrity Level Selection: Systematic
Methods Including Layer of Protection Analysis (ISA)
Bill Goble & Harry Cheddie – Safety Instrumented Systems Verification:
Practical Probabilistic Calculations (ISA)
18 May 2008
Pre-Instructional Survey
Presented by
Dr. Raymond Wright
FSEglobal Introduction 17
Flixborough BP Texas
Three Mile Island
Chernobyl
Bhopal
Chernobyl
BP Texas
Buncefield
18 May 2008
After Three Mile Island, but before Chernobyl, the head of the Soviet
Academy of Sciences said,
When the Bhopal plant works manager was informed of the accident,
he said in disbelief,
“The gas leaks just can’t be from my plant. The plant is shut
down. Our technology just can’t go wrong. We just can’t have
leaks.”
18 May 2008
Specification
44%
HSE
PES NAMUR
18 May 2008
IEC 61508
*Safety Lifecycle
Technical
Installation &
Requirements
Commissioning
Operation &
Maintenance
Competence
of Persons Changes after
Commissioning
18 May 2008
* Simplified view
Installation &
Commissioning
Back to
appropriate
Safety Validation phase
18 May 2008
Decommissioning
Modification Decommissioning
Sub-clause 15.4 Sub-clause 16
Clause Sub-clause Sub-clause
5 6.2 7, 12.7
Functional Safety
A system composed of Sensors, Logic Solvers, and Final Control Elements for the
purpose of taking the process to a safe state when process conditions are outside
normal limits. Separate from the Basic Process Control System.
PT PT
1A 1B
I/P
FT
REACTOR
18 May 2008
Loop 1
A Safety Instrumented
1 Function is defined as a
Loop 2
“Function to be
2 implemented by a SIS
6
3 which is intended to
achieve or maintain a safe
Logic state for the process with
Loop 3 4 Solver respect to a specific
Loop 4
hazardous event.”
Final elements
SAFE
DETECTED
(λSD)
60% Safe
Failures
SAFE
UNDETECTED λS DANGEROUS
(λSU)
UNDETECTED
(λDU)
λD
PFDavg = 1 - e −λ *TI/2
DU
40% DANGEROUS
Dangerous
18 May 2008
DETECTED
RRF = 1/PFDavg Failures
(λDD)
Review of Introduction
Learning from Incidents
Safety Standards – IEC 61508, IEC 61511, ISA 84.01
Concepts in safety standards
Safety Lifecycle
Class Exercise
Question Sheet – SIS (Safety) Terminology
18 May 2008
Presented by
Dr. Raymond Wright
FSEglobal Safety Life Cycle (SLC) 33
Installation &
Commissioning
Back to
appropriate
Safety Validation phase
18 May 2008
Decommissioning
Analysis Phase
Realisation Phase
Installation &
Commissioning
Safety Validation
Operation Phase
Back to
appropriate
phase
Decommissioning
Operation
Maintenance including Periodic
Inspection and Testing
Management of Change for
Modification and Retrofit
Decommissioning
18 May 2008
Consequence Database
4. Consequence Analysis Hazard Consequences
Failure Probabilities
3. Layer of Protection Analysis Hazard Frequencies
Requirements
Blue area shows information
Allocation
Develop Non-SIS
Safety
Target SILs
Layers
the activities 7b. Select Architecture Redundancy: 1oo1, 1oo2, 2oo3, 1oo2D
11. Validation Planning
Review Procedures, Management of Change,
Emergency Plans, etc.
workshop manual
14. SIS Startup, Operation,
13. Operation & Maintenance
Maintenance, Periodic Functional
Planning
Tests
18 May 2008
Decommission
Based on the Functional Safety Lifecycle model provided in the Safety Engineering I course from exida.com
Class Exercise
Question Sheet – Safety Life Cycle
18 May 2008
Presented by
Dr. Raymond Wright
FSEglobal Analysis Phase 41
Requirements
Allocation
Safety
18 May 2008
Identify the Risk associated with each hazard. This is the Inherent Risk
associated with the particular hazard.
What are the consequences?
How often does it happen?
18 May 2008
Identify the risk reduction required when the inherent risk minus the risk
reduction provided by existing control measures is still higher than the
tolerable risk.
Do the existing control measures reduce the risk below the tolerable
risk level?
Can other control measures be identified or developed?
Use the required risk reduction as the safety performance target for the
Safety Instrumented Function (SIF). The safety performance target is
specified as the Safety Integrity Level (SIL) of the SIF.
What safety performance do I need from the identified SIF?
Risk Management
Tolerable Risk
Components of Risk
Consequence
Likelihood (frequency)
Risk Reduction
Process Risk
Bow Tie Diagrams to Visualise Risk
18 May 2008
Presented by
Dr. Raymond Wright
FSEglobal Risk Management 46
Definition of Risk
Risk is a measure of the likelihood and consequence of an adverse
event or incident. That is, how often it happens, and what the effects are
when it does?
Risk Receptors
Injury to Personnel
Damage to the Environment
Financial Loss
Equipment / Property Damage
Business Interruption
Business Liability
Company Image
Lost Market Share
18 May 2008
Tolerable Risk
Organisations have a moral, legal Moral, Legal and Financial
and financial responsibility to limit Responsibilities
the risk their operations pose
Organisations have a moral duty to Make plant a safe as
possible, disregard costs
limit the risk to employees and the
public
Organisations also have a
responsibility to consider the risks to
the environment, property and Moral
business
The concept is simple, but the
determination of tolerable risk is Legal Financial
complex. It considers the
consequences of risk in a number of
different ways:
Individual risk of injury Comply with Build lowest cost
regulations, plant, keep operating
Damage to the environment regardless of costs or budget as small as
Economic loss due to lost level of risk possible
18 May 2008
Each layer
Layers of Protection provides risk
reduction
18 May 2008
LIKELIHOOD
Reduce the likelihood of the
incident
Reduce the consequence of the
incident
Reduce both the likelihood and
the consequence
CONSEQUENCE
CONSEQUENCE
Legal Implications Serious Severe or
Permanent Injury
Loss of Public Image
Consequence for each category Single Injury,
can be assessed in two ways: Minor not severe
Qualitatively – uses the judgement
18 May 2008
of competent people
Incidental Minor Injury
Quantitatively – uses specific
effects for calculation
Copyright © FSEglobal 2008
FSEglobal Risk Management 53
Consequence
Example of Consequence Categories Example of Consequence Categories
for Damage to the Environment for Economic Loss to the Business
CONSEQUENCE
Serious Significant with Serious $100K – S1M
Serious Offsite impact
Recordable, no
Incidental Agency involvement Incidental < $10K
Consequence
Incidental Minor Serious Major Catastrophic
Costs (C) < $10k $10k - $100k $100k - $1M $1M - $10M > $10M.
Single Injury,
Severe or Multiple fatalities
not severe, Single fatality
People (P) Minor injury permanent (internal &
possible lost (internal only)
disabling injury external)
time
Serious offsite Long term impact,
Recordable, but Licence breach Significant with
Environment impact, possible adverse
no agency and/or agency serious offsite
(E) long-term public international
involvement involvement impact
health effects publicity
Agency Major prosecution
Recordable, but Agency Major
involvement with company
Legal (L) no agency involvement with prosecution with
with possible officer
involvement prosecution significant fine
prosecution imprisonment
Complaint from Widespread Community
Widespread
Community & public, minor complaints, local outrage, state
No impact outrage, federal
18 May 2008
LIKELIHOOD
Seldom 1/100 yrs
Quantitatively – uses specific
frequencies for calculation
Rarely 1/1,000 yrs
18 May 2008
Frequently Event likely to occur once in 1 year to once in 10 years. > 1.00E-01 per year
Sometimes Event likely to occur once in 10 years to once in 100 years. > 1.00E-02 per year
Seldom Event likely to occur once in 100 years to once in 1,000 years. > 1.00E-03 per year
Never Event likely to occur less than once in 10,000 years. < 1.00E-04 per year
18 May 2008
Consequence
Costs (C) < $10k $10k - $100k $100k - $1M $1M - $10M > $10M.
Single Injury, not Severe or Multiple fatalities
Single fatality
People (P) Minor injury severe, possible permanent (internal &
(internal only)
lost time disabling injury external)
Serious offsite Long term impact,
Recordable, but Licence breach Significant with
impact, possible adverse
Environment (E) no agency and/or agency serious offsite
long-term public international
involvement involvement impact
health effects publicity
Agency Major prosecution
Recordable, but Agency Major prosecution
18 May 2008
1/10 yr 1 2 3 3 4
1/100 yr 1 1 2 2 3
1/1,000 yr - 1 1 2 3
1/10,000 yr - - 1 1 2
Consequence
Costs (C) < $10k $10k - $100k $100k - $1M $1M - $10M > $10M.
Single Injury, not Severe or Multiple fatalities
Single fatality
People (P) Minor injury severe, possible permanent (internal &
(internal only)
lost time disabling injury external)
Serious offsite Long term impact,
Recordable, but Licence breach Significant with
impact, possible adverse
Environment (E) no agency and/or agency serious offsite
long-term public international
involvement involvement impact
health effects publicity
Agency Major prosecution
Recordable, but Agency Major prosecution
18 May 2008
Risk Reduction
The final step in risk management is to determine the difference
between the inherent risk and the tolerable risk. The different will
determine the level of risk reduction required.
Based on the
Inherent
unmitigated
Risk
consequence and
likelihood
Risk Reduction
Layers of
Each layer
Risk
Protection
provides risk
reduction
Tolerable
Risk Based on
corporate risk
tolerance
18 May 2008
Risk Reduction
As the components of risk are Consequence and Likelihood, reducing
either of these components will reduce the risk
Inherent
Non-SIS Risk Risk of the
Reduction Process
e.g. DCS,
Alarm System Consequence Reduction
e.g. bunds, fire protection,
reduce hazardous material
Likelihood
SIS Risk
SIL 1 Reduction
Unacceptable
SIL 2 Risk Region
ALARP
SIL 3
Risk Region
Tolerable
Risk Region
18 May 2008
Consequence
Copyright © FSEglobal 2008
FSEglobal Risk Management 62
Questions?
Class Exercise
Question Sheet – Risk Management
18 May 2008
Presented by
Dr. Raymond Wright
FSEglobal Process Risk 64
Examples:
Lightning hits an aeroplane and CAUSES the aeroplane to crash, and
the CONSEQUENCE is many people die. (Injury)
An oil storage tank overflows and CAUSES oil to leak into a river, and
the CONSEQUENCE is contaminated water. (Environment)
No maintenance CAUSES a valve to stick, and the CONSEQUENCE is
$1 million lost production. (Economic)
18 May 2008
Mitigative Control
Measures
Cause Incident Consequence
Mitigative Control
Measures
Cause Incident Consequence
Example
A fire from a tank overflow costs $1 million
A fire detection system provides an alarm and triggers a deluge system.
This reduces the damage by 80%.
How much does a fire cost with and without the fire detection system?
Preventative
Control Measures
Cause Incident Consequence
Preventative
Control Measures
Cause Incident Consequence
Example
A level transmitter fails once a year and causes a tank to overflow
A high level switch provides an alarm and an operator shuts the inlet valve.
This is effective 90% of the time.
How often does the tank overflow with and without the alarm?
Cause Incident
Tank Overflow
Faulty limit switch 1
OR
High level alarm ignored 2
OR
Inlet valve sticks open 3
Example:
A faulty limit switch; OR
An operator ignores a high level alarm; OR
18 May 2008
Incident Consequence
Tank Overflow
1 Lost product
OR
2 Environmental damage
OR
3 Fire, Explosion
Example:
A storage tank overflow could cause
Lost product; OR
18 May 2008
Fire, explosion; OR
Environmental damage
! IMPORTANT
CONCEPT !
18 May 2008
Questions?
Class Exercise
Question Sheet – Bow Tie Diagrams
18 May 2008
Presented by
Dr. Raymond Wright
FSEglobal Conceptual Design 77
Requirements
Allocation
Safety
18 May 2008
Scope Definition
This is the framework for the project.
The purpose of the project in terms of goals and outcomes is defined.
Operational and safety objectives are defined.
Adequate resourcing and realistic scheduling should be made
available to achieve these objectives.
Responsibilities are assigned and reporting mechanisms are put in
place.
18 May 2008
Presented by
Dr. Raymond Wright
FSEglobal Hazard Analysis 81
Requirements
Allocation
Safety
18 May 2008
Existing layers of protection are identified, and their effect on the level
of risk is determined.
Identify control measures that can completely prevent the hazard, or
completely mitigate it
What level of risk reduction can each control measure realistically
offer?
The level of risk reduction required determines the Safety Integrity Level
(SIL) requirement of the SIF.
A SIF with a SIL 3 requirement is expensive to implement, and
expensive to maintain
If a SIL-rated function is implemented in a BPCS (DCS), the BPCS must
have the required safety rating – this is not easy to achieve
If an Alarm and Operator Response is identified as a layer of
protection, it must be independent from other layers of protection
18 May 2008
Risk
Based on the
unmitigated
Inherent Risk consequence and Tolerable Risk inherent
likelihood
Risk Level in the process
Layers of Each layer
Protection provides risk
reduction
SIS ALARM BPCS
Tolerable Risk Based on
corporate risk
tolerance
Risk
Checklist
The disadvantage is that all items may not be listed, and therefore some
items may be overlooked
18 May 2008
What if …
FMEA
HAZOP
Node: C3 Column 1
Parameter: Liquid Level
Too high Loss of reboiler Potential High level D4 D3 Add high-high J. Jones
heating element distributor alarm level switch to May 2008
damage activate drain
valve
Too high Bottoms valve Potential High level D4 D3 Same as above J. Jones
blockage distributor alarm May 2008
damage
Too low Loss of feed flow Potential Low level D3 D2 Verify if tubes will S. Smith
reboiler tube alarm be damaged March 2008
damage
18 May 2008
More Column Steam Column Pressure relief E4 E2 Install SIF (SIL 2) J. Jones
Reboiler Pressure Overpressure valve, operator to stop reboiler May 2008
Control fails, and potential intervention to steam flow upon
causing mechanical high pressure high column
excessive heat failure of the alarms, pressure
input. vessel and Mechanical
release of its design of
contents. vessel.
More Steam reboiler Column Pressure relief E4 E2 Same as above J. Jones
tube leak causes Overpressure valve, operator May 2008
high pressure and potential intervention to
steam to enter mechanical high pressure
vessel failure of the alarms.
vessel and
release of its
contents.
18 May 2008
More Low flow through Pump seal fails Low Outlet D3 D1 Existing
pump causes and releases flow pump safeguards
pump failure and flammable Shutdown SIF adequate
subsequent seal material (SIL 2).
failure
Hazard &
Consequences
More Column Steam Column Pressure relief E4 E2 Install SIF (SIL 2) J. Jones
Reboiler Pressure Overpressure valve, operator to stop reboiler May 2008
Control fails, and potential intervention to steam flow upon
causing mechanical high pressure high column
excessive heat failure of the alarms, pressure
input. vessel and Mechanical
release of its design of
contents. vessel.
More Steam reboiler Column Pressure relief E4 E2 Same as above J. Jones
tube leak causes Overpressure valve, operator May 2008
high pressure and potential intervention to
steam to enter mechanical high pressure
vessel failure of the alarms.
vessel and
release of its
contents.
18 May 2008
More Low flow through Pump seal fails Low Outlet D3 D1 Existing
pump causes and releases flow pump safeguards
pump failure and flammable Shutdown SIF adequate
subsequent seal material (SIL 2).
failure
Initiating Events
In HAZOP, Initiating events are found in the “Causes” column
What-If and Checklist questions
Potential for multiple initiating events per hazard
More Column Steam Column Pressure relief E4 E2 Install SIF (SIL 2) J. Jones
Reboiler Pressure Overpressure valve, operator to stop reboiler May 2008
Control fails, and potential intervention to steam flow upon
causing mechanical high pressure high column
excessive heat failure of the alarms, pressure
input. vessel and Mechanical
release of its design of
contents. vessel.
More Steam reboiler Column Pressure relief E4 E2 Same as above J. Jones
tube leak causes Overpressure valve, operator May 2008
high pressure and potential intervention to
steam to enter mechanical high pressure
vessel failure of the alarms.
vessel and
release of its
contents.
18 May 2008
More Low flow through Pump seal fails Low Outlet D3 D1 Existing
pump causes and releases flow pump safeguards
pump failure and flammable Shutdown SIF adequate
subsequent seal material (SIL 2).
failure
Safeguards
Find both non-SIS and SIS Safeguards (other than SIS under study)
Safeguards apply to initiating events. Multiple safeguards per initiating event
may exist
Safeguards apply to a
specific Initiating Event
More Column Steam Column Pressure relief E4 E2 Install SIF (SIL 2) J. Jones
Reboiler Pressure Overpressure valve, operator to stop reboiler May 2008
Control fails, and potential intervention to steam flow upon
causing mechanical high pressure high column
excessive heat failure of the alarms, pressure
input. vessel and Mechanical
release of its design of
contents. vessel.
More Steam reboiler Column Pressure relief E4 E2 Same as above J. Jones
tube leak causes Overpressure valve, operator May 2008
high pressure and potential intervention to
steam to enter mechanical high pressure
vessel failure of the alarms.
vessel and
release of its
contents.
18 May 2008
More Low flow through Pump seal fails Low Outlet D3 D1 Existing
pump causes and releases flow pump safeguards
pump failure and flammable Shutdown SIF adequate
subsequent seal material (SIL 2).
Safeguards apply to a
failure
specific Initiating Event
Copyright © FSEglobal 2008
FSEglobal Hazard Analysis 98
SIF Description
Find all of the existing and recommended SIF
Recommended SIF found in Recommendations Column, Existing SIF found in
Safeguards Column
More Column Steam Column Pressure relief E4 E2 Install SIF (SIL 2) J. Jones
Reboiler Pressure Overpressure valve, operator to stop reboiler May 2008
Control fails, and potential intervention to steam flow upon
causing mechanical high pressure high column
excessive heat failure of the alarms, pressure
input. vessel and Mechanical
release of its design of
contents. vessel.
More Steam reboiler Column Pressure relief E4 E2 Same as above J. Jones
tube leak causes Overpressure valve, operator May 2008
high pressure and potential intervention to
steam to enter mechanical high pressure
vessel failure of the alarms.
vessel and
release of its
contents.
18 May 2008
More Low flow through Pump seal fails Low Outlet D3 D1 Existing
pump causes and releases flow pump safeguards
pump failure and flammable Shutdown SIF adequate
subsequent seal material (SIL 2).
failure
Hazard Definition
Process Hazard Analysis
HAZOP
HAZOP Results
Questions?
Class Exercise
Question Sheet – Process Hazard Analysis
18 May 2008
Presented by
Dr. Raymond Wright
FSEglobal Consequence Analysis 102
Consequence Analysis
Requirements
Allocation
Safety
18 May 2008
prices and cause shareholders
to sell shares?
Incident Outcome
The physical manifestation of
the incident.
For toxic materials, the incident
outcome is a toxic release,
while for flammable materials,
the incident outcome could be a
Boiling Liquid Expanding Vapor
Cloud Explosion (BLEVE), flash
fire, unconfined vapor cloud
explosion, toxic release, etc.
For Example
For a 10 lb/sec leak of
ammonia, the incident outcome
is a toxic release
18 May 2008
Consequence Analysis
should consider:
Toxic Hazards
Consequence Categorisation
Consequence
Incidental Minor Serious Major Catastrophic
Costs (C) < $10k $10k - $100k $100k - $1M $1M - $10M > $10M.
Single Injury,
Severe or Multiple fatalities
not severe, Single fatality
People (P) Minor injury permanent (internal &
possible lost (internal only)
disabling injury external)
time
Licence Serious offsite Long term
Recordable, Significant with
Environment breach and/or impact, possible impact, adverse
but no agency serious offsite
(E) agency long-term public international
involvement impact
involvement health effects publicity
Agency Major
Recordable, Agency Major
involvement prosecution with
Legal (L) but no agency involvement with prosecution with
with possible company officer
involvement prosecution significant fine
prosecution imprisonment
Complaint
Widespread Community Widespread
from public,
Community & complaints, local outrage, state outrage, federal
No impact minor
Reputation (R) government government government
18 May 2008
reputation
action action action
damage
Example:
In a five year period there were 235 explosions of industrial boilers.
As a result of those explosions, 17 people were killed and 84 people
were injured.
Personal Loss of Life (PLL) = 17 / 235 = 0.073 per incident
Personal Injury (PI) = 84 / 235 = 0.358 per incident
18 May 2008
Consequence Modelling
112 meters
87 meters Calculates “Effect Zones” and
“Effect Distances”
Typically uses mathematical
models
Consequence Modelling
Effect Zone
For an incident outcome of toxic release, the area over which the
airborne concentration exceeds some level of concern.
For example: given an IDLH* for ammonia of 500 ppm (v), an effect
zone of 4.6 square miles is estimated for a 10 lb/s leak.
Zones for thermal effects and explosion overpressure are described in a
similar fashion.
112 meters
87 meters
Types of Consequence
Consequence Analysis Methods
Categorisation
Statistical Analysis
Consequence Modeling
Questions?
Class Exercise
Question Sheet – Consequence Analysis
18 May 2008
Presented by
Dr. Raymond Wright
FSEglobal Likelihood Analysis 115
Likelihood Analysis
Hazard Characteristics
4. Consequence Analysis Hazard Consequences
Consequence Database
Layers of Protection
3. Layer of Protection Analysis Hazard Frequencies
Failure Probabilities
Requirements
Allocation
Develop Non-SIS
Safety
Target SILs
Layers
SIS No
Required Exit
?
average
path of aircraft, terrorist
activity, sabotage
Initiating
Incident
Event
Operator does
Control System Mechanical
not respond
Fails Relief Fails
appropriately
Statistical Analysis
Questions?
Class Exercise
Question Sheet – Likelihood Analysis
18 May 2008
Presented by
Dr. Raymond Wright
FSEglobal Fault Trees 123
Battery
Quantitative Analysis of Fault Trees -
System combine probabilities using probability
Failure multiplication.
PTOP
PTOP
What is the probability the valve fails to close?
Questions?
Class Exercise
Question Sheet – Fault Tree Analysis
18 May 2008
Presented by
Dr. Raymond Wright
FSEglobal Event Trees 130
Branch 2
Branch 1
Outcome 1
Initiating Outcome 2
Event Outcome 3
Outcome 4
Outcome 5
18 May 2008
Outcome 6
Data:
Accident, 1/7 years
Probability of overturn, 1/10
Probability of leak after turnover, 1/3; otherwise 1/6
Probability of ignition, 20% in all spill cases
Questions?
Class Exercise
Question Sheet – Event Tree Analysis
18 May 2008
Presented by
Dr. Raymond Wright
FSEglobal LOPA 137
Analyse the results of the HAZOP (PHA) together with the Risk
Matrix (tolerable risk) to determine if the existing control
measures (safeguards) reduce risk sufficiently
This form of analysis is a specific form of event tree analysis,
but is only interested in the likelihood of the failure outcome
Layers of Protection
Consequence
Mitigation Emergency
Response
Safety
Active Protection Layer
Likelihood
Instrumented Emergency
System Shutdown
Trip level alarm
Operator Process
Intervention Shutdown
Process Alarm
Basic
Process Process Process Control Layer
18 May 2008
time
Specificity
An independent protection layer must be specifically designed to
prevent the consequences of one potentially hazardous event.
Independence
The operation of the protection layer must be completely
independent from all other protection layers, no common equipment
can be shared with other protection layers.
Dependability
The device must be able to dependably prevent the consequence
from occurring. Both systematic and random faults need to be
considered in its design
Auditability
The device should be proof tested and maintained. These audits of
operation are necessary to ensure that the specified level of risk
18 May 2008
Initiating Events
Examples:
Fire
No Incident
18 May 2008
Initiating Event
Cooling water failure frequency is 0.5 /yr
The BPCS and SIS are physically separate devices, including sensors,
logic solver and final elements.
Failure of the BPCS is not responsible for initiating the unwanted
accident.
The BPCS has the proper sensors and actuators available to perform a
function similar to the one performed by the SIS.
1 Response Unlikely – Not all of the conditions for a normal operator response 1.0
have been satisfied
3 Drilled Response - All of the conditions for a normal operator response have 0.01
been satisfied, and a ‘drilled response’ program is in place at the facility. Drilled
response exists when written procedures, which are strictly followed, are drilled
or repeatedly trained. The drilled set of actions forms a small part of all alarms
where response is highly practised – that is, its implementation is ‘automatic’.
18 May 2008
Time at Risk
P=
Total Time
18 May 2008
In some organizations
PFD = 0.0 if vessel designed to withstand pressure
Example:
OREDA says 1.0 x 10-7 /hr rate for “significant leakage”
PFD = (1.0 x 10-7*8760) * 1 = 0.0009
18 May 2008
Relief Valves
Rupture Disks
Fusible Plugs
Occupancy
Time of Occupancy
P=
Total Time
Questions?
Class Exercise
Question Sheet – Layer of Protection Analysis
18 May 2008
Questions?
18 May 2008
Presented by
Dr. Raymond Wright
FSEglobal Non-SIS Layers 158
SIL Selection
Requirements
Allocation
Safety
18 May 2008
Presented by
Dr. Raymond Wright
FSEglobal SIL Selection 161
SIL Selection
Requirements
Allocation
Safety
18 May 2008
SIL Selection
SIL Selection
Reminder!
Different from a SIS, which can encompass multiple functions and act in
multiple ways to prevent multiple harmful outcomes
SIS may have multiple SIF with different individual SIL, so it is
incorrect and ambiguous to define a SIL for an entire Safety
Instrumented System
18 May 2008
Presented by
Dr. Raymond Wright
FSEglobal SIL Selection 167
SIL Selection
Requirements
Allocation
Safety
18 May 2008
3b 3a
Moderate High
Hazard Matrix Procedure 1
Categorize consequence 1 2 3b
Categorize likelihood
3b
Low
Select SIL from matrix
corresponding to identified
Note c 1
consequence and likelihood Minor Serious Extensive
categories
Hazardous Event Severity Rating
3 X 3, 4 X 4, 5 X 5, …
a) One Level 3 Safety Instrumented Function does not provide sufficient risk reduction at this risk
level. Additional modifications are required in order to reduce risk (see note d);
b) One Level 3 Safety Instrumented Function may not provide sufficient risk reduction at this risk
level. Additional review is required (see note d);
18 May 2008
Assignment of Consequence
Based on IEC 61511-3 Annex C
18 May 2008
Moderate High
consequence categories to
determine the SIL required
1 2 3b
Example 1
A SIF was identified during a
3b
Low
HAZOP study Note c 1
The HAZOP team determined:
Minor Serious Extensive
the consequence is Serious
the likelihood is High Hazardous Event Severity Rating
Moderate High
consequence categories to
determine the SIL required
1 2 3b
Example 1 (continued)
Further analysis showed that this
3b
Low
scenario yielded a consequence of Note c 1
0.21Probable Loss of Life (PLL) and
a likelihood of 1/576 incidents per Minor Serious Extensive
year Hazardous Event Severity Rating
What is the SIL?
a) One Level 3 Safety Instrumented Function does not provide sufficient risk reduction at this risk
level. Additional modifications are required in order to reduce risk (see note d);
b) One Level 3 Safety Instrumented Function may not provide sufficient risk reduction at this risk
18 May 2008
Consequence
Recordable Lost Time Permanent
Many Deaths
Injury Injury Injury/Death
Rule Set:
All extreme risk will be reduced
All moderate risks will be reduced where practical.
Copyright © FSEglobal 2008
FSEglobal SIL Selection 175
Rule Set:
All extreme risk will be reduced
All moderate risks will be reduced where practical.
Copyright © FSEglobal 2008
FSEglobal SIL Selection 176
Consequence
Recordable Lost Time Permanent
Many Deaths
Injury Injury Injury/Death
SIL 1 (RRF>10)
1 per 1000 yrs Acceptable Acceptable Moderate Extreme
SIL 2 (RRF>100)
1 per 10,000 yrs Acceptable Acceptable Moderate Moderate
SIL 3 (RRF>1000)
1 per 100,000 yrs Acceptable Acceptable Acceptable Moderate
18 May 2008
Rule Set:
All extreme risk will be reduced
All moderate risks will be reduced where practical.
Copyright © FSEglobal 2008
FSEglobal SIL Selection 177
Risk Matrix
This example shows
Example of a Risk Matrix
relative risk levels
Rules can be 1/yr 1 2 3 4 NA
developed to 1/10yr 1a 1 2 3 4
determine if risk
Likelihood
1/100yr - 1a 1 2 3
related to a particular
incident is acceptable 1/1000yr - - 1a 1 2
1/10,000yr - - - 1a 1
Rule Set:
Requires SIL 3 SIF
Risk must be reduced
Requires SIL 2 SIF
Consequence
Risk to be reduced if
cost-effective
Requires SIL 1 SIF Figures in the cells represent
Risk is tolerable Safety Integrity Level
18 May 2008
0 1 1
1 2 2 1
Each additional layer of
Likelihood
0 1 2
18 May 2008
Consequence
Consequence
Ca Minor Injury
W3 W2 W1
Cb Serious Injury, Single Death
Ca X1
a Cc Several Deaths
Pa Cd Many Deaths
X2
Fa 1 a
Frequency & Exposure
Cb Pb
X3 Fa Rare to Frequent
Fb Pa 2 1 a
Fb Frequent to Continuous
Pb
Cc Fa X4
Fb Pa 3 2 1 Possibility of Avoidance
Pb Pa Sometimes Possible
Cd Fa X5
Pa 4 3 2 Pb Almost Impossible
Fb
Pb X6 Probability of Occurrence
b 4 3
W1 Very Slight
18 May 2008
Risk Graph
b 4 3
Consequence
Ca Minor Injury
W3 W2 W1
Cb Serious Injury, Single Death
Ca X1
a Cc Several Deaths
Pa Cd Many Deaths
X2
Fa 1 a
Frequency & Exposure
Cb Pb
X3 Fa Rare to Frequent
Fb Pa 2 1 a
Fb Frequent to Continuous
Pb
Cc Fa X4
Fb Pa 3 2 1 Possibility of Avoidance
Pb Pa Sometimes Possible
Cd Fa X5
Pa 4 3 2 Pb Almost Impossible
Fb
Pb X6 Probability of Occurrence
b 4 3
W1 Very Slight
18 May 2008
Parameters Description
Demand Rate W The number of times per year that the hazardous event
would occur if no SIS was fitted. This can be determined
by considering all the failures that can lead to one
18 May 2008
V = 1 Rupture or explosion
Consequence
Ca Minor Injury
W3 W2 W1
Cb Serious Injury, Single Death
Ca X1
a Cc Several Deaths
Pa Cd Many Deaths
X2
Fa 1 a
Frequency & Exposure
Cb Pb
X3 Fa Rare to Frequent
Fb Pa 2 1 a
Fb Frequent to Continuous
Pb
Cc Fa X4
Fb Pa 3 2 1 Possibility of Avoidance
Pb Pa Sometimes Possible
Cd Fa X5
Pa 4 3 2 Pb Almost Impossible
Fb
Pb X6 Probability of Occurrence
b 4 3
W1 Very Slight
18 May 2008
Pb
Cd Fa X5
Pa 4 3 2
Fb
Pb X6
b 4 3
Pb
Cd Fa X5
Pa 4 3 2
Fb
Pb X6
b 4 3
Questions ?
Class Exercise
Question Sheet – Qualitative SIL Selection
18 May 2008
Presented by
Dr. Raymond Wright
FSEglobal SIL Selection 192
Requirements
Allocation
Safety
18 May 2008
Example
A known consequence requires a likelihood of 1x10-5 to achieve a
tolerable risk level.
FTA shows that existing control measures reduce the likelihood to
1x10-3
The difference is 1x10-2, or a required risk reduction factor of 100
From the SIL table the SIF needs to be SIL 2
18 May 2008
LOPA
If the consequence is known, then LOPA can be used to calculate the
likelihood of the event with all existing control measures in place.
This is compared with the likelihood required to achieve a tolerable risk
level, and the difference is directly related to the SIL requirement of the
SIF
Example
A known consequence requires a likelihood of 1x10-6 to achieve a
tolerable risk level.
LOPA shows that existing control measures reduce the likelihood to
1x10-3
The difference is 1x10-3, or a required risk reduction factor of 1000
From the SIL table the SIF needs to be SIL 3
18 May 2008
Severity Target
Impact
Rating Frequency (1 yr)
Temporary injury to personnel and damage
to the environment.
Minor 1.0 x 10-3
Minor damage to equipment. No shutdown
of the process.
Serious injury to personnel and the
environment.
Serious 1.0 x 10-4
Damage to equipment. Short shutdown of
the process.
Catastrophic consequence to personnel
and the environment.
Extensive 1.0 x 10-6
Large scale damage of equipment.
Shutdown of a process for a long time.
18 May 2008
FUnmitigated Event
RRFSIF =
FTarget
Probability of failure
Safety Integrity on demand, average Risk Reduction Factor
Level (Low Demand mode of
operation)
FUnmitigated Event 1
RRFSIF = PFD =
FTarget
RRFSIF
RRF
Event Freq. 100 Target Freq.
1 / 10 yr 1 / 1000 yr
PFD
0.01
18 May 2008
Example
Calculate require risk reduction and assign SIL with the same method as
the general frequency based method
Findividual risk
Ftarget =
PLL
18 May 2008
Example
An accident scenario yielded a consequence of 0.21 Probable Loss of
Life (PLL) and a likelihood of 1/576 incidents per year.
Tolerable individual risk of fatality at this facility is 1x10-4
What SIL should be selected?
SIL Assignment
SIL selection is performed based on the RRF calculated for the SIF
Assuming the RRF required = 210
Target SIL = SIL 3
The minimum risk reduction for SIF of 1000 guarantees that any
SIL 3 system will achieve the required risk reduction factor
Or Target SIL = SIL 2 with RRF > 210
18 May 2008
Topics:
Risk and the Context of SIL Selection
Safety Instrumented Functions
Required risk reduction leading to SIL assignment
Questions?
Class Exercise
Question Sheet – Quantitative SIL Selection
18 May 2008
Presented by
Dr. Raymond Wright
FSEglobal Safety Requirement Specification 209
Requirements
Allocation
Safety
18 May 2008
Definition
IEC61511: “specification that contains all the requirements of the safety
instrumented functions in a safety instrumented system”
Objective
Specify all requirements of SIS needed for detailed engineering and
process safety information purposes
Tasks
Identify all Safety Instrumented Functions
Document the SIL requirement or Risk Reduction requirement of each
SIF
Document the cause (frequency)/hazard/consequence the SIF is
guarding against
Provide a functional description of each SIF, and document this as
18 May 2008
The hazard
Frequency of occurrence
Consequence
SIL level required
18 May 2008
SRS Elements
Functional Requirements
Description of the function of the SIF
How it should work
Integrity Requirements
The risk reduction and reliability requirements
How well it should work
18 May 2008
Trip Point
Units
SIL
Tag# Description
BS-01 Burner Loss of Flame 1 ~ ~ PSIG X X X
18 May 2008
Structured Text
Strengths: extremely flexible; no special knowledge required
Weaknesses: time consuming; transposition to program code difficult
and error prone
Cause-and-Effect Diagrams
Strengths: low level of effort; clear visual representation
Weaknesses: rigid format (some functions cannot be represented with
C-E diagrams); can oversimplify
Binary Logic Diagrams (ANSI/ISA-5.2-1976)
Strengths: more flexible than C-E diagrams; direct transposition to a
function block diagram program
Weaknesses: time consuming; knowledge of standard logic
representation required
18 May 2008
Describe the logic for a SIF, where a low-pressure condition can cause
flame out in a fired heater. In this case, the inputs are from burner
monitor switch BS-01 and pressure switch PSL-02. The output is to a
double-block and bleed assembly whose up and downstream block
valves are XV-03A and XV-03B respectively with XV-03C as the bleed
valve. The valves can be moved to their safe position by de-energizing
solenoid XY-03. The system is de-energize to trip.
Trip Point
Units
SIL
Tag# Description
BS-01 Burner Loss of Flame 1 ~ ~ PSIG X X X
PSL-01 Fuel Gas Pressure Low ~ 7 PSIG X X X
18 May 2008
The Binary Logic Diagram below also describes the gas flow shutdown
example.
1=Energized FC
AND s XV
03B
PSL Vent FC
01 XV
Energized=1
03C
FO
18 May 2008
Exceptions can be justified, but great care must be taken with all
aspects of the design including verification calculations and the data
used for such calculations.
18 May 2008
Accident
Detection
Trip level
Process
Safety
Operator takes action Time
High level
18 May 2008
Time
Copyright © FSEglobal 2008
FSEglobal Safety Requirement Specification 224
Reset Functions
Most SIF should latch when a trip occurs. This means that an operator
reset is normally required to ensure that control valves are in their
proper position and that the process is safe to restart.
Automatic resetting is used only when immediate restart of the
equipment is desired (circulation pumps, drain valves, etc.). Any
additional restart risk must be considered as part of the SIL selection
process.
18 May 2008
Integrity Requirements
SRS Format
1. Introduction
1. Overview of system
2. Description of operation
3. Other
2. General Requirements
1. Requirements common to all SIF
3. SIF Requirements
1. Functional Requirements
2. Integrity Requirements
18 May 2008
Checklists B.2.5
SRS Quality
The measure of quality for any document, including a SRS, is not the
number of pages or the document weight but rather how precisely,
quickly, and clearly all required information is passed to the reader.
18 May 2008
SRS Review
Questions ?
Class Exercise
Question Sheet – Safety Requirement Specification
18 May 2008
Presented by
Dr. Raymond Wright
FSEglobal Design Considerations 233
Power
Grounding
(Annex B.7.2)
18 May 2008
System Environment
Operator Interfaces
Resets
Bypassing
Requirements
Allocation
Safety
18 May 2008
Presented by
Dr. Raymond Wright
FSEglobal Realisation Phase 243
Realisation Phase
Presented by
Dr. Raymond Wright
exida.com
245
FSEglobal SIS Technologies 246
System Technologies
The IEC 61508 / IEC 61511 Standards describe functional safety for
Electric / Electronic / Programmable systems. These technologies
include:
Electric - Relay systems
Electronic - Solid State systems
Programmable – PLC and DCS systems
Tasks
Choose the right equipment for the purpose. All criteria used for process
control still applies.
Obtain reliability and safety data for ALL of the equipment
18 May 2008
Relay Systems
Used in relatively simple logic applications
Generally fail safe
Logic reconfigured by rewiring
Advantages Considerations
Fail-safe
Nuisance trips
Low initial cost
No diagnostics
Can be distributed
No serial communications
Immune to interference
Large systems are complex
Suits most voltages
Reprogramming by rewiring
Not self documenting
High cost of ownership
18 May 2008
Advantages Considerations
Built-in functionality
Flexibility
Can be distributed
Not self documenting
Serial communication available
High cost of ownership
Good diagnostic capability
No common cause
18 May 2008
Programmable Systems
Modular construction
Microprocessors/software perform logic
Reprogrammed through software
Advantages Considerations
Flexibility
Software dependent (possible
Modular reliability/security issues)
Highest packing density
Common cause
Self documenting
18 May 2008
Mode of Operation
Safety Systems are Static Systems - it is sometimes years before they
operate
24V 24V
Diagnostics
Input 5V Input 5V
Circuit Circuit
Short
Open
Input Signal 1 Circuit Input Signal 1
Circuit
from Field from Field
1 Signal 10 Signal
to CPU to CPU
R R
18 May 2008
0V 0V
Field
Switch
R1
I/P O/P
Field
CPU Output
I/P O/P
Relay
0V
18 May 2008
Questions ?
Class Exercise
Question Sheet – SIS Technologies
18 May 2008
Presented by
Dr. Raymond Wright
FSEglobal Architecture 256
A 1oo2
A
B
2oo3
B
A
2oo2 C
18 May 2008
1oo2
0.08 0.0004
Vote
B (very safe, but more
nuisance trips than simplex)
A
The optimum solution!
2oo3
B Vote 0.0048 0.0012
A Compromise!
C
18 May 2008
Diagnostics
Designed for
1oo2D
Designed for Safety
Safety & Availability
B
18 May 2008
Diagnostics
Basic Formulae
Where:
MTTR = Mean Time To Repair TI = Test Interval
MDT (Mean Down Time) = (MTTR + TI/2) S = Safe (initiating) failure
Assumption: 1/MDT >> failure rate D = Dangerous (inhibiting) failure
18 May 2008
Source:
Reliability, Maintainability, and Risk, by D.J. Smith
Questions ?
Class Exercise
Question Sheet – SIS Architecture
18 May 2008
Presented by
Dr. Raymond Wright
FSEglobal Test Philosophy 266
Test Philosophy
Determining the test frequency will influence the level of reliability of the
SIS.
Some processes cannot tolerate frequent shutdowns for preventative
maintenance, and need high reliability systems.
Some processes such as batch processes stop and start frequently
making it easier to perform necessary maintenance.
Average
Probability
T
PFavg = ∫ PF (t )dt
1
T 0
Approx PF = λ ∗ TI
18 May 2008
Approx PFavg = λ ∗ TI /2
Copyright © FSEglobal 2008
FSEglobal Test Philosophy 269
PFavg = λ TI / 2
PF(t)
PFAVG
Test period
Operating time interval
18 May 2008
Time interval
PF(t)
IEC61511
SIL 1
SIL 2
PFavg
SIL 3
SIL 4
18 May 2008
Operating Time
test
period
CPT
18 May 2008
PF(t)
IEC61511
SIL 1
SIL 2
PFAVG
SIL 3
SIL 4
18 May 2008
Operating Time
test
period
Questions ?
Class Exercise
Question Sheet – Test Philosophy
18 May 2008
Presented by
Dr. Raymond Wright
FSEglobal SIL Verification 275
SIL Verification
SILs Achieved
18 May 2008
Background
The SRS provided the SIL requirement of each SIF
The technology has been chosen
The architecture has been chosen
The test philosophy has been documented
Failure Data
The failure data for each component in each subsystem of the SIF is needed to
calculate if the SIL requirements have been met.
As this phase requires that performance of each SIF is verified, it is important to
understand the data received on failure rates and failure modes for the various
equipment used, and to be able to use that data for performance verification.
The failure data is used to calculate the PFDavg for each subsystem and for the
whole SIF
Presented by
Dr. Raymond Wright
FSEglobal Failure Data 279
Components
Modules
Complete system
No. Failures
λ (Failure Rate) = failures/hr
Total Unit Hours of Operation
18 May 2008
Infant Wearout
Mortality Failures
λ λ Useful
Life
Time
Life
Bath tub curve shows infant mortality and aging failures
(which may not be included in data bases).
Example
50 solenoids have been operating in the field for 5 years. During that
period 5 solenoids have failed. What is the failure rate expressed in
Failures per year
Failures per million hour
FITs
2.28 failures per million hrs = 2280 failures per billion hrs
λ = 2280 FITs
18 May 2008
MTTF = 1 / λ
(valid for single components or a series of components with a constant failure rate)
Time to Time to
Detect Fault Repair Fault
Success Success
MTBF
Failure Failure
18 May 2008
time
Example 1
An industrial I/O module has an MTTF of 87,600 hr. It takes and
average of 2 hr to repair the module. What is the MTBF?
MTBF = MTTF + MTTR = 87,600 + 2 = 87,602 hr
When repair time is short MTBF is approximately equal to MTTF
Example 2
An industrial I/O module has an MTTF of 87,400 hr. It takes and
average of 400 hr to repair the module. What is the MTBF?
MTBF = MTTF + MTTR = 87,400 + 400 = 87,800 hr
Presented by
Dr. Raymond Wright
FSEglobal Reliability 289
Topics
Reliability / Unreliability
F(t) = P(T ≤ t)
Reliability / Unreliability
and MTTF = 1 / λ
18 May 2008
Example
A pressure transmitter has an MTTF of 250 yrs. What is the failure rate in
failures per year and FITs?
The failure rate per year equals 1/MTTF = 1/250 = 0.004 failures per yr.
To convert to FITs find failures per hr = 0.004/8760 = 4.57 x 10-7.
This is 457 FITs (failures per billion hrs)
Example
A pressure transmitter has an MTTF of 250 yrs. What is the reliability for a
mission time of 5 years?
Useful Approximations
Some functions can be approximated by a series of other functions:
PF(t) = λt
Copyright © FSEglobal 2008
FSEglobal Reliability 295
Availability / Unavailability
A=1-U
18 May 2008
Availability / Unavailability
Useful equations:
PFavg = λt / 2
18 May 2008
Example
A transmitter has a failure rate of 0.005 failures per year. What is the
average probability of failure if the transmitter is 100% tested and
calibrated every two years?
PFavg = λt / 2
Questions ?
Class Exercise
Question Sheet – Failure Data
18 May 2008
POWER
CONTROLLER
SUPPLY
Fail
De-Energized
3
DDN
1
OK
0
DUN
2
18 May 2008
Fail
Energized
300
Copyright © FSEglobal 2008
FSEglobal Reliability 301
Markov Models
Looks at success and failure on one drawing. Flexible, solved for
probabilities as a function of time interval.
18 May 2008
AC Power
B
AC Power
Motor Pump
C
AC Power
B
18 May 2008
AC Power
Motor Pump
C
Copyright © FSEglobal 2008
FSEglobal Reliability 303
A B
Series System AC Power Motor
R S = RA * R B
18 May 2008
A B
Series System AC Power Motor
A S = AA * A B
18 May 2008
A B
AC Power Motor
Power
Supply
Power
Supply
A B
Power
Controller
Supply
Example: Upper
Leg
RPS = 0.6
RC = 0.8
Lower
(for a one year interval)
Leg
RSystem?
18 May 2008
R = 0.8
= 0.999424
Example 1
Example 1
System will fail is any element fails – use OR gate.
Are failures mutually exclusive?
Solution
System
The failures are not mutually exclusive Failure
0.02534
P(system failure)
= 0.02534
18 May 2008
Example 1
System
Failure
0.02534
P(system failure)
= 0.0255
18 May 2008
Example 2
A system has five components. All are needed for proper operation.
Component failure rates are:
λ AC POWER = 0.001 failures per year;
λ DC POWER SUPPLY = 0.04 failures per year;
λ LEVEL SWITCH = 0.1 failures per year;
λ TIME DELAY RELAY = 0.2 failures per year;
λ SOLENOID VALVE = 0.25 failures per year.
What is the probability of system failure for a one year time interval?
18 May 2008
Example 2
System
Failure
0.4462
The frequency (F) at which a hazardous event will occur will be:
F = Fa x P1 x P2 x P3 x P4
For the system to fail, the initiating event has to happen AND protection
18 May 2008
layer 1 has to fail AND protection layer 2 has to fail AND protection layer 3
has to fail AND protection layer 4 has to fail.
Markov Models
Fail
De-Energized
3
DDN
1
OK
0
DUN
2
18 May 2008
Fail
Energized
Markov Models
Fail
De-Energized
3
DDN
1
OK
0
DUN
2
18 May 2008
Fail
Energized
Markov Models
Repairable system
Markov Models
μΔt
Repairable system
Copyright © FSEglobal 2008
FSEglobal Reliability 320
Markov Models
redundancy.
Markov Models
R=e-λt
R=e-λt
18 May 2008
System Engineering
Reliability Block Diagrams
Fault Trees
Markov Models
Multiple Failure Modes
Questions ?
Class Exercise
Question Sheet – Reliability Engineering
18 May 2008
Presented by
Dr. Raymond Wright
FSEglobal Failure Modes 324
Failure Modes
With a safety system, the concern is not how the system operates, but
how the system fails.
Systems can fail in two ways:
60%
Failures Safe
Failures λS
18 May 2008
λ
λD Dangerous
40% Failures
Copyright © FSEglobal 2008
FSEglobal Failure Modes 325
Failure Modes
Failures can be further divided into those that are detected and those
that are undetected
Safe failures can be divided into Safe Detected failures and Safe
Undetected failures
Dangerous failures can be divided into Dangerous Detected failures and
Dangerous Undetected failures
SAFE
DETECTED
(λSD)
60% Safe
Failures
SAFE
UNDETECTED λS DANGEROUS
(λSU)
UNDETECTED
(λDU)
λD
λS = λSD + λSU
18 May 2008
40% DANGEROUS
Dangerous
DETECTED
λD = λDD + λDU Failures
(λDD)
Copyright © FSEglobal 2008
FSEglobal Failure Modes 326
Failure Modes
The safety integrity level is derived from the Probability of Failure on
Demand (PFD)
The Probability of Failure on Demand (PFD) is derived from the
dangerous undetected failure rate
SAFE
DETECTED
(λSD)
60% Safe
Failures
SAFE
UNDETECTED λS DANGEROUS
(λSU)
UNDETECTED
(λDU)
λD
PFDavg = 1 - e − λ *TI/2
DU
40% DANGEROUS
Dangerous
DETECTED
18 May 2008
Failure mode data can be derived from knowing the total failure rate and
the percentage of Safe or Dangerous failures
For example, if the total failure rate is 0.01 and the percentage Safe
failures is 85%, then failures can be split into
Safe Failures = 0.01 x 0.85 = 0.0085
Dangerous failures = 0.01 – 0.0085 = 0.0015
Failure mode data can be split further from knowing the Safe and
Dangerous failure rates and the diagnostic coverage factors (C) for Safe
and Dangerous failures
For example if the diagnostic coverage factor for Safe failures is 90%,
and for Dangerous failures is 60%, then the failures can be split into:
SFF =
λTotal
Copyright © FSEglobal 2008
FSEglobal Failure Modes 330
Example
A valve has a failure rate of 0.05. Analysis indicates that the percentage
of safe failures is 80%, and diagnostics can detect 90% of safe failures,
but only 60% of dangerous failures. What is the safe failure fraction
(SFF) of the valve?
PFD is derived from failure rate, failure mode and test interval
Failure rate is divided into Safe Failures (failures that cause a false trip)
and Dangerous Failures (failures that can prevent operation)
For the purposes of safety and calculating PFDavg we are only
interested in the Dangerous Undetected Failures
Many databases that provide failure rates list the different failure modes
for an equipment item
An untested device’s PFD gets larger as the operational time interval
increases
For devices subject to periodic inspection and test, an average PFD can
be used
PFDavg ~ λDU∗TI/2
18 May 2008
Example
A valve has a failure rate of 0.05. Analysis indicates that the percentage
of safe failures is 80%, and diagnostics can detect 90% of safe failures,
but only 60% of dangerous failures. The valve is tested every four
years. What is the PFDavg of the valve?
Failure Modes
Safe Failures
Dangerous Failures
Safe Failure Fraction
Questions ?
Class Exercise
Question Sheet –Failure Modes
18 May 2008
Presented by
Dr. Raymond Wright
FSEglobal SIL Verification Metrics 335
Probability of failure
Safety Integrity on demand, average Risk Reduction
Level (Low Demand mode Factor
of operation)
SIL Verification
The system fails if any one of the elements fail. The system is tested
once per year
The system fails if any one of the elements fail. The system
is tested once per year
PF = λ (system) x TI
= 0.0543 x 1
18 May 2008
= 0.0543
System
Failure
PFDavg (SIF) = PFDavg (sensor)
1E-02 + PFDavg (logic solver)
+ PFDavg (final element)
Sensor Subsystems
Lambda D (λD)
PSH
18 May 2008
Lambda D (λD)
PFDavg = λDU TI / 2
PSH
PFDavg = (0.000006 * 8760) x 1/2
PFDavg = 0.0263
RRF = 1/PFDavg = 38
18 May 2008
Architectural Constraints
Architectural Constraints
b) The behavior of the subsystem 60% < 90% SIL 2 SIL 3 SIL 4
under fault conditions can be 90% < 99% SIL 3 SIL 4 SIL 4
completely determined; and > 99% SIL 3 SIL 4 SIL 4
c) There is sufficient dependable
failure data from field experience IEC 61508 Table 3
Type B
to show that the claimed rates of
failure for detected and Safe Failure Hardware Fault
Fraction Tolerance
undetected dangerous failures are
0 1 2
met.”
Examples of Type A devices: < 60% NA SIL 1 SIL 2
18 May 2008
Architectural Constraints
b) The behavior of the subsystem 60% < 90% SIL 2 SIL 3 SIL 4
under fault conditions cannot be 90% < 99% SIL 3 SIL 4 SIL 4
completely determined; or > 99% SIL 3 SIL 4 SIL 4
c) No dependable failure data
from field experience exists for IEC 61508 Table 3
Type B
the subsystem, sufficient to show
that the required target failure is Safe Failure Hardware Fault
Fraction Tolerance
met.”
0 1 2
Examples of Type B devices:
Transmitters < 60% NA SIL 1 SIL 2
18 May 2008
Assume from the previous tables that these subsystems meet the
following requirements:
Subsystem 1, Type A, HFT = 0, SFF = 50%. Meets SILac ?
Subsystem 2, Type B, HFT = 1, SFF = 80%. Meets SILac ?
Subsystem 2, Type B, HFT = 0, SFF = 70%. Meets SILac ?
Field devices are the most critical, and probably the most neglected
elements in safety systems
Field devices provide input information to the logic solver, and carry out
the trip function when the logic solver demands it
Field devices typically contribute considerably more to the PFDavg
value than the logic solver, and therefore have the greatest potential to
create problems
Equipment Fail to Danger Rate PFDavg PFDavg
Per year % Contribution
Sensor 0.05 0.025 42
Logic System ( 4 relays) 0.01 0.005 8
Solenoid and Valve 0.06 0.03 50
Total 0.12 0.06 100
92%
Common Cause
The most common methods used to reduce the effect of common cause
problems are separation and diversity.
Examples
Redundant transmitters could be diverse in either process
measurement (one could measure pressure, the other could measure
temperature), or in technology, or both
Separate routing for field cables from redundant devices
Separate process connections for redundant devices
Ensuring test equipment is not faulty. For example, a faulty calibrator
may mean that multiple devices have been calibrated incorrectly
Others?
18 May 2008
Diagnostics
Automatic diagnostics are available in most microprocessor-based
systems such as PLCs, and some smart field devices
The effect of diagnostics is to detect faults so that appropriate action
can be taken – such as initiate an alarm or a shutdown
When dangerous faults are detected they can be recognised and
converted to safe failures. The effect of this is to improve the SFF
Questions ?
Class Exercise
Question Sheet – SIL Verification
18 May 2008
Presented by
Dr. Raymond Wright
FSEglobal Analysis Models 356
Modelling Formula
The equation groups given in this section apply to the individual
subsystems as well as the whole SIF
Failures
Equation Group 1
Leading to loss
of production Detectable Undetectable
Modelling Formula
Equation Group 2 is used to calculate the PFDavg for any system (or
part of system) with automatic diagnostics. Note that the test interval for
automatic diagnostics TIa is very short, and should be no more than
50% of the process safety time (fault tolerant time of the process).
Typically, TIa would be in the range of 1 to 10 seconds.
Equation Group 3 is used to calculate the PFDavg for any system with
manual proof testing. Note that the manual proof test interval has to be
short compared with the MTBF of the system.
18 May 2008
Modelling Formula
Detectable by Detectable by
Self Diagnostics Manual Proof Testing
Group 1 2 3
18 May 2008
Analysis Models
Using the equations the PFDavg for each subsystem and the SIF can
be calculated.
Example of a Single Channel Model (using example values)
TI = 1 year; TIa = 1 hr; MTTR = 10 hrs
The relative values of the different subsystems allow us to detect weak areas
Sensor
Logic Solver
Actuator (Final Element)
SIF
Sensor Actuator
Logic
18 May 2008
Sensor Actuator
Next we calculate the safe and dangerous trip rates for a single channel
of the sensor subsystem, and then combine both sensors.
To do this we split the sensor subsystem into components
Sensor 1 Actuator
Logic
Sensor 2 Actuator
Transmitter Barrier
We can now use the failure rate data and failure mode data for each
18 May 2008
Using the failure rate data and failure mode data (percentage split
between safe and dangerous failures) for each component, we split the
overall failure rate into safe and dangerous failures.
Transmitter Barrier
λ1 λ2
λd
Automatic Test Manual Test
(1-ß) * λd
ß * λd
(1-ß) * λd
As common cause failures are common to all channels, they are placed
in series with the redundant channels.
18 May 2008
We now need to account for the common cause fraction in the model.
Sensor 1 Actuator
Sensor 2 Actuator
Redundant Section:
PFDavg (R) = 2.C.[(1-ß) λd]2.(MTTR)2 + (1-C).[(1-ß).λd.TI]2/3
Sensor Subsystem:
PFDavg = PFDavg (R) + PFDavg (C)
Copyright © FSEglobal 2008
FSEglobal Analysis Models 367
Sensor Actuator
Sensor Actuator
We may have achieved the PFDavg values we wanted, but it may have
been at the cost of a higher spurious trip rate.
The next step in our model should be to determine the effect on
spurious trip rate. This time we use equation group 1 for λs.
Sensor Actuator
Sensor Actuator
Sensor Actuator
Sensor Actuator
1oo2 2oo3
the complete SIF – it may not have much effect on the overall spurious
trip rate.
Reliability Equations
Building a Model
Effect of Diagnostics
Effect of Common Cause
Questions ?
Class Exercise
Question Sheet – Analysis Models
18 May 2008
Presented by
Dr. Raymond Wright
FSEglobal Operations Phase 372
Decommission
Functional Testing
MOC required
(Section 11)
Copyright © FSEglobal 2008
FSEglobal Operations Phase 375
Functional Testing
Maintenance
Management of Change
Modifications
Decommissioning
Questions ?
Class Exercise
Question Sheet – Operation Phase
18 May 2008
Presented by
Dr. Raymond Wright
FSEglobal Functional Safety Management 377
1. Concept
2. Overall scope
definition
3. Hazard and
Management of Functional Safety
risk analysis
5. Safety requirements
Documentation
allocation
Verification
6. Overall 7. Overall 8. Overall
operation and safety 9. SRS
installation and
maintenance validation E/E/PES
commissioning
planning planning realization
planning
16. Decommissioning
or disposal
Objectives
Safety Planning
Analyze
Hazard Analysis /
Risk Assessment: Document
Define Design Targets
Evaluate Design:
Verify Reliability Analysis of Safety
Document
Integrity & Availability
Modify
OK
18 May 2008
Personnel Competency
Competency Certification
CFSE + Exida
18 May 2008
FSEng + TUV
ISA SIS Certification
Presented by
Dr. Raymond Wright
FSEglobal Safety Lifecycle Documentation 389
Topics :
Analyze
Hazard Analysis /
Risk Assessment: Document
Define Design Targets
Evaluate Design:
Verify Reliability Analysis of Safety
Document
Integrity & Availability
Modify
OK
18 May 2008
Documentation must:
Safety requirements Specification (overall safety requirements, comprising: overall safety functions
and overall safety integrity)
Safety requirements Description (safety requirements allocation)
allocation
Operation and maintenance Plan (overall operation and maintenance)
planning
Safety validation planning Plan (overall safety validation)
Ensures and justifies that safety requirements are met with brief details
of:
Safety analysis
Verification & validation
Documentation to be generated
Brief description of the intended testing and validation activities
Factory and site acceptance tests
Tests for unexpected behavior
Regression testing
Management procedures that will be applied
Quality management
Configuration management
Recording mechanisms
18 May 2008
Detection
Documentation Summary
Post-Instructional Survey
Answer the questions to the best of your ability
The results will help the instructor improve the course
30 minutes
18 May 2008
Thank you
Questions: Please send any questions to
ray.wright@optusnet.com.au
We will respond as soon as possible.
18 May 2008