You are on page 1of 106

Fault Tree Development Minimum Cut Set Finding

• Top-down approach

• Bottom-up approach

53 54

Example:
Procedure for top-down approach
G1 Top-gate
“AND” is replaced by horizontal
G2,G3
• Uniquely identify all gates and basic events arrangement
A,G3 Gates are replaced with input
G4,G3 events
• Place the top gate in the first row of a matrix. A,C
A,G5 “OR” is replaced by vertical
B, G3 arrangement
• Replace all gates by basic events either using a or b.
C, G3
A,C
• Replace an “OR” gate by vertical arrangement. A,B
B,C According to Boolean algebra
B,G5 A×A≡ A
• Replace an “AND” gate by horizontal arrangement. C,C
C,G5

• Delete all supersets (sets that contain another set as a subset.) A,C
A,B
B,C
A,B
C
C,A,B
The minimal cut sets for the tree are:
{C}, {A, B}, {A, C}, {B, C}, {A, B, C}

55 56
Bottom-up approach Bottom-up approach
• Example: using the same fault on above for bottom-up
Similar, except start with gates containing only basic events.
approach
1) Generate two columns, one is for gates and other for the
other for cut sets.

2) Start with gates that have only basic events as inputs.

3) Generate cut sets for each of these gates in the table.

4) For “OR” gate If gate use union rule and represent the basic
events separately. Example: A “OR” B= (A), (B).

5) For “AND” gate uses intersection rule and put the events
into the same
The minimal cutsets for the tree are:
{C}, {A, B}, {A, C}, {B, C}, {A, B, C}

57 58

Top-event probability estimation


Probabilistic analysis using basic failure data

• Cutsets approach:
1) Top-event probability estimation
Gate-by-gate approach: Quickest method. Applicable when the fault tree is large and
Straightforward and simple. Using union (OR gate) and the failure rate/failure probability of basic events are small.
intersection (AND gate) rule to calculate top-event Where, PTOP is the probability for the top event and Cj is the
probability. probability of minimal cutsets. And, i=1, 2, 3,..n, denotes the
failure probability of corresponding components or basic
events.

59 60
Importance factor estimation 5. Importance factor estimation

• Basic-events (Components) importance (BIi): It is calculated by • The symbol Σ in the equation denote a “sum of all those
“the sum of the probability of occurrence of all cutsets containing probability of cutsets containing basic-event i as one of its
the basic-event (component)” divided by the total probability of basic-events”.
occurrence for the system. • • Cutsets importance (CIj): It is the ratio of cutsets
characteristic over the system characteristic.

61 62

Event Tree Analysis The steps of ETA:

• It is an inductive procedure which maps the all possible • Identification of initiating event
outcomes resulting from an initiating event (any
accidental release or occurrence), e.g. gas leakage,
• Identifications of Safety Functions
equipment failure or human error.

Examples:
• Determine the probability of various outcomes (final
consequences) resulting from the initiating event.
• Automatic shutdown
• Alarms which alert the operator
• Operator actions in response to alarms
• Barriers or containment systems to limit the effects on the
initiating event

63 64
The steps of ETA: The steps of ETA:

• Qualitative evaluation of safety functions: It is qualitative


judgment like success/ failure, true/ false or yes/no to
evaluate the safety functions or events’ consequences in
different branches of the tree.

65 66

Construct the Event Tree

Example a. Enter the initiating event and safety functions.


Oxidation reactor Operator Automatic
SAFETY high temperature reestablishes shutdown system
• Oxidation reactor high temp. Alarm alerts FUNCTION alarm alerts operator cooling water flow stops reaction at
at temperature T1 to oxidation reactor temperature T2
operator at temp T1.
• Operator reestablish cooling water flow to the
oxidation reactor.
INITIATING EVENT:
• Automatic shutdown system stops reaction at Loss of cooling water
temp. T2. T2 > T1 to oxidation reactor

These safety functions are listed in the order in


which they are intended to occur.

FIRST STEP IN CONSTRUCTING EVENT TREE

67 68
Construct the Event Tree b) Evaluate the safety functions.
b. Evaluate the safety functions.
Oxidation reactor Operator Automatic Oxidation reactor Operator Automatic
SAFETY high temperature reestablishes shutdown system SAFETY high temperature reestablishes shutdown system
FUNCTION alarm alerts operator cooling water flow stops reaction at FUNCTION alarm alerts operator cooling water flow stops reaction at
at temperature T1 to oxidation reactor temperature T2 at temperature T1 to oxidation reactor temperature T2

INITIATING EVENT: INITIATING EVENT:


Loss of cooling water Loss of cooling water
to oxidation reactor to oxidation reactor

Success Success

If the safety function does not affect the course of the


accident, the accident path proceeds with no branch pt to
Failure Failure the next safety function.

REPRESENTATION OF THE FIRST SAFETY FUNCTION 69


REPRESENTATION OF THE SECOND SAFETY FUNCTION
70

b) Evaluate safety functions. Describe the Accident Sequence


Oxidation reactor Operator Automatic Oxidation reactor Operator Automatic
SAFETY high temperature reestablishes shutdown system SAFETY high temperature reestablishes shutdown system
FUNCTION alarm alerts operator cooling water flow stops reaction at FUNCTION alarm alerts operator cooling water flow stops reaction at
at temperature T1 to oxidation reactor temperature T2 at temperature T1 to oxidation reactor temperature T2

B C D
A Safe condition,
return to normal
operation
AC Safe condition,
process shutdown
INITIATING EVENT: INITIATING EVENT:
Loss of cooling water ACD Unsafe condition,
Loss of cooling water runaway reaction,
to oxidation reactor to oxidation reactor operator aware of
A problem
AB
Unstable condition,
process shutdown
ABD Unsafe condition,
runaway reaction,
Success Success operator unaware
Completed ! of problem

Failure Failure

COMPLETED EVENT TREE 71


ACCIDENT SEQUENCES 72
Safety Function
0.01 Failures/Demand
Bow-Tie Model

Initiating Success of Safety Function


Event (1-0.01)*0.5 = 0.495 Occurrence/yr.
0.5 Occurrences/yr.

Failure of Safety Function


0.01*0.5 = 0.005 Occurrence/yr.

Figure 11-10 The computational sequence across a safety function in an


event tree.

73 74

Bow-TieXP by CGERISK
Bayesian network - How does it work?

Decision
making

data
Diagnosis
Expert
knowledge
Optimization

75 76
Bayesian Network Software for BN-based approach

• GeNie Modeler by Bayesfusion, LLC (open


source) (https://www.bayesfusion.com/)
• AGENARISK for Risk Analysis
(https://www.agenarisk.com/)
• HUGINEXPERT (https://www.hugin.com/)

Khakzad et al. (2013) Dynamic safety analysis of process systems by mapping bow-tie into Bayesian
network, PSEP, 91, 46-53 77 78

Accident Precursor Concept for risk


Dynamic Risk
modelling
Conceptual Safe (Normal) state
Design • Risk
Gas leaked but
Near Miss vented to safe
FEED • Risk
Dynamic Risk= F{s(c, p, k),t} location

Detailed
• Risk Gas leaked and
design Mishap vented onto rig

Dynamic Installation • Risk


Small scale fire
Risk
Incident

Operation • Risk CAUSES


Fire and major
Time
Accident accidents

Yang, M., Khan, F., Lye, L. (2013). Precursor-based hierarchical Bayesian approach for rare event frequency
79 estimation: a case of oil spill accidents. Process Safety and Environmental Protection, 91(5), 333-342. 80
Dynamic risk assessment framework
Consequence Analysis
1 Asset Loss

Estimation of
X Human fatality • Purpose: To assess the extent of damage
Dollar Value
X Environmental Loss • Typical Hazard
X Business loss • Toxic Release, Fire and Explosion
• Modelling of hazard scenario
– Toxic Release: Source (Release) Model,
Unit Selection & Consequence Probability Posterior Risk Dispersion
Scenario Assessment Assessment Estimation
Identification – Fire and explosion: Source Model, Fire and
Explosion, Heat Dispersion
Prior Failure Design stage data
– Fatality Assessment: Probit Analysis
Likelihood Function Real time process data
– Nonfatal Consequence: Skin-burn, Property
Posterior Failure Bayesian theory damage

Yang, M., Khan, F., Amyotte, P. (2015). Operational risk assessment: a case of Bhopal disaster.
Process Safety and Environmental Protection, 97 70-79. 81 82

Loss functions An example

• The loss function of the reactor

Asymmetric inverted normal loss functions Quartic loss functions

83 84
CFD Modeling
qThe common method of estimating the overpressure caused by an
FLACS – a CFD fire and explosion
explosion (Multi-Energy method and TNT-Equivalence method) assume that
the blast generated is similar in all directions with no directional effects. software
However, these methods do not take into account factors such as:
–Directional effects
–Focusing effects
–Reflection effects
–Factors related to the source of the explosion (e.g. initial strength,
shape)
Thus, Computational Fluid Dynamics (CFD) modeling has been introduced
to allow for the better predictions of the strength of blast waves generated
by gaseous explosions.

qComputational Fluid Dynamics (CFD): a powerful and useful tool for


predictive analysis of flow, mass, momentum, and energy associated with
dispersion and explosion phenomenon.

85 86

Phast by DNV QRA software overview

87 88
Risk Estimation and Evaluation Risk Management

• Purpose: To assess Risk and Make Safety • Propose: to propose mitigating measure to
Judgment reduce the potential impact of the hazard and
• Methods possibly reduce the risk level
– Individual Risk • Method
– Societal Risk – Safe Work Procedure at every project stages
• Tolerability Criteria – Emergency Response Management
– Emergency Response Procedure

89 90

Case study: a risk-based winterization approach Risk-based Approach


for vessels operations in Arctic environments

• An approach is needed to answer:


– What is environmental loading and the operating
envelop? Decision structure Risk Assessment Risk Management Impact Assessment
• Available options • Establish scope • Risk acceptance • Monitor
– How much winterization is required? of engineering
design and
of the analysis criterion effectiveness of
results of the
• Generate risk- • Assess possible
technology, decisions or
– What technology will be most appropriate? response strategy
related
information
risk management
options actions made
through risk-
– What is residual risk after winterization? • Optimize risk
reduction options based approach

Risk Communication

91 92
The Proposed Methodology Step 1.0 Environmental Load
Select a system
Modelling
Environmental load modeling • Loading scenarios are defined in two dimensions:
Take one loading scenario – Annual Extreme Low Temperatures @
Probability Consequence – Various consecutive hours of exposure
assessment assessment
e.g., probability of temperature between -40 to -30 °C for
Risk Estimation
24 consecutive hours, etc.
Risk exceeds no Safe, do not apply
acceptable level? winterization • Through statistical analysis, probability distributions
yes
Redesign winterization
plan
Apply winterization
Method
selection
Risk-based decision of environmental load can be determined
analysis
Residual risk assessment [presented by Dr. Lye]
yes Risk exceeds
acceptable level?
no
All loading no
scenarios analyzed?
yes
93 94
Safe

Step 2.1 Probability of Failure Step 2.2 Consequence Assessment


Assessment
• Losses due to the failure assessed in terms of dollar
• Limit state function: g(x) = ΔTactual- ΔTlimit
value and injury/fatality
(1) ΔTactual : the difference load and operating envelop Importance class Description Financial Injury fatality Severity
loss value
(2) ΔTlimit : acceptable temperature difference
Critical Failure causes system to > $1 million One or more fatality 8-10
stop functioning

Permanent injury or fatality 6–8


• Probability of Failure (PoF): Important for good
operation
Failure causes impaired
performance and adverse
>$0.5Million

consequences
– Assume ΔTactual > ΔTlimit is a failure state Required for good Failure may affect the >$0.2Million Serious injury requiring 4-6
operation performance and lead to weeks to recover
– Estimate Pr (g(x)>0) based on environmental subsequent failure of the
system
load Part of good Failure may not affect the >$10,000 Injury requiring rest and 2-4
operation performance immediately recovery
– Rank Probability [Definitely, Likely, Occasional, Seldom, but prolonged failure may
lead to failure of the
Unlikely] system
Optional for Failure may not affect the <$10,000 First Aid 0-2
operation performance of the system
95 96 96
Step 2.3 Risk Estimation Step 3.0 Applying Winterization

• Risk = Probability of failure × Losses • Winterization methods:


• A risk matrix to define risk levels – Insulation
– Heat tracing: electric and steam
– Air bubbler and circulation
– Ice repellent coatings
– Deicing and anti-icing chemicals
– …

97 98

Step 4.0 Residual Risk


Assessment An example
• System: a pipeline on deck
• Probability of failure (PoF): – Temperature data from Barrow Station, Alaska
– g(x) = ΔTactual -E- ΔTlimit – Operating envelop (Top) also follows a normal
where E is the efficacy of the winterization to reduce the distribution with mean temperature to be maintained as
temperature difference 10 °C with a possible variation of 4 °C
– Calculate Pr (g(x)>0) based on same environmental – Maximum allowable temperature difference is ΔTlimit =
load 25 °C
– Rank Probability
• Consequence assessment: same severity value
will be used
• Evaluate risk level using risk matrix

99 100
• Step 1.0 Environmental load • Step 2.3 Risk estimation
– Load (extreme temperatures for 24 hours duration 100 year return – Risk is considered “high” according to risk matrix,
period) follows a normal distribution with mean
winterization must be applied
temperature as -45.8 and standard deviation as 1.1 • Step 3.0 Apply winterization
• Step 2.1 Probability of failure assessment – Electric heat tracing option 1 providing:
• Q=8 watt per foot
– ΔTactual = |load-Top| , then it follows normal distribution
with mean of 55.8 and standard deviation as 4.2 – Efficacy (E) in terms of temperature difference:
• E=ΔT= Q*ln(Do/Di)/(2pi*k) = 43 °C
– PoF= Pr(ΔTactual >25) = 1.00 [100% chances
exceeding allowable temperature difference] where 2pi is part of the formula for calculating the area
of a cylinder; Do is the outer insulation diameter =6.5
• Step 2.2 Consequence assessment inch; Di is the inner insulation diameter =4.5 inch; k is
– Severity value = 4 (Failure may affect the performance and lead conductivity factor (BTUin/hrft2 oF) =0.25 (fiber glass)
to subsequent failure of the system )

101 102

Example of RBW - Task 1


• Step 4.0 Residue risk assessment after • Vessel’s exterior stairs do not
meet the LTE guide
winterization requirements of ~35 deg. It
was confirmed that stairs
– PoF = 0.001 [0.1% chances going above maximum cannot be altered due to
allowable temperature difference], Severity value =4 space restrictions.

– Risk is considered “low” according to risk matrix


– If the risk still exceeds the acceptable level, we need
to redesign winterization method

Qualitative risk assessment comparing the


two ships, using history of stair icing for Earl
Windsor.
103 104
Example of RBW - Task 2 Take-aways
• Comment No. 25 – Fire and wash water lines pass through car deck.

42.4 mm
• Quantitative risk assessment (QRA) accuquires
varieties of models to obtain “pure” quantitative
results.
• Lots of attention and effort have been devoted to
1) Set minimum insulation thickness probability assessment; consequence loss
2) Set acceptable risk level
3)
4)
Take environmental loading
Adjust insulation thickness and heat
modeling needs more investigations.
tracing to minimums to meet
acceptable risk level. • QRA can help the decision-makers to understand
the value of safety interventions and optimize
resource allocation.

105 106

Thank you!

Questions?

Human error: Classification and Quantification


Ming Yang, PhD, P.Eng.
Assistant Professor
Safety and Security Science Section,
Faculty of Technology, Policy, and Management,
TU Delft, the Netherlands
Email: m.yang-1@tudelft.nl 107
Human Factor

• Human factor is the term used to describe the interaction of


individuals with
• Each other
• Facilities and equipment
• Management systems Facilities and People
• Human
equipment
Part A: Human factors and Errors • Human factors


Ergonomics
Work Space


characteristics
Human behavior
Fitness
• Design
analysis focuses on • Reliability
• Stress
• Fatigue
how these
interactions Management
contribute towards systems
• Procedure
the creation of safe • Training
• Leadership
workplace • Commitment

*Source: OGP guidelines on “Human factors: a means of improving HSE performance”.

Why should we care? Human Performance Shaping Factors (PSF)

Benefits of taking human


• Factors that specifically decrements or improves
factor into account: human performance.
Improved
Engineering • Fewer accidents/ near
Rate of incidents

Improved misses
Safety
• Safer workplace
Management
• Improved efficiency
Incorporation (reduced downtime)
of human
• Lower lifetime costs
factors????
(maintenance is cheaper,
and re-engineering not
needed)
• More productive workforce
Time

*Source: OGP guidelines on “Human factors: a means of improving HSE performance”. *Source: Hughes, G., & Kornowa-Weichel, M. (2004). Whose fault is it anyway?: A practical illustration of human factors in
process safety. Journal of hazardous materials, 115(1), 127-132.

5
Human Error Human Error Classification

An inappropriate or undesirable human decision or behaviour Why is classification important?


that reduces or has the potential for reducing the: • For Human Error Identification
• effectiveness • For Human Error Quantification
• safety • For preventing errors to happen
• system performance (how to make best use of limited resource)

Different classification approach Discrete action classification

• Discrete action classifications Very simple but powerful (Swain & Guttman, 1983)
• Information processing classifications
• Behavior type based classifications • Error of omission - acts not carried out
• Errors of commission – acts carried out either
• inadequately
• in wrong sequence
• too early or too late
• Extraneous act – Wrong/Unrequired act performed
Information processing classification Behavior type based classification

• Follows the scheme of information processing assumed to


occur when humans operate and control systems such as: • Skill-based (Slips and Lapses)
aircraft, ship, power plant. (Rouse & Rouse, 1983) • controlled by sub-conscious behavior and stored patterns
of behavior
• Information processing scheme: • usually errors of execution
• Operator observes the state of the system
• Formulates a hypothesis • Rule-based (RB Mistakes)
• Chooses a goal • applies to familiar situations - stored rules are applied
• Selects a procedure to achieve desired goal • errors involve recognizing the salient features of the
• Executes the procedure situation

• Specific categories of errors can occur at each stage (i.e. • Knowledge-based (KB Mistakes)
incorrect interpretation of state of the system) • errors result from inadequate analysis or decision making

Behavior type based classification Human Error Modeling

Skill-rule-knowledge based model (Rasmussen, 1981)


Human Error Modeling Generic error modelling system (Reason, 1987) Frequency of different error types

• In raw frequencies, SB >> RB > KB


• 61% of errors are at skill-based level
• 27% of errors are at rule-based level
• 11% of errors are at knowledge-based level

• But if we look at opportunities for error, the order reverses

• humans perform vastly more SB tasks than RB, and vastly


more RB than KB
• so a given KB task is more likely to result in error than a
given RB or SB task

Error detection and correction Dealing with human error


• Three modes of error detection
• Self-monitoring: periodic attentional checks, measurement Three generic design approaches for dealing with human error:
of progress toward goal, discovery of surprise
inconsistencies • Exclusion designs - impossible to make the error
• Environmental error cueing • Prevention designs - difficult but not impossible
• Error detection by other people • Fail-safe designs - reduces the consequences but not the
possibility
• Effectiveness of self-detection of errors
• SB errors: 75-95% detected, average 86%
• RB errors: 50-90% detected, average 73%
• KB errors: 50-80% detected, average 70%

• Including correction tells a different story:


• SB: ~70% of all errors detected and corrected
• RB: ~50% detected and corrected
• KB: ~25% detected and corrected
Human error and accident theory Human error and accident theory

• Major systems accidents (“normal accidents”) start with an • Some of the factors affecting performance:
accumulation of latent errors • Lack of experience with system in failure states: training is
rarely sufficient to develop a rule base that captures
• Latent errors: whose adverse consequences may lie system response outside of normal bounds resulting RB
dormant within the system for a long time, only becoming errors
evident when they combine with other factors
• System complexity and cognitive strain:
• Most of those latent errors are human errors: designers, • system complexity prohibits mental modeling
high-level decision makers, construction workers, managers • stress of an emergency encourages RB approaches
and maintenance personnel and diminishes KB effectiveness

• Invisible latent errors change system reality without altering • Limited system visibility by automation and “defense in
operator’s models depth”: results in improper rule choices and KB reasoning
• seemingly-correct actions can then trigger accidents

Reducing Accidents Reducing Accidents Ensure safe work culture

• Eliminate/reduce risk through design


• Apply HF principles to design
Generative
• Provide procedural checklists st
tru Safety is how we
• Provide training si ng do business
ea
• Provide appropriate & meaningful feedback ,i n cr around here
d
me
• Incentive programs i nfor Proactive
g l y We work on the
sin
rea problems that we
Inc Calculative
Procedure to reduce human error within a project: We have systems still find
1) Identification of Errors Causes in place to
• Task analysis manage all
Reactive hazards
• Action error analysis Safety is
• Performance shaping factor important, we do
2) Design solution to address the Error Causes Pathological a lot every time
Who cares as we have an
long as we are accident
not caught!!!
Human error in industries

Airline industries
• http://www.youtube.com/watch?v=s2PkViQWPeA
• http://www.youtube.com/watch?v=RjnqePtCaCI&feature=related
• http://www.youtube.com/watch?v=RDNnldonjZE&feature=related

Subway crash
• http://www.youtube.com/watch?v=0r2gvlTMG-Q
Part B: Quantifying human error
Monorail crash
• http://www.youtube.com/watch?v=dCis-KGEolo

Human Error Quantification Human Error Quantification techniques


• First generation techniques
• We can only measure the likelihood of the errors involved • Consider human as a mechanical component
• Perform probabilistic risk assessment (PRA) to quantify failure
• Human reliability quantification techniques quantify the probability
human error probability (HEP) • Includes:
• Success likelihood index method (SLIM)
• Technique for human error rate prediction (THERP)
• Human error probability (HEP) is the probability that an
• Human error assessment and reduction technique (HEART)
operator will fail in the assigned task
• Second generation techniques
• HEP is used: • Incorporate cognitive aspects of human
• To prevent death or injury of the workers • Focuses on causes of errors rather than their frequency
• To prevent death or injury to the general public • Includes:
• Avoid damage to a plant • A technique for human error analysis (ATHEANA)
• Stop any harmful effects on the environment • Cognitive reliability error analysis method (CREAM)
Success likelihood index method (SLIM) SLIM Process

Purpose: Evaluating the probability of a human error occurring Step 1: The selection of the expert panel
throughout the completion of a specific task. Step 2: The definition of situations and subsets
Step 3: The elicitation of PSFs
Outcome: Measures to reduce the likelihood of errors occurring Step 4: The rating of the tasks on the PSF scale
within a system and therefore lead to an improvement in the Step 5: PSF weighting
overall levels of safety. Step 6: The calculation of Success Likelihood Index (SLIs)
Step 7: Conversion of SLIs into probabilities

SLI Calculation In Class Exercise:


!"#! = !%& '"! × )" *+, - = 1 /+ - = 0
Case study: An operator decoupling a filling hose from a
Where: !"#! = !122344 5-635-ℎ++8 -9830 *+, /:46 ; chemical (chlorine) road tanker
)" = <+,=:5->38 -=?+,/:923 @3-Aℎ/-9A *+, /ℎ3 -/ℎ B!C Human error of interest: Failure to close valve V0101 prior to
'"! = !2:538 ,:/-9A +* /:46 ; +9 /ℎ3 -/ℎ B!C decoupling the filling hose
0 = Dℎ3 91=E3, +* B!C4 2+94-83,38
Step 1: Two operators with a minimum 10 years experience, one
human factors analyst and a reliability analyst who is familiar
with the system and also has operational experience.

Step 2: Possible human errors:


V0101 Open
Alarm mis-set
Alarm ignored
In Class Exercise: In Class Exercise:

Step 3: 5 PSFs identified: Training, Procedures, Feedback, The Step 6: SLI calculation
perceived level of risk and Time pressure involved
PSFs Rating Weighting SLI
Step 4: PSF rating (VO1O1open)
Errors Training Procedures Feedback Perceived risk Time
VO1O1 open 6 5 2 9 6 Training 6 0.15 0.9

Alarm mis-set 5 3 2 7 4 Procedures 5 0.15 0.75

Alarm ignored 4 5 7 7 2 Feedback 2 0.3 0.6


Perceived risk 9 0.3 2.7
Step 5: PSF weighting Time 6 0.10 0.6
PSF Importance Total SLI 5.55
Training 0.15
Procedures 0.15
Calculate for Alarm mis-set and Alarm ignored
Feedback 0.3
Perceived risk 0.3
Time 0.10
Sum 1.00

In Class Exercise: In Class Exercise:

Step 6: SLI calculation (Cont.) Step 7: Calculating HEP

PSFs VO1O1 open Alarm mis-set Alarm ignored !"# $%& = ( )!* + ,
Training 0.9 0.75 0.6
Two additional tasks X and Y were assessed, which had HEP
Procedures 0.75 0.45 0.75
values of 0.5 and 10-4 respectively and SLIs respectively of 4.00
Feedback 0.6 0.6 2.1
and 6.00
Perceived risk 2.7 2.1 2.1
Time 0.6 0.4 0.2
Solve these two equations to find a and b
Total SLI 5.55 4.3 5.75

Finally, Log (HEP) = -1.85 * SLI + 7.1


Calculated HEPs are:
V0204 = 0.0007
Alarm mis-set = 0.14
Alarm ignored = 0.0003
HRA of a muster sequence

DETERMINATION OF HUMAN ERROR


PROBABILITIES
FOR OFFSHORE PLATFORM MUSTERS
(Adopted from DiMattia et al., 2005)

Muster scenario description Hierarchical Task Analysis

Muster Scenario Listen & follow PA Collect personal survival


Component Detect Alarm
1 2 3 suit
A person falls overboard resulting in A hydrocarbon gas release in A fire and explosion in the
Situation Identify Alarm Evaluate potential
the activation of the muster alarm. the process units. process units. Assist others if needed
egress paths and
A very experienced (15 years) An experienced (3 years) An inexperienced (6 choose route
Act Accordingly

Tenable
operator who at the time of muster operator who at the time of months) operator who at
Muster person in
alarm is in the process units muster alarm is changing the time of muster alarm is Move along egress Register at TSR
question
draining a process vessel. filters in a solids removal in the process units working route
Ascertain
unit. valves to isolate a vessel. YES
if danger
The incident occurs in good The incident occurs in cold, The incident occurs during a is Provide pertinent
Weather feedback attained while
weather and calm seas. wet weather. winter storm. imminent
Assess enroute to TSR
The muster is conducted during The muster is conducted The muster is conducted quality of
Time of day NO
daylight hours. during daylight hours. during nighttime hours. egress
The operator is on a different deck The operator is on the same The operator is on the same Return process path Don survival suit if
Location of muster than the person who has fallen deck as the gas release. deck as the fire and
equipment to safe state directed to
initiator overboard. The operator does not explosion.
Not tenable
Make workplace as safe
see or hear the muster initiator. Choose alternate route
as possible in limited Follow OIM’s instructions
time
37
Descriptions of PSFs PSF Rating

PSF Description
PSF affecting the completion of actions as quickly as possible to effectively muster in
Stress a safe manner. This is essentially the effect from the muster initiator on the Performance Shaping Factor
consequences of not completing the task. Rating
Atmospheric
PSF that affects the likelihood of a task being completed successfully because of the Scale Stress Complexity Training Experience Event Factors
intricacy of the action and its sub-actions. This, combined with a high level of stress, Factors
Complexity can make actions that are normally simplistic in nature complicated or cumbersome. highly very
This PSF can cause individuals to take shortcuts (violations) to perform a task as 100 no stress not complex no effect no effect
quickly as possible or not to complete the task. trained experienced

PSF that directly relates to an individual’s ability to most effectively identify the somewhat some somewhat
50 some stress some effect some effect
muster alarm and perform the necessary actions to complete the muster effectively. complex training experienced
Training
Training under simulation can provide a complacency factor as a highly trained
highly no
individual may lack a sense of urgency because of training’s inherent repetitiveness. 0 very complex no training large effect large effect
PSF related to real muster experience. An individual may not be as highly trained as stressed experience
Experience other individuals but may have experienced real musters and the stressors that
accompany real events. Strong biases may be formed through these experiences.
PSF that is a direct result from the muster initiator and the location of the individual
Event factors with respect to the initiating event. Distractions that can affect the successful
completion of a muster include smoke, heat, fire, pressure wave and noise.
PSF that influences actions due to weather. High winds, rain, snow or sleet can affect
Atmospheric manual dexterity and make egress paths hazardous when traversing slippery
factors sections. Extremely high winds negatively impact hearing and flexibility of
movement.

PSF Weight Predicted HEP

HEP
No. Action Phase Loss of Defenses
1 2 3
1 Detect alarm 0.00499 0.0308 0.396 Do not hear alarm. Do
2 Identify alarm 0.00398 0.0293 0.386 not properly identify
Awareness alarm. Do not
3 Act accordingly 0.00547 0.0535 0.448 maintain composure
(panic).
Ascertain if
Misinterpret muster
4 danger is 0.00741 0.0765 0.465
initiator seriousness
imminent
Muster if in and fail to muster in a
5 0.00589 0.0706 0.416 timely fashion. Do not
imminent danger
return process to safe
Return process
Evaluation state. Leave
6 equipment to 0.00866 0.0782 0.474
workplace in a
safe state
condition that
Make workplace
escalates initiator or
as safe as
7 0.00903 0.0835 0.489 impedes others
possible in
egress.
limited time
Predicted HEP Advantages of SLIM
HEP
No. Action Phase Loss of Defences
1 2 2
8
Listen and follow PA
0.00507 0.0605 0.420
• Given that HEPs are calibrated with other known HEPs, they
announcements
Evaluate potential
are likely to be a reasonable estimate
9 egress paths and 0.00718 0.0805 0.476 Misinterpret or do not hear PA
choose route announcements. Misinterpret
Move along egress tenability of egress path. Fail to • It is a flexible technique: one can deal with the entire range
10 0.00453 0.0726 0.405
route
Assess quality of egress Egress
follow a path which leads to TSR;
decide to follow a different egress
of HE forms without requiring a detailed decomposition of
11 route while moving to 0.00677 0.0788 0.439 path with lower tenability. Fail to the task; for example, as required with THERP
TSR assist others. Provide incorrect
Choose alternate route assistance which delays or
12 if egress path is not 0.00869 0.1000 0.500 prevents egress.
tenable
Assist others if needed
14 0.01010 0.0649 0.358
or as directed
15 Register at TSR 0.00126 0.0100 0.200
Provide pertinent Fail to register while in the TSR.
16 feedback attained while 0.00781 0.0413 0.289 Fail to provide pertinent
enroute to TSR feedback. Provide incorrect
Don personal survival feedback. Do not don personal
Recovery
survival suit in an adequate time
17 suit or TSR survival suit 0.00517 0.0260 0.199
for evacuation. Misinterpret
if instructed to abandon
OIM’s instructions or do not
Follow OIM's follow OIM’s instructions.
18 0.00570 0.0208 0.210
instructions

Human Error Assessment and Reduction Technique (HEART) HEART Process


Step 1: Identifying the full range of sub-tasks.
• First generation technique, developed by Williams in 1986
Step 2: Determining a nominal human unreliability score for the
• Consider all factors which may negatively affect performance particular task by consulting with local experts.
of a task
Step 3: Determining the error-producing conditions (EPCs), which
• Quantify each of these factors to obtain an overall human are apparent in the given situation and highly probable to have a
error probability (HEP), the collective product of factors negative effect on the outcome. Get the total HEART effect of each
EPC.

Step 4: Get experts’ assessed proportion of effect (from 0 to 1).

Step 5: Calculate effect = ((Max effect – 1) × Proportion of effect) + 1

Step 6: Calculate final HEP as a product of all calculated effect and


nominal human unreliability.
HEART Example HEART Example

A reliability engineer has the task of assessing the probability of From the relevant tables, it can be established that the type of
a plant operator failing to carry out the task of isolating a plant task in this situation is of the type (F) which is defined as
bypass route as required by procedure. However, the operator is “Restore or shift a system to original or new state following
fairly inexperienced in fulfilling this task and therefore typically procedures, with some checking”. This task type has the
does not follow the correct procedure; the individual is proposed nominal human unreliability value of 0.003
therefore unaware of the hazards created when the task is
carried out

HEART Example HEART Advantage

Error producing Total HEART Assessed Assessed effect


conditions (EPCs) effect proportion of
• HEART is very quick and straightforward to use and also has a
effect small demand for resource usage
Inexperience X3 0.4 (3-1)*0.4+1=1.8
Opposite technique X6 1.0 (6-1)*1+1=6 • The technique provides the user with useful suggestions as to
Risk Misperception X4 0.8 (4-1)*0.8+1=3.4 how to reduce the occurrence of errors
Conflict of Objectives X 2.5 0.8 (2.5-1)*0.8+1=2.2
Low morale X 1.2 0.6 (1.2-1)*0.6+1=1.12

The final calculation for the normal likelihood of failure can


therefore be formulated as: 0.003 x 1.8 x 6.0 x 3.4 x 2.2 x 1.12 = 0.27
References

Kirwan, B. (1994). A guide to practical human reliability


assessment. London: Taylor & Francis.
System Reliability Modeling
Reason, J. (1990). Human error. Cambridge university press

Ming Yang, PhD, P.Eng.


Assistant Professor
Safety and Security Science Section, Faculty of
Technology, Policy, and Management, TU Delft

TPM 024A
1

Learning objectives Introduction

• Understand basic concepts in reliability • Things fail in nature.


engineering • Examples of such failures are the cracking of
lawn-mower chasing, failure of a washing
• Understand constant failure rate, time machine and car battery, burning of a toaster
dependent models and reliability models oven electrical plug, leakage of a water-heater,
of systems failure of a CD drive, failure to function of TV
remote control and stereo amplifier, failure of an
automobile engine starter, leakage of house roof
etc.

2 3
Concepts, Terms and Definition
Concepts, Terms and Definition
Reliability
• Reliability is the probability that a component or system
will perform desired operation for a given period of time • Maintainability
under the defined operating conditions.
– Probability Numerical parameter – Maintainability is the probability that a failed
– Desired operation component or system will be restored or
– Time Engineering parameters repaired to specified conditions within a
– Operating conditions period of time using specified maintenance
• It is the probability of a non-failure over time. In some
cases reliability is not defined over time but over another procedure.
measurement such as miles traveled, units or batches – In simple terms it is probability of repair in a
produced etc.
given time.

4 5

Concepts, Terms and Definition Concepts, Terms and Definition

• Availability • Quality
– Availability is the probability that a component – Quality is defined as the amount by which
or system is performing its required function product satisfies the user’s (customer’s)
at a given point of time in a specified requirements
operating condition. In other words, it is the – Quality is function of design specification.
probability or degree to which the system will Quality of a product and reliability are
dependent on each other
be ready to start a mission when needed.
– A high reliability product will have a high
Availability = failure time/(failure time + repair quality, but vice versa may or may not be true
time)

6 7
Concepts, Terms and Definition Concepts, Terms and Definition

• Failure • Mean Time to Failure (MTTF)


– It is a state in which component is unable to – When applied to non-repairable items, time to
perform required function satisfactorily at a failure is the basic measure of reliability and
given time. is usually expressed in terms of the average
– A failure may be treated as random or time to failure known as the mean time to
deterministic by studying the physics of the failure.
failure process. – MTTF is estimated by the total measured
– Time to failure is the time elapsed from the operating time of a population of items
onset of the function (or mission start) to the divided by the total number of failures within
failure of the devise or process (mission the population during the measured time
failure). period.

8 9

Concepts, Terms and Definition Concepts, Terms and Definition

• Mean Time between Failures (MTBF) • Mean Time to Repair (MTTR)


– MTBF is used for repairable systems. – The maintainability of a product or process is
– It means the time elapsed or number of measured in terms of mean time to repair.
operations between successive failures of the – It is the statistical mean of the distribution of
same repairable system. times to repair. In another words, the
– It may be the time between occurrences of summation of active repair times during a
the same (single) failure mode, or if so given period of time divided by the total
specified, a group of failure modes. number of malfunctions during the same time
interval.

10 11
Concepts, Terms and Definition Concepts, Terms and Definition
• Operating Time • Failure Rate
– The time during which an item is performing a – A value expressing the frequency of failure
function. It is the time period between turn-on occurrence over any specified time interval or
and turn-off of a system, subsystem, cycles of operation
component or part during which operating is • Failure Modes
specified.
– The various manner or ways in which failures
• Repair Time occur and the resulting operating condition of
– Time measured from the beginning of the item at the time of failure
correction of a malfunction to the completion • Common Cause
of such correction. Time during which one or
more technicians are actually working to – A cause resulting in failure of all affected
repair a failure. This time include preparation systems
time, fault location time, correction time and
checkout time.

12 13

Reliability function Failure Function

• Reliability is defined as the probability that a • The cumulative distribution function (CDF) is
component or system will function over some the probability that a failure occurs before
time period, t. Let us define a continuous time, t, and therefore, the CDF is:
random variable as the time to failure of the
component or system T. F (t ) = 1 - R(t ) = Pr{T < t } , where, F (0) = 0 , limt ® ¥ F (t ) = 1.

• Then, reliability function can be expressed as The probability density function (PDF) is defined as:
R(t ) = Pr{T ³ t }
dF (t ) dR(t )
=-
• where R(t ) ³ 0 , R(0) = 1 , and limt ® ¥ R(t ) = 0 f (t ) =
dt dt
and this function describes the shape of the

for a given value of t, is the probability that failure distribution. The PDF, f (t ) has two properties,
the time to failure is greater than/equal to T. ¥
f (t ) ³ 0 and ò f (t )dt = 1.
0

14 15
Relation between PDF and
Failure time
Reliability Function/CDF
• Mean time to failure (MTTF) t

• Mean of a failure distribution is one of the F (t ) = ò f (t )dt PDF f(t)

several measures of central tendency. It 0 Area = 1

t
is the statistical mean of the failure ∞ 1.0

probability distribution defined by R(t) = ∫ f(t)dt F(t)


¥ t
By definition, MTTF = E (T ) = ò tf (t )dt CDF
0
¥ dR(t ) t
Also, MTTF = ò - t dt
0 dt
1.0
¥ ¥
Using integration by parts, MTTF = -tR(t ) + ò R(t )dt Reliability function
0 0 R(t)
¥ é t ù
MTTF = ò R(t )dt , since the limt ® ¥ tR(t ) = lim t ® ¥ t. exp ê- ò l (t )dt ú = 0 t
0 ë 0 û 16 17

Median time to failure Mode time to failure


• The median divides the failure distribution
into two halves, with 50% of failures • For t greater than 0 and less than infinity.
occurring before the median time to • For a small fixed interval of time centered
failure and 50% occurring after the around the mode, the probability of failure
median. will generally be greater than for an interval
• This value may be preferred for of the same size located else where within
measuring the central tendency of highly the distribution
skewed failure distributions.
f(t mode ) = max f(t)
R(t med ) = 0.5 = Pr {T ³ t med )
18 19
Time Variance

• Variance is a measure of the spread or


f(t)
dispersion, of the failure times about the
mean. Mathematically,
¥
it is defined by:
s 2 = ò (t - MTTF ) 2 f (t )dt
0
• It represents an average squared
distance a failure time will be from the
MTTF. The square root of variance is the
t mode t median MTTF t
standard deviation. Another
Comparison of the Measure of the Central Tendency computationally simple
¥
form of variance
is given by: s = ò t f (t )dt - (MTTF ) 2
2 2
0
20 21

Hazard Rate Function Hazard Rate Function

• This function describes the instantaneous


(at time t) rate of failure, which is an Pr{t £ T £ t + Dt T ³ t } =
R(t ) - R(t + Dt )
R (t )
alternative way of describing a failure
where, by definition, Pr{t £ T £ (t + Dt )} = R(t ) - R(t + Dt )
distribution. Then, the conditional probability of failure per unit time (failure rate) is

• The conditional probability of a failure in R(t ) - R(t + Dt )


R(t )Dt
the time interval from t to t + Dt , given Therefore, the hazard rate function,
that the system has survived up to time t - [R(t ) - R(t + Dt )] 1
l (t ) = lim
is: Dt ® 0 Dt R (t )
dR(t ) 1 f (t )
=- . =
dt R(t ) R(t )

(see following slide)


22 23
Hazard Rate Function Average Failure Rate
Depending upon the characteristics of l(t), the failure rate may be
increasing (IFR), decreasing (DFR), or constant function (CFR).

In other words, l(t), hazard rate or failure rate function, uniquely Average failure rate is defined between two times t1 and t 2 .
determines the reliability function as shown below.

dR(t ) 1 dR(t ) 1 t2 ln R(t1 ) - ln R(t 2 )


l (t ) = -
dt
.
R (t )
or l (t )dt = -
R (t ) AFR(t1, t 2 ) = ò l (t )dt =
t 2 - t1 t t 2 - t1
1
t R (t ) dR(t )
Integrating, ò l (t )dt = ò -
0 0 R (t )

t é t ù
- ò l (t )dt = ln R(t ) as R(0) = 1, and therefore, R (t ) = exp ê - ò l (t )dt ú
0 ë 0 û

Cumulative failure rate over a period of time t

t
L(t ) = ò l (t )dt
0
24 25

Conditional Reliability Bathtub Failure Curve


• Conditional reliability is useful in describing the reliability
of a component or system following a burn-in period To or λ (t)

after a warranty period, To. It can be defined as the


reliability of a system given that it has operated for a time Random failures
To. Mathematically: Useful life
Early failures
Wear out failures
R ( t / T 0 ) = Pr{ T > T 0 + t T > T 0 } Burn in
Wear out
Pr{ T > T 0 + t }
=
Pr{ T > T 0 }
R {T 0 + t } c 0 /c 1 t0 t
=
R (T 0 ) ì co
ï c o - c1t + l 0£t £
é To + t ù ï
c1
exp ê - ò l ( t )dt ú ï
Hazard rate function, l (t ) = í l
co
< t £ t0
ëê ûú é To + t ù
0 ï c1
= = exp ê - ò l ( t )dt ú
é To ù êë To úû ïc 2 (t - t0 ) + l t0 < t
exp ê - ò l ( t )dt ú îï
êë 0 úû 26 27
Characteristics of Bathtub Curve Constant Failure Rate Model
• Theoretical model for analyzing failure process
Age Characterized by Caused by Safe guard
Burn in Decreasing failure Manufacturing defects, Burn in testing, • A failure distribution that has a constant failure
rate Poor quality control, Screening, Quality
Welding flaws, Cracks, control, Acceptance
rate is called an exponential distribution function
etc. testing
Useful life Constant failure rate Environment, Random Redundancy, Excess
loads, Human error, strength
Chance events
Wearout Increasing failure Fatigue, Corrosion, Derating, Preventive
rate Aging, Maintenance,
Useful life, random failure, constant failure
Friction, Cyclic loading Replacement

28 29

Exponential Reliability Function Exponential Reliability Function


é t ù
• Common, simple failure distribution, easy to R(t ) = expê- ò ldt ú = e - lt , t>0
ë 0 û
analyze F(t) = 1 - R(t) = 1 - e - lt
• It represents completely random or chance failure f (t ) = -
dR (t )
= le - lt
dt
• It dominates during the useful life of a component ¥
MTTF = ò e - lt dt =
1
0 l
• Let us assume that , l (t) = l for t ³ 0 such that ¥ 1 2 - lt 1
Var = ò (t - ) le dt =
l >0 0 l l2
Design life time (for a defined relaibility, t r )
R (t r ) = e - lt r = R
1
tr = - ln R
l
Median time
1 1
t med = - ln 0.5 = 0.69315 = 0.69315MTTF
30
l l 31
Exponential PDF, CDF, Hazard Rate
Interesting observations!
and Reliability functions
• Variability of failure time increases as reliability
(MTTF) increases
• Mean time to failure is the reciprocal of the failure
rate (λ) F(t)
• R(MTTF ) = e -MTTF / MTTF = e -1 = 0.368 ,a
component having a CFR has slightly better than l(t)
one-third chance of surviving to its mean time to
failure
f(t)
• The median is always less than mean since the
R(t)
exponential distribution is skewed to right
t

32 33

The Weibull Distribution The Weibull Distribution

• Hazard rate functions that are not constant over • For convenience it is better to express as:
time. l (t ) =
bæt ö
b -1
θ > 0, β > 0, t ³ 0
ç ÷ Θ is a scale parameter that
q èq ø
• It is one of the most useful probability distribution é t b æ t ö b -1 ù
influence both the mean and the
( t / q )b spread or dispersion of the
in the reliability engineering. R(t ) = expê- ò ç ÷ dt ú = e -
êë 0 q è q ø úû distribution. It is called
• It may be used to model both increasing and dR (t ) b æt ö
b -1 b
characteristic life and has units
f (t ) = - = ç ÷ e - (t / q ) identical to those of failure time.
decreasing failure rates. dt q èq ø

• The general expression for hazard rate function β is referred as the shape parameter
is a power function given by
β < 1, the probability density function is similar to in shape to
b Hazard function is increasing for a > 0, b > 0
l (t ) = at exponential, for large values β < 3, the probability density function is
Decreasing for a > 0, b < 0. somewhat symmetrical, like normal distribution

34 35
The Weibull Distribution The Weibull Distribution
1
b=4.0
f(t)
b=2.0

b=0.5
b=0.5 b=4.0
b=4.0 b=1.5
R(t)
b=0.5
b=1.5
b=1.5 b=2.0
b=2.0

36 37

The Weibull Distribution The Weibull Distribution


f(t) 1

Θ=0.5
Θ=0.5

Θ=2.0
Θ=2.0
R(t)
Θ=1.0
Θ=1.0
Θ=1.0
Θ=1.0
Θ=0.5
Θ=0.5

Θ=2.0
Θ=2.0

38 39
The Weibull Distribution The Weibull Distribution

• For 1<β<3, the density function is skewed. Shape factor description


• When β=1, λ(t) is constant, the distribution is identical to Value of shape Property
the exponential with λ=1/Ѳ. parameter
• When t=Ѳ, R(t)=exp[−(Ѳ/Ѳ)β]=0.368. Therefore, 63.2% of 0<β<1 Decreasing failure rate
all weibull failure will occur by the time
β=1 Exponential distribution (constant rate)
æ 1ö
MTTF = q .Gçç1 + ÷÷ 1<β<2 Increase rate (Concave)
è bø
ì æ 2ü β=2 Rayleigh distribution
ï 2ö é æ 1 öù ï
Var = q 2 íGçç1 + ÷÷ - êGçç1 + ÷÷ú ý 3>β>2 Increase failure rate (Convex)
ïî è b ø ë è b øû ï
þ
¥
4 <= β <= 3 Increase rate symmetrical
G( x ) = ò y x -1e - y dy
0
40 41

The Weibull Distribution The Weibull Distribution

• Design life t r = q ( - ln R )1 / b • Burn in screening for Weibull


R(t | To ) =
{
exp - [(t + To ) / q ]b } = expéê- æç t + To ö÷ b + æç To ö÷ b ùú
• Median Time to Failure [
exp - (To / q )b ] êë è q ø è q ø úû

t med = t 0.50 = q ( - ln R )1 / b = q (0.69315 )1 / b • Failure modes


• For a system comprised of N independent failure
• Mode f (t * ) = max f (t ) modes, each having a Weibull distribution with
t ³0 shape parameter β and scale parameter Θi , the
ì q (1-1 / b )1/ b forb >1 system failure rate:
t mod e = í 0 forb £1 b æ t ö
b -1 æ n æ 1 ö

î l (t ) = å
q
i =1..n i
çç
èqi
÷÷
ø
=bt b -1çç å çç
ç i = 1è q i
÷÷
ø
÷
÷÷
è ø
42 43
The Weibull Distribution Three Parameter Weibull function

• Identical Weibull Components • When there is a minimum guaranteed life, to such


• In the case of n serially related components, that , T>to the three parameter Weibull is most
having identical hazard rate functions, with same appropriate.
é æt - t bù This distribution assumes that no failures
shape and scale parameters R(t ) = expê- ç 0 ö
÷ ú for t ³ t 0 will take place prior to time, to.
ê è q ø ú
ë û
b -1
b æ t - t0 ö The parameter, to is called the location or
b -1 l (t ) = ç ÷ for t ³ t 0
bæt ö b qè q ø
l (t ) = å ç ÷ =n (t ) b - 1 Which is a Weibull system with
threshold parameter.
i =1..n q è q ø qb parameters and q b æ 1ö
MTTF = t 0 + q .Gçç1 + ÷÷
è bø The variance of this distribution is the
é æ t öb ù 1/ b
R(t ) = expê- nç ÷ ú n 1 same as that in the two parameter model.
êë è q ø úû t median = t 0 + q (0.69315 ) b It is easy possible to transform a three
t r (design life) = t 0 + q ( - ln R )1 / b parameter Weibull to a two parameter
Weibull with the transformation
44 45

Redundancy in Weibull Failures Applications of Weibull Distribution

• In case two identical components (assumed • Because of its flexibility, Weibull distribution is often the
first choice when attempting to model a population with
independent) are used to form a redundant increasing failure rate. Some common applications of
system (both must fail for the system to fail), then Weibull distribution are:
the system reliability is: – Determining the breaking strength of components or
the stress required to fatigue failure of metals
– Estimating the time to failure for mechanical/electrical
é ì b ü
2ù b b components.
R s (t ) = 1 - [1 - R(t )] 2 Rs (t ) = ê1 - í1 - e - (t / q ) ý ú = 2e - (t / q ) - e - 2(t / q )
êë î þ úû – Calculating the time to failure for items that wear out,
b such as automobile tires, thinning of pipe wall
æt ö æ 1ö
-ç ÷ MTTF = q .Gçç1 + ÷÷(2 - 2 -1 / b ) thickness etc.
R (t ) = e è q ø è bø
– Analyzing systems that fail when the weakest
b -1 b
bæt ö 2 - 2e - (t / q ) component in the system fails. In this case, the Weibull
ls ( t ) = ç ÷
q èq ø 2 - e - (t / q )
b distribution represents an extreme value distribution

46 47
Normal (Gaussian) Distribution Normal (Gaussian) Distribution

1 é - 1 (t - µ ) 2 ù
• The normal distribution has also been f (t ) = expê ú -¥ < t< ¥
2p s êë 2 s 2 úû
used to model fatigue and wear out
phenomena.
=0.5
• Because of its relationship with f(t)
= 0.5

lognormal distribution, it is also useful


in analyzing lognormal probabilities. =1.0
=1.0

48 49

Normal (Gaussian) Distribution Normal (Gaussian) Distribution

• Parameters μ and σ are the mean and variance • There is no closed form solution to reliability
of the distribution. function, it must be evaluated numerically
• The distribution is symmetrical about its mean.
æt -µ ö
• Reliability z= ç ÷ is made, then z will be normally distributed with a mean
è s ø
of zero and variance of one.
¥ 1 é - 1 (t - µ ) 2 ù
expê 1 2
R (t ) = ò údt PDF F( z ) = e-z / 2
t 2p s êë 2 s 2 úû 2p
t
CDF F ( z ) = ò (F( z )dz

50 51
Normal (Gaussian) Distribution Lognormal Distribution

• The standardized table can be used to find the • If the random variable T, the time to failure has a
cumulative probabilities of any normally lognormal distribution, the logarithm of T has a
distributed random variable, by making use of normal distribution.
æT - µ t- µö
• The lognormal density function may be written as
F (t ) = Pr{T £ t } = Prç £ ÷
è s s ø
ì t- µ ü æt- µ ö
= Pr íz £ ý = Fç ÷
î s þ è s ø æ 2ö
1 æ ö
expçç - ÷
1 t
f (t ) = çç ln ÷÷ for t ³ 0
æt- µ ö f (t ) f (t ) 2Pst 2 t ÷
R (t ) = 1 - F ç ÷ l (t ) = = è 2s è med ø ø
è s ø R(t ) 1 - F{(t - µ ) / s }

52 53

Lognormal Distribution Lognormal Distribution

• The distribution is defined only for positive values


f(t)
of and therefore more appropriate than the s=0.1
s=0.1
normal as a failure distribution.
• Like Weibull distribution, lognormal can take on a
variety of shapes. s=1.0
s=1.0

• The mean, variance and mode of lognormal are s=0.5


s=0.5
æ s2 ö
MTTF = t med expç ÷
ç 2 ÷
è ø
s 2 = t2
med
[ ]
exp(s 2 ) exp(s 2 ) - 1
t
t med
t mod e =
( )
exp s 2 The effect of shape parameter on the lognormal density function

54 55
Lognormal Distribution Lognormal Distribution

• To compute failure probabilities, the relationship • Since the logarithm is a monotonically increasing
between the normal and lognormal is used. It is function
shown in
F (t ) = Pr{T £ t } = Pr{lnT £ ln t }
Distribution Lognormal Normal
ì lnT - ln t med ln t - ln t med ü
2 = Pr í £ ý
s î s s þ
Mean t med exp In tmed
2
ì 1 t ü
= Pr íz £ ln ý
î s t med þ
Variance t2
med
[ ]
exp(s2 ) exp(s2) -1 S2
æ1 t ö æ1 t ö
= Fçç ln ÷÷ R(t ) = 1 - Fçç ln ÷÷
è s t med ø è s t med ø
Given that T is a lognormal random variable

56 57

Lognormal Distribution Lognormal Distribution

• Similar to normal, the hazard rate • The effect of shape parameter on lognormal hazard rate
function for the lognormal distribution curve
cannot be solved analytically. λ(t)
• The lognormal hazard rate can be
calculated numerically at selected points
in time by finding f(t)/R(t). s=0.4
s=0.4
s=0.6
• The hazard rate function increases until it s=1.0
s=0.8
s=0.8
s=0.6
s=1.0
reaches a peak and then it slowly
decreases, which is an uncommon failure
rate behavior for most components.
t

58 59
Serial Configuration Serial Configuration

• Components within the system 1 2 3 N

may be related to one another in


two primary ways: Let E1 = the event that component 1 does not fail, and
–serial or E 2 = the event that the component 2 does not fail.
–parallel configuration Then, P(E1 ) = R1 and P (E 2 ) = R2 , where, R1=Reliability of
• In serial configuration, all component 1 and R2 =Reliability of component 2.

components are critical, if either Rs = P(E1 Ç E2 ) = P(E1 ).P (E 2 ) = R1.R2 (assuming independence)
components fails, the system Generalizing to n components in series, System reliability (Rs) is:
will fail Rs (t) = R1(t ). R 2 (t ). R3 (t ) …. Rn £ Min{R1(t ), R2 (t ), R3 (t ) …. R n (t )}

60 61

Serial Configuration:
Serial Configuration
Constant Failure rate

• In such case system reliability If each component has constant failure rates, the
system reliability
can never greater than smallest
æ ö
component reliability. Rs (t ) = Õ Ri (t ) = Õ exp( -li t ) = expç - å li t ÷ = exp( -ls t )
ç ÷
i =1,n è ø
• It is important for all n
components to have a high ls = å l i is the system failure rate.
i =1
reliability, for having high system This means that the system also has a constant failure rate
reliability.

62 63
Serial Configuration:
Parallel configuration
Weibull failure model
The Weibull system reliability is given by: • If components in parallel or 1

bi ù bi ù
redundant, all units must fail for
é æ t ö é æ ö
Rs (t ) = Õ expê- çç ÷÷ ú = expê- å ç t ÷÷ ú the system to fail. The system 2

i =1,n ê è qi ø ú ê i =1,n çè q i ø ú reliability of parallel and


ë û ë û independent components is
3

b i -1
æ dRs (t ) ö 1 b æ t ö found by 1 minus the probability
l (t ) = ç - ÷ = å i çç ÷÷ of all components fail (i.e. the
dt ø Rs (t ) i =1,n q i è q i
N
è ø
probability that at least one
It is evident from above expression of l(t) that system component doesn’t fail). c c
Rs (t ) = P( E1 È E2 ) = 1 - P( E1 È E2 ) c = 1 - P( E1 Ç E2 )
exhibit Weibull type failure though every component c c
= 1 - P( E1 ).P( E2 ) = 1 - (1 - R1 )(1 - R2 )
has Weibull failure distribution.

64 65

Parallel configuration:
Parallel configuration
Constant failure rate system
R s (t ) = P (E1 È E 2 ) = 1 - P (E1 È E 2 ) c = 1 - P (E1c È E 2 c ) 1 • A redundant system consisting of
= 1 - P (E1c ).P (E 2 c ) = 1 - (1 - R1 )(1 - R 2 )
2 all CFR components, the system
Generalizing R s (t ) = 1 - Õ [1 - R i (t )]
i =1, n
3 reliability:
It is always true that,
N R s (t ) = 1 - Õ (1 - e - li t )
Rs(t) ³ Max (R1, R 2 , R 3 ,...R n ) i =1, n
• In parallel configuration system reliability is always li is the failure rate of the i th component.
• greater than maximum reliability component

66 67
Parallel configuration: Combined Series and
Weibull system Parallel Systems

• For a redundant system consisting of • A complex system contains both


components having Weibull failure series and parallel components.
law, the system reliability is given by: • Such systems need to be broken
down in series and parallel
n ì
æ æ t b i öü
ï ö ÷ï subsystems.
R s (t ) = 1 - Õ í1 - expçç - çç ÷÷
÷÷ý
i =1ï ç èqi ø • Finally the reliability of the system
î è øïþ
may be obtained based on the
relationship among the subsystems

68 69

Combined Series and


Low Level Redundancy
Parallel Systems
R1 A
B
• Each component comprising the
R3

R2
system may have one or more parallel
R6
C components
R4 R5

• Let each component has reliability, R


In the above network, the subsystems will have the reliability
R A = [1 - (1 - R1 )(1 - R 2 )] Rlow = [1 - (1 - R ) 2 ] 2 = (2R - R 2 ) 2
R B = R A R 3 & RC = R 4 R 5
A B
Since RA and RB are in parallel with one another and in series with R

System Reliability: R s = [1 - (1 - R B )(1 - RC )].R 6 A B

70 71
Comparison of High and Low
High Level redundancy
level of redundancy

• The entire system may be placed in • By comparing the two reliabilities, it


parallel with one or more identical may be said that the reliability of low
systems level redundancy is greater than the
• Let each component has reliability, R reliability of the high-level redundancy.
R high = [1 - (1 - R 2 ) 2 ] = 2R 2 - R 4
It may be seen below:
R low -R high = (2R - R 2 ) 2 - (2R 2 - R 4 )
A B
= R 2 ( 4 - 2R + R 2 ) - R 2 (2 - R 2 )
A B = 2R 2 (1 - 2R + R 2 ) = 2R 2 (1 - R ) 2 ³ 0

72 73

k-out-of-n Redundancy k-out-of-n Redundancy

• k-out-of-n identical and independent components to func tion for The number of ways in which x successes can be obtained from
the system to function. Obviously, k <=n . -
n components. R x (1 - R )n x is the probability of x successes
• If k = 1, complete redundancy occurs, and if k = n , the n and n - x failures for a single arrange ment of successes and
components are, in effect, in series. failures. Therefore the probability of k or more successes from
• The reliability may be obtained from the binomial probability among the n components can be written as,
distribution.
n n ænö
If each component is viewed as an i ndependent trial with R ( its R s = å P ( x ) = å çç ÷÷R x (1 - R ) n - x
reliability), as a constant probability of success, then, x =k x =k è x ø
The mean time to failure the system is,
n x
P( x ) = R (1- R )n - x
x ¥
is the probability of exactly n component operating. MTTF = ò R s (t )dt
0
74 75
k-out-of-n CFR case Complex Configuration

• For constant failure rate system, • Component configuration may


Reliability will be: be such that the system
n ænö
R s = å çç ÷÷e - lxt [1 - e - lt ] n - x
reliability cannot be simply
x =k è x ø decomposed into series and
parallel relationships
• The MTTF is
¥
• Such networks may be analyzed
1 n 1
MTTF = ò R s (t )dt = å either by the method of
0 l x =k x
Decomposition or Enumeration
76 77

Decomposition Enumeration
A C

E
• For small networks, enumeration is used to
B D
The shown network is broken down in two determine the system reliability. Steps are:
sub network, one in which component E has
failed (II) and one in which component E is – identify all possible combination of success (S) and
functioning with reliability RE (I). Reliability failures (F) of each component and the resulting
A C of each network is determined separately. success and failures of the system.
I
– For each possible combination of component success
B D
or failures, the probability of the intersection of these
The total reliability of the system may be
events is computed (considering mutually exclusive
computed as: events).
A C
II – Sum of success probabilities or one minus sum of
R s = R E RI + (1 - R E )RII failure probabilities is the system reliability
B D

78 79
System structure function System structure function

• A very general alternative approach for analyzing • Reliability for series system
the reliability of complex systems is through the Pr{Y( X 1, X 2 , X 3 ,... X n ) = 1} = Pr{ X 1 = 1, X 2 = 1, X 3 = 1,..., X n = 1}
use of system structure function: = Pr{ X 1 = 1} Pr{ X 2 = 1} Pr{ X 3 = 1}... Pr{ X n = 1}
= R1.R 2 R 3 ...R n
Xi = 1 {
if component i operates
0 if component i has failed
• Reliability for parallel system
The system structure function is defined as:

Pr{Y( X 1, X 2 , X 3 ,... X n ) = 1} = Pr{max( X 1, X 2 , X 3 ,..., X n ) = 1}


ì1 system operates
Y( X1, X 2 , X 3 ,... X n ) = í = 1 - Pr{allX i = 0}
î0 system has failed
= 1 - (1 - R1 )(1 - R 2 )(1 - R3 )...(1 - R n )

80 81

System structure function System bounds


• Reliability for k-out-of-n system • Lower bounds
n
Pr{F( X1, X 2 , X 3 ,... X n ) = 1} = Pr{ å X i ³ k } – the lower bound will be attained if the minimal cutsets
i =1 share no common components. The lower bound
In case R1= R2= R3 … = Rn, the reliability can be determined using reliability is:
binomial probability distribution c é ù
• Cutsets and minimal cutsets R L = Õ ê1 - Õ (1 - R k )ú c is the number of minimal cutsets
i =1ëê keSi ûú
– A cutset is a set of components whose failure will result in a
system failure. A minimal cutsets is one in which all the
components must fail in order for the system to fail. • Upper bounds
• Pathways and minimal pathways – the upper bound will be attained if the minimal path
– A pathway is a set of components whose functioning ensures that sets share no common components (i.e., they are
the system functions. A minimal pathways are the one in which all independent).
components within the set must function for the system to
function. p é ù
RU = 1 - Õ ê1 - Õ Rk ú p is the number of minimal pathways
i =1ëê keTi ûú
82 83
Common-Mode Failure Three-state devices

• Several components may fail by common mode failure • Three state devises are components that have
such as same power source, external load, vibrations etc. three states such as open and short failure
• A common-mode failure can be depicted in series with
modes, and an operating state. Examples include
electrical circuits, flow valves etc.
those components sharing the failure mode.
– It is interesting to note that system comprising these
• In order to represent system in series, it must be possible components is that redundancy may either increase or
to separate independent failures from common-mode decrease the system reliability
failures. – An alarm system is a three state device, which may fail
R1 safe (false alarm) or may fail to danger (failure to
function in need)
R2 R’

R3
R s = [1 - (1 - R1 )(1 - R 2 )(1 - R 3 )].R '
84 85

Three-state devices Series structure

• Assumptions • For the system to fail short both switches must


fail short. For the system to fail open, at least one
–Failures modes are mutually switch must fail open
exclusive • E1= the event that both the switches fail short
–All components composing the • E2= the event that at least one switch fails open
and let,
system are independent
Q = P ( E1 È E 2 ) = P ( E1 ) + P ( E 2 )
Q = 1- R
86 87
Series structure Parallel structure
q oi = the probability that component i fails open • For the system to fail short, one or both switches
q si = the probability that component i fails short to fail short. For system to fail open, both
Then, Q = ( qs 1q s 2 ) + ( q o1 + q o 2 − q o1q o 2 ) and therefore,
switches must fail open
R = 1 − [(q s 1q s 2 ) + (q o1 + q o 2 − q o1q o 2 )]
• E1= the event that both the switches fail open
= (1 − q o1 )(1 − q o 2 ) − q s 1q s 2 • E2= the event that at least one switch fails short
System reliability is the probability that there are no opens minus the
probability that all components are short.

This can be generalized for n components in series as: Q = P ( E1 È E 2 ) = P ( E1 ) + P ( E 2 )


n n
R = II (1 − q oi ) − II q si
i =1 i =1 Q = 1- R
88 89

Parallel structure Low-level Redundancy


Q = (q o1q o2 ) + (q s1 + q s2 - q s1q s 2 ) and
1 2 m

R = 1 - [(q o1q o2 ) + (q s1 + q s2 - q s1q s 2 )]


1 2 m
= (1 - q s1 )(1 - q s2 ) - q o1q o2
1 2 m
N components
This can be generalized for n components in series as:
n n
R = Õ (1 - q si ) - Õ q oi
1 2 m
i =1 i =1
Reliability is the probability that there are no short minus the
probability that all components are open m m
R L = Õ (1 - q n oi ) - Õ [1 - (1 - q si ) n ]
i =1 i =1

90 91
Thank you!
High-level Redundancy

1 2 m

1 2 m
Questions?
1 2 m
N components

1 2 m

n n Ming Yang, PhD, P.Eng.


æ m ö é m ù Assistant Professor of Process and Offshore Safety
RH = çç1 - Õ q si ÷÷ - ê1 - Õ (1 - q oi )ú Safety and Security Science Section,
è i =1 ø ë i =1 û Faculty of Technology, Policy, and Management,
TU Delft, the Netherlands
92 Email: m.yang-1@tudelft.nl 93

Why? - Safety and Security Concerns

Operational Safety • Prudence due to industrial activities should be present in every


industry, and certainly also in the hazardous materials using industries
Economics • Characteristics of chemicals using industries: use of hazardous
materials, existence of chemical industrial parks, license to
operate/acceptability linked with reputation, high uncertainties linked with
debatable opinions
• The Netherlands & Belgium: densely populated areas combined with
highly concentrated chemical industrial activities
Lecture 05 of TPM024A • The Rotterdam & Antwerp Port Areas are part of the “ARRRA” and are
extremely important for the Dutch (/Belgian/German/European)
economies

Dr. Ming Yang, Assistant Professor of Safety and Security Science


Faculty of Technology, Policy, and Management
May 30th 2022
Email: m.yang-1@tudelft.nl
1 2
Presentation outline
1. What? (Definition / what exactly are we discussing)

2. Why? (Some figures and psychological info)

3. How? (Conceptual models and theoretical info)

4. With? (Practical models/tools and concrete data/examples)

5. New? (Innovative ways of dealing with economics of


operational safety and prevention investments)

6. Recommendations? (Things to remember)

3 4

Operational safety and prevention


1. What?
economics?
• What exactly are we discussing? • We are talking about:

• Decision-making regards prevention


investments for avoiding or mitigating
possible future consequences (losses)
due to unwanted events related with
company operations and activities.

5 6
One of the domains of operational risk
2. Why?
management
• Some figures and psychological info

7 8

Some accident & accident cost figures Some accident & accident cost figures

• Daily 18,900 accidents at work in Europe (EU-27, data from • Costs to employers in France due to workplace accidents and work-
2007), related ill-health (extrapolations based on study in UK): 4.5 – 8.9
Billion € (EU, 2011)

• Of which 13,700 result in a sick leave of at least one day


• Benefit-Cost ratio based on 56 prevention projects varies between
• Of which 4,100 result in a sick leave of at least one month 1.21 and 2.18 (benOSH case studies, EU, 2011)
• Of which 275 lead to permanent incapacity to work
• Of which 16 fatalities • According to a large-scale study (involving 300 companies), 75% of
the companies indicate that additional investments in occupational
safety would lead to company costs remaining the same or
decreasing over the long term (ISSA, 2012); perceived B/C ratio of
these companies was between 1 and 1.99

9 10
Economic consequences of safety and prevention
for a company
Some derived facts

• Too many accidents still happen, even in the EU


with all the regulations and experience

• It pays off to invest in prevention and safety

à So why is it then so difficult to have managers


invest in safety and prevention in organisations?

11 12

Individual Psychological background: Individual Psychological background:


Loss aversion and prevention investments Loss aversion and prevention investments
• Suppose you are offered two options: • By far most people will prefer options (A) in the first case
– (A) You receive 5,000€ from me (with certainty); and and (D) in the second case.
– (B) We toss a coin. You receive 10,000€ from me if it is heads,
otherwise (if it is tails), you receive nothing.
What will you chose? • Hence, they go for the certainty regarding the positive risk
(getting 5000€ with certainty), and at the same time they
• Let’s now consider two different options: go for taking the gamble as regards the negative risk, and
– (C) You have to pay me 5,000€ (with certainty); and risking to pay 10,000€ with a level of uncertainty (there is
– (D) We toss a coin. You need to pay me 10,000€ if the coin turns a 50% probability that they will not have to pay anything)
up heads, otherwise (in case of tails), you don’t need to pay me instead of paying 5,000€ for certain.
anything.
What will you chose?
à Result is NOT LOGICAL: “Loss Aversion”

13 14
Loss Aversion Individual Psychological background:
Loss aversion and prevention investments
Translating this psychological principle into safety terminology, it is clear
that company management would be more inclined to invest in
production (‘certain gains’) than to invest in prevention (‘uncertain
gains’).
Also, management is more inclined to risk highly improbable
accidents (‘uncertain losses’) than to make large investments
(‘certain losses’) in dealing with such accidents.

We do not gamble with gains,


while we tend to gamble with losses (because we really hate to lose)!

15 16

The brain of the manager and the position of


Balance between safety and production
operational safety – how it should be

17 18
What operational safety economics can do, is
Prediction
help to find the balance
Operational safety and prevention economics =
emerging field of interest to academia and industry

àWill be much more important in future academic


research AND industrial decision-making

Let us now see current theories, R&D and industrial


practices of the field of interest!

19 20

Costs and benefits w.r.t. prevention:


3. How?
how should we interprete them?
• Conceptual models and theoretical info • “Costs” = costs of prevention measures (‘control costs’)

OR: Costs = costs of accidents that happened (‘failure costs’)

• “Benefits” = averted costs


Thus: hypothetical benefits of losses that never occurred (accidents that
never happened) due to the taking of prevention measures

OR: Benefits = cost savings through prevention of disruptions + added


value generated by prevention

21 22
Costs of prevention measures (‘control Costs of accidents (non-exhaustive)
costs’) Interested Non-quantifiable consequences of Quantifiable consequences of
parties accidents accidents
Victim(s) -Pain and suffering -Loss of salary and bonuses
-Moral and psychic suffering -Limitation of professional skills
• - Staffing costs of company HSE department -Loss of physical functioning
-Loss of quality of life
-Time loss (medical treatment)
-Financial loss

• - Staffing costs for the rest of the personnel (time needed to implement safety -Health and domestic problems
-Reduced desire to work
-Extra costs

-Anxiety
measures, time required to read working procedures and safety procedures, etc.) -Stress
Colleagues -Bad feeling -Time loss
• - Procurement and maintenance costs of safety equipment (e.g., fire hoses, fire -Anxiety or panic attacks
-Reduced desire to work
-Potential loss of bonuses
-Heavier work load
extinguishers, emergency lighting, cardiac defibrillators, pharmacy equipment, -Anxiety
-Stress
-Training and guidance of temporary
employees
etc.) Organisation -Deterioration of social climate
-Poor image, bad reputation
-Internal investigation
-transport costs
-medical costs
• - Costs related to training and education w.r.t. working safe -lost time (informing authorities,
insurance company, etc.)

• - Costs related to preventive audits and inspections -Damage to property and material
-Reduction in productivity
-Reduction in quality
• - Costs related to exercises, drills, simulations w.r.t. safety (e.g., evacuation -personnel replacement
-New training for staff
exercises, etc.) -Technical interference
-Organisational costs
-Higher production costs
• - A variety of administrative costs -Higher insurance premiums
-Administrative costs

• - Prevention-related costs for early replacements of installation parts, etc. -Sanctions imposed by parent
company
-Sanctions imposed by the
• - Investigation of near-misses and incidents government
-modernization costs (ventilation,
lighting, etc.) after inspection
• - Maintenance of machine park, tools, etc. -new accident indirectly caused by
accident (due to personnel being
• - Good housekeeping tired, inattentive, etc.)
-loss of certification
-loss of customers or suppliers as a
• … direct consequence of the accident
-variety of administrative costs
-loss of bonuses
-loss of interest on lost cash/profits
-loss of shareholder value

23 24

Insured and Uninsured costs of


Hypothetical benefits
accidents
Insured costs: Uninsured costs: Take accident costs for a variety of scenarios:
these are all ‘avoided accident costs’ if certain
• Product and material damage
• Insurance premiums • Lost production time prevention investments (related to the scenarios
• Medical bills • Legal costs considered) are made !
• Indemnity payments • Overtime & temporary labour
• Investigation time/Administration
• Temporary disability payments • Supervisors time
• Employers liability • Fines Other hypothetical benefits (difficult to calculate):
• Public liability
• Loss of expertise/experience Added value generated by:
• Loss of morale
• Product liability • Bad publicity – Increased employee motivation and satisfaction
• … – Sustained focus on quality and better quality of
• …
products
– Product innovations
– Better corporate image
– …

25 26
Hypothetical benefits Hypothetical benefits
Costs of incidents and accidents that happened (‘failure Hypothetical benefits are all costs related
costs’) and costs of
incidents and accidents that were avoided and that never
with accidents which have never occurred.
happened (‘hypothetical benefits’) are different in their nature,
due to the number of scenarios of possible accidents.

Nonetheless, their analogy is clear and therefore they can


easily be confused when making economic considerations.

However, less failure costs does not imply an equivalent of


higher hypothetical benefits; there are much more
hypothetical benefits when prevention investments go up !

27 28

Minimum total cost point


Two kinds of hypothetical benefits (‘prevention costs’ = control costs;
‘accident costs’ = failure costs)

• The hypothetical benefit of the risk treatment option can be regarded in two
ways:

• Definition (i): as the difference between the highest possible costs of an
accident in the current situation and those of an accident after applying the
treatment measure. Hence:

• Maxmax Hypothetical Benefit = Maximum possible accident cost
without any
• treatment – Maximum possible accident cost after the risk
treatment

• Definition (ii): as the difference between the costs of retention when doing
nothing (taking no action) and those of the possible accident after applying
the treatment measure. Hence:
• • For each company the break-even point is different
• Expected Hypothetical Benefit = Cost of retention – Expected possible cost • No hypothetical benefits are taken into consideration in this figure
of • Only re-active way of economic analysis (based on accident/failure
• accident after the risk treatment
statistics over time, linked with prevention efforts)! (à not usable for disasters)
• Nonetheless, all accidents are treated similar

29 30
Accident Typology Risk typology
• Accidents where a satisfactory amount of
information is available (Type I accidents or
non-major accidents)

• Accidents where very scarce information is


available (Type II accidents or major
accidents)

• Accidents where no information is available


(Type III accidents or black swan accidents)
= extremum of type II accident

31 32

Type I accidents: Type I and II accidents:


the Egyptian pyramid model the Mayan pyramid model

33 34
Hence: different types of risk, also for decision-
Some further considerations on economic prevention
making and operational economics! analyses
• Economic analyses can support normative risk
Risks possibly leading to minor occupational control decisions but can not be used to determine
accidents (“Type I”) are not to be confused with the efficiency and effectivity of prevention measures.
risks possibly leading to disasters and major
accidents (“Type II and Type III”) when making
• Economic analyses require debatable information,
prevention investment decisions. e.g. the price of a fatality, the price of a finger cut-off,
the question of who pays which costs, the question
Statistical cost-benefit methods may yield reliable of who receives which benefits, etc.
results for Type I risks, whereas the (maxmax)
hypothetical benefits of Type II risks almost always
outweigh the prevention costs for these types of
risks.

35 36

Some considerations on economic approaches


Economic approaches in safety decision-making
in prevention decision making

Choices between safety and prevention measures, • Many critiques can be formulated on the concept of ‘economic approaches
for safety decisions’ , e.g.:
constrained with the available H&S budget – Economic approach for safety decisions gives the industry the aura of
being more scientific about prevention measures taken
– Economic approaches and processes allow governments and
Example: (low probab, high conseq.) or organisations to hide behind ‘rationality’ and ‘objectivity’
(higher prob., lower conseq.) accident reduction? – Analysts know that the economic assessments are often based on
selective information, arbitrary assumptions, and enormous uncertainties.
Nonetheless they accept that the assessments are used to conclude on
à Evaluative best value-for-money risk reduction risk acceptability.
measures
• HOWEVER: THERE IS NO ALTERNATIVE (!): to support and continuously
improve the decision-making about prevention and safety measures, we
need to make economic assessments. The right way forward is not to reject
the economic approach in safety decision-making, but to improve the tools
and their use!

37 38
Economic approaches in safety decision- Economic approaches in safety decision-
making making
• Two fundamental scientific requirements should be
• Two fundamental scientific requirements should met:
be met:
2. Validity requirements of the economic assessment:
1. Reliability requirements of the economic assessment:
• The degree to which the produced economic/financial
• The degree to which the economic analysis methods numbers are accurate compared to the underlying
produce the same results at reruns of these methods true number

• The degree to which the economic analysis produces • The degree to which the assigned probabilities
identical results when conducted by different analysis adequately describe the assessor’s uncertainties of
teams, but using the same methods and data the unknown quantities considered

• The degree to which the economic analysis produces • The degree to which the epistemic uncertainty
identical results when conducted by different analysis assessments are complete
teams with the same analysis scope and objectives,
but no restrictions on methods and data • The degree to which the economic analysis
addresses the right quantities

39 40

Strengths and weaknesses of economic analyses w.r.t.


safety 4. With?
Things to consider when using an economic analysis • Practical models/tools and concrete
for safety decisions:
data/examples
• - An economic analysis is as accurate as its input information
• - It is also often easier to obtain data on costs than on potential
benefits
• - Indirect, invisible costs and benefits might play an
important role in the (lack of) accuracy of an economic analysis
• - Perceptions of people carrying out the analysis should be
as objective as possible
• - An analysis creates an image of precision, but it is not
• - Net present values can not be certain to be realistic
• - Philosophical difficulties: valuation of ill-health, life, etc.

41 42
Helping decision makers to take prevention
investment decisions: Current decision models Five-step approach
and Decision Support Systems

• - “Quick and dirty” calculations 1. Preparation (scope, goals, suitable technique, parties to
be involved)

• - Calculations for costs and prevention 2. Selecting variables and indicators (smart choice : in line
with goals etc., data available, agree upon)
investments regarding Type I accidents
3. Finding data (available, extrapolation, new)
• - Calculations for costs and prevention
4. Valuations and calculations (attach money values to
investments regarding Type II accidents quantified variables and indicators; understandable
presentation of results)

5. Interpretation of results and refinement

43 44

Helping decision makers to take prevention


investment decisions: Current Decision models
and Decision Support Systems C/B analyses for accidents: basic
• Cost-benefit analysis: incorporates valuations of costs idea/approach
and benefits into the calculations, to make decisions.

• Cost-effectiveness analysis: is used to compare the


costs associated with a range of risk control measures that
achieve similar benefits in order to identify the least cost
option and maximize the return from a given budget.

(both approaches can be used for Type I and Type II


prevention investments, but in a different way for the two
types !)

45 46
“Quick and dirty” accident cost calculations:
only possible for Type I accidents
Quick-calculation example

• Use e.g. the type I pyramid: • Use e.g. the Bird pyramid:

Sort of Bird pyramid Number of Cost per sort of Cost


incident/accident incidents/accidents incident/accident

Serious 1 N x N.x

Minor injury 10 10.N y 10.N.y

Property damage 30 30.N z+t 30.N.(z + t)

incident 600 600.N s 600.N.s

Total cost : N.(x + 10.y + 30.(z +


t) + 600.s

47 48

Quick and dirty CB-analysis of safety


measures for type I risks
{(Cwithout × Fwithout ) - (Cwith × Fwith )}× Prcontrol > Safety measure cost
Type I models/tools?
Or, if not sufficient information regarding the initiating events’ frequencies is available for
using the previous equation: • A practical tool for the estimation of the
(Cwithout - Cwith ) × Faccident × Prcontrol > Safety measure cost
direct and indirect costs and prevention
investments of non-major accidents.
With:
ìCwithout = cost of accident without safety measure
ïC = cost of accident with safety measure
ï with
ïï Fwithout = statistical frequency of initiating event if the safety measure is not implemented
í
ï Fwith = statistical frequency of initiating event if the safety measure is implemented
ï Faccident = statistical frequency of the accident
ï
ïîPrcontrol = Probability that the safety measure will perform as required

49 50
Safety and prevention econ. calculations:
More rigourous calculations for costs and prevention Number of existing models/tools exist
investments
regarding Type I accidents Models are often quite simplistic, providing
only a very rough idea of for example
1

• Definitions -during execution work 29


– The costs of incidents/accidents
-due to execution work
– Occupational accidents -lesion 300
– The possible benefits of a prevention
-sudden event
– Incidents -external source measure
Author Fixed ratio
– The different cost categories and their
• Ratios direct and indirect costs: Heinrich 1/4
importance
Bird 1/6
Rikhardsson 1/1,5 – Annual cost savings due to prevention
RIVM 1/5
E&P FORUM 1/4 –…
France 1/3

51 52

Three types of available tools Investment analysis parameters

• - Analyze occupational accidents and assess • NPV


their economic impact (only costs)
• IRR NPV(r)=0

• - Develop and evaluate prevention


measures and their costs and benefits (= • Profitability Index or Benefit Cost Ratio
averted accident costs), eventually linked NPV / C0
with an investment analysis
• PBP

• - Carry out investment analysis (compare


P
y(P) = ò x(t).dt = 0
0

different investment alternatives)

53 54
The model
An example of a Type I model/tool for cost
calculations and prevention investment
calculations – building the model
Porter Business in Context Model
MEEMO Model
Man Equipement

Organisation

Energy Product

Material Environment

55 56

Influencing parameters Influencing parameters

DIRECT COSTS
Fines
A. MAN
• medical costs, costs for
INDIRECT COSTS prosthesis, costs of orthopaedic
B. EQUIPEMENT
C. INTERNAL WORK ENVIRONMENT
material, travelling costs;
Social
Energy loss disturbances
D. MATERIAL/PRODUCTS/... • costs due to the temporary
E. ORGANIZATION
Man Equipment
Man F. PRODUCT (OUTPUT) disability of the victim;
Health Material Health damage G. CUSTOMERS
Exhaustion
damage damage Products
H. SUPPLIERS • costs due to the possible
Organization Damage I. EXTERNAL ENVIRONMENT permanent disability of the
Energy Damage
victim;
Bad products
Lost
Material Environment
Raise premium
deals • costs due to the disease of the
Waste
victim.
Raw materials
Exhaustion Waste
Damage
In case of a deathly accident, aside the possible costs above, their are
Raw material
other costs like the funeral costs, a percentage of the wage of the victim as
an indemnity for the persons left behind etc. This also is clearly described
Bad publicity
in the Belgian legislation.

57 58
Integrate supplementary information Tool can be used for estimating the accident costs

E. ORGANIZATION Ple ase enter de cimal numbers: (e g 0,25 = 2hrs, 1 = 8hrs)


Cost Estimator of Occupational Accidents: da ys spent ave rage cost/d amount
Dealing wit h t he incident
GENERAL DATA GENERAL DAT
G ENERAL DATA
GENER AL DATA Pe rsonne l perfor ming F irst- Aid trea tement 0 €
Pe rsonne l ta king injur ed person to hospital/home 0 €
Business secto r
#N/A Invest igation of incident
NACE Fill in code (XX.X XX), bus iness sector above appears aut omatic
Accide nts inve stigation safety de pa rteme nt 0 €
Accide nts inve stigation dir ect ma na ge ment 0 €
Name of compan y Da te (C21 *)
Accide nts a dm inistration HR department 0 €
Victim(s) Cho ose Accide nts a dm inistration legal de pa rtement 0 €
Fun ction o f th e victim( s) Cho ose Necess ary field! Time spent with L ocal OSH author ity 0 €
Critical for the co ntinu ity of the co mp any (sco re f rom 1 -5) External OSH service s 0 €
Re porting to boa rd of directors, He ad Office,... 0 €
Accide nt with death as consqu en ce
Extra work time safety departem ent 0 €
Cau sin g e vent 00 Ge en info rma tie (me eting, coordination, instr uctions, advise,...)
Getting back to b usiness
Cost of r ecruite ment a nd sa lary of the replace r 0 €
Con cern ed obj ect 00 .0 0G een be tro kken voo rwerp o f g een inf ormatie
Re organisa tion of work ( training,...) 0 €
Actions to safegu ard futu re business
In-company activities (promotion) 0 €
Perman en t lesio n
Communication to pe rsonne l €
Typ e o f lesio n 00 0Onb eken d le tsel: Info rmatie on tb reekt Socia l disturbances ( trade unions) 0 €
Turnov er pe rsonne l due to a ccident(s) 0 €

Typ e o f Acciden t (Belgian Leg islation ): TOTAL ORGANISATION 0 €


reg ular occu pationa l accid ent
De scription of accid ent : F. PRODUC T (OUTP UT) Ple ase enter de cimal numbers: (e g 0,25 = 2hrs, 1 = 8hrs)
da ys spent ave rage cost/d amount
Lost production ( reduce d output) 0 €
DIRECT COSTS Re work, repairs, re je ctions (Q e ffect) 0 €
Orde rs lost or cancelle d 0 €
A. MAN Innovativ e ca pa city of the firm 0 €
employer charges on t he payment 108% 35099,83 € : maximum payment
Wa rranties (Q e ffect) 0 €
on 108% : s ocial securit y bleu collars 38,34% 13,07% social sec. whit e c ollars
employer charges on t he holiday payment 16,27% 11,33% company c harges TOTAL P RO DUCTION (OUTPUT) 0 €
Costs 2,1 xgross salary=cost empl.
G. CUS TOMERS Ple ase enter de cimal numbers: (e g 0,25 = 2hrs, 1 = 8hrs)
Medical costs,travellin g cost,p rotheses,orth oped ic material, .. .
da ys spent ave rage cost/d amount
Tem porary dis abil ity (% i nvali dity)
Communication to custome rs 0 €
Pay me nt € choo se freq uen cy
Pr ice re duction, penalties 0 €
Premiu m en d o f y ear € 0 €/d ins urance
Lost de als a nd clie nts 0 €
Ho urs/week hours 0 €/d employer
Lost da ys days 0 € s al ary by insur ance TOTAL CUS TOMERS 0 €
% inv alidity tempo rary % 0 € s al ary by em pl oyer
H. SUP PLIERS
Perm anent di sabili ty (% invalidi ty)
External OSH Service 0 €
% inv alidity p ermanen t 0 % 0 € NPV provisi on
Psy chologica l support vitim(s) or witnesses €
Ho lid ay paymen t 0,00 € 0 € payed by i nsurance
Me dical ex amination Ex te rnal OSH ser vice be fore sta rt €
Death: Fu ne ral costs, % fo r pe rso n le ft b ehin d, % children 0 € 0 € payed by empl oyer
Ins uren ce c omp an y :Raise in surance premiu m 0 €
TOTA L MA N 0 €
Temp off ice €
INDIR ECT COSTS
Others 0 €

B. EQUIP EMENT/MACHINES/... Plea se enter decimal n umbe rs: (eg 0,25 = 2hrs, 1 = 8 hrs) TOTAL SUP PLIERS 0 €
days sp ent average c ost/d amo unt
I. EXTERNAL ENVIRON MENT Ple ase enter de cimal numbers: (e g 0,25 = 2hrs, 1 = 8hrs)
Makin g th e are a safe 0 €
Govern ement da ys spent ave rage cost/d amount
Da ma ged pro te ctive equ ipment 0 €
Extra control of gov erneme nt 0 €
Repa irs, clean ing of working g ear, veh icles o r in stallation s 0 €
Legal costs, fine s, indemnities 0 €
Repla cemen t o f wo rking gea r, vehicles or installations 0 €
Cha nge me nt of p rodu ction location Media
0 €
Pr ess re lease s 0 €
Oth er material co sts 0 €
External environment
TOTA L EQUIPEMENT/MA CHINES/... 0 € Dama ge to the e nv ironment 0 €
Market
C. INTER NAL WORK ENVIRONMENT Plea se enter decimal n umbe rs: (eg 0,25 = 2hrs, 1 = 8 hrs) Attractive ne ss to potentia l customer s (loss of im age) €
days sp ent average c ost/d amo unt Labour market
Costs ma de to rep air orig inal wo rk en viron me nt Position la bour mar ket, a ttr activeness ne w personnel €
0 €
TOTA L INTER NA L WORK ENVIRONMENT 0 € TOTAL EXTERNAL ENVIRON MENT 0 €

D. MA TER IAL /PR ODUCTS/... Plea se enter decimal n umbe rs: (eg 0,25 = 2hrs, 1 = 8 hrs)
days sp ent average c ost/d amo unt
OTHER COSTS: 0 €
Material damage of stock s 0 € DIREC T CO ST INDIRECT COST TOTAL CO ST
Sub stitution of pro du cts 0 €
0 + 0 = € 0,00
TOTA L MA TER IAL /PR ODUCTS/... 0 € RATIO DIRECT C OST / INDIRECT CO ST 1/ #DI V/0!

59 60

Tool can be used for carrying out an investment


analysis
Example:

Cost Estimator of Occupational Accidents:


Investment analysis
Occupational accident:
Business sector
Payback, IRR, ...
#N/A
After putting away the goods, the victim has not
• PBP / IRR lowered the forks of a forklift truck, and
NACE 0,00
Name of company 0
Subsides of governement,... 0 €
Invested amount 0 €

consequently drove with the forks against a


Total investement 0 €

- Case 1
Yearly costs of acci dents (Calculated) 0
Number of worker s (FTE) 0 FTE
Number of accidents 0 injuries sector pun
Tar get reduction of i njuries 0 injuries

- Case 2
Case 1 : Tar get reduction of injuries w ill be achieved i mmediately
Payback period
#DIV /0!

#DIV /0!
%

Years
supporting beam of the ceiling. As a result, the
fork lift truck overturned.
Rate of return #DIV /0!
cost saving per year #DIV /0! €
Case 2 : Target reduction of injuries will be achieved as fol low s:
Rate of return
after 1 year 20,00% #DIV /0! after 1 year
after 2 years 40,00% #DIV /0! after 2 years
after 3 years 60,00% #DIV /0! after 3 years
after 4 years 80,00% #DIV /0! after 4 years
after 5 years 100,00% #DIV /0! after 5 years

• Amount of sales needed


Payback period

Sales V olume / turnover


Amount o f sales ne eded tot replace Lost prof ts
0 € Link to the balance sheet data base Injury: fractures
to replace lost profits
Profit margin 0,00% % Cl ick her e and select sector-go to page
Annual Profit 0 25"toegevoegde w aarde per personeel slid"

Annual accident cost (indirect and direct)


Accident cost as a percent of profits
Amount o f sales ne eded tot replace lost profits
#DIV /0!
#DIV /0!

%

(If profit margin is 5 %, then it takes 20 euro of s ales to replac e every euro of los s)
Time of absence: 181 calendar days
Decreasing the insurance premium
Personnel expenses
% of per sonnel expenses %

• Decreasing the insurance


Yearly insur ance premi um

Number of accidents injuries


Tar get reduction of i njuries injuries %
Invested amount €

premium Decrease of premium


New premium
Cost saving s a year
0 €
0 €

%

Payback perio d #DIV /0! years

61 62
Type II models/tools?

• A practical tool for the estimation of the direct and


DIRECTE KO ST IN DIRECTE KOST TOTALE KOST
indirect costs and prevention investments of major
15502 + 29289,4 = € 44791,88 accidents.
RATIO DIRECTE KOST / IN DIRECTE KOST 1/ 1,89

Opmerkin gen :
De reachttruck zou in sept 20 08 restwaard e total loss 304 7 euro
afgeschreven huur van drie maanden van reachttruck: 3350
zijn. De machine was total loss en de
verzekering heeft enkel de restwaar de
uitgekeerd.
De n ieuwwaarde reachtruck i s 25 050 eu ro

63 64

Cost benefits analysis for type II risks: need for


specific tool
Scope: “major accident”
• Identification of costs and benefits (being different of nature) for Definiton (for example):
major accident scenarios (hence: hypothetical scenario-based
benefits!)
“Major accidents are accidents that deviate from normal
• Calculation of Net Present Value of all costs and benefits (using a expectations, which cause at least several fatalities on site
discount rate, probabilities or frequencies, and an installation
lifespan) and
one fatality and many injured off site, and/or important
• Comparison of total (summated) discounted costs and benefits environmental damage, and/or material damage of tens of
millions of euros, and/or international press attention.”
• Use of a “Disproportion factor”:
Costs > Benefits x Disproportion factor

65 66
Costs and benefits? Costs and benefits?
(Prevention) cost categories Benefit (avoided accidents) categories

• Initial costs: research, selection and design, material, training and • Production chain benefits: production losses, start-up, planning
information, adaptation of the guidelines & work procedures, etc.
• Benefits regarding damages and losses: material/property/asset damages
• Installation costs: production losses, start-up, installation team, tools and and losses of the own company, material/property/assets damages and
utensils losses of other companies, of neighbouring living areas, of public property, …

• Exploitation costs: utility costs • Judicial benefits: penalties, interim-lawyers, specialized lawyers, internal
investigation team, experts, legislation changes, permits
• Maintenance costs: material, maintenance team, production losses, start-up
• Insurance benefits: insurance premium
• Inspection costs: inspection team

67 68

Costs and benefits? Reputation benefits?


Benefit (avoided accidents) categories
BP Share Price (GBP)
700
• Human- and environmental benefits: victims, injuries, recruitment, 650
600
550
environmental damages 500
450
400
350
300 BP Share Price (GBP)
250
• Intervention benefits: intervention

02/02/2010
29/03/2010
26/05/2010
21/07/2010
15/09/2010
09/11/2010
06/01/2011
02/03/2011
28/04/2011
27/06/2011
19/08/2011
14/10/2011
08/12/2011
06/02/2012
30/03/2012
29/05/2012
25/07/2012
19/09/2012
13/11/2012
10/01/2013
06/03/2013
02/05/2013
Table 13: BP Share Price, Key Values
Source: Based on (BP, 2013)

• Reputation benefits: stock exchange value of shares maximum: £655,40 20/apr/10


minimum: £302,90 29/jun/10
drop: £352,50

• Other benefits: managers’ time, clean-up drop in %: 53,78%

current value: £466,25 2/mei/13

69 70
Disproportion factor:
Tool for type II risk economic analyses
research in progress
– Cost-benefits analysis for x measures for 1 scenario
– Cost effectiveness analysis for x measures for 1 scenario
– Cost-effectiveness analysis for y measures for z scenarios

1. ‘Guidelines and instructions’ sheet


2. ‘Data Input’ sheet
Disproportion(factor( 3. ‘Cost-Benefit Analysis’ sheet
!

30"
!
!
4. ‘Cost structure’ sheet
!
!
! 5. ‘Benefit structure’ sheet

Level%2%risks:%Tolerable%if%ALARP%
!

Level%1%risks:%First%priority%
Level%3%risks:%Acceptable%
!
!
!
6. ‘Cost-Effectiveness Analysis’ sheet
!

10"
!
!
7. ‘Report’ sheet
!
!
!
!
8. ‘History’ sheet
!
1"
!
Broadly( ALARP( ! Intolerable(
Risk(
9. ‘Optimization tool’ sheet
Acceptable( !
!
!
!

71 72

Use of the tool?


Type I and II cost effectiveness
• - Optimize the current/existing decision process analyses?
regarding prevention measures investment • A practical approach.

• - Approach for safety/prevention managers to


convince others of safety measure investments

73 74
Building blocks for the approach
Cost effectiveness analysis
Approach: “Knapsack problem” formulation
max Bi xi
• Building block 1: The risk matrix
s.t. Ci xi £ Bu tot
• Building block 2: Costs and Benefits
xi Î {0,1} needed
A number of assumptions are implicitly taken in this formulation:
• A measure is either taken or not (it cannot be partially taken);
• The total benefit of all measures taken is the sum of the individual benefits of the chosen
measures;
• The total cost of all measures taken is the sum of the costs of the individual measures;
• Method used in combinatorial
• Measures can be independently implemented, without consequences for the other measures.
optimization, that is, to solve the so-
called ‘knapsack problem’

75 76

Building block 1: The risk matrix Discretization of the risk matrix

(DoD, 2000)
Probability of Hazard
Severity of
F E D C B A
consequences
Impossible Improbable Remote Occasional Probable Frequent
I
1.
Catastrophic
II
3. 2.
Critical
III
4.
Marginal
IV
Negligible Every cell of the risk matrix corresponds with a certain cost Ci.
Acceptable
Risk Code/ Un- Un-
Actions
1.
acceptable
2.
desirable
3. with 4. Acceptable (This cost is the total cost of all risks together within that cell
controls
of a certain type (e.g. fire risks) –
hence first one looks at the likelihood as well as at the
consequences, second one sums the consequences per
cell, third a final cell is assigned to the package of risks)

77 78
Building block 2: Cost-benefit analysis
The risk matrix, with cell cost per year
• “Costs” = costs of prevention measures
Likelihood Cell assignments (in €/year)
[year-1] for decreasing from risk cell i to risk cell j;
>1 7,500 75,000 750,000 2,500,000 called COPij
> 10-1 750 7,500 75,000 250,000
> 10-2 75 750 7,500 25,000
> 10-3 7.5 75 750 2,500
> 10-4 0.75 7.5 75 250 • “Benefits” = averted costs (thus:
Consequence
classes / financial “hypothetical benefits due to the taking of
impact [€] à < 7,500 < 75,000 < 750,000 < 2,500,000 prevention measures”): to calculate by
(ideally, we have real cost values Ci, based on financial information available about risks in the organization) determining the decrease of costs related
(cfr. as used in tool under construction)
to risk cell i and risk cell j. This decrease
can be calculated by subtracting Cj from
Ci.
79 80

Cost-benefit analyses Solving the knapsack problem


• Although the knapsack problem is NP hard , it can be
solved efficiently even for very large instances.
Needed input data (in general):
• It can be solved by standard off-the-shelf commercial
software for mixed-integer programming, such as
n
Number of cells where risks do exist for the organization (= Nc; Nc Ì n )
Ci
– CPLEX (http://www.ibm.com/software/integration/optimization/cplex-
Butot optimizer/)
Costs of Prevention for going from risk cell i to risk cell j (CoPij), "i, j whereby i Î Nc – Gurobi (http://www.gurobi.org)
– GLPK (http://www.gnu.org/software/glpk/)
– lpsolve (http://lpsolve.sourceforge.net).
– even spreadsheet software such as Excel or LibreOffice include a
solver that can be used to approach and optimize the safety
measures portfolio using the method described in this paper.

81 82
Illustrative example (1) Illustrative example (2)

• Input information: • Input information:


Prevention measure ij (Illustrative) costs of Hypothetical benefits for
Prevention for going from i going from i to j (€)
to j (CoPij) (€)
Start = Risk cell 3
3 2 35 67.5
3 1 42 74.25
n = 20 Start = Risk cell 7
7 6 325 675
Nc = 6: risk cells 3, 7, 10, 12, 13 and 15 from Figure 2 7 5 460 742.5
7 3 295 675

Ci = see risk matrix from Figure 1 7 2 420


590
742.5
749.25
7 1
Start = Risk cell 10
Butot = €50,000 10 9 330 675
10 6 350 675
Costs of Prevention for going from risk cell i to risk cell j (CoPij) = see Table 7b 10 5 390 742.5
10 2 400 742.5
10 1 880 749.25
Start = Risk cell 12
12 11 13,500 17,500
12 10 13,750 24,250
12 9 14,800 24,925
12 8 13,000 22,500
12 7 15,000 24,250
12 6 16,500 24,925
12 5 26,000 24,992.5
12 4 13,900 24,750
12 3 17,000 24,925
12 2 27,500 24,992.5
12 1 38,000 24,999.25
Start = Risk cell 13
13 9 410 675
13 5 550 742.5

83 84

Illustrative example (3) Solution of the illustrative example (4) chosen cost benefit
Start = Risk cell
1
3
3 2 35 67.5 0 0 0
3 1 42 74.25 1 42 74.25
Start = Risk cell
to solve this problem, four conditions have to be met: (i) the total benefit of measures taken, 7
7 6 325 675
1

0 0 0
7 5 460 742.5 0 0 0

needs to be maximized; (ii) the available budget constraint needs to be respected; (iii) 7 3
7 2
295
420
675
742.5
1
0
295
0
675
0
7 1 590 749.25 0 0 0
Start = Risk cell
maximum 1 decrease per risk cell is allowed; and (iv) a measure can be taken, or not. These 10
10 9 330 675
0

0 0 0
10 6 350 675 0 0 0
10 5 390 742.5 0 0 0
conditions translate into the following mathematical expressions: 10 2 400 742.5 0 0 0
10 1 880 749.25 0 0 0
Start = Risk cell
1
12
12 11 13500 17500 0 0 0
12 10 13750 24250 0 0 0

åB
12 9 14800 24925 0 0 0

(i) max i , j ij xij 12 8


12 7
13000
15000
22500
24250
1
0
13000
0
22500
0
12 6 16500 24925 0 0 0
12 5 26000 24992.5 0 0 0

(ii) å CoP
i, j
ij £ Bu tot 12 4
12 3
12 2
13900
17000
27500
24750
24925
24992.5
0
0
0
0
0
0
0
0
0
12 1 38000 24999.25 0 0 0
Start = Risk cell
0
13
(iii) åxj
ij £1 13 9
13 5
13 1
410
550
700
675
742.5
749.25
0
0
0
0
0
0
0
0
0
Start = Risk cell
1
15

x ij Î {0,1}
15 14 31000 67500 0 0 0
(iv) 15 13
15 11
36650
29880
74250
67500
1
0
36650
0
74250
0

• Solution: total cost = 49,987€; total hypothetical benefit = 97,499.25€. Total hypothetical profit =
47,512.25€.

85 86
Possible approach refinements for further application in real Possible approach refinements for further application in real
industrial practice industrial practice

Relationships between measures: Relationships between measures:

In general, the portfolio of safety measures chosen by a company is subject to a


number of extra constraints, that express relationships between these measures. Another possibility:
Fortunately, these relationships are generally easily added to the knapsack-based
approach, usually by introducing additional constraints.
if risk cell r is decreased: risk cell t has to be decreased, but the
Binary relationships: reverse is not true.
(e.g. firedoor and extra layer of fireproof coating)
Mutually dependent prevention measures (e.g., installing safety fire equipment and
training):
Risk cell decrease ràs = installing the coating
Relationship between risk cell decrease from r to s and from u to t can be expressed in Risk cell decrease tàu =installing the door
the approach by an extra constraint:

Then: extra constraint:


x(ràs) = x(tàu)
x(tàu) <= x(ràs)

87 88

Possible approach refinements for further application in real Possible approach refinements for further application in real
industrial practice industrial practice

Relationships between measures: Relationships between measures:

Yet another possibility: Another possibility:

either risk cell r or risk cell t needs to be decreased, but not both risk cells at either risk cell r, or risk cell t, or both, need to be decreased.
the same time.

(e.g. two measures are redundant (they duplicate each other’s effects), and (e.g. either invest in a company fire department or in a sprinkler
the organisation judges it superfluous to invest in both measures system, or in both)
simultaneously)
Mathematical constraint:
(e.g. protection of machine from fire by procedural measures or by physical
X(ràs) + x(tàu) >= 1
measures (fire wall), but not by both)

Extra constraint:
x(ràs) = 1 – x(tàu)

89 90
Possible approach refinements for further application in real Possible approach refinements for further application in real
industrial practice industrial practice

Relationships between measures: Relationships between measures:

Other relationships:
Another possibility:
• In principle, all relationships between measures can be expressed in a
mathematical way as constraints
if risk cell t is decreased, risk cell r cannot be decreased, and vice
versa. The possibility also exists that both measures are not taken. • Logical relationships can also be used, and expressed by operators
– NOT (risk cell i is not decreased)
(e.g., management has decided that one part of a facility will be – AND (risk cell i and risk cell j are decreased)
protected by a sprinkler system at most, but not two parts) – OR (risk cell i or risk cell j is decreased)
– IMPLICATION (if risk cell i is decreased, then risk cell j is decreased)

Mathematical constraint: These logic-operators can be used to create arbitrarily complex relationships that can
X(ràs) <= 1 – x(tàu) be used to express the most complex logical relationships between safety measures

91 92

Possible approach refinements for further application in real Possible approach refinements for further application in real
industrial practice industrial practice

Relationships between measures: Relationships between measures:

Example for logical operators: Logical equivalent: (M1 AND M2) AND NOT(M3) IF THEN M4 OR
M1: automatic fire door is installed [e.g. x(4à2)] M5
M2: fire alarm system is installed [e.g. x(6à2)]
M3: electricity system is upgraded [e.g. x(3à1)]
Converted into its conjunctive normal form:
M4: back-up generator is installed [e.g. x(7à3)]
M5: a link to an additional electricity system is installed [e.g. x(5à2)] (NOT(M1) OR NOT(M2) OR M4 OR M5) AND (M3 OR M4 OR M5)

The condition is the following: Which, in turn, can be translated into the following mathematical
If both the automatic fire door and the alarm system are installed, and the constraints which need to be met both:
electricity system is not upgraded, then either a back-up generator should be
installed, or a link to an additional power system should be purchased.
X(4à2) + x(6à2) – x(7à3) – x(5à2) <= 1
X(3à1) + x(7à3) + x(5à2) >= 1
Logical equivalent: (M1 AND M2) AND NOT(M3) IF THEN M4 OR M5

93 94
Possible approach refinements for further application in real Possible approach refinements for further application in real
industrial practice industrial practice

Relationships between measures: Relationships between measures:


Example for the non-additive case:
Non-additivity situations:
The effect of combining risk cell decreases (3à1 and 7à3) does not yield a benefit of A
+ B = C, but it yields a benefit of D (<C)
The benefits and costs may not simply be additive.
Suppose further that the cost of implementing (3à1 and 7à3) is not R+S=T, but is S
(e.g. the effect of one fire door instead of none will be greater than the effect (<T)
of installing two doors instead of one, due to the diminishing rate of return of
the second door) This can be handled by adding an extra risk cell decrease with cost S and hypothetical
benefit D.
Additionally, constraints are necessary to ensure that this extra measure is not taken if
In such situations, ‘virtual’ measures need to be created in the cost-benefit either 3à1 or 7à3 are taken.
table to represent the action of taking both measures.
Thus:
If you want to ensure that each measure is taken only once, additional x(extra risk cell decrease) <= 1 - x(3à1)
constraints are also needed. x(extra risk cell decrease <= 1 - x(7à3)
X(3à1) + x(7à3) <= 1 (that is, both 3à1 and 7à3 are not chosen at the same time)

95 96

Optimal diversified selection of prevention


5. New? measures within a budget constraint

• Innovative ways of dealing with


economics of operational safety and
prevention investments?

97 98
Optimal diversified selection – application
Evaluating + and – side of risk treatment
of the Langrange multiplier method
Assume the following functions for the curves of diminishing marginal rate of return on investment for
Technology, Organisation, and People respectively:

yT =
0.5 xT
; yO =
0.2 xO
; yP =
0.3xP
• Method for Risk Treatment decisions
xT + 5000 xO + 200 xP + 2000

Furthermore, the safety budget is for example set to be 20,000€. Hence, since x i represents the safety
budget for safety measure of type i, the condition xT + xO + x P = 20000 can be drafted. The following
using:
maximization problem thus arises:

é 0.5 xT 0.2 xO 0.3 xP ù


Max ê
ë xT + 5000
xT , x O , x P
+
xO + 200
+ ú
xP + 2000 û – Cost = Cost of Risk Treatment (retention, transfer,
s.t.
xT + xO + xP £ 20000 prev., combination; taking probabilities into account)
To solve the problem, the Lagrange function is formulated:

0.5 xT 0.2 xO 0.3x P


– Min cost: NO accident cost; Max cost: worst-case
+ l × (20000 - xT - xO - x P )
L=
xT + 5000
+
xO + 200
+
x P + 2000 accident cost
The four first order conditions look as follows:

¶L 2500
– Variability = Max cost – Min cost
= -l = 0
¶xT ( xT + 5000 )
2

¶L
=
40
-l = 0
– Uncertainty = Variability / Max cost
¶xO ( xO + 200)
2

¶L
=
600
-l = 0
– Hypothetical benefit = Cost of retention – Cost of
¶x P (xP + 2000 )
2

¶L
= 20000 - xT - xO - x P = 0
retention AFTER treatment
¶l

Solving this system of equations gives xT » 11,700; xO » 1920; xP » 6350 . Hence, under the made – Defining a maximum uncertainty level for the company
assumptions, the safety budget of 20,000€ should be allocated to technological measures for some
11,700€, to organisational measures for some 1,920€, and to people-related measures for some
6350€. Therefore, using a budget of 20,000€ allows to achieve a safety benefit of some (0.35 + 0.18 +
(e.g. 30%)
0.23)% = 76%.

99 100

Event trees and economics –


Scenario thinking approach for a

Evaluating + and – side of risk treatment


runaway reaction

101 102
Event trees and economics – cost-
Other promising techniques under
variable approach for domino effects
investigation
prevention
- Safety value function approach

- Multi-attribute utility approach

- Bayesian theory approach

- Limited memory influence diagram approach ?


- Monte Carlo simulations for safety economics

- Game theory applications for safety economics

103 104

Heuristic for economic decision-making 6. Recommendations

105 106
Thank you!
Conclusions & recommendations
• Preventing accidents is an important expenditure on a yearly
basis for organizations

• Optimizing prevention investments and making investment Questions?


decisions in a cost-efficient way is very rewarding for companies

• The two types of risks should be treated differently when making


economic analyses, and different techniques should be used

• Economic analyses should be explored and elaborated more in


depth to make them more accurate and more user-friendly, and
usable for companies Ming Yang, PhD, P.Eng.
Assistant Professor
Safety and Security Science Section,
Faculty of Technology, Policy, and Management,
TU Delft, the Netherlands
107 Email: m.yang-1@tudelft.nl 108

On the quantitative Outline


resilience assessment of
• Why to assess resilience?
complex engineered • What to be assessed?
systems • How to assess resilience quantitatively?
Dr. Ming Yang
Assistant Professor of Safety and Security Science
Faculty of Technology, Policy, and Management
TU Delft, The Netherlands

1 2
Safety?
• What is Health? • What is Safety?
• Absence of illness/sickness • Absence of accidents
• How to measure and • How to measure and
maximize? maximize?
• Check for any illness or • Check for any accident
sickness causing situation - RISK
• Avoid unhealthy conditions • Minimize Risk
• Do regular checkups • Monitor and Manage Risk

3 4

Safety? (resilience-based thinking) The need for resilience engineering

• What is Health? • VUCA (Volatility, Uncertainty, Complexity, and


• What is Safety?
• Strong immunity and absence of illness Ambiguity) world
• Strong ability to handle
disruptions and maintain • Resilience engineering accepts that:
desired performance – Residue risk cannot be eliminated
• How to measure and maximize? • How to measure and – Root causes may not be found in complex systems
maximize?
• Do healthy things (exercise regularly, • Check for functionality
variability balancing productivity
Hydrate, get plenty of sleep)
and safety - Resilience
• Do regular checkups (both immunity and
• Maximize Resilience
Adverse event System capability
disease)
• Monitor and Manage Resilience
Risk Resilience
Health to the human is Safety to the System
Pre-accident During an accident Post-accident
Let’s build up our immunity!
5 6
Is risk assessment useful in resilience
What is Resilience Engineering?
assessment and management?
• Broad qualitative risk assessment of • The first definition (Hollnagel, 2006, p. 16)
events, recovery, and uncertainties for states: "The essence of resilience is,
therefore, the intrinsic ability of an
(Aven, 2017): organization (system) to maintain or regain a
– Making a judgment of the type of events that dynamically stable state, which allows it to
can occur (what we know and do not know) continue operations after a major mishap
– Making a distinction between known and and/or in the presence of a continuous
unknown types of events, and suprising stress."
events • Resilience Engineering is a paradigm for
– Assessing the probability for these types of safety management that focuses on systems
events using subjective probabilities coping with complexity and balancing
productivity with safety (Patriarca et al.,
– Conducting assessments to reveal unknown 2018)
and suprsing events
Hollnagel, E. (2006). Resilience - the Challenge of the Unstable. In E. Hollnagel, D. D. Woods, & N. Leveson (Eds.), Resilience
Aven, T. (2017). How some types of risk assessment can support resilience analysis and management, RESS, 167, 536- Engineering: Cocnepts and Percepts (pp. 9–17). Ashgate Publishing, Ltd. .
543. 7 Patriarca, R. et al. (2018). Resilience Engineering: Current status of the research and future challenges. Safety Science, 79-100. 8

A horse as an engineered system What man do to the horse?


A tamed and trained • Tam and train the wild
A wild horse horse horse
• Equip the trained horse
• Grow a new breed of
horse
• Train the new breed of
horse
• Equip the trained horse
• …
Anticipate?
Absorb? Adapt? Restore?

9 10
Tong, Q., Yang, M., Zinetullina, A. (2020). A dynamic Bayesian network-based approach to resilience assessment of engineered systems, JLPPI, 65, 104152

Research fields References Definitions Key words


Ecological system Holling (1973) [5] The measure of the persistence of systems and Absorb
of the ability to absorb change and disturbance Maintain

Questions to be answered and still maintain the same relationships


between populations or state variables.
Organization system Kendra and Wachtendorf (2003) [29] An ability to sustain a shock without completely Sustain a
deteriorating; that is, most conceptions of shock
resilience involve some idea of adapting to and Adapt
‘bouncing back’ from a disruption.

• How to define resilience for complex engineered Burnard and Bhamra (2011) [30] Resilience is the emergent property of
organizational systems that relates to the
Adaptive
capacity
inherent and adaptive qualities and capabilities Improve
systems? that enable an organizations adaptive capacity
during turbulent periods. The mechanisms of
awarenes
s
organizational resilience thereby strive to Reduce
• How to measure resilience? improve an organization’s situational awareness,
reduce organizational vulnerabilities to systemic
vulnerabi
lity
risk environments and restore efficacy following Restore
the events of a disruption. efficacy

Economics Charles Perrings (2006) [31] Economic resilience refers to the ability or Absorb
capacity of a system to absorb or cushion against
damage or loss.
Joseph Fiksel (2006) [32] Enterprise resilience refers to the capacity for an Survive
enterprise to survive, adapt, and grow in the Adapt
face of turbulent change. Grow
Rose and Liao (2005) [33] Economic resilience refers to inherent ability and Inherent
adaptive response that enables firms and ability
regions to avoid maximum potential losses. Adaptive
response
Avoid
losses
Social system Adger (2000) [34] Ability of groups or communities to cope with Cope
external stresses and disturbances as a result of with
social, political and environmental change. external
stresses
and
disturban
ces
Social-ecological system Gumming et al. (2005) [35] Ability of a system to maintain its identity in case Maintain
11 of disturbances. 12
Kinzig et al. (2006) [36] Resilience in a social-ecological system refers to Survive

Sharma, N., Tabandeh, A. and Gardoni, P. (2017). Resilience analysis: a mathematical formulation to model resilience of
Dinh, Pasman, Gao, and Mannan (2012) Sharma, Tabandeh, and Gardoni (2017)
• Resilience is the ability to recover quickly after an
• “The resilience of a system is related to its ability
upset, has been recognized as an important
to withstand stressors, adapt, and rapidly recover
characteristic of a complex organization…

engineering systems. Sustainable and Resilient Infrastructure, 3(2), 49-67.


from disruptions.”
• Six principles: flexibility, controllability, early
detection, minimization of failure, limitation of
effects, administrative controls/procedures.

Cumulative resilience function


Dinh et al. (2012). Resilience Engineering of industrial processes: principles and contributing factors, JLPPI, 25(2), 233-241. Resilience density function
13 Resilience mass function 14
Time-dependent functionality curve under
Tong, Yang, Zinetullina (2020)
disruptions
• S1: the normal operating state
with the functionality of F1 when
• The four attributes: a disruption occurs at t1.
• S2: the disrupted state with the
– Adaption, Absorption, Restoration, Learning functionality of F2 at t2.
• S3: the state with the
• The terminology of “functionality” is used. functionality of F3 when
adaptation action finishes at t3.
• S4: the new state with the
• ofIfathe
“the probability system
system’s is state
functionality more probable
sustaining to orsustain
a “high” state restoring to functionality of F4 when
a “high” state from a “low” state during and after the occurrence of disruptions in the restoration finishes at t4.
a “high
operation of a system within afunctionality”
specific time.” state or restore to a The resilience can now be quantified as the
“high functionality” state from a “low sum of the probabilities of state S1 and S4 at
each time step
functionality” state subject to the
disruptions, it can be considered more
resilient. Fig. 1. Time-dependent functionality curve
under a disruption

Tong, Q. Yang, M.*, Zinetullina, A. (2020). A dyanmic Bayesian newtork-based approach to resilience assessment of Tong, Q. Yang, M.*, Zinetullina, A. (2020). A dyanmic Bayesian newtork-based approach to resilience assessment of
engineered systems, JLPPI, 65, 104152 engineered systems, JLPPI, 65, 104152
15 16

DBN, as a system resilience modeling tool


• Ability of modeling
time-dependent
factors
• Ability of
decomposing the
primary attributes
• Ability of modeling
multiple states
• Ability of
incorporating both
subjective and
objective data
Tong, Q. Yang, M.*, Zinetullina, A. (2020). A dyanmic Bayesian newtork-based approach to resilience assessment of
engineered systems, JLPPI, 65, 104152 Zinetullina, A., Yang, M.*, Khakzad, N., Golman, N. (2020). Dynamic resilience assessment for process units operating in
17 Arctic environments. SIEE, 2, 113-125. 18
Resilience profile Zinetullina and Yang et al. (2021)
Start

Define the system

I. FRAM modeling
Develop FRAM model
for the system

• Integration of FRAM
Resilience

Identify critical coupling


using MSC

with DBN II. Hysys simulation III. Generate DBN parameters

Establish linkages
Simulate the system between the identified

• Use of a process in Aspen Hysys critical couplings and the


attributes of resilience

simulator (Aspen Obtain PoFs based on


simulation results
Develop CPTs based on
data from simulation,
literature and surveys

Hysys) to estimate IV. Resilience assessment


before imprvement
Develop DBN model

the probabilities Run DBN to obtain


resilience profile

V. Resilience assessmernt
after improvement
Propose additional safety
measures based on FRAM
model

Reassess resilience and


compare with previous
profile

Conduct sensitivity analysis to


identify the most influential
Zinetullina, A. Yang, M., et al. (2020). Quantitative resilience assessment of chemical nodes

process systems using functional resonnance analysis method and dynamic Bayesian
Zinetullina, A., Yang, M*., Khakzad, N., Golman, N. (2020). Dynamic resilience assessment for process units operating in
19 network. RESS, 205, 107232 End 20
Arctic environments. SIEE, 2, 113-125.

Typical performance curve for


Chen, Yang, Reniers (2021)
resilience quantification
• Resilience is represented by a physical
parameter (e.g., storage resilience)

Chen et al. (2021). A dynamic stochastic methodology for quantifying HAZMAT storage resilience.
RESS, 215, 107909.
21 22
Cai and Xie et al. (2017) Sun, Wang, Yang, Reniers (2021)

• Availability-based engineering resilience • Application of resilience metric for safety


metric barrier performance assessment

Steady-state availability is
used as a resilience metric

Cai et al. (2017). Availability-based engineering resilience metric and its corresponding evaluation
Sun et al. (2021). Resilience-based approach to safety barrier performance assessment in process
methodology. RESS, 6041.
facilities, JLPPI, 73, 104599
23 24

Essential elements of a generic


Summary on resilience quantification framework for resilience quantification

• Quantification using performance over a


time period
• Quantification using probabilistic
System
indicators classification Disruption
and stakeholder analysis
• Quantification using multiple indicators analysis

Resilience
quantification

Functionality
assessment

25 26
The goal is to improve resilience References
• Aven, T. (2017). How some types of risk assessment can support resilience analysis and
management, RESS, 167, 536-543.
• Cai et al. (2017). Availability-based engineering resilience metric and its corresponding evaluation
methodology. RESS, 6041.
• Dinh et al. (2012). Resilience Engineering of industrial processes: principles and contributing
factors, JLPPI, 25(2), 233-241.
• Hollnagel, E. (2006). Resilience - the Challenge of the Unstable. In E. Hollnagel, D. D. Woods, & N.
Leveson (Eds.), Resilience Engineering: Cocnepts and Percepts (pp. 9–17). Ashgate Publishing, Ltd. .
• Patriarca, R. et al. (2018). Resilience Engineering: Current status of the research and future
challenges. Safety Science, 79-100.
• Sharma, N., Tabandeh, A. and Gardoni, P. (2017). Resilience analysis: a mathematical formulation
to model resilience of engineering systems. Sustainable and Resilient Infrastructure, 3(2), 49-67.
• Sun, H., Wang, H., Yang, M., Reniers, G. (2021). Resilience-based approach to safety barrier
performance assessment in process facilities, JLPPI, 73, 104599
• Tong, Q. Yang, M., Zinetullina, A. (2020). A dyanmic Bayesian newtork-based approach to resilience
assessment of engineered systems, JLPPI, 65, 104152
Health to the human is Safety to the System
• Woods, D. (2003). Creating foresight: how resilience engineering can transform NASA’s approach
Let’s build up the immunity of complex to risky decision-making.
systems! • Zinetullina, A. Yang, M., et al. (2020). Quantitative resilience assessment of chemical process
systems using functional resonnance analysis method and dynamic Bayesian network. RESS, 205,
10723.

.
27 28

Thank you for your attention!


& Questions?

Risk reduction and


management

Lecture 5 of SPM 9448

Dr. Ming Yang, Assistant Professor of Safety and Security Science


Ming Yang, PhD, P.Eng. Safety and Security Science Section
Assistant Professor of Safety and Security Science May 17th 2021
Faculty of Technology, Policy, and Management,
TU Delft, The Netherlands Email: m.yang-1@tudelft.nl
Email: m.yang-1@tudelft.nl 29 1
Outline Incident Pyramid G. Creedy/CSChE

• Risk reduction measures in general


• Inherent safer design: from safety and Incident Pyramid
environmental perspectives
1 Serious Injuries/Fatalities

10 Medical Aid Cases

Property Losses/First Aid


30 Treatments

600 Near Misses

Unsafe Behaviors/Conditions
10,000

2 3

D. McCutcheon
Planned
Reviews
RISK MANAGEMENT
Management Activities
To track company
actions against policy.
Identification of
Hazards
• Knowledge uncertainty
– Various techniques
Reduce the Risk Risk Analysis/
A GENERIC • Physical scope
Assessment
Risk Analysis/ FRAMEWORK – Definition of system boundary for assessment
Assessment Activities
Yes To track, look for and FOR RISK • Analytical scope
analyze and assess MANAGEMENT
Can the
No
Is the risk
hazards or concerns that
arise and challenge
– Nature of hazards under consideration
risk be acceptable?
reduced?
policy.
• Perception, in addition to
Yes
– Event likelihood & Severity of consequences
No
Management Activities
Discontinue the Manage the To ensure company
Activity Residual Risk activities keep risks
under control.

4 5
Hierarchy of Controls D. Hendershot Risk reduction/treatment – (i)

• Techniques of control and risk reduction:


– Substitution: by replacing substances and
procedures by less hazardous ones, by
improving construction work, etc.
– Elimination of risk exposure: this consists in
not creating or completely eliminating the
condition which could give rise to the
exposure.
– Prevention: combines techniques to reduce
the likelihood/frequency of potential losses.
Observation and analysis of past accidental
events enable the improvement and
intensification of prevention measures.
6 7

Risk reduction/treatment – (ii) Risk reduction/treatment – (iii)

• Techniques of control and risk reduction: • Techniques of control and risk reduction:
– Reduction/mitigation are techniques whose – Segregation summarizes the techniques
goal is to reduce the severity of accidental which are to minimize the overlapping of
losses when an accident occurs: losses from a single event. It may imply very
• Measures applied before the occurrence high costs.
of the event (often also have an effect on • Segregation by separation of high risk
the likelihood/frequency) units
• Measures applied after the occurrence of • Segregation by duplication of high risk
the event (often aim to accelerate and units
enhance the effectiveness of the rescue)

8 9
Risk reduction/treatment – (iv) Risk reduction/treatment options

• Techniques of control and risk reduction: • Risk reduction:


– Transfer, risk transfer by: – Risk avoidance (inherent safety or safety-by-
• Contractual transfer of the risk financing, design)
essentially insurance. – Risk control (prevention, protection,
• Risk financing by retention (auto mitigation)
financing), finance planning of potential • Risk acceptance:
losses by your own resources.
– Risk retention (conscious or unconscious)
• Alternative risk transfer (ART) solutions – Risk transfer (insurance)
comprise both elements of auto financing
and contractual transfer and so cannot be
classified in any of the above categories.

10 11

Design Basis
Protection Layers

Effect Mitigating
Measures
Passive Protection Measures
(eg. PSV)
an

SIS
Inherent safer design
Pl

lan
cy

yP

Critical Alarms
en
rg

nc
me

ge

Safety Management System


lE

er

Proces control via DCS


Em
na
ter

al
rn
Ex

te
In

HSE TOTAL
12 13
PETROCHEMICALS
INHERENTLY SAFER DESIGN
Principles of Inherent Safety (EARLY)

• Minimization (Intensification)
INHERENT SAFETY
Minimize amount of hazardous material
in use (when use of such materials
PASSIVE ENGINEERED cannot be avoided – i.e. elimination)
(ADD-ON) SAFETY

ACTIVE ENGINEERED
(ADD-ON) SAFETY

PROCEDURAL
(ADMINISTRATIVE)
14 15
SAFETY

Principles of Inherent Safety (EARLY) Principles of Inherent Safety (EARLY)

• Substitution • Moderation (Attenuation)


Replace substance with less hazardous Use hazardous materials in least
material; replace process route with one hazardous forms; run process equipment
involving less hazardous materials with less severe operating conditions
(e.g. T and P)

16 17
With respect to stairs, an inherently safer
Principles of Inherent Safety (EARLY)
option to a two-story house is a…

• Simplification
Simplify equipment and processes that
are used; avoid complexities; make
equipment robust; eliminate opportunities
for error

18 19

With respect to stability, an inherently Safety by Design


safer option to a bicycle is a… Inherent Safe Design
• Reduction of product inventory
• Alternative fabrication processes (operating conditions, chemicals
used,…)
• Spacing
• Location
• Equipment design (full containment design,…)
• Location of buildings in low risk areas
• Reduction of occupancy of building occupants in high risk areas
• …

20 21
Safety by design
Safety Management System/ Safety by Design
Control Safety Instrumented Systems
• Redundancy/diversity of instrumentation
• Planned inspections/maintenance
• ESD system
• Emergency preparedness
• Blowdown systems
• Knowledge and skill training
• Interlock systems
• Engineering and change management
• …
• Communication
• Materials and services management
• Hiring and placement, contractor selection
• …

22 23

Safety by Design Safety by design


Passive Protective Measures Effect Mitigating Measures
• Safety valves
• Water curtain
• Control of ignition sources
• Containment (in building,…)
• Blast resistant buildings
• Flow orifices
• Mounted bullets of spheres
• Diking
• …
• Release direction
• Fire fighting measures (foam application, …)
• Explosion suppression systems
• Explosion venting
• …

Passive Potective Measures

24 25
Safety by Design Safety by Design
Inspection and Maintenance Inspection and Maintenance

• Change inspection frequencies


INSPECTION PLANNING • Change inspection scope/thoroughness
• Change inspection tools/techniques
Corporate Philosophy Inspection Planning activity
The Inspection • Acoustic Emission Testing (AE)
Local Legislation Plan • Eddy Current Testing (ET)

Monitoring info.

House keeping
General "Good
RBI analysis/
priotitization

findings
• Infrared/Thermal Testing (IR)
Corporate Policy • Leak Testing (LT)
i. Inspect
ii. Onsite assessment • Magnetic Particle Testing (PT)
iii. Detailed FfS if
Codes and
needed • Neutron Radiographic Testing (NR)
Standards, RP's, Data analysis
RAGAGEP • Penetrant Testing (PT)
Anomolies
DM's
• Radiographic Testing (RT)
Database Update database
Design Construction
• Ultrasonic Testing (UT)
Operation Inspection
•Visual Testing (VT)
06_Inspection Planning RBI role.vsd

26 27

Prevention Prevention
• Prevention is an attitude and/or a series of • 9 Principles of prevention:
measures to be taken to avoid degradation – 1. Avoid risks: remove the hazard or the exposure to it
of a certain situation (social, environmental, – 2. Assess risks that cannot be avoided: assess their
nature and importance, identify actions to ensure
economical, technological, etc) or to prevent safety and guarantee the health of workers.
accidents, epidemics or illness. It acts mainly – 3. Fight risks at the source: integrate prevention as
on the likelihood of occurrence and the early as possible, from the design of processes,
causality chain, trying to lower the probability equipment, procedures and workplace
that an event happens. Prevention actions – 4. Adapt work to man: design positions, choose
equipment, methods of work and production to reduce
are also intended to keep a hazard risk the effects of work on health.
problem from getting worse. They ensure – 5. Consider the state of technological developments:
that future development does not increase implement preventative measures in line with the
hazard losses. technical and organization developments.

28 29
Prevention
• 9 Principles of prevention:
– 6. Replace the hazardous by what is less hazardous:
avoid the use of harmful processes or products when
the same result can be obtained with a method with
less hazards.
– 7. Plan prevention integrated in a coherent package: Environmental risk minimization –
a) technique, b) work organization, c) working
conditions, d) social relations, e) environment pollution prevention
– 8. Take collective protection measures and give
them priority over individual protective measures: use
of personal protective equipment’s (PPE) only to
supplement collective protection or their defaults.
– 9. Give appropriate instructions to employees:
provide them the necessary elements for
understanding the risks and thus involve them in the
preventative approach.

30 31

Wastes? Pollutants? Pollution? Why do we care pollution?


• Wastes • Pollutants can
– Wastewater easily contaminate
– Air emissions our drinking water
– Solid wastes
– Energy
• We all breathe the
– Time & Money
same air
• Pollutants are wastes that contaminate air,
water, land and make it unfit for use
• Pollution is the release of pollutants from a
particular activity

32
32 33
What is Pollution Prevention (P2) P2 vs. Pollution Control

• POLLUTION PREVENTION is the use of • Past industrial practices


Raw material Products
processes, practices, materials or energy
Process
that avoid or minimize the creation of Energy
pollutants and wastes without creating or Wastes to environment

shifting new risks to communities,


workers, consumers or the environment • Recent industrial practices
(Environment Canada’s Definition) Raw material Products

Process
Energy Wastes
Treatment

Residual waste to environment


pollution
control
34
34 35
35

P2 vs. Pollution Control, Cont’d Ideal P2 practices


• Current P2 practices Minimize wastes
during usage and
Cleaner raw Products Cleaner raw Products disposal
material material
Process Reuse & Process Reuse &
Energy Waste Treatment Cleaner Energy Residuals Reprocess
Residuals Recycle Residuals Recycle

Recycles Secondary Residuals Recycles Secondary Reuse &


Products waste to Products Recycle
environment
Recycles
Minimize wastes in
System raw material
boundary Not a P2 practice extraction and
energy generation

Define your system boundary first!

36
36 37
37
P2 Principles P2 practice or not?

• Elimination • Offsite recycling


• Minimization • Waste treatment such as detoxification,
• Substitution incineration
• Moderation • Concentrating hazardous or toxic
constituents to reduce volume
• Diluting constituents to reduce toxicity
• Transferring hazardous or toxic
constituents from one environmental
medium to another

38
38 39
39

Group Discussion #1 P2 Examples in a Restaurant

• As a restaurant manager, reduce wastes • To reduce energy use:


from the daily operation. What are the – Replace outdated equipment and appliance
potential P2 opportunities? with more energy-efficient ones
– What are the waste streams? – Change air conditioning filters regularly to
– Can we eliminate or reduce those wastes help it run more efficiently
through P2 principles? (Elimination, • To reduce water use:
minimization, substitution, and moderation) – Turn off faucets and hoses when not in use
– Install low volume toilets
– Serving water only to guests who requests

40
40 41
41
P2 Examples in a Restaurant, Cont’d Group Discussion #2

• To reduce solid wastes • Standardize the process to identify P2


– Work with the suppliers to take back and opportunities?
reuse corrugated cardboard boxes, five
gallon buckets and other packaging
– Serve soft drinks on tap
• To reduce food wastes
– Buy in bulk to reduce container waste but
avoid buying too much of a product that might
spoil
– Strict inventory control to prevent usable
materials from needlessly becoming waste
42
42 43
43

Benefits of P2
• Reduced material, operation, and
production costs
Application of P2 to OOG Operations –
• Reduced waste treatment and future clean
up costs A Brief Discussion
• Improved business efficiency and
profitability
• Improved company image
• Reduced risks to employees and
communalities
• …
44
44 45
45
Environmental Impacts
Four Main Stages of OOG Operations
Stage Activity Type of nature of impacts
Acoustic source, short-term disturbance to marine
• Geological and geophysical survey Geo. survey Seismic survey
organism and fish population
– Identify areas with favorable geological conditions Increase in turbidity, disturbance on bottom
Test drilling
– Aerial and seismic survey to provide detailed geological information
• Exploration Exploratory Emissions and discharges of pollutants, disturbance to
drilling fisheries, accidental blowouts
– Exploration drilling to verify presence and quantity of hydrocarbons
– Determine if the reservoir is feasible to develop Exploration Plugging well Long-term impacts on benthic and pelagic habitats,
and biodiversity
• Development and production
abandonment
– Platform and pipeline emplacement
Platform Construction discharges, long term and chronic effects
– Drilling for production of discharges on benthic and pelagic biota
emplacement
– Produced hydrocarbon separation and gas processing and pipeline
– Oil and gas export laying
• Decommissioning Development and Drilling for Drilling fluids and cuttings discharge, produced water,
– Removal of platform facilities production production and accidental spillage, impacts on fisheries, physical
injection wells disturbance
– Remediation of environmental impacts
Vessel traffic Operational emissions and discharges, impacts on
marine birds, mammals and other organisms
Decommissioning Facility removal Operational emissions and discharges, impacts on
fisheries, marine organisms if explosive charges are
used
46
46 47
47
Patin (1999)

Waste Streams at Production Stage P2 Opportunities

• Produced water • Produced water


• Drilling wastes – Separation of produced water with oil-body
– Drilling fluids down the well using sub-sea separation
technology
– Drilling cuttings
– Produced water re-injection for pressure
• Storage displacement water maintenance and oil recovery
• Deck drainage –…
• Flaring emissions

48
48 49
49
P2 Opportunities, Cont’d P2 Opportunities, Cont’d

• Drilling wastes • Deck drainage


– Use of SBF is considered one P2 practice – An appropriate platform size design since the
compared to WBF and OBF because it volume of deck drainage is proportional to the
reduces cuttings, emissions and energy use size of platform
and SBF is environmental benign due to its – Efficient wash down operation to reduce
potential to biodegrade frequency of regular cleaning
– Application of advanced drilling tools that
enable drilling to penetrate precise targets
–…

50
50 51
51

Thank you!
P2 Opportunities, Cont’d

• Storage displacement water


– This water can be directed into the water
injection system after minor treatment Questions?
• Flaring emissions
– Reduce the number of well testing
– Collect waste gas (light hydrocarbons), if
economically and technically feasible,
compressed and used for onsite power
generation Ming Yang, PhD, P.Eng.
Assistant Professor
Safety and Security Science Section,
Faculty of Technology, Policy, and Management,
TU Delft, the Netherlands
52
52 Email: m.yang-1@tudelft.nl 53
It’s the Risk Governance, stupid!

Risk Governance can be defined as the totality of actors, rules,


Risk Management conventions, processes and mechanisms concerned with how
relevant risk information is collected, analysed and communicated,
and how management decisions are taken.

Risk Governance starts with good Corporate Governance and


integrated board management, conditioned by:
Dr. Ming Yang (i) Diversity: strategically targeted composition of the board team
Assistant Professor of Safety and Security Science (ii) Trust: constructive and open-minded board culture
Faculty of Technology, Policy, and Management (iii) Network: efficient board structure
(iv) Vision: stakeholder-oriented board measures of success
TU Delft, The Netherlands
Although these preconditions of success have been proven in a
variety of studies, they seem to be very hard to achieve by
organizations. In the light of the recent economic crises, put it this
way: “How can a team of committed board members with
individual IQs above 120 have a collective IQ of 60?”

1 2

Remember from Lecture 1: Risk management –


Uncertainty management? Approach to adequately
Risk Governance govern risks?

• Principal agent theory / shareholder model / Anglo Saxon


model

• Stakeholder model / Rijnland model

• Anyway, Risk Governance requires generalism, strategic


viewpoint, and holism

3 4
A Risk Governance PDCA Mindset: Risk Governance Model

5 6

Knowledge and know-how: Safety Management System –


Loop Risk Management 12 items

1. Safe work practices (procedural and adm control of work


activities, safety reviews, MOC procedures, …), including
emergency planning and procedures!
2. Safety training
3. Group meetings
4. Pursuing in-house safety rules and complying with
regulations
5. Set of basic safety rules and regulations
6. Safety promotion
7. Contractor and employee evaluation, selection and
control
8. Safety inspection, monitoring and auditing
9. Maintenance regimes
10. Hazard analysis and incident investigation and analysis
11. Control of movement and use of dangerous goods
12. Documentation control and records
7 8
Stakeholders and expertise: Securing organizations:
Risk Governance Framework Playing it safe or Playing with safety

9 10

Why?
Presentation outline Safety Concerns

• Why? – Safety Concerns

• What? – Safety Matters


Remark:
• Easy? – Safety Bothers ‘Safety’ = ‘Safety + Security’
(i) everyone
• How? – Playing with Safety (ii) safety anxieties
• New? – Safety Futures

• Who? – Safety Scores

• The End. – The Safety Tail/Tale:

A Never-ending Story.

11 12
Why? - Safety Concerns (i) Why? - Safety Concerns (ii)

• All stakeholders
• Prudence due to industrial activities should be present
in every industry, and certainly also in the hazardous
materials using industries
• Characteristics of chemicals using industries: use of
hazardous materials, existence of chemical industrial parks,
license to operate/acceptability linked with reputation, high
uncertainties linked with debatable opinions
• Belgium & The Netherlands: densely populated area
combined with highly concentrated chemical industrial
activities
• The Rotterdam Port Area is part of the “ARRRA” and is
extremely important for the Dutch
(/Belgian/German/European) economy

13 14

What?
Safety Matters
What? – Safety Matters (i)

(i) Good News: Focus on Safety


(ii) Improvement News: The “AND” Story

http://www.youtube.com/watch?v=2MpsArclaxw

15 16
What? – Safety Matters (i) What? – Safety Matters (i)
• Specialistic AND Generalistic • Analytic
• Technology AND HOFS
• Reactive AND Proactive
• Current practice
• Individu AND Group • Linear

• Short-term AND Long-term


• Top-down AND Bottom-up
• Normal acc. AND Disaster
• Operational AND Strategic
• Blue-collar AND White-collar
• Simple AND Complicated

• Confidential
• Static
• Practical
• Realist/Pragmatic

17 18

Easy?
What? – Safety Matters (ii) Safety Bothers

• Specialistic AND Generalistic


• Analytic AND Systemic
• Technology AND HOFS • Current practice AND
• Reactive AND Proactive Innovation
• Individu AND Group
• Linear AND Cyclic (i) Safety Leads to Headaches
• Short-term AND Long-term (ii) Safety Disturbs
• Top-down AND Bottom-up
• Normal acc. AND Disaster
• Operational AND Strategic
• Blue-collar AND White-collar
• Simple AND Complicated

• Confidential AND Transparant


• Static AND Dynamic
• Practical AND Theoretical/Fundamental/Conceptual
• Realist/Pragmatic AND Dreamer/Idealistic
19 20
Easy? - Safety Bothers (i) Easy? - Safety Bothers (ii)

• Measures to take to be safe (how safe is safe


enough?) • 1999: “Throughout the evolution of the chemical industry,
• Possible costs (all types, not only financial) of minor safety has been treated as an afterthought. It is the tag-
along in a group of kids on the playground: at times
and major accidents
annoying yet unavoidable.”
• Safety investments Osborne L., Process Safety Progress 18(4): W5
• Strategic decisions (game-theory - what do others do Chemical engineering students were invited to write an
w.r.t. safety?) essay on a safety topic. The quotation is from the
• Competences available / Company memory winning essay.
• …
• 2012: “A picture is slowly emerging of chemical industrial
clusters that will set their own sustainability standards
• License to operate / authorities & politics through intensive collaboration.”
• Media Reniers G. & Amyotte P., Journal of Loss Prevention in
• Academia the Process Industries, 25, p.227-231.

21 22

How?
Playing with Safety How? - Playing with Safety (i)

• Closed-mindedness (‘been there, done that, seen that’)


• No or few innovations (a.o. regarding using risk
assessments)
(i) …so that it gets dangerous • Inadequate investments in prevention
(ii) …so that it gets safer • Perceiving safety as a cost
• Considering safety as being evident (complacency)
• Insufficient transparency (towards any stakeholder)
• Insufficient collaboration (with competitors,
authorities, and academia)
• Inadequate integration of safety in business
management system
• Too much focus on compliance
• Inadequate company memory
• …

23 24
How? - Playing with Safety (ii) New?
(so that it gets safer!) Safety Futures

• Open-mindedness (open for creative new ideas /


techniques / solutions to further improve safety) linked
with available budget
• Technological and HOFS innovations (e.g., using (i) Economics of Safety
innovative risk assessment techniques, company memory (ii) What holds the future?
conceptual models and/or software, new model of safety
culture linked with performance management and total
respect management)
• Trying new collaborations (e.g., with academia), new
safety projects, new safety investments on top of usual
investments, high transparency, applying game-
theoretical models, …

25 26

New? - Safety Futures (i)


New? - Safety Futures (ii)
• Agreement between parties committing themselves to
trade at a specified time in the future, a good or service at a
predefined price. • Megatrends: Communication devices, Big data,
(good = safety; predefined price = prevention Collaboration, Sustainability, Performance-based decision-
investment) making, Aging, Accelerated urbanization, Resource scarcity
à Impact of prevention investment decisions on future of • Technological improvements and innovations will lead
a company to more accurateness of risk perception and assessment,
• From a prevention investment point of view, different better knowledge of uncertainties, better knowledge
types of risks should be dealt with differently (high- dispersion, dynamic risk assessment results and real-time
uncertainty decisions possibly leading to huge profits always risk data processing, more complete data and information,
go hand in hand with huge possible losses (Disasters)! calculation of systemic risks, serious games, changing role
Hence, focus on low-uncertainty decisions for making of media in communication of risks
‘normal’ profits. • Globalization: decrease of risk perception differences,
• Credo ‘Safety not for sale’ or ‘safety before sale’ not true: decrease of differences in safety cultures, more integration
Story of Safety versus Productivity is comparable to Story of of safety within other domains, decrease in differences of
chicken and egg! values of life, decrease in ethical differences
• THERE IS NO ALTERNATIVE: The right way forward is not • ‘Uncertainty measurement’ device for people and
to reject the economic approach in safety decision-making, organisations: risk radars, risk dashboards, risk watches,
but to improve the tools and their use! (much like risk • Safety apps, safety QR code for equipment, google glasses
assessments)
• Loss Aversion: We do not gamble with gains, while we
tend to gamble with losses (because we really hate to lose)! 27 28
Who?
Safety Scores Who? – Safety Scores (i)

• Quality of Perception of Leaders: The map is NOT the territory!

(i) Good Leadership


(ii) Excellent Leadership

29 30

The End.
Who? – Safety Scores (ii) The Safety Tail/Tale: A Never-ending Story.

• Improve the perception of reality of every member


of an organization, and this way make better (individual
and group) decisions
• Organizational alignment (i) Does Safety have a tail?
• Developing the “Respect culture” within the
organization (we are evolving from a ‘risk society’ (Beck, (ii) The Story of the Dinosaurs
1986) towards a ‘respect society’)
• Thinking in “AND” terms
• Focus on 7 domains for excellence: Productivity,
Effectiveness, Quality, Safety & Security, Efficiency,
Ecology, Ergonomics

31 32
The End. The End.
The Safety Tail: A Never-ending Story. (i) The Safety Tale: A Never-ending Story. (ii)

• The Future will be ‘Safe’ and ‘Excellent’, or there will be no


future at all!
• Safety is a circle – it never ends!

Some recommendations:
• Safety (or rather: ‘dealing with uncertainty’) should be
taught at all levels of education, and in all studies
• Safety thinking should always in some sort be part of
technological innovation
• Safety Science should be a true pillar of society if it wants
to excell
“All I’m saying is now is the time to develop the technology to
deflect an asteroid.”
(from: Risk-benefit analysis, Wilson and Crouch, Harvard Univ. Press,
2001)

33 34

Thank you for your attention!


& Questions?

End of the course

Ming Yang, PhD, P.Eng.


Assistant Professor of Safety and Security Science
Faculty of Technology, Policy, and Management,
TU Delft, The Netherlands
35 Email: m.yang-1@tudelft.nl 36

You might also like