You are on page 1of 39

Black Swans and how to avoid them

by
David L Russell, PE, ASP
President
Reading List:
The Black Swan:Second Edition
The Black Swan: Second Edition: The Impact of the Highly
Improbable: With a new section: "On Robustness and
Fragility"
Fooled by Randomness--The hidden role of chance in life and in the
markets
AntiFragile Things that can gain from disorder.

Books by Nassem Taleb


What are Black Swans?
• 1 The disproportionate role of high-profile, hard-to-predict, and
rare events that are beyond the realm of normal expectations in
history, science, finance, and technology.
• 2 The non-computability of the probability of the consequential
rare events using scientific methods (owing to the very nature of
small probabilities).
• 3 The psychological biases which blind people, both individually
and collectively, to uncertainty and to a rare event's massive role in
historical affairs.
• In other words, a black swan is the unpredictable, that which can’t
happen, but which does, and which no one expected, except by
hindsight.
• Black swans arise from risk.
What’s Risk? Answer: Anything which can harm a facility.

Technical Basis for Risk:


Risk = C * L(S|A) * L(A)
Where L(A) is the likelihood of an attack being attempted
L(S|A) is the likelihood of adversary success given attack
C is the consequences following a successful attack, and
The total risk is obtained by summing the results of the equation for all relevant scenarios
The consequence variable C represents the “worst reasonable case” consequences given a
successful attack.
Another Definition: Risk = vulnerability * Damage or R = V*D.
The only difference it that the formula above incorporates the idea that the term L(S|A) is
the probability that the attack will be successful. Either definition can be used.
Risk is non linear, and non-normally distributed.

Black Swans in this area

Histogram of individual scaled non-negligible claim sizes based on Texas Wind Storm Data:
Source: www.Catrisk.net/its-close-to-a-Weibull-again With a Weibull 1.30023, 0165684 set of parameters, the fit is
good to the 95% level
Risk Assessment is used to measure the Risk
Most are partial risk assessments because of complexity and time required
A complete Risk or threat assessment considers the full spectrum of
threats (i.e., natural, criminal, terrorist, accidental, process, blunders, spills,
lack of proper maintenance, etc.) for a given facility/location.
Example of partial risk assessment: ASTM 1528 Property Transfer
Assessment; or Environmental Audits, Safety Audits, etc.
The assessment should examine supporting information to evaluate the
likelihood of occurrence for each threat. For natural threats, historical data
concerning frequency of occurrence for given natural disasters such as
tornadoes, hurricanes, floods, fire, or earthquakes can be used to
determine the credibility of the given Risk or Threat.
For criminal threats, the crime rates in the surrounding area provide a good
indicator of the type of criminal activity that may threaten the facility.
Risk Assessment is used to measure the Risk
In addition, the type of assets and/or activity located in the facility may also
increase the target attractiveness in the eyes of the aggressor.
(How are your relations with the neighborhood? And How attractive is your
facility?).
Example: Commercial Solvents Plant manufacturing Trimethylamine
which is very very smelly (dead fish odor), persistent, and highly
detectable.
The type of assets and/or activity located in the facility will also relate directly
to the likelihood of various types of accidents.
Example: Explosives manufacturing– accidents are legendary!
Nitroparrafins manufacturing – high hazards associated with process
Ag/Fertilizer storage: Oiling the NH3NO3 prills creates potential ANFO
Petroleum & Chemical handling: Visible, highly volatile, some remote
storage for pumping wells– easily accesed & vandalized.
Risk Assessment is used to measure the Risk
The risks/threats from negligence and lack of proper supervision are often
greater threats than the natural occurrences or terrorist threats because they
involve people and daily activities.
Examples: Chemical or Petroleum Spills; railroad disasters; fire and
explosion at Savannah Sugar refinery; Petroleum Fires (BP Texas City
2005, and Deepwater Horizons).
European practices sometimes are very nerve wracking depending upon
which country and how they process their level of risk
There are some good databases out there and one of the best accessible is
from the UK, HS Executive which I’ll discuss later.
Frequency of accidents/ incidents in the US can be computed from OSHA and
the EU Databases: https://www.osha.gov/oshstats/index.html and
http://ec.europa.eu/eurostat/statistics-
explained/index.php/Accidents_at_work_statistics#Incidence_rates
Risk Assessment is used to measure the Risk
Frequency of accidents/ incidents in the US can be computed from OSHA and
the EU Databases: https://www.osha.gov/oshstats/index.html and
http://ec.europa.eu/eurostat/statistics-
explained/index.php/Accidents_at_work_statistics#Incidence_rates and
http://www.who.int/quantifying_ehimpacts/methods/en/takala.pdf
https://www.deepdyve.com/lp/wiley/the-frequency-of-industrial-accidents-
Nbw0U9j2bg ( a pay site to read the article)
And the international Labor Organization database:
http://www.ilo.org/ilostat/faces/wcnav_defaultSelection;ILOSTATCOOKIE=ujQn
yprhiXzTSptTdNwbeaV3XIGN1PFCawl6zx-ACAgq_WY0LbpY!-
1193313289?_afrLoop=69481885654601&_afrWindowMode=0&_afrWindowI
d=null#!%40%40%3F_afrWindowId%3Dnull%26_afrLoop%3D69481885654601
%26_afrWindowMode%3D0%26_adf.ctrl-state%3D136wp6ayl2_4
How we measure and express RISKS and THREATS

Criticality Rating
Vulnerability Rating Very High High Medium Low
ITEM Description

Very High

High

Moderate

Low

Risks are very high. Immediate Countermeasures Recommended

Risks are high. Countermeasures should be started as soon as possible

Risks are moderate. Countermeasures can be postponed until plant


turnaround
Little or no risk. Countermeasures may be put in place when convenient
Alternative Ways of Expressing Risk: ANNUAL LOST EXPECTATION
Annual Loss Expectation
Estimated Replacement Cost
Cost expressed as $X.XX*10N Rating = N
Frequency of Occurrence of undesirable event
(3 years approximates 1000 days)
1/300 years f=1 Type of event
1/30 years f=2 Type of Event
1/3 years f=3 Type of Event
1/100 days f=4 etc.
1/10 days f=5
1/day f=6
10/day f=7 etc.
Calculated Annual Lost Expectation = 10(f+N-3)/3
When we look at it graphically:
Value of N decreases Value of F decreases→
9 8 7 6 5 4 3 2 1

Highest Value,
9

Highest Risk
Occurrence
8
7
6
5
4
3

Lowest Value,
Lowest Risk Occurrence
2
1
So what does a Black Swan Cost?
Answer:
It’s whatever you want to make it, up to and including the total
destruction of the facility + Life Lost+ damage+ property damage+
rebuilding costs+ lost sales, etc. It can even include lost opportunity costs
and if you can monetize it, the cost of rebuilding reputation and
community relations.

A Category 9 Black Swan (N=9) and all other Black Swan Events, in fact all
Risk Events depend upon what the Frequency is and how it is measured.
The associated problem is that many Black Swan Events have a very very
low frequency but a very very high cost.
So what causes a Black Swan Event?
At its heart, is our lack of understanding of the
interconnectedness (network) of our non-computer and
computer related facilities, and the lack of a good guide for the
frequency of failures.
Often times, the result of a single failure will cascade through a
network of connected pipes, facilities, or even computers. Then
the whole thing becomes unstable or unmanageable and
collapses into a Black Swan event.
Examples: Chernobyl, Bhopal, Deepwater Horizons, Buncefield,
And many many more.
So what does a Black Swan Cost?
We often fail to examine our own shortcomings and neglect to look at
possible routes for accident propagation.
Ask yourself, what are the limits of an event. Obviously things like a
snowstorm in the central Middle East are highly unlikely and can be
neglected as possible problems.
But, How many times have you heard someone in a “position of authority”
say, “That will never happen here.” or “It hasn’t happened in my ___ years
with this company!”
But it could happen tomorrow if the conditions are right!
Part of the problem is our failure in statistical theory. We’ve been
conditioned to look at normal and Student’s T distributions as the normal.
Mother nature and the Universe uses a different ( fat-tailed) distribution for
events. Look at the Texas Windstorm Damage Cost data cited above.
One method of accident reconstruction is the “Five Whys”
It consists of asking “Why” did X occur when the situation is evaluated.
It requires at least 5 repetitions of the “Why” question, each one digging further into
the cause of the incident.
An example:
Why did the car hit the tree? Answer because it went off the road.
Why did the car go off the road? Answer because the driver lost control of the vehicle
Why did the driver lose control of the vehicle? Answer: Because the roadway was wet.
Note: This is too simplistic of an answer– so one must investigate a bit further.
Why did (what else caused the driver to lose control on the wet pavement? Answer:
He swerved to avoid a pedestrian.
Why was the pedestrian on the road? Because the sidewalk had not been shoveled
from a recent snowfall.
At this level of detail, one will have a pretty good idea of what happened and what, if
anything can be done to prevent future occurrences. One could ask about why the
sidewalk had not been shoveled, but that may put the cause away from the possible
management of future incidents.
Sometimes the cause of many incidents or accidents could best be explained
by analyzing the incident using Taguchi’s Fishbone System where causes are
listed and can be evaluated. See the example in the next page.

The Taguchi method considers causes of an incident which include:


• Manpower
• Methods
• Machines & Machinery
• Metrics
• Materials and
• Minutes (time)
Most often a disaster does not arise from a single act.
It is a cascade of events which includes:
• Violations of safety protocols,
• carelessness,
• Poor maintenance, and poor housekeeping,
• Deferred maintenance
• Lack of inspection or poor inspection
• Failure to test or quality control critical elements in a process
• And Other Causes which tend to act cumulatively to create the conditions
ripe for a Black Swan Event.
Look for example at Bhopal:
Example of a Black Swans and its Cause Bhopal, India Dec. 1984
Drought in Central India crop production way down affecting the demand for
Sevin
Decision to product Sevin via Methyl Isocynate route
Poor siting of plant with inadequate plant area* for safety
Budget Cuts in Maintenance – pressure indicators on Operator’s Board
inaccurate due to leaks
Failure to follow safety, maintenance and operating procedures: Failure to
install a slip blind during maintenance, allowing water to enter the MlC Tank
and lack of enforcement of good safety practices resulting in chemical
exposures**
*There is some disagreement about Government’s role in allowing a community to develop at the fence line
of the plant.
** There has been the contention that one of the causes of the incident was sabotage because the workers
believed that they were underpaid and closed critical valves leading to the disaster.
Bhopal, India Dec. 1984
Management Failures, inadequate risk prevention considerations—
MIC Cooler taken out of service months before despite high
heat Indian summer temps.
Emergency Vent Scrubber down for repairs
Failure to enforce safety protocols
Failure to maintain adequate maintenance on critical items
When the release occurred (at night) it killed over 5000 people and
according to some sources may have injured 10,000 or more.
https://www.youtube.com/watch?v=HsuUQzhP2Ds the video is
lengthy (about an hour) but raises sharp points and it’s well done
with personnel interviews.
Another Way of Looking at Risk is the EU Approach
Analysis of risk example for rail transport of
hazardous goods.
The analysis involves the use of an event tree
(reverse of a fault tree) to determine order and
likelihood of contributing causes
The EU uses the Colored Books for risk assessment evaluation and frequency. The
following books are freely available in electronic form as PDF-files:
YELLOW BOOK
Methods for the calculation of Physical Effects Due to releases of hazardous
materials (liquids and gases) – Third edition Second revised print 2005
GREEN BOOK
Methods for the determination of possible damage to people and objects resulting
from releases of hazardous materials – First edition 1992
PURPLE BOOK
Guidelines for quantitative risk assessment – First edition 1999/2005
RED BOOK
Methods for determining and processing probabilities – Second edition 1997/2005
-------------
The USEPA CAMEO Suite is good for defining scenarios and evaluating hazards, and
while the city data base is tailored to the US, it can be easily adapted to meet any
conditions worldwide.
EU Event Tree Diagram for flammable liquid cartage incident example
EU Approach to Risk Assessment
Assesses a number of factors

Types of Harm: Scenarios Developed Relevant Information


• People killed • Explosion • Population Density & time
during or shortly of day
after accident • Flash Fire or Pool Fire • Traffic Density &
• People Injured • Atmospheric Releases Congestion
• Damage to • Contamination of • Type and use of
important Surrounding Structures
buildings and water or Soil
structure • Accessibility of Emergency
Services
• Environmental • Atmospheric Conditions
Pollution linked to (worst case and most
cargo released likely)
• Topography of the area
Number of
Fatalities
versus
Annual
Frequency

Note Left hand


axis is
1/Frequency
(N)

Cassini,P., Hall, R. and Pons, P. Transport of Dangerous Goods Through Road Tunnels, QRA model Vers. 3.6
OECD/PIARC/EU (CDROM) FEB 2003
Why the US could NEVER Follow this approach to its logical conclusion

Example of a
F/N graph for
presentation

Note: Not chemical


specific.

The US would NEVER consider this type of approach because of legal and liability concerns. Under this approach a
corporate executive might be personally held liable for MURDER because he/she allowed this level of risk
Possible Data Sources: Primarily applicable to the US
Determine Frequency (F) from historical data (rail, roadway,
airline, & other statistics) Data are available in terms of
accidents per _ mile etc.
Homeland Security has made movement data on some types
of HAZMAT shipments unavailable.
Traffic data and emergency response data are available.
Example:
https://www.ntsb.gov/investigations/data/Pages/Data_Stats.aspx
lists causes for all modes.
http://safetydata.fra.dot.gov/officeofsafety/default.aspx has the
frequency
Use the Colored Books from the UK and EU
1 EU guidelines on Risk Assessment of the Carriage of Dangerous Goods by Rail
https://www.unece.org/fileadmin/DAM/trans/danger/publi/adr/guidelines/Calculation%20of%20risks_e.pdf
2. Green Book: Methods for the determination of possible damage to people and object resulting
from release of hazardous materials (CPR 16E) 377 pages
https://www.scribd.com/doc/61170131/Green-Book-Methods-for-the-Determination-of-Possible-Damage-
CPR-16E
3. Orange Book: Risk management principles and concepts (52 pages)
https://www.gov.uk/government/publications/orange-book
4. The Purple Book: Guidelines for quantitative Risk Assessment (237 pages)
https://www.scribd.com/document/60474471/Guidelines-for-Quantitative-Risk-Assessment
5. The Red Book: Methods for determining and processing probabilities (CPR 12E) (604 pages)
https://www.scribd.com/document/55826988/Red-Book
6. The Yellow Book: Methods for the Calculation of Physical Effects due to the releases of hazardous
materials (liquids and gases). (870 pages)
http://content.publicatiereeksgevaarlijkestoffen.nl/documents/PGS2/PGS2-1997-v0.1-physical-effects.pdf
or https://www.scribd.com/doc/49833247/TNO-Yellow-Book-CPR-14E

Note: as of 10/28/2020, the links work


Detailed Contents of the Colored Books
Carriage by Rail Green Book (2) Orange Book (3) Purple Book (4) Red Book (5) Yellow Book (6)
(1)
As described a. Radiation a. Risk a. Guidelines for a. Probability a. Outflow &
above- a b. Explosive Management QRA Theory Spray releases
framework effects on & Appetite for b. Loss of b. Bayesian b. Pool
humans & Risk containment Analysis Evaporation
buildings b. How to identify events c. Reliability c. Cloud Vapor
c. Fire Products Risk c. Modeling Theory Dispersion
d. Shelter in source term d. Data Analysis d. Heat Flux from
Place d. Modeling e. Failure Fires
e. Calculation of exposure Scenario e. Vessel
people and damage Analysis Ruptures
exposure e. Modeling f. Event trees f. Interfacing
societial risk g. Accident Models
from Sequencing
Transport h. Dependent
f. Calculating & failure analysis
presenting i. Human factors
results j. Uncertainty
analysis
Other Tools
ADIOS - Oil Spill Analysis Program
Chemical Reactivity Worksheet (from EPA & NOAA)
CAMEO Suite Computer Aided Management of Emergency
Operations, includes ALOHA
CARVER + SHOCK Process based flowsheet program used
primarily for the food industry to determine routes of contamination,
but it is highly adaptable to chemical and other industries
Other Tools
CAFÉ Chemical Aquatic Fate Effects Database
The Dow Hazard Index- Dow’s Fire and Explosion Index Hazard
Classification Guide- 7th Edition Published by AIChE useful for citing
locations of tanks & equipment
PSM More of a tool for prevention than risk assessment. By
OSHA

Another good book reference:


Critical infrastructure Protection in Homeland Security –
Defending a Networked Nation by Ted G. Lewis Director of Naval
Postgraduate School for Homeland Security
Specific Case History
Texas City, Texas, Refinery Fire and Explosion Disaster -2005
BP refinery in Texas City, Texas, (2005) 3rd largest refinery in the US.
Justifying the location of a trailer used for meetings on the basis
that the potentially hazardous location was unoccupied most of
the time. Result – about 20 people occupying trailer at time of
incident were injured or killed.
Poor maintenance activity dating back years.
6 plant managers in 6 years.
Maintenance failure on repair of sensor on isomerization column
raffinate splitter level (50 m stack)
Improper design of column level sensors to provide inaccurate
levels above the safety point
Improper maintenance on control valve, allowing liquid to flow to
the safety flare which caused the explosion
BP refinery,- Cont’d.
Shortcut procedures for filling restart the column allowed the
Inexperienced Operator on the board who had never performed
the filling operation before. Lead Operator did not keep written
records.
Supervisory replacement inadequate to the task when regular
supervisor called away due to family emergency
Casual attitude toward Safety
Truck with engine running parked in the plant area (proximate
cause of initiation)
Result: 15 people killed, 80 injured, Damage in the tens of millions
of dollars
Other major Black Swan Events of Note:
Deepwater Horizons, Chernobyl and many others

https://www.youtube.com/results?search_query=chernobyl+disa
ster+what+really+happened

There is a large quantity of existing information on lots of these incidents, and you are invited to
investigate the causes for yourself. Some incidents have multiple video sources on YouTube and that
may be a good place to start– but don’t stop there.
Other major Black Swan Events of Note:
Sun Oil Company Refinery, Philadelphia, PA 1975
In 1975, at the Sun Oil Refinery in Philadelphia a tank caught fire. The fire was being
successfully fought by the Philadelphia Fire Department. The large oil storage tanks
were diked, and the fire was well on its’ way to being put out. There was, however,
a large problem.
The fire quickly spread to 5 and 8 alarms as the fire grew and involved other tanks in
their diked area. One unlucky fire engine was working in an adjacent dike, spraying
foam. They did not notice that the fire foam they were wading in had a layer of
petroleum laying on top of the water, and that it was only the foam which
prevented the petroleum layer from flashing—that is until someone breached
the foam cover, causing the diked area to flash, killing 8 firefighters and
injuring another 14 firefighters.
The fire reached 11 alarms before it was finally declared extinguished on August
26, 1975.
• Another Black Swan Event.
Example: 2017 Atlanta I-85 highway fire and bridge
failure
Causes: Contractor went bankrupt and left Polyethylene
Pipe beneath elevated structure portion of highway.
Material Improperly stored for 10 years. A Crack addict
started fire. Major traffic consequences, traffic blocked
on a major N-S highway system. Traffic disrupted for
about 3 months. Economic losses in the Millions of USD.
Conclusions:
Large Failures don’t occur in isolation but are a collection/ cascading of minor
events which lead up to big events.
If the devil “in the details”, it is the details which tend to trip us up.

LESSONS
When you design or build: Ask yourself what can go wrong?
Let your imagination run, but temper it with reality. You need a group
setting for this activity.
List your scenarios
Review them
USE “5 Whys”, Fault Tree Analysis, Bowtie Analysis, or Fishbone Analysis for
determining possible failure modes and remedies.
Determine how they can be prevented—SIMPLY
Remember that when we design something to be FOOLPROOF, NATURE
INVENTS A BETTER CLASS OF FOOL.

You might also like