You are on page 1of 4

Treatment of Multiple Failures in Process Hazard

Analysis
Paul Baybutt
Primatech Inc., Columbus, OH; paulb@primatech.com (for correspondence)
Published online 3 June 2013 in Wiley Online Library (wileyonlinelibrary.com). DOI 10.1002/prs.11601

Process hazard analysis (PHA) is used to identify hazard events, such as safeguard failures, to produce a type of mul-
scenarios. These hazard scenarios may be initiated by single tiple failure. Often, PHA practitioners do not identify ena-
or multiple failures. Other scenario events, such as the blers. However, they can be key elements of scenarios and
responses of safeguards, are also subject to multiple failure, should be addressed [3]. Examples of enablers are provided
and enabling events may combine with initiating events or in Table 1.
other scenario events to produce a type of multiple failure. If Some practitioners focus on the identification of scenarios
multiple failures are not addressed in PHA, important sce- with single failure initiating events and do not address multi-
narios may be missed, and the risks of scenarios may be ple failures, or for that matter, enablers. Indeed, some practi-
underestimated. This article describes the meaning and types tioners specifically exclude the consideration of multiple
of multiple failures that are possible, discusses their impor- failures in PHA. However, such practice is poor as important
tance, and provides an approach for addressing them in scenarios may be missed, though the inclusion of multiple
PHA. V C 2013 American Institute of Chemical Engineers Process Saf
failures may pose challenges for PHA teams. Also, some
Prog 32: 361–364, 2013 practitioners do not address multiple failures that may occur
Keywords: multiple failures; process hazards analysis; de- within scenarios. Their omission may lead to underestimates
pendent failures; common cause failures; enablers of scenario risks resulting in flawed decisions on risk reduc-
tion. Furthermore, PHA is often used to screen scenarios for
INTRODUCTION
Process hazard analysis (PHA) methods such as the haz- further analysis using methods such as layers of protection
ard and operability (HAZOP) study and what-if analysis are analysis and underestimates of scenario risk may cause im-
used to determine hazard scenarios for processes [1]. Their portant scenarios to be omitted. This article describes the
starting point is to identify initiating events for scenarios. An meaning and types of multiple failures that are possible in
initiating event is the minimum combination of failures nec- processes, discusses their importance, and provides an
essary to start the propagation of a hazard scenario. It may approach for addressing them in PHA.
be a single initiating cause, multiple simultaneous causes, or
initiating cause(s) in the presence of enablers. Hazard sce- MEANING AND TYPES OF MULTIPLE FAILURES
narios also involve other events that occur as a result of the Multiple failures involve two or more events occurring to-
initiating event, such as the response of safeguards, which gether. The events may be equipment failures, human fail-
eventually lead to a consequence. Multiple failures may ures, external events, or combinations thereof. External
occur for these other scenario events, and between them
events include natural events such as lightning strikes,
and the initiating event. For example, redundant safeguards
human-induced events such as a vehicle collision with a
may fail at the same time, and some safeguards may fail in a
related sequence. pipeline, loss of utilities and services, and domino events.
Enablers are events or conditions that do not directly Sometimes multiple failures are referred to as “double jeop-
cause a scenario but must be present or active for a scenario ardy” or “triple jeopardy,” and so forth and also as double
to proceed [2]. They do not initiate a hazard scenario by contingency, and so forth. Failures that occur some time
themselves; rather, they make scenarios possible. Enablers prior to another failure are usually considered to be latent
are sometimes called contributing causes or contributing fac- events and can be treated as enablers. They include equip-
tors. They may apply not only to the initiating event but also ment that may have been taken out of service or left in a dis-
to other events in the scenario such as safeguard responses. abled state, for example, a disabled alarm.
Enabling events usually occur prior to the initiating event Multiple failures in hazard scenarios may involve the ini-
and are sometimes called latent failures, for example, a failed tiating event and/or other elements of the scenario, for
or disabled alarm. Enabling conditions exist at the time the example, safeguards. An example of an initiating event multi-
scenario occurs and are sometimes called latent conditions, ple failure is the level controller on one fractionation column
for example, low environmental temperature. Enabling failing at the same time as the level controller on another
events combine with initiating events or other scenario fractionation column causing a higher-than-expected load of
liquids in the overhead system that may not be designed to
handle both simultaneous failures. An example of a safe-
guard multiple failure is the failure of dual relief valves on a
C 2013 American Institute of Chemical Engineers
V vessel at the same time resulting in over-pressurization

Process Safety Progress (Vol.32, No.4) December 2013 361


Table 1. Examples of enabling events and conditions.  The simultaneous occurrence of two or more external
events may not be credible.
Disabled equipment, e.g., bypassed interlock
Override of inhibit condition, e.g., during startup These guidelines are based on this general relationship
Failed equipment, e.g., alarm between failure rates:
Extreme ambient conditions, e.g., low temperature
Utility failure, e.g., loss of inerting Human failure > Equipment failure > External events

However, PHA practitioners must be alert for exceptions.


failure of the vessel. Of course, at issue is the credibility of The guidelines assume that the contributing failures to each
such multiple failures, that is, their likelihood of occurrence. multiple failure are independent, so that a multiple failure may
be classified as noncredible when the contributors lower the
overall likelihood below a credibility threshold. For example,
IMPORTANCE OF MULTIPLE FAILURES a multiple failure with three contributors, each with a proba-
It can be argued that actions taken to protect against sin- bility of 1 3 10 2 3, has a probability of 1 3 10 2 9, which is
gle failures will also protect against multiple failures, because below any reasonable de minimis criterion.
they help to protect against the individual contributors to the Despite any guidelines, other multiple failures that the
multiple failures. It can then be posited that it is sufficient to team views as credible should not be eliminated, for exam-
address single failures and that multiple failures need not be ple, triple failures, although the identification of such failures
addressed. Certainly, actions taken to prevent single failures is unlikely. Regardless of their likelihood, some hazard sce-
that contribute to multiple failures will help to prevent the narios that involve two or more failures may merit documen-
multiple failures. However, that is not the whole story. tation because of their extremely severe consequences or the
Multiple failures may occur as a result of dependency existence of specific safeguards that protect against the multi-
between failures. Usually, the causes of such dependencies ple failure.
will not be eliminated by implementing measures to reduce Usually, PHA teams must be prompted to consider multi-
the likelihood of the individual failures. For example, if the ple failures; otherwise, there is a strong tendency toward
cause of the dependent failure of both column level control- considering only single failures. This practice involves a
lers is a set point error by the control systems engineer, mental shift for most teams. Unfortunately, there is no simple
improving their reliability will do nothing to address this de- or formal approach to identifying multiple failure scenarios
pendent failure. The same is true for the dual relief valves, if using inductive PHA methods, such as the HAZOP study.
the cause of their dependent failure is plugging of their Although this situation is unsatisfactory, feasible alternatives
inlets. generally are not available. Deductive techniques such as
Furthermore, multiple failure scenarios may have more fault tree analysis (FTA) do provide a formal means of
severe consequences than scenarios involving any one of addressing multiple failures, but their application for an
their contributors. They may merit additional safeguards entire process is not feasible owing to the time and effort
beyond those taken to protect against the single failures. involved. Furthermore, the events they analyze must be iden-
Also, protective actions against single failure events may not tified by another method, some multiple failures are difficult
have been taken, as they may have been deemed unneces- to include, and there is still no guarantee that all multiple
sary for the lesser consequences involved. For example, the failures will be identified by the analysts.
scenario with the initiating event of the failure of both col- The identification of multiple failures for initiating events
umn level controllers is more serious than a scenario with is particularly challenging, especially if the contributors arise
the failure of either one alone. Similarly, the scenario with in different parts of the process. It is less challenging for
the failure of both relief valves is more serious than a sce- multiple failures that occur within the elements of a scenario,
nario with the failure of either one alone. since the elements are defined as part of the scenario. How-
An author from the United States Environmental Protec- ever, the team must still search for them, as they may not be
tion Agency has expressed the view, “. . . major accidents obvious. The PHA team should use guidelines on the types
generally involve more than one cause. Virtually none of the of multiple failures that are considered credible for the pro-
accidents that EPA and OSHA investigated involved only a cess and examine the process for such possibilities using
single cause. More commonly, half a dozen root and contrib- their imagination.
uting causes were identified” [4].
DEPENDENT FAILURES

TREATMENT OF MULTIPLE FAILURES IN PHA


For multiple failures to be classified as noncredible
PHA studies should consider credible multiple failures. because the contributors lower the overall likelihood below
The design basis for the process should be reviewed, as it a credibility threshold, the contributors must be independent.
may contain protections against some multiple failures and However, some apparently independent failures may be de-
may provide insight into what the designers thought was pendent. For example, if a process contains a primary pump
possible. Ground rules can be established for the considera- and a backup, both electrically powered, failure of electric
tion of multiple failures, ideally as part of corporate PHA power will cause both pumps to fail. However, if the primary
guidance. For example, some possible guidelines for initiat- pump fails mechanically, the backup pump will not neces-
ing events are: sarily fail, especially if it is of a different design. Dependent
failures involve two or more failure events that are related
 Two concurrent human failures are credible. causally.
 A single equipment failure coupled with a single human For dependent failures, the likelihood of the multiple fail-
failure is credible. ure will be higher than otherwise would be estimated. For
 The simultaneous failure of two or more pieces of equip- independent failures, P(A and B) 5 P(A) 3 P(B) but for de-
ment may not be credible. pendent failures, P(A and B) 5 P(A) 3 P(B|A). In the case
 A single equipment or human failure with an external of loss of electric power to the two pumps, P(A and
event may not be credible. B) 5 P(A), because P(B|A) 5 1. In words, the probability of

362 December 2013 Published on behalf of the AIChE DOI 10.1002/prs Process Safety Progress (Vol.32, No.4)
Table 2. Examples of dependent failure situations. errors. Nuclear power plant generating stations employ
highly redundant systems, and considerable effort is invested
Control and shutdown functions with a common power in addressing CCFs [5].
supply Some sources of CCF reinforce others, for example, similar
Fire water system and process water use same supply equipment has the same vulnerability to abnormal environ-
Failure of a single I/O card resulting in the loss of several ments and is susceptible to the same misoperation, misalign-
control channels ment, miscalibration, and so forth, by a person. For example,
Multiple flow meters, analyzers, etc. with a calibration error a technician may be assigned to calibrate identical pressure
due to human error, faulty calibration instruments, etc. detectors in different systems on a given shift, but the same
Functional deficiency in a type of valve, sensor, etc. used in miscalibration error is made during each of the calibrations
multiple systems owing to a misunderstanding by the technician. There are
Plugging of shared instrument lead lines now multiple miscalibrated pressure detectors in the process
in different systems. The technician represents a source of
CCF for scenarios where the pressure detectors play a role
failure of both pumps is the same as the probability of fail- but the identical design of the pressure detectors reinforced
ure of the single primary pump, when the cause is failure of this cause. Some further examples of situations with the
electric power. potential for dependent failures are provided in Table 2.
There are various types of dependent failure. The best Usually, it is not possible to eliminate CCFs completely
known is common cause failure (CCF) wherein simultane- from a process. There will always be common equipment,
ous, or near-simultaneous, multiple failures result from a sin- manufacturing processes, raw materials, equipment compo-
gle shared cause. The cause may be internal or external to nents, people, utilities, and so forth. At best, CCFs can be
the system. For example, the failure of two redundant relief minimized. Defenses against CCFs include functional diver-
valves owing to the internal cause of plugging, or the failure sity, using different equipments to achieve the same purpose,
of two pumps owing to the external cause of power failure. spatial separation of equipment, physical protection of
CCFs may result from the random failure of a single, equipment, and staggered testing/maintenance.
nonredundant component, for example, multiple pump fail-
TREATMENT OF DEPENDENT FAILURES IN PHA
ures owing to a breaker failure, or systematic failures in
redundant components, for example, two pumps failing Ideally, the role of dependent failures should have been
owing to maintenance error. considered during the design of the process. However, PHA
Random failures occur at predictable rates but unpredict- provides the opportunity for a critical examination of the
able times. They result from a variety of degradation mecha- process for potential dependent failures and its ability to
nisms. Systematic failures are related in a deterministic way withstand them. For this critical examination to be performed
to a particular cause. They are nonrandom and usually occur effectively, it is essential that the PHA team understands the
under the same conditions such that once one failure has meaning and importance of dependent failures. Further
occurred another is assured. Systematic failures neither can details on the types of dependent failures that may occur in
be predicted nor can their rates quantified. They result from processes can be found in the literature [6]. The PHA team
hidden faults in design or implementation and can only be should try to identify their presence. A checklist of typical
eliminated by a modification of the design, operating proce- sources of dependent failures can be a useful aid (Table 3).
dures, and so forth. Generally, they are controlled by ensur- Dependent failures should be addressed for both initiating
ing that all life cycle activities are performed correctly, for events and events within hazard scenarios. Dependent fail-
example, employing suitable procedures to reduce the likeli- ures within scenarios may occur between the initiating event
hood of incorrect equipment specifications. and other scenario events, such as safeguard responses, and
Common mode failures are CCFs in which multiple items between other events within a scenario such as the
fail in the same way, or mode, for example, multiple relief responses of redundant safeguards or sequential safeguard
valves stuck closed. Cascade or sequential failures are responses.
another type of dependent failure. They are failures in a sys- Dependent failures for initiating events may lead to
tem of interconnected components where the failure of a higher estimates of the risk of multiple failure scenarios, or
preceding component triggers the failure of successive com- the inclusion of multiple failure scenarios that otherwise
ponents in a type of chain reaction. They are also called would be excluded on the grounds of credibility. Dependent
propagating failures, because one event leads to another in a failures for events within hazard scenarios impact the assess-
sequence. Cascade failures commonly occur in electrical dis- ment of scenario risk. Conservatively, it may be assumed that
tribution systems and computer networks. Overloading one scenario elements subject to dependent failure, such as
part of the system or network results in transfer of the load redundant safeguards, would not all act to reduce the sce-
or data and overloading of other parts of the system or net- nario likelihood. For example, if two relief valves may fail
work, which ultimately results in the failure of the entire sys- dependently, the failure likelihood of only one may be taken
tem. The successive dependent failure of safeguards in a into account in estimating the scenario likelihood, that is, the
process is a cascade failure. failure probability of the second relief valve is assumed to
be 1.

IMPORTANCE OF CCFS TREATMENT OF ENABLERS IN PHA


Redundancy is commonly used to improve system reli- The combination of enabling events with initiating events
ability for processes by protecting against random hardware and other scenario events, such as safeguard responses, pro-
failures, which reduces the likelihood of system failure. duces a type of multiple failure. Both enabling events and
However, redundancy introduces the potential for CCFs by conditions should be addressed in PHA. Typically, this is
providing duplication of functions or components. Moreover, accomplished best by including an additional column in the
process complexity increases with redundancy, and the PHA worksheet to record them, usually placed immediately
potential for CCFs increases owing to the increase in shared after the safeguards column [3]. The team should be
attributes for the elements of complex systems. Furthermore, prompted to brainstorm the identification of enablers. Check-
redundancy is not effective against design or other systematic lists of possible enablers and examples are useful aids.

Process Safety Progress (Vol.32, No.4) Published on behalf of the AIChE DOI 10.1002/prs December 2013 363
Table 3. Typical sources of dependent failures. Dependent multiple failures can be as likely as single fail-
ures since they reduce to a single failure. Generally, inde-
Utilities, e.g., electrical power, instrument air, etc. pendent multiple failures are less likely, but they may still be
People, e.g. credible. Inductive PHA methods are not well suited to iden-
Designers, e.g. specification errors, hardware, and software tifying multiple failures but teams should address them to
design errors, human-process interface design errors the extent possible. Brainstorming approaches that are char-
Manufacturers, e.g., manufacturing errors acteristic of PHA can be used. Deductive methods, such as
Constructors, e.g., construction errors FTA, do a better job but their application for an entire pro-
Operators, e.g., misoperation cess is not feasible owing to the time and effort involved.
Mechanics, e.g., maintenance errors, miscalibration Moreover, the events they analyze must be identified by
Maintenance, e.g., tools, procedures, training another method, some multiple failures are difficult to
Operations, e.g., procedures, training include, and completeness is not assured. Other recent pro-
External factors, e.g., lightning, flooding cess safety progress articles have addressed PHA [[7–9]].
Common locations
Environmental factors, e.g.
Electrical issues, e.g., power spike, voltage surge, high LITERATURE CITED
current level, static discharge, radiofrequency radiation 1. Center for Chemical Process Safety/American Institute of
Mechanical issues, e.g., shock, vibration Chemical Engineers, Guidelines for Hazard Evaluation
Chemical issues, e.g., corrosive atmosphere, salt air, Procedures, 3rd edition, Center for Chemical Process
humidity, presence of water Safety/American Institute of Chemical Engineers, New
Physical issues, e.g., temperature, fire, dust, debris York, NY, 2008.
Usage issues, e.g., heavy or infrequent 2. Center for Chemical Process Safety/American Institute of
Control systems, e.g., DCS Chemical Engineers, Layer of Protection Analysis: Simpli-
Similar technologies or the same type of redundant fied Process Risk Assessment, Center for Chemical Pro-
equipment cess Safety/American Institute of Chemical Engineers,
Process corrosion, plugging, or fouling, e.g., plugging of New York, NY, 2001.
relief valves and sensors in a shutdown system 3. P. Baybutt, “Analytical methods in process safety manage-
Single elements supporting multiple systems, e.g., common ment and system safety engineering—Process hazards
process taps, common conduit, single energy sources, analysis,” Handbook of Loss Prevention Engineering, J.M.
single field devices, etc. Haight (Editor), Wiley-VCH, Weinheim, Germany, 2013.
Susceptibility to misoperation, e.g., training, procedures, 4. J.C. Belke, Recurring causes of recent chemical accidents,
activity under abnormal stress US EPA, Chemical Emergency Preparedness and Preven-
tion Office, International Conference and Workshop on
CONCLUSIONS
Reliability and Risk Management, AIChE/CCPS, San Anto-
PHA provides a means to identify hazard scenarios for nio, TX, 1998.
processes, including their initiating events and safeguard 5. T.E. Wierman, D.M. Rasmuson, and A. Mosleh, Common-
responses. Both single and multiple failures are possible for Cause Failure Database and Analysis System: Event Data
initiating events and within scenarios. They should be Collection, Classification, and Coding, NUREG/CR-6268,
addressed in PHA, otherwise, important hazard scenarios Rev. 1, U.S. Nuclear Regulatory Commission, Office of
may be missed, and the risks of hazard scenarios may be Nuclear Regulatory Research, Washington, DC, 2007.
underestimated. Multiple failures may be independent or de- 6. Methods for Determining and Processing Probabilities,
pendent. PHA teams should address these types of failures: Report CPR-12E, VROM, The Hague, 2005.
7. M. Kaszniak, Oversights and omissions in process hazard
 Single failures, for example, a valve failure. analyses: Lessons learned from CSB investigations, Pro-
 Dependent multiple failures including CCFs, for example, cess Saf Prog 29 (2010), 264–269.
a breaker failure resulting in multiple pump failures. 8. P. Baybutt, Process hazard analysis for phases of opera-
 Independent multiple failures if they meet guidelines tion in the process life cycle, Process Saf Prog 31 (2012),
established by the company or are considered credible by 279–281.
the team. 9. P. Baybutt, What risk reduction measures should be cred-
 Enabling events in combination with initiating events and ited in process hazard analysis? Process Saf Prog 31
safeguard failures. (2012), 359–362.

364 December 2013 Published on behalf of the AIChE DOI 10.1002/prs Process Safety Progress (Vol.32, No.4)

You might also like