You are on page 1of 10

Safety Science 92 (2017) 94–103

Contents lists available at ScienceDirect

Safety Science
journal homepage: www.elsevier.com/locate/ssci

An Accident Causation Analysis and Taxonomy (ACAT) model of complex


industrial system from both system safety and control theory
perspectives
Weijun Li ⇑, Laibin Zhang, Wei Liang ⇑⇑
Research Center of Oil & Gas Safety Engineering Technology, College of Mechanical and Transportation Engineering, China University of Petroleum, Beijing 102249, China

a r t i c l e i n f o a b s t r a c t

Article history: Accident causation analysis is a good way to trace industrial accident causes and ultimately to prevent
Received 7 June 2016 similar accidents from happening again. Classification of accident causes can not only provide a compre-
Received in revised form 27 September hensive understanding of accident but also benefit causes statistics. Although many accident cause clas-
2016
sification models or taxonomies have been proposed, yet some models are domain-specific while others
Accepted 1 October 2016
Available online 6 October 2016
are too general or complicated for practical application. To address the basic two issues of accident anal-
ysis, which are (1) what is the failure and (2) how does the failure happen, a new model is presented from
both system safety perspective and control theory perspective. First, complex systems can be decom-
Keywords:
Accident causes classification
posed into six components, which are machine, man, management, information, resources, and environ-
Risk analysis ment from the view of system safety factors. From control theory perspective, actuator, sensor, controller,
Complex industrial system and communication are defined as system factors’ functional abstractions. The combinations of system
System safety factors and control functions form a matrix model for accident causation analysis and classification,
named Accident Causation Analysis and Taxonomy (ACAT) model. Then a comparison with existing cause
classification schemes is made and the case of BP Texas refinery accident is used to illustrate its
capability.
Ó 2016 Elsevier Ltd. All rights reserved.

1. Introduction Suedung, 2000). Structural decomposition deconstructs a system


into objects and defines causal explanations for object’s failures.
With the increasing complexity of engineered social-technical It is the most common methodology to analyze and categorize
systems, risks from any element possibly contribute to system complex system failure. Due to its simplicity and intuition, struc-
failure. For a complex system, the incident frequency is low but tural decomposition has been widely used in risk analysis of avia-
consequences are serious (Venkatasubramanian, 2011). Therefore, tion area (Miller, 1991), information system (Irani et al., 2001),
how to gain valuable and sufficient information from limited acci- industrial engineering (Lu and Liao, 2013), etc. However, the expla-
dents remains a key problem. So far, a good way is to use accident nations are usually plausible and subjective. Another type of meth-
causation analysis theories or models to summarize laws and com- ods is functional abstraction, which focuses on the representation
mon patterns from different failures so that similar accidents can of system functional relationships. It provides a way to analyze
be reduced and prevented (Dulac, 2007). With the evolution of behaviors of a system, such as closed-loop or feedback functions
methods, there are basically two research directions: structural (Rasmussen and Suedung, 2000). Though functional abstraction is
decomposition and functional abstraction (Rasmussen and proposed later than structural decomposition, it has caused great
attention over the last decade (Debrincat et al., 2013; Goh et al.,
2010; Hata et al., 2015; Ouyang et al., 2010). Despite functional
⇑ Corresponding author at: Research Center of Oil & Gas Safety Engineering
abstraction method is more systemic by depicting a system failure
Technology, Department of Safety Engineering, College of Mechanical and
Transportation Engineering, China University of Petroleum, No. 18 Fuxue Road, as a dynamic or multi-level control process, its complexity limits
Changping District, Beijing, China. its practical application (Rasmussen, 1997). In order to address
⇑⇑ Corresponding author at: College of Mechanical and Transportation these problems, a new accident causation is needed to cover
Engineering, China University of Petroleum, No. 18 Fuxue Road, Changping District, system elements as well as consider their functions.
Beijing, China.
Before defining a complete accident analysis model, we should
E-mail addresses: weijunli2013@163.com (W. Li), tongxunzuozhe1@163.com,
lw@cup.edu.cn (W. Liang).
first reconsider why accidents occur. Undoubtedly, the simplest

http://dx.doi.org/10.1016/j.ssci.2016.10.001
0925-7535/Ó 2016 Elsevier Ltd. All rights reserved.
W. Li et al. / Safety Science 92 (2017) 94–103 95

and most straightforward way is to answer the following two For a long time, man, machine, media, management, and mission
questions: (1) what is the failure and (2) how does the failure hap- have been recognized as main elements contributing to accidents.
pen? Apparently, elements of a system are the subjects of failures. However, it is too vague to include failures caused by supervision,
Previous studies (Edwards, 1972; Harris, 2006) have defined many decision making, regulations, or safety attitudes into management
system elements, such as man, machine, material, information, etc. failure. Traditional management factor is a general subject which
The 5M (Man, Machine, Media, Management, and Mission) model cannot provide more detailed types of failure. Based on accidents
has been generally accepted as a structured form to describe sys- review, we identified six system safety factors, which are Man
tem factors and identify systemic risks. Conventionally, manage- (M), Machine (M), Management (M), Environment (E), Information
ment is a broad concept which includes communication, (I), and Resources(R). Among these system safety factors, machine
supervision, decision making, regulations, etc. With the growing refers to hardware in plant including all kinds of instruments,
scale of management risks control (Waring, 2015), it is necessary equipment, or vehicles. Man, which is also called human, refers
to decompose system in detail and redefine a new model for sys- to on-site personnel like operator, maintenance worker, office
tem factors. Apart from the conventional 5M, information and stuff, installers, or field supervisor. Their duties are to implement
resource are considered in this paper. Information consists of pro- the decisions from managers. Management refers to supervision
cedures, programs, methods, standards, regulations, etc. Resources or decisions made by managers from plant units, companies, agen-
include training, experts, raw materials, fund, products, etc. On the cies, or government. Information includes procedures, programs,
other hand, it is because subject fails to perform its function that methods, standards, regulations, or laws. Resources include train-
accident happens. Hence, another important part of accident anal- ing, experts, raw materials, fund, energy, or products. Environment
ysis model should be function definition. Control theory has pro- does not mean the physical environment but a social environment
ven to be a useful method for safety management of complex because the physical environment like weather is beyond control-
systems (Wahlström and Rollenhagen, 2014). According to control lability. It usually includes safety culture, attitude, or issues left
theory, a process comprises functions like actuator, sensor, over by history. Take BP Texas refinery accident as an example,
controller, etc. As Leveson (2004) stated in her research, safety inadequate preliminary hazard analysis and mechanical integrity
problem can be treated as a control problem and failures occur program (Baker et al., 2007) are categorized into information fail-
because the controller cannot handle components adequately. ure. To prevent this type of failure, attention should be paid to pro-
Therefore, control theory is used to describe the functions of each gram formulation and evaluation.
system factors. Based on both system elements and control theory, So far, the first problem, which is what is the failure, has been
it is concluded that system elements and their functions dictate the settled. Further, it should be noted that every subject has dynamic
type of failure. In other words, accidents result from object’s func- characteristics rather than being a static element. In other words,
tion failure. The new model presented in this paper uses failure the nature of system safety factors failure is that they did not per-
taxonomy (subjects) defined from system safety perspective to form their definitive functions. In the next section, the problem of
guide causation analysis and uses control theory to describe safety how does the failure happen will be addressed.
constraints (or functions) failure.
The rest of this study is organized as follows. In Section 2, a brief 2.2. Control theory
description of basic theories including system safety factor and
control theory are introduced. Based on those theories, an Accident The essence of control theory is to use sensors to measure the
Causation Analysis and Taxonomy (ACAT) model is proposed and output and then compare output performance with desired perfor-
its elements and functions are defined in this section. In Section 3, mance by monitor and finally sent feedback to the input actuators.
a comparison with other existing accident causation analysis Generally, control theory is used in control system engineering in
models is made. Then the BP Texas refinery accident is used to the industrial field. At present, the philosophy of control theory
illustrate the new model and a comparison with logic tree is has been applied in system safety analysis of complex systems.
provided in Section 4. Finally in Section 5, conclusions are made. Coze (2005) presents that the nature of complex system is that dif-
ferent system components interact with each other to implement
their functions. It means that when system safety factor fails to
2. Concepts and model perform its function, hazards or accidents occur.
Conventionally, different considerations are defined for each
2.1. System safety factors factor of system to detail its potential risks (Everdij and Scholte,
2013). However, due to lack of standards for the interpretation of
Since the concepts of man-machine-media (environment) these factors, different reference presents varied considerations.
model was first proposed by T.P. Wright, it has had a profound It leads to poor consistency in application. Hence, a structured the-
effect on accident analysis and prevention (Miller, 1991). After- ory is needed to guide the establishment of subgroups. Control the-
ward, Management and Mission were introduced and the 5M ory can describe factors’ functions and their communications with
model was established. In consideration of the complexity of sys- a closed loop. Each component in a control structure indicates a
tem failure, more system factors have been incorporated into 5M particular function that one factors should complete. A simplified
model. For instance, Miller (1967) summarized seven system diagram of a control structure is shown in Fig. 1. The basic compo-
safety factors, which are man, machine, media, management, time, nents are actuator, sensor, and controller and communication,
cost, and information. Irani et al. (2001) proposed a variation ‘‘5M”
(i.e. Man, Machine, Method, Material, and Money) model to evalu-
ate the impact of human, process and technology factors on infor-
mation system failure. Kozuba (2013) suggested that though many Actuator System Sensor
efforts had been made to prevent undesirable flight-related events,
human factor, technical factor and organizational factor were still
the main causes. Of all these systematic safety factor models, the
initial 5M model is the most widely used one and has been gener- Controller
ally accepted in many areas, especially in aviation domains. It is a
structured method which describes the subjects of safety analysis. Fig. 1. Simplified diagram of a control system.
96 W. Li et al. / Safety Science 92 (2017) 94–103

function of actuator successfully. With the guidance of this model,


Table 1
Definitions of control system factors. it will be less casual or subjective to identify or classify accident
causes.
Functional factor Function definition
Take measures or execute commands
Actuator
Measure and monitor the output
3. Comparison with other models
Sensor
Compare output performance with the reference Six frequently used accident analysis models are chosen to com-
Controller
Connect elements and convey information pare with ACAT, which separately are 3M, 5M, HFACS, AcciMap,
Communication
STAMP, and TeCSMART. A brief introduction of these models is
described below.

respectively. Therefore, from the perspective of control theory, it is


assumed that each system safety factor has these four functional
characteristics. The definitions of the functional characteristics
Work Environment
are shown in Table 1. For example, if the mission is to open a valve,
the control system can be described as follows: (1) an operator
who open the valve is actuator of the mission, (2) sensor refers
to the field supervisor whose job is to monitor the operation pro-
Data processing
cess, (3) an audit or evaluation should be made by a controller,
in brain
and (4) all of these works require effective communications. Any
missing function may lead to mission failure.
Control
Perceive Person activities
2.3. Model and definition

As stated above, from safety engineering perspective, a complex Machine


system can be deconstructed into six subjects. From control engi-
neering perspective, every subject has four functional characteris-
Data processing
tics. Consequently, a matrix table is established by combining Controller
in computer
these two lists. Since failure diagnosis can also be view as a classi-
fication process (Venkatasubramanian, 2005), the new structured
model is named Accident Causation Analysis and Taxonomy
(ACAT) model. The new model and each element’s definition are
shown in Table 2. Harmful gases
The code Hxx represents each type of hazard scenario or failure,
which is a description of how a subject fails to perform its function.
For example, H11 means that man does not implement the Fig. 2. 3M model (adapted from (Song and Xie, 2014).

Table 2
ACAT model and elements’ definitions.

Actuator (A) Sensor (S) Controller (C) Communication (O)


Subject

Function

Man (M) H 11 H 12 H 13 H 14
Machine (M) H 21 H 22 H 23 H 24
Management (M) H 31 H 32 H 33 H 34
Information (I) H 41 H 42 H 43 H 44
Resources (R) H 51 H 52 H 53 H 54
Environment (E) H 61 H 62 H 63 H 64
No Description No Description
H11 Fail to take effective actions H41 Wrong or inadequate information
H12 Fail to monitor, or fail to detect the human failure in time H42 Fail to monitor or update information
H13 Fail to follow procedures H43 Fail to establish information
H14 Lack of effective communication between operators H44 Fail to deliver or interpret information
H21 Design deficiency or malfunction H51 Lack of training, experts, raw materials,
fund, energy or products
H22 Fail to monitor or detect the machine failure in time H52 Fail to monitor the resource spending or
changes
H23 Lack of sufficient machine maintenance H53 Inadequate allocation of resources
H24 Information from equipment is not captured or interpreted H54 Fail to deliver resources or resources
needs
H31 Fail to manage workers or equipment or organization appropriately H61 Ignore warnings or issues in previous
events
H32 Fail to monitor organizational failure or manage change H62 Fail to monitor the environment change
H33 Fail to follow procedures or organizational inadequate decision H63 No response to poor safety culture or
attitude
H34 Lack of communication within decision levels H64 Lack of communication culture
W. Li et al. / Safety Science 92 (2017) 94–103 97

3.2. 5M model

The wildly accepted 5M model refers to Man, Machine, Media,


Man Management, and Mission (FAA, 2000). Man, Machine, and Media
overlap to form Mission and Management surrounds the other four
factors, as shown in Fig. 3. It provides a structured method to
decompose system and has various variations and applications
(Everdij and Scholte, 2013; Luo et al., 2014). Because it is too gen-
Mission eral and less detailed, more explanations are needed for each factor
Machine Medium and different users have varied considerations. Therefore, its con-
sistency has restricted its application.

3.3. HFACS

Fig. 3. 5M model (adapted from (Miller, 1991). Although the Human Factors Analysis and Classification System
(HFACS) proposed by Wiegmann and Shappell (2003) mainly
addresses human factors failure, it shows great advantages as an
3.1. 3M model accident causes taxonomic approach (Lenné et al., 2012). Four
types of failures were presented, including unsafe acts, precondi-
3M represents man (human), machine, and media (environ- tions for unsafe acts, unsafe supervision and organizational influ-
ment), see Fig. 2. It contradicts the traditional accident causation ences, as shown in Fig. 4. Apart from human errors, underlying
theory which blame single operator or equipment failure. Based latent conditions were also considered in HFACS framework,
on this, a new scientific field called Man-Machine-Environment including environment factors such as weather, lighting, equip-
System Engineering (MMESE) emerges (Long and Dhillon, 2015). ment design, automation, etc. Unlike 5M model, subcategories
However, due to its over-simplicity, it had been quickly replaced were defined for each of these failures, which increases its
by frameworks with more factors. consistency.

Fig. 4. HFACS framework (adapted from (Shappell et al., 2007).


98 W. Li et al. / Safety Science 92 (2017) 94–103

3.4. AcciMap model Apparently, all of these researches have contributed a lot to
accident causation analysis of complex system. Meanwhile, argu-
Accident Map (AcciMap) model is a six-layer framework com- ments, comparisons, and improvements have never stopped.
posed of government, regulators and associations, company, man- (Ergai et al., 2016; Salmon et al., 2012; Underwood and
agement, staff, and work, as shown in Fig. 5. Rasmussen (1997) Waterson, 2014).
designed this six-level model mainly for industrial risk manage- The ACAT model developed in this paper can be seen as an
ment. A vertical interaction control mechanism was first explained extension of conventional 5M model. It defines the subgroups of
by assuming accident as control failures between actors at each of each element in more detail by incorporating control theory.
these levels (Cassano-Piche et al., 2009). It decomposes a system Meanwhile, it provides a structured accident causes taxonomy
from the view of organizational levels. However, some comments approach.
pointed that it lacks a failure taxonomy to guide analysis In order to make an overall comparison, seven characteristics
(Salmon et al., 2012). are chosen, which respectively are generality, communicability,
integration, consistency, taxonomy, completeness, and simplicity.
3.5. STAMP model The number of circle ‘‘s” indicates the extent to which one model
has certain characteristic. Models with more circles have wider
Leveson’s STAMP (Systems-Theoretic Accident Model and Pro- applicability than those with less circles. For example, generality
cesses) model described socio-technical system as a dynamic con- was evaluated for each model in the first row in Table 3. TeCSMART
trol process, see Fig. 6. She described some basic system was driven from a comparative analysis of 13 systemic accidents in
components and focused on the upper levels of the AcciMap different domains, including aerospace field, petrochemical field,
model. For example, Congress and Government Regulatory Agen- public health, economic field, medicine field, etc.
cies were defined in much more detailed compared with AcciMap. (Venkatasubramanian and Zhang, 2016). Therefore, TeCSMART
Besides, it provided detailed guidance for accident factors classifi- was assigned seven circles, which means that it has the strongest
cation (Leveson, 2004). universality or generality than the other six models. The compar-
isons are shown in Table 3.
3.6. TeCSMART model
4. Model evaluation with BP Texas refinery case
To identify common failure mechanisms and modes across dif-
ferent domains, Venkatasubramanian proposes a seven-level con- 4.1. Incident description and logic tree
ceptual framework, called Teleo-Centric System Model for
Analyzing Risks and Threats (TeCSMART) (Venkatasubramanian One of the largest industrial disasters occurred at BP Texas City
and Zhang, 2016). It comprises of society, government, regulatory, refinery on March 23, 2005, leading to 15 fatalities, 180 injures and
market, company, plant, and equipment, as shown in Fig. 7. It pro- over 1.5 billion dollars of financial losses (CSB, 2007). Like most
vides a more complete and broad view of system failure. Compared other industrial disasters, what happened to BP Texas City Refinery
with AcciMap model, it considers two more factors, which are soci- is a doomed event rather than a random anomaly. A brief descrip-
ety and market. However, modeling system at multiple levels tion of this accident is introduced.
requires a substantial amount of expert knowledge, which makes The accident occurred during startup of the isomerization unit
the analysis of the process too complicated and time-consuming. (ISOM) raffinate splitter section. Fig. 8 illustrates the brief

Research Discipline Environmental Stressors


Government
Political Science; Law;
Economies; Sociology Changing political
climate and public
Economics; Decision Regulators, Associations awareness
Theory;
Organizational
Sociology Changing market
Company
conditions and
Industrial Engineering;
financial pressure
Management &
Organization
Management
Psychology; Human Changing competency
factors; Human- and levels of
Machine Interaction education
Staff
Mechanical, Chemical,
and Electrical Fast pace of
Engineering technological change
Work

Fig. 5. AcciMap model (adapted from (Rasmussen, 1997).


W. Li et al. / Safety Science 92 (2017) 94–103 99

SYSTEM DEVELOPMENT SYSTEM OPERATIONS

Congress and Legislatures Congress and Legislatures

Government Regulatory Government Regulatory


Agencies Industry Agencies Industry
Associations, User Associations, User
Associations, Unions, Associations, Unions,
Insurance Companies, Courts Insurance Companies, Courts

Company Management Company

Project Management Operations Management

Design Documentation Operating Process

Implementation and
assurance

Manufacturing Management
Maintenance and Evolution

Manufacturing

Fig. 6. STAMP (adapted from (Leveson, 2004).

processes from ISOM startup to accident. According to the investi- these events were not further grouped into one category. Second,
gation report, potential hazards existed in every step of the pro- some root causes were not detailed enough for the convenience
cess, which finally contributed to the severe consequences. of developing corresponding measures. For example, ‘‘Production
In order to trace the root causes, the US Chemical Safety Board pressures” was summarized as one of the root causes, but whether
(CSB) report provided a detailed logic tree map (CSB, 2007). Gener- these pressures were caused by budget or market was not indi-
ally, the operable basic events of logic tree, which are drawn at the cated. Besides, this deductive and top-down method cannot
bottom of the figure, are regarded as root causes. It has been pro- include all possible initiating events identified by investigators.
ven a convenient method to analyze accident causes. However, for In other words, some evidence indicated in reports was not identi-
a large and complex system, the logic tree will be too complex and fied as possible causes in logic tree.
voluminous for readers to understand. In this case, the original
report showed the logic tree in thirteen whole pages. According 4.2. Causes taxonomy with new model
to those detailed but also complicated logic trees, 75 bottom
events were identified. Failures in procedures, training, policies, There have been many excellent articles studying this disaster,
communications, oversights, operations, software system, and see (Holmstrom et al., 2006; Manca and Brambilla, 2012; Saleh
budgets were all placed in the layer of root causes. However, there et al., 2014). Admirably, not only technical reasons or human errors
are some limitations to the logic tree method. First, no taxonomies but also managerial and organizational factors were considered in
or summaries were provided since some bottom events are repet- these researches. It suggests that a broad view of accident analysis
itive or refer to the same type of causes. For example, there are five and prevention is the general trend. Given that official accident
basic events revealing that ‘‘lack of training” is a crucial cause, but reports usually dissect accidents in extreme detail because many
100 W. Li et al. / Safety Science 92 (2017) 94–103

experts and scholars in various fields are deployed to investigate


root causes, there is no doubt that they elaborate the most detailed
analysis results and subsequent researches were mostly based on
them. However, the major drawback of most official reports is that
View #7: Societal View these large blocks of text lack a structured or regular framework
for readers to capture the main contributing factors. Furthermore,
it is not helpful to recognize the common accident pattern. So in
View #6: Govt View this paper, the method described above was used to recognize
and classify the causes of BP Texas City disaster.
Based on the event sequence introduced in CSB report, the acci-
View #5: Regulatory View dent causes taxonomy with a novel model is shown in Table 4. The
detailed explanations or evidence of each cause can be found in
supplementary material. Take an example to illustrate how to rec-
View #4: Market View
ognize causes with the new model. The report stated that the mal-
functioning instrumentation like control valve and sight glass was
View #3: Management View reported to Day Board Operator but no action was made. In this
scenario, the Day Board Operator is defined as the failure subject
and his function as an actuator failed. So this cause is can be cate-
View #2: Plant View gorized into H11 (Fail to take effective actions). To eliminate this
type of hazard and prevent it from happening again, measures such
as training, supervision, working specification should be taken.
View #1: Equipment View
Each cell in Table 4 represents a type of failure causes. Four col-
umns represent four types of functions and six rows represent six
Fig. 7. TeCSMART framework (adapted from (Venkatasubramanian and Zhang, system factors. Agents or subjects who failed to perform their func-
2016). tions can be identified according to these causal factors. For exam-
ple, the third row (H31–H34) indicates the failure of management.

Table 3
Comparison of ACAT model with other models.

Models 3M 5M HFACS ACAT AcciMap STAMP TeCSMART


Generality s sss ss ssss sssss ssssss sssssss
Communicability ss sss s ssss ssssss sssssss sssss
Integration sssss ssssss ssss sssssss ss s sss
Consistency s sss sssss sssssss ssss ssssss ss
Failure taxonomy ss ssss ssssss sssssss sss s sssss
Completeness s ss sss ssss sssss ssssss sssssss
Simplicity sssssss ssssss sssss ssss sss ss s

Startup Shift Safety Relief Fire and


Preparations Turnover Valves Open Explosion

Tower Re-startup Liquid Flows Into


Blowdown Drum
Tower Filling
Tower Overfills
Liquid Flow Into
Sewer System
Tower Shutdown
Tower Overflows
Blowdown Stack
Overflows

Fig. 8. Event chain from ISOM startup to accident.


W. Li et al. / Safety Science 92 (2017) 94–103 101

Table 4
BP Texas city accident causes taxonomy with ACAT.

H11 H12 H13 H14


11-1 Day Board Operator did not 12-1 The frontline supervisor 13-1 Didn’t undertake 14-1 Supervisor did not review startup
take actions even if the malfunc- didn’t meet requirements procedures procedure
tioning control valve and sight 12-2 Lack of oversight 13-2 Didn’t implement 14-2 shift turnover
glass were reported to him 12-3 Inadequate supervision program 14-3 No record
13-3 Lack of risks 14-4 Miscommunication
understanding 14-5 Communication Less than Adequate (LTA)
H21 H22 H23 H24
21-1 Wrong sitting of trailers 22-1 No evaluation for blow- 23-1 Level transmitter, sight 24-1 Inadequate information on control
21-2 Inappropriate blowdown down drum glass, and pressure valve were board
system 22-2 No checks to instrument identified as malfunctioning 24-2 Unnoticed switch failure
21-3 No automatic shutdowns or 22-3 Lack of work order tracking but were not repaired 24-3 fault reading
safety interlocks
21-4 Malfunctioned control board
21-5 Inappropriate DCS system
H31 H32 H33 H34
31-1 Managers did not encourage 32-1 No Management of Change 33-1 Managers did not 34-1 The Health and Safety Executive report
the reporting of incidents (MOC) implement OSHA safety identified needed changes, but BP Group did
31-2 Process Hazard Analysis 32-2 Lack of monitor for mal- regulations not systematically review its refinery operations
(PHA) team did not provide functions by BP 33-2 Managers did not and corporate governance
recommendations 32-3 Lack of oversight of pro- implement safety policies 34-2 Production pressures
31-3 Managers did not identify grams by BP 33-3 Siting of trailers was
hazard scenarios. 32-4 Lack of oversight of people against BP policy
31-4 OSHA does not require by BP. 33-4 Startup process was
employers to evaluate changes. 32-5 Supervisors’ against BP safety guidelines
31-5 OSHA did not identify the underestimation 33-5 Pre-startup Safety Stan-
likelihood of a catastrophic dard (PSSR) was not
incident conducted
31-6 BP’s wrong determination
H41 H42 H43 H44
41-1 Incomplete API 752 42-1 Lack of oversight of major 43-1 No formal shift turnover 44-1 Inadequate information in logbooks,
41-2 Inappropriate vehicle policy accident prevention programs or policy databases or reports
41-3 Inadequate shift turnover management 43-2 Lack of a human fatigue- 44-2 Inadequate information in maintenance
procedures 42-2 Outdated equipment data prevention policy work order system
41-4 Wrong procedure statements 42-3 Outdated procedures did 43-3 No malfunctioning level 44-3 No serious safety failures information in
41-5 Job requirement and training not address recurring opera- transmitter repair policy reports
program LTA. tional problems during startup. 43-4 No formal policy to pre- 44-4 No feed or product-routing instructions
41-6 Vulnerability based on non- 42-4 Outdated PSI (Pounds per vent startup across different 44-5 No central database or repository for data
applicable criteria. Square Inch) control rooms sheets
41-7 Inadequate reporting programs 43-5 Separating startup phase
41-8 Inappropriate occupancy quan- was not included in procedures
tification method
41-9 Inappropriate operating
envelope
41-10 Poor traffic policy
41-11 Incomplete API 521
41-12 Inappropriate Checklist
H51 H52 H53 H54
51-1 Inadequate operator training 52-1 Money problems 53-1 No extra board operator 54-1 Budget cuts did not consider the main-
and staffing 52-2 Inadequate evaluation by 53-2 Overworked operators tenance needs
51-2 Lack of technically trained BP Group and Texas City 53-3 No health and safety 54-2 Recommendations for extra board opera-
personnel during the startup board member tor not followed
51-3 OSHA lacks of inspectors 53-4 Inappropriate work
assignment
H61 H62 H63 H64
61-1 Previous instruments prob- 62-1 Poorly managed corporate 63-1 Poor work environment 65-1 Reporting bad news was not encouraged
lems were not fixed mergers, leadership and organi- 64-2 Unsafe compromises 65-2 Fear of reprisal
61-2 Previous warning signs were zational changes, staff reduc- 64-3 Inappropriate BP Group
ignored tions, and budget cuts oversight and Texas City
61-3 History of allowing variance to management culture
procedures 64-4 Lack of reporting and
61-4 Incident investigations LTA learning culture

It is obvious that BP group, managers, OSHA, and hazard analysis operators, between Health and Safety Executive and BP, between
team are blamed to take responsibilities for managerial failures. databases and logbooks, between budget cuts and maintenance
From the view of control failure, inappropriate behaviors of agents needs, between market and company, etc.
or functions of subjects can be recognized in four columns. For The causes in italic represent the bottom events identified in
instance, the fourth column (H14–H64) indicates failures in com- logic tree. By comparing causes listed in logic tree and in ACAT
munications, including miscommunications or lack of communica- table, it can be concluded that the ACAT model can not only help
tions between supervisors and operators, between machine and to identify accident causes from different levels but also classify
102 W. Li et al. / Safety Science 92 (2017) 94–103

possible. Structural decomposition models simplified systemic


analysis through decomposing a system into several factors (like
Man human, machine, material, etc.). But they lack consistent consider-
ations of these factors’ behavior descriptions. While functional
Machine abstraction models use control constraints concepts to describe
12% 14% system’s behaviors, these multi-level frameworks are too compli-
cated to be applied in hazard identification and causes classifica-
12% Management
tion. Consider this dilemma, a model called Accident Causation
13%
Analysis and Taxonomy (ACAT) was established based on both sys-
Information tem safety and control theory. By comparing it with 3M, 5M,
29% 20% HFACS, AcciMap, STAMP, and TeCSMART, it can be seen that ACAT
Resources has advantages on integration, consistency, and taxonomy.
The case study of BP Texas refinery explosion accident analysis
Environment confirmed that ACAT can identify and further categorize accident
causes so that managers or analysts can grasp a complete picture
of the accident. Ninety causes of the accident were analyzed and
categorized. Based on statistical results, 29% causes are relevant
Fig. 9. Factor failure chart.
to information factor and 20% causes are from management flaws.
Although CSB did not make statistical analysis, they roughly sum-
marized the root causes as failures from BP Group Board, senior
executives, and BP Texas City Managers (CSB, 2007). This is consis-
tent with our results. However, as we mentioned at the beginning
of the text, management is a broad concept which should be
Actuator failure described in detail. Thus, specific measures can be made with the
guidance of the ACAT model. For BP Texas refinery accident, more
21% Sensor failure attentions should be paid to procedures, programs, methods, stan-
35% dards, and regulations.
We assumed that any complex system can be considered as a
24% Controller failure control system. Actuator, sensor, controller, and communication
20% coordinate together to make the system work smoothly and con-
tinuously. When carrying out ACAT analysis, we need to recognize
Communication which function is ineffective so that targeted measures can be
failure taken to fix the control loop.
Compared to existing complex systemic analysis method, the
ACAT model can benefit not only accident analysis but also acci-
dent statistics. It was based on system framework, solid theories
Fig. 10. Functional failure chart. and accidents reviews. It helps accident investigators to under-
stand accidents from a broad view and gather more information,
causal factors for convenience and feasibility. Furthermore, the as well as warn managers to consider all of the system factors to
proportions of different types of causes to the accident can be identify hazards and prevent accidents.
obtained by counting percentages. For example, 12 failure causes
out of the total 90 causes are categorized as ‘‘Man” failure. There- Acknowledgments
fore, the percentage of ‘‘Man” failure causes is 12 divided by 90 and
comes to be 13%. Similarly, the other percentages can be obtained. The authors are greatly appreciated the supervision by Profes-
Two pie chart are drawn, as shown in Figs. 9 and 10. sor Venkat Venkatasubramanian during Weijun’s academic visiting
It can be seen from Fig. 9 that information factor failure (29%) at Columbia University. The authors are also grateful for all the
and management factor failure (20%) contribute most to the acci- accident report boards or organizations. This paper is supported
dent. The results warn managers that more attentions should be and funded by programs with National Science and Technology
paid to procedures, programs, methods, standards, and regulations. Major Project of China (Grant No. 2011ZX05055), Science Founda-
Besides, management flaws in refinery site, BP Group, and OSHA tion of China University of Petroleum, Beijing (No.
are revealed as another major cause, which should also be noted. 2462015YQ0406), and China Scholarship Council ([2015]3022).
Fig. 10 shows that actuator failure (35%) accounts for a higher
proportion of functional failures, which means that the implemen-
Appendix A. Supplementary material
tation of execution is poor. Therefore, more attention should be
paid to strengthening execution and enforcement. In addition,
Supplementary data associated with this article can be found, in
the other three functions also affected the safety of the system pro-
the online version, at http://dx.doi.org/10.1016/j.ssci.2016.10.001.
foundly. This confirmed that each component in the feedback con-
trol loop should perform its functions well to keep the whole
References
system running safely.
Baker, III J.A., Leveson, N., Bowman, F.L., Priest, S., Erwin, G., Rosenthal, I., Gorton, S.,
Tebo, P.V., Hendershot, D., Wiegmann, D.A., Wilson, L.D., 2007. The Report of the
5. Conclusions
BP US Refineries Independent Safety Review Panel. <http://www.csb.gov/
assets/1/19/Baker_panel_report1.pdf> (Feb. 16, 2016).
For complex system, accidents are usually caused by complex Cassano-Piche, A.L., Vicente, K.J., Jamieson, G.A., 2009. A test of Rasmussen’s risk
failures. It requires an accident analysis tool which is detailed management framework in the food safety domain: BSE in the UK. Theor. Issues
Ergon. Sci. 10 (4), 283–304.
enough to address each type of possible hazards. On the other side, Coze, J.C.L., 2005. Are organisations too complex to be integrated in technical risk
for operation convenience, the tool is expected to be as simple as assessment and current safety auditing? Saf. Sci. 43 (8), 613–638.
W. Li et al. / Safety Science 92 (2017) 94–103 103

CSB, US Chemical Safety and Hazard Investigation Board, 2007. Investigation report: Luo, X., Zhao, S., Zeng, X., Li, L., 2014. Research on fatigue risk management of
refinery explosion and fire. <http://www.csb.gov/assets/1/19/csbfinalreportbp. airport staff. In: Proceedings of the 13th International Conference on Man-
pdf> (Dec. 11, 2015). Machine-Environment System Engineering. Springer, Berlin Heidelberg, pp. 3–
Debrincat, J., Bil, C., Clark, G., 2013. Assessing organisational factors in aircraft 12.
accidents using a hybrid Reason and AcciMap model. Eng. Fail. Anal. 27, 52–60. Manca, D., Brambilla, S., 2012. Dynamic simulation of the BP Texas City refinery
Dulac, N., 2007. A framework for dynamic safety and risk management modeling in accident. J. Loss Prev. Process Ind. 25 (6), 950–957.
complex engineering systems Doctoral dissertation. Massachusetts Institute of Miller, C.O., 1967. The Role of Systems Safety in Aerospace Management. Institute of
Technology. Aerospace Safety and Management, University of Southern California.
Edwards, E., 1972. Man and machine: systems for safety. In: Proceedings of British Miller, C.O., 1991. Investigating the management factors in an airline accident.
Airline Pilots Association Technical Symposium. British Pilots Association, Flight Safety Digest 10 (5), 1–15.
London, pp. 21–36. Ouyang, M., Hong, L., Yu, M.H., Fei, Q., 2010. STAMP-based analysis on the railway
Ergai, A., Cohen, T., Sharp, J., Wiegmann, D., Gramopadhye, A., Shappell, S., 2016. accident and accident spreading: taking the China-Jiaoji railway accident for
Assessment of the human factors analysis and classification system (HFACS): example. Saf. Sci. 48 (5), 544–555.
intra-rater and inter-rater reliability. Saf. Sci. 82, 393–398. Rasmussen, J., 1997. Risk management in a dynamic society: a modelling problem.
Everdij, M.H.C., Scholte, J.J., 2013. Unified framework for FAA risk assessment and Saf. Sci. 27 (2), 183–213.
risk management toolset of methods for safety risk management. Federal Rasmussen, J., Suedung, I., 2000. Proactive Risk Management in a Dynamic Society.
Aviation Administration. <http://www.nlr-atsi.nl/downloads/rarm-toolset-of- Swedish Rescue Services Agency.
methods-for-safety-risk-manage.pdf> (Feb. 16, 2016). Saleh, J.H., Haga, R.A., Favarò, F.M., Bakolas, E., 2014. Texas City refinery accident:
FAA, Federal Aviation Administration, 2000. FAA System Safety Handbook, Chapter case study in breakdown of defense-in-depth and violation of the safety–
15: Operational Risk Management. <https://www.faa.gov/regulations_policies/ diagnosability principle in design. Eng. Fail. Anal. 36, 121–133.
handbooks_manuals/aviation/risk_management/ss_handbook/media/Chap15_ Salmon, P.M., Cornelissen, M., Trotter, M.J., 2012. Systems-based accident analysis
1200.pdf> (Feb. 16, 2016). methods: a comparison of Accimap, HFACS, and STAMP. Saf. Sci. 50 (4), 1158–
Goh, Y.M., Brown, H., Spickett, J., 2010. Applying systems thinking concepts in the 1170.
analysis of major incidents and safety culture. Saf. Sci. 48 (3), 302–309. Shappell, S., Detwiler, C., Holcomb, K., Hackworth, C., Boquet, A., Wiegmann, D.A.,
Harris, D., 2006. The influence of human factors on operational efficiency. Aircr. Eng. 2007. Human error and commercial aviation accidents: an analysis using the
Aerosp. Technol. 78 (1), 20–25. human factors analysis and classification system. Human Factors: J. Hum. Fact.
Hata, A., Araki, K., Kusakabe, S., Omori, Y., Lin, H.H., 2015. Using Hazard Analysis Ergon. Soc. 49 (2), 227–242.
STAMP/STPA in Developing Model-Oriented Formal Specification toward Song, X., Xie, Z., 2014. Application of man-machine-environment system
Reliable Cloud Service. Platform Technology and Service (PlatCon), 2015 engineering in coal mines safety management. Procedia Eng. 84, 87–92.
International Conference, IEEE, 23–24. Underwood, P., Waterson, P., 2014. Systems thinking, the Swiss Cheese Model and
Holmstrom, D., Altamirano, F., Banks, J., Joseph, G., Kaszniak, M., Mackenzie, C., accident analysis: a comparative systemic analysis of the Grayrigg train
Wallace, S., 2006. CSB investigation of the explosions and fire at the BP Texas derailment using the ATSB, AcciMap and STAMP models. Accid. Anal. Prev. 68,
City refinery on March 23, 2005. Process Saf. Prog. 25 (4), 345–349. 75–94.
Irani, Z., Sharif, A.M., Love, P.E., 2001. Transforming failure into success through Venkatasubramanian, V., 2005. Prognostic and diagnostic monitoring of complex
organizational learning: an analysis of a manufacturing information system. systems for product lifecycle management: challenges and opportunities.
Eur. J. Inf. Syst. 10 (1), 55–66. Comput. Chem. Eng. 29 (6), 1253–1263.
Kozuba, J., 2013. The role of the human factor in maintaining the desired level of air Venkatasubramanian, V., 2011. Systemic failures: challenges and opportunities in
mission execution safety. International Conference of Scientific Paper AFASES. risk management in complex systems. AlChE J. 57 (1), 2–9.
Brasov. <http://213.177.9.66/ro/afases/2013/air_force/Kozuba.pdf> (Dec. 11, Venkatasubramanian, V., Zhang, Z., 2016. TeCSMART: a hierarchical framework for
2015). modeling and analyzing systemic risk in sociotechnical systems. AIChE J. http://
Lenné, M.G., Salmon, P.M., Liu, C.C., Trotter, M., 2012. A systems approach to dx.doi.org/10.1002/aic.15302.
accident causation in mining: an application of the HFACS method. Accid. Anal. Wahlström, B., Rollenhagen, C., 2014. Safety management – a multi-level control
Prev. 48, 111–117. problem. Saf. Sci. 69 (1), 3–17.
Leveson, N., 2004. A new accident model for engineering safer systems. Saf. Sci. 42 Waring, A., 2015. Managerial and non-technical factors in the development of
(4), 237–270. human-created disasters: a review and research agenda. Saf. Sci. 79, 254–267.
Long, S., Dhillon, B.S., 2015. In: Proceedings of the 13th International Conference on Wiegmann, D.A., Shappell, S.A., 2003. A human error approach to aviation accident
Man-Machine-Environment System Engineering. Springer, Berlin Heidelberg. analysis: the human factors analysis and classification system. VT Ashgate
Lu, W., Liao, T., 2013. Preliminary discussion on strengthening safety management Press, Burlington.
of urban metro equipment based on 5M1E factors. In: Advances in Industrial
Engineering, Information and Water Resources, 311. WIT Press.

You might also like