You are on page 1of 21

Safety Science 166 (2023) 106238

Contents lists available at ScienceDirect

Safety Science
journal homepage: www.elsevier.com/locate/safety

Hazards correlation analysis of railway accidents: A real-world case study


based on the decade-long UK railway accident data
Ning Wang a, Xin Yang a, *, Jianhua Chen b, *, Hongwei Wang c, Jianjun Wu a
a
State Key Laboratory of Advanced Rail Autonomous Operation, Beijing Jiaotong University, Beijing, China
b
Shenzhen Polytechnic, Shenzhen, China
c
National Research Center of Railway Safety Assessment, Beijing Jiaotong University, Beijing, China

A R T I C L E I N F O A B S T R A C T

Keywords: With the continuous development and construction of railway transportation, the railway accidents occur
Hazards correlation analysis frequently which greatly threatens the life safety of passengers and the further development of railway industry.
Knowledge graph Discussing and summarizing experience from past accidents is benefit to improve the safety of railway. This
Railway accidents
paper proposes a modeling method for the correlation analysis of hazards in railway accidents based on the
Safety management
knowledge graph theory. By describing the association between accidents and hazards in the knowledge graph
network, the potential law of accident occurrence is revealed. The innovation of this study is that it considers the
correlations between hazards. In addition, the hazards are further refined, and new topology indexes that adapt
to the heterogeneous structure characteristics of knowledge graph are presented. Based on actual railway ac­
cident data in the UK, a number of key hazards have been identified using the methods proposed in this paper.
The experimental results show that by controlling key hazards one by one, the harmful consequence caused by
hazards to accidents is also continuously decreasing. Finally, based on the experimental exploration of the
correlation between key hazards, corresponding preventive measures were developed. The method based on
knowledge graph is expected to be applied to explore the relationship between hazards in railway accidents and
provide additional decision-making information for the prevention of railway accidents.

1. Introduction important link in maintaining the safety of national and people’s lives
and property.
Railway transportation is a high-speed, massive, and high-frequency Under the foregoing facts, railway accidents have received extensive
mode of transportation. It involves the safety of numerous personnel, attention from scholars. For example, Evans (2021) and others conduct
goods and materials. If an accident occurs during railway transportation, relevant research on fatal train accidents in European Railways up to
it will not only cause serious casualties and property losses, but also have 2019. There are many hazards causing railway accidents. In order to
a significant impact on social and economic development. In the railway prevent the occurrence of railway accidents, it is necessary to explore
transportation system, train derailment, runaway and collision are and study the key hazard factors in railway accidents. In recent years,
common railway accidents. Especially in terms of traffic accidents, many scholars have analyzed the hazard factors of railway accidents,
major railway traffic accidents have happened in various countries. Chang et al. (2009) examines the causality among driving performance,
Most of the accidents have caused serious consequences such as heavy traffic actors and intersection accidents using path analysis. Stanton and
casualties. For example, on April 28, 2008, the T195 train from Beijing Walker (2011) discuss the psychological factors involved in the Lad­
to Qingdao was off-line when it arrived at the Jiaoji railway, and broke Grove rail accident. Based on the ordered probit model and data
collided with the 5034 train from Yantai to Xuzhou, causing 71 deaths analysis, Hao et al. (2016) find that driver, environment and weather
and 416 injuries. On February 22, 2012, in Buenos Aires, Argentina, a characteristics have a strong impact on the severity of injuries in acci­
train hit a buffer barrier at the end of the rail and derailed, killing 49 dents at highway-rail grade crossings. Haleem (2016) investigates haz­
people and injuring more than 600. Therefore, railway transportation ard factors of traffic casualties at private highway-railroad grade
safety is very necessary, and ensuring railway transportation safety is an crossings in the United States, including temporal crash characteristics,

* Corresponding authors.
E-mail addresses: 11111047@bjtu.edu.cn (X. Yang), cjh@szpt.edu.cn (J. Chen).

https://doi.org/10.1016/j.ssci.2023.106238
Received 20 September 2022; Received in revised form 20 April 2023; Accepted 16 June 2023
Available online 25 June 2023
0925-7535/© 2023 Elsevier Ltd. All rights reserved.
N. Wang et al. Safety Science 166 (2023) 106238

geometry, railroad, traffic, vehicle, and environment. Li et al. (2019) network theory has also been widely used. Klockner and Toft (2018)
integrate theoretic accident model and the process (STAMP) and human propose a safety net method that can be used to understand the
factors analysis and classification System (HFACS), and reveal a number complexity of the relationship between system factors in railway safety
of prominent accident reasons caused by human factors. However, most accidents. Liu et al. (2019) build an iterative causal network model for
railway accidents are not caused by a single factor. The hazards that lead railway operation accidents and develope indicators to analyze the
to railway accidents can be divided into four categories: personnel fac­ causes of railway accidents. Based on Pearson correlation coefficient,
tors, equipment factors, environment factors and management factors. Hua and Zheng (2019) analyze the correlation between railway cause
Scholars also discuss the connections between the multiple hazards factors, establish a railway accident cause network model, and analyze
factors. Mirabadi and Sharifian (2010) use association rule data mining the overall structure of the network by using the basic topology index of
technology to analyze the data of past accidents in Iran’s railway, the complex network. Lam and Tai (2020) propose a hazard analysis
consider human factors, vehicles, tracks, signal systems and other haz­ framework for railway accidents, and consider the correlation of railway
ard factors, and identify the potential relations between hazard factors. accident data from three perspectives of local, global and scenario. Li
Evans and Hughes (2019) study the relations between travelers, delays and Wang (2018) propose a hazard monitoring model based on complex
and failures to road users at railway level crossings in the UK. Exploring network to identify and analyze the correlation between accident cause
the relations between different hazards and obtaining valuable knowl­ factors.
edge from them will contribute to the development of railway safety. However, most of the railway accident networks proposed by the
In the research of security hazard analysis in many fields, network above scholars are one-dimensional complex networks, which can only
models are commonly used to reveal the characteristics and hidden provide limited information. In general, the process of railway accidents
knowledge of complex systems in this field (Wu et al., 2021; Zeng et al., is relatively complex and affected by various factors. In order to explore
2021). At present, the commonly used network models include Bayesian the correlation between hazard factors and obtain more useful infor­
network, Complex network and the emerging knowledge graph network mation about railway accidents, the multi-dimensional network model
in recent years. provides a new perspective. In one-dimensional complex networks, de­
Bayesian networks are often used to infer uncertain knowledge in gree, clustering coefficient, centrality and other network topology
railway transportation. Logical structure diagram and probability cal­ analysis indexes are widely used. By comparison, in multi-dimensional
culations are used for reasoning to explore useful information in railway complex networks, the corresponding network topology analysis in­
transportation. Wang et al. (2017a, 2017b) build a Bayesian network dexes are relatively few. Although Liu et al. (2021) expand the multi-
fault prediction model with fault causes as model variables to evaluate dimensional indexes for railway accidents, it only contains qualitative
the impact of weather on railway switches. Dindar et al. (2018) use causal relationships and hazard types without considering quantitative
Bayesian network to study the impact of weather on railway turnout causality, hazard occurrence time and other valuable information.
systems and analyze the probability of train derailment. Huang et al. Therefore, in order to further explore the correlation between hazards
(2020) used interpretative structural modeling combined with Bayesian and provide effective information for the prevention of railway acci­
networks to quantitatively analyze the relationship between various dents, it is necessary to study and innovate the multi-dimensional
factors in the railway dangerous goods transportation system. However, network of railway accidents and its topological analysis indexes.
in the process of using Bayesian networks to explore railway accident As a new network, the knowledge graph was proposed by Google in
information, most scholars describe the relationship between nodes as a 2012. It is a typical multilateral relationship graph, which is composed
causal relationship of hazards, other types of nodes including time, ac­ of nodes (entities) and edges (relationships between entities) (Zhang
cident hazard consequences, and so on. and edges that can describe the et al., 2022). Knowledge graph describes related domain knowledge
relationship between them. At the same time, edges that can describe the entities into connected network nodes, which can describe various re­
relationships between various types of nodes are also missing. On the lations between knowledge entities. The greatest advantage of this
other hand, although Bayesian network is a directed network, it cannot knowledge graph network compared to complex network is that it can
form the representation of ring graph. However, in the complex system contain various types of nodes and the association relationships between
of real hazard of railway accidents, the factors of hazard can influence various types of nodes. In addition to the causal relationship, explore the
each other and form a directed ring graph. Therefore, Bayesian network relationship between other nodes. At the same time, the knowledge
also has some limitations for the correlation analysis of hazard of graph can break through the limitation of Bayesian network, build a
complex systems. directed ring knowledge graph network, and explore the relationship
Complex network theory is a new theory which has been rising in between hazard in complex systems. At present, knowledge graph has
recent years. It is an abstract concept for a large number of real complex been applied to knowledge modeling in various fields, Ma (2022) sum­
systems. It can describe various interactions or relations within complex marizes the construction and application of knowledge graph in Geo­
systems, such as a causal relationship (Liu et al., 2018). The model based science. At the same time, it has also been explored by many scholars in
on complex network theory can form a directed ring graph, explore the the field of food science and industry (Min et al., 2022), medical (Gong
correlation between a large number of hazards, and break through the et al., 2021), drug discovery (Bonner et al., 2022) and social networks
limitations of Bayesian network. Therefore, it is widely used in the (Agouti, 2022), etc. In recent years, scholars have also made many re­
research of exploring the relations between knowledge. It provides a searches in the field of transportation using knowledge graph theory. In
new way to understand the complex interactions between railway haz­ the field of urban transportation, Tan et al. (2021) explore the relations
ards. In terms of subway safety research, Li et al. (2017) establish a between transportation entities by using the knowledge graph theory.
subway hazard network using the complex network theory and discuss Tian et al. (2022) propose a tracking method for people exposed to
the correlation between hazards. In the subway construction safety epidemic situation in public transport based on knowledge graph theory.
research, in order to analyze and understand the complexity of the ac­ Liu et al. (2021) propose a method to build railway operational accident
cident network, Zhou et al. (2014) establish a complex network model of knowledge graph model by using the theory of knowledge graph, and
subway construction accidents, which finally proves that it can control formulate new heterogeneous topology indexes to analyze the topology
the original accident and prevent the occurrence of secondary accidents characteristics of the network. However, it overlooks the correlation
and derivative accidents. Zhou et al. (2019) combin the complex strength between accidents and hazards as well as the fact that the
network with the association rule mining algorithm, and propose an number of hazards is small, which can’t fully reveal the correlation
improved Apriori algorithm to explore the type of anomaly monitoring, between accident hazards.
revealing the association rules between the security hazard monitoring This study proposes a modeling method of railway accidents based
type and the hazard coupling. In the aspect of railway security, complex on knowledge graph, and analyzes the correlation between hazards, so

2
N. Wang et al. Safety Science 166 (2023) 106238

Table 1
Description of railway accident.
Railway Description of Accident occurrence All the hazards and accidents Relations Casualties
accident time
report

R072011 At 00:25 hrs on Tuesday 4 May 2010 five Night Management hazard:There are defects in There are defects in the inspection, No casualties
wagons ran away and two of them the inspection, maintenance and maintenance and management of all
derailed close to Ashburys station in management of all components of the components of the train → Brake system
Manchester. The runaway was caused by train;Equipment hazard:Brake system problem → Runaway accident →
ineffective handbrakes on the wagons. problem;Types of accidents:Runaway Derailment accident
The investigation found deficiencies in accident;Derailment accident.
the maintenance plan for the wagons.
Nobody was injured in the derailment.
R142013 Around 07:06 on 28 June 2012, when Daytime Management hazard:Insufficient Insufficient preparation plans for severe No casualties
the train was moving along a straight preparation plans for severe weather weather conditions such as floods and
track, it hit a washed out embankment conditions such as floods and storms; storms → The flood caused by heavy
near Cckmore, Northern Ireland. The Environmental hazards:The flood caused rain → Embankments eroded by the
investigation found that on June by heavy rain;Embankments eroded by the current → Collision accident
27,2012, heavy rainfall in the catchment current;Types of accident:Collision
area of the nearby brokerage River, and accident.
the culvert system on the river could not
cope with the flow from heavy rainfall,
causing water to flow back behind the
railway embankment and causing local
flooding. Weather preparation
procedures do not include planning for
flood or rainstorm events and fail to
respond appropriately to rainfall events.
There were no injuries.
R062018 At around 18:05 on July 28, 2017, a Daytime Management hazard:Insufficient Insufficient inspection and maintenance Three people
train heading north entered Abergavini inspection and maintenance of cables and of cables and wires→Cable and wire were slightly
Station, and a power cable hanging from wires;Equipment hazard:Cable and wire fracture defect fault;→Electrical injured
the station pedestrian bridge was stuck fracture defect fault;Types of accident: accidents → Collision accidents.
on the roof. The train dragged the cable Electrical accidents;Collision accidents.
out of the remaining fixtures until one
end of the cable fell off the connected
distribution cabinet. The free end of the
cable quickly spun and collided with a
group of passengers on the pedestrian
bridge stairs, three of whom suffered
minor injuries. The investigation found
that the cable did not undergo regular
inspections in accordance with the
electrical installation requirements.

Table 2
Mortality and weighted injury quantification.
Injury degree Weighting

Class 1 Minor injury / multiple Class 2 injuries 0.008


Multiple Class 1 minor injuries / more severe injury 0.04
1–2 Major injuries 0.2
Multiple major / single fatality 1
Multiple fatalities 5

Table 3
Railway accidents and average consequences.
Accident Description of accident Average consequence (DWI/Year)

A01 Collision accident 2.652


A02 Derailment accident 0.1624
A03 Runaway accident 0.0504
A04 Falling accident 0.2224
Fig. 1. Distribution of occurrence proportion of various accident types. A05 Dragging accident 0.1024
A06 Electrical accident 0.0288
A07 Overspeed accident 0.004
as to further explore the correlation between hazards of railway acci­ A08 Near miss 0.0008
dents, and provide useful information for the prevention of railway ac­
cidents. In this study, hazards correlation analysis knowledge graph
(HCAKG) model is established. When constructing HCAKG, four ma­ useful information of HCAKG with more dimensions. Finally, the
trixes are used to extract various relations between knowledge entities. effectiveness of the method is verified through analyzing real-life
In order to analyze the topological structure of HCAKG, topological accidents.
analysis indexes considering time factors is proposed to explore the

3
N. Wang et al. Safety Science 166 (2023) 106238

Table 4 their relations.


Hazards and description.
Hazards description Type of 2.1.1. Step 1: Identify knowledge entities
Hazards The first step in the establishment of HCAKG is to identify knowledge
H01 Insufficient experience or ability of train driver H-Type entities. The interactions between hazards in railway accidents even­
H02 Train driver’s misjudgment of dangerous situation H-Type tually leads to the occurrence of accidents. In order to prevent the
H03 Train driver ignores or misunderstands the warning H-Type occurrence of railway accidents, it is necessary to explore the relations
… … H-Type between the hazards that cause railway accidents. Therefore, the first
EI01 Brake system problems EI-Type
EI02 Braking system equipment and performance problems EI-Type
thing that can be determined is to select hazards and railway accidents
EI03 Road rail vehicle fault or abnormal state EI-Type as knowledge entities. In addition, in order to obtain more information
… … EI-Type related to hazards and formulate targeted prevention strategies, types of
E01 Leaves E-Type hazard, hazard occurrence time and accident consequences are consid­
E02 Landslide E-Type
ered as knowledge entities.
E03 High Weather temperature E-Type
… … E-Type
M01 Inadequate preparation plan for flood and rainstorm M-Type 2.1.2. Step 2: Identify the relations between knowledge entities
M02 Inappropriate shift scheduling or work planning M-Type The second step of building HCAKG is to determine the relations
M03 The supervision and review of the safety work system M-Type between knowledge entities. The relations between knowledge entities
is invalid
… … M-Type
determined in this study includes the causal relationship between haz­
ards, the correlation between hazards and accidents, the correlation
between hazards and types of hazard, the correlation between hazards
and their occurrence time, and the correlation between accidents and
Table 5 accident consequences. The relationships between knowledge entities
Knowledge entities and triples of R072011、R142013 and R062018. are described by four keywords, namely ‘Result − In’, ‘Type − Is’,
Railway knowledge entities Knowledge triples ‘Value − Is’ and ‘Happened − When’.
accident The keyword ‘Result − In’ indicates a causal link between knowledge
report entities. The knowledge entities include hazards and accidents. For
R072011 M14: There are defects in the <M14, Result-In1, EI01>,< example, if and only if H01 is the direct causal factor leading to H02,
inspection, maintenance and EI01, Result-In1, A03>,<A03, their knowledge triple can be expressed as < H01, Result − In1, H02 >.
management of all components Result-In1, A02>,<M14, Type-
The keyword ‘Type − Is’ is used to describe the correlation between
of the train;EI01: Brake system Is, M-type>,<EI01, Type-Is, EI-
problem;A03: Runaway accident; type>,<M19, Happened-
hazards and types of hazard. For example, H01 is H-Type hazard. The
A02: Derailment accident;M- When1, Night>,<EI01, knowledge triple can be expressed as < H01, Type − Is, H-Type >.
type;EI-type;Night;Con. Happened-When1, Night>, The keyword ‘Value − Is’ shows the relations between accident and
<A02, Value-Is0, Con>,<A03, their consequences, and is recorded as a specific value. For example,
Value-Is0, Con>
assume that the accident consequence of accident A01 is quantified as
R142013 M01: Insufficient preparation <M01, Result-In1, E07>,<
plans for severe weather E07, Result-In1, E12>,<E12, 0.008. The knowledge triple can be expressed as < A01, Value − Is 0.008,
conditions such as floods and Result-In1, A01>,<M01, Type- Con >.
storms;E07: The flood caused by Is, M-type>,<E07, Type-Is, E- The keyword ‘Happened − When’ reveals the correlation between
heavy rain;E12: Embankments type>,<E12, Type-Is, E-type>, hazards and their occurrence time. For example, if hazard H01 occurs
eroded by the current;A01: <M01, Happened-When1,
Collision accident;M-type;E-type; Daytime>,<E07, Happened-
once at night, the knowledge triple can be expressed as < H01,
Daytime;Con. When1, Daytime>,<E12, Happened − When 1, night >.
Happened-When1, Daytime>,
<A01, Value-Is0, Con> 2.1.3. Step 3: Build the model
R062018 M06: Inadequate inspection and <M06, Result-In1, EI18>,<
Due to the different types of mutual definitions between knowledge
maintenance of overhead EI18, Result-In1, A06>,<A06,
linesEI18:Broken cables and Result-In1, A01>,<M06, Type- graphs, seven matrices are defined to construct the model.
wires or defective capacitorsA01: Is, M-type>,<EI18, Type-Is, EI- The first matrix is the causal strength matrix (CSM), and CSMij is
Collision accident;A06: Electrical type>,<M06, Happened- defined by Eq. (1). The CSMij is determined by the knowledge triples
accident;M-type;EI-type; When1, Daytime>,<EI18,
including the keyword’Result − In’. In the knowledge triples, i represents
Daytime;Con. Happened-When1, Daytime>,
<A01, Value-Is0.04, a hazard or an accident, j represents a hazard or an accident too, δ
Con><A06, Value-Is0.04, represents a specific value, and KTs represents all knowledge triples
Con> identified in step 2. CSM can describe and map the causal relations be­
tween hazards and accidents into a network graph. Because there is a
one-way causal relationship between knowledge entities, the causal
2. Method
relationship edge is described as a directed edge in this study. The higher
the frequency of causal relationship is, the higher the degree of corre­
2.1. Modeling of hazards correlation analysis knowledge graph
lation strength is. Therefore, weighted edges are used to describe the
correlation strength between causal relationships. In other words, when
To analyze the hazard correlation of railway accidents, HCAKG
CSMij = δ, there are δ causal relationship edges from knowledge entity i
needs to be established. HCAKG is a knowledge graph containing
to knowledge entity j.
knowledge entities (nodes) and relations (different types of edges) be­
{
tween knowledge entities. Each edge is represented as a triple of the δ < i, Result − Inδ, j >∈ KTs
CSM ij = (1)
form (head entity, relation, tail entity) (Wang et al., 2017a, 2017b). 0 < i, Result − Inδ, j >∕∈ KTs
Knowledge entities and their relationships are usually described by
knowledge triples. Knowledge triples are represented as <Entityhead , The second matrix is the type matrix (TYM), and TYMij is a matrix
Relation, Entitytail >, where Entityhead , Entitytail represent two entities and defined by Eq. (2). The TYMij is determined by the knowledge triples
Relation represents the relationship between Entityhead and Entitytail . including the keyword’Type − Is’. In the knowledge triples, i represents a
Therefore, building HCAKG requires identifying knowledge entities and hazard, j represents the type of hazard, and KTs represents all

4
N. Wang et al. Safety Science 166 (2023) 106238

Fig. 2. Knowledge graph of R072011、R142013 and R062018.

knowledge triples identified in step 2. Through TYM, the knowledge


triples describing hazard and types of hazard can be mapped to a graph,
that is, if TYMij = 1, there is an edge marked’Type − Is’ from entity i to
entity j
{
1 < i, Type − Is, j >∈ KTs
TYM ij = (2)
0 < i, Type − Is, j >∕
∈ KTs

The third matrix is the result matrix (RM), and RMij is defined by Eq.
(3). The RMij is determined by the knowledge triples including the
keyword’Value − Is’. In the knowledge triples, i represents an accident, j
represents the consequence of accident, δ represents a specific value, and
KTs represents all knowledge triples identified in step 2. RM can map the
knowledge triples describing the result value into the graph, in other
words, if the value of RMij > 0, there is an edge marked’Value − Isδ’ from
entity i to entity j.
{
δ < i, Value − Isδ, j >∈ KTs
RM ij = (3)
0 < i, Value − Isδ, j >∕
∈ KTs

The fourth matrix is the time matrix (TIM), which is a matrix defined
by Eq. (4). The TIMij is determined by the knowledge triples including
the keyword ‘Happened − When’. In the knowledge triples, i represents a
hazard, j represents the occurrence time of hazard, δ represents a spe­
cific value, and KTs represents all knowledge triples identified in step 2.
Fig. 3. Causal strength matrix.
This means that through TIM, the knowledge triples describing the time
relationship can be mapped into a graph. Because there is a one-way
causal relationship between knowledge entities, the edge of the corre­
lation between hazards and time is described as a directed edge in this
study. The higher the frequency of the correlation is, the higher the

5
N. Wang et al. Safety Science 166 (2023) 106238

Fig. 4. Type matrix、Result matrix and Time matrix.

Fig. 5. Shortest path matrix and Causal reachability matrix.

intensity of the correlation is. Therefore, weighted edges are used to weight relationship of causality is not be considered here, and the cau­
describe the intensity of the correlation between hazards and time. That sality is described as RWCSM matrix.
is, when TIMij = δ, there are δ time relationship edges from knowledge {
1CSM ij > 0
entity i to knowledge entity j. RWCSM ij = (5)
0CSM ij = 0
{
δ < i, Happened − Whenδ, j >∈ KTs
TIM ij = (4) The shortest path matrix (SPM) is constructed based on the RWCSM,
0 < i, Happened − Whenδ, j >∕∈ KTs
and is defined as SPMij by Eq. (6). Where SPMij represents the shortest
Based on the first matrix, the remove weighted causal matrix path length from hazard i to hazard j or accident j and the shortest path
(RWCSM) is constructed, and RWCSMij is defined by Eq. (5). In order to length from accident i to accident j, u and v represent two entities, N
facilitate the subsequent construction of the shortest path matrix, the represents all entities on the shortest path. SPM can capture the length

6
N. Wang et al. Safety Science 166 (2023) 106238

Table 6 of correlation between hazards and other hazards is stronger.


Relationship between knowledge entities. ∑ ∑
AcciTI
h = CAM hi × TIM TI
i / SPM hi (8)
Accident Knowledge triples i∈HN i∈HN
report
∑ ∑
R012011 <M19, Result-In1, M03>, <M03, Result-In1, H32>, <H32, Result- PcciTI
h = CAM ih × TIM TI
i / SPM ih (9)
In1, M04>,<M04, Result-In1, H28>, <H28, Result-In1, A01>, i∈HN i∈HN
<M19, Type-Is, M-type>, <M03, Type-Is, M-type>, <H32, Type-Is,
H-type>, <M04, Type-Is, M-type>, <H28, Type-Is, H-type>,<M19, The type distribution proportion of hazards is an index to measure
Happened-When1, Night>, <M03, Happened-When1, Night>,<H32, the relationship between a specific hazard and other types of hazard in
Happened-When1, Night>, <M04, Happened-When1, Night>,<H28, different time periods, including the active type distribution proportion
Happened-When1, Night>,<A01, Value-Is0.008, Con>.
and the passive type distribution proportion. The active type distribu­
R032011 <M13, Result-In1, EI02>, <EI02, Result-In1, A02>, <M13, Type-Is,
M-type>,<EI02, Type-Is, EI-type>,<M13, Happened-When1, tion proportion is intended to measure the proportion distribution of
Daytime>, <EI02, Happened-When1, Daytime>,<A02, Value-Is0.04, other hazard types caused by a specific hazard in different time periods.
Con>. Active type distribution proportion AhtdpTI hT refers to the proportion of T-
…… ……
R152020 <M10, Result-In1, EI15>, <EI15, Result-In1, EI17>, <EI17, Result-
Type hazards in all hazards that can be directly caused by a specific
In1, A10>,<M10, Type-Is, M-type>, <EI15, Type-Is, EI-type>, hazard h in the TI time, which can be calculated by Eq. (10). Similarly,
<EI17, Type-Is, EI-type>,<M10, Happened-When1, Night>, <EI15, the passive type distribution proportion is intended to measure the
Happened-When1, Night>,<EI17, Happened-When1, Night>,<A10, proportion distribution of the hazard type that leads to a specific hazard
has-Value0, Con>.
R162020 <M13, Result-In1, EI01>, <EI01, Result-In1, A03>, <A03, Result-
in different time periods. Passive type distribution proportion PhtdpTI Th
In1, A02>, <M13, Type-Is, M-type>, <EI01, Type-Is, EI-type>, represents the proportion of T-Type hazards among all hazards that can
<M13, Happened-When1, Daytime>, <EI01, Happened-When1, directly lead to a specific hazard h in the TI time. It can be calculated by
Daytime>,<A03, Value-Is0, Con>, <A02, Value-Is0, Con>. Eq. (11). Compared with previous studies, the strength of the correlation
between hazards in the causal strength matrix and time dimension are
characteristics of causal paths in the network model. considered here. When the causal relationship between hazards is
∑ stronger, the type distribution proportion is higher, and its proportion
SPM ij = RWCSM pq (6) will be greater. In a certain period of time, the more the occurrence
u,v∈N times of hazards, the higher the distribution proportion index of hazard
The final matrix is the causal accessibility matrix (CAM), and CAMij is types in that period of time, and the greater their proportion.
defined by Eq. (7). Where CAMij indicates whether there is a causal path ∑ ∑
AhtdpTI
hT = (CSM hj × TYM jT × TIM TI
j )/ (CSMhi × TIM TIi ) (10)
from hazard i to hazard j or accident j. Moreover, CAM can also capture j∈HN i∈HN
whether there is a causal path from the accident i to the accident j.
∑ ∑
{ PhtdpTI
Th = (CSMjh × TYM jT × TIM TI
j )/ (CSM ih × TIM TI
i ) (11)
1SPM ij > 0
CAM ij = (7) j∈HN i∈HN
0SPM ij = 0
The accessibility between types of hazards is an index to measure the
degree of association between different types of hazards in different time
2.2. Network topology analysis indexes periods, including direct accessibility between types of hazards and in­
direct accessibility between types of hazards. The two indexes can reveal
As a knowledge graph model, HCAKG model contains various en­ the strength of the association between hazard types from both local and
tities and relations. The traditional one-dimensional topological analysis global perspectives. The direct accessibility between types of hazards is
indexes are no longer applicable to this multi-dimensional model. intended to measure the strength of the direct correlation between the
Although Liu et al. (2021) propose topological analysis indexes which is two types of hazards in different time periods. That is, the direct
suitable for multi-dimensional models, it ignores time and other infor­ accessibility between types of hazards DaTI PQ refers to the strength of the
mation and is still not applicable to this model. Therefore, it is necessary direct causal relationship between the P-Type hazards and Q-type haz­
to propose new network topology analysis indexes containing more in­ ards in the TI time, which can be calculated by Eq. (12). Similarly, the
formation such as time to explore the correlation between hazards in indirect accessibility between types of hazards is intended to measure
HCAKG. In HCAKG model, N represent the collection of all nodes, HN is the strength of the indirect correlation between the two types of hazards
the collection of all hazard nodes, AN is the collection of all accident in different time periods. The indirect accessibility between types of
nodes, TN is the collection of nodes of all types of hazard, TIN is the hazards IaPQTI refers to the strength of indirect causal relationship be­
collection of all time type nodes, CN is the consequence node. Therefore, tween the P-Type hazards and Q-type hazards in the TI time, which can
N = {HN, AN, TN, TIN, CN}. be calculated by Eq. (13).
The causal correlation intimacy of hazards is an index to measure the ∑
degree of causal relationship between hazards in different time periods, DaTI
PQ = (RWCSM ij × TYM iP ×TYMjQ × TIM TI TI
i × TIM j ) (12)
including active causal correlation intimacy and passive causal corre­
i,j∈HN

lation intimacy. Active causal correlation intimacy AcciTI h refers to the



IaTI
PQ = (CAM ij × TYM iP ×TYMjQ × TIM TI TI
i × TIM j ) (13)
degree of difficulty that a specific hazard h causes other hazards in TI i,j∈HN
time, which can be calculated by Eq. (8). The higher the active causal
correlation intimacy of hazards, the easier it is for hazards to cause other The accessibility between types of hazards and time is an index to
hazards at that time. Similarly, the passive causal correlation intimacy measure the degree of correlation between the types of hazards and
different times, including direct accessibility between types of hazards
PcciTI
h refers to the difficulty degree of a specific hazard h caused by other
and time and indirect accessibility between types of hazards and time.
hazards in TI time, which can be calculated by Eq. (9). When the passive
These two indexes can reveal the strength of the relationship between
causal intimacy of hazards is higher, the hazards are more likely to be
different types of hazards and occurrence time from both local and
caused by other hazards at that time. Compared with previous studies,
global perspectives. The direct accessibility between types of hazards
the information of time dimension is considered here. When the number
and time is intended to measure the strength of the direct correlation
of occurrence of hazards is more in a certain period of time, the higher
between different types of hazards and different times. That is, the direct
the causal correlation intimacy of hazards is, indicating that the degree

7
N. Wang et al. Safety Science 166 (2023) 106238

Fig. 6. Knowledge graph of hazards of railway accidents.

accessibility between types of hazards and time DtiaTIW indicates the de­
including direct harmful consequence of hazard and intermediate
gree of direct correlation between W-type hazards and TI time, which harmful consequence of hazard. Direct harmful consequence reveals the
can be calculated by Eq. (14). Similarly, the indirect accessibility be­ severity of harmful consequence of hazards from the perspective of each
tween types of hazards and time is intended to measure the strength of hazard, and intermediate harmful consequence reveals the severity of
the indirect correlation between different types of hazards and different harmful consequence of hazards from the whole knowledge graph
times. That is, the indirect accessibility between types of hazards and network model. These two indicators can provide decision-making basis
time ItiaTI for investment in accident prevention.
W indicates the degree of indirect correlation between W-type
hazards and TI time, which can be calculated by Eq. (15). The direct harmful consequence of hazards is intended to measure
∑ the severity of injury to personnel when a specific hazard occurs in TI
DtiaTI
W = (RWCSM ij × TYM iW × TIM TI
j ) (14) time. The direct harmful consequence of generated by hazard h in TI
i,j∈HN
time can be calculated by Eq. (16). TIM matrix records the intensity of
∑ hazard occurrence time, that is, the number of hazards occurring in TI
ItiaTI (CAM ij × TYM iW × TIM TI (15) time in N years, so TIMTIh /N represents the likelihood of occurrence of
W = j )
i,j∈HN
the hazard h in TI time; CAMhAi is an entity of the causal reachability
Harmful consequence indexes of hazards measure the severity of the matrix. RMAiCon is an entity of the result matrix, representing the average
hazards caused by a given hazard to personnel in railway accidents, consequence of accident Ai.

8
N. Wang et al. Safety Science 166 (2023) 106238

Fig. 7. Distribution of causal correlation intimacy.

The intermediate harmful consequence of hazards is intended to 3. Case analysis


measure the severity of personal injury caused by a specific hazard in TI
time. It should be noted that this particular hazard can be caused by 3.1. Data collection
other hazards, and will also lead to the occurrence of other hazards,
which plays an intermediate role in the spread of hazards. From the In this study, the decade-long UK railway accident data from 2011 to
perspective of the whole knowledge graph network model, if this 2020 are selected. Relevant information can be obtained from the
particular hazard is controlled or eliminated, it will affect direct harmful website of the UK railway accident investigation office. A total of 176
consequence of other hazards that produce this hazard, and also affect railway accident reports were selected, excluding subway accidents and
the direct harmful consequence of other hazards caused by this hazard, tram accidents.
that is, it will reduce the direct harmful consequence of other hazards, In the process of counting railway accident data, the information
and play a role in mitigating and controlling the consequences of acci­ required to be recorded is the number of railway accident, the occur­
dents. The intermediate harmful consequence of hazards can be calcu­ rence time of railway accident, and the basic situation of the hazards and
lated by Eq. (17), where CAMih and CAMhAi is the entity of the causal the type of hazards in railway accident. The statistics of railway accident
reachability matrix. DhcTIh is the harmful consequences generated by the
data with accident number R072011、R142013 and R062018 are
hazard i in TI time. shown in Table 1 below. It is worth noting that in the statistics of ac­
∑ cident types, there is an accident report recording the occurrence of
DhcTI TI
h = (TIM h /N) × (CAMhAi × RM AiCon ) (16) multiple types of accidents. For example, in the accident report
Ai∈AN
R072011, the record shows that the Runaway accident occurred firstly
∑ and then the derailment accident occurred. The statistics of these two
IhcTI
h = (CAM ih × CAM hAi × DhcTI
i ) (17)
i∈HN,Ai∈AN
types of accidents are carried out, and there is a ‘Result − In’ causal
relationship between the two accidents.
Based on the comprehensive analysis of the accident data reports
over the past 10 years, the collected accident types are Collision acci­
dent, Derailment accident, Runaway accident, Falling accident, Over­
speed accident, Dragging accident, Electrical accident and Near Miss.
The proportion of each accident type is shown in Fig. 1.

9
N. Wang et al. Safety Science 166 (2023) 106238

Fig. 8. The type distribution proportion of H-Type hazards in daytime.

Fig. 9. The type distribution proportion of EI-Type hazards.

10
N. Wang et al. Safety Science 166 (2023) 106238

Fig. 10. The type distribution proportion of E-Type hazards.

Fig. 11. The type distribution proportion of M-Type hazards.

3.2. HCAKG construction this model, the accident consequences are measured by the severity of
casualties, which is quantified by the death and weighted injury (DWI)
First of all, railway accident, railway accident consequences, haz­ shown in Table 2 (RSSB. 2014.).
ards, types of hazards, hazard occurrence time are identified as knowl­ If only accidents numbered R072011、R142013 and R062018 have
edge entities from the 176 railway accident reports collected above. In occurred within 3 years, take the accident reports R072011、R142013

11
N. Wang et al. Safety Science 166 (2023) 106238

Fig. 12. Accessibility between different types of hazards in daytime.

Fig. 13. Accessibility between different types of hazards in night.

and R062018 in Table 1 as examples to calculate the harm consequences The links between knowledge entities are described by knowledge
of the accident. According to Tables 1 and Table 2, it can be seen that in triples with keywords ‘Result − In’, ‘Type − Is’, ‘Value − Is’ and
the accident reports numbered R072011、R142013 and R062018, there ‘Happened − When’.
are no casualties in R072011 and R142013, so the degree of the damage The railway accidents numbered R072011, R142013 and R062018
is not estimated, and the Weighting is 0. But in R062018, there are three are used as examples to identify knowledge entities and the Knowledge
people were slightly injured, so the Weighting is 0.04 according to triples, the results shown in Table 5 below.
Table 2. In these there reports, the average consequence of collision Taking the knowledge triples identified in Table 5 as an example,
accidents in 3 years was 0 + 0.04 / 3 = 0.0133 DWI/Year, the average Neo4j is used to build the knowledge graph shown in Fig. 2 can be
consequence of Electrical accidents in 3 years was 0.04 / 3 = 0.0133 constructed.
DWI/Year. The average consequence of derailment accidents and The required matrix in the model constructed from the knowledge
runaway accidents within three years are both 0 / 3 = 0 DWI / Year. graph in Fig. 2. Where the CSM matrix is shown in Fig. 3, the TYM
According to the above calculation method, the average consequences of matrix, the RM matrix, and the TIM matrix are shown in Fig. 4 (a), 4 (b),
different accident types during 10 years can be calculated, as shown in and 4 (c). And the RWCSM matrix and the CAM matrix are shown in
Table 3. Fig. 5 (a) and 5 (b), respectively. The subsequent index calculation can
The railway accidents numbered A01 ~ A08 are identified as acci­ be performed according to the constructed matrix.
dent knowledge entities that are shown in Table 3. And their conse­ Through the data processing method of the above three accident
quence is identified as accident consequence knowledge entities reports, other accident reports in the past 10 years can be analyzed and
numbered Con. The 89 hazards numbered H01 ~ H34, EI01 ~ EI21, E01 all knowledge triplets can be obtained. By recording and analyzing other
~ E15 and M01 ~ M19 and the four types(H-Type、EI-Type、E-Type and accident reports in the past 10 years, all of the knowledge triples can be
M-Type) of the 89 hazards are identified that are shown in Table 4. Due obtained. Due to space limitations, only some knowledge triples are
to space limitations, see the appendix for details. Meanwhile, in this listed here that are shown in Table 6.
study, Daytime and Night are identified as time knowledge entities. Based on the identified all knowledge entities and knowledge triples,

12
N. Wang et al. Safety Science 166 (2023) 106238

Correspondingly, the correlation matrix can be obtained in Fig. 6,


calculate the subsequent indicators, and analyze the hidden information
in the data.

3.3. Topology analysis results

3.3.1. Causal correlation intimacy of hazards


First, Eqs. (8) and (9) are used to calculate the active causal corre­
lation intimacy and passive causal correlation intimacy of each hazard.
Fig. 7(a) shows the active causal correlation intimacy and passive
causal correlation intimacy of H-Type hazards in daytime and night. On
the whole, in daytime and night, the passive causal correlation intimacy
of hazards is generally higher than their active causal correlation in­
timacy. The passive causal correlation intimacy of H04, H05, H13, H19
in daytime and H04 and H34 in night is significantly higher than that of
most hazards, which is more likely to be caused by other hazards, and
requires special attention.
Fig. 7(b) shows the active causal correlation intimacy and passive
causal correlation intimacy of EI-Type hazards in daytime and night. On
the whole, the passive causal correlation intimacy of hazards in daytime
and night is generally higher than their active causal correlation in­
timacy. Among them, the passive causal correlation intimacy of EI11,
Fig. 14. Accessibility between types of hazards and time. EI14, EI16, EI19 in daytime and EI14, EI16 in night are higher than that
of most hazards. It shows that it is more likely to be caused by other
hazards during this time period, which requires special attention.
Neo4j is used to build a knowledge network model of railway accidents
Fig. 7(c) shows the active causal correlation intimacy and passive
that is shown in Fig. 6. The knowledge graph network model is
causal correlation intimacy of E-Type hazards in daytime and night. On
composed of 105 nodes and 525 edges. 105 nodes include 89 hazards, 4
the whole, the passive causal correlation intimacy of hazards is generally
types of hazards, 8 accidents, Con, Daytime, and Night. The 525 edges
higher than their active causal correlation intimacy in daytime and
consist of 89 edges containing the keyword ‘Type − Is’, 274 edges con­
night. The passive causal correlation intimacy of E04, E07, E10 in
taining the keyword ‘Result − In’, 8 edges containing the keyword
daytime and E07, E10 in night are higher than that of most hazard,
‘Value − Is’, and 154 edges containing the keyword ‘Happened − When’.
indicating that it is more likely to be caused by other hazards during this

Fig. 15. Direct harmful consequences of hazards.

13
N. Wang et al. Safety Science 166 (2023) 106238

Fig. 16. Intermediate harmful consequences of hazards.

time period, which requires special attention. H33 is 0 in daytime, indicating that they cannot be directly affected by
Fig. 7(d) shows the active causal correlation intimacy and passive other hazards in daytime.
causal correlation intimacy of M-Type hazards in daytime and night. On The active type distribution proportion and the passive type distri­
the whole, the passive causal correlation intimacy of hazards is generally bution proportion of EI-Type hazards in daytime and night are shown in
lower than their active causal correlation intimacy. The passive causal Fig. 9 respectively. Through analysis, it can be found that in daytime and
correlation intimacy of most hazards is 0, indicating that it cannot be night, most EI-Type hazards are more likely to lead to EI-Type hazards.
caused by other hazards, while only M04 and M05 in daytime and M03, In daytime and night, EI-Type hazards are mainly caused by M-Type and
M04 and M05 in night have a passive causal correlation intimacy greater EI-Type hazards.
than 0, indicating that it can be caused by other hazards in this time The active type distribution proportion and the passive type distri­
period. Among them, M04, M07 and M12 in daytime have a high degree bution proportion of E-Type hazards in daytime and night are shown in
of active causal correlation, indicating that these hazards are more likely Fig. 10 respectively. Through analysis, it can be found that in daytime
to lead to other hazards in this period of time, which requires special and night, E-Type hazards are more likely to lead to E-Type and EI-Type
attention. hazards, indicating that the occurrence of general natural disasters will
Obtain the correlation between different hazards in different time lead to the occurrence of other secondary natural disasters, as well as
periods, which is conducive to develop prevention strategies for hazards. mechanical equipment and other failures. In daytime and night, most of
the E-Type hazards are caused by M-Type hazards and E-Type hazards,
3.3.2. Type distribution proportion of hazards which also indicates that improper management will also lead to the
The active type distribution proportion of hazards and the passive occurrence of environmental hazards.
type distribution proportion of hazards can be calculated by. (10) and The active type distribution proportion and the passive type distri­
Eq. (11). bution proportion of M-Type hazards in daytime and night are shown in
The active type distribution proportion and the passive hazards type Fig. 11 respectively. Through analysis, it can be found that all M-Type
distribution proportion of H-Type hazards in daytime and night are hazards can lead to the occurrence of other types of hazards in daytime
shown in Fig. 8 respectively. Through analysis, it can be found that in and night, among which H-Type hazards and EI-Type hazards account
daytime and night, most H-Type hazards can cause H-Type hazards, a for a large proportion. In daytime and night, only M03, M04 and M05
small number of hazards can cause the generation of other types of are M-Type hazards caused by other types of hazards, and most of them
hazards, and a small number of hazards such as H07, H10, H11, H28, are M-Type hazards, with a small number of H-Type hazards.
H30, whose active type distribution proportion is 0, indicating that they Exploring the type distribution proportion index is conducive to the
can’t cause the generation of other hazard types, which will directly lead subsequent development of targeted prevention strategies for each
to accidents. In daytime and night, most H-Type hazards are caused by hazard and the prevention of railway accidents.
H-type and M-Type, indicating that there is a strong correlation not only
between H-type hazards and M-type hazards, but also between H-type 3.3.3. Accessibility between different types of hazards
hazards. The passive type distribution proportion of H09, H15, H16 and The direct accessibility between different types of hazards and the

14
N. Wang et al. Safety Science 166 (2023) 106238

Fig. 17. Harmful consequences of all hazards.

Fig. 18. Changes of harmful consequences in daytime.

indirect accessibility between different types of hazards can be calcu­ a local perspective. The indirect accessibility between H-Type hazards
lated by Eqs. (12) and (13). Fig. 12 shows the statistical results of direct and H-Type hazards, M-Type hazards and H-Type hazards, M-Type
type accessibility index and indirect type accessibility index between hazards and EI-Type hazards are all high, indicating that from a global
different types of hazards in daytime. Fig. 12 shows that during the daily perspective, the relationships between these hazards are relatively close.
time period, the direct accessibility between H-Type hazards and H-Type Fig. 13 shows the statistical results of direct accessibility between
hazards, EI-Type hazards and EI-Type hazards, M-Type hazards and H- different types of hazards and indirect accessibility between different
Type hazards, M-Type hazards and EI-Type hazards are all high, indi­ types of hazards in night. As seen in Fig. 13, in the night time period, the
cating that the relations between these hazards are relatively close from indirect accessibility between H-Type hazards and H-Type hazards, M-

15
N. Wang et al. Safety Science 166 (2023) 106238

Fig. 19. Changes of harmful consequences at night.

Fig. 20. Prevention strategy development process.

16
N. Wang et al. Safety Science 166 (2023) 106238

Table 7 Table 7 (continued )


Prevention suggestions for some key hazards. Key hazards Recommended precautions Safety inspection plan
Key hazards Recommended precautions Safety inspection plan
their ability to match their
H07 (Train driver Improving the job Appropriately increase posts
accelerates, requirements of train drivers the inspection schedule Conducting more complete
overspeed or fails to to ensure that they have during the daytime management training for
decelerate in time) sufficient ability and management and monitoring
experience before taking up staff to ensure their ability to
their posts take up their posts
Strengthening the warning More detailed division and
training for train drivers arrangement of the
Reasonably arrange the shift responsibilities of personnel at
arrangement and rest time of each post to ensure that the
train drivers, and carry out responsibilities of personnel at
health and safety inspection each post do not
before train drivers take up conflictStrengthen the
their posts training management of
Provide more complete management personnel and
management training for track workers, and increase
management and monitoring the supervision plan“.
staff to ensure their post
ability
Improve track monitoring, Type hazards and H-Type hazards, M-Type hazards and EI-Type hazards
cleaning and early warning, are high, indicating that from a global perspective, the relationships
and timely ensure the stable
between these hazards are relatively close. From a local perspective, the
and suitable operation status
of tracksGuarantee and direct accessibility between H-Type hazards and H-Type hazards, EI-
maintain the protection and Type hazards and EI-Type hazards, M-Type hazards and H-Type haz­
warning system of the train on ards, M-Type hazards and EI-Type hazards are high, indicating that the
time to ensure that the train relationships between these hazards are relatively close.
driver receives accurate alarm
information and makes timely
Exploring the accessibility between different types of hazards is
adjustment helpful to get the correlation between different types of hazards locally
EI07 (Vehicle with Replacing faulty wheels in Arrange the daytime and as a whole. It is helpful to formulate prevention strategies for rail­
broken train time, checking and ensuring and night shift way accidents.
components) the normal use status of inspection without any
wheels on time difference
Improving the inspection 3.3.4. Accessibility between types of hazards and time
procedures of train parts and The direct accessibility between types of hazards and time and in­
components, improving the direct accessibility between types of hazards and time can be calculated
safety inspection standards by Eq. (14) and (15). Fig. 14 shows the statistical results of direct
Ensuring the operation status
of train track, and improving
accessibility between types of hazards and time and indirect accessi­
the safety inspection bility between types of hazards and time in daytime and night. As can be
standardsTimely clean up seen from Fig. 14, direct accessibility and indirect accessibility of H-
fallen leaves and other dirt Type hazards and daytime period, M-Type hazards and daytime period,
that may damage and pollute
and M-Type hazards and night time period are all high, indicating that
train components
E08 (Blocking lines of Strengthening the Appropriately increase these hazard types are closely related to the corresponding time period,
instruments and management training for the the inspection schedule whether from a local or global perspective.
equipment) operators of mechanical during the daytime Exploring the connection between different types of hazards and
equipment to ensure the different time periods is conducive to formulate prevention strategies for
operation specification and
accuracy
railway accidents.
Timely check the circuit,
electrical and other systems to 3.3.5. Harmful consequences of hazards
ensure that the wires are in a The direct harmful consequences and intermediate harmful conse­
safe position
quences of hazards can be calculated by Eq. (12) and (13). The direct
Timely clean up the blocking
of track lines caused by strong harmful consequences of H-Type hazards in daytime and night are
windsImprove the safety shown in Fig. 15 (a). From the figure, it can be seen that among H-type
inspection system, timely hazards, H04 and H14 cause greater harm in daytime. The direct
inspect and maintain the harmful consequences of EI-Type hazards in daytime and night are
equipment and environment
around the track lines, and
shown in Fig. 15 (b). From the figure, it can be found that among E-Type
timely eliminate potential hazards, EI06, EI11, EI12, EI16, etc. cause greater harm in daytime. The
safety hazards direct harmful consequences of E-Type hazards in daytime and night are
M04 (No or Reasonably planning the work Appropriately increase shown in Fig. 15 (c). From the figure, it can be found that among E-type
inappropriate safe arrangement of railway staff the inspection schedule
hazards, E07 causes greater harm in daytime. The direct harmful con­
work system to ensure moderate workload during the daytime
implemented) and prohibit long-term fatigue sequences of M-type hazards in daytime and night are shown in Fig. 15
operation (d). From the figure, it can be found that M07, M14 and M17 among M-
Strengthening the pre job type hazards cause greater harm in daytime, and these hazards need to
training of on-site safety con­ be paid special attention to. On the whole, the harmful consequences of
trollers and conducting more
most hazards in daytime are greater than those in night.
operation training before tak­
ing up their posts to ensure The intermediate harmful consequences of H-Type hazards in day­
time and night are shown in Fig. 16 (a). From the figure, it can be found
that among H-Type hazards, H07, H11, H14, etc. cause greater

17
N. Wang et al. Safety Science 166 (2023) 106238

Table A1 Table A1 (continued )


Hazards, hazards type and occurrence frequency. Hazards description Type of
Hazards description Type of hazard
hazard
EI19 Failure or damage of signal device or failure of signal EI-Type
H01 Insufficient experience or ability of train driver H-Type system
H02 Train driver’s misjudgment of dangerous situation H-Type EI20 Line kerbs, guardrails, etc. are damaged or blocked EI-Type
H03 Train driver ignores or misunderstands the warning H-Type effectively
H04 Train driver stress, fatigue or distraction H-Type EI21 Inadequate warning devices at intersections EI-Type
H05 The train driver’s inspection before departure is not in H-Type E01 Leaves E-Type
place E02 Landslide E-Type
H06 Train driver fails to operate or misoperates in violation of H-Type E03 High Weather temperature E-Type
regulations E04 Strong wind or aerodynamics E-Type
H07 The train driver overspeed or fails to slow down in time H-Type E05 The tree is in danger E-Type
H08 Illegal operation by road vehicle drivers H-Type E06 Low visibility caused by heavy fog and glare E-Type
H09 Insufficient experience or ability of road vehicle drivers H-Type E07 Rainfall, snow melting, river water, flood, water flow, etc E-Type
H10 Road vehicle driver drives the vehicle in an unsafe H-Type E08 Blocking lines of vehicles, instruments and equipment E-Type
position E09 Cows, trees and other animals and plants block the line E-Type
H11 Passengers and pedestrians in unsafe positions H-Type E10 Retaining wall, retaining wall, stone, landslide debris, ice E-Type
H12 Passengers and pedestrians are distracted, tired or H-Type block track
uncomfortable E11 Intrusion of foreign vehicles, objects and animals E-Type
H13 Wrong judgment of pedestrians, passengers and riders on H-Type E12 Decrease of embankment stability (water erosion) E-Type
dangerous situations E13 Collapse and falling of reinforced concrete and retaining E-Type
H14 Unsafe behavior by pedestrians, passengers and riders H-Type wall
H15 Pedestrians, passengers and riders ignore or H-Type E14 Ice accumulation on contact rail (device on electric E-Type
misunderstand the warning traction vehicle)
H16 Excessive workload of staff H-Type E15 Viaduct damaged and sunk (supporting track) E-Type
H17 The signalman is distracted or tired H-Type M01 Inadequate preparation plan for flood and rainstorm M-Type
H18 Signalman ignores relevant information or makes wrong H-Type M02 Inappropriate shift scheduling or work planning M-Type
judgment M03 The supervision and review of the safety work system is M-Type
H19 Signalman fails to conduct command operation or H-Type invalid
conducts wrong command operation M04 No or inappropriate safe work system implemented M-Type
H20 The controllers of site safety fatigue or distraction H-Type M05 Open for operation under abnormal or unsafe conditions M-Type
H21 Misjudgment or wrong decision of controllers of site H-Type M06 Inadequate inspection and maintenance of overhead lines M-Type
safety M07 Inadequate track structure inspection and maintenance M-Type
H22 Violation or wrong operation of controllers of site safety H-Type system
H23 The controllers of site safety lacks experience in H-Type M08 Missing, insufficient or inconsistent provision of safety M-Type
implementing safe system of work information communication
H24 The controllers of site safety did not implement or H-Type M09 Ineffective risk management of vehicle intrusion into the M-Type
implemented an inappropriate safety work system track
H25 Person in charge of possession stressed or tired or H-Type M10 Defects in risk management of level crossings M-Type
distracted M11 Defects in the monitoring, contact and management of the M-Type
H26 Person in charge of possession operation error or illegal H-Type signal system plan and its staff
operation M12 Defects in the evaluation, inspection and maintenance of M-Type
H27 Distractions of other track workers H-Type equipment and environment around the track line
H28 Other track workers in unsafe position H-Type M13 The acceptance, test and inspection of train braking M-Type
H29 Miscalculation of dangerous situation by other track H-Type performance are invalid
workers M14 Defects in the inspection, maintenance and management M-Type
H30 Other track workers’ illegal operation or wrong operation H-Type of each component of the train
H31 Other track workers have not received, ignored or H-Type M15 The design and approval process of mechanical M-Type
misunderstood warnings equipment, parts, systems, etc. is flawed
H32 Manage and monitor the wrong or illegal operation of the H-Type M16 Unclear assignment of responsibilities of site staff M-Type
staff M17 Inadequate training, supervision and management of train M-Type
H33 Inadequate safety inspection of relevant staff before H-Type drivers
departure M18 Invalid competency management system for rail road M-Type
H34 Track - road vehicles, mechanical equipment and other H-Type vehicle machine operators
operators’ illegal operation or operation error M19 Inadequate training, supervision and management of M-Type
EI01 Brake system problems EI-Type relevant management personnel and track workers
EI02 Braking system equipment and performance problems EI-Type
EI03 Road rail vehicle fault or abnormal state EI-Type
EI04 Train wheel failure or load change EI-Type intermediate harmful consequences in daytime. The intermediate
EI05 Uneven train load EI-Type harmful consequences of EI-Type hazards in daytime and night are
EI06 Train door design problems EI-Type
shown in Fig. 16 (b). From the figure, it can be found that among EI-Type
EI07 Damage and falling off of train components EI-Type
EI08 Train direction and line error EI-Type hazards, EI04, EI07, EI09, etc. cause greater intermediate harmful
EI09 Suspension and suspension fault of train EI-Type consequences in daytime. The intermediate harmful consequences of E-
EI10 Train protection and warning system fault EI-Type Type hazards in daytime and night are shown in Fig. 16 (c). From the
EI11 Bogie distortion, damping failure, power shaft and other EI-Type figure, it can be found that among E-Type hazards, E08 causes the
defects
EI12 Failure of train mechanical parts such as screws and EI-Type
greatest intermediate harmful consequences in daytime. The interme­
bearings diate harmful consequences of M-Type hazards in daytime and night are
EI13 Guard rail not installed EI-Type shown in Fig. 16 (d). From the figure, it can be found that among M-Type
EI14 Track twist fault EI-Type hazards, M04 causes the greatest intermediate harmful consequences in
EI15 Track slippery pollution EI-Type
daytime. These hazards need to be focused. On the whole, the inter­
EI16 Track tilt damage, cycle top and other defects EI-Type
EI17 Fault and defect of circuit and electrical system EI-Type mediate hazard consequences of all hazards in daytime are greater than
EI18 Broken cables and wires or defective capacitors EI-Type or equal to those in night.
Fig. 17 (a) and 17 (b) show the harmful consequences of all hazards
in daytime and night respectively. From the perspective of each hazard,

18
N. Wang et al. Safety Science 166 (2023) 106238

each hazard can bring great harm to the participants of railway acci­ as an example are choosed to introduce the prevention strategies for
dents. However, in general, direct harmful consequences of H-Type each hazard. H11, EI07, E08 and M04 have heavy intermediate harmful
hazards and M-Type hazards are generally larger than those of other consequences in the two time periods of daytime and night. Therefore,
types of hazards in daytime. At night, the direct harmful consequences of these four hazards are selected for the formulation of specific prevention
some M-Type hazards are larger than other types of hazards. In daytime strategies, as shown below.
and night, the intermediate direct harmful consequences of H-Type H07 (Train driver accelerates, overspeed or fails to decelerate in
hazards are generally larger than those of other types of hazards. time): It can be seen from Fig. 7 (a) that in daytime and night, the active
Because some hazards play an intermediate role in the causal path of causal correlation intimacy of H07 is 0, indicating that it is caused by
accidents, if the hazards with high intermediate harmful consequences other hazards, cannot lead to other hazards, and will directly lead to
are eliminated or controlled, the harmful consequences of a large accidents. The passive causal correlation intimacy of H07 in daytime is
number of hazards will be reduced, so as to prevent and control the greater than that of H07 in night, indicating that H07 in daytime is more
injury of accidents. likely to be caused by other hazards. In this knowledge graph model, the
hazards that directly lead to H07 are mainly H01, H03, H04, H06, EI10,
4. Discussion E10 and M03. Therefore, some targeted measures can be formulated for
these hazards, “Improving the job requirements of train drivers,
The results of the case study show that the method based on ensuring that they have sufficient ability and experience to work again”,
knowledge map is useful for exploring railway accidents. The proposed for example. More measures can be seen in Table 5. At the same time,
topological index can adapt to the multidimensional structural charac­ when carrying out daily safety hazard protection, management depart­
teristics of railway accidents, and provide useful information for un­ ment can choose to appropriately increase the daytime inspection
derstanding the potential laws of railway accidents and the correlation scheduling plan. Eliminate H07 by blocking the occurrence of these
between hazards. hazards in time, so as to block the relationship with other hazards and
prevent accidents.
4.1. Comparison of harmful consequences related to hazards EI07 (Vehicle with broken train components): as can be seen from
Fig. 7 (b), the active causal correlation intimacy of EI07 is 0 in daytime
According to the intermediate harmful consequence index of haz­ and night, indicating that it is caused by other hazards, cannot lead to
ards, the hazards with high intermediate harmful consequence index can other hazards, and will directly lead to accidents. The passive causal
be identified. The hazards with higher intermediate harmful conse­ correlation intimacy of EI07 in daytime and night are not 0 and equal,
quence are controlled one by one in the two time periods of daytime and indicating that EI07 can be caused by other hazards with the same in­
night, and it is found that the direct harmful consequence of the whole tensity. In this knowledge graph model, the hazards that directly lead to
accident is also gradually reducing, as shown in Figs. 18 and 19. Among EI07 are EI04, EI12, EI15 and E01. Therefore, some targeted measures
them, the harmful consequences in daytime are relatively heavy. By can be formulated for these hazards, such as “Replacing faulty wheels in
eliminating the hazard, we can see that the degree of influence on its time, checking and ensuring the normal use status of wheels on time”,
control is also large. Therefore, from the perspective of the whole more measures can be seen in Table 7. At the same time, during the daily
knowledge graph, eliminating the hazards with higher intermediate protection of safety hazards, the daytime and night shall be arranged for
harmful consequence can greatly reduce the direct harmful consequence undifferentiated inspection, so as to ensure that the inspection and
in accidents. scheduling plans during the day and night are generally consistent. EI07
As is shown in the Fig. 18 and Figs. 19, 17 hazards were removed in can be eliminated by blocking the occurrence of these hazards in time, so
turn, accounting for 19.1% of all hazards. In the daytime, by removing as to block the relationship with other hazards and prevent accidents.
19.1% of the hazards, 21.9% of the direct harmful consequences of all E08 (Blocking lines of instruments and equipment): as can be seen
hazards and 57.5% of the intermediate harmful consequences of all from Fig. 7 (c), the active causal correlation intimacy of E08 is 0 in
hazards can be reduced. In the night, 20% of the direct harmful conse­ daytime and night, indicating that it is caused by other hazards, which
quences of all hazards and 56.9% of the intermediate harmful conse­ cannot lead to other hazards, and will directly lead to accidents. In both
quences of all hazards can be reduced by removing 19.1% of all hazards. daytime and night, the passive causal correlation intimacy of E08 is not
It can be found that the proposed indexes can effectively reduce the 0, and the passive causal correlation intimacy of E08 in daytime is high,
direct harmful consequences of all hazards and intermediate harmful indicating that it is easier to be caused by other hazards in daytime. In
consequences of all hazards, and the impact on intermediate harmful this knowledge graph model, the hazards that directly lead to E08 are
consequences of all hazards is greater. H34, EI17, E04 and M12. Therefore, some targeted measures can be
formulated for these hazards, for example, “Strengthening the man­
4.2. Developing preventive measures agement training for the operators of mechanical equipment to ensure
the operation specification and accuracy”, more measures can be seen in
The third step is to consider the influence of different time factors Table 5. At the same time, when carrying out daily safety hazard pro­
according to the correlation between the key hazard and other hazards. tection, management department can choose to appropriately increase
Through the topological analysis indexes, the correlation between the daytime inspection scheduling plan. Eliminate E08 by blocking the
hazards can be got. According to these relations, relevant strategies can occurrence of these hazards in time, so as to block the relationship with
be formulated to provide reference for the prevention of railway acci­ other hazards and prevent accidents.
dents. The prevention strategy shall be formulated in accordance with M04 (Not implemented or unsuitable safety work system imple­
the following three steps, as shown in Fig. 20. The first step is to identify mented): as can be seen from Fig. 7 (d), M04 has a high degree of active
the key hazards according to the intermediate harmful consequence causal correlation intimacy in daytime and a low degree of active causal
index of the hazards in different time. The second step is to analyze the correlation intimacy in night, indicating that it is more likely to lead to
relationship between key hazards and other hazards by using casual the occurrence of other hazards in daytime, which requires special
correlation intimacy. The third step is to analyze other hazards that attention. The low degree of passive causal correlation intimacy in both
cause the key hazards, and formulate specific prevention strategies daytime and night indicates that it is not easy to lead to other hazards in
considering the influence of different time. this time period. In this knowledge graph model, the hazards that lead to
Because there are a large number of intermediate hazards with more M04 are H16, H22, H32, M16, M19. Therefore, some targeted measures
heavy intermediate harmful consequences identified in the two time can be formulated for these hazards, for instance “Reasonably planning
periods of daytime and night, four hazards from various types of hazards the work arrangement of railway staff, ensuring moderate workload and

19
N. Wang et al. Safety Science 166 (2023) 106238

prohibiting long-term fatigue operation”, more measures can be seen in Declaration of Competing Interest
Table 7. At the same time, during the daily protection of safety hazards,
the daytime inspection scheduling should be appropriately increased. The authors declare that they have no known competing financial
M04 can be eliminated by timely blocking the occurrence of these interests or personal relationships that could have appeared to influence
hazards, so as to block the relationship with other hazards and prevent the work reported in this paper.
accidents.
It is noteworthy that most of the identified hazards with more heavy Acknowledgements
intermediate harmful consequences are H-type hazards. That is,
personnel factors greatly affect the occurrence and severity of accidents. This work is supported by the National Natural Science Foundation
By identifying the relationship between these hazards and other hazards of China (Nos. 72071015, 72288101, 71890972/71890970), and the
and taking specific prevention and management measures, the spread 111 Project (No. B20071).
and impact of hazards can be greatly controlled. Based on the above
analysis, corresponding preventive measures and safety inspection Appendix A
schemes have been determined and formulated as shown in Table 7
below. See Table A1

References
4.3. Further application prospects
Agouti, T., 2022. Graph-based modeling using association rule mining to detect
The correlation analysis method based on knowledge graph has high influential users in social networks. Expert Syst. Appl. 202, 117436.
Bonner, S., Barrett, I.P., Ye, C., Swiers, R., Engkvist, O., Hoyt, C.T., Hamilton, W.L., 2022.
application value in the field of safety. This paper discusses the appli­ Understanding the Performance of Knowledge Graph Embeddings in Drug Discovery.
cation of this method in the correlation analysis of railway accident Artif. Intell. Life Sci. 2, 100036.
hazrad. The correlation between various hazards in railway accidents is Chang, S., Lin, C., Hsu, C., Fung, C., Hwang, J., 2009. The effect of a collision warning
system on the driving performance of young drivers at intersections. Transport. Res.
obtained. The key hazards of railway accidents have been explored. F: Traffic Psychol. Behav. 12, 371–380.
Preventing and controlling key hazards is conducive to the prevention of Dindar, S., Kaewunruen, S., An, M., Sussman, J.M., 2018. Bayesian network-based
railway accidents. The method proposed in this paper is not only limited probability analysis of train derailments caused by various extreme weather patterns
on railway turnouts. Saf. Sci. 110, 20–30.
to the field of railway safety, but also can be applied in other safety fields
Evans, A.W., 2021. Fatal train accidents on Europe’s railways: An update to 2019. Accid.
such as maritime safety, highway safety, and subway safety. Anal. Prev. 158, 106182.
The real benefits of this method for the development of the safety Evans, A.W., Hughes, P., 2019. Traverses, delays and fatalities at railway level crossings
field is that it can identify the knowledge entities and the relations be­ in Great Britain. Accid. Anal. Prev. 129, 66–75.
Gong, F., Wang, M., Wang, H., Wang, S., Liu, M., 2021. SMR: Medical Knowledge Graph
tween the knowledge entities in the corresponding fields based on the Embedding for Safe Medicine Recommendation. Big Data Res. 23, 100174.
accident and disaster data in different fields. Thereby establishing a Haleem, K., 2016. Investigating risk factors of traffic casualties at private highway-
knowledge graph model that conforms to a specific accident. By railroad grade crossings in the United States. Accid. Anal. Prev. 95, 274–283.
Hao, W., Kamga, C., Yang, X., Ma, J., Thorson, E., Zhong, M., Wu, C., 2016. Driver injury
analyzing the constructed model using topological analysis indexes that severity study for truck involved accidents at highway-rail grade crossings in the
conform to the characteristics of the model, it can explore the relations United States. Transport. Res. F: Traffic Psychol. Behav. 43, 379–386.
between hazards in different fields. Obtaining key hazards and elabo­ Hua, L., Zheng, W., 2019. Research on causation of railway accidents based on complex
network theory. Chin. J. Saf. Sci. 29, 114–119.
rately preventing and controlling them is conducive to the development Huang, W., Zhang, Y., Kou, X., Yin, D., Li, L., 2020. Railway dangerous goods
of the security field. transportation system risk analysis: an interpretive structural modeling and bayesian
network combining approach. Reliab. Eng. Syst. Saf. 204, 107220.
Klockner, K., Toft, Y., 2018. Railway accidents and incidents: Complex socio-technical
5. Conclusion system accident modelling comes of age. Saf. Sci. 110, 59–66.
Lam, C.Y., Tai, K., 2020. Network topological approach to modeling accident causations
In this paper, a new correlation analysis model of railway accident and characteristics: analysis of railway incidents in Japan. Reliab. Eng. Syst. Saf.
193, 106626.
hazards based on knowledge graph is proposed. In order to find the Li, Q., Song, L., List, G.F., Deng, Y., Zhou, Z., Liu, P., 2017. A new approach to
deeper correlation between hazards and prevent railway accidents, new understand metro operation safety by exploring metro operation hazard network
topological analysis indexes are formulated. This method has been (MOHN). Saf. Sci. 93, 50–61.
Li, C., Tang, T., Chatzimichailidou, M.M., Jun, G.T., Waterson, P., 2019. A hybrid human
applied to the actual railway accidents in the UK. The results show that and organisational analysis method for railway accidents based on STAMP-HFACS
this method can effectively find out the relations between accidents and and human information processing. Appl. Ergon. 79, 122–142.
hazards. Through further analysis of the revealed relations, it provides Li, K., Wang, S., 2018. A network accident causation model for monitoring railway
safety. Saf. Sci. 109, 398–402.
an effective decision-making basis for the prevention of railway acci­ Liu, J., Li, K., Zheng, W., Zhu, J., 2018. An importance order analysis method for causes
dents. The method based on knowledge graph is expected to be applied of railway signaling system hazards based on complex networks. Risk Reliab. 233
to explore other types of railway accidents and provide additional (4), 567–579.
Liu, J., Schmid, F., Zheng, W., Zhu, J., 2019. Understanding railway operational
decision-making information for railway accident prevention.
accidents using network theory. Reliab. Eng. Syst. Saf. 189, 218–231.
However, this research still has some inadequate, requires further Liu, J., Schmid, F., Zheng, W., 2021. A knowledge graph-based approach for exploring
study and improvement in the future work. More abundant data can be railway operational accidents. Reliab. Eng. Syst. Saf. 207, 107352.
Ma, X., 2022. Knowledge graph construction and application in geosciences: A review.
considered and more considerations can be added in model building.
Comput. Geosci. 161, 105082.
Moreover, more knowledge entities and their relations should be Min, W., Liu, C., Xu, L., Jiang, S., 2022. Applications of knowledge graphs for food
included and a knowledge model with more information should be science and industry. Patterns (N Y). 3 (5), 100484.
established so as to further explore the relations between hazards. Mirabadi, A., Sharifian, S., 2010. Application of association rules in Iranian Railways
(RAI) accident data analysis. Saf. Sci. 48 (10), 1427–1435.
Rssb, 2014. Guidance on hazard identification and classification (GE/GN8642). Rail
CRediT authorship contribution statement Safety And Standards Board, London.
Stanton, N.A., Walker, G.H., 2011. Exploring the psychological factors involved in the
Ladbroke Grove rail accident. Accid. Anal. Prev. 43 (3), 1117–1127.
Ning Wang: Writing – original draft, Methodology, Data curation. Tan, J., Qiu, Q., Guo, W., Li, T., 2021. Research on the Construction of a Knowledge
Xin Yang: Writing – original draft, Methodology, Conceptualization. Graph and Knowledge Reasoning Model in the Field of Urban Traffic. Sustainability.
13 (6), 3191.
Jianhua Chen: Writing – review & editing, Methodology. Hongwei
Tian, C., Zhang, Y., Qian, X., Li, J., 2022. A knowledge graph-based method for epidemic
Wang: Writing – review & editing, Conceptualization. Jianjun Wu: contact tracing in public transportation. Transp. Res. C 137, 103587.
Writing – review & editing, Methodology.

20
N. Wang et al. Safety Science 166 (2023) 106238

Wang, Q., Mao, Z., Wang, B., Guo, L., 2017a. Knowledge Graph Embedding: A Survey of Zhang, J., Zhang, X., Wu, C., Zhao, Z., 2022. Survey of knowledge graph construction
Approaches and Applications. IEEE Trans. Knowl. Data Eng. 29 (12), 2724–2743. techniques. Comput. Eng. 48 (3), 23–37.
Wang, G., Xu, T., Tang, T., Yuan, T., Wang, H., 2017b. A bayesian network model for Zhou, Z., Irizarry, J., Li, Q., 2014. Using network theory to explore the complexity of
prediction of weather-related failures in railway turnout systems. Expert Syst. Appl. subway construction accident network (SCAN) for promoting safety management.
69, 247–256. Saf. Sci. 64, 127–136.
Wu, J., Li, D., Si, S., Gao, Z., 2021. Special issue: Reliability management of complex Zhou, Y., Li, C., Ding, L., Sekula, P., Love, P., Zhou, C., 2019. Combining Association
system. Front. Eng. Manag. 8 (4), 477–479. Rules Mining with Complex Networks to Monitor Coupled Risks. Reliab. Eng. Syst.
Zeng, G., Sun, Z., Liu, S., Chen, X., Li, D., Wu, J., Gao, Z., 2021. Percolation-based health Saf. 186, 194–208.
management of complex traffic systems. Front. Eng. Manag. 8 (4), 557–571.

21

You might also like