Learning Incident Prediction Models Over Large Geographical Areas For Emergency Response

2021 IEEE International Conference on Smart Computing (SMARTCOMP)
Learning Incident Prediction Models Over Large

Geographical Areas for Emergency Response
2021 IEEE International Conference on Smart Computing (SMARTCOMP) | 978-1-6654-1252-0/21/$31.00 ©2021 IEEE | DOI: 10.1109/SMARTCOMP52413.2021.00091
Sayyed Mohsen Vazirizade Ayan Mukhopadhyay Geoffrey Pettet

Vanderbilt University. Nashville. TN Vanderbilt University. Nashville. TN Vanderbilt University. Nashville. TN
Said El Said Hiba Baroud Abhishek Dubey

Tennessee Department of Transportation Vanderbilt University. Nashville. TN Vanderbilt University. Nashville. TN
Abstract—Emergency Response Management (ERM) necessi- earliest methods, known as ‘crash frequency analysis’, uses the
tates the use of models capable of predicting the spatial-temporal frequency of incidents in a specific discretized spatial area as
likelihood of incident occurrence. These models are used for a measure of the inherent risk the area possesses [2]. This
proactive stationing in order to reduce overall response time.
Traditional methods simply aggregate past incidents over space approach also forms the basis of hotspot analysis [3], [4].
and time; such approaches fail to make useful short-term predic- Since then, a range of different statistical and data-mining
tions when the spatial region is large and focused on fine-grained based approaches have been used to forecast incidents such
spatial entities like interstate highway networks. This is partially as traffic accidents [5], [6], [7], [8], [9], [10]. The main
due to the sparsity of incidents with respect to space and time. shortcoming of prior approaches is the inability to make
Further, accidents are affected by several covariates. Collecting,
cleaning, and managing multiple streams of data from various accurate predictions at high spatial-temporal resolutions due
sources is challenging for large spatial areas. In this paper, to high data sparsity. The majority of prior work in accident
we highlight how this problem is being solved in collaboration prediction either focuses on the spatial dynamics of incident
with the Tennessee Department of Transportation (TDOT) to occurrence or considers extremely coarse temporal resolutions;
improve ERM in the state of Tennessee. Our pipeline, based unfortunately such predictions are not particularly useful for
on a combination of synthetic resampling, clustering, and data
mining techniques, can efficiently forecast the spatio-temporal allocating and dispatching responders.
dynamics of accident occurrence, even under sparse conditions. This paper discusses a framework for prediction of spa-
Our pipeline uses data related to roadway geometry, weather, tiotemporal incidents. Over the past year, our project has
historical accidents, and traffic to aid accident forecasting. To focused on developing principled approaches to improve ERM
understand how our forecasting model can affect allocation and for the state of Tennessee. While we have extensively tackled
dispatch, we improve and employ a classical resource allocation
approach. Experimental results show that our approach can emergency response in past collaborations with several gov-
noticeably reduce response times and the number of unattended ernment bodies restricted to cities [11], [12], [13], planning
incidents in comparison to current approaches followed by first emergency response on scale of a large state is significantly
responders. The developed pipeline is efficacious, applicable in more challenging due to the aforementioned challenges of high
practice, and open-source. sparsity. The problems are exacerbated when we limit the area
of interest to only interstate highways across the state, which
I. I NTRODUCTION
reduces the number of samples available and leads to increased
A constant threat that affects humans across the globe are sparsity. The sparsity of incidents creates highly imbalanced
incidents like traffic accidents, fires, and crimes. Such inci- data, particularly at high spatiotemporal resolutions. However,
dents result in injuries, loss of life, and damage to properties our collaboration with first responders revealed that forecasting
and are collectively labeled as emergencies. For example, models that operate at high spatiotemporal granularity can be
road accidents alone account for 1.25 million deaths globally extremely beneficial to emergency response.
and about 240 million emergency medical services (EMS) Our contributions in this paper are the following: 1) we
calls are made in the U.S. annually [1]. Emergency response develop a pipeline that can effectively forecast incidents that
management (ERM) is defined as the set of procedures and are sparsely scattered in space and time. We show how a
tools that first responders use to deal with incidents. It includes combination of synthetic resampling and non-spatial clustering
specific mechanisms to forecast incidents, detect incidents, can result in the creation of accurate spatiotemporal models
allocate and dispatch resources, and mitigate the post-effects for short-term forecasting of road accidents. 2) It is natural
of incidents [1]. Arguably, the most important component of to compare forecasting approaches through metrics such as
ERM is to understand the spatiotemporal dynamics of incident likelihood values on test data, error rates, precision, and recall.
occurrence, as this aids resource allocation and dispatch and However, our conversations with first responders revealed that
helps design policies and safety codes. in practice, the relative hazard that each discretized spatial
A variety of approaches have been used to understand the area possesses is more important than a generative distribution
spatial and temporal dynamics of road accidents. One of the over incident occurrence. This is intuitive since accurately
978-1-6654-1252-0/21/$31.00 ©2021 IEEE 424

DOI 10.1109/SMARTCOMP52413.2021.00091
Authorized licensed use limited to: ESCUELA POLITECNICA DEL LITORAL (ESPOL). Downloaded on February 09,2023 at 05:43:23 UTC from IEEE Xplore. Restrictions apply.
in unobserved heterogeneity in the likelihood of incident
occurrence across large spatial regions.
Sparsity: It is also crucial to take into account the frequency
of incident occurrence. While the frequency of road accidents
is alarming, incidents are actually extremely sporadic when
Fig. 1: Blue lines represent TN’s roadway network. Yellow segments viewed from the perspective of total time and space. For
represent interstate highway segments under the jurisdiction of TDOT example, there were a total of approximately 78,000 road
and are the area of study for this paper. accidents reported between 2017-2020 on interstate highways
forecasting the risk at each segment relative to other segments in Tennessee. Performing spatiotemporal discretization that
is important for allocating resources. We show how standard considers about 5,000 road segments and time windows of
measures of correlation can be used to evaluate how models 4 hours each results in > 99% sparsity.
over incident prediction can identify the relative risks that Data Collection and Integration: Road accidents are af-
areas pose. 3) The overall goal of ERM is to reduce incident fected by a large number of determinants. For predicting
response times by better allocating resources in anticipation accidents in large geographic areas, it is challenging to collect,
of future incidents. To improve allocation, we modify the clean, understand, and analyze data from different sources and
classical p-median problem for resource allocation to balance integrate them for incident prediction.
the geographic spread of responders and their availability in
high density regions. 4) Finally, through extensive simulations III. DATA
on real-world data, we show how our forecasting pipeline and Before describing our pipeline, we describe the covariates
allocation algorithm provide a significant reduction in response used and their sources (Table I). Our exercise of collecting data
times compared to the existing methods used in practice. The for features that affect accidents was guided by the invaluable
more detailed version of this manuscript is available in [14]. domain expertise of first responders and our collaborators at
the Tennessee Department of Transportation (TDOT). Using
II. P ROBLEM F ORMULATION the data sources shown in Table I, we extract the following
features: 1) Roadway Information: To learn a predictive model
Consider a spatial area of interest S, in which incidents (like for accidents over a network of roadways, the information and
accidents) occur in space and time. The decision-maker ob- geometry of the roadways are needed. We collect roadway
serves a set of samples denoted by {(s1 , t1 , w1 ), (s2 , t2 , w2 ), information from INRIX [15], a private entity that provides
. . . , (sn , tn , wn )}, where si is location of the ith incident, location-based data and analytics. We retrieved information for
and in our problem is limited to the Interstate Highway about 80,000 roadway segments in the state of Tennessee, out
network of the state of Tennessee, Fig. 1. ti denote the time of which about 5,000 are interstate highway segments. We also
of occurrence of the ith incident. wi ∈ m represents a retrieved static features associated with each segment that are
vector of features associated with the environment defined by immutable (relatively) over time (e.g., like number of lanes).
the location and time of the ith incident. We refer to this 2) Traffic: The correlation of traffic and road accidents is well-
tuple of vectors as D, which denotes the input data that the established [1]. We collected traffic data for each of the road
decision-maker has access to. The vector w can contain spatial, segments through INRIX at a temporal resolution of 5-minute
temporal, or spatio-temporal features and it captures covariates intervals for about three years. Then, we calculated congestion
that potentially affect incident occurrence. For example, w based on the free flow speed of traffic and the estimated current
typically includes features such as weather, traffic volume, and speed of the vehicles. 3) Weather: Weather is inherently spatio-
time of day. The most general form of incident prediction can temporal, and can play an important role in accident rates [1].
then be stated as learning the parameters θ of a function over a We collected hourly weather data (temperature, precipitation,
random variable X conditioned on w. We denote this function visibility, and wind) from 40 different weather stations in and
by f (X | w, θ). The random variable X represents a measure around the state of Tennessee. 4) Incidents: We look at every
of incident occurrence such as a count or presence of incidents accident reported in Tennessee from January 2017 to May
during a specific time period on any part of a segment. The 2020, a total of approximately 78,000 accidents. Incident data
goal of the incident prediction problem is to find the optimal for this project is provided by TDOT.
parameters θ∗ that best describe D. This can be formulated as
a maximum likelihood estimation (MLE) problem. IV. A PPROACH
Below, we highlight the major challenges this problem Our goal is to learn a function f (Section II) that outputs
formulation poses in the context of designing an emergency the likelihood of incident occurrence conditioned on a set
response pipeline for a large spatial area: of features. Discretization reduces the problem to learning
Irregular incident occurrence: It is well-established in liter- the likelihood of incident occurrence on each spatiotemporal
ature that predicting road accidents is extremely difficult due unit. A straightforward way to do so is to learn a separate
to their inherent randomness [1]. While accidents are affected model over each segment. However, such an approach results
by various features, it is difficult to take all determinants into in overfitting; each segment contributes a relatively small
account while designing forecasting models, thereby resulting amount of data which ignores structural similarities between
425
TABLE I: Data Features, Size, and Sources
Dataset Range Size Rows Features Source Frequency Type Description
- - - - Time of day Derived - Temporal We divide each day into six 4-hour time windows.
- - - - Weekend Derived - temporal A binary feature that denotes weekdays.
02/01/2017 Past Incidents in the last window Derived - Spatio-temporal Number of incidents on the segment in the last time window of 4 hours
to Past Incidents in a day Derived - Spatio-temporal Number of incidents on the segment in the last day
Incident 21MB 80,000
05/01/2020 Past Incidents in a week Derived - Spatio-temporal Number of incidents on the segment in the last week
Past Incidents in a month Derived - Spatio-temporal Number of incidents on the segment in the last month
02/01/2017 Visibility Weatherbit 1 hour Spatio-temporal A measure of the distance at which an object or light can be clearly discerned.
to Wind Speed Weatherbit 1 hour Spatio-temporal Speed of wind.
Weather 300MB 1,400,000
06/01/2020 Precipitation Weatherbit 1 hour Spatio-temporal Amount of precipitation.
Temperature Weatherbit 1 hour Spatio-temporal It is the reported temperature.
04/01/2017 Congestion Derived 5 minutes Spatio-temporal Congestion is the ratio of the difference between free flow speed and the current speed to free flow speed
Traffic to 1.2TB 30,000,000,000 Free Flow Speed INRIX 5 minutes Spatial The speed at which drivers feel comfortable if there is no traffic and adverse weather condition.
12/01/2020 Traffic Confidence INRIX 5 minutes Spatio-temporal A confidence score regarding the accuracy of the traffic data (we collect this directly from INRIX).
Lanes INRIX static Spatial Number of lanes for a roadway segment.
Roadways Static 81MB 80,000 Miles Derived static Spatial Length of a roadway segment.
iSF Derived static Spatial Inverse scale factor which represents the the curvature of a roadway segment.
patterns of incident occurrence across the entire spatial region Balanced Random Forest method [17]. 4) Finally, we also
of interest. The other approach is to learn one model for the use simple artificial neural networks to learn a model over
entire area. However, a universal model fails to capture any incident occurrence. An important note to practitioners is the
heterogeneity that is not explicitly modeled in the feature non-interpretability of neural networks can be a barrier when
space. In order to balance these considerations, we try to deploying systems in the real-world that affect government
identify segments that observe similar patterns for incident policies. While we refrain from presenting complete descrip-
occurrence [11]. While it is possible to identify distinct spatial tions of each approach due to brevity, the accompanying
regions (hotspots) and learn a separate model for each area, codebase released with this paper provides all the details.
it is possible that there exists generalizable information in
V. A LLOCATION AND D ISPATCH
the entire area that is spatially invariant. In order to do so,
we seek to identify common areas irrespective of spatial The primary purpose of incident prediction models is to
contiguity by clustering all the available segments based on make informed resource allocation decisions. However, prior
their frequency of incident occurrence. Specifically, we used literature rarely evaluates ERM pipelines in their entirety.
k-means algorithm [16] to group the segments into clusters. Our goal in this project is to ensure that first responders
Given clusters of roadway segments that share similar controlled by the state of Tennessee save vital response time
patterns of spatio-temporal incident occurrence, learning the when dealing with accidents. Hence we seek to evaluate the
patterns is still challenging due to the sparsity of the data. To impact incident models have on response time outcomes. Our
address this concern, we perform synthetic under-sampling and evaluation process is guided by the following steps:
over-sampling to balance our data. However, naive synthetic 1) Understanding existing policies: Through our collabora-
sampling performs poorly in our case since the relative fre- tion with first responders, we first understand how emer-
quencies of incident occurrence are markedly different among gency resources are allocated and deployed in practice. We
the clusters. Therefore, it is impractical to ‘balance’ data in describe this approach below and use it as our baseline.
each cluster in the same manner. To alleviate this, we start with 2) Resource Allocation: In practice, models like the well-
the cluster with the highest frequency of incident occurrence known p-median formulation [18], [19] are widely used
(cluster A, say) and perform synthetic sampling such that the to allocate emergency resources [1]. A shortcoming of
number of positive data points (spatial segments in temporal such approaches is that the service time of resources (for
windows that have accidents) is the same as the number of example, the time that ambulances are busy responding
negative data points (spatial segments in temporal windows to accidents) is not taken directly into account in the
that do not have accidents). Then, we perform synthetic allocation process. We introduce a novel modification to
sampling in the other clusters such that the ratio of accidents the p-median problem to design a heuristic approach that
occurring for any given cluster (cluster B, say) to the frequency bridges this crucial gap.
in cluster A is the same as in the original dataset. 3) Evaluation: Using the proposed allocation model, we
Clustering and synthetic sampling provide the foundation compare the performance of existing prediction models and
to learn spatio-temporal forecasting models over accident our pipeline by creating a black-box simulator that imitates
occurrence. We use the following well-known models to this emergency response.
end. 1) We use Logistic Regression to treat the occurrence of
accidents as a dichotomous output and model the likelihood A. Resource Allocation
that any accidents occur. 2) We use count-based models to Our interaction with first responders revealed that resource
model accident occurrence conditional on spatial temporal allocation is based on identifying hotspots of incident occur-
features. We use Poisson regression as well as zero-inflated rence. First, a map based on historical accidents is created.
regression models, both of which have been widely used Then a group of experienced engineers determine the location
to model accident occurrence [1], with zero-inflated models of the responders; typically, responders are placed in areas
demonstrating significantly improved predictive power [6]. 3) with the highest historical accident rates.
We also use the random forest classifier, which is an ensemble While allocation based on hotspots is intuitive, it has
of decision trees and can implicitly address sparsity using the been shown to underperform in practice [1]. The allocation
426
formulation we use is based on the p-median problem [18], Algorithm 1: Greedy-Add Algorithm
which is commonly applied to ambulance allocation. The input : Demand Edges E, Potential Responder Locations L, Segment
objective of the standard p-median problem is to locate p Incident Likelihoods ai ∀ei ∈ E, Segment to Location Distances
d(i, j) ∀ei ∈ E, ∀lj ∈ L, Number of Responders p, Balance
facilities (i.e. responders) such that the average demand- Factor α
weighted distance between edges and their nearest facility is output: Responder Locations X
1 Initialize k := 0, Xk := ∅ ;
minimized. One shortcoming of the p-median formulation is 2 while k < p do
that it does not account for responders becoming unavailable 3 k := k + 1;
4 for location lj ∈ L, where j ∈ / Xk−1 do
when attending to incidents. To address this, we modify the 5 Xk := Xk−1 ∪ lj ;
standard p-median formulation by adding a balancing term 6 E, where yei ∈ Xk ;
Find nearest facilities yi ∀ei ∈
e ∈E ai ψ α
to the objective function. Intuitively, this balancing term. 7 Compute balance terms bj := ( i
ei ∈E ai
) ∀lj ∈ L where
α, penalizes responders that cover disproportionately large ψ := 1 if yi = lj , ψ := 0 otherwise;

8 Compute Zjk := e ∈E ae d(ei , yi )bye ;
demand compared to other facilities, encouraging multiple i i
9 end
responders to congregate near high demand areas. Formally, 10 Best location lj∗ := argminj Zjk ;
we solve the following optimization problem: 11 Xk := Xk−1 ∪ j ∗ ;
12 Return Xk
13 end
|E| |L|

min ai dij Yij bj (1a) that minimizes the modified p-median score (step 10) and add
i=1 j=1
|L|
it to the set of allocated responder locations (step 11).

s.t. Yij = 1, ∀i ∈ {1, . . . , |E|} (1b) Given the allocation of responders, we simulate response to
j=1
real incidents to evaluate the efficacy of our model. Response
|L|
to emergency incidents is typically greedy [1]; the closest
Xj = p (1c)
j=1 available responder to the scene of the incident is dispatched.
Yij ≤ Xj , ∀i ∈ {1, . . . , |E|}, ∀j ∈ {1, . . . , |L|} (1d) This a direct consequence of the critical nature of the incidents
Xj , Yij ∈ {0, 1}, ∀i ∈ {1, . . . , |E|}, ∀j ∈ {1, . . . , |L|} (1e) that emergency responders address.
where E is the set of demand nodes (roadway segments), L VI. E XPERIMENTAL E VALUATION
is the set of possible responder locations, p is the number To evaluate our models using the datasets explained in Sec-
of responders to be located, ai is the likelihood of accident tion III, we train separate models based on a rolling temporal
occurrence on node ei ∈ E, and dij is the distance between window. The first train set starts from April/2017 to Jan-
node ei ∈ E and location lj ∈ L. Yij and Xj are two sets of uary/2019 and the corresponding test set is Feb/2019.The sec-
decision variables; Xj = 1 if a responder is located at lj ∈ L ond train set is created by adding the first test set (Feb/2019)
and 0 otherwise, and Yij = 1 if edge ei ∈ E is covered by to the the first train set (April/2017 to January/2019), and
a responder located at lj ∈ L (i.e. the responder at j is the March/2019 is considered for the second test set. This rolling
nearest placed responder to e) and0 otherwise. The balancing window continues to the last test set, which is Jan/2020; 12
e∈E ae Yej α
term we add is denoted by bj = ( ) , and represents pairs of test and train sets in total. For the sake of repro-
e∈E ae
the proportion of total demand covered by a responder located ducibility, the code used in this study is publicly available at
at j. The influence of the balancing term is controlled by the https://github.com/StatResp/smart-comp IncidentPrediction.
hyper-parameter α; intuitively, as α increases, responders are
more ‘tightly packed’ around high demand areas. If α = 0, TABLE II: Summary of performance evaluation metrics for
our formulation reduces to the standard p-median formulation. each model in percentage
Constraint (1b) expresses that the demand of each node must Model Cluster Resamp. Name Acc. Pre. Rec. F1 Pea. Spe.
Naive Naı̈ve 95.5 3.8 4.2 4.0 82.1 60.8
be met, (1c) ensures that p responders are located, and (1d) No LR+NoR+NoC1 94.0 13.8 27.4 18.2 70.4 55.2
No RUS LR+RUS+NoC1 93.0 12.8 32.3 18.3 63.1 54.7
shows that nodes must be covered only by locations where ROS LR+ROS+NoC1 93.0 12.8 32.3 18.3 63.2 54.7
LR
No LR+NoR+KM2 93.0 12.5 30.9 17.7 76.6 58.4
responders have been located. Yes RUS LR+RUS+KM2 92.3 12.1 34.4 17.8 74.2 58.1
ROS LR+ROS+KM2 92.4 12.2 34.2 17.9 74.2 58.1
The p-median problem is known to be NP-hard [20], No NN+NoR+NoC1 94.9 19.2 32.8 24.0 71.7 58.5
therefore heuristic methods are employed to find approximate No RUS NN+RUS+NoC1 95.0 19.2 32.6 24.1 73.2 59.3
ROS NN+ROS+NoC1 94.9 19.1 32.8 23.9 69.3 54.7
NN
solutions in practice. We use the Greedy-Add algorithm [21] to No NN+NoR+KM2 95.0 19.0 31.6 23.7 75.6 58.9
Yes RUS NN+RUS+KM2 94.7 18.4 32.7 23.3 73.1 54.6
optimize the locations of responders. We show the algorithm ROS NN+ROS+KM2 94.7 18.3 33.1 23.3 74.5 55.4
No RF+NoR+NoC1 95.0 19.0 31.8 23.6 78.7 63.4
in Algorithm 13. First, we initialize the iteration counter k RUS RF+RUS+NoC1 95.2 19.3 30.5 23.5 67.4 56.9
No
and the set of allocated responder locations Xk to the empty ROS
Weighted
RF+ROS+NoC1
RF+CW+NoC1
95.3
95.4
18.6
20.6
27.6
30.4
22.1
24.4
79.2
77.1
64.6
62.5
Tree
set (step 1). Then, as long as there are responders awaiting No RF+NoR+KM2 95.1 18.9 30.5 23.2 79.8 62.3
RUS RF+RUS+KM2 95.0 19.4 32.5 24.2 73.8 57.6
allocation, we iterate through the following loop: (1) update Yes
ROS RF+ROS+KM2 95.1 18.3 28.7 22.2 80.1 63.6
Weighted RF+CW+NoC1 95.4 20.6 30.4 24.4 77.1 62.5
counter k current iteration (step 3), (2) for each potential No ZIP+NoR+NoC1 94.4 14.6 26.8 18.9 74.0 58.0
location not already in the allocation, compute the modified No RUS ZIP+RUS+NoC1 94.2 13.9 26.1 18.1 61.1 50.6
ROS ZIP+ROS+NoC1 94.2 13.9 26.7 18.2 61.2 50.6
ZIP
p-median score (equation 1a) of the allocation which includes No ZIP+NoR+KM2 93.1 13.1 31.9 18.5 77.6 61.8
Yes RUS ZIP+RUS+KM2 93.0 12.7 30.8 17.8 74.2 57.1
the potential location (steps 5 - 8), and (3) find the location ROS ZIP+ROS+KM2 93.0 12.8 30.9 18.0 74.3 57.0
427
TABLE III: Total travel distance of responders per accident in each generally decreases those metrics. The effect of clustering
4-h window and resampling on other parameters is somewhat unclear.
p
alpha 0
10
0.5 1 0
15
0.5 1 0
20
0.5 1
We observe that typically, the combination slightly under-
Naı̈ve 42.81 42.14 45.82 25.61 25.49 27.80 19.31 19.62 21.41 performs in comparison to using one of the two approaches.
LR+NoR+NoC1 43.06 42.67 42.08 25.11 24.57 25.83 18.59 17.30 17.52
LR+RUS+NoC1 45.73 44.78 42.81 25.62 25.12 25.99 19.27 18.07 17.39 We present three major takeaways from this observation —
LR+ROS+NoC1 45.69 44.72 42.67 25.64 25.10 25.93 19.28 18.03 17.35
LR+NoR+KM2 42.76 42.56 43.35 24.31 24.65 26.02 18.54 18.73 19.01 first, NN and RF outperform the Naive, LR regression, and
LR+RUS+KM2 44.62 42.96 43.06 24.75 24.62 25.76 18.68 18.72 17.72
LR+ROS+KM2 44.72 42.89 43.04 24.75 24.56 25.88 18.70 18.72 17.74
ZIP models. Second, count-based models are not suitable
NN+NoR+NoC1
NN+RUS+NoC1
40.31
40.05
40.42
40.45
41.36
41.30
22.72
22.47
23.36
23.42
24.91
24.86
16.41
16.18
16.75
16.86
18.05
18.18
for such fine spatio-temporal resolution since there are few
NN+ROS+NoC1 40.73 40.74 41.63 22.77 23.49 25.05 16.24 16.94 18.19 samples showing more than one incident occurrence. Third,
NN+NoR+KM2 40.39 40.92 42.75 22.68 23.95 25.67 16.71 17.29 18.80
NN+RUS+KM2 40.65 40.65 42.24 22.71 23.99 25.32 16.69 17.21 18.63 synthetic sampling and clustering improves forecasting in
NN+ROS+KM2 40.81 40.78 41.98 22.80 23.74 25.52 16.71 17.21 18.72
RF+NoR+NoC1 43.14 40.58 41.80 23.42 23.52 24.95 17.28 16.81 17.78 sparse datasets compared to approaches that do not use them.
RF+RUS+NoC1 41.77 40.29 42.43 23.26 23.63 25.54 16.71 17.38 18.70
RF+ROS+NoC1 44.36 42.20 41.95 24.62 24.60 25.44 18.57 17.31 17.42
However, we recommend practitioners carefully evaluate each
RF+CW+NoC1 42.34 41.13 42.35 23.33 23.94 25.49 16.72 17.35 18.44 component of the proposed incident prediction pipeline on
RF+NoR+KM2 43.60 41.37 41.86 23.71 23.77 25.05 17.56 16.99 17.84
RF+RUS+KM2 42.00 41.18 42.77 23.29 24.12 25.79 16.79 17.56 18.85 unseen data (test set) before deployment. Fourth, while values
RF+ROS+KM2 43.20 41.13 42.08 23.86 23.89 25.24 17.55 16.98 17.95
RF+CW+NoC1 42.54 40.95 42.27 23.65 24.29 25.56 17.04 17.49 18.81 of F1 scores might seem low in comparison to approaches on
ZIP+NoR+NoC1 42.37 42.06 41.53 24.29 24.17 25.12 18.00 17.07 17.54
ZIP+RUS+NoC1 47.17 46.73 43.20 26.45 25.74 26.46 19.59 19.13 17.78 other data-driven problems, we claim that the improvement is
ZIP+ROS+NoC1
ZIP+NoR+KM2
47.21
42.22
46.74
42.14
43.14
42.56
26.43
23.88
25.74
24.35
26.54
25.89
19.55
17.92
19.12
18.21
17.79
18.68
significant in the context of extremely sparse and inherently
ZIP+RUS+KM2 45.95 43.50 43.97 25.37 25.20 26.41 19.22 19.44 17.76 random incidents like road accidents. We show the validity of
ZIP+ROS+KM2 46.02 43.47 44.00 25.39 25.22 26.38 19.20 19.51 17.71
this claim by simulating allocation and dispatch to accidents.
B. Allocation and Dispatch

A. Evaluation Our final goal is to help our community partners reduce
To evaluate the performance, we use both conventional accident response times. We now discuss how our forecasting
performance metrics (accuracy, precision, recall, and F1 score) models aid allocation and dispatch. We evaluate the entire
and metrics desired by responders such as the Pearson and the combination of forecasting and dispatch on a one year simu-
Spearman correlations of the average prediction vector (the lation of emergency response in Tennessee using 4-hour time
element ith in this vector is the average predicted likelihood of windows. We vary the hyper-parameter α and the number
incident occurrence over a test period for segment ith.) and the of available responders p to observe their effect. We use
average incident occurrence vector (the element ith is average two metrics to evaluate performance — the average distance
incident occurrence over the same test period and segment) We traveled by the responders, Table III, and the (average and
refer to the following abbreviations for brevity: LR (logistic max) number of incidents that cannot be attended due to
regression), NN (neural networks), RF (random forests), ZIP responder unavailability, Table IV. We observe that the models
(zero-inflated Poisson), RUS (random under sampling), ROS with high F1 and precision metrics in Table II typically
(random over sampling), NoC1 (No clustering) and KM2 (k- perform best in terms of distance traveled and response times,
means clustering with 2 clusters). Our baseline model emulates with the best model reducing the average distance traveled by
the forecasting model currently used by first responders in 4.5 km compared to the Naive model.
Tennessee, which is essentially a long term hotspot map We also observe that our forecasting pipeline results in a
which captures the average number of accidents for each noticeable improvement to response times (upto 19%) and
roadway segment over a long period of time. We refer to reduces the total number of unattended accidents, with NN
the baseline as the Naive model. We present results for each models providing the best results (and RF taking a close
approach in Table II. To understand the role and efficacy of second). We provide three major takeaways based on our allo-
each component of the pipeline, we present results with and cation and dispatch experiments. First, forecasting models that
without synthetic resampling and clustering. Furthermore, we provide the highest accuracy might not be the best candidates
investigate the capability of the aforementioned performance for allocation. This observation shows the importance of using
metrics using a comprehensive simulation and show the results a metric that focuses on false negatives and false positives
in Tables III and IV. In all of the aforementioned tables, the (like the F-1 score) for sparse emergency incidents. Second,
values in each column are color coded; green represents the while traditional allocation models based on long-term (tempo-
best results while red represents the worst. ral) hotspots are widely used, accurate short-term forecasting
We present the major observations. Based on Table II, we models can result in significant reduction in response times
observe that the Naive model shows good performance in to accidents. Finally, leveraging the structure of the problem
terms of Pearson correlation and accuracy but not F1 score, to improve classical resource allocation formulations can aid
precision, and recall due to under-predicting accidents. Using emergency response in the field.
just clustering generally improves Pearson and Spearman cor- VII. C ONCLUSION
relation while reducing F1-score and precision. We further ob-
serve a similar trend with combining resampling and clustering Emergency response to incidents like road accidents is a
for Pearson and Spearman correlation, while resampling alone major concern for first responders. Standard approaches to
428
TABLE IV: Average and maximum number accidents in each 4-h window that responders are not able to immediately respond
average number of unattended accidents maximum number of unattended accidents
p 10 15 20 10 15 20
α 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1
Naı̈ve 0.59 0.55 0.51 0.06 0.05 0.05 0.01 0.01 0.01 31.00 34.00 33.00 22.00 22.00 22.00 9.00 10.00 10.00
LR+NoR+NoC1 0.57 0.49 0.49 0.06 0.05 0.04 0.00 0.00 0.00 32.00 31.00 30.00 21.00 21.00 15.00 6.00 5.00 4.00
LR+RUS+NoC1 0.61 0.54 0.52 0.06 0.05 0.05 0.01 0.00 0.00 31.00 31.00 33.00 19.00 18.00 21.00 8.00 6.00 6.00
LR+ROS+NoC1 0.62 0.54 0.52 0.06 0.05 0.05 0.01 0.00 0.00 31.00 31.00 33.00 19.00 18.00 21.00 8.00 6.00 6.00
LR+NoR+KM2 0.56 0.49 0.48 0.06 0.05 0.05 0.00 0.00 0.00 33.00 31.00 34.00 22.00 21.00 19.00 7.00 8.00 6.00
LR+RUS+KM2 0.60 0.52 0.49 0.06 0.05 0.05 0.00 0.00 0.00 31.00 32.00 32.00 23.00 21.00 19.00 7.00 5.00 8.00
LR+ROS+KM2 0.60 0.51 0.50 0.06 0.05 0.05 0.00 0.00 0.00 31.00 31.00 32.00 23.00 23.00 19.00 7.00 6.00 8.00
NN+NoR+NoC1 0.51 0.47 0.48 0.04 0.04 0.04 0.00 0.00 0.00 31.00 32.00 35.00 19.00 16.00 18.00 6.00 4.00 7.00
NN+RUS+NoC1 0.49 0.46 0.46 0.04 0.04 0.04 0.00 0.00 0.00 31.00 31.00 34.00 19.00 20.00 17.00 5.00 6.00 6.00
NN+ROS+NoC1 0.51 0.47 0.48 0.05 0.04 0.05 0.00 0.00 0.00 31.00 34.00 34.00 21.00 19.00 19.00 6.00 6.00 6.00
NN+NoR+KM2 0.50 0.47 0.48 0.04 0.04 0.04 0.00 0.00 0.00 30.00 35.00 34.00 20.00 19.00 19.00 7.00 6.00 6.00
NN+RUS+KM2 0.51 0.47 0.47 0.05 0.05 0.04 0.00 0.00 0.00 30.00 32.00 35.00 20.00 19.00 18.00 8.00 6.00 6.00
NN+ROS+KM2 0.51 0.47 0.47 0.05 0.04 0.05 0.00 0.00 0.00 32.00 32.00 32.00 19.00 17.00 18.00 7.00 6.00 7.00
RF+NoR+NoC1 0.57 0.48 0.49 0.05 0.04 0.04 0.00 0.00 0.00 33.00 31.00 32.00 20.00 20.00 18.00 7.00 5.00 5.00
RF+RUS+NoC1 0.53 0.47 0.47 0.05 0.05 0.04 0.00 0.00 0.00 31.00 32.00 34.00 21.00 21.00 19.00 6.00 6.00 5.00
RF+ROS+NoC1 0.61 0.52 0.50 0.06 0.05 0.05 0.01 0.00 0.00 33.00 31.00 32.00 22.00 19.00 23.00 8.00 6.00 8.00
RF+CW+NoC1 0.54 0.48 0.46 0.05 0.04 0.04 0.00 0.00 0.00 31.00 31.00 31.00 22.00 18.00 17.00 8.00 9.00 6.00
RF+NoR+KM2 0.58 0.50 0.48 0.05 0.04 0.05 0.01 0.00 0.00 33.00 30.00 31.00 22.00 20.00 19.00 9.00 4.00 5.00
RF+RUS+KM2 0.53 0.47 0.47 0.05 0.05 0.04 0.00 0.00 0.00 32.00 31.00 34.00 21.00 21.00 13.00 5.00 6.00 5.00
RF+ROS+KM2 0.58 0.50 0.48 0.05 0.05 0.05 0.00 0.00 0.00 31.00 31.00 31.00 19.00 20.00 18.00 6.00 5.00 7.00
RF+CW+NoC1 0.55 0.48 0.47 0.05 0.05 0.04 0.00 0.00 0.00 32.00 31.00 33.00 22.00 21.00 21.00 5.00 6.00 8.00
ZIP+NoR+NoC1 0.54 0.49 0.46 0.06 0.04 0.04 0.00 0.00 0.00 32.00 33.00 31.00 21.00 19.00 20.00 6.00 5.00 7.00
ZIP+RUS+NoC1 0.64 0.56 0.54 0.06 0.06 0.05 0.01 0.00 0.00 31.00 32.00 33.00 19.00 21.00 21.00 8.00 7.00 6.00
ZIP+ROS+NoC1 0.64 0.56 0.54 0.06 0.06 0.05 0.01 0.00 0.00 31.00 32.00 33.00 19.00 21.00 21.00 8.00 7.00 6.00
ZIP+NoR+KM2 0.54 0.49 0.47 0.06 0.04 0.05 0.00 0.00 0.00 33.00 32.00 33.00 21.00 20.00 18.00 7.00 5.00 6.00
ZIP+RUS+KM2 0.62 0.53 0.51 0.05 0.05 0.05 0.01 0.00 0.00 31.00 32.00 34.00 23.00 21.00 19.00 8.00 7.00 6.00
ZIP+ROS+KM2 0.63 0.53 0.52 0.05 0.05 0.05 0.01 0.00 0.00 31.00 32.00 32.00 23.00 18.00 20.00 8.00 7.00 6.00
predict road accidents rarely scale to large geographic areas [7] A. Pande and M. Abdel-Aty, “Assessment of freeway traffic parameters
due to extremely high sparsity in data and difficulties in leading to lane-change related collisions,” Accident Analysis & Preven-
tion, vol. 38, no. 5, pp. 936–948, 2006.
gathering data. In collaboration with TDOT, we present a [8] L. Zhu, F. Guo, R. Krishnan, and J. W. Polak, “The use of convolutional
framework for forecasting extremely sparse spatiotemporal neural networks for traffic incident detection at a network level,” in
incidents like road accidents. We show how our approach for Transportation Research Board Annual Meeting, 2018.
[9] J. Bao, P. Liu, and S. V. Ukkusuri, “A spatiotemporal deep learning
forecasting, based on a combination of non-spatial clustering, approach for citywide short-term crash risk prediction with multi-source
synthetic resampling, and learning from multiple data sources, data,” Accident Analysis & Prevention, vol. 122, pp. 239–254, 2019.
outperforms forecasting methods used in the field. We also [10] X. Li, D. Lord, Y. Zhang, and Y. Xie, “Predicting motor vehicle
crashes using support vector machine models,” Accident Analysis and
present a novel modification to a classical formulation for Prevention, vol. 40, no. 4, pp. 1611–1618, 2008.
resource allocation. Through extensive simulations, we show [11] A. Mukhopadhyay, Y. Vorobeychik, A. Dubey, and G. Biswas, “Prior-
how our pipeline results in a significant improvement. This itized allocation of emergency responders based on a continuous-time
incident prediction model,” in International Conference on Autonomous
open-source implementation can be used by other organiza- Agents and MultiAgent Systems, 2017, pp. 168–177.
tions looking for emergency response optimization. [12] A. Mukhopadhyay, G. Pettet, C. Samal, A. Dubey, and Y. Vorobeychik,
“An online decision-theoretic pipeline for responder dispatch,” in Inter-
VIII. ACKNOWLEDGEMENT national Conference on Cyber-Physical Systems, 2019, pp. 185–196.
We acknowledge the support from the National Sci- [13] G. Pettet, A. Mukhopadhyay, M. Kochenderfer, and A. Dubey, “Hierar-
chical planning for resource allocation in emergency response systems,”
ence Foundation (IIS-1814958), the Tennessee Department of arXiv preprint arXiv:2012.13300, 2020.
Transportation (TDOT) for funding the research, and Google [14] S. M. Vazirizade, A. Mukhopadhyay, G. Pettet, S. E. Said, H. Baroud,
for providing computational resources. and A. Dubey, “Learning incident prediction models over large ge-
ographical areas for emergency response systems,” arXiv preprint
R EFERENCES arXiv:2106.08307, 2021.
[15] “INRIX.” [Online]. Available: https://inrix.com/
[1] A. Mukhopadhyay, G. Pettet, S. Vazirizade, D. Lu, S. E. Said, A. Jaimes, [16] J. MacQueen et al., “Some methods for classification and analysis of
H. Baroud, Y. Vorobeychik, M. Kochenderfer, and A. Dubey, “A review multivariate observations,” in Proceedings of the fifth Berkeley sympo-
of incident prediction, resource allocation, and dispatch models for sium on mathematical statistics and probability, vol. 1, no. 14, 1967,
emergency management,” 2021. pp. 281–297.
[2] J. A. Deacon, C. V. Zegeer, and R. C. Deen, “Identification of hazardous [17] C. Chen, A. Liaw, and L. Breiman, “Using random forest to learn
rural highway locations,” Transportation Research Record, vol. 543, imbalanced data,” University of California, Berkeley, 2004.
1974. [18] M. Dzator and J. Dzator, “An effective heuristic for the p-median
[3] W. Cheng and S. P. Washington, “Experimental evaluation of hotspot problem with application to ambulance location,” Opsearch, vol. 50,
identification methods,” Accident Analysis & Prevention, vol. 37, no. 5, no. 1, pp. 60–74, 2013.
pp. 870–881, 2005. [19] L. Caccetta, M. Dzator et al., “Heuristic methods for locating emergency
[4] J. Eck, S. Chainey, J. Cameron, M. Leitner, and R. Wilson, “Mapping facilities,” in Proceedings Modsim. Citeseer, 2005.
crime: Understanding hot spots,” National Institute of Justice, Tech. [20] O. Kariv and S. L. Hakimi, “An algorithmic approach to network loca-
Rep., 2005. tion problems. i: The p-centers,” SIAM Journal on Applied Mathematics,
[5] W. Ackaah and M. Salifu, “Crash prediction model for two-lane rural vol. 37, no. 3, pp. 513–538, 1979.
highways in the ashanti region of ghana,” IATSS Research, vol. 35, no. 1, [21] M. S. Daskin, Network and discrete location: models, algorithms, and
pp. 34–40, 2011. applications. John Wiley & Sons, 1995.
[6] D. Lord, S. Washington, and J. N. Ivan, “Further notes on the appli-
cation of zero-inflated models in highway safety,” Accident Analysis &
Prevention, vol. 39, no. 1, pp. 53–57, 2007.
429

Learning Incident Prediction Models Over Large Geographical Areas For Emergency Response

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Learning Incident Prediction Models Over Large Geographical Areas For Emergency Response

Uploaded by

Copyright:

Available Formats

2021 IEEE International Conference on Smart Computing (SMARTCOMP)

Learning Incident Prediction Models Over Large

Sayyed Mohsen Vazirizade Ayan Mukhopadhyay Geoffrey Pettet

Said El Said Hiba Baroud Abhishek Dubey

978-1-6654-1252-0/21/$31.00 ©2021 IEEE 424

B. Allocation and Dispatch

You might also like