You are on page 1of 8

KNUST

PREDICTING THE SEVERITY OF ROAD ACCIDENTS IN GHANA


USING MACHINE LEARNING ALGORITHMS
Reindorf Nartey Borkor1,† , Salomey Osei1 , Bernard Opoku2 , Kwame Owusu Edusei2,3 , Prince Mensah4,∗ , Miheso Juanita Afi4,5 ,
Lowor Divine Adjogli5,6 and Effah Ebenezer Danso6,7
† Department of Mathematics, Kwame Nkrumah University of Science and Technology (KNUST), Kumasi, Ghana.

∗ Corresponding author: Dr. Reindorf Nartey Borkor


Department of Mathematics
KNUST, Ghana
Email: reinbork@gmail.com.

1 Abstract
2 In Ghana, traffic safety has always been a major concern in terms of sustainable transportation development, and predicting the severity of traffic
3 accidents remains an important challenging issue. Predicting crash injury severity is an important constituent of reducing the consequences of
4 traffic crashes. This study developed machine learning (ML) models including; Random Forest (RF), Logistic Regression (LG) and Artificial
5 Neural Network to predict crash injury severity using several crash-related parameters. The input parameters mainly include vehicle attributes,
6 road condition and light condition attributes. This study employed the crash database of Ghana for the years 1998–2011. The performance of
7 the various algorithms was measured and compared based on accident severity prediction accuracy, precision, recall, F1-scores, Receiver
8 Operating Characteristics (ROC) scores, and the confusion matrix, while the relevance of the feature attributes was determined using feature
9 selection technique. RF and ANN classifiers performed beyond the acceptable threshold of 70% for Precision, Recall, F-score and Accuracy.
10 After building the predictive model, the RF classifier predicted an accuracy of 87.97%. The ANN classifier was the second-best performing
11 classifier followed by the LR classifier which yielded an overall accuracies of 70.80% and 48.68% respectively. The study has demonstrated the
12 potential of machine learning (ML) as a reliable accident forecasting technique, based on predicted performance and accuracy. The findings of
13 this study are expected to be useful in the establishment or improvement of an effective traffic safety system within a sustainable transportation
14 system, which is critical in assisting government managers in developing timely proactive traffic accident prevention strategies and effectively
15 improving road traffic safety.

16 Keywords: machine learning; logistic regression model; random forest model; artificial neural network; accident severity; feature importance.

Background ductivity loss, medical bills, legal and judicial costs, emergency
costs, insurance fees, property damage, congestion costs, and
Road traffic accident has been one of the leading causes of in-
employment loss Blincoe et al. (2015).
juries and deaths. More than 1.2 million people die each year
on the world’s roads, according to the World Health Organiza- Road traffic accidents affect a huge number of countries each
tion Organization (2015). year, including Ghana. Several factors influence road accidents
In Ghana, studies have shown that one of the leading causes in Ghana, resulting in the severity of the incidents being pre-
of death and injury is road traffic crashes most of which occur in dicted. Statistical modeling techniques have traditionally been
urban areas. Over the last decade, 72 people per 100,000 have used to forecast crashes and categorize their severity Savolainen
experienced a serious bodily injury, 2,080 people have died in et al. (2011), Kidando et al. (2019). However, estimating the sever-
traffic accidents, and over 8% of the population has died as a ity of a road traffic accident using statistical modeling techniques
result of traffic accidents Blankson and Lartey (2020). According is not very accurate Wahab and Jiang (2019). For example, the as-
to the Statista Research Department, there were almost 12,100 sumptions related to data distribution and a linear relationship
road traffic accidents in Ghana from January to October 2020, between explanatory and dependent variables can be untrue
involving over 20,400 vehicles. and lead to inaccurate inferences, an innovative approach (ma-
Furthermore, the collisions resulted in 2,080 deaths and chine learning and deep learning) based on supervised learning
12,380 injuries. According to the source, more males than girls is proposed to improve the performances of accident severity
were involved in traffic accidents in 2016. The source further prediction and to overcome such limitations.
indicated that buses and mini-busses were the leading vehicles The National Road Safety Commission (NRSC) and the Motor
involved in accidents of this nature, after cars. In Ghana, road Traffic and Transport Unit (MTTU) have taken many measures
accidents are still a major public safety concern. Traffic crashes and have made significant commitments to improve travel safety.
incur enormous expenses to people and society, including pro- However, traffic accident may occur at any time and in any
2 Journal of ...

location. Drivers, on the other hand, might be given important Machine Learning Model in Accident Severity Predic-
information to help them prevent or lessen their chances of tion.
being involved in an accident. For preventing and reducing Machine learning models have been employed for traffic acci-
the incidence of traffic accidents, forecasting and identifying dent severity prediction and have proved to have some advan-
associated components under varied situations are critical. As a tages over statistical models.
result, traffic accident prediction models have been developed ML can model the non-linear relationship that exists between
to disclose the important factors that influence traffic accidents target variable and related explanatory variables Assi et al. (2020).
so that traffic safety can be enhanced. Also, because some level of assumptions exist between the ex-
The rest of this paper is organized as follows: Chapter 2 planatory variables and the target variables in statistical models,
introduces some previous studies which are closely related to there would be a model failure if the assumptions are violated,
traffic accident severity prediction. Chapter 3 identifies the key machine learning methods do not depend on inherent assump-
crash types and investigates the impacts of risk factors on differ- tions. Machine learning models can help model the complexity
ent types of crashes using the data resources obtained from the between the explanatory variables and are able to capture non-
motor traffic and transport unit. Also, the proposed model is linear relationships while it could be difficult to be achieved
introduced for predicting the crash severity. Chapter 4 presents using statistical models Chang (2005). The study of Lord and
results and discussion on the proposed model and compares Mannering (2010) and Mannering and Bhat (2014) indicates that
it with some other models. Chapter 5 outlines the main con- the growth of research with regards to transportation could be
clusions and explains the limitations and recommendations for greatly elevated by new dataset resources provided by the rise
future study. of current technologies.
Logistic Regression (LR), Random Forest (RF), K Nearest
Neighbor (KNN), Support Vector Machine (SVM), and Decision
Literature Reviews tree models are known to be the most widely employed models
that were conducted to uncover the significance of machine
Road safety managers and researchers have looked at a variety
learning (ML) model in predicting crash severity over Statistical
of strategies and data to improve road traffic safety. A thorough
models Delen et al. (2017).
understanding of the factors that contribute to road traffic acci-
dents is demanded of effective road safety management. Over
the years, experts have worked hard to uncover some of the Neural Network Model in Accident Severity Prediction.
elements that influence the severity and frequency of accidents. Neural Networks have been utilized in the past as computer-
This chapter covers a summary of the adopted models, such as based models for knowledge processing and prediction in a
statistical models, machine learning models, and the progress of variety of domains Mussone et al. (1999).
deep learning, as well as accident severity prediction. Neural networks have been successfully used to learn and
memorize feature datasets, analyze data, and draw comparisons
between new and old data Ertugrul and Hizal (2005) and to teach
Statistical Model in Accident Severity Prediction the dynamics of non-linear system without any form of math-
According to previous studies, the most prevalent methods of lin- ematical modeling Singh and Deo (2007). Many studies have
ear and nonlinear regression analysis utilized for traffic severity employed neural network models to predict traffic accidents and
prediction include linear regression modeling, logistic regres- severity in transportation research Alkheder et al. (2017), and
sion modeling, and negative binomial regression. A statistical these models have demonstrated high accuracy in predicting
model is a mathematical model that encapsulates a set of sta- accident severity as compared to other Statistical Models Abdel-
tistical assumptions about the generation of sample data, with Aty and Abdelwahab (2004)
the modeling processes largely depicted in idealized form Cox A comprehensive study conducted byChang (2005) compared
(2006). The methods are not only limited in their applicability a 3-layer Artificial Neural Network (ANN) model with the Neg-
because some are not always viable, such as when the conclusion ative Binomial model for the prediction of crash frequency. The
is discrete, but they also demand strong assumptions about data results indicated that the ANN model performed better than
distribution. James and Kim (1996). the Negative Binomial model. Research conducted by Xie et al.
Models such as simple multiple linear regression and negative (2007) compared the Back Propagation Neural Network (BPNN)
binomial regression were some of the early models of traffic acci- and Bayesian Neural Network (BNN) and the Negative Bino-
dent prediction which were based on the approach that assumes mial model for predicting the frequency of traffic accident on
normal distribution of errors. The general form of the linear rural roads. The study indicated that both of the neural network
accident prediction model can be expressed as follows: models had better prediction performance than the Negative
Binomial model. Convolutional Neural Network (CNN) can ex-
amine spatial information and have been extensively employed
y|θ Dist(θ )withθ = f ( X, β, ϵ) in image classification problems Wenqi et al. (2017). The study
of Ren et al. (2018) used a traffic accident dataset and a GPS
dataset to construct a deep learning model in order to better un-
where, derstand the relationship between human movement and traffic
Y: the response variable (i.e accident frequency) accidents. Given real-time GPS data, the model could assess the
θ : the accident dataset likelihood of accidents and their position on a map.
Dist(θ ): the model distribution. While the multiple studies and significant contribution of
X: represent the vector of different explanatory variables. the surveyed model to road safety should not be overlooked,
β: represents the vector of regression coefficient. it is very important to undertake different types of accident
prediction so that specific countermeasures can be made.
Borkor et al. 3

Methodology to predict β in a logistic regression model in order to maximize


the likelihood of reproducing the data given the estimates of
In this section, we review the Logistic Regression (LR) Random
the parameters. The null hypothesis for full models shows that
forest (RF), and Artificial Neural Network (ANN) and discuss
all available number of β S are zero. If the null hypothesis is
their application for predicting traffic accident severity. The
rejected, then it means at least one β is not zero, which indicates
models were implemented using the python programming lan-
that the logistic regression model predicts the probability of the
guage.
outcome better than the mean of the response variables.
Final interpretations of the outcomes are evaluated using the
Logistic Regression
odds ratio of the predictors Peng et al. (2002). The odds ratio
Logistic regression is a model employed to examine and to fit defined as a measure of the association between an exposure
accident systems. This technique has been used to model prob- and an output Szumilas (2010), as derived from exp( β); if an
abilistic systems and as well as to predict future events. This explanatory variable experience a unit increase whiles other fac-
model is a direct probability model that has no requirements on tors remain constant, then we say the odds ratio has increased
the distributions of the independent variables or predictors Har- by a factor of exp () β). An odds ratio greater than 1 or less than
rell Jr (2015). In this study, the model is used to predict the 1 indicates exposure associated with higher or lower odds of
severity of traffic accidents. It finds the model that best de- output for a unit increase in the explanatory variable. A confi-
scribes the relationship between a response variable and a set of dence interval of 95% of the odds ratio is mostly employed to
explanatory variables. evaluate the result; when the confidence interval is low it repre-
The logistic regression model has been used over the years sents a lower level of precision. It can be used as a proxy to find
to analyze binary dependent variable. The dependent variable statistical significance if the confidence interval does not include
used in logistic regression model takes the form of true/false an odds ratio of 1 in the interval. The odds ratio can be used
(1/0), where ‘1’ usually represents true, and ‘0′ represents false. to compare levels of individual explanatory variables Szumilas
The true/false form can be converted to match any binary re- (2010) Hosmer Jr et al. (2013).
sponse Abdel-Aty and Abdelwahab (2004). Generally, the linear
model assumes that responses and the loss terms are normally
Random Forest
Gaussian distribution, and the outcomes are independent Hilbe
(2011). When binary data are modeled using this technique, the The Random forests method was first proposed by Ho (1995).
first two assumptions are violated because the binary response He discovered that if forests of trees splitting with oblique hy-
variable is a derivative of the Bernoulli distribution, whiles nor- perplanes are arbitrarily constrained to be sensitive to only a few
mal regression is centered entirely on the Gaussian probability feature dimensions, they can gain accuracy as they grow without
distribution function. overtraining, as long as the forests are randomly restricted to
The natural logarithm or logit of an odds ratio, is the basic be sensitive to only selected feature dimensions. The model is a
mathematical concept about the logistic regression. The logistic widely used extension of bagged trees Lee et al. (2020), it’s able
technique predicts the logit of Y from X. The odds can be ex- to make predictions after it has been trained on a dataset.
pressed as the ratios of probabilities (π) of True events of Y to The random forest model evaluates each class separately and
probabilities of false events of Y. The simple logistic regression compiles the prediction that receives the most votes as the cho-
model can be expressed in the form: sen prediction. Initially, each tree in the classifiers takes data
from samples in the dataset and then selects features at random
to employ in the tree’s growth at each node. The bagging tech-
π
logit(Y ) = ln( ) = β 0 + β 1 x1 + β 2 x2 + β n x n = β 0 + ∑ β i x i nique is used in the random forest’s training algorithm. Bagging,
1−π also known as bootstrap aggregation, is a technique for reduc-
(1)
where; ing variances in estimated prediction functions that works well
π : is the probability of True or success for high-variance, low-bias techniques like trees Breiman (2001).
x . . . xn : denotes explanatory variables in the model The bagging technique fits many trees by using several random
β : denotes the regression coefficient for each variable. Once both sampling of records in the dataset. The training examples used
sides of the equation are changed with antilog, the expression in this study were K1 = ( xn , yn ) = ( x1 , y1 ), . . . , Kn . An input
now the the form: value x was then applied to the prediction problem and a func-
tion θ̂ = t( x; K1∗ , . . . , Kn∗ ) termed as the base learner is utilized. If
e β 0 + β 1 x1 + β 2 x2 + β n x n t( x; Ki ) is a decision tree trained on dataset, then the predicted
π = P (Y = 1 ) = (2)
1 + e β 0 + β 1 x1 + β 2 x2 + β n x n output of the tree with input x would be the quantity θ ( x ). In
applying the bagging technique, it is required to help stabilize
where:
the base learner t by resampling the training data. The bagged
π : is the probability of or True success (Y = 1). Even though
version of the base learner is defined below:
Equation 1 shows a linear relationship between logit (Y ) and
( X ), Equation (2) shows that the relationship between Y and θ̂ ∞ ( x ) = E∗ [t( x; K1∗ , . . . , Kn∗ )] (3)
X is nonlinear. Hence, the natural log transformation of the
odds must be employed to make a linear relationship between where:
categorical response variables and predictor variables. The role Ki∗ series represents sampling with replacement, which is drawn
of the β coefficient is to interpret the direction of the relationship independently from the original dataset when applying the boot-
that exit between X and logit (Y ). A bigger β value (ie.β > 0) strap method.
indicates that the large logit (Y ) is associated with large X values E∗ is the expectation obtained in terms of the bootstrap measure.
and vice versa. On the other hand, small ( β < 0) indicates Because we are focused on evaluating exact values, the monte
that small logit (Y ) is associated with bigger X values and vice Carlo bagged estimator was adopted. The concepts of Monte
versa. The maximum likelihood technique is mostly deployed Carlo algorithms was provided by Brassard and Bratley (1988),
4 Journal of ...

the algorithm terminates answers which may be occasionally


incorrect and avoids the output of two distinct correct answers
for the same problem Esposito and Saitta (2004). In the equation
bellow;
1 B ∗
B b∑
θ̂ B ( x ) = tb ( x ) (4)
=1
where:
t∗b ( x ) = t( x; Kb1
∗ , . . . , K∗
bn
B represents the number of trees grown.
Kbi∗ represents an element i in the bth bootstrap sample.

Artificial Neural Network (ANN)


The first step toward artificial neural networks came in 1943
when Warren McCulloch, a neurophysiologist, and a young Figure 1 Structure of a Perceptron. Pal and Mitra (1992)
mathematician, Walter Pitts, wrote a paper on how neurons
might work. They modeled a simple neural network with elec-
trical circuits. It is based on the structure and functions of bio- piecewise linear function that will output the input directly if it
logical neural networks which works like the way the human is positive and outputs zero otherwise, it’s defined as follows:
brain processes information. Figure ?? illustrates the structure
of the ANN model. ANN includes a large number of connected f ( x ) = max (0, x )
processing units that work together to process information. It
manually learns from examples and experiences it-self and then . The neural network is trained using the backpropagation (BP)
apply the learning on tests. in python package program. backpropagation is a method em-
There are several kinds of ANNs, but the most widely used ployed to calculate the gradient of the loss function with regards
is the multilayer perceptron (MLP), which consists of the input to the weights in the network. It’s a widely used method for
layers, hidden layer and output layer, each of which has nodes optimizing the network’s performance by adjusting the weights.
and activation functions Delen et al. (2006). This study captures a
neural network with MLP architecture which was designed and Classifiers Evaluation
implemented using the python programming language (soft- The machine learning algorithms utilised in this study were
ware). The Perceptron - proposed by Frank Rosenblatt in 1958, evaluated according to their predictive strength. As a result,
accepts input, moderates them with certain weight values, then we calculated the algorithms’ Precision, Recall, F1-score, and
applies the transformation function to output the final result. overall Accuracy. Precision is the proportion of the identified
The structure of the Perceptron is presented in Figure 1. It con- instances of the data that were successfully predicted by the
sists of inputs ( x ), weights (wi ), a summing function (∑), an algorithms, Recall is the proportion of the test data instances
activation function ϕ(·), a bias (bk ), and an output (yk ). This that were correctly identified by the classifier model based on
can be expressed mathematically as: the trained data. The F1-score or F-measure is the harmonic
mean of Precision and Recall. The precision, recall and F1 score
 N 
is indicated in equation (5 - 7).
yk = ϕ ∑ w i x i + bk
i =1
tp
Precision = (5)
In the Perceptron structure represented in Figure 1, x1 , x2 . . . xm tp + f p
denote the model inputs. The obtained inputs are then mul-
tiplied by the weights (wi ), and the net input is obtained by tp
Recall = (6)
summing with the biases which varies between −1 and +1. The tp + f n
role of the activation function is to introduce non-linearity into
the neural network, this is useful because it helps the neural 2 ∗ ( Recall ∗ Precision)
F1 Score = (7)
network to learn more complex information. For each layer in Recall + Precision
the network, number of units of neuron and the preferred acti-
vation function must be specified. The rectified linear activation Additionally, we computed the Area Under Receiver Operating
unit (ReLU), Sigmoid, Tangent hyperbolic (Tanh) and softmax Characteristics (AROC), training and validation curve and the
are some of the most widely used activation functions. The confusion matrix of the algorithms.
tanh activation function is employed mainly for binary classifi- The confusion matrix, also called the error matrix, is a tabular
cation Sharma (2019). However, when handling complex and representation of the performance of an algorithm. The compu-
high-dimensional datast, the ReLU has proved to be fast and tation of the confusion matrix shows the number of instances
more effective than tanh and sigmoid Liu et al. (2021). softmax ac- that were correctly predicted and incorrectly predicted. As il-
tivation function is mostly used as the output layer for multiclass lustrated, figure (2-4) shows confusion matrix for each of the
classification Sharma (2019). In this study, the target variable is algorithm. From the confusion matrix tables, the predicted class
multiclass, and our dataset is high-dimensional, so we adopted is on the row while the actual class is at the column. The ROC
the ReLU activation function for input and hidden layer whiles curve which depict the predictive ability of each algorithm typ-
softmax was used for the output layer as the activation function. ically features true positive (sensitivity) rate on the y-axis and
It is desired to have a continuous and differentiable activation false positive rate (1-specificity) on the x-axis. Figure (6-9) shows
function Lee et al. (2020). The ReLU activation function is a the predictive strength of the various algorithms.
Borkor et al. 5

Figure 2 Confusion Matrix of Random Forest Algorithm Figure 4 Confusion Matrix of Artificial Neural Network Algo-
rithm

Figure 3 Confusion Matrix of Logistic Regression Algorithm

Figure 5 The Correlation Matrix of all Features


Feature Selection
We deploy this technique to manually select features which unwanted features show a low correlation with the class and
would contribute enormously to the prediction performance of therefore should be ignored by the algorithm. However, it ex-
our models. Redundant or irrelevant features in the dataset can amines excess, as they are usually strongly correlated with one
decrease the model’s accuracy and make the model learn based or more of the other features Czarnowski et al. (2018). figure 5
on unnecessary features. The reality is, in building a machine depicts the correlation of the features. For the models proposed
learning model, not all the features present in a dataset would be to classify accident severity, the features considered to be most
suitable. A generalized dataset usually contains multiple proper- suitable in our work are; road type, Age of Driver, Crash type,
ties; related features, irrelevant features and redundant features. road class, weather conditions, Light conditions, road surface
Thus, only the related features will enhance the efficiency and conditions, and the number of vehicles involved in the crash.
effectiveness of our learning algorithm. Mostly, dimensional
errors often occur in our algorithms since we are not always in-
Results and Discussion
formed about feature that would be effective for our prediction.
So, it is very necessary to select relevant feature that are suitable In this section, the results of the models developed to predict traf-
for the efficiency of the learning algorithm, particularly when fic accident severity will be thoroughly discussed. The response
handling complex dataset. Several methods have been deployed variable "Accident Severity" was classified as fatal, serious, or
for feature selection in different fields Cai et al. (2018),Ou et al. slight, and 18 independent variables were used to predict the
(2017),Hall (1999). In this work, we proposed the Correlation- severity of a road traffic accident, including road type, road class,
based feature selection (CFS). Correlation-based feature selection number of vehicles, light conditions, weather conditions, road
(CFS) ranks attributes according to a heuristic evaluation func- surface conditions, number of casualties, crash type, and so on.
tion based on correlations. The function examines subsets made Finally, the respective performance measure and methodologies
of feature vectors, which are correlated with the labeled class, are examined, interpreted, and compared to demonstrate per-
but independent of each other. The CFS technique assumes that formance accuracy. In comparing the Random Forest model,
6 Journal of ...

Figure 6 Receiver operating characteristics of the Random


Forest Algorithm Figure 8 Training and Validation loss of the Artificial Neural
Network

Figure 7 Receiver operating characteristics of the Logistic Re-


gression Algorithm
Figure 9 Training and validation of the Artificial Neural Net-
Logistic Regression and the Artificial Neural Network Model work
for accident severity prediction, averagely, the proposed meth-
ods achieved a good performance. The models were able to
classify almost all the severity classes correctly. However, for Conclusion
each prediction the performance varies from model to model.
The overall accuracy and the performance measure of the three This study focused on deploying machine learning algorithm to
models are listed in table 1 and table 2 predict the severity of road traffic accident, the study has shown
that ML can generate accurate predictions of accident severity
The result in table 1 and table 2 indicates that, the Random and provide potentially meaningful information. We identi-
Forest Model achieved the best prediction performance with fied the RF algorithm as the best algorithm in predicating the
an overall accuracy of 87.97% and an F1 score of 87.88%, fol- severity of road accidents when compared with LG, and ANN.
lowed by the Artificial Neural Network (with overall accuracy It is hoped that the findings of this study will guide the selec-
of 70.80% and F1 score of 70.41%) and Logistic regression Model tion of appropriate ML algorithm for the prediction of accident
(with and overall accuracy of 48.68% and F1 score of 47.78%). severity, and help decision-makers, transportation engineers
Even though the Logistic Regression is considered as one of the and other road safety agencies to make well-informed and better
simplest machine learning models and is easy to implement yet decisions to improve road traffic safety in Ghana. Moreover,
inferior in the the entire prediction, which shows that the sim- strategic crash forecast would be extremely important for im-
plest model cannot capture accurately the complex relationships proving travel safety, so an interesting future study would be
that exist between risk factors. Although the overall accuracy of to develop visualization techniques by feeding our model with
the ANN model was great but not really impressive as compared real-time data and forecasting the probabilities of a road traffic
to the Random Forest, it still has room for improvement. Thus, accident in real-time while showing potential hotspots or areas
it is worth noticing that both models performed well in terms of progressively under different conditions.
sensing the differences that exist between the classes.
Borkor et al. 7

Acknowledgments Hilbe JM. 2011. Logistic regression. International encyclopedia


of statistical science. 1:15–32.
We fully recognize with appreciation, the tremendous help and
Ho TK. 1995. Random decision forests. In: . volume 1. pp. 278–
incredible support from ...
282. IEEE.
Hosmer Jr DW, Lemeshow S, Sturdivant RX. 2013. Applied logistic
Funding regression. volume 398. John Wiley & Sons.
James JL, Kim KE. 1996. Restraint use by children involved in
This study was supported by ... crashes in hawaii, 1986–1991. Transportation research record.
1560:8–12.
Conflicts of interest Kidando E, Moses R, Sando T. 2019. Bayesian regression ap-
proach to estimate speed threshold under uncertainty for traf-
The authors declare no conflict of interest. fic breakdown event identification. Journal of Transportation
Engineering, Part A: Systems. 145:04019013.
References Lee J, Yoon T, Kwon S, Lee J. 2020. Model evaluation for fore-
casting traffic accident severity in rainy seasons using ma-
Abdel-Aty MA, Abdelwahab HT. 2004. Predicting injury severity chine learning algorithms: Seoul city study. Applied Sciences.
levels in traffic crashes: a modeling comparison. Journal of 10:129.
transportation engineering. 130:204–210. Liu YC, Ma CY, He Z, Kuo CW, Chen K, Zhang P, Wu B, Kira
Alkheder S, Taamneh M, Taamneh S. 2017. Severity prediction Z, Vajda P. 2021. Unbiased teacher for semi-supervised object
of traffic accident using an artificial neural network. Journal detection. arXiv preprint arXiv:2102.09480. .
of Forecasting. 36:100–108. Lord D, Mannering F. 2010. The statistical analysis of crash-
Assi K, Rahman SM, Mansoor U, Ratrout N. 2020. Predicting frequency data: A review and assessment of methodological
crash injury severity with machine learning algorithm syner- alternatives. Transportation research part A: policy and prac-
gized with clustering technique: A promising protocol. Inter- tice. 44:291–305.
national journal of environmental research and public health. Mannering FL, Bhat CR. 2014. Analytic methods in accident
17:5497. research: Methodological frontier and future directions. Ana-
Blankson PK, Lartey M. 2020. Road traffic accidents in ghana: lytic methods in accident research. 1:1–22.
contributing factors and economic consequences. Ghana med- Mussone L, Ferrari A, Oneta M. 1999. An analysis of urban colli-
ical journal. 54:131–131. sions using an artificial intelligence model. Accident Analysis
Blincoe L, Miller TR, Zaloshnja E, Lawrence BA. 2015. The eco- & Prevention. 31:705–718.
nomic and societal impact of motor vehicle crashes, 2010 (re- Organization WH. 2015. Global status report on road safety 2015.
vised). Technical report. World Health Organization.
Brassard G, Bratley P. 1988. Algorithmics: theory & practice. Ou J, Xia J, Wu YJ, Rao W. 2017. Short-term traffic flow fore-
Prentice-Hall, Inc. casting for urban roads using data-driven feature selection
Breiman L. 2001. Random forests. Machine learning. 45:5–32. strategy and bias-corrected random forests. Transportation
Cai J, Luo J, Wang S, Yang S. 2018. Feature selection in machine Research Record. 2645:157–167.
learning: A new perspective. Neurocomputing. 300:70–79. Pal SK, Mitra S. 1992. Multilayer perceptron, fuzzy sets, classifi-
Chang LY. 2005. Analysis of freeway accident frequencies: neg- action. .
ative binomial regression versus artificial neural network. Peng CYJ et al. 2002. An introduction to logistic regression anal-
Safety science. 43:541–557. ysis and reporting 96 j. Educ. Res. 3.
Cox DR. 2006. Principles of statistical inference. Cambridge univer- Ren H, Song Y, Wang J, Hu Y, Lei J. 2018. A deep learning ap-
sity press. proach to the citywide traffic accident risk prediction. In: . pp.
Czarnowski I, Jedrzejowicz P, Chao KM, Yildirim T. 2018. Over- 3346–3351. IEEE.
coming “big data” barriers in machine learning techniques for Savolainen PT, Mannering FL, Lord D, Quddus MA. 2011. The
the real-life applications. statistical analysis of highway crash-injury severities: a re-
Delen D, Sharda R, Bessonov M. 2006. Identifying significant view and assessment of methodological alternatives. Accident
predictors of injury severity in traffic accidents using a series Analysis & Prevention. 43:1666–1676.
of artificial neural networks. Accident Analysis & Prevention. Sharma S. 2019. Towards Data and Model Confidentiality in Out-
38:434–444. sourced Machine Learning. Ph.D. thesis. Wright State University.
Delen D, Tomak L, Topuz K, Eryarsoy E. 2017. Investigating Singh P, Deo M. 2007. Suitability of different neural networks in
injury severity risk factors in automobile crashes with pre- daily flow forecasting. Applied Soft Computing. 7:968–978.
dictive analytics and sensitivity analysis methods. Journal of Szumilas M. 2010. Explaining odds ratios. Journal of the Cana-
Transport & Health. 4:118–131. dian academy of child and adolescent psychiatry. 19:227.
Ertugrul S, Hizal NA. 2005. Neuro-fuzzy controller design via Wahab L, Jiang H. 2019. A comparative study on machine learn-
modeling human operator actions. Journal of Intelligent & ing based algorithms for prediction of motorcycle crash sever-
Fuzzy Systems. 16:133–140. ity. PLoS ONE. 14:1–20.
Esposito R, Saitta L. 2004. A monte carlo analysis of ensemble Wenqi L, Dongyu L, Menghua Y. 2017. A model of traffic acci-
classification. In: . p. 34. dent prediction based on convolutional neural network. In: .
Hall MA. 1999. Correlation-based feature selection for machine pp. 198–202. IEEE.
learning. . Xie Y, Lord D, Zhang Y. 2007. Predicting motor vehicle collisions
Harrell Jr FE. 2015. Regression modeling strategies: with applica- using bayesian neural network models: An empirical analysis.
tions to linear models, logistic and ordinal regression, and survival Accident Analysis & Prevention. 39:922–933.
analysis. springer.
8 Journal of ...

Table 1 Performance Accuracy for all Classifiers


Model Training Accuracya Testing Accuracy
Random Forest 92.27 87.97
Artificial Neural Network 70.41 70.80
Logistic Regression 48.74 48.68
a

Table 2 Performance measure for all Classifiers


Model Precisiona Recall F1 Score
Random Forest 87.90 87.97 87.88
Artificial Neural Network 70.31 70.80 70.41
Logistic Regression 47.84 48.66 47.78
a

You might also like