You are on page 1of 11

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/325414837

Analysis and predict the nature of road traffic accident using data mining
techniques in Maharashtra, India

Article · October 2017

CITATIONS READS

9 666

3 authors, including:

Baye Atnafu
Symbiosis International University
2 PUBLICATIONS   11 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Analysis and predict the nature of road traffic accident using data mining techniques in Maharashtra, India View project

All content following this page was uploaded by Baye Atnafu on 29 May 2018.

The user has requested enhancement of the downloaded file.


International Journal of Engineering Technology Science and Research
IJETSR
www.ijetsr.com
ISSN 2394 – 3386
Volume 4, Issue 10
October 2017

Analysis and Predict the Nature of Road Traffic Accident


Using Data Mining Techniques in Maharashtra, India

Baye Atnafu Gagandeep Kaur


M.Tech student, Dept of CS/IT Assistant Professor, Dept of CS/IT
Symbiosis Institute of Technology Symbiosis Institute of Technology
Pune, India Pune, India

Abstract—Traffic accidents are the main cause of hidden pattern that influences the traffic accident
death as well as serious injuries in the world. India is severity levels using data mining techniques.
among the emerging countries where the rate at which There are a number of Data Mining classification
traffic accident occurs is more than the critical limit. As algorithms available (Like a Random tree, J48,
a human being, we all want to avoid traffic accidents Random forest, CART and Naïve Baye's) to predict
and stay safe. In order to stay safe, careful analysis of
the target class by analyzing the training dataset to
roadway traffic accident data is important to identify the
nature of road traffic accident that causes for fatal, get better boundary conditions which can be used to
grievous injury, minor injuries, and non-injury. For this determine each target class. After determining the
purpose, there are a number of classification and boundary conditions, the subsequent task is to
association rule mining algorithms. From this in the predict the target class based on the boundary
proposed method, Random tree, J48, and Naive Baye's conditions.
algorithms are selected that were show better There are also a number of Data Mining algorithms
performance in the previousstudies and applied on data
are available to find out the association between
set to analyze road accident data of Maharashtra state
in India. The results of the three algorithms are
independent variables in a huge data. Association
compared and then the prediction model is done using rule mining algorithm is the most popular
the algorithm which proves to be the best. Further, the methodologies to detect the significant associations
Apriori association rule mining algorithm is applied to between the data stored in the large database. There
find out the relationship between independent are a number of association rule mining algorithms
variablesrespects with nature of accidents. This study available. From these Apriori, predictive Apriori
also investigates the influence of other contributing and FP-growth algorithm are the most common
factors (like road related, driver related and association rule mining methods to find out the
environmental related factors) that influence the association between various road traffic accident
likelihood of severe accidents in Maharashtra, India.
severity factors that influencing the traffic accident
Keywords—road accident, data mining, random tree, severity levels in Maharashtra state, India.
J48, Naive Baye’s, association rule mining

I. RELATED WORK
I. INTRODUCTION
Sachin Kumar & Durga Toshniwal [3] in this study
Annually due to road traffic accidents 1.25 million
initially authors applied three popular classification
peoples die and 20-50 million peoples hurt non-
algorithms such as a CART, Naïve Bayes, and
fatal injuries [1]. According to the road traffic
SVM on PTW power two-wheeler accident data set
accident data provided by states, Maharashtra
and compared the results. CART classification
records the third highest number of fatal accidents
algorithm accuracy was found superior to other two
(13,212) [2]. However, this trend can change in
algorithms. Hence they have been used CART
future as it is hard to predict the rate at which road
classification algorithms to find the various factors
traffic accidents occur as it can occur in any
that influence the accident severity of power two-
situation. Therefore, we need to investigate the
wheeler accidents in entire Uttarakhand state and its

1153 Baye Atnafu , Gagandeep Kaur


International Journal of Engineering Technology Science and Research
IJETSR
www.ijetsr.com
ISSN 2394 – 3386
Volume 4, Issue 10
October 2017

13 districts separately [3]. The result shows that Sachin Kumar, Durga Toshniwal, Manoranjan
each districthas different factors associated with Parida [8] used the Latent Class Clustering and k-
power two-wheeler accidents severity. Modes clustering algorithms to form different
Sachin Kumar & Durga Toshniwal [4] k-means homogeneous clusters using a heterogeneous road
clustering algorithm used to investigate the high accident data. Further, FP growth algorithm is
and low-frequency accident locations. Further, applied to the clusters formed to find out the
they have been used association rule mining to algorithm that is better-performing when decreasing
recognize the association between the various the heterogeneity of traffic accident data [8]. The
factors related to road traffic accidents at various results prove that there is no any clustering
places with changeable accident occurrences. The algorithms is superior to others, that means both the
result shows that more accidents occur on clustering techniques perform well when to reduce
highways, foot-travelers are more vulnerable to the heterogeneity nature of accident data [8].
road accidents at roads that have intersections, Yannis George, Theofilatos Athanasios &
Curve on roads bordered by agriculture land are Pispiringos George [9] investigated road accident
risky for multi-vehicle crashes and intersections on severity per vehicle type by using log-normal
roads which fall upon marketplaces are more regression techniques. The result of this study
vulnerable to severe accidents. shows that bad weather situations and accidents
Wang, Yubian, and Wei Zhang [5] logistic during nighttime increase accident severity.
regression model used to isolate and enumerate the Furthermore, authors concluded that there is a
impact of various roadways and environmental major impact of crash type while examining
factors on the traffic crash severities and predict the accident severity.
accident severity levels. Authors investigated that Hao, Wei, Camille Kamga, Xianfeng Yang, JiaQi
factors like crash location, road function class, road Ma, Ellen Thorson, Ming Zhong, and Chaozhong
alignment, light condition, road surface condition, Wu. [10] explored the factors of injury severity of
and speed limit have the significant impact on crash truck drivers in the United States using an ordered
severity. The results show that higher crash severity probit regression model. The outcomes of analysis
is linked with rural roadways, major arterials, show bad weather and visibility condition enhance
locations without intersection, locations with the chances of high-level injury severity in truck
curves, during night-time, dry roadway conditions, drivers'.
and high-speed limits. Bhaven Naik, Li-Wei Tung, Shanshan Zhao, &
L. Mussonea, M. Bassani and S P. MascibKeller [6] Aemal J. Khattak [11] authors combined two
evaluated the factors that affect the accident diverse sources of data and applied random
severity levels at urban road intersections using parameters (mixed) ordered logit model to consider
back propagation neural network and generalized the distinct heterogeneity in the data. Results
linear mixed model. Both methods demonstrate that depicted that rain, air temperature, humidity, air
traffic flows have a significant role in predicting temperature, wind speed, and rain were found a
severity; this role is not limited to the flow when factor for injury severity. From these factors,
the crash occurred, but also extends to the other warmer air temperatures and rain were associated
vehicle crash flow data before the crash occurs after with high severe injuries while less severe injuries
the crash occurred. wereassociated with higher levels of humidity.
Yina Wu, Mohamed Abdel-Aty & Jaeyoung Lee Dursun Delen, Leman Tomak, Kazim Topuz &
[7] logistic regression model used to recognize the Enes Eryarsoy [12] focused on recognizing the
various factors contributing to increased vehicle person, vehicle, and accident-related risk factors
crash risk during fog and investigate the situations that are significant in building a variance in injury
in which crash risk are more likely to occur. The severity levels sustained by a driver in a car crash.
analysis results show that drivers will be more The authors used a number of predictive analytics
careful when fog is present and the chances of algorithms to find the composite associations
increasing crash risk would be more near ramp between various stages of injury severity and the
areas. risk causes related to crash. The authors also found
the importance of crash-related risk factors after

1154 Baye Atnafu , Gagandeep Kaur


International Journal of Engineering Technology Science and Research
IJETSR
www.ijetsr.com
ISSN 2394 – 3386
Volume 4, Issue 10
October 2017

applying a systematic series of in-formation fusion- this type of classification problem such as a
based sensitivity analysis on the trained predictive Random tree, j48, Naïve Bayes…etc.
models. Sensitivity analysis results prove that use A. RANDOM TREE
of a preventive system (i.e., seatbelt), the way of
A random tree (RndTree) is a group of distinct
crash and drug usages are the main predictors of the
decision trees, which means that operator of
injury severity.
random tree works just like the decision tree
Liling Li, Sharad Shrestha & Gongzhu Hu [13] operator except, for each split, only a random
applied statistics analysis and data mining subset of attributes is accessible [16]. A RndTree is
algorithms such as, Apriori rule mining, Naive a tree haggard at a chance from the collection of
Baye's and k-means clustering algorithm on the achievable trees. In this perspective "at random"
FARS Fatal Accident dataset for the purpose of signifies that every tree has an equal chance of
investigating the relationship between fatal rate and being selected from the set of trees.
other attributes such as weather condition, collision
B. J 48
manner, light condition, drunk driver and surface
conditions. The analysis result shows that J48 is an advanced version of ID3; it decides target
environmental factors likeroadway surface, value of a new test data with respect to diverse
weather, and light conditions do not strongly affect attribute values of training dataset [15]. The inner
the fatal rate, while the human factors like drunk or nodes of a decision tree are represented by different
not and the collision type have a stronger effect on attributes while the branches tell the achievable
the fatality rate. values of these attributes. The internal nodes denote
the dependent variable values [15]. Escalating the
Hasan Mehdi Naqvi & Geetam Tiwari [14] using a
count of trees provides a more intelligent learner
binomial logistic regression model to identify the
just as having a large varied group is capable of
various causes that influence the motorcycle fatal
reaching intelligent conclusion [17].
crashes. Study results show that the chances to
occur rear-end, sideswipe and head-on collision are c. NAÏVE BAYE’S
42 times, 35 times and 25 times more than hit The naïve Bayesian classifier is one of the most
pedestrian for variable "collision type", effective and widely used supervised learning
respectively; the probability of fatal crash increase algorithms to classify the road accident data. It is a
insingle-vehicle crashes than two or more vehicles statistical model that predicts class membership
crashes for variable "number of vehicle", and, the probabilities based on Bayes' theorem. The Naive
probability fatal crash on two-lane national Baye's classification algorithm is one of the
highway is more than four-lane national highway probability-based methods used for classification
for variable "number of lane". and prediction based on the Bayes' hypothesis with
the assumption of independence between each pair
of variables.
II. METHODOLOGY
2. ASSOCIATION RULE MINING
1. CLASSIFICATION ALGORITHMS
ALGORITHMS
The classification algorithm is one of the data Association rule mining algorithm is the most
analysis methodsthat used to constructmodelsto popular methodologies used to detect the
predict the future data. The type of classification significant associations between the data stored in a
algorithms used variesaccording to the target huge database. For this purpose there are a number
variable. The target variable for this survey paper is of association rule mining algorithms present, from
represented as a category variable withsix possible these Apriori, predictive Apriori and FP-growth
outcomes (Overturning, head-oncollision, Rar end association rules mining algorithm are the most
collision, Right turn collision, left turn collision and unusually used algorithms inthe area of road traffic
Others).Accordingly, the analyzingproblem is accident analysis,to generate the best rules that
characterized as a nominal classification problem show the association between various attributes in
and based on the extant literature there are various large datasets.
data mining techniques available that can handle

1155 Baye Atnafu , Gagandeep Kaur


International Journal of Engineering Technology Science and Research
IJETSR
www.ijetsr.com
ISSN 2394 – 3386
Volume 4, Issue 10
October 2017

a. APRIORI ALGORITHM Data Mining allows discoveringnovel patterns that


Apriori rule mining algorithm is the naive method are not discovered yet by using various open source
of finding the frequent item-setsin a huge database data mining tools.Currently,there are many tools
by generate a setof all possible combination of are available for data mining, Such as WEKA,
items and then compute the support for RAPID MINER, R, KNIME…etc.
them.However, the number of possible
combinations increases exponentially as thenumber
a. WEKA
of items in item-set increases making this method
impractical [19]. Weka is one the most widely usedtool for finding
b. PREDICTIVE APRIORI ALGORITHM hidden patterns established by University of
Waikato[25].Weka offersthreemeans to use the
The predictiveapriorialgorithm is also used for
tool: the Java API, a GUI, and a command line
discovering hidden and novel patterns in a large
interface (CLI).WEKA contains classification,
database. It variesfrom Apriori algorithm in that
clustering, association rulesmining algorithms and
both confidence and support measures are joined
data preprocessing tools[25].
into a unique measure called as predictive
accuracy[20]. b. RAPID MINER
Rapid Miner is one of the open source tools in data
c. FP-GROWTH ALGORITHM
mining developed by Ingo Mierswa and Ralf
Frequent-pattern growth association rule algorithm Klinkenberg. Rapid Miner also knew as YALE
is the enhanced version of the Apriori rule mining (Yet another Learning Tool) based on XML used to
algorithm present by Jiawei Han and so forth [21]. implement numerous machine learning and data
It compresses data sets to an FP-tree, scans the mining classification and clustering algorithms
database twice, does not produce the candidate [25].
itemsets in therule mining process, and greatly c. R
improves the mining efficiency [22]. But FP-
R is open source data mining toolbased on C and
Growth algorithm needs to create an FP-tree which
FORTRAN programming language recognized by
contains all the datasets. This FP-tree has high
Ross Ihaka and Robert Gentleman for statistical
requirement on memory space [23].From these
computing and charts. R provides less support to
ofvarious data mining algorithms, we planned to
data mining algorithms as compared to Rapid
use Random tree and J 48classification algorithm to
Miner and Weka, it does implement a few data
predict the accident severity levels and to identify
mining algorithms[26].
various factors that influence the accident severity
levels. Further,in this study alsoplanned to use 4. KNIME
Apriori association rule mining algorithms to KNIME is aneasily operabledatamining tool
identify the association between various attributes. contains platform for dataintegration, data
Authours selected these algorithms based on the processing, data analysis, and exploration that runs
performance of algorithms shown in the previous inside the IBM’s Eclipse[27]. KNIME is easy to
studies. extend and to add plugins[27].
3. DATA MINING TOOLS

Table I: Comparison of data mining tools that used for road accident data analysis
Tools R WEKA RAPID MINER KNIME
1 Memory Usage More memory Less Memory more memory -
2 Language C, Fortran Java Language Independent Java
and R
3 speed Works faster on any machine Works faster on any machine Requires more memoryto operate -
4 Usage Complicated to use Most easiest Easy to use Easy to use
to use
5 Interface Type CLI GUI/CLI GUI GUI
Supported

1156 Baye Atnafu , Gagandeep Kaur


International Journal of Engineering Technology Science and Research
IJETSR
www.ijetsr.com
ISSN 2394 – 3386
Volume 4, Issue 10
October 2017

From these of various data mining tools,planned to


use WEKA toolsbased on the above review to
analyze the road accident data and find out the
various factors that influence the accident severity #A
levels.
5. DATA PREPROCESSING
The dataset used in this study is obtained from
National Highways Authority of India (NHAI) which
covers accident historical data from September 2014
to July 2017 [28]. The dataset contains 19,166
accident records and 10 attributes after
preprocessing. The detail of data set and its attributes
with values are given in Table V.In this study
planned touse WEKA 3.8 data mining tools from
various data mining tools based on the future shows
during thereview, for the purpose of classification, Figure 2: Model buildingusing classification
prediction, model evaluation, attribute selection, data
cleaning, data integrating and managing road algorithmsThe above diagram shows each step
accident data obtained from National Highways followed throughout the study starting from data
Authority of India. Figure 1 shows the general block collection up to prediction of the nature of road
diagram of the proposed work. The first task is data traffic accidents. After collecting the dataset from
preprocessing which include tasks such as data National Highways Authority of India feature
cleaning, integration,transformation, and reduction. selection method is applied to select the desired
After once the data is preprocessed, the next step is attributes.After this, selected desired attributes are
applying the data mining techniques on the data. checked for duplication, missing values, and outliers.
After preprocessing, the dataset is decomposed into
two sets: training and testing sets. Next step is
applying the classification algorithms on the data set
and test whether all classifiers are trained or not. If
all the classifiers are trained then test the classifier
and generate results. Then the nature of road traffic
accidents prediction is done, further, we applied the
Apriorirule miningalgorithm to discover the
relationship between various factors that frequently
influence the nature of road traffic accident. Finally,
the result is interpreted for both classification and
association rule mining techniques.

Figure 1: Block diagram of proposed work 6. RESULT ANALYSIS


There are a number of Data mining techniques The results of our study include association
available but the proposed method uses three rulesamong the variablesandclassification of nature
Classification algorithms: Random tree, J48 and of road traffic accidents as being overturning, head-
Naïve Baye's for predictions andApriori oncollision,rear-end collision, right turn collision, left
associationrule mining algorithms to detect the turn collision accidents. We used a data analytic tool
significant associations between the data stored in Weka toperform this analysis.
the large database. After applying these algorithms, Fatal Severe injury
the next step is to visualize the outcomes obtained Moderate injury Non-injury
from experiments. The detailed process of the above-
mentioned tasks is shown in figure 2.

1157 Baye Atnafu , Gagandeep Kaur


International Journal of Engineering Technology Science and Research
IJETSR
www.ijetsr.com
ISSN 2394 – 3386
Volume 4, Issue 10
October 2017

The influence of the nature of road traffic accidents


on accident severity levels such as fatal, severe
injury, moderate injury, and non-injury are shown in
figure 3. The key to the nature of road traffic
accidents is mentioned in table II. Unsurprisingly, the
highest number of road traffic accident
severitylevelshappen in rear-end collisionbecause the
most usual case of fatal and non-fatalinjury is rear-
endedcollision.
a. CLASSIFICATION RESULTS
Random tree classifier was built on the preprocessed
data of 19,166 total records, 19,096instanceswere
correctlyclassified giving a 98.3% accuracy rate.
1 2 3 4 5 6 7
Figure 3: The influence of the nature of accidents on Nature of accidents
accident severity levels
The variousevaluation measures for j48 classifier are given in Table II

J48 classifier also was built on the preprocessed data of 19,166 total records, 18,681instanceswere
correctlyclassified giving a 97.5% accuracy rate. Itproves that random tree classifier performsbetterthan j48 to
predict the nature of road traffic accidents.Due to this reason the prediction model was built using random tree
classifier. to Dueclassified giving a 97.5% accuracy rateThe variousevaluation measures for j48 classifier are
given in Table III.

APRIORIASSOCIATION RULE MINING RESULTS


The Apriori association rule mining algorithm was appliedon the preprocessed data of 19,166 total records
with 11 attribute.After applying Apriori algorithm in weka the best 10 rules are shown in Table II. We could
see that for variable road condition “Straight road”, for variable road feature “Four or more lanes with central
divider ” for variable road intersection and control type” unmanned rail crossing” and for variable weather
condition ” fine” are the highest influential factors for fatal and series injury accident severity.

1158 Baye Atnafu , Gagandeep Kaur


International Journal of Engineering Technology Science and Research
IJETSR
www.ijetsr.com
ISSN 2394 – 3386
Volume 4, Issue 10
October 2017

Table IV: Best 10 rules generated by Apriori association rule mining algorithms.

Table V: Description of data set and its attributes with value.


Attribute Number of distinct values Attribute value Code

Overturning 1

Head on collision 2
Rear end collision 3
Nature of Accident 7 Collusion brush or side wipe 4
Left turn collusion 5
Right turn collusion 6
skidding 7
Fatal 1
Grievous injury 2
Classification of
Accident 4 Minor Injury 3

Non Injured 4

5 Drunk 1
Over-speeding 2
Vehicle out of control 3
Causes of accident Fault of driver of motor vehicle /driver of 4
other vehicle /cyclist /pedestrian /passenger
Defect in mechanical condition of motor 5
vehicle /road condition
4 Single lanes 1
Two lanes 2
Road Feature
Tree or more lanes without central divider 3
Four or more lanes with central divider 4
Straight road 1
Slight curve 2
8 Sharp curve 3
Road Condition
Flat road 4
Gentle incline 5
Steep incline 6

1159 Baye Atnafu , Gagandeep Kaur


International Journal of Engineering Technology Science and Research
IJETSR
www.ijetsr.com
ISSN 2394 – 3386
Volume 4, Issue 10
October 2017

Hump 7
Dip 8
Fine 1
Mist/fog 2
Cloud 3
12 Light rain 4
Heavy rain 5
Hail/sleet 6
Weather Condition
Snow 7
Strong wind 8
Dust storm 9
Very hot 10
Very cold 11
Other weather condition 12
T-junction 1
Y- junction 2
Four arm Junction 3
Intersection Type and 8 Staggered Junction 4
Control Junction with more than four arms 5
Roundabout junction 6
Manned rail crossing 7
Unmanned rail crossing 8
7:00am - 10:59am T1
11:00am - 2:59pm T2
6 3:00pm - 6:59pm T3
Time of day
7:00pm - 10:59pm T4
11:00pm - 2:59am T5
3:00am - 6:59am T6
Date Many 09-09-2015 09-09-2015
Jan - Feb Winter
Mar-May Summer
Season
4 Jun-Sept Rainy
Oct-Dec Monsoon

III. CONCLUSION changing and increasing road traffic accident


severity leads to the issues of not understanding the
Paper, on Analysis and Predictof theNature of
nature of accidents, factors influencing the traffic
Road Traffic Accidents Using Data Mining
accidents, and managing large volumes of data
Techniques in Maharashtra, India discusses the latest
obtained from various sources properly. Many
work in the field of road accident analysis and
researchers have tried to solve these issues but still,
prediction. Road traffic accident severity keeps on
there are gaps in thepredictionofthenature of road
changing over time and increase endlessly. The
accidents and finding the contributory factors such

1160 Baye Atnafu , Gagandeep Kaur


International Journal of Engineering Technology Science and Research
IJETSR
www.ijetsr.com
ISSN 2394 – 3386
Volume 4, Issue 10
October 2017

as season and timein which the accident frequently type. Transportation Research Procedia, 25, 2081-
occurred. This leads to the challenges in the field of 2088.
accident analysis and prediction. Some of the 10. Hao, W., Kamga, C., Yang, X., Ma, J., Thorson, E.,
challenges include modeling of accidents for finding Zhong, M., & Wu, C. (2016). Driver injury severity
study for truck involved accidents at highway-rail
suitable algorithms to detect the accident severity
grade crossings in the United States. Transportation
levels, data preparation, transformation, and research part F: traffic psychology and behavior, 43,
processing time. Therefore, in order to fill some of 379-386.
the gaps, we are motivated to studythe roadtraffic 11. Naik, B., Tung, L. W., Zhao, S., & Khattak, A. J.
accident data to find out the influenceof thenature of (2016). Weather impacts on single-vehicle truck
road accidents on road accident severity levelsin crash injury severity. Journal of safety research, 58,
Maharastra, India.As seen in 57-65.
classificationandApriori association rule mining, the 12. Delen, D., Tomak, L., Topuz, K., & Eryarsoy, E.
road condition,weathercondition, road (2017). Investigating injury severity risk factors in
featureandroad intersection type and controls are automobile crashes with predictive analytics and
sensitivity analysis methods. Journal of Transport &
strongly affect the nature of accidents that lead fatal
Health, 4, 118-131.
and series injury severity. 13. Li, L., Shrestha, S., & Hu, G. (2017, June). Analysis
of road traffic fatal accidents using data mining
IV. REFERENCES techniques. In Software Engineering Research,
1. WHO. Road traffic safety. Available from Management and Applications (SERA), 2017 IEEE
http://www.who.int/mediacentre/factsheets/fs358/en/ 15th International Conference on (pp. 363-370).
Accessed on 21 September 2017. IEEE.
2. 400 road deaths per day in India; up 5% to 1.46 lakh 14. Naqvi, H. M., & Tiwari, G. (2017). Factors
in 2015 Available Contributing to Motorcycle Fatal Crashes on
fromhttp://timesofindia.indiatimes.com/lifestyle/healt National Highways in India. Transportation Research
h-fitness/health-news/400-road deaths-per-day-in- Procedia, 25, 2089-2102.
India-up-5-to-1-46-lakh-in- 15. Classification, prediction, and clustering. Available
2015/articleshow/51920988.cms accessed on 21 from
September 2017. http://shareengineer.blogspot.in/2012/09/classification-
3. Kumar, S., & Toshniwal, D. (2017). Severity analysis and-clustering.html accessed on Accessed on 22
of powered two-wheeler traffic accidents in September 2017.
Uttarakhand, India. European Transport Research 16. Sewaiwar, Purva, and Kamal Kant Verma.
Review, 9(2), 24. "Comparative Study of Various Decision Tree
4. Kumar, S., & Toshniwal, D. (2016). A data mining Classification Algorithm Using
approach to characterize road accident locations. WEKA." International Journal of Emerging
Journal of Modern Transportation, 24(1), 62-72. Research in Management &Technology 4 (2015):
5. Wang, Y., & Zhang, W. (2017). Analysis of 2278-9359.
Roadway and Environmental Factors Affecting 17. Rapidminer.Randomtree. Available from
Traffic Crash Severities. Transportation Research https://docs.rapidminer.com/studio/operators/modelin
Procedia, 25, 2124-2130. g/predictive/trees/random_tree.html accessed on 22
6. Mussone, L., Bassani, M., & Masci, P. (2017). September 2017.
Analysis of factors affecting the severity of crashes in 18. Zhao, Yongheng, and Yanxia Zhang. "Comparison of
urban road intersections. Accident Analysis & decision tree methods for finding active objects."
Prevention, 103, 112-122. Advances in Space Research 41.12 (2008): 1955-
7. Wu, Y., Abdel-Aty, M., & Lee, J. (2017). Crash risk 1959.
analysis during fog conditions using real-time traffic 19. Lai, K., & Cerpa, N. (2001, October). Support vs.
data. Accident Analysis & Prevention. confidence in association rule algorithms. In
8. Sachin Kumar, Durga Toshniwal, Manoranjan Proceedings of the OPTIMA Conference, Curicó.
Parida. "A comparative analysis of heterogeneity in 20. Aher, S. B., & Lobo, L. M. R. J. (2012). A
road accident data using data mining techniques", comparative study of association rule algorithms for
Evolving Systems, 2016. course recommender system in e-learning.
9. George, Y., Athanasios, T., & George, P. (2017). International Journal of Computer Applications,
Investigation of road accident severity per vehicle 39(1), 48-52.

1161 Baye Atnafu , Gagandeep Kaur


International Journal of Engineering Technology Science and Research
IJETSR
www.ijetsr.com
ISSN 2394 – 3386
Volume 4, Issue 10
October 2017

21. L. Zhichun and Y. Fengxin, “An improved frequent 25. Rangra et al. “Comparative Study of Data Mining
pattern tree growth algorithm,” Applied Science and Tools”. In Proceedings ofInternational Conference on
Technology, vol. 35, no. 6, pp. 47–51, 2008. Advanced Research in Computer Science
22. C. Jun and G. Li, “An improved FP-growth algorithm andSoftware Engineering, vol. 4, no. 6, pp. 216-223,
based on item head table node,” Information June 2014.
Technology, vol. 12, pp. 34–35, 2013. 26. Bhinge, A. V. (2015). A Comparative Study on Data
23. B. Zheng and J. Li, “An improved algorithm based Mining Tools (Doctoral dissertation, California State
on FP-growth,” Journal of Pingdingshan Institute of University, Sacramento).
Technology, vol. 17, no. 4, pp. 9–12, 2008. 27. Patel, P. S., & Desai, S. G. (2015). A Comparative
24. S.K. David, Amr T.M. Saeb, K.A. Rubeaan, Study on Data Mining Tools. International Journal of
Comparative Analysis of Data Mining Tools and Advanced Trends in Computer Science and
Classification Techniques using WEKA in Medical Engineering, 4(2).
Bioinformatics, Computer Engineering andIntelligent 28. National Highway Authority of India.Regional Office
System, 4(13), 2013. – Mumbai. Available from http://nhai.org.in accessed
on July 30, 2017.

1162 Baye Atnafu , Gagandeep Kaur

View publication stats

You might also like