You are on page 1of 3

Road Accidents Prediction and Classification

U. Vivek Krishna, S. Sudha Karan, S. Sanju, E. Vignesh and Dr.R.Kaladevi


Department of Computer Science and Engineering
Saveetha Engineering College, Chennai, Tamil Nadu
Affiliated to Anna University

Abstract— Road safety is a major concern due to the


rapidly growing number of automobiles. Every year, 1.2
million individuals die in road accidents. Statistics shows The traffic accident prediction will play an important role
that around 22,000 people have died in road accidents in the integrated planning and management of traffic, the
in Tamil Nadu between 2017 and 2020. According to the reason which with much randomness about the traffic
accident include some nonlinear elements, such as people,
reports released by the National Crime Records Bureau
car, road, climate and so on.
(NCRB), Tamil Nadu ranks second in deaths caused due to
negligence on road according to the ‘Accidental Deaths & CHAPTER - 2
Suicides in India- 2021. The State witnessed 15,384
casualties in 14,747 accidents, ranking after Uttar Pradesh LITERATURE SURVEY
with 18,972 casualties in 18,228 accidents. In this project,
Machine Learning algorithms such as decision tree and With the exponentially increasing number of vehicles, road
random forest are used to predict the severity of an accident safety is a matter of huge concern. Road accidents kills
occurring at a particular location and time. Factors like around 1.2 million people every year. Road crashes cost
speed limit, age, weather, vehicle type, light conditions and $518 billion globally, costing individual countries from 1-
day of the week have been used as parameters for training 2% of their economy. In 2021, a total of 15,384 people
the model. This model will play an important role in have lost their lives in road accidents in Tamil Nadu
planning and management of traffic and would help us compared to 14,527 in 2020. Steps are being taken to
combat this issue but they have been ineffective.
reduce a lot of road accidents in the future.
T. Augustine and S. Shukla proposes an Accident
Prediction system that can help to analyze the potential
safety issues and predict whether an accident will occur or
not [1].
Keywords – Road safety, casualties, machine learning,
Random Forest, Decision Tree The authors compared statistical methods and machine
learning models in predicting time for incident clearance.
CHAPTER - 1 Deep learning models have also been employed by
researchers in predicting the duration of road incidents [2].
INTRODUCTION S. Seid and Pooja utilized the ANN to model injury
severity of traffic accidents using classifying the injury
According to the death statistics released by the World severity into five different categories (no injury, possible
Health Organization, the number of traffic accidents injury, minor non-incapacitating injury, incapacitating and
occurring annually in the world is alarming. The traffic fatality) [3].
accidents killed 1.2 million people each year and 50 million
people were injured. Approximate 3,300 people were killed Lu Wenqi, Luo Dongyu &Yan Menghua, developed a
and 137,000 people were injured every day. Direct probabilistic model relating significant crash precursors to
economic losses of 43 billion dollar, the frequent changes in crash potential [4].
occurrence of traffic accidents directly threaten human life
and property safety. The authors proposes traditional way of linear analyses
cannot reveal the really situation the result of prediction is
Road accident prediction is one of the most not satisfactory. Compares traditional BP network with its
important research area in traffic safety. The occurrence of proposed solution [5].
road traffic accidents is mainly affected by geometric
characteristics of road, traffic flow, characteristics of The occurrence of road traffic accidents is mainly affected
drivers and environment of road. Many studies have been by geometric characteristics of road, traffic flow,
conducted to predict accident frequencies and analyze the characteristics of drivers and environment of road [6].
characteristics of traffic accidents, including studies on
hazardous location/hot spot identification, accident injury- Many studies have been conducted to predict accident
severities analysis, and accident duration analysis. Some frequencies and analyze the characteristics of traffic
studies focus on mechanism of accidents. Other factors accidents, including studies on hazardous location/hot spot
include weather and light conditions of the road. identification [7], accident injury-severities analysis [8].
4.1 Data Set

CHAPTER - 3 The below fig.4.1.1 is the dataset snippet which is used for
training the machine learning model.

PROPOSED SYSTEM

In this project, decision tree and random forest algorithms


are introduced. These algorithms are used for classification
and regression process. Our first process is the obtaining a
raw dataset and pre-processing operations are performed
such as
cleaning, transformation and reduction.After pre-
processing, the dataset is splitted into two sets such as
training and testing set. After splitting of dataset, some
machine learning algorithms are implemented and the
trained model is obtained. Then the model is evaluated and
result is obtained. In this way, analysis is performed. From
the analysis the result is obtained in the form of
accuracy.The process is implemented step by step in the
fig 3.1.

Fig 4.1.1: Snippet of the dataset.

4.2 ALGORITHM

In this project we have introduced two machine learning


algorithms - Decision Tree and Random Forest.

4.2.1 DECISION TREE

Decision Tree is a Supervised learning technique that can


be used for both classification and Regression problems,
but mostly it is preferred for solving Classification
problems. It is a tree-structured classifier, where internal
nodes represent the features of a dataset, branches represent
the decision rules and each leaf node represents the
outcome.

In a Decision tree, there are two nodes, which are


the Decision Node and Leaf Node. Decision nodes are used
to make any decision and have multiple branches, whereas
Leaf nodes are the output of those decisions and do not
contain any further branches. The decisions or the test are
performed on the basis of features of the given dataset.

4.2.2 RANDOM FOREST


Fig 3.1: Proposed Methodology Diagram.
Random Forest is a popular machine learning algorithm
that belongs to the supervised learning technique. It can be
used for both Classification and Regression problems in
CHAPTER - 4 ML. It is based on the concept of ensemble learning, which
is a process of combining multiple classifiers to solve a
complex problem and to improve the performance of the
IMPLEMENTATION AND ANALYSIS model.
As the name suggests, "Random Forest is a classifier that [3] S. Seid and Pooja, ‘‘Road accident data analysis: Data
contains a number of decision trees on various subsets of preprocessing for better model building,’’ J. Comput.
the given dataset and takes the average to improve the Theor. Nanosci., vol. 16, no. 9, pp. 4019–4027, Sep. 2019.
predictive accuracy of that dataset." 
[4] Lu Wenqi, Luo Dongyu & Yan Menghua, “A Model of
Traffic Accident Prediction” INSPEC Accession Number:
17239218, DOI: 10.1109/ICITE.2017.8056908
[5] Fu Huilin, Zhou Yucai, “The Traffic Accident
4.3 RESULT Prediction Based on Neural Network”, 2017.
[6] Thineswaran Gunasegaran Yu-N Cheah, “Evolutionary
Decision Tree algorithm showed an accuracy of 66.67%
Cross validation” INSPEC Accession Number: 17285520
whereas the random forest algorithm showed an accuracy
of 88.89%. DOI: 10.1109/ICITECH.2017.8079960
[7] Simon Bernard, Laurent Heutte and Sebastien Adam,
“On the Selection of Decision Trees in Random Forests”
INSPECAccessionNumber:10802866
DOI:10.1109/IJCNN.2009.5178693
[8] Rafael G.Mantovan,, Ricardo Cerri, Joaquin
Vanschoren, “Hyper-parameter Tuning of a Decision Tree
Induction Algorithm” INSPEC Accession Number:
CHAPTER - 5 16651860 DOI: 10.1109/bracis.2016.018

5.1 CONCLUSION

This project aims at using various Machine


Learning classification techniques to predict
severity of an accident at any particular location.
Machine Learning has enabled to analyze meaningful data
to provide solutions with a greater accuracy than with
humans. An model has been built with a accuracy greater
than 17% of the conventional system. This project can be
used by governments to prevent accidents.

5.2 FUTURE WORKS

In future, by using more machine learning algorithms the


effect of the model to predict and classify road accidents
will be established. We have planned to develop an fully-
fledged web app for user and police interaction can be
published for use in real-time. It can be used for Indian
states or cities, if proper data of accidents is provided by the
Indian Government.

REFERENCES

[1] T. Augustine and S. Shukla, "Road Accident Prediction


using Machine Learning Approaches," 2022 2nd
International Conference on Advance Computing and
Innovative Technologies in Engineering (ICACITE), 2022,
pp. 808-811, doi: 10.1109/ICACITE53722.2022.9823499
[2] J. Tang, L. Zheng, C. Han, W. Yin, Y. Zhang, Y. Zou,
and H. Huang, ‘‘Statistical and machine-learning methods
for clearance time prediction of road incidents: A
methodology review,’’ Anal. Methods Accident Res., vol.
27, Sep. 2020.

You might also like