Professional Documents
Culture Documents
Hwang2018 PDF
Hwang2018 PDF
velocity
Time(sec) Intake Air Flow
Rate json_string
Engine Coolant
WebAPI: Data Retrieve and Preprocessing
Temperature
Engine Load
json_array json_array json_array
Fuel
Driver
Consumption Road Traffic Vehicle Health
Rate Behavior
Monitor Care
Battery Voltage Analysis
sample/sec
Output Data
Driver Behavior: Defensive, Weak
Defensive, Weak Aggressive,
Aggressive, …
(a) With redundant zero velocities
json_array
Road Traffic Monitor: Traffic Jam,
Continuous curve, Uphill,
Downhill, …
Vehicle Health Care: Malfunctions
of engine, brake, lamps, …
Cloud Computing
Platform
velocity
Fig. 1 Abstract system architecture of driver behavior analysis
637
furious acceleration or deceleration are further counted by the standardized moment defined by Eq. 2.
defined thresholds. Then, the acceleration parameter and
frequencies of furious acceleration or deceleration are
appended into tuple of Pk to create Qk as the output of data (2)
transformation process. Where X is a random parameter vector, is the mean, and is
the variance.
The skewness is also the unitless value called the
skewness coefficient. Referring to the categories illustrated in
Fig. 7. A skewness coefficient equal to 0 indicates a symmetric
distribution of the probability density function centered on the
mean. A skewness coefficient less than 0 is called a negative
skewness which probability density function tilted to the right.
A skewness coefficient greater than 0 is called positive
skewness which probability density function tilted to the left.
Latitude Longitude Speed RPM Airflow Engintemp
Load Fuel Volt In reference to acceleration-based driving behavior analysis,
(a) Chi-squared (χ2) test method the counts of acceleration and deceleration are more frequently
when traffic jam. The probability density function tilted to the
right.
s>0 s<0
s=0
X X X
Latitude Longitude Speed RPM Airflow Engintemp Fig. 7 Categories of skewness coefficient
Load Fuel Volt
(b) Pearson's correlation coefficient method There are 100 logged route datasets of metropolitan bus
Fig. 5 Data selection are selected to as inputs of data mining process. The decision
tree classifier in scikit-learn package has been adopted to
3.5 Data mining and knowledge evaluation/display realize the function of data mining process based on the
In the data mining process, taking the velocity and kurtosis and skewness of velocity as well as acceleration. Fig.
acceleration in Qk as the input parameters, the decision tree 8 is a trained decision tree by 60 kurtosis coefficients of
method with statistical kurtosis and skewness are used to velocity and acceleration from 100 logged dataset. The left
analyze the driver behavior. The analyzed qualitative driver subtree from root represents lower velocity. The left most leaf
behavior classes are defined semantic labels in knowledge represents the lower velocity and acceleration which is labeled
evaluation/display process. The kurtosis measures the defensive. On the other hand, the right most leaf represents the
tailedness of a pdf (probability density function) distribution, higher velocity and acceleration which is labeled aggressive.
which is a fourth standardized moment defined by Eq. 1. Referring to Fig. 8, other two leaves are labeled weak defensive
and aggressive, respectively. After that, the remainder 40
(1) kurtosis coefficients are used to evaluate the error rate of
trained decision tree. It is about 5.8%. Fig. 9 is a trained
Where X is a random parameter vector, is the mean, and is
decision tree by 60 skewness coefficients of velocity and
the variance.
acceleration from 100 logged dataset. Its leaves are labeled
A kurtosis is a value without unit. It is also called a
defensive, weak defensive, weak aggressive, and aggressive
kurtosis coefficient. Referring to Fig. 6, a kurtosis coefficient
from right to left. The evaluated error rate is about 2.8%.
equal to 0 is defined as a normal kurtic, a less than 0 is called a
platy kurtic, and a greater than 0 is called a kurtosis letpo kurtic.
In reference to velocity-based driving behavior analysis, can be
used to distinguish between highway driving (letpo kurtic) or
Lower velocity
urban driving (platy kurtic)
638
transformation process. Furthermore, statistical kurtosis and
skewness of velocity and acceleration are also derived in data
transformation process that prepared for data mining process.
In this paper, the decision tree classifier of Scikit-learn
have been used for data mining based on the kurtosis and
skewness of velocity and acceleration from metropolitan bus
route datasets. Significantly, there are four categories being
preset as the results of decision tree classifier. Finally, the
aggressive weak weak defensive
aggressive defensive
knowledge evaluation/display process mainly summarizes the
classification rules from the completed decision tree classifier,
Fig. 9 Trained decision tree based on skewness coefficients of as well as interpret driver behavior category semantics to
velocity and acceleration defensive, weak defensive, weak aggressive, and aggressive to
complete the overall processes of driver behavior analysis. The
IV. CONCLUSION research will continue to involve more parameters to improve
The analysis of driver behavior has been accomplished the driving behavior classification models that hopes to further
through the iterative processes of data cleaning, data optimize the performance of the driving behavior model to
integration, data selection, data transformation, data mining, enhance the classification of driving behavior degree.
and knowledge evaluation/display based on the datasets of
cloud computing platform. Moreover, the iterative processes ACKNOWLEDGMENT
have been implemented by Python based on Scikit-learn class This study is conducted under the “Project for 4G Advanced
package as well as the datasets have recorded the dynamical Business and video services platform” of the Institute for
vehicle parameters of every routes from OBD port including Information Industry which is subsidized by the Ministry of
velocity, engine RPM, intake air flow rate, engine coolant Economic Affairs of the Republic of China.
temperature, engine load, fuel consumption rate, and battery
voltage. The route dataset is downloaded from cloud REFERENCES
computing platform accordance with the authorized user key
and route key. A Kalman filter and redundant zero remover [1] Hyun-Jeong, Y., Shin-Kyung, L., Oh-Cheon, K., "Vehicle-
have been realized to remove noisy and redundant data in the generated data exchange protocol for remote OBD
downloaded route dataset in data clean process to avoid inspection and maintenance," 6th International Computer
interference with driver behavior analysis. Data integration Sciences and Convergence Information Technology
process binds the individual vehicle parameter arrays into a (ICCIT 2011), 81-84, 2011.
matrix for the successive steps of driver behavior analysis. [2] Chipan Hwang, Mu-Song Chen, and Hsuan-Fu Wang,
Then, Data selection process uses the chi-squared (χ2) "Development of Car Cloud Sensor Network Application
test and Pearson's correlation coefficient methods to rank the Platform," 30th IEEE International Conference on
vehicle parameters in the matrix that designed to reduce the Advanced Information Networking and Applications
complexity of data mining. The results of data selection process (AINA-2016), pp. 975-978, Mar. 23-25, 2016, Switzerland
reflects the velocity, engine RPM, and fuel consumption rate [3] Scikit-Learn, http://scikit-learn.org/stable/, July 2017.
are the highest ranks, respectively. Acceleration is another [4] Jiawei Han, Micheline Kamber, and Jian Pei, Data Mining:
important data for driver behavior analysis which are Concepts and Techniques, 3e, Morgan Kaufmann, 2011.
transformed from the differential of vehicle velocity in data
639