A Comprehensive Review of Driver Behavior Analysis Utilizing Smartphones Compressed

4444 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 21, NO.
10, OCTOBER 2020
A Comprehensive Review of Driver Behavior

Analysis Utilizing Smartphones
Teck Kai Chan, Cheng Siong Chin , Senior Member, IEEE, Hao Chen , Member, IEEE,
and Xionghu Zhong, Member, IEEE
Abstract— Human factors are the primary catalyst for traffic TABLE I
accidents. Among different factors, fatigue, distraction, drunken- FACTORS I NFLUENCING D RIVING B EHAVIOUR
ness, and/or recklessness are the most common types of abnormal
driving behavior that leads to an accident. With technological
advances, modern smartphones have the capabilities for driving
behavior analysis. There has not yet been a comprehensive review
on methodologies utilizing only a smartphone for drowsiness
detection and abnormal driver behavior detection. In this paper,
different methodologies proposed by different authors are dis-
cussed. It includes the sensing schemes, detection algorithms,
and their corresponding accuracy and limitations. Challenges
and possible solutions such as integration of the smartphone
behavior classification system with the concept of context-aware,
mobile crowdsensing, and active steering control are analyzed.
The issue of model training and updating on the smartphone and
cloud environment is also included.
Index Terms— Driving behavior analysis, detection algorithms,
drowsiness detection, abnormal driving detection, smartphone Among the different factors, it was found that fatigue,
telematics solution. distraction, drunkenness, and/or recklessness are the most
I. I NTRODUCTION common reasons for abnormal driving leading to acci-
dents [15], [16]. On the other hand, studies have also shown
T ECHNOLOGIES advances in automobiles have brought
humans great benefits and convenience with a high level
of mobility. However, such invention causes several social and
if passenger can provide timely warning to the driver, he is
less likely to cause any injury related collision [17]. Hence,
environmental problems such as traffic congestion, traffic acci- it is critical for active systems such as the Advanced Driver
dents, and environmental pollutions [1]. Undeniably, the loss Assistance System (ADAS) or Intelligent Driver Assistance
of life or even injuries caused by traffic accidents are the most Systems (IDAS) to provide accurate and timely warning to
undesirable problems to avoid. Yet, it has been discovered the driver in the event of potential danger in order to have
that the main catalyst of traffic accidents is due to human ample time to escape from unwanted situation.
factors [2]–[4]. Assistance systems have to categorize an unsafe driver
A driver’s behavior can be defined as the way a driver or detect abnormal driving patterns to enable such warning
responds to his or her existing driving state (e.g., the vehicle’s system. When a driver is drowsy, distracted, reckless, or drunk,
speed or distance to the vehicle in front) by performing a the driver usually exhibit certain changes in behavior or
certain action (e.g., accelerate or steer) [5]. However, different body movement, resulting in an abnormal driving pattern.
drivers differ in how they accelerate and brake pedals, how A drowsy driver usually exhibits symptoms such as rapid
they turn the steering wheel, and the distance they keep when and constant blinking, nodding or swinging their head, and
following a vehicle [6]. frequent yawning [18]. On the other hand, a distracted driver
Furthermore, such behavior is also susceptible to external may be caused by some unpredicted reasons that occur at
factors as shown in Table 1 [7]–[14]. random [19]. Such a driver has an increased response time to
external stimuli and may unintentionally slow down or have
Manuscript received March 25, 2019; revised June 30, 2019 and sudden longitudinal and transversal movements [16], [20].
September 1, 2019; accepted September 6, 2019. Date of publication
September 19, 2019; date of current version October 2, 2020. The A drunk driver intoxicated by alcohol usually has the habit of
Associate Editor for this article was M. Brackstone. (Corresponding author: sudden acceleration or deceleration with a delayed response.
Cheng Siong Chin.) A reckless driver can be considered similar to a drunk driver,
T. K. Chan and C. S. Chin are with the Faculty of Science, Agriculture and
Engineering, Newcastle University in Singapore, Singapore 599493 (e-mail: such that the driver may be affected by emotional or external
t.k.chan2@newcastle.ac.uk; cheng.chin@newcastle.ac.uk). factors that caused him/her to accelerate or decelerate the
H. Chen and X. Zhong are with the College of Computer Science and vehicle suddenly and violates the speed limit awake [16].
Electronic Engineering, Hunan University, Changsha 410082, China (e-mail:
haochen@hnu.edu.cn; xzhong@hnu.edu.cn). As seen in Fig. 1, there are six types of abnormal driving
Digital Object Identifier 10.1109/TITS.2019.2940481 patterns. Weaving can be described as serpentine driving or
1524-9050 © 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Pretoria. Downloaded on July 13,2023 at 19:44:20 UTC from IEEE Xplore. Restrictions apply.
CHAN et al.: COMPREHENSIVE REVIEW OF DRIVER BEHAVIOR ANALYSIS UTILIZING SMARTPHONES 4445
Fig. 1. Different Types of Abnormal Driving Pattern. a) Weaving, b)

Swerving, c) Side slipping, d) Fast U-turn, e) Turning with Wide Radius, Fig. 2. Categories of Methodology Reviewed.
f) Sudden Braking [21].
However, the paper only investigated four different machine
driving in S-shape. Swerving can be described as an abrupt learning algorithms, namely, Artificial Neural Network
redirection when driving along a generally straight course. (ANN), Support Vector Machine (SVM), Random Forest (RF),
Sideslipping occurs when a driver deviates from the normal Bayesian Network (BN), missing out on the latest and exciting
driving direction. Fast U-turn is a U-turn at high speed. methodology proposed in the literature. On the other hand,
Turning with a wide radius is almost similar to a fast U-turn. Handel et al. [27] focused on the area of smartphone-based
However, the vehicle would drive along a curve with a big insurance telematics, where they only provided some form of
radius that can drift out of the lane. Sudden braking is similar comparison between different smartphones for harsh braking
to an emergency brake, but a sudden brake is usually done detection. Meiring and Myburgh [19] review paper focused on
with no apparent reason [21]. the general picture of driving behavior analysis, and the types
Driver monitoring can be categorized into direct or indirect of algorithms applied thus lacking emphasis on smartphone
driver monitoring. The former includes monitoring of heart solutions.
rate and/or driver body movement using different sensors. The Additionally, Vlahogianni and Barmpounakis [28] provided
latter includes analysis of facial expression, pedal and steering a general overview of driving behavior analysis using a
activities, and reactions to certain events [15], [22], [23]. smartphone. It missed out the methodology proposed in the
As reviewed by Kaplan et al. [15], direct driver monitoring has literature. Although Engelbrecht et al. [29] provided a survey
a high success rate, especially through the monitoring of the of smartphone-based sensing in vehicles for intelligent trans-
driver’s electroencephalogram (EEG). However, the method is portation system applications, it only provided a review on
not user-friendly as it possesses an extremely intrusive nature a limited number of papers, and only a few were targeted
where it is mandatory for the driver to wear an electrode at driver behavior analysis using a smartphone. On the other
helmet while driving to collect data. Although there exists hand, Martinez et al. [7] focused on the types of machine
a nonintrusive alternative such as placing electrodes on the learning algorithms used in the characterization of driving
steering wheel or in the driver’s seat, it comes with lower styles. disregarding the types of platform where the algorithms
accuracy due to improper electrode contact. Furthermore, were applied and their integration to the Advanced Driver
personal differences, such as gender and personality, influence Assistance System (ADAS). Wahlstrom et al. [30] review was
the signals that complicate the detection process. comprehensive, but the paper focused on the smartphone’s role
Therefore, the most viable and attractive solution would as a measuring instrument and user-interactive services instead
be the use of indirect driving monitoring. Telematics boxes of drivers’ behavior classification.
have been introduced in the market dedicated to such purpose. Hence, this paper aims to fill the current gaps by providing
However, the main drawbacks of such systems are their asso- contributions that includes the following.
ciated high cost and low customer acceptance that limit wide • A comprehensive review on non-intrusive smartphone
and rapid platform deployment [15], [24]. On the other hand, solution for driving behavior analysis. Selected publica-
smartphones are cheap and easy to deploy for driver behavior tions for review are based on the condition that they only
analysis. A modern smartphone is equipped with multiple sen- utilized smartphone in their application.
sors that include an accelerometer, gyroscope, magnetometer, • This paper also includes the identification of different
microphone, cameras, thermometer, and Global Positioning limitations of different methodologies with an in-depth
System (GPS). Thus, the smartphone has the potential of discussion on the challenges of smartphone solution and
monitoring not just the tracking of vehicles’ acceleration but the possibility of integration with other technology.
also the driver’s distraction and proximity to external obstacles The organization of the paper is as follow. A selection
or objects. Moreover, the network capability of a smartphone of different publications targeted towards using a smartphone
allows fast and reliable transmission of sensors’ data [25]. for driving behavior analysis is reviewed in Section III and
While there has been an increasing interest in driver behav- Section IV. The list of methodologies can be seen in Fig. 2.
ior analysis using smartphones, there has not yet been a com- Followed by the discussion on the challenges of smartphone
prehensive review in this area. Kaplan et al. [15] provided a solution. In Section VI, possible solutions are discussed, and
review focused on the detection of the driver’s drowsiness and finally, the paper ends with a conclusion.
distraction through the visual and non-visual approach. How-
ever, the details on utilizing smartphone solution were limited. II. A PPLICATIONS OF S MARTPHONE
Ferreira Jr., et al. [26] provided an investigation with differ- In the following section, discussion on driver behavior
ent Android smartphone sensors and classification algorithm. analysis using a smartphone together with other hardware are
4446 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 21, NO. 10, OCTOBER 2020
journey to measure the PPG signals. While the use electrode

fabric to collect ECG signals is not as intrusive as collect-
ing EEG signals by wearing an electrode helmet, but ECG
signals may not be as accurate as using EEG signals for
drowsiness detection [15]. Furthermore, it is also unclear if
the electrode fabric will dramatically increase the price of
the system. Finally, the proposed system is also considered
to be under development as the design of the system requires
further improvement. This is because the placement of the
sensors may hinder with the normal operation of the steering
wheel.
Araujo et al. [32] proposed an application that is capa-
ble of advising and teaching the driver to follow efficient
Fig. 3. Proposed System by Lee and Chung [31]. driving patterns. Although the authors’ focus was to reduce
fuel consumption by placing more emphasis on coaching the
driver, the methodology also aims to detect dangerous driving
presented. Details are provided to suggest why such an idea behavior such as high acceleration and high level of throttle
may not always be feasible. Applications of a smartphone aggression. Secondly, even though most of the data were
in other domains are also discussed including the human collected from CAN bus, the authors’ proposal was a low-cost
mobility analysis, activity recognition, structural monitoring, solution called the Torque Pro (a smartphone application)
transportation analytics and their applications in the healthcare that collect the CAN bus data through a low-cost On-Board
domain. Diagnostics-II (OBD-II) Bluetooth Adaptor. Therefore, this
solution is considered as relevant in this paper.
A. Driver Behavior Analysis Using Smartphone with other Araujo et al. [32] methodology begin with the collection
Hardware of data such as vehicle’s speed, acceleration, altitude, throttle
Lee and Chung [31] proposed a multi-sensory fusion signal, instant engine fuel consumption, and engine rotations
approach for driver’s drowsiness detection. The proposed over a 200s window. Features such as average, minimum, max-
system combined the concept of PERcentage of CLOSure imum, and duration of vehicle stoppage were then extracted
of Eyelid (PERCLOS) through the smartphone’s 3-axis from the collected data.
accelerometer, front camera and with electrocardiography Driving conditions (urban, highway, or combined) can be
(ECG), photoplethysmography (PPG), in-vehicle temperature easily classified using the vehicle’s speed data. A normalized
for drowsiness detection as seen in Fig. 3. consumption metric was calculated using the result and the
The Hue-Saturation-Value (HSV) color model approach is manufacturer’s consumption metric. It was then converted into
used for face detection, and the extracted eye feature is used a fuzzy number where a low number would indicate a good
for calculation of PERCLOS. As explained by the authors [31], fuel consumption. The authors then adopted a center-average
the ECG signal is used to calculate the heart rate and represents defuzzification to deliver a crisp value to the driver. Finally,
the subject’s overall activity level, that is a good indicator the driving hint for driving behavior improvement was evalu-
for drowsiness. On the other hand, PPG signals are used to ated through fuzzy logic by taking the driving conditions and
calculate the blood pressure as it was found that subjects who fuel consumption into consideration. The hint with the highest
worked long hours of overtime and lack of sleep have higher value was delivered to the driver.
blood pressure. The temperature and the 3-axis accelerometer Firstly, the methodology could not be assessed as it did
are included to predict the driver’s vigilance index. It was not include any accuracy measurement involving false positive
suggested that sleep symptoms are more likely to happen in rate or false negative. As the driving condition is determined
the event of a warm environment, and the accelerometer can through a linear discriminant, such a system cannot detect
be used to track the vehicle’s speed. Sensors data from the speeding as it is considered as driving on the highway.
ECG sensor, PPG sensor, and temperature sensor are sent In the study, the system can only provide driving hint after
to the smartphone through the Bluetooth connection. The 200s of data collection. While the window can change, it is
transmitted and collected data by the smartphone were used as unclear on how high data transmission can affect the data
the inputs for the Fuzzy Bayesian network. Each fuzzy variable measurements or classification accuracy. As explained earlier,
define the membership degree to the output state, which in the normalized consumption metric is derived by dividing the
this case, the level of fatigue. The closer the output value is fuel consumption by the manufacturer’s consumption metric
to 1, the more tired is the driver. An output value of 0.6 to on the different driving condition. However, there are several
0.75 indicates that the driver is in a partial sleep condition and other different factors affecting the fuel consumption rate, such
is advised to take a break. Any value above 0.75 will trigger as the vehicle’s age and conditions of air filters, which in
a warning alarm. turn affects the classification results. Finally, each automobile
This system is comprehensive but the PPG sensor can be manufacturer has proprietary protocols posing difficulty to
considered as slightly intrusive. Moreover, the driver cannot look for a generic and cheap compatible OBD-II Bluetooth
possibly place his finger on the sensor throughout the driving adaptor [105].
Hong et al. [33] utilized a naïve Bayes classifier with 5-bin that allows tracking of an individual’s movement with a high
discretization for classification of aggressive driving behavior. degree of accuracy and temporal frequency. These rich sources
In addition to smartphone sensors, Hong et al. [33] also of data that represent society-wide proxies of human mobile
proposed the use of low-cost ODB-II Bluetooth reader as well activities are ideal for statistical analysis and inference which
as an additional IMU to be mounted behind the steering wheel. can help in understanding and characterizing human mobility
As explained by the authors, with the inclusion of these two patterns during their daily activities [34]–[37]. Such analysis
hardware, more features could be extracted, and classification improves decision making and root cause analysis in many
can be more accurate. mobility tasks such as monitoring and planning of traffic or
Proposed features include maximum, average, and standard public transportation systems. And also in detecting extra-
deviation of speed, speed change, longitudinal and lateral ordinary events and forecasting or simulating traffic-related
acceleration from the smartphone’s GPS and accelerometer. phenomena [34], [38].
The ODB-II will provide the maximum, average, and standard As demonstrated in [34], an extraordinary event can be
deviation of speed, speed change, engine RPM, engine RPM detected by analyzing collective trajectory patterns. Trajectory
change, throttle position, and throttle position change. Finally, pattern is a pattern which reflects an individual’s travel behav-
the IMU mounted on the steering wheel will be used to detect ior [39]. Therefore, extraordinary events such as concerts or
turn events based on its z-axis acceleration change. To prevent sports competitions are detected when the destination of many
features from being dependent on the context of any specific individual trips are set toward a specific small area (the event
trip, Hong et al. [33] defined six specific driving conditions location) and subsequently set as the origin of return trips after
that compare the driver’s driving style more independently of the event ends [34].
the trips. These conditions consist of 1) Start: for 5 seconds
after a start, 2) Stop: for 5 seconds before a stop, 3) H-speed:
when the vehicle is traveling at speed above 50 km/hr, 4) Turn: C. Application of Smartphone in Other Domains
when the vehicle is turning, 5) B-turn: for 5 seconds just before A modern smartphone is equipped with multiple sensors that
the vehicle makes a turn, 6) A-turn: for 5 seconds after the include an accelerometer, gyroscope, magnetometer, micro-
vehicle makes a turn. In order to ensure all collected sensory phone, cameras, thermometer, and Global Positioning Sys-
signals have a consistent orientation, a virtual recalibration tem (GPS). Thus, a smartphone is capable of other tasks
method is used to repeatedly adjust the 3D orientation of the other than driver behavior analysis. The application of mobile
smartphone and IMU with respect to their orientation when sensing has been applied in human activity recognition [40],
the vehicle stops moving. structural monitoring [41], transportation analytics [42] as well
Features were extracted from a set of driving data collected as in the healthcare domain [43], [44].
over three weeks by 22 participants. It consists of 1017 trip In [40], several classifiers were trained using sensory data
with a total duration of 542 hours. To obtain the ground collected from accelerometer and gyroscope to perceive user’s
truth of whether a driver is aggressive, Hong et al. [33] motion state which includes descending stairs, ascending
proposed the use of historical traffic violation records for the stairs, walking, jogging and jumping and results have shown
past 3 years and a driving behavior questionnaire. The model that such application was a success.
was then validated using a leave one out cross-validation In addition, authors [41] showed that acceleration data col-
and achieved an accuracy of 90.5% when ground truth is lected using smartphone contained consistent and significant
obtained using historical traffic violation records. The accuracy indicators of the first three modal frequencies of the bridge.
then drops to 81% when the ground truth is obtained using The results can be more precise if acceleration data collected
driving behavior questionnaire. While the accuracy of this from smartphones were combined. Therefore, proving the
implementation is high when ground truth is obtained using hypothesis that crowdsensing with smartphones can be used
historical traffic violation records, historical traffic violation for structural health monitoring.
records may be biased. This is because it does not distinguish On the other hand, Lu et al. [42] propose a collaborative
between the causes of accidents or tickets which may result in framework for transportation service analytics which applied
nonrepresentative data being collected for aggressive driving spatiotemporal analytics on transportation infrastructural data
behavior. Secondly, each automobile manufacturer has propri- to detect anomalous transportation events. The detected anom-
etary protocols, and it may not be easy to look for a generic alous events can trigger mobile sensing and thereby recognize
and cheap compatible OBD-II Bluetooth reader [105]. specific commuting-related activities of interest with much
higher accuracy and lower energy cost.
Kelly et al. [43] showed that health status could be estimated
B. Human Mobility Analysis
using motion sensors within a smartphone. Another interest-
One particular interesting domain related to driver behavior ing application of a smartphone is the ability to estimate
analysis is the human mobility analysis. A slight difference Parkinson’s disease severity [44] as the data collected from
between the two is the latter is interested in discovering an accelerometer can be used to differentiate between patients
subgroups of vehicles and travels that can be characterized with Parkinson’s disease and healthy people by measuring
by some common movement behavior [34]. Such study can their walking patterns. Although the smartphone is inexpensive
be carried out due to the widespread availability of GPS and readily available, they have a huge potential to solve
technology in navigation devices, vehicles, or smartphone, problems from our daily lives.
TABLE II
F EATURES AND A NALYSIS M ETHODS AVAILABLE
FOR D ROWSINESS D ETECTION
Fig. 4. The different States of An Open Eye [46].
blinks over the total time of the recent 500 frames. As defined
by the authors [47], the driver is deemed to be severely
drowsy if PERCLOS exceeds 25% in the recent 500 frames
or a blink time duration of more than 2500ms. On the other
hand, the driver is deemed to have mild drowsiness when
blink time was more than 1000ms and blink rate was more
than 0.5, or the driver has been driving continuously for
4 hours. Using such threshold, authors [47] have reported
a detection rate (drowsy driver detection) of 95% accuracy
III. D ROWSINESS D ETECTION with a zero false positive and a 10% false negative rate.
In this section, drowsiness detection proposed by different While the detection rate, as reported by the authors [47]
authors are discussed in details. The proposed methodologies is impressive, there are several limitations to the systems.
are classified into visual detection methodology and sound Firstly, to achieve a high detection rate, the system must
detection methodology. have a personalized set of training data (eye state image) for
different drivers. Secondly, training of backpropagation ANN
was not performed in the smartphone environment. Model
A. Visual Detection Methodology parameters have to be manually loaded into the smartphone
As mentioned earlier, when a driver is drowsy, he usu- after training in an external environment using MATLAB. The
ally exhibits symptoms such as rapid and constant blinking, possibility of training the ANN in a smartphone environment
nodding or swinging their head, and frequent yawning [18] is also unclear. The required level of computation power can
. In addition, there will also be physiological changes in a be difficult to deploy in all smartphone [58]. While glasses
driver [15]. Hence, drowsiness detection can be carried out by or skin complexion has little influence on this methodology,
analyzing different features, as shown in Table 2. light is a factor that can affect the performance; therefore,
With the availability of only a smartphone, non-visual applicability is limited at night. Such a system also requires
features (EEG, ECG and EOG) are usually not applicable. the fixed position of the phone in order to capture the facial
A common idea to detect a drowsy driver visually is using image. If the face of the driver cannot be detected, the system
the smartphone’s front camera to capture the image of the is not useful.
driver’s face. Based on the captured image, further analysis Similarly, Dasgupta et al. [48] also adopted PERCLOS
can be used to determine if the driver is drowsy. For such for drowsiness detection. However, as mentioned by
methodology, a reliable and valid metric known as PERCLOS Dasgupta et al. [48], a single image-based cue may not be
is commonly adopted to determine the alertness of driver [15]. reliable enough. Therefore, to increase the reliability of the
PERCLOS is the percentage of eye closure [45]. Figure 4 system, the authors [48] suggested a three-stage detection
illustrates different states of an open eye. Theoretically, if a process. The three-stage detection framework begins with the
driver is tired, duration of eye closure will be higher leading computing of PERCLOS, then the classification of voice to the
to a higher PERCLOS. Therefore a threshold can be set unvoiced ratio (VUR) based on the driver’s voice as speech
to indicate whether the driver is at a dangerous state of signals can be used as an indicator of drowsiness. Finally,
drowsiness. a reaction time test where a driver must touch the screen
Such an idea was adopted in the system proposed by of the smartphone within a time limit. Image captured was
Xu et al. [47]. Using the smartphone front camera and an first processed to increase the local contrast, enhancement of
Android API, Xu et al. [47] captured the position of the details, and to provide an orientation correction if the driver’s
eye and zoomed the eye image to a size of 40 × 20 pixel head had a tilt of more than 30 degrees. Detection of the
for consistency. It also eliminates redundant information at face was then executed using Haar-like features. Based on
the same time. A single hidden layer Artificial Neural Net- the detected face, a Haar classifier trained with eye images
work (ANN) with 16 hidden neurons was used to classify the during day time and night time was used to detect the eyes.
states of the eyes. A quadratic sum of weights penalty factor A linear Support Vector Machine (SVM) was then trained to
was introduced into the error function to minimize underfitting classify between open and closed eyes. To compute the VUR,
and overfitting. Dasgupta et al. [48] use the microphone in a smartphone to
Based on the classification results of eyes state, PERCLOS capture speech signals sampled at 20 kHz, and Singular Value
was calculated as the percentage of the closed eye in the Decomposition (SVD) was done to remove external noise.
number of frames captured and blink time was derived using Again, SVM was adopted to return the voiced and unvoiced
the time of the 1st and the last frame of consecutive closed length with Mel frequency cepstral coefficients (MFCC) as
eyes. Blink rate was then determined by the total number of input. VUR was then calculated by using the two outputs.
The last test requested the driver to touch the screen of of the driver. Moreover, the performance in regards to relative
the smartphone within 10s to test the alertness. In the event, position accuracy, detection of vehicles, and the number of
if PERCLOS over a 10s sliding window is more than 20%, tracked objects can be affected by the limited processing
then the second stage that classifies the driver’s UVR is capabilities of smartphones [52], [53]. Finally, the smartphone
activated. If the driver fails the voice test and the reaction test, has to be fixed at a position to capture the facial image and
an alarm will be generated through the speaker. As a result, vehicle in front of the driver simultaneously.
an SMS will be sent to the emergency number registered. On the other hand, Qiao et al. [54] captured facial features
Based on such system, Dasgupta et al. [48] achieved an as eye blinking, head shaking, and yawning for drowsiness
average detection rate of 93% as compared to a system using detection. The image was first processed using Gaussian Filter
only PERCLOS which obtained an average detection rate for noise removal, followed by Mean Filter for image enhance-
of 88% based on four different test set. However, a system ment. The detection of the face of the driver was carried out
using only PERCLOS achieved a higher detection rate in one using a Haar classifier and location of the eyes and mouth were
of the test set. Although Dasgupta et al. [48] claimed that approximated using position approximation based on the size
SVD could be performed to remove noise from the audio, of the driver’s face. As explained by Qiao et al. [54], an open
the paper lacks details on whether such system can extract the eye should have a black pixel since it has a larger visible
driver’s voice from polyphonic audio that may contain other pupil. Therefore, after locating the eyes, the eye image was
speech present in the audio collected. To separate overlapping converted to a grey image, and the ratio of the black pixel
of multiple sound events is a challenging task [49], [50], and was calculated. On the other hand, the movement of the head
it is highly doubtful that using only an SVD is sufficient. was calculated based on the variances of the face’s centroid.
Therefore, if the radio is switched on, it may cause a false Finally, the contour of the mouth was computed using Canny
negative indicating that the driver is alert. Subsequently, asking contour-finding algorithm. This contour was then measured,
the driver to touch the screen within 10s while driving is a form which indicates a yawn. Finally, a threshold was set for all
of distraction. Moreover, training of the SVM model appears collected information to determine if the eye was closed; the
to perform outside of the smartphone environment. Finally, the head was shaking, or the driver was yawning. Detection of
position of the phone is fixed to capture the facial image. a driver performing any of the actions will indicate that the
Similarly, Chang et al. [51] also adopted the idea of PERC- driver is tired. The system was tested on 500 images of an
LOS in their drowsiness detection system, but at the same time, eye blink, head shake and yawning and reported an accuracy
they also track the distance of the vehicle in front that can give of 90.5%. Although the proposed system aims to lower the
an alarm if the distance gets too close. Chang et al. [51] used chance of false negative, it has a higher false positive rate
the Haar-like features for face and eye detection. Based on the as the proposed system will raise the alarm upon detecting
detected region of interest, a skin color model was then applied any of the indicators (eye closed, head shaking or yawning).
to the eye image to classify the eye state. As explained by the A driver that performed any of the action does not necessarily
authors [51], white pixel indicates the skin area, and black indicate that he is tired, only through continuous performing of
pixel indicates the eyeball area. If the ratio of a black pixel the action, one can determine if the driver is tired. Moreover,
over white pixel drops below 0.8, then the eye is considered the position of the smartphone has to be fixed to capture facial
as closed, and the driver is deemed as drowsy if the eye is features.
closed for more than 400ms. The vehicle in front of the driver For accurate eye detection without training data,
was also detected using Haar-like features, and the distance Li et al. [55] proposed the Progressive Locating Method
between the vehicle was calculated according to the equation (PLM). The system aimed to eliminate the need for a
given as statistical method and to focus on the apparent and geometric
relationship within the facial region. It was to reduce the
Wc Wp
D= ∗ (1) uncertainty of eye detection due to the unrelated facial
Wr 2 ∗ tan(F OV /2) region and avoid the high computational cost. As described
where Wc is the real width value of the detected vehicle, by Li et al. [55], the system consists of three major parts.
Wr is the width of the detected square, W p is the width The first part was the detection of the driver’s face on the
of image and FOV is the field of view of the rear camera. image captured by the smartphone using a skin color model.
If D is less than 7m, then an alarm is given. In this sys- The second part of the system was the use of PLM algorithm
tem, Adaboost algorithm was used to train a face classifier that consists of three smaller subparts. The first subpart
(using 7240 images), eye classifier (3500 images) and vehicle was to find the facial feature as a reference to detect the
classifier (using 5900 images). Audio was used but was only coarse region where the right eye was, and in this case, Li et
considered for engine state classification and voice command. al. [55] proposed using the driver’s mouth. Once the mouth
Therefore, audio, in this case, is considered as irrelevant to was detected, the bottom and right points were chosen as a
drowsiness detection and excluded from further discussion. reference point to get the coarse eye region.
The accuracy of such a system was not reported in the In the second subpart, the grey level projection was used to
paper. The drawback of this system is a large amount of locate the right eye region. Finally, an accurate eye coordinate
different images is required to train each classifier. It is also was measured by subsequent analysis. In the final part of the
unknown how Wc can be obtained or estimated as it is system, the extracted eye image was analyzed. An open eye
difficult to predict the actual width of the vehicle in front size (height and dimension) was first determined and used as
a threshold. If the eye size collected in a subsequent image audio signals about an observer who moves relative to the
frame is smaller than the threshold value, then the eye is signal source. The speakers first send out acoustic signals
considered close. If the system detected three closed eyes out at 20 kHz, and the microphone received the reflected sig-
of five frames, then a drowsiness alert is given to warn the nals. Band-pass filter and undersampling were then applied
driver. Although such a system is not affected by skin color to improve frequency resolution without distorting frequency
variation, it cannot work well in a dark environment. Presence spectra before transforming the signal into the time-frequency
of glasses and long fringe can affect the detection accuracy domain through FFT. Effective features from Doppler profiles
of the system. These are due to the reflection of light on of audio signals for each drowsy driving action was extracted
the surfaces of glass, and long fringe affects the distribution and used as inputs of Long Short Term Memory (LSTM)
of greyscale around the right eye region. Thus, such system network. Two LSTM models were trained for this propose.
over-relies on the grey distribution of the facial region [56]. The first LSTM classify between normal state, nodding, and
In addition, the position of the smartphone has to be fixed to yawning. The second LSTM classify between normal and
detect the face. abnormal steering operation. Outputs of both LSTM are passed
Zhang et al. [57] also proposed to use a smartphone camera to a NN to predict if the driver is drowsy. Based on such
to capture facial features. However, instead of capturing just a system, Xie et al. [58] reported an accuracy of 94% in
the eye blinks and yawns, they include the extraction of detecting drowsy driver.
Blood Volume Pulse (BVP) for drowsiness detection. The idea However, the main limiting factor is a large amount of data
is based on imaging photoplethysmography (PPG) technique is required to train the classifier. Moreover, road conditions
to monitor changes of image sequence without concerning also affect the accuracy of such a system. It was found that
the specific content. Thus, physiological processes recorded when the driver needs to operate the steering wheel more often
in a video can be extracted and analyzed. In their work, on a crooked road, it will result in low detection accuracy.
Zhang et al. [57] utilized the Second Order Blind Identification Furthermore, it is also unclear if the presence of passenger will
(SOBI) extended-PPG for such purposes. A front camera of a affect the accuracy as passenger movement will also affect the
smartphone was set between a distance of 15cm to 80cm away Doppler shift. In addition, it is unknown if LSTM or NN can
from the driver depending on the camera’s specification. The be trained in a smartphone environment.
captured image stream was then analyzed by a nine-channel
SOBI to estimate the underlying signals from different phys- C. Summary
iological activities. This section presents the discussion on different method-
As signals were extracted in a random order, an identi- ologies proposed by different authors. The summary of the
fication algorithm was required for correct identification of proposed methodologies and their limitations can be found
the signal that corresponds to the physiological response. The in Table 3 and Table 4, respectively. Table 3 and 4 con-
analysis was then carried out to estimate the Heart Rate tain information such as smartphone sensor used, pre/post-
Variability (HRV), blink duration, blink frequency, and yawn processing procedure, the algorithm used, features extraction
frequency. Drowsiness was determined if any of the indicators and calculation, drowsiness detection accuracy, and the limi-
exceeds the threshold set by the authors [57]. Following tations. In the pre/post-processing column, the rough idea of
this implementation, the system reported a detection rate how processing is provided and how the algorithm is used is
of 91.5%. Although this system proposed a novel idea of also indicated. For example, Median Filter was used for noise
multi-information fusion utilizing only a single camera, there removal. Features extraction are listed in the authors’ paper.
are several limitations. There are features that used in system and not just features
The proposed method cannot work well on poorly illumi- for the final classifier. As seen in Table 3, the main idea for
nated or with illumination interference environment, in the drowsiness detection is through the direct analysis of driver
event when the driver’s head is tilted or shaking and also characteristics such as his eyelid, head, or body movement in
on a bumpy road. Zhang et al. [57] also mentioned that a continuous fashion. Thus, the phone must be mounted in a
system could not be readily deployed without checking the secured position to capture the driver’s face regardless of the
driver’s physiological characteristics as they differ from person illumination condition.
to person and the system will require further adjustment for
optimal performance. Finally, the system requires the fixed IV. A BNORMAL D RIVING PATTERN D ETECTION
position of a smartphone, and it is also not applicable for
drivers with small eyes, and for driver wearing a pair of As mentioned earlier, a distracted, drunk, or reckless driver
sunglasses. may unintentionally slow down or have sudden longitudinal
and transversal movements [16], [20]. Such actions can be
B. Sound Detection Methodology detected by the different sensors of a smartphone. In the
Xie et al. [58] proposed an interesting system, D3 -Guard following subsections, different types of methodologies are
to detect drowsy driver using the smartphone microphone. discussed in details.
As the movement of the human body produce Doppler shift,
Xie et al. [58] proposed to detect nodding, yawning, and A. Thresholding Methodology
operating of the steering wheel in a drowsy state. Doppler As evaluated by Fazeen et al. [59], extreme driving behavior
shift refers to the change in frequency or wavelength of will quickly produce a higher value in the data collected by
TABLE III
D ETAILS ON P ROPOSED M ETHODOLOGIES FOR D ROWSINESS D ETECTION
the accelerometer. Therefore by setting a threshold; one can it only works well in low speed and short distances cases due
quickly detect the abnormal driving behavior. Moreover, such to accelerometer’s accumulated error [62], [63]. Finally, this
an idea can also be extended to detect road anomalies such as system also requires the smartphone to be in a fixed position
speed bumps, pothole. The smartphone is first aligned with the and can produce inaccurate readings if the phone’s orientation
vehicle coordinate system at the y-axis of the accelerometer tilts because of a road bump [24], [64].
with the moving direction of the vehicle to allow for such
detection. After the alignment, data from the x-axis accelerom- B. Pattern Matching Methodology
eter can be used to detect turning or lane change, y-axis can be Another methodology rather similar to the setting of the
used to detection acceleration or braking, and finally, the z-axis threshold is the use of Dynamic Time Warping (DTW) to
can be used to detect vibrations or road anomalies. However, compare the similarity of the collected sensors’ signal with
Fazeen et al. [59] did not report the accuracy or detection rate the signals of aggressive maneuvers. Such methodology was
of abnormal driving but indicated an accuracy of 85.6% for applied by [65]–[68].
the overall road anomaly classification system. Such a simple Based on such an idea, Johnson and Trivedi [65] proposed
idea can be readily deployed into any smartphone and do not a system called MIROAD utilizing DTW on the sensor-fusion
require a high computational cost. output of accelerometer, gyroscope, and magnetometer to
However, the threshold value defined by such a method is classify the maneuvers performed by the driver. As explained
only applicable to a specific type of vehicle, road, and phone by the authors [65], the gyroscope gives a clearer indication
combination [60]. It is due to the different phone and car of turns, and by using accelerometer and magnetometer in
possess different characteristics where the threshold value used conjunction with the gyroscope, a more accurate reading of
in car A with smartphone A cannot be used on car B with a device orientation can be obtained. Johnson and Trivedi [65]
different smartphone. then proposed to use an array consisting of readings from
Moreover, poor road conditions also affect the readings of the x-axis of the gyroscope, the y-axis of accelerometer
the sensors. Thus, it cannot accurately distinguish the differ- and x-axis of a magnetometer as they were found to be
ences in various driving behavioral patterns [61]. In addition, the most accurate indicator of different driving maneuvers.
TABLE IV
L IMITATIONS OF P ROPOSED M ETHODOLOGIES FOR D ROWSINESS D ETECTION
In order to detect the start and end time of an event, Simple passes a predefined threshold, then the event has started, and
Moving Average (SMA) of the rotational energy about the subsequent samples are concatenated until the values drop
x-axis was used via the x-axis of the gyroscope. If SMA below the threshold to give the end time. Based on these
collected samples, templates of different maneuvers (collected the collected data. Based on this system, Singh et al. [67]
in the same way) were compared using DTW. The smallest achieved an average detection accuracy of 94.6%.
distance value will indicate that the collected samples are Finally, Ali et al. [68] did an evaluation between DTW
almost identical to the template of that category. Based on and K-Nearest Neighbor (K-NN) utilizing the accelerometer,
the distance calculated, Johnson and Trivedi [65] proposed to gyroscope, and GPS in a smartphone. As explained by the
use K-Nearest Neighbor (K-NN) with K as 3 to determine the authors [68], the y and z axes of the accelerometer can easily
type of event. Based on such a system, the overall detection pinpoint the road bump present on the road while the x-axis
accuracy of events was at 91.9%%. of the accelerometer can be used to detect aggressive driving
On the other hand, Eren et al. [66] also utilized the same behavior. On the other hand, the z-axis of the gyroscope can be
sensors for event detection. They first collected the safe used to detect aggressive maneuver. Finally, GPS can provide
driving templates for each driving event (turning, acceleration, the speed of the vehicle. To facilitate the collection of data,
slowing down and lane change) from five safe drivers by the axes of the smartphone must be aligned with the axes of the
manually selecting and labeling each safe driving habits based vehicle. The data is filtered through a low pass filter to remove
on the collected data. For identification of event within the the presence of noise. A total of 540 driving events were
selected portion of the signal, windowed samples collected recorded over two months. These events belong to 9 different
during driving were z-score normalized and matched with categories namely, 1) normal right turn, 2) normal left turn,
each template to compute the similarity between them using 3) normal right lane change, 4) normal left lane change,
Dynamic Time Warping (DTW) algorithm. The classification 5) aggressive right turn, 6) aggressive left turn, 7) aggressive
of driver behavior was performed through the comparison of right lane change, 8) aggressive left lane change and 9) road
probabilities computed using Bayesian inference. anomalies.
In the study, Ali et al. [68] compared the two algorithms for
P(rn )P(s|rn ) two different task, 1) classification of normal and aggressive
P(rn |s) = (2)
P(s) driving as well as road anomalies and 2) classification of
normal and aggressive turn and lane turn. The difference
where rn is the class of driving event. P(r1 ) denotes prior between the two tasks is that task 1 is more generalized
unsafe probability and P(r2 ) denotes prior safe probability. classification where the algorithm can only detect normal or
P(r1 ) was chosen as 0.2 and P(r2 ) was chosen as 0.8 based aggressive driving regardless of turns or lane change but taking
on the estimation of ten different drivers on whether they were into consideration of road anomalies. On the other hand, task
safe or not. If P(r1 |s) is higher than P(r2 |s) than the driver 2 classified the event into their respective categories without
is classified as an unsafe driver. Based on such a system, considering road anomalies. Based on the results, K-NN was
the authors [66] reported a detection rate of 93%. able to perform better in task 1 as compared to DTW (98.67%
Singh et al. [67] proposal was to detect brakes, and lateral vs. 90.22). However, K-NN achieved a much lower average
maneuvers applied to the car. As explained by the authors [67], accuracy in task 2 as compared to DTW (78.06% vs. 96.75%).
the brakes can produce a subtle measurement in the form of The main limitation of using DTW algorithm is the need
acceleration signal that can be collected by the accelerometer to collect templates of different maneuvers for similarity
whereas gravity sensor and gyroscope data can be fused to calculation. Much effort is required to select and label each
detect lateral maneuvers such as sudden left, sudden right, and driving habits manually. At the same time, it is also impossible
lane changes. In addition, through the use of crowdsensing, to collect all types of templates due to the unique driving
congestion on the road can be detected when multiple smart- styles and behaviors of different drivers [69]. Moreover, DTW
phones detect the braking event in the same geographical area. is a computationally expensive operation since the algorithm
Such information can be used to distinguish braking due to computes the similarity between two elements for all elements
congestion and sudden braking due to harsh driving. In order in two signals. Furthermore, dangerous templates can only
to allow matching of templates, different maneuvers profile be collected in a safe environment when no safety concern
have to be collected. Before a drive, the phone axes were complicates the data collection process. As such, template
aligned with the vehicle axes where the y-axis of the phone matching appears to be unfeasible for a large and heteroge-
must be in the same direction with the moving direction of the neous sensing environment when considering different types
car. Collected data from the accelerometer were first reoriented of vehicles and devices [70]. A common drawback among the
to its original axes using Euler’s angle, and only the data from four methodologies proposed is that they require the phone to
the y-axis was kept for detection of brakes. SMA filter was be in a fixed position.
then applied to smooth the data to remove vibrations caused by Furthermore, in [65], they only considered turn events, and
the car. The smoothened data were passed through a bandpass other events such as speeding, hard brake were not considered.
filter to remove noise due to hardware sensitivity. Yaw angle Despite the fact that the overall accuracy is high, such a
calculated from data collected by gravity sensor were fused system was not able to achieve high classification accuracy for
with data collected along the z-axis of the gyroscope to U-turn, which achieved only 77% accuracy. In [66], only ten
compute the angular velocity. The angular velocity was also drivers were considered for computing the prior probability.
passed to the SMA filter to smoothen the data. The data were It is insufficient to represent all drivers. Moreover, the accu-
then matched individually with the templates signal and the racy of the classifier is based on completed surveys by test
pair with the lowest similarity score determine the event of participants, where the results could have been influenced by
subjectivity [29]. Whereas for [67], the detection accuracy the limitation of hard thresholding to detect different abnormal
of events at high speed is not clear. As the authors [67] driving behavior. The system utilized an accelerometer, gravity
mentioned that templates would be discarded if the recognition sensor, a magnetic sensor as well as GPS to compute the
rate for test samples is low, and another template will be degree of a jerk, orientation rate, speed variation and bearing
selected. Therefore, it is assumed that only one template is variation for abnormal driving detection. The internal linear
used for each event. As such, if a non-representative template accelerometer was used to compute the rate of change of
is used, it will only be updated after some time and is acceleration with respect to time that constitutes to a jerk.
unclear how updating will take place. Although, the use of In order to compute the vehicle’s direction variation, magnetic
crowdsensing allows collective sharing of data, the required and gravity sensors were used to compute the orientation
cost and time to implement crowdsensing-based applications vector. Such vector consists of yaw, roll, and pitch that
can be a limiting factor [71]. In [68], the study cannot be characterize rotation around the different axes. All sensor
considered as conclusive because there were several details data were fused with GPS data to reduce noise due to
which were left out. Firstly, the authors [68] did not specify electromagnetic interference and device vibration. In addition,
the number of K (number of nearest neighbors) used, and there the system also considered speed limits for each particular
was no study on the effects of K on the classification accuracy. road to detect speeding events. Events were also combined
Secondly, the study did not include hard braking and speeding. with weather information and time of day to better determine
Chaovalit et al. [72] also proposed a template matching the riskiness of events and score the driver. The outputs were
system similar to [65]–[68]. As opposed to using DTW, then fed to a fuzzy system with a set of fuzzy rules. Each rule
Chaovalit et al. [72] proposed to use Symbolic Aggregation evaluates a combination of different possible fuzzy values of
Approximation (SAX). In their system, the accelerometer input variables and outputs a type of event (hard acceleration,
was used for considering movement along the lateral and hard braking, aggressive steering or over speeding). However,
longitudinal axes that correspond to turning, lane changing, as different smartphone and vehicle have different components
braking, and acceleration. The raw data from the magne- and characteristic, a newly set up system requires calibration.
tometer was used as an indicator for the detection of driving After calibrating, the system dynamically adjusts the fuzzy sets
events in the lateral domain. Finally, the GPS data was for the variables. As explained by the authors [24], calibration
used to provide the location and speed data of the vehicle. was done only to adjust the jerk and yaw rate since speed
In the paper, it was unclear how representation samples of variation, and bearing variation can be fixed regardless of
different aggressive driving behavior were collected. It was device or vehicle. Based on such a system, the paper [24]
only mentioned that samples representing different aggressive reported a detection rate of at least 90% if there are at least
driving patterns (sudden acceleration, sudden brake, and other 1500 calibration samples.
aggressive driving behaviors) were available, and such data The main drawback of this system is the calibration
were manually validated [72]. The idea of SAX was to samples required. It can lead to further complications if
convert the data into a string. Before the string conversion, non-representative calibration samples are collected [29].
piecewise aggregation was performed where samples during Usage of non-representative calibration samples can affect the
a time window were averaged. The primary motivation of system reliability as each set of rules can only be applied to a
SAX was to reduce computational cost [72]. During a drive, single driver. Although several solutions were provided, they
streaming data was collected and converted to a string using possessed several limitations. Castignani et al. [24] suggested
SAX and compared with the converted representative data. The that the system can dynamically stop the calibration phase and
similarity is measured in terms of distance and is calculated decide if the information is representative or not. However, it is
using min-dist [73]. unknown how can it affect the system detection accuracy if the
If the similarity passed a pre-defined threshold, correspond- system stops calibration with only a small set of data collected.
ing driving behavior could be identified from the streaming The second solution provided is to perform a periodic update
data. This system is easy to use and implement with a of the fuzzy rules. However, it would imply the system can be
lower computational cost as compared to the system proposed inaccurate for the first few drives if the first calibration samples
by Eren et al. [66]. However, by using SAX, there is a collected are non-representative of the driver’s behavior. The
possibility that temporal information is averaged out in the third solution is to combine information from different drivers
time window that may affect accuracy. Such a system is also and consolidate the information to give a universal set of rules.
not applicable for real-life deployment as the detection rate of However, this solution will require a big amount of data to
aggressive events is only at 25%. In addition, it also requires provide a universal set of rules. Fuzzy logic also requires
the smartphone to be in a fixed position. the setting of various thresholds which is sensitive to the
noise [74]. Finally, such a system also requires the position of
the smartphone to be fixed.
C. Fuzzy Logic Methodology Eftekhari and Ghatee [75] proposed a hybrid of Discrete
Castignani et al. [24] proposed a fuzzy logic mechanism, Wavelet Transformation (DWT) and Adaptive Fuzzy Inference
SenseFleet, to detect risky driving events that includes hard System (ANFIS) to recognize overall driving behaviors. Four
acceleration, hard braking, aggressive steering, and over speed- features relating to angular velocity, lateral acceleration, lon-
ing. Such mechanism consisted of a calibration phase to make gitudinal acceleration, and angle variation were used in this
it adaptive and was proposed with the intention to overcome system. Before calculating these four features, a two-level
decomposition was applied on the time series data using used to classify the driving behavior of the driver. If the safe
DWT. The features are then used as the input of ANFIS. driving score is lower than the aggressive driving score and
Six ANFIS were trained to measure the similarity and if the driver has an anger level of above 51, then the driver
dissimilarity between the collected data concerning safe, is considered as aggressive. The system was tested on twenty
semi-aggressive, and aggressive driving patterns. Three drivers with different smartphone and vehicles. As Eftekhari
decision-making mechanism was proposed. Among the three and Ghatee [76] reported the results in the form of scores
mechanism, hierarchical ordering gave the highest classifi- for safe and unsafe driving with the classification of behavior,
cation accuracy compared to majority voting and ratio of there is no detection rate available.
similarity over dissimilarity. The system was evaluated on The limitation of Eftekhari and Ghatee [76] system can
twenty drivers who were asked to compute a questionnaire be said to be similar to [24] as the samples collected from
to determine if they are a safe, semi-aggressive or aggressive a small number of the driver cannot be used to represent
driver. Around eight mins of driving data were collected for all the drivers. Hence, they have a high rate of false alarm
each of them with 2Hz sampling rate. Thus, resulting in a or missed detection. In addition, maneuvers consider in this
total of 960 samples, where 70% were used for training and system were only lane change, turns and U-turns. Furthermore,
30% for testing. Based on such implementation, Eftekhari and there are instances of drivers where their behavior cannot be
Ghatee [75] reported an accuracy of 92%. However, it may not classified using such a system as their safe driving score was
be accurate to classify the driver based on a questionnaire. the same as their aggressive driving score. The anger score
Moreover, the system was only validated on a small number tabulated by the questionnaire is not representative of the anger
of samples. In addition, such a system cannot be applied on level throughout the journey. Assuming the questionnaire was
a smartphone where high computational power is needed to done before the drive, the driver may be calm, but due to an
run a large number of algorithms (DWT, 6 ANFIS, hierarchical unforeseen event, the anger level may rise during the drive
ordering). The position of the smartphone has to be fixed such leading to system inaccuracy. Moreover, the position of the
that the coordinate system of the smartphone and the vehicle smartphone has to be fixed such that the coordinate system of
are aligned. the smartphone and the vehicle are aligned.
Subsequently, Eftekhari and Ghatee [76] proposed a new
system making use of fuzzy logic in combination with a neural
network for aggressive driving detection utilizing the smart- D. Non-Ensembled Learning Methodology
phone’s gyroscope, accelerometer and magnetometer. This
system evaluated the driver’s driving behavior by determining To address the limitation of using non-representative cali-
the similarity between each driver’s behavior and safe and bration samples, Castignani et al. [70] extended his work by
aggressive driving behavior. The system consists of four stages proposing a Multivariate Normal Model (MVN) to detect risky
that analyzed segments of sensor data where each segment driving maneuvers and profile drivers as well as including
consisted of 3s data with six samples and 50% overlap (half a periodic update of the system with the collection of new
of the data used belong to the previous segment). In the samples. The largest advantage of such a system is it does not
first stage, the energy of a segment, FGyr , was calculated via require any labeled data for detection as the probability of an
observation that allows the assessment of driver behavior in a
the sum of squared values collected by the gyroscope in the
continuous fashion.
z-axis during the 6s. A threshold was set based on this energy
Data collected by the accelerometer was again used to
value and any exceedance will indicate a maneuvering action
calculate the jerk, and the data measured by the magnetometer
including the lane change, turns and U-turns. In the second
was used to calculate the average yaw rate and the angular
stage, the maneuver type (lane change or turns or U-turns)
velocity. GPS used in this system was for measuring the speed,
was determined using the peak of FGyr , angular variation
speed variation as well as the bearing variation. Based on
and duration of each maneuver as inputs to a single hidden
empirical result, Castignani et al. [70] suggested that a feature
layer backpropagation NN with 10 neurons. Eftekhari and
set comprising of only 1) product of speed variation and
Ghatee [76] proposed a length of 500 training inputs and
standard deviation of jerk and 2) product of bearing variation
reported a detection accuracy of 89%. In the third stage, the
and average yaw rate is sufficient for the task but feature set
detected maneuver was assigned with a fuzzy number. This
comprising of only 1) product of speed, speed variation and
number was calculated based on the peak of the variance
standard deviation of jerk and 2) product of speed, bearing
of the squared values measured by the accelerometer in the
variation and average yaw rate provides fewer false positives
lateral axis. In the final stage, the fuzzy number calculated in
and do not produce any high severity during calm driving.
the previous stage was used to evaluate the driving behavior.
The feature set used will then generate the first MVN model
The membership function was first determined by considering
using Maximum Likelihood Estimator (MLE). An MVN can
750 maneuvers collecting from three aggressive and three safe
be defined by the multivariate Gaussian probability density
drivers.
function given as
The similarity between the calculated fuzzy number and
the fuzzy number of safe and unsafe driving was then tabu- 1 1 −1
lated using the max-min composition. Finally, scores for the p(x|μ, )= d exp(− 2 (x − μ)
T
(x − μ))
2π 2
d
maneuver was computed to give a safe and aggressive driving 2
score. This two scores with the driver anger level were then (3)

where p(x|μ, ) is the multivariate Gaussian probability 152 features. Based on such implementation, SVM obtained
density function, x is the features considered,
d represents the an accuracy of 95.4%, and NN obtained an accuracy of 96.9%.
dimension of feature space and θ = (μ, ) represents the As different features were used by different classes, a direct
parameters’ distribution. In this system, θ was dynamically comparison between them can be quite biased.
estimated. Using the probability density function with the Although it is unclear how well SVM will perform with
estimated θ , the likelihood of an observation can be calculated 152 features, it can be concluded that it may not be worthwhile
and anomalies can be identified. After the training phase, to increase the computational load with a small increment
the system can periodically update the model’s parameter by of accuracy. While sudden braking is indeed an abnormal
considering the new samples and also the old samples where driving behavior, a driver may use the sudden brake to avoid
Castignani et al. [70] suggested to use 2900 old samples potential collision ahead of him where the fault is not on
and 1200 new samples based on their evaluation. To prevent the driver. Therefore, there is a need to consider the differ-
adding samples of low quality (non-representative samples), ence behind the reason for sudden braking to prevent any
Castignani et al. [70] added a criterion to calculate the ratio bias/demerit being given to the driver when the fault is not on
of the determinant of new samples against the determinant of the driver. As explained by Yu et al. [21] side slipping occurred
the old samples. If it does not exceed the pre-defined threshold, when a driver is driving in a straight line but deviates from
then the system does not perform the update. Similar to the a normal driving direction as seen in Fig. 2. However, such
SenseFleet [24], once the system detects an anomaly, it will behavior is almost similar to a lane change, and Yu et al. [21]
then deduct the score of the driver with the consideration of did not include a detailed explanation of how the system may
the weather, daylight and environmental information. It was classify between the two scenarios. It was only mentioned that
reported that such a system was able to detect 600,000 risky data were collected in a real driving environment with a normal
maneuvers from 4800 drivers using iPhone 6 and Samsung driving behavior with few and small fluctuations in sensors’
Galaxy S5 in a large scale implementation. The paper included readings due to abnormal driving. Therefore, it is unclear if
an evaluation of the average score, but it did not include a lane change was conducted during the data collection phase
comparative study between ground truth and detected anomaly. that resulted in a false alarm caused by the system. Finally,
Therefore, details such as detection rate, false position or false the model was trained outside of the smartphone environment
negative were excluded. and the phone was required to be mounted in a fixed position.
As mentioned earlier, Castignani et al. [70] suggested to Ahmed et al. [77] proposed a Random Forest (RF) to clas-
use 2900 old samples and 1200 new samples for a system sify between undistracted and distracted drivers. Distractions
update and this criterion was based on the highest recall value can be due picking up a phone to call, text or read and
of detection rate which was approximately 0.7. Since recall Ahmed et al. [77] found that when drivers picked up their
is defined as a ratio of true positive over the addition of true phone to perform any of the actions, it will cause subtle
positive and false negative, there may be a need to increase changes in the accelerometer and gyroscope measurement.
the recall of the detection rate even though such system is A total of 28 features were calculated, and 6 were extracted
robust against different car and phone combination. Similar to using a filter based approach based on linear correlation and
their early work [24], the position of the smartphone has to information gain. These features were then fed into an RF
be fixed. as inputs for driving behavior classification. Based on such a
Yu et al. [21] utilized data from the accelerometer and system, the authors [77] reported an accuracy of 96%. While
orientation sensor on a smartphone together with Support this method is accurate in detecting if a driver has picked up
Vector Machine (SVM) and NN to detect abnormal driving his phone, it has a poor accuracy classifying whether the driver
behavior such as weaving, swerving, side slipping, fast U-turn, is reading, calling or texting on the phone. Ahmed et al. [77]
turning with a wide radius and sudden braking. The proposed suggested that it was due to humans having different habits
system consists of two different phases, namely: an offline when they read, call or text using a phone. This system
and online phase. The offline phase consists of data collection, only considered distraction caused by driver picking up the
data processing, feature extraction, and finally, model training. phone to perform certain actions but it is unclear if such a
During the data collection stage, the coordinate system of system can detect driver distraction if he did not pick up his
the vehicle and smartphone placed were aligned by align- phone. The method was also tested on a younger group of
ing the accelerometer’s y-axis along the moving direction participants which may have different results for other age
of the vehicle. To remove the noise present in the data groups, particularly for an older driver. Furthermore, such a
collected, a low pass filter known as the Dynamic Exponential method is also not useful in detecting distraction caused by
Smoothing Filter (DESF) was used to preserve effective fea- other scenarios such as eating or falling asleep.
tures of different driving behavior. Based on the processed Bejani and Ghatee [78] proposed the use of Convolutional
data, sixteen basic features were extracted to differentiate Neural Network (CNN) for feature extraction and classification
the different driving behavior. To further improve the distinct of the driving style. As CNN suffers the drawback of over-
difference between the different types of driving behavior, the fitting, the original algorithm was modified with an adaptive
authors [21] suggested using another 136 features. Sixteen of regularized function where the authors [78] called CNNAR.
them were squared of the basic features, and the rest were The idea of adaptive regularized is to achieve an adaptive
the product of any two different features. However, SVM dropout and weight decay. To enable adaptive dropout, Bejani
only utilized 16 basic features for training while NN used all and Ghatee [78] keep track of the ratio of mean square error on
the validation set over mean square error on the training set. with the car axes. The t-SNE algorithm was then applied on
Based on this ratio and two predefined thresholds, keeping raw data to extract features relevant to different labels. The
probability is updated adaptively and passed to the dropout combination of C4.5 (a type of decision tree) and Radial Basis
layer. In weight decay, a controlling term is added to the error Function (RBF) was applied to classify the type of maneu-
function of CNN. ver (U-turn, turns or lane change) using a set of piecewise
aggregated magnetometer data (different window of samples
Err or new = M S E T rain + a| |w| |2F (4) are averaged) as the input. A Multi-Layer Perceptron (MLP)
where a is the scaling parameter and | |w| |2F is the Frobenius was then applied to determine the traffic situation (congested
norm of synaptic weights. Based on the pre-defined threshold or no congestion) based on the speed derived from GPS and
and the ratio of mean squared error on the validation set over zero crossing rate of the data collected by the y-axis of the
mean square error on the training set, the scaling parameter is accelerometer. MLP was again applied to classify the type
updated adaptively. Two CNNAR were trained, one for feature of cars into high and low sensitivity car by considering the
extraction and one for driving style classification. The pro- zero crossing rate, variance, minimum, maximum of data col-
posed system was able to achieve 95% classification between lected by z-axis of the accelerometer. The fusion of K-nearest
a dangerous driver and a safe driver. CNN is a computationally neighbor (KNN), MLP and SVM with RBF were then applied
expensive algorithm and would take a substantial amount of to determine if the maneuver was safe or dangerous under
time to train the model as compared to other machine learning different traffic condition. The results of three algorithms
algorithms. The proposed methodology was only applied on were fused with a voting mechanism to produce the initial
data collected previously, and it is also unclear if such a system evaluation. Based on the output of the algorithm, a Fuzzy
can actually be implemented in a smartphone. Inference System (FIS) was applied to determine the risk level
On the other hand, Carvalho et al. [79] performed a driver of the driver. The result then passed to another FIS to provide
behavior profiling using three different types of Recurrent suggestion if the maneuver was deemed as dangerous. Based
Neural Network (RNN) namely, a simple RNN, Gated Recur- on this implementation, the authors [80] reported an accuracy
rent Neural Network (GRU) and LSTM. Their study was based of 94%.
on using only the accelerometer readings and was aimed to The main drawback of this system is the large number of
classify between aggressive braking, aggressive acceleration, algorithms that make it complicated and difficult to implement
aggressive left turn, aggressive right turn, aggressive left lane in a real system. There is much work involved to retrain all
change, aggressive right lane change and non-aggressive event. the models as the system cannot be used in different cities due
To ensure both the smartphone and vehicle’s coordinate to different traffic conditions. In addition, such methodology
systems are aligned, the smartphone was fixed on the car’s also requires a system with a high level of computation power
windshield using a car mount and a rotation matrix. Data were to run all the algorithms. Moreover, all training procedure
then collected from 4 car trip and the total driving duration was performed in a MATLAB environment which prompts the
was approximately an hour that provides a total of 69 events. question of how a user can update the system. Subsequently,
The authors [79] then used 70% of the data as training data this system only takes into account of turns, U-turn and lane
and 30% of the data as testing data. Based on their result, GRU change. Besides, ground truth obtained was based on expert
and LSTM were the best performing RNN which achieved an scoring of driving style that is subjected to subjective judg-
accuracy of above 95%. However, the accuracy of LSTM was ments of other drivers or experts [81]. Furthermore, the use of
easily affected by the number of neurons. piecewise aggregation to smooth the magnetometer data may
Even though the results were good, the number of events cause some temporal information to be averaged out. Finally,
considered were only 69 from two experienced drivers. the smartphone also must be placed in a specific position for
In addition, two of the categories (aggressive left and right proper calibration.
lane change) were imbalanced as compared to other categories. Xie et al. [82] also proposed a methodology utilizing
Moreover, data were collected during sunny weather, on a dry ensemble learning using data collected from accelerometer,
road and paved with asphalt. It is unknown how wet weather gyroscope, and GPS. In their system, data collected were
condition will affect the accuracy of the classifiers. Further- first filtered by a moving median and moving mean filter
more, the position of smartphone must be fixed throughout the for noise removal. Feature extraction was then performed
trip which may be affected by road anomalies such as pothole on the filtered data to form a feature vector of dimension
or speed bump. Finally, model training and validation were 273 and z-scored normalized. Ensemble learning using KNN,
conducted offline, as it is unclear if model training can work Logistic Regression (LR), Naïve Bayes (NB) and RF were
in a smartphone environment and if the good performance is then applied where the output of each classifier was fused to
consistent for streaming data. give the final output. Based on such a system, Xie et al. [82]
achieved an accuracy of weighted F1-score of 87%. However,
this system was tested only to one case of driver distraction
E. Ensembled Learning Methodology due to talking inside the car. An experienced driver is less
To improve the detection accuracy, Bejani and Ghatee [80] likely to be affected by talking to a passenger. In addition,
proposed the use of ensemble learning. To allow reliable data the dimension of the feature vector is very large, that requires
collection, the smartphone must be placed in a specific position a high computational effort. Finally, such a methodology was
followed by a calibration process to align the smartphone axes not tested in a smartphone environment.
Guo et al. [83] proposed a novel algorithm combining value above 50 will be classified as aggressive events. How-
Autoencoder and Self-Organizing Map (AESOM) for driving ever, it was not mentioned how this value could be classified
behavior clustering using only the GPS sensor in a smartphone. into hard accelerating or hard braking, but it was reported that
As GPS data may be affected by noise or outlier, a software such system achieved a detection accuracy of 95%.
is embedded within the GPS receiver to filter the data with In this methodology, Daptardar et al. [84] predefined 4 states
a propriety filtering algorithm automatically. Based on the for the HMM considering slopes and acceleration but it is
collected GPS data, a total of 10 input features related to highly difficult to generalize the states as the two parameters
speed and acceleration were estimated. Features were then z- are bound to change on different kinds of roads as well
scored normalized and fed into the autoencoder for extraction as different locations [85]. Moreover, the hard threshold of
of latent features and followed by the passing of latent features severity index also limits the system to be used in a specific
to the SOM for clustering of different driving behavior. In their location with a specific phone and vehicle. It is because it
paper, Guo et al. [83] did not report any detection accuracy is unclear if the same threshold can be used for different
rate but showed the loss value of the autoencoder is between phones and car since the sensitivity of the accelerometer
0.03 to 0.08 and the clustering results of the different type of and characteristics of cars are different. While the detection
driving behavior such as clustering acceleration behavior into accuracy is high, the system can only detect hard acceleration
a slight, moderate and heavy. As a result, there is no way to and hard braking. Lastly, the smartphone must be placed in a
determine the accuracy of the algorithm. Moreover, validation specific position where the coordinate system is aligned with
of detected abnormal driving behavior against the ground truth the vehicle’s coordinate system.
was not performed. As such, this methodology lacks reliability.
The algorithm was also not applied in real time. Instead, it was F. Unsupervised Methodology
applied on data collected for a period of time. It is also unclear
On the other hand, Yang and Hansen [86] proposed the use
how the new data collected is related to the clustering results
of unsupervised clustering approach to access driving behav-
and how it is clustered into one category. Therefore, such
ior using data retrieved from in-vehicle smartphone Inertia
algorithm lacks practicality when it can only be applied to
Measurement Unit (IMU) and GPS sensors. Their method-
a group of data for post-event analysis and cannot be used
ology was specifically designed to grade the performance
for online behavior analysis. Finally, Guo et al. [83] did not
of 5 maneuvers, left turn, right turn, gas hit, brake hit, and
provide any details on the ground truth (abnormal driving)
normal driving. For this methodology, the data collected were
obtained.
first filtered by the Median filter to remove noise followed
Daptardar et al. [84] proposed a Hidden Markov
by resampling of data to have a uniform rate using linear
Model (HMM) based driving event detection methodology.
interpolation. A coordinate transformation was then performed
Prior to the data collection, alignment of smartphone coor-
to convert the smartphone centered X, Y, Z accelerations to
dinate system must be in sync with the vehicle coordinate
vehicle centered acceleration. Global-referenced GPS bearing
system such that the y-axis of the accelerometer is aligned
directions was also converted to vehicle-referenced moving
with the moving direction of the vehicle. Data collected were
angles. Subsequently, an alignment step was done to assign the
then filtered with a low pass filter and Kalman filter to remove
axis by referring to gravity and GPS information. To identify
noise due to road anomalies. Data collected from the z-axis
the event start and end, GPS data was used. As suggested
of the gyroscope was leveraged to detect the lateral driving
by Yang and Hansen [86], peaks and troughs of GPS speed
maneuvers (lane change and turns). The data collected from
derivative indicated the occurrences of gas-hit/brake-hit events
the y-axis of the accelerometer was leveraged to detect the
while unwrapped GPS bearing derivative indicated a right or
longitudinal driving maneuvers (hard accelerating and hard
left turn. A total of 7 features (mean, variance, mean crossing
braking). Gyroscope data were linearly fitted to reduce dimen-
rate, peak frequency, spectral energy, entropy, and correlation)
sionality and slopes of the linearly fitted data was calculated
were extracted from each axis of the accelerometer. Based on
as the input to the HMM model where the HMM detects the
the features, a cluster scheme was applied iteratively for three
occurrences of events. If input slopes are high, represented by
times where each of them was labeled with a grade (A to D).
high gyroscope values followed or preceded by low slopes,
It is assumed that for an experienced driver, the majority of
then an event had taken place. The output of the HMM was
the maneuvers will have grade A due to safe driving. The
then passed to an RF classifier to categorize the event as a lane
results were only compared with a grading providing by the car
change or turn. On the other hand, a severity index was used
insurance. Therefore, results were inconclusive as they were
to detect hard accelerating and hard braking. First, the data
not verified against the ground truth. In addition, smartphone
from the y-axis of the accelerometer was used to calculate the
location was only tested in one location and may not be
jerk energy defined as the rate of change of acceleration. The
free-positioned as suggested by the authors [86].
severity index was then calculated using the jerk energy given
as
G. Combination of Visual and Non-Visual Methodology
I ndex = (RoV ∗ (J E Max − J E Min ))/100 (5)
On the other hand, Bergasa et al. [20] proposed an appli-
where RoV is the rate of change of velocity, J E Max and cation called DriveSafe that used the computer vision and
J E Min are the maximum and minimum jerk energy in the pattern recognition techniques to detect abnormal driving
event window. Samples of a particular window which produce due to drowsiness and inattentive drivers. As explained by
Bergasa et al. [20], lane weaving and drifting can be used (at a stop sign and traffic light). This system was proposed to
to infer drowsiness. On the other hand, sudden longitudinal identify drivers who dash across the red traffic lights as well as
and transversal movements can be used to infer distractions. making unsafe turns. Measurements from the accelerometer,
Since drowsiness is inferred by lane weaving and drifting, gyroscope, GPS were fused with the image captured by the
Bergasa et al. [20] proposed to use the rear camera of a rear camera of a smartphone to form a vision-based driving
smartphone to capture the image of the road ahead. Image event detection system. Prior to the drive, the phone was first
captured will be pre-processed by converting it into greyscale mounted with its coordinate system aligned with the vehicle
and resizing it to a size of 320x240 pixel. An adaptive Canny coordinate system by aligning the z-axis of the smartphone
algorithm and Hough transform were used to capture the lane with the moving direction (y-axis) of the vehicle. The rear
markings on the processed image followed by using Kalman camera of the smartphone must be facing the road without any
filtering to track the lane. Lane drifting detection was based obstruction. The system must also be equipped with a prior
on Lanex (fraction of lane exit), which is defined as the map containing location information of the traffic lights. The
fraction of a given time interval spend outside a virtual driving light or sign detector using the smartphone rear camera will
lane. If Lanex exceeds 80%, a lane drifting has occurred. The only be activated when the distance between the car and
microphone was used to detect the switching on of blinking the traffic lights or sign is within 30m, and this distance is
indicator by tracking the sound generated when a driver turns estimated using the Haversine formula. To extract the shape
on the blinking indicator. Therefore, if a turn or lane change of the traffic light or stop sign from the captured image,
is made without switching on the blinking indicator, then Jiang et al. [89] applied the HSV color model. By taking
an involuntary lane change has occurred. As explained by weather into consideration, color ranges have a different range
Bergasa et al. [20], actions such as acceleration, braking and for a stop sign and traffic light during different weather
turning are highly correlated to the data collected by the conditions. To differentiate between a stop sign and traffic
accelerometer. Therefore, any violation of the threshold set light, the shape of the detected shape is compared with the
by the authors [20], one can merely detect the critical event shape of the traffic light and stop sign. As explained by the
(sudden acceleration, braking, and turning). However, as mea- authors [89], the traffic light has a shape of a circle while
surements from accelerometer were noisy, data collected were the stop sign has an octagon shape. In order to increase the
required to be clean using a Kalman filter. The use of the GPS processing speed, noisy images caused by vibrations were
was to estimate the vehicle speed and road curvature where filtered based on the threshold value set on the x-axis of the
both values were used to estimate the centripetal acceleration accelerometer. The filtered image was then downsampled to a
of the vehicle due to the road. This value was used to decouple smaller size by using a Gaussian pyramid. Finally, to detect
the lateral acceleration due to the road curvature from the unsafe turns, authors [89] proposed to first detect a turn by
one caused by wrong driver movements. Through this system, tracking the measurement from the gyroscope x-axis, lane
the authors [20] reported a precision value of 0.74 to 0.93. marking is then detected using the canny algorithm and Hough
As the details of the false negative are not shown, the detection transform. A turn is detected if the threshold of 0.5 rad/s is
accuracy cannot be calculated. crossed and a drifting event is then detected if the midpoint
There are several shortcomings to this application. The of the lane marker crosses the centreline of the lane marker
application cannot detect abnormal driving behavior occur at detection window.
speed below 50km/h as the application will only be activated The precision value of such a system to detect stop
when the vehicle exceeds the speed of 50km/h. Using only sign, red light, stop sign driving event, red light driving
the sound of switching on the blinking indicator is also event and lane drifting event was found to be approxi-
unreliable as there is a possibility of misdetection for safe mating 0.8. However, overall, detection accuracy was not
drivers. Although authors [20] mentioned that detection of reported.
such sound is out of their scopes, the main challenge of As the system is a vision-based system, detection error will
such sound detection is the detection has to be carried out increase in the event of challenging light conditions due to
in a noisy environment if the radio is turned on or when rainy days or at night. Roadside color objects such as vehicles,
the passenger is talking. Threshold values were also simply buildings also add to the false positive detection of red lights
taken from [87] as Paefgen et al. [87] also used an iPhone and stop signs. On the other hand, fading lane markers will
in their studies. It is unclear if the same threshold can be affect the detection accuracy of the system. The system is
used for another type/brand of the smartphone as the threshold also not robust against road with a turn lane. If a driver drifts
used cannot be considered as industrial standards [88] since into the turn lane and makes a turn at an intersection without
it varies for different driver and phone. As the system relies stopping when the traffic light is green, such action is detected
on image processing technique to detect lane marking, it will as a lane drifting event.
be a challenge for the system to detect fading lane markings. In addition, yellow light is not considered in their work that
In addition, the error rate will likely increase in the event may result in several false positive. In the event of a yellow
of challenging light conditions due to rainy days or at night. light, driving behaviors at intersections becomes complicated.
Moreover, the position of the phone must be fixed throughout While drivers should stop the vehicle in advance, it may not
the journey. always be the best choice in every situation. If the vehicle is
Jiang et al. [89] also proposed a system utilizing a camera close to the stop line at high speed, then it is much safer to
that analyzes driver behavior at an intersection or junction pass the intersection in this situation [90].
Furthermore, if the prior map installed on the smartphone behavior. If the number of votes does not satisfy the condition
is not updated with the latest information of the stop sign of the number of the classifier minus 1, then it classified as
and traffic lights, then the system will not be activated when other driving event and required further processing. Such a
approaching an unmarked intersection. The threshold used to driving event is considered as events that belong to inattentive
detect turn may not be a value that can be used universally. driving but are hard to detect atypical inattentive driving.
Finally, the position of the smartphone must be fixed through- Therefore, Xu et al. [93] proposed to use the Self-Organizing
out the journey. Map to separate such events from normal driving. For early
Saleh et al. [91] methodology also included the use of the recognition of inattentive driving, Xu et al. [93] suggested
camera, but they utilized a Long Short Term Memory (LSTM) using the degree of completion of inattentive behavior at the
for classification of driving behavior. The motivation was to time, t, using the following equation.
eliminate the manual handcrafting of features extraction as t − t0 π
LSTM can enable the exploitation of inherent dependency A= = (6)
t1 − t0 T
and correlation between different sensor data captured at each
where t1 is the end of action and t0 is the start of action
time step of real driving sessions. For their methodology,
based on historical data. A group of gradient model for-
Saleh et al. [91] proposed to use measurements collected by
est was set up for early detection of events. To take into
the accelerometer, GPS and the rear camera of the smartphone.
account of false alarm caused by road bumps and car stop,
As data were collected with the different sampling rate,
Xu et al. [93] also included the use of an accelerometer to
collected data with the lowest sampling rate were up-sampled
detect road bumps and car stop. In the online phase, signals
to match the highest frequency used for data collection using
were collected and transformed into Doppler profiles in a
Finite Impulse Response (FIR) filter. They were followed by
similar fashion. Based on the classifier trained, the system will
z-score normalization to produce data of mean 0 with standard
identify if the collected signals are the behavior of inattentive
deviation 1. A rolling window of size 64 overall features with
driving. Based on the proposed system, it achieved an accuracy
an overlap of 50% was then done to obtain 9499 window
of 94%.
samples of which 70% was used for training and the rest
Although the system is comprehensive, much work is
for validation. Although, the methodology was not tested
required for the training and tuning of several different algo-
in real life application, Saleh et al. [91] methodology was
rithms which includes SVM, SOM and gradient model forest.
applied on the UAH-Drive set dataset which was introduced
Secondly, the degree of completion of inattentive driving
for driving behavior analysis, and they reported an F1 score
behavior is only based on 8 drivers that may not be comprehen-
of 0.91. It is unclear if training of model can be carried out
sive enough because different drivers have different reaction
in the smartphone environment. Otherwise, the model must be
time in different scenarios. Furthermore, while the authors
trained outside a smartphone environment and network para-
consider atypical inattentive driving events, they did not study
meter transferred to the phone, which may be inconvenient.
on the effect of atypical attentive driving events. Example,
As hyperparameters are virtually independent and should be
an attentive driver chewing on gum or passenger who is eating.
optimized separately [92], it is also unclear if model training
Therefore, such events may cause a false alarm by the system.
can be carried out automatically for a new driver. In addition, a
large amount of training data was also used for such a system.
I. Summary
Section IV presents the discussion on different method-
H. Sound Methodology ologies proposed by different authors. The summary of the
Finally, Xu et al. [93] proposed an interesting sys- proposed methodologies and their limitations can be found
tem for early recognition of inattentive driving using a in Table 5 and 6. Table 5 and 6 contain information such
smartphone microphone. The idea is the same as [58] which is as smartphone sensor used, pre/post-processing procedure,
to utilize the Doppler shift of audio signals caused by human algorithm used, features extraction and calculation, drowsi-
action. As explained by the authors [93], such application is ness detection accuracy and the limitations. In the pre/post-
suitable for identifying the four representatives of different processing column, the rough idea of how processing is done is
inattentive driving behavior (fetching forward, picking up provided and in the algorithm used column, how the algorithm
drops, turning back and eating and drinking) because they is used is also indicated, example: Median Filter for noise
have unique Doppler profile. Data for different inattentive removal. Features extracted and calculated are all features
driving behaviors were collected by using a smartphone as listed in the authors’ paper. They are features that are used
a transmitter and receiver device. in system and not just features for the final classifier.
Audio signals were transmitted at 20 kHz to filter back- In short, driving behavior classification can be broadly
ground noise and effects of people. Collected signals were classified into two categories, 1) the classification of individual
then transformed into the frequency domain using Sliding driving events and 2) the detection of aggressive driving events
window Overlap Fourier Transform (SOFT) and features were over a long window and extending detection to an overall
extracted using PCA. These features were trained using binary driver risk assessment [29], [30]. However, the key question
SVM for every pair of inattentive driving event. A voting is, how can the classification of individual driving events or
mechanism was then provided to score each identified event. overall driver risk assessment bring value-added features to
The number of votes will determine the category of driving the driver? Therefore, if a system is classifying individual
TABLE V
D ETAILS ON P ROPOSED M ETHODOLOGIES FOR S MARTPHONE -BASED D RIVING E VALUATION
TABLE V
(Continued.) D ETAILS ON P ROPOSED M ETHODOLOGIES FOR S MARTPHONE -BASED D RIVING E VALUATION
TABLE V
(Continued.) D ETAILS ON P ROPOSED M ETHODOLOGIES FOR S MARTPHONE -BASED D RIVING E VALUATION
TABLE VI
L IMITATIONS ON D IFFERENT P ROPOSED M ETHODOLOGIES
TABLE VI
(Continued.) L IMITATIONS ON D IFFERENT P ROPOSED M ETHODOLOGIES
TABLE VI
(Continued.) L IMITATIONS ON D IFFERENT P ROPOSED M ETHODOLOGIES
driving events, it may be more worthwhile to consider how the use of such light is safe for drivers without any long-term
can a driver be warned of his dangerous behavior and how detrimental effects.
such information can be disseminated to fellow road users or However, there are several limiting factors for this imple-
the authorities. Whereas, for a system that provides overall mentation. In order, for such implementation to be effective,
driver risk assessment, it may be more attractive for the driver lighting conditions must be relatively stable, and drivers must
where recommendations can be given according to improve be close to the light source [94], [96]. In order to overcome
their driving [29]. the influence of light on the image, Zhang et al. [97] proposed
Among the systems proposed, different systems have dif- active IR light of 850nm and combined with an 850nm narrow-
ferent requirements. Some systems have an active and passion band filter to get the images under the infrared illumination.
mode thus requiring user to manually activate the evaluation However, cameras in a smartphone may not produce quality
process [65]. Other systems can automatically start the evalua- images in the IR spectrum. In addition, an 850 nm emitter
tion process when the smartphone is located on the dashboard that has a red glow can be distracting to drivers because the
and automatically off the evaluation process when the location human eye is more sensitive to this wavelength [99].
of the smartphone changes [76]. Some systems may require The second limitation of visual methodology for drowsiness
the smartphone to be mounted in a fixed position [59], detection is possible false positive or false negative in several
[65]–[68] while others do not but require an additional reori- scenarios. The use of PERCLOS may fail in the event when
entation process [86]. the driver sleeps with his eyes open [15] and also in the event
when the driver is in a state of daze. As such, PERCLOS can
V. C HALLENGES produce a false negative for these two conditions. Also, false
Based on the methodologies proposed earlier, different types positive can be produced in cases where drivers close their
of challenges faced when implementing smartphone driver eyes for some time in an attempt to moisten their eyes [100].
behavior analysis are consolidated and discussed in details in In addition, tracking of mouth contours to detect yawn may
this section. produce several false positive due to the driver talking or
singing [15].
Finally, visual detection methodology cannot work if it
A. Visual Detection Methods for Drowsiness Detection
cannot detect what it is set out to detect (i.e., failure to capture
As discussed in Section III-A, drowsiness detection is an image of the driver’s eye due to long fringe or other
commonly detected through facial expression by using the accessories or in the event driver tends to keep moving or
smartphone camera. However, there are several limitations shifting his position). Unfortunately, there has not yet been a
for such methodology as seen in Table 7. Firstly such solution in the event if the driver’s eye is covered by his fringe.
methodology is easily affected by poor illumination. As such, To mitigate the effects of driver’s movement, several cameras
several authors [48], [31], [94], [95] had proposed the use or smartphones may be required [100]. However, that would
of near InfraRed (IR) illumination to provide illumination suggest an increase in hardware cost. In addition, the mounting
for image capturing under poor lighting conditions. The use of several cameras or smartphones at eye level would obstruct
of IR illumination is a simple and effective approach for the driver’s vision.
pupil detection based on differential infrared lighting scheme.
The high contrast between the pupils and the rest of the
face can significantly improve the eye-tracking robustness B. Abnormal Driving Detection Methods
and accuracy [96]. Moreover, such implementation can help On the other hand, a non-visual approach that utilizes
in the detection of eye state even if the driver is wearing smartphone’s sensors to detect abnormal driving behavior is
sunglass [96], [97]. Moreover, clinical research has also found not without any drawbacks. As mentioned earlier, driving
that near IR does not induce neurodegeneration [98]. Therefore behavior classification can be broadly classified into two
TABLE VII
C OMPARISON B ETWEEN C LASSIFICATION OF I NDIVIDUAL D RIVING E VENTS AND OVERALL D RIVING R ISK A SSESSMENT
categories, 1) the classification of individual driving events As seen in Amadoa et al. [104], questionnaires are completed
and 2) the detection of aggressive driving events over a long before the actual drive, It contains a vague reference point
window and extending detection to an overall driver risk problem since drivers do not have an objective input related
assessment [29], [30]. to their own performances.
The former is dedicated to identify specific aggressive Assessing drivers in terms of the expected performance on
maneuvers and would first require the detection of a maneuver, a future task might be more related to their confidence in their
followed by classifying the aggressiveness of the maneu- general driving skill, rather than the specific behavioral output.
ver. Although Table 5 indicates that such implementation has In fact, a lower correspondence between subjective ratings
high accuracy, it is not a trivial task. The first critical issue to and actual performance and self-enhancement bias as found
address is the number of events to consider. As seen in Table 5, in previous studies [104] could be seen. If the questionnaire
different authors considered various types of maneuver. In such is done after the drive, drivers rely on their memory that
cases, authors focus only on several specific maneuvers with- can be inaccurate due to impression management and self-
out considering all possible maneuvers. Moreover, considering deception [33]. Thus, it is important to consider the issue
only a specific maneuver without understanding its variants related to each type of implementation which are summarized
can cause an incomplete evaluation [75]. It may not be possible in Table 7.
to consider every single type of maneuvers leading to the The next limitation of using smartphone sensors for abnor-
question of how many events needs to be considered before mal driving behavior analysis is the varying level of accu-
it is exhaustive. Finally, such implementation are targeted racy due to different hardware specifications on a different
towards the quality of maneuvers instead of the quality of phone. A homogeneous precision and sampling rate of the
overall driving. sensor data may not be guaranteed in every existing device.
On the other hand, an overall driver risk assessment does A compromise has to be reached to accommodate most of
not require a large number of labeled data showcased in [70]. the existing hardware [25]. Furthermore, IMU in smartphones
As driving style of each driver is unique, a unique set of are considered to be the lowest grade of sensors among the
threshold may be required to gauge the level of abnormality in commercial sensors [30], [105] and can exhibit significant
their driving. Thus, the first question would be how can person- bias, scale factor, misalignment, and a high level of noise [28],
alized labeled data be collected for supervised methodology in [29], [106]. Likewise, vibrations from the vehicle, road con-
a safe and trustable way. Subsequently, as the classification of dition, and environment can also contribute to noisy sensor
driving style is done after collecting a period of the signal, readings [107]. Additionally, measurements of a magnetometer
it may be too late when abnormal driving was detected [101]. can be corrupted easily by electromechanical components or
In addition, driving style can be comparable if there is a can exhibit sharp variation due to sudden changes in phone
set of reliable scoring system [102], [103] and if the test is orientation [108], [109]. Furthermore, measurements provided
conducted repeatedly using the same vehicle type and road by a magnetometer are relative to the Magnetic North, not to
situation [102]. the Geographic North Pole. The difference between both is
In some overall driver risk assessment systems, a driver is the magnetic declination can vary with time and place [108].
required to complete a questionnaire in order to evaluate the Therefore, there is a need to preprocess the data before any
driving risk. However, it may induce some form of biasness. further analysis. An easy and perhaps the most common way
(as seen in Table 6) to remove noise is by adopting low, high, literature, authors [115] mount the smartphone directly to the
median or bandpass filter. steering wheel. While the system may detect steering actions
The use of GPS also presents several obstacles. Firstly, with higher accuracy, the smartphone directly hinders with
the estimation of speed using GPS can be reliable and the deployment of the airbag. When the airbag is deployed,
accurate, but a significant amount of processing is required it may fling the device towards the driver at high speed causing
for driving analytics [28]. Moreover, the accuracy of GPS grievous injuries.
location data is inconsistent as it is affected by the current An abnormal driving behavior detection system may not
location of the vehicle such as in a tunnel, basement, forest, be useful or even impractical if it is designed to operate
high-building as well as incorrect placement of phone within only in one particular position. If the smartphone is not in
the vehicle [110], [111]. In addition, the sampling rate of a fixed position, small orientation changes can happen due
the smartphone-based GPS is only about 1Hz, which is too to vehicle moving over road slopes or speed bump leading
slow to detect and classify certain maneuvers accurately [110] to measurement errors. While dynamic reorientation strategies
as compared to IMU which has sampling rates of 20Hz to can rectify such issues and are easy to calculate, they produce
300 Hz [30]. Subsequently, GPS has high energy consumption. sensor data with significant errors due to the angles evolving.
The average power drain by a GPS while sampling at 1Hz Therefore, to implement an efficient reorientation strategy,
is approximately 370mW, whereas the average power drain it is of utmost importance to consider an efficient approach
of sampling gyroscope, accelerometer, and magnetometer at to assess when and how to correct the orientation of the
100Hz is only at around 60mW [110]. As such, the phone smartphone concerning the vehicle’s coordinate system [28].
must be charged constantly throughout the journey. Thus, if the device changes its position for some unknown
In addition, the use of GPS stirs privacy issue. Although reason, the specific time interval of position change can be
logistics fleet administrator can track how their fleet is being detected and the new position of the phone can be recognized
used and how their drivers behave so that potential risk and before proceeding to detect subsequent events [28].
operational cost can be reduced [24]. But the drivers may feel Another issue to consider is the type of feature extraction
being micro-managed and monitored during their entire work to use for driver behavior analysis. While statistical features
hours. In addition, insurance companies also face challenges are easy to calculate and extract, they are usually handcrafted
trying to convince users to use usage-based insurance due to and have to be carefully examined to determine the level
tracking issue [75]. of importance towards the end goal. It may lead to large
This is because individual daily whereabouts is still a feature dimension, excessive training time, low prediction for
private issue and tracking of the vehicle through GPS is a the model, and poor real-time performance [116].
privacy violation [75]. Furthermore, tracking the location of an On the other hand, unsupervised feature extraction method-
individual over a period can reveal his frequent destinations. ology such as Principal Component Analysis (PCA) and
In a worst-case scenario, such information can be used to Independent Component Analysis (ICA) can automatically
predict the future destinations of an individual [112]. extract feature and filters out any redundant sensor informa-
In another study [113], it was found that if a company tion [117]. PCA performs data transformation to find principal
explicitly explain to the customer that fair information practice components that can best explain the data and achieve good
will be followed, the customer is more likely to disclose their results when the input data follow a Gaussian distribution
personal information. As long as the user of a system is not and essential hidden features orthogonal to each other in the
unwillingly identified, monitored or tracked by the system vector space [117]. Whereas, ICA can extract independent hid-
and the right to reject being tracked in public or semi-public den features from multivariate signals by assuming observed
spaces are given, it should be fine. The company should also signals are generated via linear transformations from source
ensure that such information can only be shared (with user signals [117]. However, both methods may not be able to
consent) with the relevant entities for a specific time to avoid extract the representative features from driving behavior data
tracking [112]. by linear transformations as vehicle dynamics and human
Subsequently, most of the methodologies and applications driving behavior involve nonlinear properties [116], [117].
require the driver to mount their smartphone in a fixed position On the other hand, wavelet transform is a powerful tool for
or dynamically reorienting its axes by real-time computation nonstationary signal analysis for representing various aspects
of the Euler angles [111]. It is because measurements cannot of the signal, such as trends, discontinuities, and repeated pat-
be used directly without knowing the position of the phone terns [118]. The wavelet transformation can convert the signal
inside the vehicle. Therefore, an alignment strategy is needed into components of a different frequency while preserving
to align the smartphone’s coordinate system with the vehicle information related to the time domain [115]. However, given
coordinate system [107]. Even if the device is perfectly fixed to that wavelet analysis provides a signal representation of higher
a phone mount, there are potential errors since the device itself precision at low frequencies and less at high frequencies [119],
might not be able to determine if the vehicle is on a flat surface it can result in a loss of information if the high-frequencies
or is tilted due to the slope of a road [114]. Further complicat- signal carry information about the driver behavior or signature
ing the alignment issue is that the position of smartphones may of a road anomaly [120].
change daily or even during the journey due to user interaction On the other hand, the current state of the art, Deep Neural
and vibration [25] , [30], [88]. Moreover, some of the positions Network (DNN) not only omits the process of handcraft-
where the phone is mounted pose safety concerns. In one of the ing features but increases the recognition accuracy [121].
In simple, DNN can be described as a NN with a larger

number of hidden layer where each layer is parametrized by
a set of weights and bias. The shallower layer can extract
low-level features to represent more specific driving behaviors
while deeper layer can extract higher-level features to represent
more abstract driving behavior by fusing various low-level
features [117]. However, one major shortcoming of DNN
is requirement of a large amount of data for training as Fig. 5. Typical Mobile Crowdsensing Architecture [131].
insufficient data reduces the network’s generalization ability
leading to over-fitting [121]. Thus, it may be worthwhile to
consider the solution provided in [78] to tackle the issue of and if corrective actions are calculated, such information is
overfitting. Finally, for most of the methodologies proposed, also disseminated to other road users [124].
It was demonstrated in [80], by considering the car’s
it is unknown if training and updating of the algorithm can be
done in a smartphone environment. sensitivity, evaluation of the driver’s style can be improved.
Thus, showing that driving context is useful in explaining
driving behavior and can improve the generalizability and
VI. P OSSIBLE S OLUTIONS
reliability of existing driving behavior models [126]. However,
In this section, some possible solutions are provided and the key limiting factor is the acquiring of low-level contextual
discussed which may help to tackle some of the challenges information during the sensing phase. A large number of
mentioned in the earlier section. This includes the inclusion sensors have to be considered in order to retrieve a large
of context-aware system, mobile crowdsensing and active amount of context information, and a smartphone is unlikely
steering control to improve driving safety. to have the capability to perform such a task. Secondly,
identifying the relevant context is a challenge as the type and
A. Context-Aware Driving Behavior Detection System number could vary significantly with respect to the situation
A driver’s behavior can easily be influenced by any of the and driver. Thus, it can become an intractable problem [126].
factors listed in Table 1. However, effects of environmental
conditions, as well as car condition on driver behavior, were B. Mobile Crowdsensing for Information Dissemination
usually excluded in the decision-making process even though As discussed in the earlier section, one of the factors
they may play a big part on the driver behaviours [80]. The influencing driver’s behavior is the environmental conditions.
difference in traffic condition can be used as one example. The By considering such factor, the accuracy of driver behavior
difference in behavior can easily be spotted when driving on analysis can increase. One alternative solution other than the
a normal road and driving on a congested road where a driver context-aware system is through a paradigm known as mobile
tends to brake more often in a congested road. crowdsensing.
A possible framework to consider would be the context- Mobile crowdsensing is a new paradigm that utilizes a
aware system. Context can be formally defined as the infor- large number of smartphones to collect and analyze data from
mation used to characterize the situation of entities (person, the environment through its sensors to benefit users through
place or object) that are considered relevant to the interaction widespread information dissemination [127]–[130]. The key
between a user and an application, including the user and difference between context-aware and mobile crowdsensing
the application themselves [122]. Thus, in the Intelligent is their primary objective. The former is focused on the
Transportation System (ITS) domain, a driving context can retrieval of low-level contextual information to deduce high-
be considered as a set of facts or circumstances that sur- level contextual information to assist the driver. The latter
round a driving activity. For example, but not restricted to, is focused on analyzing collected data and the sharing of
surrounding objects, weather, traffic and demographic infor- analytical results to other road users. The typical mobile
mation, vehicle’s states, driver’s psychological and physical crowdsensing architecture can be seen in Fig. 5.
states [123]. Context-aware system is particularly desirable if In the case of driver behavior analysis, mobile crowd-
it can react specifically to their current location, time and other sensing can be beneficial by sharing road and traffic condi-
environmental factors and adapt their behavior according to the tion [67], [129], [131]. As discussed in the earlier section,
changing circumstances [122]. Singh et al. [67] utilized mobile crowdsensing to confirm road
In the ITS domain, the context-aware system usually con- congestion. As described by the authors [67], congestion on
sists of three phases in their architecture, which are, 1) Sensing the road can be detected when multiple smartphones detect the
phase, 2) Reasoning phase and 3) Acting phase [124], [125]. braking event in the same geographical area. Such information
In the sensing phase, the system collects different information can be used to distinguish braking due to congestion and
through multiple sensors (low-level contextual information). sudden braking due to harsh driving, thereby providing higher
The reasoning phase is responsible for performing reasoning analytical precision.
on low-level contextual information to deduce the behavior Similarly, the sharing of road surface information can
of a driver (high-level contextual information). This phase also increase the accuracy of driver behavior analysis. One
may include calculating corrective actions for other road users. common methodology to detect aggressive driving is through
In the acting phase, the system is responsible for giving alarms analyzing data collected from a smartphone’s accelerometer.
Fig. 6. Benchmark results for different android devices running DNN [135].
However, speed bumps and pothole can affect the analysis implement crowdsensing-based applications may be a limiting
result. Thus, if any driver who detected such anomaly can factor [71].
send this information together with its corresponding location
to any server or cloud platform. The next driver can download C. Active Steering Control for Improved Driving Safety
such information into the abnormal driving detection system.
Secondly, if a smartphone can make accurate classification
As this concept involves a large number of participants, road
of abnormal driving behavior, then it may be appropriate
surface information can be frequently updated, resulting in a
that such diagnostic is integrated into the ADAS and allows
more accurate road surface information database [131]. Using
steering control of the vehicle by the smartphone. Although
this database, any driver behavior analysis system can take
some implementations [47], [48], [51] provide passive feed-
the road anomalies into account and easily exclude behavioral
back for the drivers upon detecting traces of driver drowsi-
analysis during the period when the car is driving past the
ness, the driver may choose to ignore the warning given by
road anomalies.
the passive system [7]. Therefore, an active system where
Mobile crowdsensing also provide benefits in other areas.
the smartphone is given steering control may be a better
Firstly, the traveling route can also be dynamically updated
alternative. Upon detection of abnormal driving behavior,
using this information so that the best trip routes based on
the smartphone can take the current dynamics and kinematics
road quality can be recommended [132]. Secondly, mobile
model of the vehicle into consideration to control the lateral
crowdsensing can increase the efficiency of real-time driving
motion of a vehicle [134]–[139]. Also, a trajectory can be
recommendation through traffic information dissemination.
computed, and the steering control would then perform the
Similar to the concept of sharing road surface information,
maneuvering action based on the planned trajectory as closely
the corresponding location of traffic light can be shared.
as possible [140], [141]. Such implementation has the potential
System can then know in advance how far ahead is the traffic
to reduce the number of accidents. However, drivers may need
light and provide a gentle reminder for the driver to slow the
to be educated in how such system works as haptic guidance
car down. Finally, mobile crowdsensing can also promote road
can easily cause driver’s annoyance [7] and in a study carried
safety. The primary objective of a behavior analysis system
out by Katzourakis et al. [142], drivers may fail to understand
is to detect any abnormal driving behavior. Therefore, if the
the functionality of a lane departure assistance system leading
information of detected aggressive driver can be shared among
to inadvertent driver reactions.
other road users, other road users can take precaution in
advance and reduce the possibility of collision.
While such a concept brings great benefits, there are also D. Mobile/Cloud-Based Training and Updating of Algorithms
several drawbacks. The number of participants of a mobile As mentioned earlier, the two most challenging question
crowdsensing application may vary as it is based on voluntary for smartphone-based driver behavior analysis is “can the
participation and contribution which cannot be controlled or algorithms can be trained and updated in a smartphone envi-
planned [130], [133]. In addition, the required cost and time to ronment?” and “If no, is there any possible solution?” In this
computing. Cloud computing is a new way of how computing

technology is deployed where dynamically scalable and virtu-
alized resources are provided on demand [147], [148]. Cloud
resources are fully dedicated to a specific task and typically
use high-end GPUs with a large amount of memory avail-
able, thus removing any constraints relating to computational
resource [143]. Once the model is defined, the user can simply
request to train or update the model by uploading the new
data through an application and download the model when
the training is done that brings great convenience. However,
one factor to consider in regards to such a solution is the cost.
Such a solution may be expensive depending on the space and
resources used.
VII. C ONCLUSION
Data collected from the smartphone are a rich source of
Fig. 7. Framework of Tensorflow lite [145]. information for analyzing driver behavior. Numerous method-
ologies proposed by different authors to detect different
subsection, the discussion of the solution towards these two kinds of abnormal driving were reviewed. While smartphone
questions will be presented. solutions system has several advantages as compared to the
Although most of the proposed system did not mention telematics boxes, there exist several challenges that must
if training and updating of an algorithm can be done in a be taken into consideration for an accurate driver behavior
smartphone, but in general, the modern smartphone is capable classification.
of such a task. One good evidence for such claim is the Apple’s One of the key challenges of using vision-based methodol-
iPhone and a simple example would the predictive keyboard ogy is the varying light condition that can greatly affect the
which demonstrates on-device training and updating. While detection accuracy. The other key consideration for the sensor-
it can be argued that such application is trivial or uses a based methodology is how the noise and external factors such
simple model, but as published in Apple’s machine learning as road conditions or vehicle electrical components can be
journal [143], it is also possible to perform on-device Deep eliminated. It is imperative that such issues should be taken
Neural Neural (DNN) for face recognition. In a test carried out into consideration in future works. In addition, future works
by Ignatov et al. [144], a Samsung S3 that was manufactured should also address the issue of mounting of the smartphone in
over more than 5 years ago is capable of running deep learning a fixed and specific position. Eliminating such a requirement
solution although it is at a much slower rate as compared to will allow noise caused by a sudden orientation change of the
a newer phone. Figure 6 shows the portion of the benchmark smartphone to be eliminated. Furthermore, it brings greater
performance of different android devices running DNN. For flexibility and convenience for the driver.
more information, reader is recommended to refer to [144] for In this paper, some possible solutions were discussed that
more details. include the integration of smartphone driver behavior analysis
Therefore, it can be concluded that the smartphone is together with the concept of context-aware, crowdsensing,
capable of running a machine learning algorithm, leading to and steering control by smartphones. In addition, mobile or
the question of how can one load the model into a smartphone. cloud-based training and updating of algorithms were also
Fortunately, there are solutions cater towards this purpose such discussed. Although the solutions discussed may not be per-
as Tensorflow Lite from Google [145] and Core ML from fect, it has the potential to alleviate some of the issues faced.
Apple [146]. As shown in Fig.7, such a solution can be easily Therefore, there is definitely room for further improvements
adopted. for a smartphone based behavior analysis system.
However, there are still several issues that can restrict an ACKNOWLEDGMENT
algorithm being deployed into a smartphone. Firstly, the model
may take up a large storage space and require significant The authors would like to thank the reviewers for their
computational time on the GPU and/or CPU while sharing constructive comments which greatly improve the quality of
resources with other running applications [143]. High com- this manuscript.
plexity model may even require the full computation power R EFERENCES
of a smartphone and require other applications to be closed.
[1] H. Y. Guo, Y. Ji, T. Qu, and H. Chen, “Understanding and modeling
In addition, it may be possible that the temperature of the the human driver behavior based on MPC,” in Proc. 7th IFAC Symp.
device increases significantly due to running a computationally Adv. Autom. Control, Tokyo, Japan, Sep. 2013, pp. 133–138.
intensive application. Finally, a single application needs to be [2] A. B. Ellison, S. P. Greaves, and M. C. J. Bliemer, “Driver behaviour
profiles for road safety analysis,” Accident Anal. Prevention, vol. 76,
developed in different platforms (iOS, Android) to cate for pp. 118–132, Mar. 2015.
different groups of users. [3] H. S. Kim, Y. S. Hwang, D. S. Yoon, W. G. Choi, and
Thus, if mobile training and updating of the algorithm C. H. Park, “Driver workload characteristics analysis using EEG data
from an urban road,” IEEE Trans. Intell. Transp. Syst., vol. 15, no. 4,
are not feasible, the next option would be to utilize cloud pp. 1844–1849, Aug. 2014.
[4] Y. Liao, M. Wang, L. Duan, and F. Chen, “Cross-regional driver– [25] G. Castignani, R. Frank, and T. Engel, “Driver behavior profiling using
vehicle interaction design: An interview study on driving risk percep- smartphones,” in Proc. 16th Int. IEEE Annu. Conf. Intell. Transp.
tions, decisions, and ADAS function preferences,” IET Intell. Transp. Syst. (ITSC), The Hague, The Netherlands, Oct. 2013, pp. 552–557.
Syst., vol. 12, no. 8, pp. 801–808, May 2018. [26] J. Ferreira, Jr., et al., “Driver behavior profiling: An investigation with
[5] B. Higgs and M. Abbas, “Segmentation and clustering of car-following different smartphone sensors and machine learning,” PLoS One, vol. 12,
behavior: Recognition of driving patterns,” IEEE Trans. Intell. Transp. no. 4, Apr. 2017, Art. no. e0174959.
Syst., vol. 16, no. 1, pp. 81–90, Feb. 2015. [27] P. Händel et al., “Insurance telematics: Opportunities and challenges
[6] C. Miyajima et al., “Driver modeling based on driving behavior and with the smartphone solution,” IEEE Intell. Transp. Syst. Mag., vol. 6,
its evaluation in driver identification,” Proc. IEEE, vol. 95, no. 2, no. 4, pp. 57–70, Winter 2014.
pp. 427–437, Feb. 2007. [28] E. I. Vlahogianni and E. N. Barmpounakis, “Driving analytics using
[7] C. M. Martinez, M. Heucke, B. Gao, D. Cao, and F.-Y. Wang, “Driving smartphones: Algorithms, comparisons and challenges,” Transp. Res.
style recognition for intelligent vehicle control and advanced driver C, Emerg. Technol., vol. 79, pp. 196–206, Jun. 2017.
assistance: A survey,” IEEE Trans. Intell. Transp. Syst., vol. 19, no. 3, [29] J. Engelbrecht, M. J. Booysen, G.-J. van Rooyen, and F. J. Bruwer,
pp. 666–676, Mar. 2018. “Survey of smartphone-based sensing in vehicles for intelligent trans-
[8] B. Shi, L. Xu, and W. Meng, “Applying a WNN-HMM based driver portation system applications,” IET Intell. Transp. Syst., vol. 9, no. 10,
model in human driver simulation: Method and test,” IEEE Trans. pp. 924–935, Dec. 2015.
Intell. Transp. Syst., vol. 19, no. 11, pp. 3431–3438, Nov. 2018. [30] J. Wahlström, I. Skog, and P. Händel, “Smartphone-based vehicle
[9] B. Zhu, J. Zhao, S. Yan, and W. Den, “Personalized lane-change telematics: A ten-year anniversary,” IEEE Trans. Intell. Transp. Syst.,
assistance system with driver behavior identification,” IEEE Trans. Veh. vol. 18, no. 10, pp. 2802–2825, Oct. 2017.
Technol., vol. 67, no. 11, pp. 10293–10306, Nov. 2018. [31] B.-G. Lee and W.-Y. Chung, “A smartphone-based driver safety
[10] A. Bender, G. Agamennoni, J. R. Ward, S. Worrall, and E. M. Nebot, monitoring system using data fusion,” Sensors, vol. 12, no. 12,
“An unsupervised approach for inferring driver behavior from natural- pp. 17536–17552, Dec. 2012.
istic driving data,” IEEE Trans. Intell. Transp. Syst., vol. 16, no. 6, [32] R. Araújo, A. Igreja, R. de Castro, and R. E. Araújo, “Driving coach:
pp. 3325–3336, Dec. 2015. A smartphone application to evaluate driving efficient patterns,” in
[11] J. Xu, J. Liu, X. Sun, K. Zhang, W. Qu, and Y. Ge, “The relationship Proc. IEEE Intell. Veh. Symp., Alcala de Henares, Spain, Jun. 2012,
between driving skill and driving behavior: Psychometric adaptation of pp. 1005–1010.
the driver skill inventory in China,” Accident Anal. Prevention, vol. 120, [33] J. H. Hong, B. Margines, and A. K. Dey, “A smartphone-based sensing
pp. 92–100, Nov. 2018. platform to model aggressive driving behaviors,” in Proc. SIGCHI
[12] P. Stanojević, T. Lajunen, D. Jovanović, P. Sârbescu, and S. Kostadinov, Conf. Hum. Factors Comput. Syst., Toronto, ON, Canada, Apr. 2014,
“The driver behaviour questionnaire in South-East Europe countries: pp. 4047–4056.
Bulgaria, Romania and Serbia,” Transp. Res. F, Traffic Psychol. Behav., [34] F. Giannotti et al., “Unveiling the complexity of human mobility by
vol. 53, pp. 24–33, Feb. 2018. querying and mining massive trajectory data,” VLDB J. Int. J. Very
Large Data Bases, vol. 20, no. 5, pp. 695–719, 2011.
[13] M. Møller and S. Haustein, “Keep on cruising: Changes in lifestyle
and driving style among male drivers between the age of 18 and 23,” [35] H. Barbosa et al., “Human mobility: Models and applications,”
Transp. Res. F, Traffic Psychol. Behav., vol. 20, pp. 59–69, Sep. 2013. Phys. Rep., vol. 734, pp. 1–74, Mar. 2018.
[36] L. Pappalardo, F. Simini, S. Rinzivillo, D. Pedreschi, F. Giannotti,
[14] W. Chu, C. Wu, C. Atombo, H. Zhang, and T. Özkan, “Traffic climate,
and A.-L. Barabási, “Returners and explorers dichotomy in human
driver behaviour, and accidents involvement in China,” Accident Anal.
mobility,” Nature Commun., vol. 6, Sep. 2015, Art. no. 8166.
Prevention, vol. 122, pp. 119–126, Jan. 2019.
[37] L. Pappalardo, S. Rinzivillo, Z. Qu, D. Pedreschi, and F. Giannotti,
[15] S. Kaplan, M. A. Guvensan, A. G. Yavuz, and Y. Karalurt, “Driver
“Understanding the patterns of car travel,” Eur. Phys. J. Special Topics,
behavior analysis for safe driving: A survey,” IEEE Trans. Intell.
vol. 215, pp. 61–73, Jan. 2013.
Transp. Syst., vol. 16, no. 6, pp. 3017–3032, Dec. 2015.
[38] F. Giannotti, M. Nanni, D. Pedreschi, C. Renso, and R. Trasarti,
[16] J. Hu, L. Xu, X. He, and W. Meng, “Abnormal driving detection based “Mining mobility behavior from trajectory data,” in Proc. Int. Conf.
on normalized driving behavior,” IEEE Trans. Veh. Technol., vol. 66, Comput. Sci. Eng., Vancouver, BC, Canada, Aug. 2009, pp. 948–951.
no. 8, pp. 6645–6652, Aug. 2017.
[39] D. Wang, Q. Liu, Z. Xiao, J. Chen, Y. Huang, and W. Chen,
[17] A. Tawari, S. Martin, and M. M. Trivedi, “Continuous head movement “Understanding travel behavior of private cars via trajectory big
estimator for driver assistance: Issues, algorithms, and on-road evalu- data analysis in urban environments,” in Proc. IEEE 15th Int. Conf
ations,” IEEE Trans. Intell. Transp. Syst., vol. 15, no. 2, pp. 818–830, Dependable, Autonomic Secure Comput., 15th Int. Conf Pervasive
Apr. 2014. Intell. Comput., 3rd Int. Conf Big Data Intell. Comput. Cyber Sci.
[18] A. Sahayadhas, K. Sundaraj, and M. Murugappan, “Detecting driver Technol. Congr. (DASC/PiCom/DataCom/CyberSciTech), Orlando, FL,
drowsiness based on sensors: A review,” Sensors, vol. 12, no. 12, USA, Nov. 2017, pp. 917–924.
pp. 16937–16953, Dec. 2012. [40] Y. Chen and C. Shen, “Performance analysis of smartphone-sensor
[19] G. A. M. Meiring and H. C. Myburgh, “A review of intelli- behavior for human activity recognition,” IEEE Access, vol. 5,
gent driving style analysis systems and related artificial intelli- pp. 3095–3110, 2017.
gence algorithms,” Sensors, vol. 15, no. 12, pp. 30653–30682, [41] T. J. Matarazzo et al., “Crowdsensing framework for monitoring bridge
Dec. 2015. vibrations using moving smartphones,” Proc. IEEE, vol. 106, no. 4,
[20] L. M. Bergasa, D. Almería, J. Almazán, J. J. Yebes, and R. Arroyo, pp. 577–593, Apr. 2018.
“DriveSafe: An app for alerting inattentive drivers and scoring driving [42] Y. Lu, A. Misra, W. Sun, and H. Wu, “Smartphone sensing meets
behaviors,” in Proc. IEEE Intell. Veh. Symp., Dearborn, MI, USA, transport data: A collaborative framework for transportation service
Jun. 2014, pp. 240–245. analytics,” IEEE Trans. Mobile Comput., vol. 17, no. 4, pp. 945–960,
[21] J. Yu, Z. Chen, Y. Zhu, Y. Chen, L. Kong, and M. Li, “Fine- Apr. 2018.
grained abnormal driving behaviors detection and identification [43] D. Kelly, K. Curran, and B. Caulfield, “Automatic prediction of
with smartphones,” IEEE Trans. Mobile Comput., vol. 16, no. 8, health status using smartphone-derived behavior profiles,” IEEE
pp. 2198–2212, Aug. 2017. J. Biomed. Health Informat., vol. 21, no. 6, pp. 1750–1760,
[22] Y. Saito, M. Itoh, and T. Inagaki, “Driver assistance system with a Nov. 2017.
dual control scheme: Effectiveness of identifying driver drowsiness and [44] S. Wan, Y. Liang, Y. Zhang, and M. Guizani, “Deep multi-
preventing lane departure accidents,” IEEE Trans. Human-Mach. Syst., layer perceptron classifier for behavior analysis to estimate Parkin-
vol. 46, no. 5, pp. 660–671, Oct. 2016. son’s disease severity using smartphones,” IEEE Access, vol. 6,
[23] J.-L. Yin, B.-H. Chen, K.-R. Lai, and Y. Li, “Automatic danger- pp. 36825–36833, 2018.
ous driving intensity analysis for advanced driver assistance systems [45] D. Sommer and M. Golz, “Evaluation of PERCLOS based cur-
from multimodal driving signals,” IEEE Sensors J., vol. 18, no. 12, rent fatigue monitoring technologies,” in Proc. 32nd Annu. Int.
pp. 4785–4794, Jun. 2018. Conf. IEEE EMBS, Buenos Aires, Argentina, Aug./Sep. 2010,
[24] G. Castignani, T. Derrmann, R. Frank, and T. Engel, “Driver behavior pp. 4456–4459.
profiling using smartphones: A low-cost platform for driver moni- [46] B. Akrout and W. Mahdi, “A blinking measurement method for driver
toring,” IEEE Intell. Transp. Syst. Mag., vol. 7, no. 1, pp. 91–102, drowsiness detection,” in Proc. 8th Int. Conf. Comput. Recognit. Syst.
Spring 2015. (CORES), Miłków, Poland, May 2013, pp. 651–660.
[47] L. Xu, S. Li, K. Bian, T. Zhao, and W. Yan, “Sober-drive: [69] Z. Ouyang, J. Niu, Y. Liu, and J. Rodrigues, “Multiwave: A novel
A smartphone-assisted drowsy driving detection system,” in Proc. Int. vehicle steering pattern detection method based on smartphones,” in
Conf. Comput. Netw. Commun. (ICNC), Honolulu, HI, USA, Feb. 2014, Proc. IEEE ICC Ad-hoc Sensor Netw. Symp., Kuala Lumpur, Malaysia,
pp. 398–402. May 2016, pp. 1–7.
[48] A. Dasgupta, D. Rahman, and A. Routray, “A smartphone-based [70] G. Castignani, T. Derrmann, R. Frank, and T. Engel, “Smartphone-
drowsiness detection and warning system for automotive drivers,” IEEE based adaptive driving maneuver detection: A large-scale evaluation
Trans. Intell. Transp. Syst., to be published. study,” IEEE Intell. Transp. Syst., vol. 18, no. 9, pp. 2330–2339,
[49] T. Hayashi, S. Watanabe, T. Toda, T. Hori, J. Le Roux, and K. Takeda, Sep. 2017.
“Duration-controlled LSTM for polyphonic sound event detection,” [71] D. G. Costa, A. Damasceno, and I. Silva, “CitySpeed: A crowdsensing-
IEEE/ACM Trans. Audio, Speech, Lang. Process., vol. 25, no. 11, based integrated platform for general-purpose monitoring of vehicular
pp. 2059–2070, Nov. 2017. speeds in smart cities,” Smart Cities, vol. 2, no. 1, pp. 46–65, Feb. 2019.
[50] G. Lafay, M. Lagrange, M. Rossignol, E. Benetos, and A. Roebel, [72] P. Chaovalit, C. Saiprasert, and T. Pholprasit, “A method for
“A morphological model for simulating acoustic scenes and its appli- driving event detection using SAX on smartphone sensors,” in Proc.
cation to sound event detection,” IEEE/ACM Trans. Audio, Speech, 13th Int. Conf. ITS Telecommun., Tampere, Finland, Nov. 2013,
Lang. Process., vol. 24, no. 10, pp. 1854–1864, Oct. 2016. pp. 450–455.
[51] K. Chang, B.-H. Oh, and K.-S. Hong, “An implementation of [73] J. Lin, E. Keogh, S. Lonardi, and B. Chiu, “A symbolic representation
smartphone-based driver assistance system using front and rear cam- of time series, with implications for streaming algorithms,” in Proc. 8th
era,” in Proc. Int. Conf. Consum. Electron., Las Vegas, NV, USA, ACM SIGMOD Workshop Res. Issues Data Mining Knowl. Discovery,
Jan. 2014, pp. 280–281. San Diego, CA, USA, Jun. 2003, pp. 2–11.
[52] F. De Ponte Müller, “Survey on ranging sensors and cooperative [74] J. Xie, A. R. Hilal, and D. Kulić, “Driving maneuver classification:
techniques for relative positioning of vehicles,” Sensors, vol. 17, no. 2, A comparison of feature extraction methods,” IEEE Sensors J., vol. 18,
p. 127, Jan. 2017. no. 12, pp. 4777–4784, Jun. 2018.
[53] D.-Y. Chen, G.-R. Chen, and Y.-W. Wang, “Real-time dynamic vehicle [75] H. R. Eftekhari and M. Ghatee, “Hybrid of discrete wavelet trans-
detection on resourcelimited mobile platform,” IET Comput. Vis., vol. 7, form and adaptive neuro fuzzy inference system for overall driving
no. 2, pp. 81–89, Apr. 2013. behavior recognition,” Transp Res. F, Traffic Psychol. Behav., vol. 58,
[54] Y. Qiao, K. Zeng, L. Xu, and X. Yin, “A smartphone-based driver pp. 782–796, Oct. 2018.
fatigue detection using fusion of multiple real-time facial features,” [76] H. R. Eftekhari and M. Ghatee, “A similarity-based neuro-fuzzy
in Proc. 13th IEEE Annu. Consum. Commun. Netw. Conf. (CCNC), modeling for driving behavior recognition applying fusion of smart-
Las Vegas, NV, USA, Jan. 2016, pp. 1–6. phone sensors,” J. Intell. Transp. Syst., vol. 23, no. 1, pp. 72–83,
[55] Z. Li, G. Sun, F. Zhang, L. Jia, K. Zheng, and D. Zhao, “Smartphone- Jan. 2019.
based fatigue detection system using progressive locating method,” IET [77] K. B. Ahmed, B. Goel, P. Bharti, S. Chellappan, and M. Bouhorma,
Intell. Transp. Syst., vol. 10, no. 3, pp. 148–156, Apr. 2016. “Leveraging smartphone sensors to detect distracted driving activities,”
[56] Y. Ji, S. Wang, Y. Lu, J. Wei, and Y. Zhao, “Eye and mouth state IEEE Trans Intell. Transp. Syst., vol. 20, no. 9, pp. 3303–3312,
detection algorithm based on contour feature extraction,” J. Electron. Sep. 2019.
Imag., vol. 27, no. 5, Feb. 2018, Art. no. 051205. [78] M. M. Bejani and M. Ghatee, “Convolutional neural network with
[57] C. Zhang, X. Wu, X. Zheng, and S. Yu, “Driver drowsiness detection adaptive regularization to classify driving styles on smartphones,” IEEE
using multi-channel second order blind identifications,” IEEE Access, Trans. Intell. Transp. Syst., to be published.
vol. 7, pp. 11829–11843, Feb. 2019. [79] E. Carvalho et al., “Exploiting the use of recurrent neural net-
[58] Y. Xie, F. Li, Y. Wu, S. Yang, and Y. Wang, “D3-guard: Acoustic-based works for driver behavior profiling,” in Proc. Int. Joint Conf. Neural
drowsy driving detection using smartphones,” in Proc. IEEE Int. Conf. Netw. (IJCNN), Anchorage, AK, USA, May 2017, pp. 3016–3021.
Comput. Commun., Paris, France, Apr./May 2019, pp. 1225–1233. [80] M. M. Bejani and M. Ghatee, “A context aware system for driving
[59] M. Fazeen, B. Gozick, R. Dantu, M. Bhukhiya, and M. C. González, style evaluation by an ensemble learning on smartphone sensors data,”
“Safe driving using mobile phones,” IEEE Trans. Intell. Transp. Syst., Transp. Res. C, Emerg. Technol., vol. 89, pp. 303–320, Apr. 2018.
vol. 13, no. 3, pp. 1462–1468, Sep. 2012. [81] Q. Xue, K. Wang, J. J. Lu, and Y. Liu, “Rapid driving style recognition
[60] Z. Ouyang, J. Niu, and M. Guizani, “Improved vehicle steering in car-following using machine learning and vehicle trajectory data,”
pattern recognition by using selected sensor data,” IEEE Trans. Mobile J. Adv. Transp., vol. 2019, Jan. 2019, Art. no. 9085238.
Comput., vol. 17, no. 6, pp. 1383–1396, Jun. 2018. [82] J. Xie, A. R. Hilal, and D. Kulic, “Driver distraction recognition based
[61] Z. Chen, J. Yu, Y. Zhu, Y. Chen, and M. Li, “D3 : Abnormal driving on smartphone sensor data,” in Proc. IEEE Int. Conf. Syst., Man,
behaviors detection and identification using smartphone sensors,” in Cybern., Miyazaki, Japan, Oct. 2018, pp. 801–806.
Proc. 12th Annu. IEEE Int. Conf. Sens., Commun., Netw. (SECON), [83] J. Guo, Y. Liu, L. Zhang, and Y. Wang, “Driving behaviour style
Seattle, WA, USA, Jun. 2015, pp. 524–532. study with a hybrid deep learning framework based on GPS data,”
[62] Q. Wang, Y. Gu, J. Liu, and S. Kamijo, “DeepSpeedometer: Vehicle Sustainability, vol. 10, no. 7, p. 2351, Jul. 2018.
speed estimation from accelerometer and gyroscope using LSTM [84] S. Daptardar, V. Lakshminarayanan, S. Reddy, S. Nair, S. Sahoo, and
model,” in Proc. IEEE 20th Int. Conf. Intell. Transp. Syst. (ITSC) P. Sinha, “Hidden Markov model based driving event detection and
Workshop, Yokohama, Japan, Oct. 2017, pp. 214–219. driver profiling from mobile inertial sensor data,” in Proc. SENSORS,
[63] Y. Gu, Q. Wang, and S. Kamijo, “Intelligent driving data recorder Busan, South Korea, Nov. 2015, pp. 1–4.
in smartphone using deep neural network-based speedometer and [85] A. Bhatt, V. Dave, Y. Panchamia, and P. Thakre, “Analyzing behavioral
scene understanding,” IEEE Sensors J., vol. 19, no. 1, pp. 287–296, attributes of drivers and implementing safe driving model,” in Proc.
Jan. 2019. IEEE Int. Conf. Veh. Electron. Saf. (ICVES), Vienna, Austria, Jun. 2017,
[64] A. Alasaadi and T. Nadeem, “UniCoor: A smartphone unified pp. 228–232.
coordinate system for ITS applications,” in Proc. IEEE 13th Int. [86] Z. Yang and J. H. L. Hansen, “Unsupervised driving performance
Conf. Mobile Ad Hoc Sensor Syst., Brasilia, Brazil, Oct. 2016, assessment using free-positioned smartphones in vehicles,” in Proc.
pp. 290–298. 19th Int. IEEE Conf. Intell. Transp. Syst., Rio de Janeiro, Brazil,
[65] D. A. Johnson and M. M. Trivedi, “Driving style recognition using a Nov. 2016, pp. 1598–1603.
smartphone as a sensor platform,” in Proc. 14th Int. IEEE Conf. Intell. [87] J. Paefgen, F. Kehr, Y. Zhai, and F. Michahelles, “Driving behavior
Transp. Syst., Washington, DC, USA, Oct. 2011, pp. 1609–1615. analysis with smartphones: Insights from a controlled field study,” in
[66] H. Eren, S. Makinist, E. Akin, and A. Yilmaz, “Estimating Proc. 11th Int. Conf. Mobile Ubiquitous Multimedia, London, U.K.,
driving behavior by a smartphone,” in Proc. Intell. Veh. Symp., Dec. 2014, Art. no. 36.
Alcala de Henares, Spain, Jun. 2012, pp. 234–239. [88] J. Wahlström, I. Skog, P. Händel, B. Bradley, S. Madden, and
[67] G. Singh, D. Bansal, and S. Sofat, “A smartphone based technique H. Balakrishnan, “Smartphone placement within vehicles,” IEEE Trans.
to monitor driving behavior using DTW and crowdsensing,” Pervasive Intell. Transp. Syst., to be published.
Mobile Comput., vol. 40, pp. 56–70, Sep. 2017. [89] L. Jiang, X. Chen, and W. He, “Safecam: Analyzing intersection-
[68] A. H. Ali, A. Atia, and M. Sami, “Driving events recognition using related driver behaviors using multi-sensor smartphones,” in Proc.
smartphone sensors,” Int. J. Ambient Comput. Intell., vol. 8, no. 3, IEEE Int. Conf. Pervasive Comput. Commun., Sydney, NSW, Australia,
pp. 22–37, Jul. 2017. Mar. 2016, pp. 1–9.
[90] Q. Wang, Y. Liu, J. Liu, Y. Gu, and S. Kamijo, “Critical areas [112] T. A. Butt, R. Iqbal, K. Salah, M. Aloqaily, and Y. Jararweh, “Privacy
detection and vehicle speed estimation system towards intersection- management in social Internet of vehicles: Review, challenges and
related driving behavior analysis,” in Proc. IEEE Int. Conf. Consum. blockchain based solutions,” IEEE Access, vol. 7, pp. 79694–79713,
Electron. (ICCE), Las Vegas, NV, USA, Jan. 2018, pp. 1–6. Jun. 2019.
[91] K. Saleh, M. Hossny, and S. Nahavandi, “Driving behavior classi- [113] A. Rohunen, J. Markkula, M. Heikkila, and J. Heikkila, “Open traffic
fication based on sensor data fusion using LSTM recurrent neural data for future service innovation: Addressing the privacy challenges
networks,” in Proc. IEEE 20th Int. Conf. Intell. Transp. Syst. (ITSC), of driving data,” J. Theor. Appl. Electron. Commerce Res., vol. 9, no. 3,
Yokohama, Japan, Oct. 2017, pp. 1–6. pp. 71–89, Sep. 2014.
[92] K. Greff, R. K. Srivastava, J. Koutnìk, B. R. Steunebrink, and [114] L. Kang and S. Banerjee, “Practical driving analytics with smartphone
J. Schmidhuber, “LSTM: A search space odyssey,” IEEE Trans. Neural sensors,” in Proc. IEEE Veh. Network. Conf. (VNC), Torino, Italy,
Netw. Learn. Syst., vol. 28, no. 10, pp. 2222–2232, Oct. 2017. Nov. 2017, pp. 303–310.
[93] X. Xu, J. Yu, Y. Chen, Y. Zhu, S. Qian, and M. Li, “Leveraging [115] R. Lotfi and M. Ghatee, “Smartphone based driving style clas-
audio signals for early recognition of inattentive driving with smart- sification using features made by discrete wavelet transform,”
phones,” IEEE Trans. Mobile Comput., vol. 17, no. 7, pp. 1553–1567, Mar. 2018, arXiv:1803.06213. [Online]. Available: https://arxiv.
Jul. 2018. org/abs/1803.06213
[94] M. J. Flores, J. M. Armingol, and A. De La Escalera, “Driver drowsi- [116] J. Chen, Z. C. Wu, and J. Zhang, “Driver identification based on
ness detection system under infrared illumination for an intelligent hidden feature extraction by using adaptive nonnegativity-constrained
vehicle,” IET Intell. Transp. Syst., vol. 5, no. 4, pp. 241–251, Dec. 2011. autoencoder,” Appl. Soft Comput., vol. 74, pp. 1–9, Jan. 2019.
[95] L. M. Bergasa, J. Nuevo, M. A. Sotelo, and M. Vazquez, “Real-time [117] H. Liu, T. Taniguchi, Y. Tanaka, K. Takenaka, and T. Bando,
system for monitoring driver vigilance,” in Proc. IEEE Intell. Veh. “Visualization of driving behavior based on hidden feature extraction
Symp., Parma, Italy, Jun. 2004, pp. 78–83. by using deep learning,” IEEE Trans. Intell. Transp. Syst., vol. 18,
[96] X. Liu, F. Xu, and K. Fujimura, “Real-time eye detection and tracking no. 9, pp. 2477–2489, Sep. 2017.
for driver observation under various light conditions,” in Proc. Intell. [118] C. Zhang, H. Wang, and R. Fu, “Automated detection of
Veh. Symp., Versailles, France, Jun. 2002, pp. 344–351. driver fatigue based on entropy and complexity measures,” IEEE
[97] F. Zhang, J. Su, L. Geng, and Z. Xiao, “Driver fatigue detection Trans. Intell. Transp. Syst., vol. 15, no. 1, pp. 168–177,
based on eye state recognition,” in Proc. Int. Conf. Mach. Vis. Inform. Feb. 2014.
Technol. (CMVIT), Singapore, Feb. 2017, pp. 105–110. [119] M. Guerrieri, R. Mauro, G. Parla, and T. Tollazzi, “Analysis of
[98] S. Romeo et al., “Fluorescent light induces neurodegeneration in the kinematic parameters and driver behavior at turbo roundabouts,”
rodent nigrostriatal system but near infrared LED light does not,” Brain J. Transp. Eng., A, Syst., vol. 144, no. 6, pp. 1–12, Mar. 2018.
Res., vol. 1662, pp. 87–101, Mar. 201. [120] A. S. El-Wakeel, A. Noureldin, H. S. Hassanein, and N. Zorba,
[99] Spectral Sensitivity of the Human Eye. Accessed: Sep. 13, 2019. “Utilization of wavelet packet sensor de-noising for accurate posi-
[Online]. Available: https://light-measurement.com/spectral-sensitivity- tioning in intelligent road services,” in Proc. 14th Int. Wireless Com-
of-eye/ mun. Mobile Comput. Conf. (IWCMC), Limassol, Cyprus, Jun. 2018,
[100] N. A. Stanton, A. Hedge, K. Brookhuis, E. Salas, and H. W. Hendrick, pp. 1231–1236.
Handbook of Human Factors and Ergonomics Methods. Bristol, PA, [121] Y. Zhang, J. Li, Y. Guo, C. Xu, J. Bao, and Y. Song, “Vehicle driving
USA: Taylor & Francis, 2004, pp. 257–258. behavior recognition based on multi-view convolutional neural network
[101] B. Mandal, L. Li, G. S. Wang, and J. Lin, “Towards detection of bus with joint data augmentation,” IEEE Trans. Veh. Technol., vol. 68, no. 5,
driver fatigue based on robust visual analysis of eye state,” IEEE Trans. pp. 4223–4234, May 2019.
Intell. Transp. Syst., vol. 18, no. 3, pp. 545–557, Mar. 2017. [122] M. Baldauf, S. Dustdar, and F. Rosenberg, “A survey on context-
[102] B. Shi et al., “Evaluating driving styles by normalizing driving behavior aware systems,” Int. J. Ad Hoc Ubiquitous Comput., vol. 2, no. 4,
based on personalized driver modeling,” IEEE Trans. Syst., Man, pp. 263–277, 2007.
Cybern., Syst., vol. 45, no. 12, pp. 1502–1508, Dec. 2015. [123] P. Angkititrakul, C. Miyajima, and K. Takeda, “Impact of driving
[103] G. Castignani, R. Frank, and T. Engel, “An evaluation study of driver context on stochastic driver-behavior model: Quantitative analysis of
profiling fuzzy algorithms using smartphones,” in Proc. 21st IEEE car following task,” in Proc. IEEE Int. Conf. Veh. Electron. Saf.,
Int. Conf. Netw. Protocols (ICNP), Goettingen, Germany, Oct. 2013, Istanbul, Turkey, Jul. 2012, pp. 163–168.
pp. 1–6. [124] S. Al-Sultan, A. H. Al-Bayatti, and H. Zedan, “Context-aware
[104] S. Amado, E. Arıkan, G. Kaça, M. Koyuncu, and B. N. Turkan, “How driver behavior detection system in intelligent transportation sys-
accurately do drivers evaluate their own driving behavior? An on-road tems,” IEEE Trans. Veh. Technol., vol. 62, no. 9, pp. 4264–4275,
observational study,” Accident Anal. Prevention, vol. 63, pp. 65–73, Nov. 2013.
Feb. 2014. [125] D. Böhmländer, T. Dirndorfer, A. H. Al-Bayatti, and T. Brandmeier,
[105] S. Kanarachos, S.-R. G. Christopoulos, and A. Chroneos, “Smartphones “Context-aware system for pre-triggering irreversible vehicle safety
as an integrated platform for monitoring driver behaviour: The role actuators,” Accidents Anal. Prevention, vol. 103, pp. 72–84, Jun. 2017.
of sensor fusion and connectivity,” Transp. Res. C, Emerg. Technol., [126] A. Rakotonirainy and F. D. Maire, “Context-aware driving behavioural
vol. 95, pp. 867–882, Oct. 2018. model,” in Proc. Int. Tech. Conf. Enhanced Saf. Vehicles (ESV),
[106] C. Bo et al., “Detecting driver’s smartphone usage via nonintrusively Washington DC, USA, Jun. 2005, pp. 1–6.
sensing driving dynamics,” IEEE Internet Things J., vol. 4, no. 2, [127] C. Ma, Q. Zhu, S. Wu, and B. Liu, “Representation learning from
pp. 340–350, Apr. 2017. time labelled heterogeneous data for mobile crowdsensing,” Mobile
[107] Y. Wang et al., “Determining driver phone use by exploiting smart- Inf. Syst., vol. 2016, Aug. 2016, Art. no. 2097243.
phone integrated sensors,” IEEE Trans. Mobile Comput., vol. 15, no. 8, [128] W. Zamora, C. T. Calafate, J.-C. Cano, and P. Manzoni, “A survey on
pp. 1965–1981, Aug. 2016. smartphone-based crowdsensing solutions,” Mobile Inf. Syst., vol. 2016,
[108] J. Almazán, L. M. Bergasa, J. J. Yebes, R. Barea, and R. Arroyo, Oct. 2016, Art. no. 9681842.
“Full auto-calibration of a smartphone on board a vehicle using [129] G. Singh, D. Bansal, S. Sofat, and N. Aggarwal, “Smart patrolling:
IMU and GPS embedded sensors,” in Proc. IEEE Intell. Veh. Symp., An efficient road surface monitoring using smartphone sensors and
Gold Coast, QLD, Australia, Jun. 2013, pp. 1374–1380. crowdsourcing,” Pervasive Mobile Comput., vol. 40, pp. 71–88,
[109] P. Nguyen, H. Nguyen, D. Nguyen, T. N. Dinh, H. M. La, and T. Vu, Sep. 2017.
“ParkSense: Automatic parking positioning by leveraging in-vehicle [130] G. Merlino, S. Arkoulis, S. Distefano, C. Papagianni, A. Puliafito, and
magnetic field variation,” IEEE Access, vol. 5, pp. 25021–25033, S. Papavassiliou, “Mobile crowdsensing as a service: A platform for
Nov. 2017. applications on top of sensing clouds,” Future Gener. Comput. Syst.,
[110] F. J. Bruwer and M. J. Booysen, “Comparison of GPS and MEMS sup- vol. 56, pp. 623–639, Mar. 2016.
port for smartphone-based driver behavior monitoring,” in Proc. IEEE [131] X. Li and D. W. Goldberg, “Toward a mobile crowdsensing system
Symp. Series Comput. Intell., Cape Town, South Africa, Dec. 2015, for road surface assessment,” Comput., Environ. Urban Syst., vol. 69,
pp. 434–441. pp. 51–62, May 2018.
[111] S.-R. G. Christopoulos, S. Kanarachos, and A. Chroneos, “Learning [132] A. S. El-Wakeel, A. Noureldin, H. S. Hassanein, and N. Zorba,
driver braking behavior using smartphones, neural networks and the “iDriveSense: Dynamic route planning involving roads quality infor-
sliding correlation coefficient: Road anomaly case study,” IEEE Intell. mation,” in Proc. IEEE Global Commun. Conf., Abu Dhabi,
Transp. Syst., vol. 20, no. 1, pp. 65–74, Jan. 2019. United Arab Emirates, Dec. 2018, pp. 1–6.
[133] F. Montori, P. P. Jayaraman, A. Yavari, A. Hassani, and Cheng Siong Chin (M’01–SM’09) received the
D. Georgakopoulos, “The curse of sensing: Survey of techniques B.Eng. degree in mechanical and production
and challenges to cope with sparse and dense data in mobile crowd engineering from Nanyang Technological Univer-
sensing for Internet of Things,” Pervasive Mobile Comput., vol. 49, sity (NTU) in 2000, the M.Sc. degree in advanced
pp. 111–125, Sep. 2018. control and systems engineering from The Univer-
[134] H. M. Fahmy, M. A. A. El Ghany, and G. Baumann, “Vehicle risk sity of Manchester in 2001, and the Ph.D. degree
assessment and control for lane-keeping and collision avoidance at low- in applied control engineering from the Research
speed and high-speed scenarios,” IEEE Trans. Veh. Technol., vol. 67, Robotics Centre, NTU, in 2008. He is currently an
no. 6, pp. 4806–4818, Jun. 2018. Associate Professor with Newcastle University in
[135] C. M. Kang, S.-H. Lee, and C. C. Chung, “Multirate lane-keeping Singapore, Singapore. His research interests include
system with kinematic vehicle model,” IEEE Trans. Veh. Technol., the design and simulation of complex systems for
vol. 67, no. 10, pp. 9211–9222, Oct. 2018. land and underwater vehicles in uncertain environment. He was the general
[136] S. Malan, M. Milanese, P. Borodani, and A. Gallione, “Lateral control chair and the technical review committee member for various international
of autonomous electric cars for relocation of public urban mobility conferences. He served as a Lead Guest Editor for the Journal of Advanced
fleet,” IEEE Trans. Control Syst. Technol., vol. 15, no. 3, pp. 590–598, Transportation (Special Issue on Intelligent Autonomous Transport Systems
May 2007. Design and Simulation).
[137] M. Martinez-García, Y. Zhang, and T. Gordon, “Modeling lane keeping
by a hybrid open-closed-loop pulse control scheme,” IEEE Trans. Ind.
Informat., vol. 12, no. 6, pp. 2256–2265, Dec. 2016.
[138] W. Kim, Y. S. Son, and C. C. Chung, “Torque-overlay-based robust
steering wheel angle control of electrical power steering for a lane-
keeping system of automated vehicles,” IEEE Trans. Veh. Technol.,
vol. 65, no. 6, pp. 4379–4392, Jun. 2016.
[139] S.-J. Wu, H.-H. Chiang, J.-W. Perng, C.-J. Chen, B.-F. Wu, and
T.-T. Lee, “The heterogeneous systems integration design and imple-
mentation for lane keeping on a vehicle,” IEEE Trans. Intell. Transp.
Syst., vol. 9, no. 2, pp. 246–263, Jun. 2008.
[140] J. Ji, A. Khajepour, W. W. Melek, and Y. Huang, “Path planning and Hao Chen (M’13) received the B.S. degree in chem-
tracking for vehicle collision avoidance based on model predictive ical engineering from Sichuan University, China,
control with multiconstraints,” IEEE Trans. Ultrason. Eng., vol. 66, in 1998, and the Ph.D. degree in computer science
no. 2, pp. 952–964, Feb. 2017. from the Huazhong University of Science and Tech-
[141] P. Petrov and F. Nashashibi, “Modeling and nonlinear adaptive control nology, China, in 2005. He is currently a Professor
for autonomous vehicle overtaking,” IEEE Trans. Intell. Transp. Syst., with the College of Computer Science and Elec-
vol. 15, no. 4, pp. 1643–1656, Aug. 2014. tronic Engineering, Hunan University, China. He has
[142] D. I. Katzourakis, J. C. F. de Winter, M. Alirezaei, M. Corno, and published more than 60 articles in journals and
R. Happee, “Road-departure prevention in an emergency obstacle conferences, such as the IEEE T RANSACTIONS ON
avoidance situation,” IEEE Trans. Syst., Man, Cybern., Syst., vol. 44, PARALLEL AND D ISTRIBUTED S YSTEMS , the IEEE
no. 5, pp. 621–629, May 2014. T RANSACTIONS ON C OMPUTERS , IPDPS, IWQoS,
[143] Apple Machine Learning Journal. Accessed: Sep. 13, 2019. [Online]. HiPC, and ICPP. His current research interests include parallel and distributed
Available: https://machinelearning.apple.com/ computing, operating systems, cloud computing, and systems security. He is
[144] A. Ignatov et al., “AI benchmark: Running deep neural networks a member of the ACM.
on Android smartphones,” Oct. 2018., arXiv:1810.01109. [Online].
Available: https://arxiv.org/abs/1810.01109
[145] Deploy Machine Learning Models on Mobile and IoT Devices.
Accessed: Sep. 13, 2019. [Online]. Available: https://www.tensorflow.
org/lite
[146] Core Machine Learning (ML). Accessed: Sep. 13, 2019. [Online].
Available: https://developer.apple.com/documentation/coreml
[147] K. Chine, “Learning math and statistics on the cloud, towards an EC2-
based Google docs-like portal for teaching/learning collaboratively with
R and scilab,” in Proc. 10th IEEE Int. Conf. Adv. Learn. Technol.,
Sousse, Tunisia, Jul. 2010, pp. 752–753.
[148] E. Hormozi, H. Hormozi, M. K. Akbari, and M. S. Javan, “Using of
machine learning into cloud environment (a survey): Managing and
scheduling of resources in cloud systems,” in Proc. 7th Int. Conf.
P2P, Parallel, Grid, Cloud Internet Comput., Victoria, BC, Canada,
Xionghu Zhong (M’11) received the B.Eng. and
Nov. 2012, pp. 363–368.
M.Sc. degrees from Northwestern Polytechnical
University, China, in 2003 and 2006, respectively,
Teck Kai Chan received the B.Eng. and M.Phil. and the Ph.D. degree from the Institute for Digi-
degrees from Newcastle University upon Tyne tal Communications, The University of Edinburgh,
in 2015 and 2018, respectively. He is currently in 2010. He was a Research Fellow with the School
pursuing the Ph.D. degree under the Singapore of Computer Engineering and a Senior Research
Industrial Postgraduate Programme with Visenti Pte. Fellow with the School of Electrical and Elec-
Ltd., and Newcastle University upon Tyne. He was tronic Engineering, Nanyang Technological Univer-
a Data Engineer with Visenti Pte Ltd., where he sity, Singapore. He was with Xylem Inc., as a Data
was involved in the investigation of advanced infor- Scientist from 2017 to 2018. He is currently a
mation fusion methods to coordinate the deployed Professor with the College of Computer Science and Electronic Engineering,
large-scale heterogeneous sensor network and the Hunan University, China. His research interests include statistical signal
development of models to achieve a reliable and processing, target localization and tracking, nonparametric Bayesian modeling
robust leak detection and localization performance. His current research and machine learning methods, and their applications to distant speech
interests include water leakage detection and localization algorithms, signal enhancement and recognition, V2X communications, and water distribution
processing, machine learning, and handling of imbalanced dataset. network monitoring.

A Comprehensive Review of Driver Behavior Analysis Utilizing Smartphones Compressed

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A Comprehensive Review of Driver Behavior Analysis Utilizing Smartphones Compressed

Uploaded by

Copyright:

Available Formats

4444 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 21, NO.

10, OCTOBER 2020

A Comprehensive Review of Driver Behavior

Fig. 1. Different Types of Abnormal Driving Pattern. a) Weaving, b)

journey to measure the PPG signals. While the use electrode

Fig. 4. The different States of An Open Eye [46].

In simple, DNN can be described as a NN with a larger

computing. Cloud computing is a new way of how computing

You might also like