Professional Documents
Culture Documents
Arnab Barua1 a
, Mobyen Uddin Ahmed1 b , Shaibal Barua1 c
, Shahina Begum1 d
Abstract: Multimodal machine learning is a critical aspect in the development and advancement of AI systems. How-
ever, it encounters significant challenges while working with multimodal data, where one of the major issues
is dealing with unlabelled multimodal data, which can hinder effective analysis. To address the challenge,
this paper proposes a multimodal reasoning approach adopting second-order learning, incorporating ground-
ing alignment and semi-supervised learning methods. The proposed approach illustrates using unlabelled
vehicular telemetry data. During the process, features were extracted from unlabelled telemetry data using an
autoencoder and then clustered and aligned with true labels of neurophysiological data to create labelled and
unlabelled datasets. In the semi-supervised approach, the Random Forest (RF) and eXtreme Gradient Boost-
ing (XGBoost) algorithms are applied to the labelled dataset, achieving a test accuracy of over 97%. These
algorithms are then used to predict labels for the unlabelled dataset, which is later added to the labelled dataset
to retrain the model. With the additional prior labelled data, both algorithms achieved a 99% test accuracy.
Confidence in predictions for unlabelled data was validated using counting samples based on the prediction
score and Bayesian probability. RF and XGBoost scored 91.26% and 97.87% in counting samples and 98.67%
and 99.77% in Bayesian probability, respectively.
561
Barua, A., Ahmed, M., Barua, S., Begum, S. and Giorgi, A.
Second-Order Learning with Grounding Alignment: A Multimodal Reasoning Approach to Handle Unlabelled Data.
DOI: 10.5220/0012466500003636
Paper published under CC license (CC BY-NC-ND 4.0)
In Proceedings of the 16th International Conference on Agents and Artificial Intelligence (ICAART 2024) - Volume 2, pages 561-572
ISBN: 978-989-758-680-4; ISSN: 2184-433X
Proceedings Copyright © 2024 by SCITEPRESS – Science and Technology Publications, Lda.
ICAART 2024 - 16th International Conference on Agents and Artificial Intelligence
neurophysiological data analysis. The nature of ve- lenges in various fields, including healthcare, en-
hicular telemetry data is complex, with vehicles con- vironmental monitoring, finance, and manufactur-
tinuously generating large volumes of it. It is a topic ing.
of great importance, as reported in various literature The paper is organised as follows: Section 2 pro-
(Winlaw et al., 2019; Alhamdan and Jilani, 2019). vides a summary of works on multimodal reasoning,
The aim of this paper is to provide a solution for its approaches and vehicular telemetry data. Section 3
handling unlabelled multimodal data through multi- describes the details of the applied methodology and
modal reasoning. The proposed approach of multi- used materials. Section 4 presents the results with fig-
modal reasoning has two phases. In the first phase, ures. Finally, section 5 discusses the performed anal-
’first-order learning,’ key features are extracted from ysis and concludes the paper.
the telemetry data and then clustered into distinct
groups. In the subsequent phase, ’second-order learn-
ing,’ the clustered data is aligned with external la-
bels from neurophysiological data analysis. This en- 2 RELATED WORKS
ables the use of a semi-supervised learning approach
to classify unlabelled vehicular telemetry data. Au- Several notable studies have recently highlighted the
toencoder extracts features using unsupervised learn- advances made in multimodal reasoning. In (Zheng
ing (Bank et al., 2023), and k-means clustering (Na et al., 2023), the authors propose the DDCoT prompt-
et al., 2010) divides them into groups. Labels are ing technique that combines visual recognition with
aligned using a supervised alignment approach, and critical thinking prompts to improve the reasoning
Random Forest (RF) and eXtreme Gradient Boosting abilities and explainability of language models in
(XGBoost) are used for classification and labelling in multimodal contexts. Authors in (Lu et al., 2022)
the semi-supervised approach. show that CoT significantly improves the perfor-
This study has made several contributions to unla- mance of large language models in both few-shot and
belled data handling, and they are: fine-tuning learning settings, underscoring the poten-
tial of explanations in enhancing AI reasoning capa-
• Multimodal Reasoning: The synergy of align- bilities. In (Zhu et al., 2022), multimodal reasoning
ment and semi-supervised learning enhances mul- is achieved through reverse-hyperplane projection on
timodal reasoning and provides a reliable solution Specific Disease Knowledge Graphs (SDKGs) using
for analysing unlabelled data. structure, category, and description embeddings. A
• Knowledge Representation: Use of autoen- semi-supervised study on multimodal reasoning is ex-
coders for knowledge representation, which plored in (Liang et al., 2023). The study involves
helped to capture essential information from un- quantifying interactions between labelled unimodal
labelled data and present it in a compressed and and unlabelled multimodal data.
latent form. Vehicular telemetry data is a valuable source of
• Supervised Alignment and Semi-Supervised information that can be used to analyze driver be-
Prediction: Using supervised alignment to align haviour, ensure identification, and improve safety on
unlabelled data with true labels of different data roads. Several articles, such as (Cassias and Kun,
helps in identifying and categorising similar pat- 2007; Kirushanth and Kabaso, 2018; Gupta et al.,
terns in the unlabelled data. This approach en- 2023; Rahman et al., 2020), have highlighted the im-
ables a semi-supervised method, which enhances portance of analyzing telemetry data for driver identi-
the model’s ability to classify and understand the fication, behaviour analysis, and road safety. How-
unlabelled dataset more accurately. ever, labelling vehicular telemetry data for specific
tasks like driver identification and behaviour predic-
• Confidence Assessment for Multimodal Rea- tion is challenging for researchers due to various
soning: Two validation strategies, i.e., Bayesian driving patterns, diverse driving conditions, and traf-
probability analysis and counting frequency of fic conditions, as discussed in (Singh and Kathuria,
samples based on model prediction scores, were 2021; Respati et al., 2018; Tselentis and Papadim-
implemented to ensure accurate predictions for itriou, 2023).
unlabelled samples. Different methods exist to solve this problem, and
• Cross-Domain Applicability: Enhancing vehic- one way is to annotate the label manually. In (Aboah
ular data analysis through multimodal reasoning et al., 2023), video data labels telemetry data with
provides a solution for managing unlabelled ve- class numbers assigned by an expert annotator. In
hicular telemetry data. Additionally, this solution (Taylor et al., 2016), an expert annotator is used. In
provides a template for addressing similar chal- (Wang et al., 2017), parameters are clustered and re-
562
Second-Order Learning with Grounding Alignment: A Multimodal Reasoning Approach to Handle Unlabelled Data
563
ICAART 2024 - 16th International Conference on Agents and Artificial Intelligence
564
Second-Order Learning with Grounding Alignment: A Multimodal Reasoning Approach to Handle Unlabelled Data
ages the model to learn sparse representation of the ously. Experts in the field evaluated this data and as-
input data in the latent space. Early stopping criteria signed labels to each minute. These labels were gen-
are used on validation loss where the value of patience erated using mind drowsiness and the eye blink rate
is 10 and min delta is 0.0001. Seven features were index. The process of assigning these labels can be
derived from the encoder part as the output shape of found in the following article (Di Flumeri et al., 2022;
the last encoder layer is 7, and the number of samples Di Flumeri et al., 2016). Binary values were used to
remains the same as the input. label the data, with 0 indicating high and 1 indicating
low mental fatigue. Before aligning the labels with
Table 1: Summary of the autoencoder model. the encoded features, the minutes that the expert does
Layers Output Shape Param not label for mind drowsiness and eye blink rate are
labelled as 2. Afterwards, alignment is performed,
input (Input Layer) None, 19 0
fusing the encoded features and labels. This helps
dense (Dense) None, 12 240
establish a relationship between vehicular telemetry
dense 2 (Dense) None, 7 91
data and neurophysiological data.
dense 3 (Dense) None, 12 96
dense 4 (Dense) None, 19 247
Total params: 674 Sample Selection. After the alignment process, a
Trainable params: 674 similarity check is performed between the cluster la-
bels and the labels of drowsiness and eye-blink rate.
Out of 85975 encoded samples, the labels of 37429
Clustering. Since there is no ground truth avail- samples are correctly matched with mind drowsi-
able for the collected telemetry data, the encoded data ness, while the labels of 36679 samples are correctly
is segregated using the clustering approach. The k- matched with eye-blink rate. Among the labels of
means clustering algorithm is used in this research 37429 samples matched with mind drowsiness, the la-
paper, which is preferred over others due to its sim- bel of 2898 samples is categorized into labels 0 and 1,
plicity, convenience, and efficiency, especially when whereas the rest are labelled as 2. Similarly, among
dealing with a large dataset (Hu et al., 2023; Na et al., the labels of 36679 samples matched with the eye-
2010). Here, the number of clusters selected for de- blink rate, 2341 samples belong to labels 0 and 1, and
veloping the k-means is 3, determined using the el- the rest belong to label 2. Therefore, a total of the la-
bow method. Apart from the number of clusters, the bel of 2898 samples of mind drowsiness and label of
values for other hyperparameters are set to their de- 2341 samples of eye-blink rate were merged together,
fault values, such as max iter is 300, n init is 10, and and after dropping duplicates, a total of 5055 encoded
init is k-means++. Results after applying the k-means samples were used for further processing.
cluster algorithm discussed in section 4.
Semi-Supervised Learning. The dataset, compris-
3.3 Second Order Learning ing 5055 encoded samples related to mind drowsiness
and eye blink rate with labels 0 and 1, can be consid-
Supervised alignment, sample selection, and semi- ered a labelled dataset and is suitable for binary clas-
supervised learning are the three steps of second- sification. However, there is a concern regarding the
order learning. Details for each step are provided be- samples labelled as 2. This is because they do not nec-
low. essarily indicate low or high levels of mental fatigue.
.Supervised Alignment. The clustering algorithm Specifically, 34531 out of 37429 samples are related
produced a good result, with distinctive separation to mind drowsiness, and 34338 out of 36679 are re-
between data points and only a few overlaps. How- lated to eye blink rate. To address this issue, a semi-
ever, without ground truth, it’s impossible to iden- supervised learning approach is being employed.
tify the meaning of each cluster. To overcome this In machine learning, a common challenge is deal-
challenge, the study employed the multimodal align- ing with large amounts of unlabelled data, and one
ment approach, which looks for relationships between way to address this is through semi-supervised learn-
instances from two or more modalities (Baltrušaitis ing, which combines labelled and unlabelled data for
et al., 2018). Specifically, the supervised alignment building a good classifier (Zhu, 2005). The self-
technique was used, where data are aligned with la- training approach has been employed in this study
bels from different sources to guide the alignment from a range of semi-supervised techniques. In this
process (Huang et al., 2023). approach, a supervised classifier is first trained on
As vehicular telemetry data was being collected, a small amount of labelled data. Then, the trained
neurophysiological data was also captured simultane- model is used to predict the labels for the unlabelled
565
ICAART 2024 - 16th International Conference on Agents and Artificial Intelligence
data. The most confident predictions are then added lyzing the prediction probability score. The condition
to the labelled data to re-train the model. To imple- is to count the number of samples that have a proba-
ment the self-training approach, the labelled dataset bility score greater than 0.7 or less than 0.4. The total
of 5055 encoded samples was utilized and classified number of samples that satisfy this condition is used
using the RF and XGBoost algorithms. The results to calculate the percentage. Algorithm 1 presents the
show an accuracy of 98% and 97% using RF and first approach in pseudo-code.
XGBoost, respectively. The default hyperparameters
used to build the RF and XGBoost are presented in Algorithm 1: Evaluating model confidence on unla-
Table 2. belled data using model’s probability score.
Data: unlabelledDataset - Unlabelled dataset
Table 2: Hyperparameters used in RF and XGBoost for
classifying. Result: Percentage of sample meets the
confidence criteria
Classifier Models Hyperparameters Details Assume: Model is already trained on labelled
Random Forest n estimators : 100 dataset
criterion : gini probabilities ← Predictions(unlabelledData)
min sample split : 2 Determine thresholdLow, thresholdHigh from
min sample leaf : 1 probabilities
XGBoost n estimators : 100 satis f yingSamples ← 0
learning rate : 0.3 foreach probability in probabilities do
eval metric : logloss if probability ≤ thresholdLow or
booster : gbtree probability ≥ thresholdHigh then
satis f yingSamples ←
The samples labelled 2 were used to create an un- satis f yingSamples + 1
labelled dataset. Rather than including all, the fo- end
cus was on 34531 samples related to mind drowsi- end
ness out of 37429 total samples. This resulted in satis f yingSamples
return ( length(unlabelledData) ) × 100
a final unlabelled dataset of 34531 samples. The
dataset was used to test both trained models. Based
on the probability of prediction, the unlabelled sam- The second approach used to validate the model’s
ples were labelled according to their class. The la- confidence is the Bayesian probability, which is a sys-
belled samples were then merged with the prior la- tematic method for updating beliefs based on new ev-
belled dataset, which consisted of 5055 samples, to idence (Meyniel et al., 2015). It calculates the poste-
retrain the model. The retrained models produced ex- rior probability using the Bayesian theorem. Since
cellent test accuracy results of over 98%. there are two classes, 0 and 1, the posterior prob-
ability for a sample is calculated for both classes.
3.4 Validation Then, it is compared with the prediction probability
score. Counting the similarity of each sample be-
The probability prediction score for both classes is tween prediction probability and posterior probabil-
analyzed to determine whether an undefined sample ity, the confidence is calculated. Algorithm 2 presents
should be relabeled as 0, indicating high or 1, indicat- the pseudo-code for the second approach.
ing low mental fatigue. This analysis helps determine
the confidence of the model. In this paper, two ap-
proaches were used to validate the model’s confidence 4 RESULTS
percentage.
In the first approach, to determine the confidence
of a model, the prediction probability distribution of
4.1 Results from First-Order Learning
any class on unlabelled samples is first selected. Next,
Prior to inserting the data for feature extraction,
two threshold values are defined by analyzing the
Min-Max Normalization is performed to ensure
probability score. These threshold values are then
that all signals are scaled to a similar range, re-
used to create a condition that counts the number of
sulting in better performance. Following this,
samples that satisfy it. The percentage of samples that
the normalised data is passed through the autoen-
satisfy the condition can be considered as the confi-
coder for feature extraction. The autoencoder
dence of the model. For instance, suppose there are
training process had a total of 300 epochs, but it
two threshold values, 0.7 and 0.4, both chosen by ana-
stopped at epoch 46 due to early stopping criteria
566
Second-Order Learning with Grounding Alignment: A Multimodal Reasoning Approach to Handle Unlabelled Data
Algorithm 2: Evaluating model confidence on unla- relation between the extracted features was analysed,
belled data using Bayesian probability. and Figure 4 shows the correlation matrix of the ex-
Data: unlabelledData - Dataset tracted features. From the figure, it can be observed
Result: Percentage of consistent samples that there is no evidence of strong linear dependency
Assume: Model is already trained on labelled between any pair of features.
dataset
probabilities ← Predictions(unlabelledData)
posteriors ← BayesianPosterior(probabilities,
Prior)
consistent ← 0
foreach sample in unlabelledData do
Extract probClass0, probClass1 from
probabilities of sample
Extract posteriorClass0, posteriorClass1
from posteriors of sample
if (probClass0 > probClass1 and
posteriorClass0 > posteriorClass1) or
(posteriorClass0 < posteriorClass1 and
posteriorClass0 < posteriorClass1) then
consistent ← consistent + 1
end Figure 4: Correlation matrix of encoded features.
end
consistent
return ( length(unlabelledData) ) × 100 The extracted dataset cannot be labelled as there
is no ground truth available. So, to extract meaning,
the dataset was clustered using the k-means algorithm
being met. The validation loss did not show any with a value of k equal to 3. The resulting clusters
improvement greater than 0.0001 over the last 10 were visualised using t-SNE, a popular tool for dis-
epochs, from epoch 37 to 46. This satisfied the playing data in a two-dimensional scatter plot. Figure
conditions for early stopping with ’patience=10’ and 5 shows the 2D t-SNE visualisation of the clustered
’min delta’ = 0.0001. Therefore, based on the trend result. From the figure, it is evident that all three clus-
of training and validation loss and early stopping cri- ters are well separated from each other. Cluster 2 has
teria, the model effectively converged by epoch 46. the highest number of data points, while the maxi-
Figure 3 displays the training and validation loss with mum number of data points overlaps between clusters
epoch. 0 and 1.
567
ICAART 2024 - 16th International Conference on Agents and Artificial Intelligence
568
Second-Order Learning with Grounding Alignment: A Multimodal Reasoning Approach to Handle Unlabelled Data
569
ICAART 2024 - 16th International Conference on Agents and Artificial Intelligence
To solve this problem, a supervised alignment ap- samples was labelled based on the prediction proba-
proach was applied, and it was performed between ve- bility score obtained from testing it on trained RF and
hicle telemetry data and the true labels from the neu- XGBoost. The newly labelled dataset was combined
rophysiological data analysis. The main reason be- with the previously labelled one, and binary classifi-
hind aligning these two is because both data are col- cation was performed where both RF and XGBoost
lected simultaneously during the experiment of each achieved over 99% accuracy on test data. More infor-
participant. The contribution of the supervised align- mation about the results can be found in Table 3, and
ment lies in its ability to transfer knowledge from the confusion matrix for the test data for both models
neurophysiological data to vehicular telemetry data, can be found in Figure 6. The contribution of semi-
thereby enhancing the learning process. The process supervised learning is based on the satisfactory results
of the alignment is described in subsection 3.3 of sec- it has achieved, which can be observed in two signif-
tion 3. After aligning the encoded features and labels, icant ways. Firstly, it validates the reliability and ef-
it becomes apparent that there is an overlap between fectiveness of RF and XGBoost models in accurately
each class. Figure 8 displays the Gaussian distribu- classifying data. Secondly, it highlights the potential
tion of samples of feature six into three classes where of semi-supervised techniques in efficiently utilising
the labels of the mind drowsiness data were aligned a combination of labelled and unlabelled data.
with the encoded features. From the figure, it is clear Validating the model’s confidence in its prediction
that samples labelled as 2, which are undefined, are is crucial. The validation procedure is explained in
situated between the samples labelled as 0 and 1 (0 subsection 3.4 of section 3. Two approaches are taken
means high mental fatigue, and 1 means low mental to validate the confidence. The first approach focuses
fatigue). on validating how accurately the models can predict
the class of unlabelled samples by using thresholds to
evaluate the decisiveness of the model’s predictions.
It is an effective way to measure the model’s perfor-
mance on unlabelled data, using probability scores as
a metric of confidence. The second approach utilises
Bayesian probability to validate the model’s predic-
tions by comparing the initial prediction probabilities
with posterior probabilities. This technique provides
a more nuanced perspective of the model’s confidence
in its classifications, and it quantifies the level of cer-
tainty across the dataset. Algorithms 1 and 2 present
the pseudo-code for the first and second approaches,
respectively.
Figure 8: Example of overlapping of three classes consider-
The multimodal reasoning approach used in this
ing one feature ( Feature no 7).
research to handle unlabeled data has shown great
The observation in Figure 8 suggests that the clas- promise for cross-domain applications. It is a
sification of the aligned dataset may not effectively highly efficient approach that involves combining and
assign meaning to undefined samples labelled as 2. analysing data from various sources. It is a versa-
The semi-supervised approach was chosen to address tile strategy that can be adapted to different indus-
the problem because it makes optimal use of the lim- tries. For example, in healthcare, it can be used to
ited labelled data for initial training, and then further integrate different patient data, while in environmen-
improves the model’s performance and accuracy by tal studies, it can help to integrate diverse ecological
incorporating a larger pool of unlabelled data. This data sets. The principles of multimodal reasoning re-
offers a more comprehensive learning approach com- main applicable and effective in various fields. The
pared to other methods like active learning or trans- capacity to extract valuable insights from heteroge-
fer learning. Prior to the semi-supervised approach, a neous and unlabeled data sources has immense im-
sample selection process was carried out (as described plications, furnishing a sturdy foundation for multi-
in subsection 3.3 of section 3), in which 5055 sam- ple industries that confront comparable issues of data
ples labelled as 0 and 1(high and low mental fatigue) integration and analysis. This method establishes a
were chosen. At the beginning of the semi-supervised standard template for future investigation and appli-
approach, RF and XGBoost algorithms were used to cations, highlighting the possibilities of multimodal
classify labelled samples, achieving over 96% accu- reasoning to promote innovative solutions in diverse
racy on test data. The unlabelled dataset of 34531 fields where data is abundant but often not unambigu-
570
Second-Order Learning with Grounding Alignment: A Multimodal Reasoning Approach to Handle Unlabelled Data
571
ICAART 2024 - 16th International Conference on Agents and Artificial Intelligence
Islam, M. R., Barua, S., Ahmed, M. U., Begum, S., Aricò, for vehicle telemetry. Applied Artificial Intelligence,
P., Borghini, G., and Di Flumeri, G. (2020). A novel 30(3):233–256.
mutual information based feature set for drivers’ men- Thiffault, P. and Bergeron, J. (2003). Monotony of road en-
tal workload evaluation using machine learning. Brain vironment and driver fatigue: a simulator study. Acci-
Sciences, 10(8):551. dent Analysis & Prevention, 35(3):381–391.
Kirushanth, S. and Kabaso, B. (2018). Telematics and Tselentis, D. I. and Papadimitriou, E. (2023). Driver profile
road safety. In 2018 2nd International Confer- and driving pattern recognition for road safety assess-
ence on Telematics and Future Generation Networks ment: Main challenges and future directions. IEEE
(TAFGEN), pages 103–108. IEEE. Open Journal of Intelligent Transportation Systems.
Li, P. and Vu, Q. D. (2015). A simple method for identify- Vasudevan, K., Das, A. P., Sandhya, B., and Subith, P.
ing parameter correlations in partially observed linear (2017). Driver drowsiness monitoring by learning ve-
dynamic models. BMC Systems Biology, 9(1):1–14. hicle telemetry data. In 2017 10th International Con-
Li, Z., Chen, L., Peng, J., and Wu, Y. (2017). Automatic de- ference on Human System Interactions (HSI), pages
tection of driver fatigue using driving operation infor- 270–276. IEEE.
mation for transportation safety. Sensors, 17(6):1212. Wang, K., Yang, J., Li, Z., Liu, Y., Xue, J., and Liu, H.
Liang, P. P., Ling, C. K., Cheng, Y., Obolenskiy, A., Liu, Y., (2022). Naturalistic driving scenario recognition with
Pandey, R., Wilf, A., Morency, L.-P., and Salakhutdi- multimodal data. In 2022 23rd IEEE International
nov, R. (2023). Multimodal learning without labeled Conference on Mobile Data Management (MDM),
multimodal data: Guarantees and applications. arXiv pages 476–481. IEEE.
preprint arXiv:2306.04539. Wang, W., Xi, J., Chong, A., and Li, L. (2017). Driv-
Lu, P., Mishra, S., Xia, T., Qiu, L., Chang, K.-W., Zhu, ing style classification using a semisupervised sup-
S.-C., Tafjord, O., Clark, P., and Kalyan, A. (2022). port vector machine. IEEE Transactions on Human-
Learn to explain: Multimodal reasoning via thought Machine Systems, 47(5):650–660.
chains for science question answering. Advances Winlaw, M., Steiner, S. H., MacKay, R. J., and Hilal, A. R.
in Neural Information Processing Systems, 35:2507– (2019). Using telematics data to find risky driver be-
2521. haviour. Accident Analysis & Prevention, 131:131–
Meyniel, F., Sigman, M., and Mainen, Z. F. (2015). Confi- 136.
dence as bayesian probability: From neural origins to Zheng, G., Yang, B., Tang, J., Zhou, H.-Y., and Yang,
behavior. Neuron, 88(1):78–92. S. (2023). Ddcot: Duty-distinct chain-of-thought
Na, S., Xumin, L., and Yong, G. (2010). Research on k- prompting for multimodal reasoning in language mod-
means clustering algorithm: An improved k-means els. arXiv preprint arXiv:2310.16436.
clustering algorithm. In 2010 Third International Zhu, C., Yang, Z., Xia, X., Li, N., Zhong, F., and Liu, L.
Symposium on intelligent information technology and (2022). Multimodal reasoning based on knowledge
security informatics, pages 63–67. Ieee. graph embedding for specific diseases. Bioinformat-
Papadelis, C., Chen, Z., Kourtidou-Papadeli, C., Bamidis, ics, 38(8):2235–2245.
P. D., Chouvarda, I., Bekiaris, E., and Maglaveras, Zhu, X. J. (2005). Semi-supervised learning literature sur-
N. (2007). Monitoring sleepiness with on-board vey.
electrophysiological recordings for preventing sleep-
deprived traffic accidents. Clinical Neurophysiology,
118(9):1906–1922.
Rahman, H., Ahmed, M. U., Barua, S., and Begum, S.
(2020). Non-contact-based driver’s cognitive load
classification using physiological and vehicular pa-
rameters. Biomedical Signal Processing and Control,
55:101634.
Respati, S., Bhaskar, A., and Chung, E. (2018). Traffic data
characterisation: Review and challenges. Transporta-
tion research procedia, 34:131–138.
Rumelhart, D. E., Hinton, G. E., Williams, R. J., et al.
(1985). Learning internal representations by error
propagation.
Siami, M., Naderpour, M., and Lu, J. (2020). A mobile
telematics pattern recognition framework for driving
behavior extraction. IEEE Transactions on Intelligent
Transportation Systems, 22(3):1459–1472.
Singh, H. and Kathuria, A. (2021). Analyzing driver be-
havior under naturalistic driving conditions: A review.
Accident Analysis & Prevention, 150:105908.
Taylor, P., Griffiths, N., Bhalerao, A., Anand, S., Popham,
T., Xu, Z., and Gelencser, A. (2016). Data mining
572