You are on page 1of 14

Downloaded from https://iranpaper.

ir
https://www.tarjomano.com https://www.tarjomano.com

Engineering Applications of Artificial Intelligence 119 (2023) 105722

Contents lists available at ScienceDirect

Engineering Applications of Artificial Intelligence


journal homepage: www.elsevier.com/locate/engappai

A new one-dimensional testosterone pattern-based EEG sentence


classification method
Tugce Keles a , Arif Metehan Yildiz a , Prabal Datta Barua b,c , Sengul Dogan a ,∗, Mehmet Baygin d ,
Turker Tuncer a , Caner Feyzi Demir e , Edward J. Ciaccio f , U. Rajendra Acharya g,h,i
a
Department of Digital Forensics Engineering, College of Technology, Firat University, Elazig, Turkey
b
School of Business (Information System), University of Southern Queensland, Toowoomba, QLD 4350, Australia
c
Faculty of Engineering and Information Technology, University of Technology Sydney, Sydney, NSW 2007, Australia
d Department of Computer Engineering, Faculty of Engineering, Ardahan University, Ardahan, Turkey
e Department of Neurology, Firat University Hospital, Firat University, 23119 Elazig, Turkey
f Department of Medicine, Columbia University Irving Medical Center, New York, NY 10032, USA
g
Ngee Ann Polytechnic, Department of Electronics and Computer Engineering, 599489, Singapore
h
Department of Biomedical Engineering, School of Science and Technology, SUSS University, Singapore
i
Department of Biomedical Informatics and Medical Engineering, Asia University, Taichung, Taiwan

ARTICLE INFO ABSTRACT


Keywords: Electroencephalography (EEG) signals are crucial data to understand brain activities. Thus, many papers have
Testosterone pattern been proposed about EEG signals. In particular, machine learning techniques have been used/presented to
EEG sentence classification extract information from EEG signals. However, there are limited works on sentence classification using this
Hand-modeled learning
data. To fill this gap, we propose an automated EEG signal classification model. In this model, we have
Iterative majority voting
presented a new molecular-based feature extractor, which utilizes a graph of the testosterone molecular
Self-organized model
Machine learning
structure. The proposed testosterone graph-based pattern is a nature-inspired pattern. The motivation is to
show the feature extraction capability of the chemical-based graphs. Hence, we presented a hand-modeled
EEG classification architecture. Our architecture uses wavelet packet decomposition (WPD) to generate wavelet
bands to extract low and high-level features. The statistical feature extraction function has been used to
generate statistical features, and our proposed testosterone pattern (TesPat) generates textural features. A
feature selector has been used to choose the most informative features (neighborhood component analysis).
Channel-wise results have been calculated by deploying a shallow classifier (k nearest neighbors). Majority
voting has been conducted to create general results, and our proposed model selects the best-resulted predicted
labels vector. Our proposed model attained a classification accuracy of >97% with 10-fold cross-validation (CV)
and >91% with leave-one subject out (LOSO) CV. Our high classification results demonstrate that our presented
system is an accurate and robust sentence classification model. The novelty of this work is the development
of an accurate testosterone-based learning model using three EEG sentence datasets.

1. Introduction (Subasi et al., 2022), and mental performance detection (Kim et al.,
2021).
The electroencephalogram (EEG) is a data type that reflects the Nowadays, brain-to-text is one of the hot topics frequently studied
brain’s electrical activity (Shah et al., 2022). It has been utilized for in the literature (Panachakel and Ramakrishnan, 2021). The main
many years to understand and decode the human brain Altaheri et al. purpose of this topic is to predict imagined letters, words, or symbols by
(2021) and Wang and Ji (2022). EEG is generally used to diagnose neu- decoding brain imagery (Saminu et al., 2021). In one of these studies,
rological disease (Cao et al., 2022; Lima et al., 2022). The development imagined words were classified using the convolutional neural network
of brain–computer interfaces has made EEG signals applicable for non- (CNN) (Datta and Boulgouris, 2021). The study selected words in two
medical purposes (Värbu et al., 2022). Some of the usage areas of EEG groups (5 nouns and 5 verbs), and EEG signals were collected from
signals include emotion recognition (Li et al., 2022), fatigue detection the participants. Thereafter, the spectrograms of these signals were

∗ Corresponding author.
E-mail addresses: 201144107@firat.edu.tr (T. Keles), 211144202@firat.edu.tr (A.M. Yildiz), prabal.barua@usq.edu.au (P.D. Barua), sdogan@firat.edu.tr
(S. Dogan), mehmetbaygin@ardahan.edu.tr (M. Baygin), turkertuncer@firat.edu.tr (T. Tuncer), cfdemir@firat.edu.tr (C.F. Demir), ciaccio@columbia.edu
(E.J. Ciaccio), aru@np.edu.sg (U.R. Acharya).

https://doi.org/10.1016/j.engappai.2022.105722
Received 29 July 2022; Received in revised form 14 November 2022; Accepted 6 December 2022
Available online 21 December 2022
0952-1976/© 2022 Elsevier Ltd. All rights reserved.
Downloaded from https://iranpaper.ir
https://www.tarjomano.com https://www.tarjomano.com

T. Keles, A.M. Yildiz, P.D. Barua et al. Engineering Applications of Artificial Intelligence 119 (2023) 105722

extracted and classified with CNN. The proposed method reached 2022) and biomedical applications (Kaplan et al., 2022; Dogan et al.,
84.6% accuracy in distinguishing grammatical classes (binary classi- 2022), are employed. The main purpose of these methods is to provide
fication). Vorontsova et al. (2021) created a dataset containing eight a helpful system to improve expertise. For example, in biomedical
different Russian words and pseudo-words and classified this dataset us- applications, different systems are presented using the basic parameters
ing CNN and a recurrent neural network (RNN). The authors achieved doctors use, such as EEG, ECG, and EMG.
an 84.5% classification accuracy in this study. García-Salinas et al. EEG signals have generally been used to diagnose brain-related
(2019) collected and classified 5 Spanish words from 27 native Spanish- disorders and emotion recognition (Xu et al., 2022; Houssein et al.,
speaking subjects. An average accuracy value of 68.9% was obtained 2022; Balasubramanian et al., 2022). In this work, three new datasets
with the classification process. Pawar and Dhage (2020) collected containing EEG signals have been employed per their association with
six subjects’ EEG signals for four different words. Kernel-based ex- sentences. Therefore, this problem is termed EEG sentence classifica-
treme machine learning was used to classify imagined words, and tion. A graph-based advanced signal classification model was proposed
a 49.77% classification accuracy was achieved for multiclass classi- to solve the problem since graph-based learning models outperform
fication. Cooney et al. (2020) used custom-designed CNN to classify other models. The aim is to develop a special graph-based feature
imagined speech. This study performed vowel and word classification, extraction function. Therefore, we used a graph of the testosterone
and EEG signals were collected from 15 subjects. The maximum accu- hormone for creating a graph-based feature extraction function termed
racy values achieved for the imagined word and vowel are 30.36% and TesPat. We found this graph in PubChem (2022). Most machine learn-
33.20%, respectively. Dash et al. (2022) proposed a multivariate fast ing models have aimed to detect hormones/molecules using these
and adaptive empirical mode decomposition-based method to classify graphs. However, the graphs have individual shapes and can be utilized
imagined words. The dataset used in the study includes EEG signaling as a component for machine learning. The main motivation of the
of 6 commands from 15 subjects. The proposed method provided a proposed TesPat is to investigate the feature extraction ability of the
21.53% accuracy for multiclass classification. Bakhshali et al. (2022) chemical depiction of testosterone hormone. This feature extraction
have classified 4 vowels from 8 subjects. The proposed method uses function (TesPat) has been tested on the EEG sentence classification
connectivity matrices and a support vector machine (SVM). In this problem. This model is one of the initial models of a new machine
investigation, the hold-out validation method was used, and an average learning methodology named Chemical Graphs-based machine learn-
classification accuracy of 81.1% was calculated from eight subjects. ing. Since there are millions of graphs in PubChem, we proposed a new
Kamble et al. (2022) proposed an EEG signal classification method deep-learning model using these graphs.
including decomposition, statistical feature extraction (Kuncan et al., This research presents a handcrafted learning model to classify EEG
2019), Kruskal Wallis feature (Ali Khan et al., 2014) selection, and signals accurately. The highly accurate model must extract distinctive
classification implementation. This paper achieved a maximum clas- features at high and low levels, but handcrafted modes extract features
sification accuracy of 89.6% for binary classification and 61.1% for at low levels. Therefore, a multileveled feature extraction methodology
multiclass classification. should be used to generate distinctive features, and we chose the WPD
As can be seen from the literature studies, researchers generally process to do this. The handcrafted features have been divided into
perform word/letter/syllable classification using EEG signals. In addi- two branches: textural features and statistical features. We have used
tion, it is seen from these studies that the methods in the literature TesPat for textural feature extraction and a statistical feature extraction
demonstrate a low classification accuracy in brain-to-text. While imple- function (this function is created using statistical moments) to gen-
menting this study, some research gaps identified in the literature became erate features in both space and frequency (using wavelet subbands)
evident and are given below: domains. The generated features were merged, and the final feature
vector was then created. The most informative features have been
• Studies in the literature generally focus on word/letter/syllable selected by deploying neighborhood component analysis (NCA), and
classification. these features were classified by deploying a kNN classifier with 10-
• The studies conducted in the past have used a limited number of fold cross-validation. Finally, IMV was applied, and the majority vote
word/letter combinations for classification. But we are classifying results were computed. We used three datasets with 14 channels. Thus,
the EEG sentences in this work. 26 (=14 + 12) predicted vectors were created for each dataset. In the
• The methods proposed in the literature have high computational last layer, the most accurate predicted vector was selected.
complexity, using CNN-based models.
• Various CNN-based EEG classification models exist in the liter- 1.2. Novelties and contributions
ature since CNNs are good solutions for obtaining high classifi-
cation accuracies. However, deep models (especially CNNs) have The new contributions of this research are:
exponential time complexities. Therefore, we proposed a graph- Novelties:
based feature engineering model to solve the computational com-
plexity problem. We selected and utilized a nature-based graph. • A new graph-based textural feature extraction function has been
Since we selected a common and known molecular structure proposed using the testosterone chemical shape.
of testosterone. Our main motivation is to investigate the use • Three EEG sentence datasets have been used in this work.
of chemical graph-based patterns for the classification of EEG • To the best of our knowledge, it is the most accurate and robust
sentences. EEG sentence classification model.
• State-of-art methods have reported low classification accuracy.
Contributions:
The one-dimensional testosterone pattern-based EEG sentence classi-
fication method proposed in this study fills these research gaps. The • We used EEG signals of sentences belonging to English and Turk-
details of the proposed method are presented in the subsections. ish languages. Furthermore, three EEG datasets have been col-
lected to validate the results. These datasets were collected by de-
1.1. Motivation and our model ploying two dataset collection strategies: (i) demonstration-based
collection and (ii) listening-based collection.
In the literature, deep learning and machine learning methods • We have presented a graph-based signal classification model. Our
(Ostad-Ali-Askari et al., 2017; Ostad-Ali-Askari and Shayan, 2021; Deng team has focused on feature engineering and proposed graph-
et al., 2020) in many different fields, such as engineering (Ostad-Ali- based feature generators. We aim to propose novel graph-based
Askari et al., 2017; Zhao et al., 2022; Song et al., 2022; Deng et al., feature extractors and investigate the classification performances

2
Downloaded from https://iranpaper.ir
https://www.tarjomano.com https://www.tarjomano.com

T. Keles, A.M. Yildiz, P.D. Barua et al. Engineering Applications of Artificial Intelligence 119 (2023) 105722

Fig. 1. Proposed testosterone pattern.

of these feature extractors with various datasets. Thus, we have edge is 𝑚(6, 1), – m is the created matrix, 6 is the row position, 1 is
presented a signal classification architecture. The channel-wise the column position – and the final point of the first directed edge
results have been demonstrated, and voting results have been is 𝑚(5, 2). To calculate the bit of the first edge, the 𝜑 (𝑚 (6, 1) , 𝑚 (5, 2))
obtained using our proposed model. In this model, we used TesPat formula has been used. Herein, 𝜑(., .) represents the signum function.
(the recommended testosterone graph-based pattern) as a feature The mathematical definition of the signum function is given below.
extractor. The TesPat is a component of a new feature engineering {
0, 𝑐 − 𝑑 < 0
methodology, and this methodology is called graphical depictions 𝜑 (𝑐, 𝑑) = (1)
of the chemical structures based feature extraction. The nature- 1, 𝑐 − 𝑑 ≥ 0
inspired feature extraction methodology has been used in this
where 𝑐, 𝑑 are the start point and end point.
work to propose TesPat. Moreover, we compared our results to
In this research, one-dimensional EEG signals were utilized. Hence,
other models, and the comparative results suggest that TesPat
a one-dimensional version of this pattern is described in this section.
has a distinctive feature generation ability. Thus, our proposal
The pseudocode of the presented TesPat is given in Algorithm 1.
reached >97% classification accuracy on three datasets.
The presented TesPat has been defined in Algorithm 1.
A numerical example for the presented TesPat has been given to
2. Testosterone pattern
clarify the feature extraction process of this model.
We applied overlapping block division with a block length of 54.
We have presented a new graph-based feature extraction model that
By using these blocks, matrixes of dimension 6 × 9 were con-
uses the chemical depiction of the testosterone hormone and proposes
structed. An example matrix is demonstrated below.
a local graph structure (LGS). In addition, we used a directed graph to
extract features. The advantages of the LGSs are given below. ⎡1 2 3 0 2 6 7 8 9⎤
⎢ ⎥
⎢0 9 1 2 3 4 6 7 1⎥
– These functions (LGSs) have low execution time (according to big ⎢ ⎥
O notation, the time complexity of these functions is equal to ⎢ ⎥
⎢1 4 2 2 1 0 4 1 7⎥
𝑂(𝑛)). 𝑀 =⎢ ⎥
– These models can generate discriminative features. ⎢2 3 4 8 9 0 1 5 6⎥
– LGSs are simple models, and the implementation of these models ⎢ ⎥
⎢5 3 7 1 0 1 3 5 1⎥
is straightforward. ⎢ ⎥
– They (LGSs) generate fixed-size features. The lengths of the gen- ⎢1 4 0 1 3 8 8 9 0⎥⎦

erated features are not dependent on signal length.
We used a matrix, testosterone pattern (see Fig. 1), and signum
LGSs have several advantages, but modeling an LGS is difficult since function, generating 24 bits. These bits, or binary features, have been
modeling some graphs is difficult. Furthermore, there are millions of categorized into three groups, given below.
unique chemical depictions in nature. These graphs are valuable for ⎛1 3⎞ ⎛8 1⎞
machine learning and computational methods. We hypothesize that ⎜ ⎟ ⎜ ⎟
we can efficiently model machine learning components using these ⎜3 3⎟ ⎜1 0⎟
⎜ ⎟ ⎜ ⎟
patterns. To show the feasibility of our hypothesis, we proposed a new ⎜3 2⎟ ⎜0 1⎟
feature extraction function and used the testosterone pattern. ⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ⎟
We utilized a testosterone-inspired directed graph to generate fea- ⎜2 8⎟ ⎜1 3⎟
tures, as demonstrated in Fig. 1. 𝜑⎜ ⎟ = (01101101)2 𝜑⎜ ⎟ = (11001001)2
⎜8 1⎟ ⎜3 1⎟
Fig. 1 depicts the pattern of the proposed TesPat, a 2D pattern. ⎜ ⎟ ⎜ ⎟
The directed edges demonstrate the used values for feature extraction, ⎜1 0⎟ ⎜1 3⎟
⎜ ⎟ ⎜ ⎟
which are enumerated from 1 to 24. The input parameters of the ⎜0
⎜ 3⎟⎟ ⎜3
⎜ 6⎟⎟
signum function (simple comparison function) were selected to extract
features efficiently. For instance, the start point of the first directed ⎜ ⎟ ⎜ ⎟
⎝8 2⎠ ⎝6 6⎠

3
Downloaded from https://iranpaper.ir
https://www.tarjomano.com https://www.tarjomano.com

T. Keles, A.M. Yildiz, P.D. Barua et al. Engineering Applications of Artificial Intelligence 119 (2023) 105722

⎛6 4⎞ By applying binary to decimal transformation/conversion, values of


⎜ ⎟
⎜4 0⎟ the map signals have been created.
⎜ ⎟
⎜6 𝑚𝑎𝑝1 (𝑗) = (01101101)2 → 109
7⎟
⎜ ⎟
⎜ ⎟ 𝑚𝑎𝑝2 (𝑗) = (11001001)2 → 201
⎜6 8⎟
𝑚𝑎𝑝3 (𝑗) = (11001101)2 → 205
𝜑⎜ ⎟ = (11001101)2
⎜8 1⎟
This example demonstrated our proposed TesPat transformation.
⎜ ⎟
⎜1 1⎟ The presented TesPat transformation generated three feature map sig-
⎜ ⎟ nals. This example depicted that three different features have been
⎜1 4⎟⎟
⎜ generated using the TesPat transformation. Moreover, histograms of
⎜ ⎟ these map signals have been utilized as feature vectors. An exam-
⎝1 9⎠
ple (numerical samples with TesPat transformation) is schematized in
Fig. 2.

4
Downloaded from https://iranpaper.ir
https://www.tarjomano.com https://www.tarjomano.com

T. Keles, A.M. Yildiz, P.D. Barua et al. Engineering Applications of Artificial Intelligence 119 (2023) 105722

Fig. 2. Numerical example of the proposed TesPat. Blue values define block values, and red values represent the number of edges.

3. Proposed model – High and low leveled features have been extracted.

Our main motivation is to detect sentences using three datasets. 31 feature vectors (1 raw EEG signal + 30 wavelet bands have been
These datasets were collected using Turkish and English languages. utilized for the input of the used feature generation functions) have
Each dataset contains 20 sentences. We presented a hand-modeled been generated using our proposed feature extraction method based on
(feature engineering) classification model to get high classification ac- WPD and TesPat. These feature vectors have been merged to create a
curacy and low computational complexity. The proposed classification merged/final feature vector. Then, NCA feature selection was applied
model is graph-based. We used a natural graph, which belongs to to the final feature vector to select the most informative 404 features
testosterone (the used graph shows molecular structure). The presented for each channel. The used datasets have 14 channels, and in this
TesPat is a handcrafted feature extraction function. Moreover, it gen- work, we calculated the result of each channel. The kNN classifier (a
erates textural features. Thus, TesPat extracts feature at a low level. shallow distance-based classifier) is used to create channel-wise results.
The structure of deep learning models has been mimicked to propose Moreover, we used kNN to show the high classifiability of the extracted
a good feature extraction method, and we proposed a multileveled features. As a result, 14 predicted vectors have been generated in the
feature extraction function. The low and high-level features have been classification phase from 14 channels. Majority voting was used to
extracted using a multileveled/multilayered feature extraction method. calculate the best classification result. From iterative hard majority
To create various levels, we used wavelet packet decomposition (WPD). voting (IHMV), the voted 12 results were computed, and the best results
By using WPD, both low-pass and high-pass filter coefficients have been were selected from the generated 26 (=14 + 12) results by deploying
extracted. Furthermore, we used a statistical feature extraction function the Greedy algorithm. In this respect, we proposed a self-organized
to generate statistical features since our proposed TesPat extracts textu- model since it can select the best result among the 26 generated results.
ral features—it is a LBP-like feature extraction function. By deploying The graphical demonstration of our proposed model is shown in Fig. 3.
our feature extraction methodology: These vectors have been utilized as an iterative majority voting
algorithm, and voted labels have been generated. Our architecture is
– Both statistical and textural features have been extracted. self-organized and selects the best accurate prediction vector per the ac-
– Features are extracted at both space (using raw EEG signal) and curacy rates. In this work, 30 wavelet bands have been generated using
frequency domains (using wavelet coefficients). wavelet packet decomposition (WPD). In the feature extraction phase,

5
Downloaded from https://iranpaper.ir
https://www.tarjomano.com https://www.tarjomano.com

T. Keles, A.M. Yildiz, P.D. Barua et al. Engineering Applications of Artificial Intelligence 119 (2023) 105722

Fig. 3. Outline of our proposed architecture. Here, our datasets used have 14 channels since 14 predicted labels vectors have been generated.

statistics (this function generates 40 features) and the proposed TesPat Step 4: Choose the most informative features by deploying NCA.
(it generated 768 features) have been applied to each one-dimensional Step 5: Apply kNN to the selected features for channel-wise prediction
signal. Thus, each feature vector’s length (F) is 808 (=768 + 40). In vector generation.
the concatenation phase, the generated 31 feature vectors have been Step 6: Create voted prediction vectors by applying IHMV.
merged to form a feature vector of length 25,048 (=808 × 31). In Step 7: Select the best prediction vector as the result.
the feature selection phase, NCA has been applied, and the top 404 More details and an expanded version of these steps have been given
features from 25,048 features are fed as input to the kNN classifier. below.
Twelve voted results were generated using the IHMV function. The
most accurate predicted vectors out of the generated 26 predicted 3.1. Feature extraction layer
vectors were chosen using the greedy algorithm.
We have proposed a feature engineering model. The most im-
As shown in Fig. 3, the proposed model is a self-organized archi-
portant phase of this research is the feature extraction component.
tecture. The proposed model has selected the best classification result
We have used two feature extraction functions: the statistical feature
from the generated 26 results.
generation function and the TesPat (a textural feature extraction func-
The detailed explanations of our proposed TesPat-based model is tion). A multileveled feature generation method has been created using
given below. wavelet packet decomposition (WPD) (Ting et al., 2008). Using WPD,
Step 1: Create wavelet coefficients by deploying WPD. 30 wavelet bands (24+1 − 2 = 30 bands have been generated) have
Step 2: Extract features from the wavelet coefficient and raw EEG signal been created, and features are extracted from these bands. Moreover,
using the proposed TesPat and statistical feature extractors. we used raw EEG signals to extract features in the spatial domain. In
Step 3: Merge the created features and create the general/ this aspect, the model generates multilevel features in both spatial and
concatenated feature vector. frequency domains. The steps of the feature extraction model are:

6
Downloaded from https://iranpaper.ir
https://www.tarjomano.com https://www.tarjomano.com

T. Keles, A.M. Yildiz, P.D. Barua et al. Engineering Applications of Artificial Intelligence 119 (2023) 105722

3.2. Feature selection layer

In the feature selection layer, the most informative 404 features


have been selected from the generated 25,088 features. We have used
the NCA (Goldberger et al., 2004), a feature selector, since it is a simple
(feature selection version of the kNN) and effective implementation.
Moreover, we have used kNN as the classifier. To attain high classifi-
cation performance, we have used NCA since NCA and kNN dovetail
effectively.
Step 5: Select the most informative 404 features by applying NCA from
the generated 25,088 features.

3.3. Classification layer

In this layer, channel-wise results have been generated by feeding


the generated 404 features. In this phase, we have used kNN as a
shallow classifier (Peterson, 2009). The hyperparameters of kNN are:
k: 4,
Distance: L1-Norm (Manhattan),
Voting: Squared Inverse,
Validation: 10-fold cross-validation.
Step 6: Use kNN to generate channel-wise results.
Step 7: Apply Steps 1–6 14 times to generate 14 predicted vectors since
our used dataset has 14 channels.

3.4. Iterative majority voting layer

This layer is used to generate voted results. In this layer, we have


Fig. 4. Schematic overview of the used IHMV algorithm.
used the IHMV algorithm, which was proposed in 2021 (Dogan et al.,
2021). The objective of this algorithm is to use the most salient chan-
nels to obtain the best classification results.
Step 1: Apply WPD to EEG signal with four levels. Herein, the symlet6 The steps of the voting layer are given below.
filter has been used. Step 8: Sort predicted vectors in accordance with the calculated classi-
fication accuracy.
𝑤 = 𝜓(𝑆, 4,′ 𝑠𝑦𝑚6′ ) (2) ( )
𝑎𝑐𝑐𝑘 = 𝜓 𝑐𝑘 , 𝑦 , 𝑘 ∈ {1, 2, … , 14} (6)
where 𝑤 defines wavelet subbands, and 𝜓() is the WPD transformation 𝑖𝑑 = 𝜔(𝑐, 𝑎𝑐𝑐) (7)
function. This function receives three parameters, the signal (𝑆), the
number of levels (for this work, we have used a four-leveled WPD), and Herein, 𝜓() is the accuracy computing function, 𝜔() defines the sorting
the mother wavelet function (we have used the symlet6-𝑠𝑦𝑚6 mother function and it sorted predicted vectors of the channels (𝑐) per the
wavelet function). 30 (=24 + 1 − 2) wavelet subbands have been accuracy values (𝑎𝑐𝑐) by descending, 𝑦 represents actual labels and
generated using this transformation. 𝑎𝑐𝑐 are accuracies, and 𝑖𝑑 is sorted indices of the prediction vectors
qualified.
Step 2: Extract features by deploying statistical feature generation
Step 9: Create a loop from 3 to 14.
and the proposed TesPat. We have used entropies (Renyi, Shannon,
Step 10: Apply the mode function to obtain majority-voted results
Log, Sure, Threshold, Norm), maximum, minimum, variance, skewness, iteratively. Herein, 12 (=14 − 3 + 1) voted vectors have been created.
kurtosis, mean absolute deviation, root mean square, range, etc., to ( )
extract 20 features (Kuncan et al., 2019). These features have been 𝑣ℎ−2 = 𝜇 𝑐𝑖𝑑(1), 𝑐𝑖𝑑(2), … , 𝑐𝑖𝑑(ℎ) , ℎ ∈ {3, 4, … , 14} (8)
applied to the input signal and the input signal’s absolute value. Thus, where 𝜇() represents the mode function, and 𝑣 defines voted vectors.
this function generates 40 (=20 × 2) features. Moreover, the proposed In this step, 12 (see ℎ in Eq. (8)) voted vectors have been calculated.
TesPat has been implemented to extract textural features. These two To clarify this model (iterative majority voting), we have shown a
feature extraction functions have been applied to the raw EEG signal graphical outline in Fig. 4.
and 30 wavelet subbands. Thus, 31 feature vectors have been extracted,
and the length of each feature vector is 808 (=40 + 768). The feature 3.5. Selection of the final result layer
creation process is defined below.
Step 3: Create feature vectors by deploying the proposed TesPat and The last layer of our proposed feature engineering architecture
statistical features. selects the best-predicted vectors. Our model creates 26 (=14 channel-
( ( ) ( )) wise + 12 voted) predicted vectors. Here, the Greedy algorithm has
𝐹𝑘 = 𝜙 𝑠𝑡 𝑤𝑘 , 𝑇 𝑒𝑠𝑃 𝑎𝑡 𝑤𝑘 , 𝑘 ∈ {1, 2, … , 30} (3) been used. The steps of this layer are:
Step 11: Calculate classification accuracies of all predicted vectors.
𝐹31 = 𝜙 (𝑠𝑡 (𝑆) , 𝑇 𝑒𝑠𝑃 𝑎𝑡 (𝑆)) (4)
Step 12: Choose the best-predicted vector demonstrated by the proposed
Herein, 𝜙() is the concatenation function, 𝑠𝑡() defines the statistical model.
feature generation function.
4. Results
Step 4: Concatenate the generated 31 feature vectors to obtain the last
feature vector with a length of 25,048 (=808 × 31).
4.1. Experimental setup
𝑋 (𝑝 + 808 × (𝑡 − 1)) = 𝐹𝑡 (𝑝) , 𝑝 ∈ {1, 2, … , 808} , 𝑡 ∈ {1, 2, … , 31} (5)
Our proposed model utilizes a hand-modeled feature engineering
where, 𝑋 is the last generated feature vector from each channel. process. Thus, the time complexity is low, and there is no need to

7
Downloaded from https://iranpaper.ir
https://www.tarjomano.com https://www.tarjomano.com

T. Keles, A.M. Yildiz, P.D. Barua et al. Engineering Applications of Artificial Intelligence 119 (2023) 105722

Table 1
Summary of various steps involved in our proposed TesPat-based model.
Layer Method Parameters Output
Feature extraction WPD Wavelet filter: sym6, 30 wavelet bands have been generated.
Level: 4
TesPat Kernel: Signum function, 768 features have been generated from each input
Block size: 54
Statistical extractor 20 moments have been used. 40 features have been extracted from each signal.
Merging Concatenation function 25,048 (= 808 × 31) features have been extracted.
Feature selection NCA Weight-based feature selection 404 features have been selected
Classification kNN 1NN with L1-norm distance 14 predicted vectors have been created for 14 channels.
Majority voting IHMV Mode function and loop-based model 12 voted vectors have been created.
Loop range: [3–14]
Selection of the final results Greedy model Accuracy calculation and the best The best-predicted vector has been selected.
accurate vector selection

Table 2 Table 3
Attributes of three collected datasets. Demographic attributes of the participants.
No Sentence Database 1 Database 2 Database 3 Dataset Gender Age Language Collection method
(English) (Turkish with (Turkish with
demonstration) listening) Database 1 16M, 4F Min: 20, Max: 27 English Demonstration
Database 2 16M, 4F Min: 19, Max:23 Turkish Demonstration
1 Hello, welcome 81 80 80 Database 3 17M, 3F Min: 18, Max: 23 Turkish Listening
2 See you again 90 80 80
3 Bye 80 80 80 ** M: Male, F: Female.
4 Never mind 83 80 80
5 Enjoy your meal 84 80 80
6 What have you looking 84 82 82 Table 4
for? EEG channels used in this work.

7 Is there any online 85 80 80 Number Channel name Number Channel name


lecture today? 1 AF3 8 O2
8 In which department 80 79 79 2 F7 9 P8
are you a student? 3 F3 10 T8
4 FC5 11 FC6
9 What is your job? 80 80 80 5 T7 12 F4
10 You can trust me 82 80 80 6 P7 13 F8
11 Calm down 82 79 79 7 O1 14 AF4
12 Get out of here 87 80 80
13 Good luck 83 80 80
14 Let’s go 81 80 80
15 Hurry up 80 80 80
EMOTIV brain cap, and the frequency rate of this device is 128 Hz.
16 It is better than nothing 83 80 80
17 You’re welcome 81 80 80 Dataset 1 and Dataset 2 were collected using the demonstration tech-
18 It is your choice 81 80 80 nique, and Dataset 3 was collected using the listening technique. Each
19 They shamble us; we 85 81 81 dataset was collected from 20 participants. These participants are
are shame university students with age range of 18 to 27 years. In the demon-
20 You know nothing 80 79 79 stration technique, sentences were demonstrated to participants, and
Total 1652 1600 1600 they went through the shown sentences. Moreover, the participants did
not have any known mental illness. The demographic attributes of the
participants are tabulated in Table 3.
The used brain cap has 14 channels, as listed in Table 4.
use expensive hardware. The MATLAB programming environment (ver.
2021a) was employed to program/implement our proposed model via 4.3. Performance evaluation metrics
functional programming strategies. The created functions are named
main, statistics (to generate statistical features), TesPat (to generate
In the three used datasets, there are 20 sentences/classes. Thus, we
textural features), WPD (Ting et al., 2008), NCA (to select features)
used the following performance metrics: overall accuracy, precision,
(Goldberger et al., 2004), kNN (for classification) (Peterson, 2009),
recall, and F1-score (Powers, 2020). Overall accuracy is the most
IHMV (for generating voted results) (Dogan et al., 2021) and Greedy
used performance evaluation metric for classification. Recall calculates
(to choose the best result) functions.
class-wise accuracies, and the unweighted average values of all re-
The various details of the proposed TesPat-based model are pro-
call are named the unweighted average recall (UAR). It is a reliable
vided in Table 1.
performance metric for unbalanced datasets. Precision demonstrates
4.2. Datasets true detection performance, and the F1-score summarizes recall and
precision since it is the harmonic mean of the recall and precision
In this research, we have used three new EEG sentence classifica- values. Moreover, two validation techniques have been used, and these
tion datasets. Two of them were collected from Turkish volunteers, a techniques are 10-fold CV and leave-one subject out (LOSO) CV. The
dataset from Nigerian volunteers, and a dataset from Nigerian volun- calculated results have been demonstrated below.
teers whose language is English. Thus, two Turkish and one English EEG
sentence datasets were used. These three datasets contain 20 sentences. 4.4. Single channel results
The attributes of these datasets are listed in Table 2 (Barua et al., 2023).
These datasets were collected from students, and each EEG sig- Firstly, channel-wise results obtained in this work is given in this
nal/segment length is 15-s. These EEG signals were collected using an section. We used a brain cap with 14 channels. We have obtained the

8
Downloaded from https://iranpaper.ir
https://www.tarjomano.com https://www.tarjomano.com

T. Keles, A.M. Yildiz, P.D. Barua et al. Engineering Applications of Artificial Intelligence 119 (2023) 105722

Table 5 Table 7
Channel-wise results (%) obtained using Database 1. Channel-wise results (%) obtained using Database 3.
Channel 10-fold CV LOSO CV Channel 10-fold CV LOSO CV
Acc UAP UAR F1 Acc UAP UAR F1 Acc UAP UAR F1 Acc UAP UAR F1
1 85.77 85.87 85.75 85.81 83.90 83.92 83.87 83.90 1 91 91.04 90.98 91.01 73.13 73.57 73.09 73.33
2 86.02 86.16 85.98 86.07 83.84 84.08 83.80 83.94 2 93.88 93.94 93.86 93.90 81.88 81.99 81.85 81.92
3 68.58 69.12 68.53 68.82 54.24 55.22 54.22 54.72 3 91.44 91.53 91.43 91.48 76 76.02 75.96 75.99
4 82.81 83.18 82.81 82.99 78.81 79.38 78.82 79.10 4 93.81 93.88 93.81 93.84 78 78.39 77.98 78.18
5 68.83 69.49 68.66 69.07 64.23 64.84 64.01 64.42 5 93.81 93.89 93.80 93.84 80.25 81.10 80.23 80.66
6 51.88 52.20 51.75 51.97 49.58 50.02 49.42 49.72 6 93.81 94.02 93.81 93.91 82.88 83.39 82.86 83.12
7 62.53 62.29 62.43 62.36 57.26 56.84 57.11 56.97 7 94.56 94.73 94.55 94.64 85.25 86.06 85.22 85.64
8 84.62 84.99 84.58 84.78 80.75 81.12 80.71 80.91 8 94.75 94.89 94.75 94.82 87.75 88.15 87.73 87.94
9 84.62 85.07 84.57 84.82 81.72 82.28 81.67 81.98 9 94.50 94.83 94.50 94.66 85.81 85.93 85.80 85.86
10 85.47 85.74 85.46 85.60 83.05 83.41 83.09 83.25 10 94.63 94.69 94.62 94.65 83.25 83.45 83.22 83.34
11 85.17 85.47 85.12 85.29 81.48 82 81.41 81.70 11 95.38 95.40 95.37 95.38 81.88 81.98 81.86 81.92
12 62.05 62.70 62.10 62.40 47.88 49.29 47.88 48.58 12 90.88 91 90.87 90.94 76.50 76.47 76.48 76.48
13 70.16 71.24 70.11 70.67 59.87 61.08 59.79 60.43 13 77.75 78.09 77.72 77.90 40 41.96 39.97 40.94
14 88.38 88.61 88.37 88.49 83.84 84.31 83.85 84.08 14 93.69 93.83 93.67 93.75 80.06 80.45 80.03 80.24

** Acc: Accuracy, UAR: Unweighted average recall, UAP: Unweighted average precision, It can be noted from the table that we have attained a classification accuracy of 87.75%
F1: F1-score. with LOSO CV (channel 8), and 95.38% with 10-fold CV (channel 11).

Table 6 Table 8
Channel-wise results (%) obtained using Database 2. Voted results (%) obtained using Database 1.
Channel 10-fold CV LOSO CV NoC 10-fold CV LOSO CV
Acc UAP UAR F1 Acc UAP UAR F1 Acc UAP UAR F1 Acc UAP UAR F1
1 89.06 89.09 89.07 89.08 70.63 70.40 70.61 70.50 3 92.43 93.02 92.41 92.71 89.65 90.49 89.62 90.05
2 91 91.31 90.99 91.15 77 77.23 76.97 77.10 4 94.85 95.17 94.85 95.01 92.68 93.14 92.65 92.90
3 88.06 88.25 88.06 88.16 72.75 73.02 72.74 72.88 5 95.64 95.96 95.63 95.79 94.01 94.43 93.99 94.21
4 86.94 87.21 86.93 87.07 66.44 66.57 66.43 66.50 6 96.61 96.83 96.60 96.71 94.19 94.62 94.16 94.39
5 90.06 90.35 90.02 90.19 75.69 75.72 75.61 75.67 7 97.09 97.29 97.08 97.18 95.52 95.97 95.50 95.73
6 87.88 88.15 87.87 88.01 73.75 74 73.72 73.86 8 97.28 97.47 97.27 97.37 94.73 95.20 94.70 94.95
7 68.25 69.55 68.19 68.86 40.63 41.62 40.55 41.08 9 97.09 97.29 97.08 97.19 94.49 94.96 94.46 94.71
8 91.06 91.30 91.06 91.18 80.94 81.31 80.94 81.13 10 97.28 97.47 97.27 97.37 94.67 95.12 94.64 94.88
9 91.63 91.97 91.62 91.80 79.88 80.01 79.84 79.92 11 97.22 97.45 97.20 97.33 94.92 95.34 94.89 95.11
10 65.69 65.94 65.67 65.80 34.50 35.74 34.49 35.10 12 97.52 97.70 97.52 97.61 94.67 95.17 94.64 94.91
11 77.81 78.42 77.82 78.12 51.63 52.20 51.63 51.91 13 97.52 97.73 97.51 97.62 94.55 95.01 94.53 94.77
12 70.06 70.20 70.07 70.14 36 36.47 35.98 36.22 14 97.34 97.59 97.33 97.46 94.73 95.27 94.71 94.99
13 83.25 83.47 83.29 83.38 61.94 62.84 61.94 62.39
14 91 91.23 91 91.12 75.13 75.20 75.12 75.16 ** NoC: Number of channels.

Table 9
Voted results (%) obtained using Database 2.
channel-wise classification results. In Table 5, the results of the English
NoC 10-fold CV LOSO CV
dataset (Database 1) are tabulated. The best results are highlighted in
Acc UAP UAR F1 Acc UAP UAR F1
bold font in Table 5.
3 94.75 95.23 94.75 94.99 85.75 86.75 85.73 86.24
As can be tabulated from Table 5, the first and 14th channels
4 96.31 96.57 96.31 96.44 88.13 88.52 88.10 88.31
yielded the best results for this dataset. We have obtained the highest 5 97 97.22 97 97.11 90.38 90.62 90.36 90.49
classification accuracy of 88.38% and 83.84% with a 10-fold CV and 6 97.63 97.75 97.62 97.69 91.19 91.33 91.17 91.25
LOSO CV, respectively. 7 97.44 97.58 97.43 97.51 91.81 92.07 91.80 91.94
8 97.50 97.63 97.50 97.56 91.38 91.59 91.37 91.48
The channel-wise results obtained using Database 2 and Database 3 9 97.31 97.48 97.31 97.40 91.13 91.41 91.12 91.26
(Turkish datasets) are given in Tables 6 and 7, respectively. 10 97.38 97.56 97.37 97.47 91.25 91.46 91.24 91.35
We have obtained the highest accuracy for channels 8 and 9 using 11 97.44 97.63 97.44 97.53 91.19 91.46 91.18 91.32
12 97.44 97.64 97.44 97.54 91.25 91.48 91.24 91.36
Database 2 (Table 6). Our model yielded an accuracy of 91.63% with
13 97.31 97.52 97.31 97.42 91.25 91.50 91.24 91.37
10-fold CV (channel 9) and 80.94% with LOSO CV (channel 8). The 14 97.31 97.52 97.31 97.42 91 91.27 90.99 91.13
channel-wise accuracy obtained using Database 3 is given in Table 7.

4.5. Voted results 4.6. Performance of the model

The proposed model used the iterative majority voting (IMV) algo-
Our proposed TesPat-based self-organized hand-modeled architec-
rithm. Thus, 12 voted results were generated using the channel-wise
ture that selects the most accurate predicted vectors. Channel-wise and
predicted labels and mode operator. The voted results are depicted
majority-voted results are provided. The best results were obtained
in this section, tabulated in Tables 8–10 for Databases 1, 2, and 3,
respectively. from voting. Moreover, we used 10-fold and LOSO CVs to develop the
Using Database 1, our proposed method reached 97.52% and model. The obtained best results are shown in Fig. 5.
95.52% classification accuracies with 10-fold and LOSO CVs, consec- Fig. 5 depicts that the most accurate dataset is Database 3, and we
utively. reached 98.19% accuracy for this dataset with a 10-fold CV. Database
For the Database 2, we have obtained the accuracy of 97.63% and 1 reached the best classification accuracy of 95.52% using LOSO CV
91.81% with 10-fold and LOSO CVs, respectively. The voted results of Moreover, our TesPat-based model attained over 91% accuracy for all
the Database 3 have been listed in Table 10. datasets with both LOSO and 10-fold CVs.

9
Downloaded from https://iranpaper.ir
https://www.tarjomano.com https://www.tarjomano.com

T. Keles, A.M. Yildiz, P.D. Barua et al. Engineering Applications of Artificial Intelligence 119 (2023) 105722

Fig. 5. Performances obtained for the proposed model using different combination of databases and validation strategies. ** Db: Database.

Table 10 main feature selection function. Thus, the computational complexity of


Voted results (%) obtained using Database 3. this layer is 𝑂(𝑘).
NoC Database 2 Database 3 Classification: The time burden of the kNN is 𝑂(𝑓 𝑑). kNN is our main
Acc UAP UAR F1 Acc UAP UAR F1 classification method. Thus the time complexity of this layer is 𝑂(𝑓 𝑑).
3 96.75 96.83 96.75 96.79 89.75 90.15 89.73 89.94 Here, 𝑓 is the number of features and 𝑑 number of observations.
4 97.31 97.37 97.31 97.34 90.63 91.03 90.60 90.82 Majority voting and selection the best results: The time burden of IMV
5 97.06 97.24 97.06 97.15 91.06 91.31 91.04 91.18 and the Greedy algorithm are 𝑂(𝑖𝑑) and 𝑂(𝑙), respectively. Here, 𝑖 is
6 97.44 97.59 97.44 97.51 91.38 91.60 91.36 91.48
7 97.63 97.81 97.63 97.72 91.88 91.99 91.86 91.93
the number of iterations, 𝑑 defines the number of observations, and 𝑙
8 98.13 98.18 98.13 98.15 92 92.18 91.98 92.08 represents the number of predicted vectors.
9 98 98.17 98 98.08 91.50 91.56 91.48 91.52 Total: By using the time burden of each layer, the total time complexity
10 98.19 98.29 98.19 98.24 91.50 91.58 91.48 91.53 is calculated as 𝑂(𝑛𝑡𝑙𝑜𝑔𝑛 + 𝑘 + 𝑓 𝑑 + 𝑖𝑑 + 𝑙).
11 98.13 98.26 98.13 98.19 91.25 91.27 91.24 91.25
12 97.94 98.06 97.94 98 90.81 90.76 90.80 90.78
5. Discussion
13 97.81 97.97 97.81 97.89 90.38 90.36 90.36 90.36
14 97.81 97.93 97.81 97.87 89.69 89.62 89.67 89.65
This paper presents a new EEG sentence classification model using
The best classification results using 10-fold CV has been attained for Database 3 with
an accuracy of 98.19% and 92% using the prediction vector of the LOSO CV.
a handcrafted feature extraction methodology. Firstly, we proposed a
new chemical-based feature extraction model based on the shape of the
testosterone molecule. Thus, we named it TesPat. By deploying TesPat,
a self-organized hand-modeled learning architecture/network has been
4.7. Computational complexity
proposed. This model can extract multilevel features at both spatial
and frequency domains. We used a simple feature selector (NCA) to
Our proposed TesPat-based model has a linear time complexity since select each channel’s most informative 404 features. kNN generates
it uses handcrafted simple feature extractors. The time complexity of channel-wise predicted vectors, which were utilized as input to the
our proposed model is calculated in this section. IMV. To demonstrate the general classification ability of the TesPat-
Feature extraction: In the feature extraction layer, we have used WPD, based self-organized model, we used three EEG sentence datasets, and
statistics, and TesPat. The time complexity of the used feature extrac- our paradigm reached 97.52%, 97.63%, and 98.19% accuracies on the
tors is 𝑂(𝑛). Herein, 𝑛 defines the length of the signal. WPD generates first, second, and third databases, respectively, with 10-fold CV. More-
sub-bands, and the length of these bands is decreased at each level. over, LOSO CV was used to validate results, and our model reached
Thus, the complexity of the presented feature generation is equal to 95.52%, 91.81%, and 92% classification accuracies on Database 1,
𝑂(𝑛𝑡𝑙𝑜𝑔𝑛). Also, 𝑡 defines the number of bands. Database 2, and Database 3, respectively.
Feature selection: The time complexity of NCA is 𝑂(𝑘), where 𝑘 is the We have obtained the highest channel-wise results for AF3, O2, P8,
time complexity multiplier of the NCA feature selection. NCA is the T8, FC6, and AF4 channels. The participants thought the sentences,

10
Downloaded from https://iranpaper.ir
https://www.tarjomano.com https://www.tarjomano.com

T. Keles, A.M. Yildiz, P.D. Barua et al. Engineering Applications of Artificial Intelligence 119 (2023) 105722

Fig. 6. The used methods for ablation (a) TesPat + WPD-based model, (b) Statistics + WPD-based model, (c) EMD + Statistics + TesPat (d) TesPat-based model, (e) statistics-based
model, (f) TesPat + Statistics-based model.

the frontal lobe was activated. We used two data collection strategies: Table 11
Results obtained using various combinations of building blocks of the proposed
listening and demonstration. Thus, our data collection was activated
model.
the occipital (it is about the visual cortex) and temporal (it is about the
Method Accuracy (%)
auditory) lobes. The P8 channel was activated since it belongs to the
TesPat + WPD-based model 75.25
parietal lobe and is responsible for senses and perceptions.
Statistics + WPD-based model 88.88
EMD + Statistics + TesPat-based model 89.07
5.1. Ablation of the proposed model TesPat-based model 70.69
Statistics-based model 59.44
TesPat + Statistics-based model 85.56
To show the effectiveness of the proposed model, we used the most Our proposed model 95.38
accurate channel of the best dataset, the 11th channel of Database 3.
Furthermore, we used six methods, and these methods’ graphical results
are demonstrated in Fig. 6.
Using these six models, the EEG sentence classification effects of In this research, we used WPD with four levels. The accuracies using
the models have been demonstrated, and a comparative results table WPD with varying levels are demonstrated in Fig. 7.
has been obtained. The results for the 11th channel of Database 3 are These tests were applied on the 11th channel of Database 3 (the
listed in Table 11. most accurate channel per our results). Fig. 7 highlights the best
Table 11 demonstrates that our model outperformed since it at- accuracy obtained using WPD with four levels. Moreover, all results
tained a 95.38% classification accuracy on this channel. Moreover, we with WPD are over 92%, demonstrating that WPD positively affects at-
compared it to the WPD and EMD models. Table 11 highlighted that taining high classification accuracy. Furthermore, Daubechies 4 (db4),
WPD attained higher 6.31% classification accuracy than EMD for this Daubechies 4 (db4), Daubechies 4 (db4), symlet 4 (sym4), symlet 5
problem. Also, we tested this model with variable levels of the WPD. (sym5), and symlet 6 (sym6) filters have been used for comparison.

11
Downloaded from https://iranpaper.ir
https://www.tarjomano.com https://www.tarjomano.com

T. Keles, A.M. Yildiz, P.D. Barua et al. Engineering Applications of Artificial Intelligence 119 (2023) 105722

Fig. 9. Summary of accuracies obtained using various molecular structures.

Fig. 7. Classification performance of the proposed model using variable levels of WPD. extractors. These feature extractors are the Hamsi pattern (Tuncer,
2021), local binary pattern (LBP) (Ojala et al., 2002), local ternary
pattern (LTP) (Ruichek et al., 2022; Tan and Triggs, 2010), melamine
pattern (Aydemir et al., 2021), DES Pattern (DesPat) (Akbal et al.,
2022), Collatz Pattern (Baygin et al., 2021b) and the homomorphically
irreducible tree pattern (HITPat) (Baygin et al., 2021a). These tech-
niques have been applied to the 11th channel of Database 3 since this
channel is our most resulting channel. The model uses the used pattern
+ statistics + kNN classifier. The calculated results are demonstrated in
Fig. 9.
The comparative results demonstrated that our paradigm TesPat-
based model reached an 85.56% classification accuracy (by only using
TesPat + Statistics + kNN), and our proposed TesPat is the best feature
extractor for this dataset. Furthermore, DesPat attained 84.94% classifi-
cation accuracy, which is higher than the other authors reported work.
The Melamine pattern is another chemistry graph-based model I that
yielded an 83.96% classification accuracy.

5.3. Benefits and limitations

The benefits and limitations of our proposed TesPat-based model


Fig. 8. Classification accuracies per the used wavelet filters.
are discussed below.
Advantages:

• Three EEG sentence datasets were used in this research.


According to the test results, the best wavelet filter is sym6. The • Two validation techniques – 10-fold CV and LOSO CV – were used
calculated classification accuracies using these six wavelet filters are to validate results and our model reached over 90% classification
demonstrated in Fig. 8. accuracies using both datasets. Moreover, robust results were
According to these ablation tests, the TesPat attained over 70% calculated by deploying these both validations.
classification accuracy, and the WPD + TesPat-based model reached • The used datasets were collected using Turkish and English.
75% accuracy. As can be seen from Table 11, WPD increases the We have tested our model on the multilingual EEG sentence
classification performance of the statistical features. Statistical features classification.
positively affect the wavelet bands, and TesPat is a good feature ex- • A nature-inspired feature extraction function was proposed,
tractor for raw EEG signals. Moreover, we used a commonly known named TesPat.
decomposition model i.e., Empirical Mode Decomposition (EMD). EMD • A self-organized handcrafted feature extraction model has been
yielded an 89.07% classification accuracy which is lower than the proposed.
WPD-based model (our proposed work). Moreover, the number of
• The presented model is simple, and a developer can easily imple-
levels parameter of WPD is critical for our proposal. We selected this
ment it.
parameter as four since it is the most accurate initialization for our
• Our proposed TesPat-based model has a linear time complexity
problem (see Figs. 7 and 8).
since it uses handcrafted simple feature extractors.

5.2. Comparative results Drawbacks:

In this section, state-of-art models have been applied to our dataset • We used a limited number of subjects since the EEG collection
to obtain comparative results. Herein, we selected the popular feature from participants is difficult.

12
Downloaded from https://iranpaper.ir
https://www.tarjomano.com https://www.tarjomano.com

T. Keles, A.M. Yildiz, P.D. Barua et al. Engineering Applications of Artificial Intelligence 119 (2023) 105722

• There are 20 sentences in the three used datasets. The number Declaration of competing interest
of sentences can be increased, and long sentences can be used.
To simplify the EEG collection process, we used simple and short The authors declare that they have no known competing finan-
sentences. cial interests or personal relationships that could have appeared to
influence the work reported in this paper.
6. Conclusions
Data availability
We focused on sentence classification in this research to discover
new abilities of the EEG signals, and three datasets were used to get The authors do not have permission to share data.
results. Two of these datasets were collected using Turkish, and one was
collected using English languages. Each dataset was collected from 20
participants and had 20 categories. We proposed a new nature-inspired Funding
feature extraction model, the TesPat. We used a directed graph of the
testosterone molecule to model TesPat. By deploying TesPat, a self- This research is supported by the 121E399 project fund
organized EEG classification architecture was proposed, which attained provided by the Scientific and Technological Research Council of
97.52%, 97.63%, and 98.19% classification accuracies with 10-fold CV Turkey (TUBITAK).
on Database 1, Database 2, and Database 3, respectively. Our proposed
model yielded 95.52%, 91.81%, and 92% accuracies for Databases Ethical approval
1, 2, and 3, respectively. These results and our presented findings
show that machine learning can detect the considered sentences using The study was approved by the local ethical committee, the Ethics
EEG signals. Therefore, EEG sentence classification is an approach that Committee of Firat University (2021/11-34).
contributes to the computer–brain interface.
The important points of this research are also listed below. References
– Our proposed model attained over 91% classification accuracies
Akbal, E., Barua, P.D., Dogan, S., Tuncer, T., Acharya, U.R., 2022. DesPatNet25:
for the three used datasets with both LOSO CV and 10-fold CV. Data encryption standard cipher model for accurate automated construction site
– The most accurate channels are AF4, P8, and FC6 for the used monitoring with sound signals. Expert Syst. Appl. 193, 116447.
Dataset 1, Dataset 2, and Dataset 3, respectively. Ali Khan, S., Hussain, A., Basit, A., Akram, S., 2014. Kruskal–Wallis-based com-
putationally efficient feature selection for face recognition. Sci. World J.
– The best results are voted results for these three datasets. In
2014.
this aspect, IHMV increases the classification performance of our Altaheri, H., et al., 2021. Deep learning techniques for classification of electroen-
channel-wise results. cephalogram (EEG) motor imagery (MI) signals: a review. Neural Comput. Appl.
– Our proposed TesPat has a distinctive feature extraction capabil- 1–42.
ity (see Fig. 9). Aydemir, E., Tuncer, T., Dogan, S., Gururajan, R., Acharya, U.R., 2021. Automated
major depressive disorder detection using melamine pattern with EEG signals. Appl.
– We used empirical mode decomposition (EMD) and WPD tech-
Intell. 51 (9), 6449–6466.
niques for decomposition. However, WPD performed better than Bakhshali, M.A., Khademi, M., Ebrahimi-Moghadam, A., 2022. Investigating the neural
EMD. correlates of imagined speech: An EEG-based connectivity analysis. Digit. Signal
– Per the ablation results, our presented model outperforms others. Process. 123, 103435.
Balasubramanian, K., Ramya, K., Gayathri Devi, K., 2022. Optimized adaptive neuro-
In the future, we plan to use explainable artificial intelligence (XAI) fuzzy inference system based on hybrid grey wolf-bat algorithm for schizophrenia
recognition from EEG signals. Cogn. Neurodyn. 1–19.
for classification using larger datasets and more languages (Loh et al.,
Barua, P.D., et al., 2023. Automated EEG sentence classification using novel dynamic-
2022). sized binary pattern and multilevel discrete wavelet transform techniques with
TSEEG database. Biomed. Signal Process. Control 79, 104055.
CRediT authorship contribution statement Baygin, M., Tuncer, T., Dogan, S., Tan, R.-S., Acharya, U.R., 2021a. Automated
arrhythmia detection with homeomorphically irreducible tree technique using more
than 10,000 individual subject ECG records. Inform. Sci. 575, 323–337.
Tugce Keles: Conceptualization, Methodology, Validation, Formal
Baygin, M., Yaman, O., Tuncer, T., Dogan, S., Barua, P.D., Acharya, U.R., 2021b.
analysis, Investigation, Resources, Data curation, Writing – original Automated accurate schizophrenia detection system using Collatz pattern technique
draft, Writing – review & editing, Writing – original draft, Visualization. with EEG signals. Biomed. Signal Process. Control 70, 102936.
Arif Metehan Yildiz: Conceptualization, Methodology, Validation, For- Cao, J., et al., 2022. Brain functional and effective connectivity based on
mal analysis, Investigation, Resources, Data curation, Writing – original electroencephalography recordings: A review. Hum. Brain Mapp. 43 (2), 860–879.
Cooney, C., Korik, A., Folli, R., Coyle, D., 2020. Evaluation of hyperparameter
draft, Writing – review & editing, Writing – original draft, Visualization.
optimization in machine and deep learning methods for decoding imagined speech
Prabal Datta Barua: Conceptualization, Methodology, Validation, For- EEG. Sensors 20 (16), 4629.
mal analysis, Investigation, Resources, Data curation, Writing – original Dash, S., Tripathy, R.K., Panda, G., Pachori, R.B., 2022. Automated recognition
draft, Visualization, Writing – review & editing. Sengul Dogan: Con- of imagined commands from EEG signals using multivariate fast and adaptive
ceptualization, Methodology, Software, Formal analysis, Investigation, empirical mode decomposition based method. IEEE Sens. Lett. 6 (2), 1–4.
Datta, S., Boulgouris, N.V., 2021. Recognition of grammatical class of imagined
Resources, Data curation, Writing – original draft, Writing – review
words from EEG signals using convolutional neural network. Neurocomputing 465,
& editing. Mehmet Baygin: Conceptualization, Methodology, Formal 301–309.
analysis, Investigation, Resources, Data curation, Writing – original Deng, W., Xu, J., Gao, X.-Z., Zhao, H., 2020. An enhanced MSIQDE algorithm with
draft, Writing – review & editing. Turker Tuncer: Conceptualization, novel multiple strategies for global optimization problems. IEEE Trans. Syst. Man
Methodology, Software, Formal analysis, Investigation, Resources, Data Cybern..
Deng, W., et al., 2022. Multi-strategy particle swarm and ant colony hybrid optimization
curation, Writing – original draft, Writing – review & editing. Caner for airport taxiway planning problem. Inform. Sci..
Feyzi Demir: Conceptualization, Methodology, Formal analysis, Inves- Dogan, A., et al., 2021. PrimePatNet87: Prime pattern and tunable q-factor wavelet
tigation, Data curation, Writing – original draft, Writing – review & transform techniques for automated accurate EEG emotion recognition. Comput.
editing. Edward J. Ciaccio: Conceptualization, Methodology, Formal Biol. Med. 138, 104867.
Dogan, S., et al., 2022. Primate brain pattern-based automated Alzheimer’s disease
analysis, Investigation, Writing – original draft, Writing – review &
detection model using EEG signals. Cogn. Neurodyn. 1–13.
editing. U. Rajendra Acharya: Conceptualization, Methodology, For- García-Salinas, J.S., Villaseñor-Pineda, L., Reyes-García, C.A., Torres-García, A.A., 2019.
mal analysis, Investigation, Writing – original draft, Writing – review Transfer learning in imagined speech EEG-based BCIs. Biomed. Signal Process.
& editing, Supervision, Project administration. Control 50, 151–157.

13
Downloaded from https://iranpaper.ir
https://www.tarjomano.com https://www.tarjomano.com

T. Keles, A.M. Yildiz, P.D. Barua et al. Engineering Applications of Artificial Intelligence 119 (2023) 105722

Goldberger, J., Hinton, G.E., Roweis, S., Salakhutdinov, R.R., 2004. Neighbourhood Peterson, L.E., 2009. K-nearest neighbor. Scholarpedia 4 (2), 1883.
components analysis. Adv. Neural Inf. Process. Syst. 17, 513–520. Powers, D.M., 2020. Evaluation: from precision, recall and F-measure to ROC,
Houssein, E.H., Hammad, A., Ali, A.A., 2022. Human emotion recognition from EEG- informedness, markedness and correlation. arXiv preprint arXiv:2010.16061.
based brain–computer interface using machine learning: a comprehensive review. PubChem, 2022. National Library of Medicine, National Center for Biotechnology
Neural Comput. Appl. 1–31. Information, https://pubchem.ncbi.nlm.nih.gov/ (accessed).
Kamble, A., Ghare, P., Kumar, V., 2022. Machine-learning-enabled adaptive signal Ruichek, Y., Chetverikov, D., Tarawneh, A.S., 2022. Local ternary pattern based
decomposition for a brain-computer interface using EEG. Biomed. Signal Process. multi-directional guided mixed mask (MDGMM-LTP) for texture and material
Control 74, 103526. classification. Expert Syst. Appl. 117646.
Kaplan, E., et al., 2022. Automated BI-RADS classification of lesions using pyramid Saminu, S., et al., 2021. Electroencephalogram (EEG) based imagined speech decoding
triple deep feature generator technique on breast ultrasound images. Med. Eng. and recognition. J. Appl. Mater. Technol. 2 (2), 74–84.
Phys. 103895. Shah, S.Y., Larijani, H., Gibson, R.M., Liarokapis, D., 2022. Random neural network
Kim, K., Duc, N.T., Choi, M., Lee, B., 2021. EEG microstate features according to based epileptic seizure episode detection exploiting electroencephalogram signals.
performance on a mental arithmetic task. Sci. Rep. 11 (1), 1–14. Sensors 22 (7), 2466.
Kuncan, F., Yılmaz, K., Kuncan, M., 2019. Sensör işaretlerinden cinsiyet tanıma Song, Y., et al., 2022. Dynamic hybrid mechanism-based differential evolution algorithm
için yerel ikili örüntüler tabanlıyeni yaklaşımlar. Gazi Üniv. Mühendis. Mimarlık and its application. Expert Syst. Appl. 118834.
Fakültesi Derg. 34 (4), 2173–2186. Subasi, A., Saikia, A., Bagedo, K., Singh, A., Hazarika, A., 2022. EEG based driver
Li, X., et al., 2022. EEG based emotion recognition: A tutorial and review. ACM Comput. fatigue detection using FAWT and multiboosting approaches. IEEE Trans. Ind.
Surv.. Inform..
Lima, A.A., Mridha, M.F., Das, S.C., Kabir, M.M., Islam, M.R., Watanobe, Y., 2022. Tan, X., Triggs, B., 2010. Enhanced local texture feature sets for face recognition under
A comprehensive survey on the detection, classification, and challenges of difficult lighting conditions. IEEE Trans. Image Process. 19 (6), 1635–1650.
neurological disorders. Biology 11 (3), 469. Ting, W., Guo-Zheng, Y., Bang-Hua, Y., Hong, S., 2008. EEG feature extraction based
Loh, H.W., Ooi, C.P., Seoni, S., Barua, P.D., Molinari, F., Acharya, U.R., 2022. on wavelet packet decomposition for brain computer interface. Measurement 41
Application of explainable artificial intelligence for healthcare: A systematic (6), 618–625.
review of the last decade (2011–2022). Comput. Methods Programs Biomed. Tuncer, T., 2021. A new stable nonlinear textural feature extraction method based EEG
107161. signal classification method using substitution box of the Hamsi hash function:
Ojala, T., Pietikainen, M., Maenpaa, T., 2002. Multiresolution gray-scale and rotation Hamsi pattern. Appl. Acoust. 172, 107607.
invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Värbu, K., Muhammad, N., Muhammad, Y., 2022. Past, present, and future of EEG-based
Mach. Intell. 24 (7), 971–987. BCI applications. Sensors 22 (9), 3331.
Ostad-Ali-Askari, K., Shayan, M., 2021. Subsurface drain spacing in the unsteady Vorontsova, D., et al., 2021. Silent EEG-speech recognition using convolutional and
conditions by HYDRUS-3D and artificial neural networks. Arab. J. Geosci. 14 (18), recurrent neural network with 85% accuracy of 9 words classification. Sensors 21
1–14. (20), 6744.
Ostad-Ali-Askari, K., Shayannejad, M., Ghorbanizadeh-Kharazi, H., 2017. Artificial Wang, Z., Ji, H., 2022. Open Vocabulary Electroencephalography-To-Text Decoding and
neural network for modeling nitrate pollution of groundwater in marginal area Zero-Shot Sentiment Classification, Vol. 36, fifth ed. pp. 5350–5358.
of Zayandeh-rood river, Isfahan, Iran. KSCE J. Civ. Eng. 21 (1), 134–140. Xu, L., Chavez-Echeagaray, M.E., Berisha, V., 2022. Unsupervised EEG channel selection
Panachakel, J.T., Ramakrishnan, A.G., 2021. Decoding covert speech from EEG-a based on nonnegative matrix factorization. Biomed. Signal Process. Control 76,
comprehensive review. Front. Neurosci. 392. 103700.
Pawar, D., Dhage, S., 2020. Multiclass covert speech classification using extreme Zhao, H., et al., 2022. Intelligent diagnosis using continuous wavelet transform and
learning machine. Biomed. Eng. Lett. 10 (2), 217–226. gauss convolutional deep belief network. IEEE Trans. Reliab..

14

You might also like