Professional Documents
Culture Documents
ir
https://www.tarjomano.com https://www.tarjomano.com
Syed Umar Amin*, Ghulam Muhammad†, Wadood Abdul†, Mohamed Bencherif†, and Mansour
Alsulaiman†
*
samin@ksu.edu.sa
*†
The authors are with the Department of Computer Engineering, College of Computer and Information Sciences
(CCIS), King Saud University, Riyadh 11543, Saudi Arabia. They are also with the Center of Smart Robotics
Research, CCIS, King Saud University, Riyadh, Saudi Arabia
filter size of CNN have a varying effect on the extracted 3. MULTI-CNN FEATURE FUSION
filters, therefore the architecture and depth play an important
role in extracting robust features [16-22]. Although many The proposed model consists of two CNN models and an
studies have used deep learning-based methods for EEG autoencoder for feature fusion. The CNN models consist of a
decoding, the improvement in performance over shallow CNN with one convolution layer and a deep CNN
conventional machine learning models has been limited [40- having four convolutions. First, each CNN model is
42]. pretrained and then trained on another MI database. When
This study proposes a feature fusion model where deep training is done, both the CNN are fused using an
CNN and shallow CNN features are fused with the help of autoencoder. After autoencoder, a softmax layer is used for
autoencoders. Shallow CNN and deep CNN with different classification.
filter sizes can extract different types of EEG features that are Both the shallow and deep CNNs have blocks of
fused using autoencoders. The proposed model achieves convolution layers and pooling layers. The convolution
better performance than the best CNN models for EEG layers learn spatial features and the pooling layers are used
decoding. Our method shows that CNN depends on depth, for dimensionality reduction. The literature review suggests
architecture, and filters for robust feature extraction. that for EEG decoding CNN can have one to four convolution
The remaining part of this paper is divided into 3 layers for good results, very deep models don’t work for EEG
sections. Section 2 gives an overview of the studies related to [21-25]. Many studies have used CNNs with just one or two
EEG classification. Section 3 presents the proposed Multi layers [33-34].
CNN fusion method and section 4 presents the results. We In both the CNN models the first convolution is a logical
provide a conclusion in section 5. one which is divide into two convolution steps. Most of the
EEG recordings consist of multiple channels which could
sometimes go up to 128 channels. Split convolution
2. RELATED STUDY technique can manage multiple channels, the first
convolution step is performed through time, and the second
Many studies employ conventional machine learning one is done for all channels spatially. Therefore the net effect
techniques in MI classification. The best-known method is convolution through all channels. The EEG dataset is
among them is the filter bank common spatial patterns organized in a 2-D manner, which consists of rows of
(FBCSP) [4, 5] which had the best result for MI decoding. channels and columns of samples. The divided first
Many researchers have used the support vector machine convolution technique favors this input organization. The
(SVM) for classification and have achieved good results [26- convolution step performed across time-samples can extract
28]. Some researchers have also used principal component temporal characteristics while the other convolution
analysis (PCA) and independent component analysis (ICA) performed through all channels can learn spatial features.
reducing dimension and removing noise [16-17]. In this study we use the BCI Competition IV dataset 2a
Many studies have started applying deep learning models (BCID) [31], many recent studies have been tested on this
CNN, DBN, and autoencoders to achieve good results for MI database. The BCI dataset has a limited number of training
classification. CNN has been widely used for feature samples, so in this study, the CNN models are pretrained on
extraction and MI classification tasks [19-22]. Some studies another dataset called the High Gamma dataset (HGD) [33].
have also used DBN for temporal features as research shows The BCI dataset is a motor imagery challenge dataset used
that EEG is oriented in time-series [24-26]. Some studies by many researchers. It has 22 channel recordings from 9
have employed both CNN and RNN for spatial and temporal healthy subjects recorded in two sessions where sessions have
feature extraction [20, 21]. SVM has also used as a classifier 288 trials each four seconds long. Tasks consist of imagining
together with DBN [24]. One study combined handcrafted movement of the right hand, left hand, feet, and tongue [31].
CSP features with CNN and fused handcrafted features with The input EEG signal is cropped by a 2s sliding window
deep features to obtain good results [26]. CNN and and then given to the CNN model as input. Cropping
autoencoders were combined for emotion recognition based increases the size of the training samples, helps improve
on EEG [27]. In one study, the authors transformed the EEG classification accuracy and also prevents overfitting on small
into images and CNN was used for image classification. EEG datasets. The sampling frequency is 256 Hz which gives
Another study extracted the mu band and the beta band from around 1000 samples with 4s cropping window.
the EEG signal for MI classification and employed stacked The split convolution is done through time-samples and then
autoencoder (SAE) with CNN [28, 29]. through all channels as shown in Figure 1. After each
All the above methods above using deep learning models convolution, we apply nonlinearity and max-pooling and at
improved classification results. In this study, we can improve last, we apply the softmax layer. Batch normalization and
the MI decoding accuracy further by fusing CNN using dropout techniques helped us increase decoding accuracy.
autoencoders. We used exponential linear units (ELU) for activation.
Authorized licensed use limited to: Carleton University. Downloaded on June 21,2023 at 11:12:53 UTC from IEEE Xplore. Restrictions apply.
Downloaded from https://iranpaper.ir
https://www.tarjomano.com https://www.tarjomano.com
The architecture of CNNs is given in Table 1. We also used The fusion is performed using autoencoders which are trained
the Adam algorithm as an optimization algorithm, which is a separately by freezing the CNN parameters.
good optimizer for high-dimensional data like EEG. After We fuse shallow CNN and Deep CNN using an
training the CNNs are optimized jointly. autoencoder. The CNNs were pretrained the HGD dataset,
then the softmax and the dense layers were removed and the
Table 1. Structure of CNN models CNNs were fused by concatenating pool features. The multi-
Shallow CNN Deep CNN CNN fusion model is shown in figure 3. The fusion model is
Convolution (25×1, 100 filters) Convolution (10×1, 100 filters) trained in a subject-specific and cross-subject manner.
The EEG is dynamic so it changes from subject to subject
Convolution (1×22, 100 filters) Convolution (1×22, 100 filters) and the trials also differ for within the subject. Hence, we
Max Pool (3×1, stride 3) Max Pool (3×1, stride 3) need robust features that can discriminate subjects and trials.
Fully Connected (1024) Convolution (10×1, 100 filters) Our study proposes training the autoencoder in a novel way
to learn feature for different subjects. Autoencoder cross-
Softmax (4 classes) Max Pool (3×1, stride 3) encoding and pretraining used in [37] achieved good results.
Convolution (10×1, 100 filters) The concatenated features are given to the autoencoder which
Max Pool (3×1, stride 3) reconstructs the feature set for the same trial in subject-
specific training but in cross-subject encoding technique, the
Convolution (10×1, 200 filters) autoencoder reconstructs a trial for a different subject. In this
Max Pool (3×1, stride 3) process, the fusion model can extract a discriminative feature
Fully Connected (1024) set for MI decoding.
The cross-subject encoding also helps increase training
Softmax (4 classes)
samples. At the last softmax function is used to classify the
feature set. The cross encoding method is given in figure 5.
Authorized licensed use limited to: Carleton University. Downloaded on June 21,2023 at 11:12:53 UTC from IEEE Xplore. Restrictions apply.
Downloaded from https://iranpaper.ir
https://www.tarjomano.com https://www.tarjomano.com
Authorized licensed use limited to: Carleton University. Downloaded on June 21,2023 at 11:12:53 UTC from IEEE Xplore. Restrictions apply.
Downloaded from https://iranpaper.ir
https://www.tarjomano.com https://www.tarjomano.com
Authorized licensed use limited to: Carleton University. Downloaded on June 21,2023 at 11:12:53 UTC from IEEE Xplore. Restrictions apply.
Downloaded from https://iranpaper.ir
https://www.tarjomano.com https://www.tarjomano.com
Information Processing Systems, December 2006, Pages 153– decoding and visualization", Hum. Brain Mapp., 38: 5391–
160 5420, 2017.
[30] M. Chen, et al., "Edge-CoCaCo: Toward Joint Optimization of [34] S. Sakhavi, C. Guan and Y. Shuicheng, "Learning Temporal
Computation, Caching, and Communication on Edge Cloud," Information for Brain-Computer Interface Using
IEEE Wireless Communications, vol. 25, no. 3, pp. 21-27, June Convolutional Neural Networks" IEEE Transactions on Neural
2018. Networks and Learning Systems, vol. 29, no. 11, pp. 5619-
[31] C. Brunner, R. Leeb, G. Muller-Putz, A. Schlogl and G. 5629, Nov. 2018.
Pfurtscheller, "BCI Competition 2008–Graz data set A and B", [35] Y. R. Tabar and U. Halici, "A novel deep learning approach
Institute for Knowledge Discovery (Laboratory of Brain- for classification of EEG motor imagery signals", Journal of
Computer Interfaces), Graz University of Technology, pages Neural Engineering, 14(1):016003, 2016.
136–142. [36] S. Stober, "Learning discriminative features from
[32] R. T. Canolty, E. Edwards, S. S. Dalal, M. Soltani, S. S. electroencephalography recordings by encoding similarity
Nagarajan, H. E. Kirsch, M. S. Berger, N. M. Barbaro, R. T. constraints," 2017 IEEE International Conference on
Knight, "High gamma power is phase-locked to theta Acoustics, Speech and Signal Processing (ICASSP), New
oscillations in human neocortex", Science 313:1626–1628, Orleans, LA, 2017, pp. 6175-6179.
2006. [37] S. U. Amin, M. Alsulaiman, G. Muhammad, M. A. Bencherif
[33] R. T. Schirrmeister, J. T. Springenberg, L. D. J. Fiederer, M. and M. S. Hossain, "Multilevel Weighted Feature Fusion Using
Glasstetter, K. Eggensperger, M. Tangermann and T. Ball, Convolutional Neural Networks for EEG Motor Imagery
"Deep learning with convolutional neural networks for EEG Classification," IEEE Access, vol. 7, pp. 18940-18950, 2019.
Authorized licensed use limited to: Carleton University. Downloaded on June 21,2023 at 11:12:53 UTC from IEEE Xplore. Restrictions apply.