You are on page 1of 2

Name:

Student ID:
Title:
Human-robot interaction (HRI) is currently getting its momentum since it can
be used through electromyography (EMG) and electroencephalography (EEG) signals
for hand gesture recognition. Herein, the development of EMG for HMI is used to
decode some hand movements using an attention-based method. The EMG signals are
derived from the muscles where these signals have characteristics of non-stationary.
However, it can be used to perform hand gesture recognition because the muscular
contraction represents the motor action of the hands. In its application, some
constraints reduce its utility, such as contact impedance of the sensor, non-stable
agent in the human skin, and so forth.

To overcome the aforementioned problems, the paper chosen by the volunteer


speaker proposes the HRI through a low-cost hand robot via surface EMG (sEMG)
signals using an attention-based neural network for recognizing some hand gestures.
The working proposed method is divided into three main parts, including sensors
(collecting raw dataset), MCU (peripheral communication), and computers
(processing raw data and making a prediction). There are four Myo-armband channels
(EMG sensors) and an EMG bio-signal feedback device (EBFD) deployed for data
acquisition. An MCU teensy board is used to connect some equipment used. The
computer is used to perform pre-processing data and learning algorithms using Python
for classification results. The proposed method of software pipeline is outlined in two
categorizations (offline training and online prediction).

In detail, each process is explained in the following. Data acquisition uses a


frequency of 0.5-495Hz, where each segment of EMG signal performs 1000 window
size sampling. The frequency follows the hand's muscle frequency of action. Then,
the raw data fusion is done in the period of 0.25s. To clean the dataset, a high-pass
filter is applied with a frequency of 5 Hz and then normalized to the zero-mean
normalization. For the classification process, Short-time Fourier transform (STFT) is
used before coming to the classifiers used.

In this study, the author proposed attention-based and other classifier models
as the benchmark for comparison results. The neural network architecture proposed
by the author is input layer = 4 x 8 x 128 or 1 x 4 x 256, first hidden layer = 2048,
second hidden layer = 1024, third hidden layer = 512, and output layer 5 according to
the number of hand motions recognized. The convolutional neural network (CNN)
architecture proposed is conv2d with batch normalization + SeLu, Selu, and Linear
Layer, before coming to the flatten, fully connected layer, and softmax for
classification result. In the attention-based ResNet, the input data is fed to ResNet and
then split into Conv2d. Here, two kinds of attention modules called spatial and
channel, are deployed. With some reshaping processes, the outputs are then summed
and then fed to a linear layer to get the final result.

The following is the training configuration of the proposed method.

Raw dataset: 2745 images

Split: 20% testing, 80% training

Bat size: 16 samples

Optimisation: Adam

Learning rate 1 x e-2 or 1 x e-3

The result shows that five hand movements, including neutral, close, open, flexion, and
extension, are well classified with the highest accuracy rate of 97.33% for ARestNet and
94.51% for AResNet in the Raw and Spectogram data, respectively.

You might also like