You are on page 1of 13

Expert Systems With Applications 241 (2024) 122696

Contents lists available at ScienceDirect

Expert Systems With Applications


journal homepage: www.elsevier.com/locate/eswa

Human Activity Recognition based on Local Linear Embedding and Geodesic


Flow Kernel on Grassmann manifolds
Huaijun Wang a,b , Jian Yang a , Changrui Cui a , Pengjia Tu a , Junhuai Li a,b ,∗, Bo Fu a,b , Wei Xiang c
a
School of Computer Science and Engineering, Xi’an University of Technology, No. 5 South Jinhua Road, Xi’an, 710048, Shaanxi, China
b
Shaanxi Key Laboratory for Network Computing and Security Technology, No. 5 South Jinhua Road, Xi’an, 710048, Shaanxi, China
c
School of Computing, Engineering and Mathematical Sciences, La Trobe University, Melbourne 3086, Australia

ARTICLE INFO ABSTRACT

Keywords: Human Activity Recognition (HAR) plays a crucial role in various applications(e.g., medical treatment, video
Human Activity Recognition surveillance and sports monitoring). Transfer learning is a promising solution to cross-domain identification
Transfer learning problems in HAR. However, existing methods usually ignore the negative transfer caused by using the features
Locally Linear Embedding
of each source domain in equal proportions, as well as the distribution difference between the source and target
Geodesic Flow Kernel
domains. In this paper, an HAR method based on manifold learning is proposed. Firstly, the similarity between
Grassmann manifolds
the domain and multiple source domains is calculated using the Multi-Kernel-Maximum Mean Difference (MK-
MMD), and the source domain most similar to the target domain is selected as the optimal source domain
in the transfer task. Secondly, Locally Linear Embedding (LLE) is leveraged to reduce the dimensionality of
both optimal source domain and target domain data to remove redundant information, and the Geodesic Flow
Kernel (GFK) is utilized to project low-dimensional data into the Grassmann manifold space and reduce the
distribution difference between the two domains. Finally, the source domain action training model is applied
to the target domain. Three public datasets (i.e., PAMAP2, OPPORTUNITY and UCI DSADS) are utilized to
validate the effectiveness of the proposed approach. Experimental results are presented to demonstrate that the
proposed HAR method can predict a large number of unlabeled samples in the target domain while preserving
the original data structure.

1. Introduction cartilage injury and rehabilitation training and proposed a monitoring


algorithm. The results show that the use of the embedded intelligent
Human Activity Recognition (HAR) analyzes various human behav- wearable device for phased exercise rehabilitation training can effec-
iors based on collected sensor data (e.g.,images or videos), and it is a tively restore joint function. Vergara-Diaz et al. (2021), experimented
major research direction in the field of mobile and pervasive comput- with 5 wearable sensors on the forearm, lower leg, and waist of a
ing. Meanwhile, HAR has achieved remarkable progress in a variety user to collect acceleration data and estimate the severity of limb-
of medical applications (e.g., video surveillance, sports monitoring and specific symptoms in Parkinson’s disease patients. Alfaro and Trejos
other aspects (Sah & Ghasemzadeh, 2019)) . (2022) proposed a user-independent pose classification method based
In recent years, with the development and application of sensors, on sensor fusion technology, combined with EMG data and inertial
HAR methods based on wearable sensors have received wide atten- sensor data, and realized a good human–computer interaction system,
tion and application, and many promising research results have been enabling better control of wearable devices in the process of auxiliary
achieved at home and abroad. Han et al. (2022),proposed a new treatment. Qaroush et al. (2021) designed a sign language recogni-
heterogeneous convolution for human activity recognition tasks, and tion system by using three-axis acceleration sensors and gyroscope
enhanced the recognition on the human activity data set collected by inertial measurement units to realize gesture recognition and data
wearable inertial sensors. Zhang et al. (2022b), studied the applica- analysis. Khatun et al. (2022) adopted an accelerometer, gyroscope,
tion of embedded intelligent wearable device monitoring in articular

The code (and data) in this article has been certified as Reproducible by Code Ocean: (https://codeocean.com/). More information on the Reproducibility
Badge Initiative is available at https://www.elsevier.com/physical-sciences-and-engineering/computer-science/journals.
∗ Corresponding author at: School of Computer Science and Engineering, Xi’an University of Technology, No. 5 South Jinhua Road, Xi’an, 710048, Shaanxi,
China.
E-mail addresses: wanghuaijun@xaut.edu.cn (H. Wang), 2211221129@stu.xaut.edu.cn (J. Yang), 2191221061@stu.xaut.edu.cn (C. Cui),
tupengjia@stu.xaut.edu.cn (P. Tu), lijunhuai@xaut.edu.cn (J. Li), 105222@xaut.edu.cn (B. Fu), w.xiang@latrobe.edu.au (W. Xiang).

https://doi.org/10.1016/j.eswa.2023.122696
Received 17 February 2023; Received in revised form 21 November 2023; Accepted 21 November 2023
Available online 23 November 2023
0957-4174/© 2023 Elsevier Ltd. All rights reserved.
H. Wang et al. Expert Systems With Applications 241 (2024) 122696

and linear acceleration application based on Android OS to collect To address the above challenges, this paper proposes a cross-domain
raw data for human behavior recognition. Jiang et al. (2022) used human behavior recognition method. The major contributions of this
wearable devices to collect surface EMG signals and inertial signals work are as follows:
and recorded a multi-category dataset of 20 gestures, to accurately
classify gestures. Alqarni (2021) leveraged wearable sensors to monitor 1 To solve the problem of missing labels in the target domain
patients’ physical physiological parameters for medical diagnosis. Kuo sample data in human behavior recognition, the label prediction
et al. (2009) bind an acceleration sensor to the ankle joint to calculate of target domain data is realized by domain transfer of the
the user’s walking distance, speed and energy consumption. Geodesic Flow Kernel (GFK) framework in the manifold space;
Traditional machine learning and deep learning requires data not 2 In the multi-source domain transfer, the feature ratio of each
only to follow the same distribution but also need sufficient labeled source domain will cause the problem of negative transfer. As
data to train the model. However, the actual collection always contains such, the Multi-Kernel-Maximum Mean Difference (MK-MMD) is
a large amount of unlabeled data, and the same action data comes from used, which is an unsupervised source domain selection method
different users or there are some differences in the wearing positions that measures the similarity about the body movement between
of sensors of different users during data collection. These factors will the source and target domains. Then the most similar source
affect the overall distribution of sensor data, leading to the degradation domain to the target domain will be chosen as the final transfer
of the performance of the model. This is because most deep learning task;
models are designed to solve specific tasks. If the data distribution 3 In line with the problem of the distribution of differences be-
changes, these models will be rebuilt again, and it is difficult to tween data, we propose the Locally Linear Embedding (LLE)
reconstruct and train them due to the computational power and time and GFK modeling method of combining the unsupervised. The
constraints. However, transfer learning can use pre-trained networks LLE algorithm is used to extract the public schema information
and apply them to our custom tasks, as well as transfer knowledge respectively, and common schema information for source and
learned from previous tasks. target domain data is mapped to the manifold space. The GFK is
Ye et al. (2020) proposed an instance-based cross-domain trans- applied in manifold space domain transfer, then, the distribution
fer model and train action classifiers using supervised learning pat- difference between the data is reduced, and the problem of
terns. Alinia et al. (2020) proposed a human action recognition method missing labels in the target domain is solved;
based on feature transfer, which learned the structural similarity be-
tween events in the source domain and the target domain, used the 2. Related work
network graph constructed for the two domains, extracted the struc-
tural relationship between the core clusters of the two domains, and 2.1. Human behavior recognition based on manifold learning
assigned appropriate labels to the samples of the target domain. Kara-
man et al. (2021) used the model fine-tuning method of transfer Manifold learning has advantages in dealing with behavioral data
learning to extract speech data sets to achieve enhanced Parkinson’s containing nonlinear structures, because it can preserve the internal
recognition. Gong et al. (2012) proposed a feature transfer learning nonlinear structure of data. The high dimensionality of features con-
method for human behavior recognition and adopted a kernel-based tained in video, image sequence or sensor data places pressure on
method for cross-domain transfer. It simulates the transfer between computing and storage, researchers use manifold learning to achieve
domains by integrating a large number of subspaces that describe the nonlinear mapping of action features from high to low dimensions to
change from the source domain to the target domain, and the model obtain effective information and improve the efficiency of computation.
can learn new shallow features and reduce domain differences. Li et al. Long et al. (2022) proposed a domain adaptive model based on
(2022) proposed a transfer learning method to train a high-fidelity optimal transmission on the Grassmann manifold and designed a sim-
and subject-independent child activity recognition model using data plified model that keep the necessary adaptive characteristics. Cross-
from the adult domain, which effectively improved the classification domain recognition experiments on several public datasets show that
accuracy. the model achieves optimal performance in every case. A deep het-
Transfer learning can identify the target domain data according to erogeneous 3D manifold network for behavior recognition by taking
the data information of the source domain. Nevertheless, how to use the advantage of the advantages of Riemannian manifold in describing 3D
domain similarity between the source and target domains to construct motion was proposed (Chen et al., 2022). Using the construction of
the transfer model and how to improve the recognition performance the graph to guide the backbone network to mine more discriminative
of cross-domain transfer still are two major problems facing transfer nonlinear spatiotemporal features, experiments on several mainstream
learning to realize behavior recognition. The following challenges exist skeleton datasets have achieved considerable results. Isomap depends
in studying the above issues. on the distance measure of multidimensional scale, and it can enhance
We only have the original activity data in the target domain without Spatiotemporal relationships in motion changes. Jenkins and Matarić
actual activity labels, and the missing labels will seriously affect the (2004) extends ISOMAP to eliminate the ambiguity of the proximal
classification effect of the model. The data distribution similarity of data sample points and make the spatial distal data points corre-
each limb part is different, and the criterion for judging similarity spond, thus revealing the spatio-temporal characteristics in the data
cannot rely solely on the symmetric relationship of limb parts. If structure. Jia and Yeung (2008) proposed a local spatio-temporal dis-
the features of each source domain are utilized in equal proportion, criminant embedding method, which projected image frames of action
the difference between the domains cannot be exploited, which may sequences into the embedding manifold space according to different
cause a negative transfer. Hence, it is necessary to study a method to categories to form a local temporal subspace, and this method can
measure the similarity between the source and target domains. To avoid improve the classification effect of actions with similar spatio-temporal
negative transfer, the source domain most similar to the target will shapes. Hongsheng et al. (2015) calculated the dense optical flow
be selected to train the learning model. After selecting an appropriate field in action images, using Convolutional Neural Networks (CNN)
source domain limb, it is difficult to construct an efficient machine combined with attention pooling mechanism to capture the region of
learning model using both the source and target domain due to the interest in continuous video frames, and uses Riemannian manifold
distribution differences of different body parts. A lot of methods are learning method to calculate the spatial motion variation between fea-
based on the premise that the distributions of the source and target ture vectors of different frames. The manifold characteristics of motion
domains are the same. Different distributions will lead to overfitting, changes were obtained, and the motion of the target was modeled
which will affect the recognition performance. from multiple perspectives to realize behavior recognition. Wang et al.

2
H. Wang et al. Expert Systems With Applications 241 (2024) 122696

(2014) proposed an action recognition framework based on manifold et al. (2020a) employed a generative adversarial approach to address
learning. In the training stage, LE manifold dimensionality reduction cross-domain problems and proposed the Generative Adversarial Dis-
algorithm was used to reduce the dimensionality of high-dimensional tribution Matching (GADM) framework. This framework resolves the
depth image features obtained from Kinect device. In the recognition issue of underperforming generators or discriminators. By enhancing
stage, the nearest neighbor interpolation method is used to map the test the objective function with the inclusion of cross-domain difference
data to the low-dimensional manifold space, and the improved Haus- distance and further minimizing the differences through competition
dorff distance is used to measure the similarity between the training between the generator and discriminator, it effectively reduces the
set and the test set in the low-dimensional space, which effectively disparities in cross-domain distributions. Kang et al. (2020b) proposed
improves the recognition accuracy. the Enhanced Subspace Distribution Matching (ESDM) method, which
leverages label information to enhance the distribution matching be-
2.2. Transdomain human behavior recognition based on transfer learning tween the source and target domains in a shared subspace. During
the kernel principal component analysis (PCA) dimensionality reduc-
Traditional machine learning and deep learning methods not only tion process, it reduces the conditional and marginal distributions in
require that the training and test data have the same distribution but the shared subspace to improve cross-domain subspace distribution
also must be with a sufficient number of training samples. However, matching. Mutegeki and Han (2019) combined CNN with Long Short-
it is time-consuming and expensive to obtain sufficient labeled train- Term Memory network (LSTM) and obtained 90.8% accuracy in the
ing samples. When using sensors to collect behavioral data, different experiment on UCI HAR dataset, which provided ideas for solving the
locations of bound sensor units and different types of sensor units problem of too few training data in the target domain dataset. Khan and
will cause data distribution differences, which will greatly degrade the Roy (2017) studied the problem of identifying activities across different
performance of the classification model. Transdomain human behavior environments in the case of limited labeled data in the target domain,
recognition based on transfer learning can solve this problem well. proposed TransAct model, combined anomaly detection method with
Transfer learning can discover the data similarity between the source clustering, and solved the challenge of identifying unknown activities
domain and the target domain. The source domain refers to the domain with data distribution differences from source samples, and the recogni-
with a large number of labels, and the target domain is the object to tion accuracy reached 81%. Zhao et al. (2010) combined Decision Tree
be assigned labels. The classification effect can be improved by helping and K-means clustering algorithm to propose TransEMDT (Transfer
the target domain without labels or with fewer labels with the help of learning EMbedded Decision Tree) for cross-user behavior recognition,
the knowledge of the source domain. and EMbedded this method in the portable activity reporting system
Xu et al. (2023) studied the transfer problem of cross-person and based on smartphones. The system can use unlabeled samples to adapt
cross-location based on wearable sensor-based human activity recogni- the activity recognition model and build personalized models for new
tion, and proposed a hybrid model to achieve cross-domain transfer. users. Feuz and Cook (2017) realized real-time and seamless trans-
The model combines CNN, a deep adaptive network with a gradient mission of learned activity information between sensor platforms by
regression layer and an adaptive classifier based on the Online sequen- building a Personalized activity ECOsystem (PECO). Meanwhile, the
tial extreme learning machine (OS-ELM). Experiments show that the
multi-view transfer learning algorithm proposed in this paper was used
model is more efficient than standard CNN and deep transfer learning
to realize information transfer between sensor platforms.
models, where the accuracy of cross-location and cross-location transfer
is improved by 12% and 20%, respectively. Kang et al. (2022) seg-
3. Method
mented target gestures from the original signals coupled in the process
of human dynamic walking, and a transfer learning method based on
distribution adaptation was studied. Gesture recognition is realized by There are nonlinear structures in the human behavior data collected
domain transfer between dynamic walking and static standing scenes, by sensors, and the dimension of the sample features is too high, and
and the accuracy of gesture recognition is improved by 15.1%. Lin the redundant information may reduce the recognition accuracy of the
et al. (2017) constructs subspaces for classes in the source domain, and classifier. The LLE manifold dimensionality reduction algorithm not
constructs anchor subspaces in the target domain. The minimization only ensures the local characteristics of the original high-dimensional
cost function is used to estimate the overlap and topological consistency feature sequence while reducing dimensionality, but also reflects the
between the source domain and the target domain as well as between manifold structure of complex data.
subspaces, to assign corresponding class labels to anchor subspaces. The distribution difference between the data in the source and target
Then, the joint subspace is constructed by combining the anchor sub- domains makes it impossible to directly classify the target domain using
space with the atomic space, and the data samples belonging to the a classifier trained in the source domain. The GFK transforms data
same joint subspace are used to train the support vector machine for into Grassmann manifolds and builds geodesic flows, and integrates
the recognition of unlabeled data in the target domain. Qin et al. infinitely many continuous subspaces between the source and target
(2019) proposed an adaptive space–time transfer study method, this domains. Therefore, The GFK learns new feature representations that
method can evaluate edge adaptive probability and conditional proba- are robust to intra-domain variations, as well as utilizes subspaces to
bility distribution between the relative importance of learning different reveal potential differences and commonalities between domains and
actions adaptive spatial characteristics of data sets, and through the reduce inter-domain differences.
incremental manifold learning capture time feature, multiple source In this paper, LLE is combined with the GFK algorithm. LLE is lever-
domain selection provides a new solution. Wang et al. (2018) pro- aged to extract useful features and to reduce redundancy complexity
posed a general framework of Stratified Transfer Learning (STL), which from labeled source domain action data and unlabeled target domain
uses inter-class affinity for intra-class knowledge Transfer, and obtains action data. Meanwhile, The GFK is used to transfer learning the feature
pseudo-labels of target domains by majority voting technology, and the data after dimensionality reduction, obtain the continuous subspace
classification accuracy was improved by 7.68%. Yao et al. (2022) pro- along the geodesic direction, and extract the common information
posed an effective method called Discriminative Manifold Distribution between the source and target domains. The transfered source domain
Alignment (DMDA). This method leverages the concept of distribu- data and labels are utilized to train the random forest model, and the
tion alignment for domain adaptation and improves the discriminative target domain data after transfer mapping is predicted. This proposed
model by learning geometrical structures in the manifold space. It also approach not only realizes unsupervised cross-domain transfer of the
addresses the issue of uncertainty introduced by pseudo-labels in the target domain actions, but also improves the generalization ability of
target domain, this approach enables effective domain adaptation. Kang the model. The Framework of method is illustrated in Fig. 1.

3
H. Wang et al. Expert Systems With Applications 241 (2024) 122696

Fig. 1. Cross-domain transfer human action recognition framework based on manifold learning.

3.1. Unsupervised source domain selection based on MK-MMD solve the problem of missing labels of sample data in the target domain
in the next step.
When sensors are used to monitor human motion, if the data of
one body part is missing, the data of other body parts are used to 3.2. Cross-domain transfer method based on GFK
help predict the movement tag 𝑦𝑡 of that space. In the case of multi-
source domain transfer, using the features from each source domain in In the unsupervised source domain selection method described in
equal proportions can lead to negative transfer due to unknown domain the previous section, among multiple body movements in the source
correlations. Therefore, it is necessary to identify the most strongly domain, the movement most suitable for the body in the target domain
correlated source domain to perform{ }the transfer effectively (Zhang
𝑛𝑡 can be selected to improve the transfer effect. On this basis, the local
et al., 2022a). Suppose that 𝐷𝑡 = 𝑥𝑗𝑡 is the data of the lost body LLE is leveraged to reduce the dimension of the action data, and
𝑖=1
parts, and the data of 𝑀 body parts in 𝐶 body parts can be used as the redundant features and noise in the original data are eliminated
{ }𝑀
the tag source domain to assist prediction labels, denoted by 𝐷𝑆𝑖 𝑖=1 . while preserving the original local topology structure. Then the GFK is
{ }𝑛𝑖𝑠
Define the source domain as 𝐷𝑆𝑖 = 𝑥𝑗𝑆 , 𝑦𝑗𝑆 , and use MK-MMD to used to project the action features after dimensionality reduction into
𝑖=1 the subspace for manifold feature transformation with the objective
select the best source domain 𝐷𝑆 (𝐾) for the final transfer.
The MMD is a nonparametric method for measuring the difference of reducing the data offset between the domains and realize domain
between two different distributions, which has been widely used in transfer. The model after transfer can solve the problem of missing
many transfer learning methods. MK-MMD is based on the MMD. The action data labels in the target domain.
feature representations of the source domain and target domain are
mapped into the Reproducing Kernel Hilbert Space (RKHS), and then 3.2.1. Local linear embedding algorithm
the difference between the mean values of the two types of data is LLE is an unsupervised manifold dimensionality reduction algo-
calculated. MK-MMD is used to measure the distance between the rithm. It assumes data points satisfy the linear relation in the local
source and target domains as follows: space, and learns the compact representation of high-dimensional data
‖ [ ( )] [ ( )]‖2 while considering the local relationship between samples. The nearest
𝑀𝑀𝐷𝑘2 (𝑝, 𝑞) ≜ ‖𝐸𝑝 𝜙 𝑥𝑝 − 𝐸𝑞 𝜙 𝑥𝑞 ‖ (1) neighbors are selected in accordance with the Euclidean distance in
‖ ‖𝐻𝑘
a high dimensional space for linear reconstruction. Then the data
where, 𝐻𝑘 is a Hilbert space defined in the topological space 𝑋 whose
structure features of higher dimensional space are mapped to a lower
reproducing kernel is 𝑘. Views 𝑝 and 𝑞 are the distributions of the target
{ }𝑁 { }𝑁 dimensional space. While realizing low dimensional embedding, the
and source domains, respectively. 𝑋 𝑆 = 𝑥𝑠𝑖 𝑖=1𝑠 and 𝑋 𝑇 = 𝑥𝑡𝑖 𝑖=1𝑡 are
nearest neighbor relationship between data can be maintained and
the sample sets in 𝑝 and 𝑞, respectively.
MK-MMD combines multiple kernel functions linearly and computes the whole structure of nonlinear manifold can be learned. The LLE
the distance calculation by selecting the optimal kernel. The kernel algorithm is illustrated in Fig. 2.
The LLE algorithm is detailed as follows:
functions are combined as follows: { }
{ } Let 𝑋 = 𝑥1 , 𝑥2 , … , 𝑥𝑁 be a set of points in a higher dimensional
∑𝑚 ∑
𝑚
𝐾= 𝑘= 𝛽𝑖 𝑘𝑖 ∶ 𝛽𝑖 = 1, 𝛽𝑖 ≥ 0, ∀𝑖 (2) space 𝑅𝐷 , the number of samples is 𝑁, and the dimension is 𝐷.
𝑖=1 𝑖=1 1. The 𝑘 nearest neighbors of each data point are selected in the
where hyperparameter 𝑚 is the number of positive semidefinite ker- high-dimensional space.
nels and the associated constraints on 𝛽𝑖 are used to guarantee the According
{ to the
} Euclidean distance in Eq. (4), we can compute set
multi-kernel 𝑘. The empirically estimated MK-MMD are as follows: 𝑁𝑖 = 𝑥𝑛𝑖1 , … , 𝑥𝑛𝑖𝑘 of neighbors of a point 𝑥𝑖 .
( ) [ ]1∕2
1 ∑∑
𝑚 𝑚
( )
𝑀𝑀𝐷 𝐷𝑝 , 𝐷𝑞 = 𝑘 𝑥𝑝𝑖 , 𝑥𝑝𝑗 ∑
𝐷
| |2
2
𝑚 𝑖=1 𝑗=1 𝑑𝑖𝑗 = |𝑥𝑖𝑘 − 𝑥𝑗𝑘 | (4)
| |
(3) 𝑘=1
( ) ( )
1 ∑∑ 2 ∑∑
𝑛 𝑛 𝑚 𝑛
+ 𝑘 𝑥𝑞𝑖 , 𝑥𝑞𝑗 − 𝑘 𝑥𝑝𝑖 , 𝑥𝑞𝑗 2. The optimal reconstruction weight of each point is calculated by
2
𝑛 𝑖=1 𝑗=1 𝑚𝑛 𝑖=1 𝑗=1 the nearest neighbor.
In this paper, MK-MMD is used to measure the distance between the Each sample point can be reconstructed by linear weighting using
body movements of multiple source domains and the target domain, its 𝑘 neighbors to minimize the sum of the squared errors, and the sum
and the body movements of the source domain with the highest simi- of the weights is 1. For sample point 𝑥𝑖 , if 𝑥𝑗 does not belong to its
larity to the target domain are selected for transfer. This helps to better neighboring sample points, set 𝑤𝑖𝑗 = 0. Otherwise, the constrained least

4
H. Wang et al. Expert Systems With Applications 241 (2024) 122696

After the optimal reconstruction weight is calculated, the low-


dimensional embedding that can best preserve the geometric charac-
teristics of the original space is found by minimizing Eq. (12).

∑𝑁 ‖ ∑𝑘 ‖2
‖ ‖
min ‖𝑦𝑖 − 𝑤 𝑦 ‖
‖ 𝑖𝑗 𝑛𝑖𝑗 ‖

𝑖=1 ‖ ‖
𝑗=1 ‖ (12)
∑ 𝑁
1 ∑𝑁
𝑠.𝑡. 𝑦𝑖 = 0, 𝑦 𝑦𝑇 = 𝐼
𝑖=1
𝑁 𝑖=1 𝑖 𝑖
[ ]
To find out the eigenvector matrix 𝑌 = 𝑦𝑖 , … , 𝑦𝑁 under the
𝑇
constraints, based on matrix 𝑊 ∶ 𝑀 = (𝐼 − 𝑊 )
(1 − 𝑊 ) to construct a new sparse symmetric positive semidefinite
matrix 𝑀, Eq. (12) is rewritten as

𝑁
( )
min 𝜀(𝑌 ) = ‖𝑌 𝐼 − 𝑌 𝑊 ‖2 = tr 𝑌 𝑀𝑌 𝑇 (13)
‖ 𝑖 𝑖‖
𝑖=1

where 𝜀(𝑌 ) and 𝐼 are the loss function and the identity matrix, respec-
tively. Also, column 𝑖 of matrix 𝐼 and 𝑊 are 𝐼𝑖 and 𝑊𝑖 , respectively.
Fig. 2. LLE algorithm structure. Combining the Lagrange multiplier formula and constraint conditions,
we can obtain the following

𝐿(𝑌 ) = 𝑡𝑟(𝑌 𝑀𝑌 𝑇 ) + 𝜆(𝑌 𝑌 𝑇 − 𝑁𝐼) (14)

Eq. (14) can be solved by use of eigenvalue decomposition of


squares problem is solved according to Eq. (5) to determine the optimal
{ } 2𝑀𝑌 𝑇 + 2𝜆𝑌 𝑇 = 0. The feature vector matrix 𝑌 = (𝑦2 , 𝑦3 , … , 𝑦𝑑+1 )𝑇
reconstruction weight 𝑤𝑖𝑗 , 𝑗 = 1, … , 𝑘 . is selected as the action data after dimensionality reduction.

∑𝑁 ‖ ∑𝑘 ‖2
‖ ‖
min 𝜀(𝑊 ) = ‖𝑥𝑖 − 𝑤 𝑥 ‖ 3.2.2. Geodesic flow kernel algorithm
‖ 𝑖𝑗 𝑗 ‖
𝑖=1 ‖
‖ 𝑗=1 ‖
‖ The geodesic flow kernel model implements achieves domain trans-
(5)

𝑘 fer by integrating infinitely many subspaces over the source and target
𝑠.𝑡. 𝑤𝑖𝑗 = 1 domains, which describe incremental changes in geometric and statis-
𝑗=1 tical properties from the source to the target domain. The principle is
revealed in Fig. 3.
Define Eq. (6) to represent the local covariance matrix of sample
It is assumed that the action feature data in the source domain after
point 𝑥
[ 𝑖 , where 𝑄𝑖 is a real] symmetric semidefinite matrix, including dimensionality reduction by LLE is 𝑍𝑆 , and the action feature data in
𝐺𝑖 = 𝑥𝑛𝑖1 − 𝑥𝑖 , … , 𝑥𝑛𝑖𝑘 − 𝑥𝑖 . Eq. (7) can be obtained by using 𝑄𝑖 to the target domain is 𝑍𝑇 . The GFK consists of three main steps:
[ ]𝑇 ( )𝑇
matrix Eq. (5), where Γ = 1, ⋯ , 1 , 𝑤𝑖 = 𝑤𝑖1 , 𝑤𝑖2 , … , 𝑤𝑖𝑘 1. Determine the optimal dimension of the subspace.
represents the local reconstruction weight matrix of sample point 𝑥𝑖 , Calculate the PCA subspaces of 𝑍𝑆 and 𝑍𝑇 , which are 𝑃 𝐶𝐴𝑆 and
and Eq. (8) can be obtained by applying the Lagrange multiplier 𝑃 𝐶𝐴𝑇 , respectively. And combine 𝑍𝑆 and 𝑍𝑇 into one data set to
method to Eq. (7). calculate the subspaces 𝑃 𝐶𝐴𝑆+𝑇 , If the distribution difference between
⟨ ⟩ ⟨ ⟩ the action data in the source and target domains is small, the distance
⎡ 𝑥𝑛 − 𝑥𝑖 , 𝑥𝑛 − 𝑥𝑖 ⋯ 𝑥𝑛𝑖1 − 𝑥𝑖 , 𝑥𝑛𝑖𝑘 − 𝑥𝑖 ⎤ of the three subspaces on the Grassmann manifold space is small.
⎢ 𝑖1 𝑖1
⎥ [ ]
𝑄𝑖 = 𝐺𝑖𝑇 𝐺𝑖 = ⎢⟨ ⋮ ⟩ ⋱ ⟨ ⋮ ⟩⎥ (6) 𝐷(𝑑) = 0.5 𝑠𝑖𝑛𝛼𝑑 + 𝑠𝑖𝑛𝛽𝑑 (15)
⎢ 𝑥 − 𝑥 ,𝑥 − 𝑥 ⋯ 𝑥𝑛𝑖𝑘 − 𝑥𝑖 , 𝑥𝑛𝑖𝑘 − 𝑥𝑖 ⎥⎦
⎣ 𝑛𝑖𝑘 𝑖 𝑛𝑖1 𝑖
where, 𝛼𝑑 and 𝛽𝑑 represent the Angles between 𝑃 𝐶𝐴𝑆 and 𝑃 𝐶𝐴𝑆+𝑇 and
𝑚𝑖𝑛𝑤𝑇𝑖 𝑄𝑖 𝑤𝑖 𝑃 𝐶𝐴𝑇 and 𝑃 𝐶𝐴𝑆+𝑇 , respectively. 𝐷(𝑑) is the total measure between
(7) these two angles. The total measure is proportional to the value of the
𝑠.𝑡. 𝑤𝑇𝑖 Γ = 1
angle and the distance between the two domains. Then we use 𝐷(𝑑) to
𝑁𝐿 determine the best dimension 𝑑,

𝐿(𝑤) = 𝑤𝑇𝑖 𝑄𝑖 𝑤𝑖 + 𝜆(𝑤𝑇𝑖 Γ − 1) (8) 𝑑 ∗ = 𝑚𝑖𝑛{𝑑|𝐷 (𝑑) = 1} (16)
𝑖=1
2. Construct geodesic curves.
Taking the derivative with regards to 𝑤 in Eq. (8),
Denote by 𝑃𝑆 ∈ R𝐷×𝑑 and 𝑃𝑇 ∈ R𝐷×𝑑 the two groups of subspaces
2𝑄𝑖 𝑤𝑖 + 𝜆Γ = 0 (9) of the action feature data 𝑍𝑆 in the source domain and 𝑍𝑇 in the target
domain after dimensionality reduction, respectively, which, are orthog-
Eq. (10) is used to normalize the weight coefficients and the optimal onal matrices. Also, 𝑅𝑆 ∈ R𝐷×(𝐷−𝑑) is the orthogonal complement of
reconstruction weights of 𝑥𝑖 are finally obtained as: 𝑃𝑆 . The geodesic function is as follows:

Φ (𝑡) = 𝑃𝑆 𝑈1 Γ (𝑡) − 𝑅𝑆 𝑈2 (𝑡) (17)
𝑤𝑖 = 𝑐𝑄𝑖 −1 Γ (10)
𝑈1 ∈ R𝑑×𝑑 and 𝑈2 ∈ R(𝐷−𝑑)×𝑑 are orthogonal matrices, both given by
𝑄−1
𝑖 Γ
Eq. (18)
𝑤∗𝑖 = (11) ∑
Γ𝑇 𝑄−1
𝑖 Γ 𝑃𝑆𝑇 𝑃𝑇 = 𝑈1 Γ𝑉 𝑇 , 𝑅𝑇𝑆 𝑃𝑇 = −𝑈2 𝑉𝑇 (18)

3. Low-dimensional embedding is carried out while preserving the where Γ and are 𝑑 × 𝑑 diagonal matrices, and 𝑐𝑜𝑠𝜃𝑖 and 𝑠𝑖𝑛𝜃𝑖 (𝑖 =
local geometry represented by the reconstructed weights. 1, 2, … , 𝑑) are diagonal elements. 𝜃𝑖 is the main angle of 𝑃𝑆 and 𝑃𝑇 and

5
H. Wang et al. Expert Systems With Applications 241 (2024) 122696

Fig. 3. GFK algorithm structure diagram.

used to measure the degree of subspace overlap, and 0 ≤ 𝜃1 ≤ 𝜃2 ≤ ⋯ ≤ Algorithm 1 LLEGFKRF


𝜃𝑑 ≤ 𝜋∕2.
Input: Labeled source domain action data 𝑋𝑠 and labels 𝑌𝑠 , unlabeled
3. Calculate the geodesic flow kernel.
target domain action data 𝑋𝑡 , number of nearest neighbors 𝑘,
The transfer process from Φ (0) to Φ (1) represents the transfer pro-
dimensionality reduction dimension 𝑑.
cess from the source domain action feature data to the target domain
Output: Labels 𝑌𝑡 of target domain action data and action recognition
action feature data. For two original 𝐷-dimensional eigenvectors 𝑥𝑖 and
rate.
𝑥𝑗 we calculate the continuous projection of them from Φ (0) to Φ (1),
1: Calculate the nearest neighbors for 𝑋𝑠 and 𝑋𝑡 separately by Eq. (4).
and all projections are connected into infinite eigenvectors 𝑧∞ ∞
𝑖 and 𝑧𝑗 . 2: Calculate the optimal reconstruction weights for 𝑋𝑠 and 𝑋𝑡
Also, the inner product of these two infinite eigenvectors is used to
separately by Eq. (11).
define the geodesic flow kernel as follows:
3: Calculate the feature vector matrices 𝑀𝑠 and 𝑀 separately for 𝑋𝑠
⟨ ⟩ 1 (
𝑇
) ( ) and 𝑋𝑡 by Eq. (14).
𝑧∞ ∞
𝑖 , 𝑧𝑗 = Φ (𝑡)𝑇 𝑥𝑖 Φ (𝑡)𝑇 𝑥𝑗 𝑑𝑡 = 𝑥𝑇𝑖 𝐺𝑥𝑗 (19) 4: Construct the geodesic between 𝑀𝑠 and 𝑀𝑡 using Eq. (17),and cal-
∫0
culate the geodesic flow kernel by Eq. (19). Calculate the samples
where 𝐺 ∈ R𝐷×𝐷 is a semi-positive definite matrix, which can be
𝑣𝑠 and 𝑣𝑡 of 𝑀𝑠 and 𝑀𝑡 , respectively,after projecting them along
calculated as follows
[ ][ ] the geodesic.
[ ] Λ1 Λ2 𝑈 𝑇 𝑃 𝑇 5: Train a random forest model using 𝑣𝑠 and 𝑌𝑠 .
𝐺 = 𝑃𝑆 𝑈1 𝑅𝑆 𝑈2 1 𝑆 (20)
Λ2 Λ3 𝑈2𝑇 𝑇
𝑅𝑆 6: Predict 𝑌𝑡 using 𝑋𝑡 and the random forest model.
where, Λ1 , Λ2 and Λ3 are all diagonal matrices with the following
diagonal elements:
( )
𝑠𝑖𝑛 2𝜃𝑖 different. The above characteristics are consistent with action recogni-
𝜆1𝑖 = 1 +
2𝜃𝑖 tion under cross-domain transfer learning. The original data underwent
( ) data preprocessing, where the data was smoothed and denoised using
𝑐𝑜𝑠 2𝜃𝑖 − 1
𝜆2𝑖 = 1 + (21) the Kalman filtering method to remove any outliers present in the data.
2𝜃
( 𝑖) The experimental setup utilized a server and the software used for the
𝑠𝑖𝑛 2𝜃𝑖
𝜆3𝑖 = 1 − experiments included Pycharm 2022.1.4 and MATLAB R2021a, details
2𝜃𝑖 are provided in Table 1. The evaluation index of the experiment is the
Then the data after the transfer of the original feature vector 𝑥 along recognition accuracy, which is calculated as follows:
the geodesic direction is 𝑣 𝑇𝑃 + 𝑇𝑁
√ 𝐴𝑐𝑐 = (23)
𝑣= 𝐺𝑥 (22) 𝑁
Where, 𝑇 𝑃 and 𝑇 𝑁 represent the number of positive and negative sam-
According to the above equation, we can obtain the data 𝑉𝑆 after ples correctly identified, 𝑁 represents the total number of test samples,
projecting the action feature data 𝑍𝑆 in the source domain. Also, data
and 𝐴𝑐𝑐 represents the proportion of correct samples identified.
𝑉𝑇 after the projection of the action feature data 𝑍𝑇 in the target
The datasets used in this paper are shown in Table 2.and explained
domain can be obtained. The label data 𝑍𝑆 of the action feature data
in detail in the following.
in the source domain is used to train the model to predict the label of
The PAMAP2 (Reiss & Stricker, 2012) dataset uses three inertial
the target domain data 𝑉𝑇 .
measurement units and a heart rate monitor as sensors to collect data,
3.2.3. Algorithm structure all with a sampling frequency of 100 Hz. and the sensors are placed
The process of LLEGFKRF algorithm is shown in Algorithm 1, and in three different body parts, i.e., wrist, chest and ankle. A total of 9
the detailed flowchart of algorithm implementation can be found in volunteers participated in the data collection, including 8 males and 1
Fig. 4. female. The dataset records 18 activities, i.e., lying, sitting, standing,
walking, running, cycling, walking, using an iron, using a vacuum
4. Experiments cleaner, jumping rope, going up and down stairs, watching TV, working
on a computer, driving a car, folding clothes, cleaning the house, and
4.1. Experimental settings playing socce.
The OPPORTUNITY (Chavarriaga et al., 2013) dataset includes
In this paper, three public datasets are used for experiments. All the 22,388 samples of standing, 13,183 samples of walking, 9,427 samples
three datasets are collected by inertial sensors fixed on human bodies of sitting, and 1,674 samples of lying, which are collected by inertial
in different experimental environments, with different subjects and sensors worn by four experimenters, such as Bluetooth wireless ac-
acquisition methods, and the distributions of the datasets are also quite celerometer, gyroscope and magnetic sensor. The body parts for data

6
H. Wang et al. Expert Systems With Applications 241 (2024) 122696

Fig. 4. Flow chart of LLEGFKRF algorithm implementation.

Table 1
Introduction of the software and hardware environment used in the experiment.
Software and Version information Application in experiment
hardware
Server Equipped with Intel Core i5-11400F CPU and The hardware environment of the experiment
2.60 GHz RAM
Pycharm Pycharm 2022.1.4 Software environment for the experiments in
the dimensionality reduction part
MATLAB MATLAB R2021a The software environment for the dataset
processing, the GFK part of the algorithm, and
the experiments in the classifier part

Table 2
Experimental datasets.
Data set Number of Number of Number of Site of collection Sensor type
subjects samples activities
PAMAP2 (Reiss & Stricker, 9 2 844 868 18 wrist, chest, ankle accelerometer,gyroscope,
2012) magnetometer, heart rate
detector
OPPORTUNITY (Chavarriaga 4 701 366 4 back,left upper arm, left Bluetoothwireless
et al., 2013) forearm, right upper arm, accelerometer and gyroscope,
right forearm, left foot, magnetometer
rightfoot
UCI DSADS (Barshan & Yüksek, 8 1 140 000 19 torso,left arm, right arm, left accelerometer, gyroscope,
2014) leg, right leg magnetometer

7
H. Wang et al. Expert Systems With Applications 241 (2024) 122696

Fig. 5. Data distribution after processing by the comparison algorithms.

collection are on the back, left upper arm, left forearm, right upper arm, value of 25 for the nearest neighbor parameter 𝑘 for all four dimension-
right forearm, left foot, and right foot. ality reduction algorithms. The dimensionality parameter 𝑑 was varied
The UCI DSADS (Barshan & Yüksek, 2014) dataset is collected within the range of 3 to 50 for the following experiments. The GFK is
by 8 subjects wearing sensors of 3-axis accelerometer, gyroscope and used for transfer learning of the action features of the source domain
magnetometer, including daily and physical activities such as standing, OPPORTUNITY and the action features of the target domain DSADS
going upstairs, going downstairs, lying, sitting, etc, and each activity after dimension reduction. The recognition accuracy of the actions in
includes 7,500 samples. The sensors are attached to body parts such as DSADS is plotted in Fig. 6.
the torso, left arm, right arm, left leg, and right leg respectively. As can be seen from Fig. 6, among the four dimensionality reduction
algorithms, The LLE algorithm can obtain the highest recognition ac-
curacy, since it can ensure the manifold structure inside the data while
4.2. Comparison of manifold learning dimensionality reduction methods
reducing the dimension.
Next, the influence of neighbor parameter 𝑘 on the result is com-
To analyze the effectiveness of the algorithm based on LLEGFK,
pared. The target dimension 𝑑 of the three-dimensionality reduction
PCA projection is performed on the data, and the distribution of three- algorithms is set to a fixed value of 25, and the value range of 𝑘 is
dimensional features before the observation is visualized. Fig. 5(a) set to 3–50. Since PCA has no parameter 𝑘, only the three manifold
shows the distribution of the original data of PAMAP and DSADS, and algorithms are compared and shown in Fig. 7.
there are obvious distribution differences between the two datasets. In Fig. 7, the recognition rate of all dimensionality reduction algo-
Also, Fig. 5(b) reveals the distribution after dimensionality reduction rithms is proportional to 𝑘. When reaching the maximum in a certain
by LLE manifold, which can remove redundant features and reduce the range, the accuracy begins to decline and tends to be stable. The
impact of redundant features on the recognition rate. The distribution transfer method after dimensionality reduction using LLE achieves the
of the original data after GFK transfer is shown in Fig. 5(c). After LLE or highest recognition rate, followed by ISOMAP and LE.
GFK processing, the distribution difference between the two datasets is
reduced. In addition, the distribution after processing by the proposed 4.3. Evaluate unsupervised source domain selection
method is revealed in Fig. 5(d), which shows the data distribution
after PAMAP and DSADS treatment is almost identical, indicating that To verify the unsupervised source domain selection method pro-
the proposed method is able to significantly reduce the distribution posed in this paper, the DSADS is selected for the experiment. This
difference between the datasets. dataset contains five limb parts, i.e., the trunk, right arm, left arm, right
In this paper, the optimal parameters 𝑘 and 𝑑 were selected using leg, and left leg. Assuming that one limb is designated as the target
grid search. To provide a visual analysis of parameter variations and domain for the absence of data labels, the remaining four limbs can be
compare different dimensionality reduction methods, we set a fixed selected as the source domain.

8
H. Wang et al. Expert Systems With Applications 241 (2024) 122696

Table 3
MK-MMD distance between the left arm and other body parts when the left arm is the
target domain.
Source domain Trunk Right arm Left leg Right leg
MK-MMD distance 0.23834 0.15579 0.16127 0.19423

Table 4
The recognition accuracy of other body parts as the source domain when the left arm
is the target domain.
Source domain Trunk Right arm Left leg Right leg
Accuracy of recognition 62.50% 82.83% 79.67% 76.83%

The left arm is randomly selected as the target domain, and then
the source domain with the shortest distance from the target domain.
That is, the smallest difference in the data distribution is selected as
the final source domain in the remaining four source domains, namely
trunk, right arm, left leg and right leg. In this experiment, MK-MMD
is used to measure the distance between multiple different source and
Fig. 6. Recognition accuracy of four dimensionality reduction algorithms in under target domains, and the final distance is adopted as the benchmark for
different target dimensions.
the next step of transfer.
Table 3 shows the MK-MMD distances from the remaining four
source domains to the left arm. It can be seen that the closest limb
part to the left arm is the right arm, so the right arm is selected as the
source domain.
After the source domain is selected, the proposed transfer learning
algorithm LLE+GFK+RF is used to transfer the source domain to the
target domain. To verify the necessity of the source domain selection
algorithm, all four limbs are utilized as the source domain to transfer
the left arm, and the obtained recognition accuracy of the target
domain is shown in Table 4. As can be seen from the table, the MK-
MMD distance between the trunk and the left arm is the largest, and
after the trunk is used as the source domain of transfer learning, the
recognition rate of the left arm is the lowest. On the contrary, the MK-
MMD distance between the right arm and the left arm is the smallest,
and the recognition rate obtained by using the left arm as the source
domain is the highest. Therefore, the multi-source domain selection
algorithm can select the closet source domain to the target domain to
obtain the best recognition accuracy. The experimental results show
that the similarity between the source target domains is important for
cross-domain learning, and it is crucial to find the correct auxiliary
domain to perform successful knowledge transfer.
Fig. 7. Recognition accuracy of three manifold learning algorithms under different To verify the superiority of the source domain selection algorithm,
values of 𝑘.
the proposed algorithm is compared with several classical source do-
main selection techniques, which are shown in Table 5. After selecting
the optimal source domain, the LLEGFK algorithm is employed for
transfer learning of the target domain, and the optimal recognition rate
is selected as the final comparison result.
In Fig. 8, the source domain selection algorithm is superior to
some existing distance measurement methods when the same transfer
algorithm is used, and it can select the optimal source domain with the
highest similarity to the target domain, which improves the recognition
accuracy.
Cross Domain Activity Recognition (CDAR) aims to label the activ-
ities of one domain using labeled data from another related domain.
CDAR has several types, such as Cross-Person, Cross-Device, Cross-
Environment, Cross-Location, Cross-Dataset activity recognition, etc.
(Chavarriaga et al., 2013). Cross-Person Activity Recognition aims
to address the differences between individuals. Cross-Device Activity
Recognition enables seamless transfer across multiple devices. Cross-
Environment Recognition aims to enhance the model’s generalization
Fig. 8. Recognition accuracy under different distance measurement algorithms when
and adaptability to different environments. Cross-Location Activity
the left arm is the target domain. Recognition specifically refers to situations where activity labels are
missing for certain body parts, and utilizes labeled data from similar
body parts to obtain labels for those body parts. Cross-Dataset Activity
Recognition involves applying an activity recognition model trained on

9
H. Wang et al. Expert Systems With Applications 241 (2024) 122696

Table 5
Recognition accuracy of other body parts as the source domain when the left arm is the target domain of data collection used.
Method Method Description
A-distance (Schölkopf et al., 2007) The correlation of local center changes
Cosine_similarity (Abdel-Basset et al., 2019) Cosine similarity, suitable for similarity calculation of high-dimension vectors.
Wasserstein_distance (Arjovsky et al., 2017) Also known as bulldozer distance, is used to indicate the similarity of two distributions

Table 6 By comparing with traditional machine learning based approaches,


Transfer learning methods for participation in comparison.
the proposed method can solve the problem of lack of labels in the
Method Description target domain data to a large extent. Meanwhile, by comparing with
CORAL (Sun et al., 2016) Correlation alignment other transfer learning methods, it is demonstrated that the proposed
JDA (Long et al., 2013) Joint distribution adaptation
method can significantly minimize the distribution difference between
BDA (Wang et al., 2017) Balanced distribution adaptation
TCA (Pan et al., 2010) Transfer component analysis the source and target domains. The experimental results are shown
STL-SAT (Chen et al., 2019) Stratified transfer learning - Stratified activity transfer in Table 9. Meanwhile, the LINEAR function is chosen as the kernel
function of the machine learning algorithm SVM, the penalty factor is
set to 1, the kernel function parameter is set to 100, and the maximum
Table 7
Accuracy of the LLEGFK method and other methods for cross-location activity depth of the decision tree is set to 50. The classifier adopted by the
recognition. transfer learning algorithms Balanced Distribution Adaptation (BDA),
Target CORAL JDA BDA STL-SAT Our method Transfer Component Analysis (TCA) and Stratified Activity Transfer
LeftArm->RightArm 68.43% 67.35% 72.86% 73.86% 85.83% (STL-SAT) is the same as the classifier used in this paper, and the
RightLeg->LeftLeg 72.96% 69.95% 76.35% 76.98% 82.50% random forest algorithm with 30 random forests is adopted.
RightArm->Torso 52.56% 51.66% 52.50% 53.82% 57.50% Because only the source domain data is used to train the model,
Average 64.65% 62.99% 67.24% 68.22% 75.28%
and the characteristics of the target domain data are not taken into
account, the recognition effect of SVM(Support Vector Machines) and
DT(Decision Tree) on the target domain data is poor. BDA, TCA and
one dataset to another dataset, which encompasses differences in data STL-SAT realize transfer learning by using distribution alignment,
collection methods, environments, and subjects. This paper includes ex- which may increase the differences between data and degrade the
periments on cross-limb position and cross-dataset activity recognition. classification performance of the learning model (Anguita et al., 2012).
In this paper, we first validate the efficacy of the LLEGFK algorithm on After dimensionality reduction of the feature data, the GFK is used to
cross-location activity recognition. Cross-location activity recognition project the feature data onto the manifold subspace for feature transfor-
specifically refers to the use of labeled data from other similar body mation, which avoids data distortion. Experimental results reveal that
parts to obtain the labels of some body parts in the absence of their the proposed method can achieve the best recognition performance.
active labels. In addition, Fig. 9 shows the recognition confusion matrix of the
Through cross-location comparison experiments with the following target domain dataset processed by the proposed transfer learning
transfer learning methods shown in Table 6, we verify that the pro- framework. More specifically, Fig. 9(a) reveals the confusion matrix
posed method can better label the unlabeled target domain data, and after the OPPORTUNITY dataset is transferred to the target domain data
the parameters of the comparison methods are set according to their of unlabeled DSADS. It can be seen that it lying is easier to correctly
original papers. identify the action of lying compared with standing, sitting and walk-
The unsupervised source domain selection method is used to select ing. The confusion matrix after the DSADS dataset is transferred to the
the source domain with the smallest distribution difference from the unlabeled OPPORTUNITY target domain dataset in Fig. 9(b). Compared
target domain from multiple source domains. The different transfer with walking and lying, the recognition accuracy of sitting and standing
methods are applied to the same transfer task. The experimental results is higher.
are shown in Table 7.
4.4.2. Evaluation of LLEGFK
4.4. LLEGFKE valuate cross-domain transfer To verify that each part of the framework in this paper can improve
the recognition accuracy rate of cross-domain transfer, Random Forest
4.4.1. LLEGFK transfer estimates across datasets (RF), LLE plus RF, GFK, GFK plus RF and the proposed method LLEGFK
To verify the efficacy of the LLEGFK algorithm on activity recogni- are respectively used for experiments. More specifically, the number
tion across datasets, OPPORTUNITY, DSADS and PAMAP2 are selected of random forests in RF is set as 30. Since LLE plus RF and the
for experiments. We use the sliding window technique to split the method are both affected by parameters 𝑘 and 𝑑, the value range of
combined data. We set the length of the sliding window to 5 s, and 𝑘 and 𝑑 are set to 3–50. To ensure the accuracy and credibility of the
slide in a semi-overlapping manner. We employ the feature extraction experimental results, we take the average values of these methods after
method proposed by Hu et al. (2018) to extract features in the time ten repetitions as the final results, which are displayed in Table 10.
domain, frequency domain, and time-frequency domain. Also, 27 di- In Table 9, the accuracy of direct use of random forest for action
mensional features are extracted from a sensor. Specific feature names recognition is the lowest after the original data is preprocessed. This is
and descriptions are listed in Table 8. because the model only learns the knowledge of the original data and
For the OPPORTUNITY datasets, five limb parts including the back, can only realize the accurate recognition of the training data. The data
left big arm, left forearm, right big arm and right forearm are selected in the target domain is different from the source domain, so there is a
for feature extraction, and a total of 405 dimensions of features are large error in using the source domain data to directly recognize the
extracted. A total of 405 dimensions of the features of the chest, left actions in the target domain.
arm, right arm, left leg and right leg of the DSADS dataset are extracted. After using LLE to reduce the dimensionality of the action data in
For the PAMAP2 dataset, 243 dimensions of the features of the wrist, the source and target domains, the recognition accuracy of the random
chest and ankle are extracted. To effectively carry out the cross-domain forest classifier is partially improved. The GFK method also improves
transfer experiment, the four common movements of lying, sitting, recognition accuracy to a certain extent. However, the combination of
standing and walking in the three datasets are utilized for recognition. LLE and GFK achieves the best transfer effect across the datasets, since

10
H. Wang et al. Expert Systems With Applications 241 (2024) 122696

Fig. 9. Confusion matrix after transfer between datasets.

Table 8
Details of the extracted features.
Serial number Feature Description
1 Mean The average of the sample in the window
2 Std The standard deviation
3 Minimum The minimum value within a window
4 Maximum The maximum value within a window
5 Mode The value with the highest frequency
6 Range The difference between the maximum value and the minimum value in a window
7 Mean crossing rate The number of data exceeding the mean point in a window
8 DC DC component
9–13 Spectrum peak position Top 5 peaks after fast Fourier transform
14–18 Frequency Frequency corresponding to 5 peaks
19 Energy The norm squared
20–23 Four shape features mean, standard deviation, skewness, kurtosis
24–27 Four amplitude features mean, standard deviation, skewness, kurtosis

Table 9 experiments. Fig. 10 shows the comparison results of LLEGFK with


Comparison of recognition accuracy of different methods on cross-dataset transfer tasks. different methods in terms of computational complexity and accuracy.
Method SVM DT BDA TCA STL-SAT LLEGFK The figure reveals that JDA and BDA exhibit the highest computa-
OPP->DSADS 25.00% 25.00% 51.46% 46.91% 55.45% 58.36% tional time consumption. Accuracy of JDA achieves 52%, while BDA
DSADS->OPP 50.92% 51.71% 36.86% 32.68% 53.25% 61.53% lags at 37%. The reason for this situation is that BDA focuses on
PAMAP->OPP 24.89% 11.97% 18.06% 39.02% 40.10% 52.07%
Average 36.60% 29.56% 35.46% 39.54% 49.60% 57.32%
balancing class distributions and lacks the need for robust adaptation
across different domains. TCA follows with moderate time consumption
and accuracy achieves 38%. It tends to address linear feature trans-
Table 10 formations and still faces challenges in handling complex nonlinear
Comparison of action recognition accuracy rates of different algorithms under
distribution differences. STL-SAT and CORAL showcase lower time
cross-dataset transfer tasks.
consumption, accuracies of them achieve 50% and 53%, respectively.
Method RF LLERF GFKRF LLEGFKRF
In contrast, our proposed LLEGFK method demonstrates the least com-
OPP->DSADS 25.15% 53.85% 48.07% 58.36%
putational time consumption and the highest accuracy, reaching 61%.
DSADS->OPP 50.32% 57.91% 50.66% 61.53%
PAMAP->OPP 12.37% 45.83% 27.64% 52.07% JDA has a high time complexity 𝑂(𝑇 𝑘𝑚2 + 𝑇 𝐶𝑛2 + 𝑇 𝑚𝑛) (Long et al.,
Average 29.28% 52.53% 42.12% 57.32% 2013), where, 𝑇 , 𝑘, 𝑚, 𝐶 and 𝑛 represent feature transformation, sub-
space bases, shared features, classes and examples, respectively. TCA
exhibits a time complexity 𝑂(𝑀𝑁 3 𝐿3 ), with 𝑀 representing the num-
ber of nodal trajectories, 𝑁 denoting the number of charger’s candidate
LLE can remove redundant information in the process of manifold di- locations, and 𝐿 being the number of charger’s energy levels (Duan
mensionality reduction, and GFK can reduce the distribution difference et al.). CORAL’s time complexity is 𝑂(𝑛3 ) approximately. STL-SAT
between the two domains and improve the performance of transfer involves feature decomposition (time complexity 𝑂(𝑛3 )) and subspace
learning. transformation, resulting in lower time consumption than TCA but
higher than CORAL. The proposed LLEGFK algorithm demonstrates the
best performance in terms of computational time. MK-MMD has the
4.4.3. Time complexity analysis lowest time complexity 𝑂(𝑛), and LLE has a time complexity 𝑂(𝑛2 ). Due
In this section, we compare the performance of the proposed LLEGFK to the dimensionality reduction achieved by LLE, the input data size
method with baseline transfer learning methods such as TCA, JDA, for GFK becomes smaller, resulting in reduced computational workload
BDA, CORAL, and STL-SAT. The DSADS dataset is used as the source and saved time. As a result, LLEGFK exhibits the lowest computational
domain, and the target domain is OPPORTUNITY for the transfer time consumption.

11
H. Wang et al. Expert Systems With Applications 241 (2024) 122696

In practical application scenarios, the method utilizes domain sim-


ilarity between different body parts (source domain) and target parts
(target domain) to construct a transfer model, while reducing the dis-
tribution differences between the selected appropriate source domain
body parts and target domain body parts. This approach enhances
the recognition performance of cross-domain transfer. The method
can reduce the number of sensors, minimizing the impact of multiple
sensors on the user’s actual experience and making future sensor-based
human identification easier to generalize. The method is also effective
in addressing the impact of partial body sensor data loss on recogni-
tion, efficiently enabling the construction of high-performance learning
models in scenarios with scarce labels and different data distributions.
It provides significant application potential support in many fields such
as healthcare, motion monitoring, and rehabilitation training.

6. Conclusion

This paper proposes a manifold learning-based action recognition


method. Firstly, an unsupervised source domain selection method based
Fig. 10. Comparison diagram of experimental accuracy and running time.
on MK-MMD is investigated. Then, the redundant data is removed
using the LLE algorithm, and the source domain data and target do-
main data are mapped to a manifold subspace for transfer using the
5. Discussion GFK. Experimental results demonstrate that this method achieves good
performance in cross-dataset transfer and cross-limb position transfer
The main contribution of this paper is the implementation of a hu- experiments on four datasets of human activities. The method can anno-
man behavior recognition model that transfers from the source domain tate unlabeled data in the target domain while preserving the original
data to the target domain, thereby improving the model’s accuracy in data structure. Therefore, it is believed that the proposed method in this
cross-domain human behavior recognition. paper can achieve significant recognition performance in cross-domain
From the perspective of transferring the source domain model, data recognition problems where there are differences between the source
provided by highly correlated body parts can assist in action recog- and target domains, and it has great potential for applications.
nition in the target domain after transfer, while data from unrelated Despite achieving satisfactory recognition performance, there are
body parts may lead to negative transfer. In this paper, an unsupervised still some limitations to consider. While this method can obtain good
source domain selection method called MK-MMD is used, which com- recognition results in most cases, there are still challenges when it
bines multiple kernel functions and selects the optimal and reasonable comes to considering fine-grained local information in the data. Ad-
kernel to calculate distances. Analyzing the distances between the ditionally, when applying the model in practical settings, there can be
target domain and multiple source domains that are highly relevant to interference caused by factors such as data collection environments and
the target domain is selected as the source domain in the final transfer sensors, which need to be appropriately addressed.
task, effectively reducing the impact of unrelated body movement data Future research should focus on further optimizing the model to
during the transfer process, effectively reducing the impact of unrelated address the aforementioned limitations and consider capturing fine-
body movement data during the transfer process. At the same time, grained information in the data to make the model more robust.
data redundancy during the transfer process can affect the effectiveness Additionally, in future work, it is important to consider various sce-
and efficiency of model transfer. To further reduce the negative impact narios that may arise in practical applications, including dealing with
of data redundancy on transfer, LLE performs manifold dimensionality data missingness, noise, and other challenges. A more comprehensive
reduction on the best source domain and target domain data, removing analysis of the application context is needed, and further exploration
redundant information from the data. LLE can effectively reduce the should be conducted in cross-domain recognition for other categories.
dimensionality of data while preserving the internal structure of body
data, thus improving the efficiency of the model’s recognition. GFK
CRediT authorship contribution statement
projects low-dimensional data from the source and target domains onto
the Grassmann manifold space through domain adaptation, reducing
Huaijun Wang: Conceptualization, Methodology. Jian Yang: Data
the distribution discrepancy between domains and enabling the source
domain model to apply to the target domain effectively. It mitigates the curation, Writing – original draft. Changrui Cui: Visualization. Pengjia
influence of distribution differences on the model’s performance in tra- Tu: Investigation. Junhuai Li: Supervision, Project administration. Bo
ditional transfer processes and addresses the issue of action recognition Fu: Writing – review & editing. Wei Xiang: Writing – review & Editing.
in the target domain with unlabeled data.
From the experimental results of cross-domain human action recog- Declaration of competing interest
nition classification, the method proposed in this paper, demonstrates
the best recognition performance in comprehensive evaluation. BDA, The authors declare the following financial interests/personal rela-
TCA, and STLSAT utilize distribution alignment for transfer learning, tionships which may be considered as potential competing interests:
which increases the differences between the data and reduces the clas- Junhuai Li reports financial support was provided by the National
sification performance of the learning model. In contrast, the approach Natural Science Foundation of China. Lei Yu reports financial support
presented in this paper performs LLE dimensionality reduction on the was provided by Natural Science Foundation of Shaanxi Province of
feature data and then utilizes GFK to project the feature data onto Junhuai Li reports financial support was provided by Xi’an Science
a manifold subspace for feature transformation. This approach avoids and Technology Plan Project. HuaiJun Wang reports financial support
data distortion and enhances the accuracy of action recognition after was provided by Key research and development program of Shaanxi
model transfer. Province.

12
H. Wang et al. Expert Systems With Applications 241 (2024) 122696

Data availability Karaman, O., Cakin, H., Alhudhaif, A., & Polat, K. (2021). Robust automated Parkinson
disease detection based on voice signals with transfer learning. Expert Systems with
Applications, 178, Article 115013.
Data will be made available on request.
Khan, M., & Roy, N. (2017). TransAct: Transfer learning enabled activity recognition.
In: 2017 IEEE international conference on pervasive computing and communications
Acknowledgments workshops (pp. 545–550).
Khatun, M. A., Abu Yousuf, M., Ahmed, S., Uddin, M. Z., Alyami, S. A., Al-Ashhab, S.,
This work was supported by the National Natural Science Founda- Akhdar, H. F., Khan, A., Azad, A., & Moni, M. A. (2022). Deep CNN-LSTM with
tion of China (No. 61971347), Doctoral Innovation Foundation of Xi’an self-attention model for human activity recognition using wearable sensor. IEEE
Journal of Translational Engineering in Health and Medicine, 10, 1–16.
University of Technology (No. 252072118), Natural Science Founda- Kuo, Y. L., Culhane, K. M., Thomason, P., Tirosh, O., & Baker, R. (2009). Measuring
tion of Shaanxi Province of China (No. 2021JM-344), Xi ’an Science distance walked and step count in children with cerebral palsy: An evaluation of
and Technology Plan Project (2022JH-RYFW-007), Key research and two portable activity monitors. Gait & Posture, 29, 304–310.
development program of Shaanxi Province (2022SF-353). Li, J., Kang, P., Tan, T., & Shull, P. B. (2022). Transfer learning improves accelerometer-
based child activity recognition via subject-independent adult-domain adaption.
IEEE Journal of Biomedical and Health Informatics, 26, 2086–2095.
References
Lin, Y. W., Chen, J., Cao, Y., Zhou, Y. J., Zhang, L. F., Tang, Y. Y., & Wang, S.
(2017). Cross-domain recognition by identifying joint subspaces of source domain
Abdel-Basset, M., Mai, M., Elhoseny, M., Le, H. S., & Zaied, E. (2019). Cosine similarity and target domain. IEEE Transactions on Cybernetics, 47, 1090–1101.
measures of bipolar neutrosophic set for diagnosis of bipolar disorder diseases. Long, T. H., Sun, Y. F., Gao, J. B., Hu, Y. L., & Yin, B. C. (2022). Domain adaptation
Artificial Intelligence in Medicine, 101, Article 101735. as optimal transport on grassmann manifolds. IEEE Transactions on Neural Networks
Alfaro, J. G. C., & Trejos, A. L. (2022). User-independent hand gesture recognition
and Learning Systems.
classification models using sensor fusion. Sensors, 22, Article 1321.
Long, M., Wang, J., Ding, G., Sun, J., & Yu, P. S. (2013). Transfer feature learning with
Alinia, P., Mirzadeh, I., & Ghasemzadeh, H. (2020). ActiLabel: A combinatorial transfer
joint distribution adaptation. In Proceedings of the 2013 ieee international conference
learning framework for activity recognition. arXiv preprint arXiv:2003.07415.
on computer vision (pp. 2200–2207).
Alqarni, M. A. (2021). Error-less data fusion for posture detection using smart
Mutegeki, R., & Han, D. S. (2019). Feature-representation transfer learning for human
healthcare systems and wearable sensors for patient monitoring. Personal and
activity recognition. In The 10th international conference on ICT convergence (pp.
Ubiquitous Computing, 1–12.
18–20).
Anguita, D., Ghio, A., Oneto, L., Parra, X., & Reyes-Ortiz, J. L. (2012). Human activity
Pan, S. J., Tsang, I. W., Kwok, J. T., & Yang, Q. (2010). Domain adaptation via transfer
recognition on smartphones using a multiclass hardware-friendly support vector
component analysis. IEEE Transactions on Neural Networks, 22, 199–210.
machine. In: International workshop on ambient assisted living (pp. 216–223).
Qaroush, A., Yassin, S., Al-Nubani, A., & Alqam, A. (2021). Smart, comfortable
Arjovsky, M., Chintala, S., & Bottou, L. (2017). Wasserstein generative adversarial
wearable system for recognizing Arabic Sign Language in real-time using IMUs
networks. In International conference on machine learning (pp. 214–223).
Barshan, B., & Yüksek, M. (2014). Recognizing daily and sports activities in two open and features-based fusion. Expert Systems with Applications, 184, Article 115448.
source machine learning environments using body-worn sensor units. The Computer Qin, X., Chen, Y., Wang, J., & Yu, C. (2019). Cross-dataset activity recognition via
Journal, 57, 1649–1667. adaptive spatial-temporal transfer learning. Proceedings of the ACM on Interactive,
Chavarriaga, R., Sagha, H., Calatroni, A., Digumarti, S. T., Tröster, G., Millán, J. D. R., & Mobile, Wearable and Ubiquitous Technologies, 3, 1–25.
Roggen, D. (2013). The opportunity challenge: A benchmark database for on-body Reiss, A., & Stricker, D. (2012). Introducing a new benchmarked dataset for activity
sensor-based activity recognition. Pattern Recognition Letters, 34, 2033–2042. monitoring. In International symposium on wearable computers (pp. 108–109).
Chen, Y., Wang, J., Huang, M., & Yu, H. (2019). Cross-position activity recognition Sah, R. K., & Ghasemzadeh, H. (2019). Adar: Adversarial activity recognition in
with stratified transfer learning. Pervasive and Mobile Computing, 57, 1–13. wearables. In 2019 IEEE/ACM international conference on computer-aided design (pp.
Chen, J. H., Zhang, L., Jin, Z. H., Zhao, C., & Wang, Q. C. (2022). 3D deep hetero- 1–8).
geneous manifold network for behavior recognition. Security and Communication Schölkopf, B., Platt, J., & Hofmann, T. (2007). Analysis of representations for domain
Networks, 2022, Article 3064804. adaptation. In Advances in neural information processing systems 19: proceedings of
Duan, S., Zhong, L., & Lin, F. Simplified-TCA: A simplified TCA algorithm for charging the 2006 conference (pp. 137–144).
scenarios. In 2021 4th international conference on advanced electronic materials, Sun, B., Feng, J., & Saenko, K. (2016). Return of frustratingly easy domain adap-
computers and software engineering (pp. 417–421). IEEE. tation. In Proceedings of the AAAI conference on artificial intelligence, vol. 30 (pp.
Feuz, K. D., & Cook, D. J. (2017). Collegial activity learning between heterogeneous 2058–2065).
sensors. Knowledge and Information Systems, 53, 337–364. Vergara-Diaz, G., Daneault, J. F., Parisi, F., Admati, C., Alfonso, C., Bertoli, M.,
Gong, B., Yuan, S., Fei, S., & Grauman, K. (2012). Geodesic flow kernel for unsuper- Bonizzoni, E., Carvalho, G. F., Costante, G., Fabara, E. E., Fixler, N., Golabchi, F. N.,
vised domain adaptation. In 2012 IEEE conference on computer vision and pattern Growdon, J., Sapienza, S., Snyder, P., Shpigelman, S., Sudarsky, L., Daeschler, M.,
recognition (pp. 2066–2073). Bataille, L., .... Bonato, P. (2021). Limb and trunk accelerometer data collected with
Han, C. L., Zhang, L., Tang, Y., Huang, W. B., Min, F. H., & He, J. (2022). Human wearable sensors from subjects with Parkinson’s disease. Scientific Data, 8, 1–12.
activity recognition using wearable sensors by heterogeneous convolutional neural Wang, J., Chen, Y., Hao, S., Feng, W., & Shen, Z. (2017). Balanced distribution
networks. Expert Systems with Applications, 198, Article 116764. adaptation for transfer learning. In IEEE international conference on data mining (pp.
Hongsheng, Cheng, Jian, Zhu, Ce, Liu, Haijun, Wang, & Feng (2015). Silhouette analysis 1129–1134).
for human action recognition based on supervised temporal t-SNE and incremental Wang, J., Chen, Y., Hu, L., Peng, X., & Yu, P. S. (2018). Stratified transfer learning for
learning. IEEE Transactions on Image Processing, 24, 3203–3217. cross-domain activity recognition. In 2018 IEEE international conference on pervasive
Hu, L., Chen, Y., Wang, J., Hu, C., & Jiang, X. (2018). OKRELM: online kernelized computing and communications (pp. 1–10). IEEE.
and regularized extreme learning machine for wearable-based activity recognition. Wang, X., Wo, B., Guan, Q., & Chen, B. (2014). Human action recognition based on
International Journal of Machine Learning and Cybernetics, 9, 1577–1590. manifold learning. Journal of Image and Graphics, 19, 914–923.
Jenkins, O. C., & Matarić, M. J. (2004). A spatio-temporal extension to Isomap nonlinear Xu, Q. S., Wei, X. F., Bai, R. X., Li, S. M., & Meng, Z. (2023). Integration of deep
dimension reduction. In Proceedings of the twenty-first international conference on adaptation transfer learning and online sequential extreme learning machine for
machine learning (p. 56). cross-person and cross-position activity recognition. Expert Systems with Applications,
Jia, K., & Yeung, D. Y. (2008). Human action recognition using local spatio-temporal 212, Article 118807.
discriminant embedding. In 2008 IEEE conference on computer vision and pattern
Yao, S., Kang, Q., Zhou, M., Rawa, M. J., & Albeshri, A. (2022). Discriminative manifold
recognition (pp. 1–8).
distribution alignment for domain adaptation. IEEE Transactions on Systems, Man,
Jiang, Y. J., Song, L., Zhang, J. M., Song, Y., & Yan, M. (2022). Multi-category gesture
and Cybernetics: Systems, 53, 1183–1197.
recognition modeling based on sEMG and IMU signals. Sensors, 22, Article 5855.
Ye, J., Li, X., Zhang, X., Zhang, Q., & Chen, W. (2020). Deep learning-based human
Kang, P. Q., Li, J. X., Fan, B. F., Jiang, S., & Shull, P. B. (2022). Wrist-worn hand
activity real-time recognition for pedestrian navigation. Sensors, 20, Article 2574.
gesture recognition while walking via transfer learning. Ieee Journal of Biomedical
Zhang, W., Deng, L., Zhang, L., & Wu, D. (2022). A survey on negative transfer.
and Health Informatics, 26, 952–961.
IEEE/CAA Journal of Automatica Sinica, 10, 305–329.
Kang, Q., Yao, S., Zhou, M., Zhang, K., & Abusorrah, A. (2020). Effective visual domain
Zhang, Y., Sun, W. Y., & Chen, J. (2022). Application of embedded smart wearable
adaptation via generative adversarial distribution matching. IEEE Transactions on
device monitoring in joint cartilage injury and rehabilitation training. Journal of
neural networks and learning systems, 32, 3919–3929.
Healthcare Engineering, 2022, Article 4420870.
Kang, Q., Yao, S., Zhou, M., Zhang, K., & Abusorrah, A. (2020). Enhanced subspace
distribution matching for fast visual domain adaptation. IEEE Transactions on Zhao, Z., Chen, Y., Liu, J., & Liu, M. (2010). Cross-mobile ELM based activity
Computational Social Systems, 7, 1047–1057. recognition. International Journal of Engineering and Industries, 1, 30–38.

13

You might also like