Professional Documents
Culture Documents
Human Activity Recognition Based On Local Linear Embed - 2024 - Expert Systems W
Human Activity Recognition Based On Local Linear Embed - 2024 - Expert Systems W
Keywords: Human Activity Recognition (HAR) plays a crucial role in various applications(e.g., medical treatment, video
Human Activity Recognition surveillance and sports monitoring). Transfer learning is a promising solution to cross-domain identification
Transfer learning problems in HAR. However, existing methods usually ignore the negative transfer caused by using the features
Locally Linear Embedding
of each source domain in equal proportions, as well as the distribution difference between the source and target
Geodesic Flow Kernel
domains. In this paper, an HAR method based on manifold learning is proposed. Firstly, the similarity between
Grassmann manifolds
the domain and multiple source domains is calculated using the Multi-Kernel-Maximum Mean Difference (MK-
MMD), and the source domain most similar to the target domain is selected as the optimal source domain
in the transfer task. Secondly, Locally Linear Embedding (LLE) is leveraged to reduce the dimensionality of
both optimal source domain and target domain data to remove redundant information, and the Geodesic Flow
Kernel (GFK) is utilized to project low-dimensional data into the Grassmann manifold space and reduce the
distribution difference between the two domains. Finally, the source domain action training model is applied
to the target domain. Three public datasets (i.e., PAMAP2, OPPORTUNITY and UCI DSADS) are utilized to
validate the effectiveness of the proposed approach. Experimental results are presented to demonstrate that the
proposed HAR method can predict a large number of unlabeled samples in the target domain while preserving
the original data structure.
The code (and data) in this article has been certified as Reproducible by Code Ocean: (https://codeocean.com/). More information on the Reproducibility
Badge Initiative is available at https://www.elsevier.com/physical-sciences-and-engineering/computer-science/journals.
∗ Corresponding author at: School of Computer Science and Engineering, Xi’an University of Technology, No. 5 South Jinhua Road, Xi’an, 710048, Shaanxi,
China.
E-mail addresses: wanghuaijun@xaut.edu.cn (H. Wang), 2211221129@stu.xaut.edu.cn (J. Yang), 2191221061@stu.xaut.edu.cn (C. Cui),
tupengjia@stu.xaut.edu.cn (P. Tu), lijunhuai@xaut.edu.cn (J. Li), 105222@xaut.edu.cn (B. Fu), w.xiang@latrobe.edu.au (W. Xiang).
https://doi.org/10.1016/j.eswa.2023.122696
Received 17 February 2023; Received in revised form 21 November 2023; Accepted 21 November 2023
Available online 23 November 2023
0957-4174/© 2023 Elsevier Ltd. All rights reserved.
H. Wang et al. Expert Systems With Applications 241 (2024) 122696
and linear acceleration application based on Android OS to collect To address the above challenges, this paper proposes a cross-domain
raw data for human behavior recognition. Jiang et al. (2022) used human behavior recognition method. The major contributions of this
wearable devices to collect surface EMG signals and inertial signals work are as follows:
and recorded a multi-category dataset of 20 gestures, to accurately
classify gestures. Alqarni (2021) leveraged wearable sensors to monitor 1 To solve the problem of missing labels in the target domain
patients’ physical physiological parameters for medical diagnosis. Kuo sample data in human behavior recognition, the label prediction
et al. (2009) bind an acceleration sensor to the ankle joint to calculate of target domain data is realized by domain transfer of the
the user’s walking distance, speed and energy consumption. Geodesic Flow Kernel (GFK) framework in the manifold space;
Traditional machine learning and deep learning requires data not 2 In the multi-source domain transfer, the feature ratio of each
only to follow the same distribution but also need sufficient labeled source domain will cause the problem of negative transfer. As
data to train the model. However, the actual collection always contains such, the Multi-Kernel-Maximum Mean Difference (MK-MMD) is
a large amount of unlabeled data, and the same action data comes from used, which is an unsupervised source domain selection method
different users or there are some differences in the wearing positions that measures the similarity about the body movement between
of sensors of different users during data collection. These factors will the source and target domains. Then the most similar source
affect the overall distribution of sensor data, leading to the degradation domain to the target domain will be chosen as the final transfer
of the performance of the model. This is because most deep learning task;
models are designed to solve specific tasks. If the data distribution 3 In line with the problem of the distribution of differences be-
changes, these models will be rebuilt again, and it is difficult to tween data, we propose the Locally Linear Embedding (LLE)
reconstruct and train them due to the computational power and time and GFK modeling method of combining the unsupervised. The
constraints. However, transfer learning can use pre-trained networks LLE algorithm is used to extract the public schema information
and apply them to our custom tasks, as well as transfer knowledge respectively, and common schema information for source and
learned from previous tasks. target domain data is mapped to the manifold space. The GFK is
Ye et al. (2020) proposed an instance-based cross-domain trans- applied in manifold space domain transfer, then, the distribution
fer model and train action classifiers using supervised learning pat- difference between the data is reduced, and the problem of
terns. Alinia et al. (2020) proposed a human action recognition method missing labels in the target domain is solved;
based on feature transfer, which learned the structural similarity be-
tween events in the source domain and the target domain, used the 2. Related work
network graph constructed for the two domains, extracted the struc-
tural relationship between the core clusters of the two domains, and 2.1. Human behavior recognition based on manifold learning
assigned appropriate labels to the samples of the target domain. Kara-
man et al. (2021) used the model fine-tuning method of transfer Manifold learning has advantages in dealing with behavioral data
learning to extract speech data sets to achieve enhanced Parkinson’s containing nonlinear structures, because it can preserve the internal
recognition. Gong et al. (2012) proposed a feature transfer learning nonlinear structure of data. The high dimensionality of features con-
method for human behavior recognition and adopted a kernel-based tained in video, image sequence or sensor data places pressure on
method for cross-domain transfer. It simulates the transfer between computing and storage, researchers use manifold learning to achieve
domains by integrating a large number of subspaces that describe the nonlinear mapping of action features from high to low dimensions to
change from the source domain to the target domain, and the model obtain effective information and improve the efficiency of computation.
can learn new shallow features and reduce domain differences. Li et al. Long et al. (2022) proposed a domain adaptive model based on
(2022) proposed a transfer learning method to train a high-fidelity optimal transmission on the Grassmann manifold and designed a sim-
and subject-independent child activity recognition model using data plified model that keep the necessary adaptive characteristics. Cross-
from the adult domain, which effectively improved the classification domain recognition experiments on several public datasets show that
accuracy. the model achieves optimal performance in every case. A deep het-
Transfer learning can identify the target domain data according to erogeneous 3D manifold network for behavior recognition by taking
the data information of the source domain. Nevertheless, how to use the advantage of the advantages of Riemannian manifold in describing 3D
domain similarity between the source and target domains to construct motion was proposed (Chen et al., 2022). Using the construction of
the transfer model and how to improve the recognition performance the graph to guide the backbone network to mine more discriminative
of cross-domain transfer still are two major problems facing transfer nonlinear spatiotemporal features, experiments on several mainstream
learning to realize behavior recognition. The following challenges exist skeleton datasets have achieved considerable results. Isomap depends
in studying the above issues. on the distance measure of multidimensional scale, and it can enhance
We only have the original activity data in the target domain without Spatiotemporal relationships in motion changes. Jenkins and Matarić
actual activity labels, and the missing labels will seriously affect the (2004) extends ISOMAP to eliminate the ambiguity of the proximal
classification effect of the model. The data distribution similarity of data sample points and make the spatial distal data points corre-
each limb part is different, and the criterion for judging similarity spond, thus revealing the spatio-temporal characteristics in the data
cannot rely solely on the symmetric relationship of limb parts. If structure. Jia and Yeung (2008) proposed a local spatio-temporal dis-
the features of each source domain are utilized in equal proportion, criminant embedding method, which projected image frames of action
the difference between the domains cannot be exploited, which may sequences into the embedding manifold space according to different
cause a negative transfer. Hence, it is necessary to study a method to categories to form a local temporal subspace, and this method can
measure the similarity between the source and target domains. To avoid improve the classification effect of actions with similar spatio-temporal
negative transfer, the source domain most similar to the target will shapes. Hongsheng et al. (2015) calculated the dense optical flow
be selected to train the learning model. After selecting an appropriate field in action images, using Convolutional Neural Networks (CNN)
source domain limb, it is difficult to construct an efficient machine combined with attention pooling mechanism to capture the region of
learning model using both the source and target domain due to the interest in continuous video frames, and uses Riemannian manifold
distribution differences of different body parts. A lot of methods are learning method to calculate the spatial motion variation between fea-
based on the premise that the distributions of the source and target ture vectors of different frames. The manifold characteristics of motion
domains are the same. Different distributions will lead to overfitting, changes were obtained, and the motion of the target was modeled
which will affect the recognition performance. from multiple perspectives to realize behavior recognition. Wang et al.
2
H. Wang et al. Expert Systems With Applications 241 (2024) 122696
(2014) proposed an action recognition framework based on manifold et al. (2020a) employed a generative adversarial approach to address
learning. In the training stage, LE manifold dimensionality reduction cross-domain problems and proposed the Generative Adversarial Dis-
algorithm was used to reduce the dimensionality of high-dimensional tribution Matching (GADM) framework. This framework resolves the
depth image features obtained from Kinect device. In the recognition issue of underperforming generators or discriminators. By enhancing
stage, the nearest neighbor interpolation method is used to map the test the objective function with the inclusion of cross-domain difference
data to the low-dimensional manifold space, and the improved Haus- distance and further minimizing the differences through competition
dorff distance is used to measure the similarity between the training between the generator and discriminator, it effectively reduces the
set and the test set in the low-dimensional space, which effectively disparities in cross-domain distributions. Kang et al. (2020b) proposed
improves the recognition accuracy. the Enhanced Subspace Distribution Matching (ESDM) method, which
leverages label information to enhance the distribution matching be-
2.2. Transdomain human behavior recognition based on transfer learning tween the source and target domains in a shared subspace. During
the kernel principal component analysis (PCA) dimensionality reduc-
Traditional machine learning and deep learning methods not only tion process, it reduces the conditional and marginal distributions in
require that the training and test data have the same distribution but the shared subspace to improve cross-domain subspace distribution
also must be with a sufficient number of training samples. However, matching. Mutegeki and Han (2019) combined CNN with Long Short-
it is time-consuming and expensive to obtain sufficient labeled train- Term Memory network (LSTM) and obtained 90.8% accuracy in the
ing samples. When using sensors to collect behavioral data, different experiment on UCI HAR dataset, which provided ideas for solving the
locations of bound sensor units and different types of sensor units problem of too few training data in the target domain dataset. Khan and
will cause data distribution differences, which will greatly degrade the Roy (2017) studied the problem of identifying activities across different
performance of the classification model. Transdomain human behavior environments in the case of limited labeled data in the target domain,
recognition based on transfer learning can solve this problem well. proposed TransAct model, combined anomaly detection method with
Transfer learning can discover the data similarity between the source clustering, and solved the challenge of identifying unknown activities
domain and the target domain. The source domain refers to the domain with data distribution differences from source samples, and the recogni-
with a large number of labels, and the target domain is the object to tion accuracy reached 81%. Zhao et al. (2010) combined Decision Tree
be assigned labels. The classification effect can be improved by helping and K-means clustering algorithm to propose TransEMDT (Transfer
the target domain without labels or with fewer labels with the help of learning EMbedded Decision Tree) for cross-user behavior recognition,
the knowledge of the source domain. and EMbedded this method in the portable activity reporting system
Xu et al. (2023) studied the transfer problem of cross-person and based on smartphones. The system can use unlabeled samples to adapt
cross-location based on wearable sensor-based human activity recogni- the activity recognition model and build personalized models for new
tion, and proposed a hybrid model to achieve cross-domain transfer. users. Feuz and Cook (2017) realized real-time and seamless trans-
The model combines CNN, a deep adaptive network with a gradient mission of learned activity information between sensor platforms by
regression layer and an adaptive classifier based on the Online sequen- building a Personalized activity ECOsystem (PECO). Meanwhile, the
tial extreme learning machine (OS-ELM). Experiments show that the
multi-view transfer learning algorithm proposed in this paper was used
model is more efficient than standard CNN and deep transfer learning
to realize information transfer between sensor platforms.
models, where the accuracy of cross-location and cross-location transfer
is improved by 12% and 20%, respectively. Kang et al. (2022) seg-
3. Method
mented target gestures from the original signals coupled in the process
of human dynamic walking, and a transfer learning method based on
distribution adaptation was studied. Gesture recognition is realized by There are nonlinear structures in the human behavior data collected
domain transfer between dynamic walking and static standing scenes, by sensors, and the dimension of the sample features is too high, and
and the accuracy of gesture recognition is improved by 15.1%. Lin the redundant information may reduce the recognition accuracy of the
et al. (2017) constructs subspaces for classes in the source domain, and classifier. The LLE manifold dimensionality reduction algorithm not
constructs anchor subspaces in the target domain. The minimization only ensures the local characteristics of the original high-dimensional
cost function is used to estimate the overlap and topological consistency feature sequence while reducing dimensionality, but also reflects the
between the source domain and the target domain as well as between manifold structure of complex data.
subspaces, to assign corresponding class labels to anchor subspaces. The distribution difference between the data in the source and target
Then, the joint subspace is constructed by combining the anchor sub- domains makes it impossible to directly classify the target domain using
space with the atomic space, and the data samples belonging to the a classifier trained in the source domain. The GFK transforms data
same joint subspace are used to train the support vector machine for into Grassmann manifolds and builds geodesic flows, and integrates
the recognition of unlabeled data in the target domain. Qin et al. infinitely many continuous subspaces between the source and target
(2019) proposed an adaptive space–time transfer study method, this domains. Therefore, The GFK learns new feature representations that
method can evaluate edge adaptive probability and conditional proba- are robust to intra-domain variations, as well as utilizes subspaces to
bility distribution between the relative importance of learning different reveal potential differences and commonalities between domains and
actions adaptive spatial characteristics of data sets, and through the reduce inter-domain differences.
incremental manifold learning capture time feature, multiple source In this paper, LLE is combined with the GFK algorithm. LLE is lever-
domain selection provides a new solution. Wang et al. (2018) pro- aged to extract useful features and to reduce redundancy complexity
posed a general framework of Stratified Transfer Learning (STL), which from labeled source domain action data and unlabeled target domain
uses inter-class affinity for intra-class knowledge Transfer, and obtains action data. Meanwhile, The GFK is used to transfer learning the feature
pseudo-labels of target domains by majority voting technology, and the data after dimensionality reduction, obtain the continuous subspace
classification accuracy was improved by 7.68%. Yao et al. (2022) pro- along the geodesic direction, and extract the common information
posed an effective method called Discriminative Manifold Distribution between the source and target domains. The transfered source domain
Alignment (DMDA). This method leverages the concept of distribu- data and labels are utilized to train the random forest model, and the
tion alignment for domain adaptation and improves the discriminative target domain data after transfer mapping is predicted. This proposed
model by learning geometrical structures in the manifold space. It also approach not only realizes unsupervised cross-domain transfer of the
addresses the issue of uncertainty introduced by pseudo-labels in the target domain actions, but also improves the generalization ability of
target domain, this approach enables effective domain adaptation. Kang the model. The Framework of method is illustrated in Fig. 1.
3
H. Wang et al. Expert Systems With Applications 241 (2024) 122696
Fig. 1. Cross-domain transfer human action recognition framework based on manifold learning.
3.1. Unsupervised source domain selection based on MK-MMD solve the problem of missing labels of sample data in the target domain
in the next step.
When sensors are used to monitor human motion, if the data of
one body part is missing, the data of other body parts are used to 3.2. Cross-domain transfer method based on GFK
help predict the movement tag 𝑦𝑡 of that space. In the case of multi-
source domain transfer, using the features from each source domain in In the unsupervised source domain selection method described in
equal proportions can lead to negative transfer due to unknown domain the previous section, among multiple body movements in the source
correlations. Therefore, it is necessary to identify the most strongly domain, the movement most suitable for the body in the target domain
correlated source domain to perform{ }the transfer effectively (Zhang
𝑛𝑡 can be selected to improve the transfer effect. On this basis, the local
et al., 2022a). Suppose that 𝐷𝑡 = 𝑥𝑗𝑡 is the data of the lost body LLE is leveraged to reduce the dimension of the action data, and
𝑖=1
parts, and the data of 𝑀 body parts in 𝐶 body parts can be used as the redundant features and noise in the original data are eliminated
{ }𝑀
the tag source domain to assist prediction labels, denoted by 𝐷𝑆𝑖 𝑖=1 . while preserving the original local topology structure. Then the GFK is
{ }𝑛𝑖𝑠
Define the source domain as 𝐷𝑆𝑖 = 𝑥𝑗𝑆 , 𝑦𝑗𝑆 , and use MK-MMD to used to project the action features after dimensionality reduction into
𝑖=1 the subspace for manifold feature transformation with the objective
select the best source domain 𝐷𝑆 (𝐾) for the final transfer.
The MMD is a nonparametric method for measuring the difference of reducing the data offset between the domains and realize domain
between two different distributions, which has been widely used in transfer. The model after transfer can solve the problem of missing
many transfer learning methods. MK-MMD is based on the MMD. The action data labels in the target domain.
feature representations of the source domain and target domain are
mapped into the Reproducing Kernel Hilbert Space (RKHS), and then 3.2.1. Local linear embedding algorithm
the difference between the mean values of the two types of data is LLE is an unsupervised manifold dimensionality reduction algo-
calculated. MK-MMD is used to measure the distance between the rithm. It assumes data points satisfy the linear relation in the local
source and target domains as follows: space, and learns the compact representation of high-dimensional data
‖ [ ( )] [ ( )]‖2 while considering the local relationship between samples. The nearest
𝑀𝑀𝐷𝑘2 (𝑝, 𝑞) ≜ ‖𝐸𝑝 𝜙 𝑥𝑝 − 𝐸𝑞 𝜙 𝑥𝑞 ‖ (1) neighbors are selected in accordance with the Euclidean distance in
‖ ‖𝐻𝑘
a high dimensional space for linear reconstruction. Then the data
where, 𝐻𝑘 is a Hilbert space defined in the topological space 𝑋 whose
structure features of higher dimensional space are mapped to a lower
reproducing kernel is 𝑘. Views 𝑝 and 𝑞 are the distributions of the target
{ }𝑁 { }𝑁 dimensional space. While realizing low dimensional embedding, the
and source domains, respectively. 𝑋 𝑆 = 𝑥𝑠𝑖 𝑖=1𝑠 and 𝑋 𝑇 = 𝑥𝑡𝑖 𝑖=1𝑡 are
nearest neighbor relationship between data can be maintained and
the sample sets in 𝑝 and 𝑞, respectively.
MK-MMD combines multiple kernel functions linearly and computes the whole structure of nonlinear manifold can be learned. The LLE
the distance calculation by selecting the optimal kernel. The kernel algorithm is illustrated in Fig. 2.
The LLE algorithm is detailed as follows:
functions are combined as follows: { }
{ } Let 𝑋 = 𝑥1 , 𝑥2 , … , 𝑥𝑁 be a set of points in a higher dimensional
∑𝑚 ∑
𝑚
𝐾= 𝑘= 𝛽𝑖 𝑘𝑖 ∶ 𝛽𝑖 = 1, 𝛽𝑖 ≥ 0, ∀𝑖 (2) space 𝑅𝐷 , the number of samples is 𝑁, and the dimension is 𝐷.
𝑖=1 𝑖=1 1. The 𝑘 nearest neighbors of each data point are selected in the
where hyperparameter 𝑚 is the number of positive semidefinite ker- high-dimensional space.
nels and the associated constraints on 𝛽𝑖 are used to guarantee the According
{ to the
} Euclidean distance in Eq. (4), we can compute set
multi-kernel 𝑘. The empirically estimated MK-MMD are as follows: 𝑁𝑖 = 𝑥𝑛𝑖1 , … , 𝑥𝑛𝑖𝑘 of neighbors of a point 𝑥𝑖 .
( ) [ ]1∕2
1 ∑∑
𝑚 𝑚
( )
𝑀𝑀𝐷 𝐷𝑝 , 𝐷𝑞 = 𝑘 𝑥𝑝𝑖 , 𝑥𝑝𝑗 ∑
𝐷
| |2
2
𝑚 𝑖=1 𝑗=1 𝑑𝑖𝑗 = |𝑥𝑖𝑘 − 𝑥𝑗𝑘 | (4)
| |
(3) 𝑘=1
( ) ( )
1 ∑∑ 2 ∑∑
𝑛 𝑛 𝑚 𝑛
+ 𝑘 𝑥𝑞𝑖 , 𝑥𝑞𝑗 − 𝑘 𝑥𝑝𝑖 , 𝑥𝑞𝑗 2. The optimal reconstruction weight of each point is calculated by
2
𝑛 𝑖=1 𝑗=1 𝑚𝑛 𝑖=1 𝑗=1 the nearest neighbor.
In this paper, MK-MMD is used to measure the distance between the Each sample point can be reconstructed by linear weighting using
body movements of multiple source domains and the target domain, its 𝑘 neighbors to minimize the sum of the squared errors, and the sum
and the body movements of the source domain with the highest simi- of the weights is 1. For sample point 𝑥𝑖 , if 𝑥𝑗 does not belong to its
larity to the target domain are selected for transfer. This helps to better neighboring sample points, set 𝑤𝑖𝑗 = 0. Otherwise, the constrained least
4
H. Wang et al. Expert Systems With Applications 241 (2024) 122696
∑𝑁 ‖ ∑𝑘 ‖2
‖ ‖
min ‖𝑦𝑖 − 𝑤 𝑦 ‖
‖ 𝑖𝑗 𝑛𝑖𝑗 ‖
‖
𝑖=1 ‖ ‖
𝑗=1 ‖ (12)
∑ 𝑁
1 ∑𝑁
𝑠.𝑡. 𝑦𝑖 = 0, 𝑦 𝑦𝑇 = 𝐼
𝑖=1
𝑁 𝑖=1 𝑖 𝑖
[ ]
To find out the eigenvector matrix 𝑌 = 𝑦𝑖 , … , 𝑦𝑁 under the
𝑇
constraints, based on matrix 𝑊 ∶ 𝑀 = (𝐼 − 𝑊 )
(1 − 𝑊 ) to construct a new sparse symmetric positive semidefinite
matrix 𝑀, Eq. (12) is rewritten as
∑
𝑁
( )
min 𝜀(𝑌 ) = ‖𝑌 𝐼 − 𝑌 𝑊 ‖2 = tr 𝑌 𝑀𝑌 𝑇 (13)
‖ 𝑖 𝑖‖
𝑖=1
where 𝜀(𝑌 ) and 𝐼 are the loss function and the identity matrix, respec-
tively. Also, column 𝑖 of matrix 𝐼 and 𝑊 are 𝐼𝑖 and 𝑊𝑖 , respectively.
Fig. 2. LLE algorithm structure. Combining the Lagrange multiplier formula and constraint conditions,
we can obtain the following
∑𝑁 ‖ ∑𝑘 ‖2
‖ ‖
min 𝜀(𝑊 ) = ‖𝑥𝑖 − 𝑤 𝑥 ‖ 3.2.2. Geodesic flow kernel algorithm
‖ 𝑖𝑗 𝑗 ‖
𝑖=1 ‖
‖ 𝑗=1 ‖
‖ The geodesic flow kernel model implements achieves domain trans-
(5)
∑
𝑘 fer by integrating infinitely many subspaces over the source and target
𝑠.𝑡. 𝑤𝑖𝑗 = 1 domains, which describe incremental changes in geometric and statis-
𝑗=1 tical properties from the source to the target domain. The principle is
revealed in Fig. 3.
Define Eq. (6) to represent the local covariance matrix of sample
It is assumed that the action feature data in the source domain after
point 𝑥
[ 𝑖 , where 𝑄𝑖 is a real] symmetric semidefinite matrix, including dimensionality reduction by LLE is 𝑍𝑆 , and the action feature data in
𝐺𝑖 = 𝑥𝑛𝑖1 − 𝑥𝑖 , … , 𝑥𝑛𝑖𝑘 − 𝑥𝑖 . Eq. (7) can be obtained by using 𝑄𝑖 to the target domain is 𝑍𝑇 . The GFK consists of three main steps:
[ ]𝑇 ( )𝑇
matrix Eq. (5), where Γ = 1, ⋯ , 1 , 𝑤𝑖 = 𝑤𝑖1 , 𝑤𝑖2 , … , 𝑤𝑖𝑘 1. Determine the optimal dimension of the subspace.
represents the local reconstruction weight matrix of sample point 𝑥𝑖 , Calculate the PCA subspaces of 𝑍𝑆 and 𝑍𝑇 , which are 𝑃 𝐶𝐴𝑆 and
and Eq. (8) can be obtained by applying the Lagrange multiplier 𝑃 𝐶𝐴𝑇 , respectively. And combine 𝑍𝑆 and 𝑍𝑇 into one data set to
method to Eq. (7). calculate the subspaces 𝑃 𝐶𝐴𝑆+𝑇 , If the distribution difference between
⟨ ⟩ ⟨ ⟩ the action data in the source and target domains is small, the distance
⎡ 𝑥𝑛 − 𝑥𝑖 , 𝑥𝑛 − 𝑥𝑖 ⋯ 𝑥𝑛𝑖1 − 𝑥𝑖 , 𝑥𝑛𝑖𝑘 − 𝑥𝑖 ⎤ of the three subspaces on the Grassmann manifold space is small.
⎢ 𝑖1 𝑖1
⎥ [ ]
𝑄𝑖 = 𝐺𝑖𝑇 𝐺𝑖 = ⎢⟨ ⋮ ⟩ ⋱ ⟨ ⋮ ⟩⎥ (6) 𝐷(𝑑) = 0.5 𝑠𝑖𝑛𝛼𝑑 + 𝑠𝑖𝑛𝛽𝑑 (15)
⎢ 𝑥 − 𝑥 ,𝑥 − 𝑥 ⋯ 𝑥𝑛𝑖𝑘 − 𝑥𝑖 , 𝑥𝑛𝑖𝑘 − 𝑥𝑖 ⎥⎦
⎣ 𝑛𝑖𝑘 𝑖 𝑛𝑖1 𝑖
where, 𝛼𝑑 and 𝛽𝑑 represent the Angles between 𝑃 𝐶𝐴𝑆 and 𝑃 𝐶𝐴𝑆+𝑇 and
𝑚𝑖𝑛𝑤𝑇𝑖 𝑄𝑖 𝑤𝑖 𝑃 𝐶𝐴𝑇 and 𝑃 𝐶𝐴𝑆+𝑇 , respectively. 𝐷(𝑑) is the total measure between
(7) these two angles. The total measure is proportional to the value of the
𝑠.𝑡. 𝑤𝑇𝑖 Γ = 1
angle and the distance between the two domains. Then we use 𝐷(𝑑) to
𝑁𝐿 determine the best dimension 𝑑,
∑
𝐿(𝑤) = 𝑤𝑇𝑖 𝑄𝑖 𝑤𝑖 + 𝜆(𝑤𝑇𝑖 Γ − 1) (8) 𝑑 ∗ = 𝑚𝑖𝑛{𝑑|𝐷 (𝑑) = 1} (16)
𝑖=1
2. Construct geodesic curves.
Taking the derivative with regards to 𝑤 in Eq. (8),
Denote by 𝑃𝑆 ∈ R𝐷×𝑑 and 𝑃𝑇 ∈ R𝐷×𝑑 the two groups of subspaces
2𝑄𝑖 𝑤𝑖 + 𝜆Γ = 0 (9) of the action feature data 𝑍𝑆 in the source domain and 𝑍𝑇 in the target
domain after dimensionality reduction, respectively, which, are orthog-
Eq. (10) is used to normalize the weight coefficients and the optimal onal matrices. Also, 𝑅𝑆 ∈ R𝐷×(𝐷−𝑑) is the orthogonal complement of
reconstruction weights of 𝑥𝑖 are finally obtained as: 𝑃𝑆 . The geodesic function is as follows:
∑
Φ (𝑡) = 𝑃𝑆 𝑈1 Γ (𝑡) − 𝑅𝑆 𝑈2 (𝑡) (17)
𝑤𝑖 = 𝑐𝑄𝑖 −1 Γ (10)
𝑈1 ∈ R𝑑×𝑑 and 𝑈2 ∈ R(𝐷−𝑑)×𝑑 are orthogonal matrices, both given by
𝑄−1
𝑖 Γ
Eq. (18)
𝑤∗𝑖 = (11) ∑
Γ𝑇 𝑄−1
𝑖 Γ 𝑃𝑆𝑇 𝑃𝑇 = 𝑈1 Γ𝑉 𝑇 , 𝑅𝑇𝑆 𝑃𝑇 = −𝑈2 𝑉𝑇 (18)
∑
3. Low-dimensional embedding is carried out while preserving the where Γ and are 𝑑 × 𝑑 diagonal matrices, and 𝑐𝑜𝑠𝜃𝑖 and 𝑠𝑖𝑛𝜃𝑖 (𝑖 =
local geometry represented by the reconstructed weights. 1, 2, … , 𝑑) are diagonal elements. 𝜃𝑖 is the main angle of 𝑃𝑆 and 𝑃𝑇 and
5
H. Wang et al. Expert Systems With Applications 241 (2024) 122696
6
H. Wang et al. Expert Systems With Applications 241 (2024) 122696
Table 1
Introduction of the software and hardware environment used in the experiment.
Software and Version information Application in experiment
hardware
Server Equipped with Intel Core i5-11400F CPU and The hardware environment of the experiment
2.60 GHz RAM
Pycharm Pycharm 2022.1.4 Software environment for the experiments in
the dimensionality reduction part
MATLAB MATLAB R2021a The software environment for the dataset
processing, the GFK part of the algorithm, and
the experiments in the classifier part
Table 2
Experimental datasets.
Data set Number of Number of Number of Site of collection Sensor type
subjects samples activities
PAMAP2 (Reiss & Stricker, 9 2 844 868 18 wrist, chest, ankle accelerometer,gyroscope,
2012) magnetometer, heart rate
detector
OPPORTUNITY (Chavarriaga 4 701 366 4 back,left upper arm, left Bluetoothwireless
et al., 2013) forearm, right upper arm, accelerometer and gyroscope,
right forearm, left foot, magnetometer
rightfoot
UCI DSADS (Barshan & Yüksek, 8 1 140 000 19 torso,left arm, right arm, left accelerometer, gyroscope,
2014) leg, right leg magnetometer
7
H. Wang et al. Expert Systems With Applications 241 (2024) 122696
collection are on the back, left upper arm, left forearm, right upper arm, value of 25 for the nearest neighbor parameter 𝑘 for all four dimension-
right forearm, left foot, and right foot. ality reduction algorithms. The dimensionality parameter 𝑑 was varied
The UCI DSADS (Barshan & Yüksek, 2014) dataset is collected within the range of 3 to 50 for the following experiments. The GFK is
by 8 subjects wearing sensors of 3-axis accelerometer, gyroscope and used for transfer learning of the action features of the source domain
magnetometer, including daily and physical activities such as standing, OPPORTUNITY and the action features of the target domain DSADS
going upstairs, going downstairs, lying, sitting, etc, and each activity after dimension reduction. The recognition accuracy of the actions in
includes 7,500 samples. The sensors are attached to body parts such as DSADS is plotted in Fig. 6.
the torso, left arm, right arm, left leg, and right leg respectively. As can be seen from Fig. 6, among the four dimensionality reduction
algorithms, The LLE algorithm can obtain the highest recognition ac-
curacy, since it can ensure the manifold structure inside the data while
4.2. Comparison of manifold learning dimensionality reduction methods
reducing the dimension.
Next, the influence of neighbor parameter 𝑘 on the result is com-
To analyze the effectiveness of the algorithm based on LLEGFK,
pared. The target dimension 𝑑 of the three-dimensionality reduction
PCA projection is performed on the data, and the distribution of three- algorithms is set to a fixed value of 25, and the value range of 𝑘 is
dimensional features before the observation is visualized. Fig. 5(a) set to 3–50. Since PCA has no parameter 𝑘, only the three manifold
shows the distribution of the original data of PAMAP and DSADS, and algorithms are compared and shown in Fig. 7.
there are obvious distribution differences between the two datasets. In Fig. 7, the recognition rate of all dimensionality reduction algo-
Also, Fig. 5(b) reveals the distribution after dimensionality reduction rithms is proportional to 𝑘. When reaching the maximum in a certain
by LLE manifold, which can remove redundant features and reduce the range, the accuracy begins to decline and tends to be stable. The
impact of redundant features on the recognition rate. The distribution transfer method after dimensionality reduction using LLE achieves the
of the original data after GFK transfer is shown in Fig. 5(c). After LLE or highest recognition rate, followed by ISOMAP and LE.
GFK processing, the distribution difference between the two datasets is
reduced. In addition, the distribution after processing by the proposed 4.3. Evaluate unsupervised source domain selection
method is revealed in Fig. 5(d), which shows the data distribution
after PAMAP and DSADS treatment is almost identical, indicating that To verify the unsupervised source domain selection method pro-
the proposed method is able to significantly reduce the distribution posed in this paper, the DSADS is selected for the experiment. This
difference between the datasets. dataset contains five limb parts, i.e., the trunk, right arm, left arm, right
In this paper, the optimal parameters 𝑘 and 𝑑 were selected using leg, and left leg. Assuming that one limb is designated as the target
grid search. To provide a visual analysis of parameter variations and domain for the absence of data labels, the remaining four limbs can be
compare different dimensionality reduction methods, we set a fixed selected as the source domain.
8
H. Wang et al. Expert Systems With Applications 241 (2024) 122696
Table 3
MK-MMD distance between the left arm and other body parts when the left arm is the
target domain.
Source domain Trunk Right arm Left leg Right leg
MK-MMD distance 0.23834 0.15579 0.16127 0.19423
Table 4
The recognition accuracy of other body parts as the source domain when the left arm
is the target domain.
Source domain Trunk Right arm Left leg Right leg
Accuracy of recognition 62.50% 82.83% 79.67% 76.83%
The left arm is randomly selected as the target domain, and then
the source domain with the shortest distance from the target domain.
That is, the smallest difference in the data distribution is selected as
the final source domain in the remaining four source domains, namely
trunk, right arm, left leg and right leg. In this experiment, MK-MMD
is used to measure the distance between multiple different source and
Fig. 6. Recognition accuracy of four dimensionality reduction algorithms in under target domains, and the final distance is adopted as the benchmark for
different target dimensions.
the next step of transfer.
Table 3 shows the MK-MMD distances from the remaining four
source domains to the left arm. It can be seen that the closest limb
part to the left arm is the right arm, so the right arm is selected as the
source domain.
After the source domain is selected, the proposed transfer learning
algorithm LLE+GFK+RF is used to transfer the source domain to the
target domain. To verify the necessity of the source domain selection
algorithm, all four limbs are utilized as the source domain to transfer
the left arm, and the obtained recognition accuracy of the target
domain is shown in Table 4. As can be seen from the table, the MK-
MMD distance between the trunk and the left arm is the largest, and
after the trunk is used as the source domain of transfer learning, the
recognition rate of the left arm is the lowest. On the contrary, the MK-
MMD distance between the right arm and the left arm is the smallest,
and the recognition rate obtained by using the left arm as the source
domain is the highest. Therefore, the multi-source domain selection
algorithm can select the closet source domain to the target domain to
obtain the best recognition accuracy. The experimental results show
that the similarity between the source target domains is important for
cross-domain learning, and it is crucial to find the correct auxiliary
domain to perform successful knowledge transfer.
Fig. 7. Recognition accuracy of three manifold learning algorithms under different To verify the superiority of the source domain selection algorithm,
values of 𝑘.
the proposed algorithm is compared with several classical source do-
main selection techniques, which are shown in Table 5. After selecting
the optimal source domain, the LLEGFK algorithm is employed for
transfer learning of the target domain, and the optimal recognition rate
is selected as the final comparison result.
In Fig. 8, the source domain selection algorithm is superior to
some existing distance measurement methods when the same transfer
algorithm is used, and it can select the optimal source domain with the
highest similarity to the target domain, which improves the recognition
accuracy.
Cross Domain Activity Recognition (CDAR) aims to label the activ-
ities of one domain using labeled data from another related domain.
CDAR has several types, such as Cross-Person, Cross-Device, Cross-
Environment, Cross-Location, Cross-Dataset activity recognition, etc.
(Chavarriaga et al., 2013). Cross-Person Activity Recognition aims
to address the differences between individuals. Cross-Device Activity
Recognition enables seamless transfer across multiple devices. Cross-
Environment Recognition aims to enhance the model’s generalization
Fig. 8. Recognition accuracy under different distance measurement algorithms when
and adaptability to different environments. Cross-Location Activity
the left arm is the target domain. Recognition specifically refers to situations where activity labels are
missing for certain body parts, and utilizes labeled data from similar
body parts to obtain labels for those body parts. Cross-Dataset Activity
Recognition involves applying an activity recognition model trained on
9
H. Wang et al. Expert Systems With Applications 241 (2024) 122696
Table 5
Recognition accuracy of other body parts as the source domain when the left arm is the target domain of data collection used.
Method Method Description
A-distance (Schölkopf et al., 2007) The correlation of local center changes
Cosine_similarity (Abdel-Basset et al., 2019) Cosine similarity, suitable for similarity calculation of high-dimension vectors.
Wasserstein_distance (Arjovsky et al., 2017) Also known as bulldozer distance, is used to indicate the similarity of two distributions
10
H. Wang et al. Expert Systems With Applications 241 (2024) 122696
Table 8
Details of the extracted features.
Serial number Feature Description
1 Mean The average of the sample in the window
2 Std The standard deviation
3 Minimum The minimum value within a window
4 Maximum The maximum value within a window
5 Mode The value with the highest frequency
6 Range The difference between the maximum value and the minimum value in a window
7 Mean crossing rate The number of data exceeding the mean point in a window
8 DC DC component
9–13 Spectrum peak position Top 5 peaks after fast Fourier transform
14–18 Frequency Frequency corresponding to 5 peaks
19 Energy The norm squared
20–23 Four shape features mean, standard deviation, skewness, kurtosis
24–27 Four amplitude features mean, standard deviation, skewness, kurtosis
11
H. Wang et al. Expert Systems With Applications 241 (2024) 122696
6. Conclusion
12
H. Wang et al. Expert Systems With Applications 241 (2024) 122696
Data availability Karaman, O., Cakin, H., Alhudhaif, A., & Polat, K. (2021). Robust automated Parkinson
disease detection based on voice signals with transfer learning. Expert Systems with
Applications, 178, Article 115013.
Data will be made available on request.
Khan, M., & Roy, N. (2017). TransAct: Transfer learning enabled activity recognition.
In: 2017 IEEE international conference on pervasive computing and communications
Acknowledgments workshops (pp. 545–550).
Khatun, M. A., Abu Yousuf, M., Ahmed, S., Uddin, M. Z., Alyami, S. A., Al-Ashhab, S.,
This work was supported by the National Natural Science Founda- Akhdar, H. F., Khan, A., Azad, A., & Moni, M. A. (2022). Deep CNN-LSTM with
tion of China (No. 61971347), Doctoral Innovation Foundation of Xi’an self-attention model for human activity recognition using wearable sensor. IEEE
Journal of Translational Engineering in Health and Medicine, 10, 1–16.
University of Technology (No. 252072118), Natural Science Founda- Kuo, Y. L., Culhane, K. M., Thomason, P., Tirosh, O., & Baker, R. (2009). Measuring
tion of Shaanxi Province of China (No. 2021JM-344), Xi ’an Science distance walked and step count in children with cerebral palsy: An evaluation of
and Technology Plan Project (2022JH-RYFW-007), Key research and two portable activity monitors. Gait & Posture, 29, 304–310.
development program of Shaanxi Province (2022SF-353). Li, J., Kang, P., Tan, T., & Shull, P. B. (2022). Transfer learning improves accelerometer-
based child activity recognition via subject-independent adult-domain adaption.
IEEE Journal of Biomedical and Health Informatics, 26, 2086–2095.
References
Lin, Y. W., Chen, J., Cao, Y., Zhou, Y. J., Zhang, L. F., Tang, Y. Y., & Wang, S.
(2017). Cross-domain recognition by identifying joint subspaces of source domain
Abdel-Basset, M., Mai, M., Elhoseny, M., Le, H. S., & Zaied, E. (2019). Cosine similarity and target domain. IEEE Transactions on Cybernetics, 47, 1090–1101.
measures of bipolar neutrosophic set for diagnosis of bipolar disorder diseases. Long, T. H., Sun, Y. F., Gao, J. B., Hu, Y. L., & Yin, B. C. (2022). Domain adaptation
Artificial Intelligence in Medicine, 101, Article 101735. as optimal transport on grassmann manifolds. IEEE Transactions on Neural Networks
Alfaro, J. G. C., & Trejos, A. L. (2022). User-independent hand gesture recognition
and Learning Systems.
classification models using sensor fusion. Sensors, 22, Article 1321.
Long, M., Wang, J., Ding, G., Sun, J., & Yu, P. S. (2013). Transfer feature learning with
Alinia, P., Mirzadeh, I., & Ghasemzadeh, H. (2020). ActiLabel: A combinatorial transfer
joint distribution adaptation. In Proceedings of the 2013 ieee international conference
learning framework for activity recognition. arXiv preprint arXiv:2003.07415.
on computer vision (pp. 2200–2207).
Alqarni, M. A. (2021). Error-less data fusion for posture detection using smart
Mutegeki, R., & Han, D. S. (2019). Feature-representation transfer learning for human
healthcare systems and wearable sensors for patient monitoring. Personal and
activity recognition. In The 10th international conference on ICT convergence (pp.
Ubiquitous Computing, 1–12.
18–20).
Anguita, D., Ghio, A., Oneto, L., Parra, X., & Reyes-Ortiz, J. L. (2012). Human activity
Pan, S. J., Tsang, I. W., Kwok, J. T., & Yang, Q. (2010). Domain adaptation via transfer
recognition on smartphones using a multiclass hardware-friendly support vector
component analysis. IEEE Transactions on Neural Networks, 22, 199–210.
machine. In: International workshop on ambient assisted living (pp. 216–223).
Qaroush, A., Yassin, S., Al-Nubani, A., & Alqam, A. (2021). Smart, comfortable
Arjovsky, M., Chintala, S., & Bottou, L. (2017). Wasserstein generative adversarial
wearable system for recognizing Arabic Sign Language in real-time using IMUs
networks. In International conference on machine learning (pp. 214–223).
Barshan, B., & Yüksek, M. (2014). Recognizing daily and sports activities in two open and features-based fusion. Expert Systems with Applications, 184, Article 115448.
source machine learning environments using body-worn sensor units. The Computer Qin, X., Chen, Y., Wang, J., & Yu, C. (2019). Cross-dataset activity recognition via
Journal, 57, 1649–1667. adaptive spatial-temporal transfer learning. Proceedings of the ACM on Interactive,
Chavarriaga, R., Sagha, H., Calatroni, A., Digumarti, S. T., Tröster, G., Millán, J. D. R., & Mobile, Wearable and Ubiquitous Technologies, 3, 1–25.
Roggen, D. (2013). The opportunity challenge: A benchmark database for on-body Reiss, A., & Stricker, D. (2012). Introducing a new benchmarked dataset for activity
sensor-based activity recognition. Pattern Recognition Letters, 34, 2033–2042. monitoring. In International symposium on wearable computers (pp. 108–109).
Chen, Y., Wang, J., Huang, M., & Yu, H. (2019). Cross-position activity recognition Sah, R. K., & Ghasemzadeh, H. (2019). Adar: Adversarial activity recognition in
with stratified transfer learning. Pervasive and Mobile Computing, 57, 1–13. wearables. In 2019 IEEE/ACM international conference on computer-aided design (pp.
Chen, J. H., Zhang, L., Jin, Z. H., Zhao, C., & Wang, Q. C. (2022). 3D deep hetero- 1–8).
geneous manifold network for behavior recognition. Security and Communication Schölkopf, B., Platt, J., & Hofmann, T. (2007). Analysis of representations for domain
Networks, 2022, Article 3064804. adaptation. In Advances in neural information processing systems 19: proceedings of
Duan, S., Zhong, L., & Lin, F. Simplified-TCA: A simplified TCA algorithm for charging the 2006 conference (pp. 137–144).
scenarios. In 2021 4th international conference on advanced electronic materials, Sun, B., Feng, J., & Saenko, K. (2016). Return of frustratingly easy domain adap-
computers and software engineering (pp. 417–421). IEEE. tation. In Proceedings of the AAAI conference on artificial intelligence, vol. 30 (pp.
Feuz, K. D., & Cook, D. J. (2017). Collegial activity learning between heterogeneous 2058–2065).
sensors. Knowledge and Information Systems, 53, 337–364. Vergara-Diaz, G., Daneault, J. F., Parisi, F., Admati, C., Alfonso, C., Bertoli, M.,
Gong, B., Yuan, S., Fei, S., & Grauman, K. (2012). Geodesic flow kernel for unsuper- Bonizzoni, E., Carvalho, G. F., Costante, G., Fabara, E. E., Fixler, N., Golabchi, F. N.,
vised domain adaptation. In 2012 IEEE conference on computer vision and pattern Growdon, J., Sapienza, S., Snyder, P., Shpigelman, S., Sudarsky, L., Daeschler, M.,
recognition (pp. 2066–2073). Bataille, L., .... Bonato, P. (2021). Limb and trunk accelerometer data collected with
Han, C. L., Zhang, L., Tang, Y., Huang, W. B., Min, F. H., & He, J. (2022). Human wearable sensors from subjects with Parkinson’s disease. Scientific Data, 8, 1–12.
activity recognition using wearable sensors by heterogeneous convolutional neural Wang, J., Chen, Y., Hao, S., Feng, W., & Shen, Z. (2017). Balanced distribution
networks. Expert Systems with Applications, 198, Article 116764. adaptation for transfer learning. In IEEE international conference on data mining (pp.
Hongsheng, Cheng, Jian, Zhu, Ce, Liu, Haijun, Wang, & Feng (2015). Silhouette analysis 1129–1134).
for human action recognition based on supervised temporal t-SNE and incremental Wang, J., Chen, Y., Hu, L., Peng, X., & Yu, P. S. (2018). Stratified transfer learning for
learning. IEEE Transactions on Image Processing, 24, 3203–3217. cross-domain activity recognition. In 2018 IEEE international conference on pervasive
Hu, L., Chen, Y., Wang, J., Hu, C., & Jiang, X. (2018). OKRELM: online kernelized computing and communications (pp. 1–10). IEEE.
and regularized extreme learning machine for wearable-based activity recognition. Wang, X., Wo, B., Guan, Q., & Chen, B. (2014). Human action recognition based on
International Journal of Machine Learning and Cybernetics, 9, 1577–1590. manifold learning. Journal of Image and Graphics, 19, 914–923.
Jenkins, O. C., & Matarić, M. J. (2004). A spatio-temporal extension to Isomap nonlinear Xu, Q. S., Wei, X. F., Bai, R. X., Li, S. M., & Meng, Z. (2023). Integration of deep
dimension reduction. In Proceedings of the twenty-first international conference on adaptation transfer learning and online sequential extreme learning machine for
machine learning (p. 56). cross-person and cross-position activity recognition. Expert Systems with Applications,
Jia, K., & Yeung, D. Y. (2008). Human action recognition using local spatio-temporal 212, Article 118807.
discriminant embedding. In 2008 IEEE conference on computer vision and pattern
Yao, S., Kang, Q., Zhou, M., Rawa, M. J., & Albeshri, A. (2022). Discriminative manifold
recognition (pp. 1–8).
distribution alignment for domain adaptation. IEEE Transactions on Systems, Man,
Jiang, Y. J., Song, L., Zhang, J. M., Song, Y., & Yan, M. (2022). Multi-category gesture
and Cybernetics: Systems, 53, 1183–1197.
recognition modeling based on sEMG and IMU signals. Sensors, 22, Article 5855.
Ye, J., Li, X., Zhang, X., Zhang, Q., & Chen, W. (2020). Deep learning-based human
Kang, P. Q., Li, J. X., Fan, B. F., Jiang, S., & Shull, P. B. (2022). Wrist-worn hand
activity real-time recognition for pedestrian navigation. Sensors, 20, Article 2574.
gesture recognition while walking via transfer learning. Ieee Journal of Biomedical
Zhang, W., Deng, L., Zhang, L., & Wu, D. (2022). A survey on negative transfer.
and Health Informatics, 26, 952–961.
IEEE/CAA Journal of Automatica Sinica, 10, 305–329.
Kang, Q., Yao, S., Zhou, M., Zhang, K., & Abusorrah, A. (2020). Effective visual domain
Zhang, Y., Sun, W. Y., & Chen, J. (2022). Application of embedded smart wearable
adaptation via generative adversarial distribution matching. IEEE Transactions on
device monitoring in joint cartilage injury and rehabilitation training. Journal of
neural networks and learning systems, 32, 3919–3929.
Healthcare Engineering, 2022, Article 4420870.
Kang, Q., Yao, S., Zhou, M., Zhang, K., & Abusorrah, A. (2020). Enhanced subspace
distribution matching for fast visual domain adaptation. IEEE Transactions on Zhao, Z., Chen, Y., Liu, J., & Liu, M. (2010). Cross-mobile ELM based activity
Computational Social Systems, 7, 1047–1057. recognition. International Journal of Engineering and Industries, 1, 30–38.
13