Robusthuman 2017

Accepted Manuscript
A Robust human activity recognition system using smartphone sensors

and deep learning
Mohammed Mehedi Hassan, Md. Zia Uddin, Amr Mohamed, Ahmad Almogren
PII: S0167-739X(17)31735-1
DOI: https://doi.org/10.1016/j.future.2017.11.029
Reference: FUTURE 3821
To appear in: Future Generation Computer Systems
Received date : 31 July 2017

Revised date : 30 October 2017
Accepted date : 14 November 2017
Please cite this article as: M.M. Hassan, M.Z. Uddin, A. Mohamed, A. Almogren, A Robust human
activity recognition system using smartphone sensors and deep learning, Future Generation
Computer Systems (2017), https://doi.org/10.1016/j.future.2017.11.029
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to
our customers we are providing this early version of the manuscript. The manuscript will undergo
copyediting, typesetting, and review of the resulting proof before it is published in its final form.
Please note that during the production process errors may be discovered which could affect the
content, and all legal disclaimers that apply to the journal pertain.
A Robust Human Activity Recognition System
Using Smartphone Sensors and Deep Learning
Mohammed Mehedi Hassan1, Md. Zia Uddin2, Amr Mohamed3, Ahmad Almogren1
1
College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia
Email: mmhassan@ksu.edu.sa, ahalmogren@ksu.edu.sa
2
Dept. of Informatics, University of Oslo, Norway
Email: mdzu@ifi.uio.no
3
Department of Electrical Engineering, University of British Columbia, Canada
Email: amrm@ece.ubc.ca
Abstract
In last few decades, human activity recognition grabbed considerable research attentions from a wide range of
pattern recognition and human-computer interaction researchers due to its prominent applications such as smart
home health care. For instance, activity recognition systems can be adopted in a smart home health care system to
improve their rehabilitation processes of patients. There are various ways of using different sensors for human
activity recognition in a smartly controlled environment. Among which, physical human activity recognition through
wearable sensors provides valuable information about an individual’s degree of functional ability and lifestyle. In
this paper, we present a smartphone inertial sensors-based approach for human activity recognition. Efficient
features are first extracted from raw data. The features include mean, median, autoregressive coefficients, etc. The
features are further processed by a kernel principal component analysis (KPCA) and linear discriminant analysis
(LDA) to make them more robust. Finally, the features are trained with a Deep Belief Network (DBN) for
successful activity recognition. The proposed approach was compared with traditional expression recognition
approaches such as typical multiclass Support Vector Machine (SVM) and Artificial Neural Network (ANN) where
it outperformed them.
Keywords— Activity Recognition, Sensors, Smartphones, Deep Belief Network.
1. Introduction
Human Activity Recognition (HAR) has become an elegant research field for its remarkable
contributions in ubiquitous computing [1–3]. Researchers use these systems as a medium to get
information about peoples’ behavior [4]. The information is commonly gathered from the signals of
sensors such as ambient and wearable sensors. The data from the signals are then processed through
1
machine learning algorithms recognize the events lying there. Hence, such HAR systems can be applied
in many practical applications in smart environments such as smart home healthcare systems. For
example, a smart HAR system can continuously observe patients for health diagnosis and medication [5]
or it can be applied for automated surveillance of public places to predict crimes to be happening in near
future [6].
In last few decades, many HAR systems were surveyed [7]–[9] where the authors focused on several
activities in distinguished application domains [10,11]. For instance, the activities can be including,
walking, running, cooking, exercising, etc. Regarding the duration and complexity of the activities; they
can be categorized into three key groups: short activities, simple activities, and complex activities. The
group of short activities consist of activities with very short duration such as transition from sit to stand.
The second kind of activities is basic activities walking and reading [12]. The final one is basically
combinations of progressions of basic activities with the interaction with other objects and individuals.
Such kind of activities can be partying or official meeting together [13]. In this paper, we focus on
recognizing basic activities.
HAR has been actively explored based on a distinguished kind of ambient and wearable sensors [1].
Some instances of such sensors include motion, proximity, microphone, video sensors. Most of the
ambient sensor-based latest HAR researchers have mainly focused on video cameras as cameras make it
easy to retrieve the images of surrounding environment. Video sensors are included with some other
prominent sensors in some works in novel ubiquitous applications [14], [15]. Though video sensors have
been very popular for basic activity recognition. However, it faces very many difficulties when privacy
issue arrives. On the contrary, wearable sensors such as inertial sensors can overcome this kind of privacy
issues and hence; such sensors deserve more focus for activity recognition in smart homes [16].
Many HAR systems over the past used accelerometers to recognize a big range of daily activities such
as standing, walking, sitting, running, and lying [17]-[23]. In [20], the authors have explored the
accelerometer data to find out the repeating activities such as grinding, filling, drilling, and sanding. In
[21]–[23], the authors tried to do elderly peoples’ fall detection and prevention in smart environments.
Majority of the aforementioned systems adopted many accelerometers fixed in different places of the
human body [17]–[21]. However, this approach apparently not applicable to daily life to observe long-
term activities due to attachment of many sensors in the human body and cable connections. Some studies
tried to explore the data of single accelerometers at sternum or waist [22], [23]. These works reported
substantial recognition results of basic daily activities such as running, walking, lying, etc.. However,
they could not show good accuracy for some complex activity situations such as transitional activities
(e.g., sit to stand, lie to stand, and stand to sit).
2
Thus, regarding different sensors in activity recognition, the accelerometer is the most commonly
utilized sensor for focusing on human body motion [8]. The sensor can be deployed in two ways. First,
one is in multi-sensor package such as triaxial accelerometers or Body Sensor Networks (BSN). The
second one is in combination with other sensors such as gyroscopes, temperature, and heart rate sensors
[24]. Bao and Intille [12] proposed one of the earliest HAR systems for the recognition of 20 activities of
daily living using five wearable biaxial accelerometers and well-known machine learning classifiers.
They achieved reasonably good classification accuracy reaching up to 84% considering the number of
activities involved. One evident drawback was related with the number and location of the body sensors
used, which made the system highly obtrusive. Gyroscopes have also been employed for HAR and have
demonstrated to improve the recognition performance when used in combination with accelerometers
[25], [26].
In the case of wearable sensors in activity recognition, the smartphone is an alternative to them due to
the support of the diversity of sensors in it. Handling sensors such as accelerometers and gyroscopes
along with the device processing with wireless communication capabilities made smartphones a very
useful tool for activity monitoring in smart homes [27], [28]. Besides, smartphones are very ubiquitous
and require almost no static infrastructure to operate it. This advantage makes it more practically
applicable than other ambient multi-modal sensors in smart homes. As recent smart phones consist of
inertial sensors (e.g., gyroscopes and accelerometers), they can be appropriate sensing resources to obtain
human motion information for HAR [29], [30].
Recently, smartphones have attracted many activity recognition researchers as they have fast
processing capability, and they are easily deployable [31]-[34]. For instance, in [31], the authors used
wirelessly connected smartphones to collect a user’s data from a chest unit composed of the
accelerometer and vital sign sensors. The data was later processed and analyzed using different machine
learning algorithms. In [32], the authors developed a HAR system to recognize five transportation
activities where data from smartphone inertial sensors were used with a mixture-of-expert model for
classification. In [33], the authors proposed an offline HAR system where a smartphone with built-in
triaxial accelerometer sensor was used. The phone was kept in the pocket during experiments. In [34], the
authors used a smartphone mounted in the waist to collect inertial sensors’ data for activity recognition.
They used Support Vector Machine (SVM) for activity modeling. In [35], a smartphone was used to
recognize six different activities in real-time. In [36], the authors proposed a real-time motion recognition
system with the help of a smartphone with accelerometer sensors. Similarly, the authors in [37] used a
smartphone with an embedded accelerometer to recognize four different activities in real-time.
As the dimension of the features from different sensors becomes very high in activity recognition,
Principal Component Analysis (PCA) can be applied in this regard [37]. PCA applies a linear approach to
3
find out the directions with maximum variations. Thus, PCA is adopted in this work to reduce the
dimensions of high-dimensional features. Recently, deep learning techniques have been getting a lot of
attentions by pattern recognition and artificial intelligence researchers [38]-[40]. Though deep learning is
more efficient than typical neural networks, it consists of two major disadvantages: it has overfitting
problem, and it is often much time-consuming. Deep Belief Network (DBN) is one of the robust deep
learning tools that use Restricted Boltzmann Machines (RBMs) during training [39]. Hence, DBN is a
good candidate to the model activity recognition system.
In this work, a smartphone-based novel approach is proposed for HAR using efficient features and
DBN. The rest of the paper is organized as follows. Section 2 explains the feature extraction process from
depth images. Then, Sections 3 illustrates the modelling of different expression through deep learning.
Furthermore, Section 4 shows the experimental results using different approaches. Finally, Section 6
concludes the work with some remarks.
4
Fig. 1. Flowchharts of the prooposed physiccal activity reccognition systeem.
F
2. P
Proposed Method
M
Fig. 1 shhows the basiic flowchart of the propoosed system. The proposeed system bassically consists of three
main parrts: Sensing,, Feature exttraction, and recognition. The part is sensing. It ccollects sensoor's data as
input to the HAR syystem. For thhis study, twoo prominent sensors in ssmartphones have been selected
s for
data colllection: triaxiial accelerom
meters and gyyroscopes. Thhe sensors prrovide measuurements at ffrequencies
within 0 Hz and 15 Hz. The seccond major ppart is the feaature extracttion. This paart starts withh removing
noise to isolate relevvant signals ssuch as gravity from triaaxial acceleraation. After rremoving noise, it does
statisticaal analysis oon fixed-sizee sliding winndows over the time-seequential ineertial sensor signals to
5
generate robust features. The third key part of the system is modeling activities from the features via
deep learning where DBN is adopted.
2.1 Signal Processing

Triaxial angular velocity and linear acceleration signals are considered from the smartphone gyroscope
and accelerometer sensors. The sampling rate of the raw signals is 50Hz for both the sensors. These
signals are then preprocessed to reduce noise. Two filters are used in this regard: median and low-pass
Butter-worth filter. Twenty Hz is considered as cutoff frequency for the Butter-worth filter. Another low-
pass Butter-worth filter is applied to the acceleration signal with gravitational and body motion
components to filter out body acceleration and gravity information. It gravitational forces are assumed to
have low-frequency components and 0.3 Hz is considered as optimal corner frequency to obtain a
constant gravity signal. In addition to the body acceleration signals in time as well as a frequency domain,
and gravitational acceleration in the time domain, more signals were obtained by changing them. The
additional signals are the body angular speed magnitude, body angular speed, body acceleration jerk,
gravity acceleration magnitude, body angular acceleration, body acceleration magnitude, body
acceleration jerk magnitude, and body angular acceleration magnitude. Then, the signals are sampled with
sliding windows with time of 2.56 seconds where there are overlapped of 50% in-between two
consecutive windows.
2.2 Feature Extraction

Robust features are obtained based on a different kind of signal processing feature extraction methods.
Five hundred and sixty-one informative features are extracted based on different previous works related to
inertial sensors for human activity recognition [1]. The mean w of a window w is determined as
1
 i1 wi .
N
w (1)
N
The standard deviation of a sliding window can be determined as
1 2

N
  i 1
( wi  w) . (2)
N
The mean absolute deviation of a sliding window is determined as
median (| wi  median ( w) |).

(3)
6
The highest value in a fixed-length sliding window is obtained as
m  max( w ). (4)
The highest value in a fixed-length sliding window is determined as
n  min( w ). (5)
The frequency skewness of a sliding window is obtained as
 3

f  f  
s  E   . (6)
   
   
The frequency skewness of a sliding window is obtained as

 

4
E f  f  (7)
K   .
 
2
 2

E f  f 
 
The maximum frequency in a sliding window is obtained as
a  max( f w ). (8)
The average energy in a sliding window is determined as

1

N
e i 1
wi2 (9)
N
The signal magnitude area (SMA) features for three consecutive windows are determined as
1
  | wij |
3 N
S (9)
3 i 1 j 1
7
The entropy features of a sliding window is determined as
1
 ci log(ci ),
N
t (10)
3 i 1
wi
ci  . (11)

N
j 1
wj
The interquartile range in a window is determined based on the medians as
Q  Q 3( w )  Q1( w).
(12)
The auto regression (AR) coefficients of a window can be determined as
P
W ( t )    (i ) w ( t  i )   ( t ) (13)
i 1
where W(t) is the time-series signal, α represents the AR coefficients, ε(t) is the noise term, and p is the
order of the filter. The Pearson correlation coefficients of two windows w1 and w2 are determined as
C12
P , (14)
C11C 22
C  Cov ( w1 , w2 ) . (15)
Then, the frequency signal weighted average is calculated as
 ( jf j )
A
j 1
. (16)
N
( f )
i 1
i
The spectral energy of a frequency band [x, y] is determined as
y
1
S 
x  y  1 i x
fi 2 . (17)
8
Then, angle between a central vector and mean of three consecutive windows can be obtained as
 
F  tan 1 (|| [ w1 , w2 , w3 ]  v ||,[ w1 , w2 , w3 ], v ). (18)
Fig.2 shows the mean of the features of four different activities. In the figures, the mean of the features an
activity is different from the mean of the features of other activities. Hence, the aforementioned features
are used to represent different activity features obtained smartphone inertial sensors’ data.
Fig. 2. Mean of the normalized features for four sample physical activities.
2.3 Dimension Reduction

The next step of the feature extraction is to apply dimension reduction using Kernel PCA (KPCA) [30].
In KPCA, a statistical kernel is applied to the input features, followed by typical PCA. Given
spatiotemporal robust features F, the covariance matrix of the features can be defined as
q
1
Y
q
 ( ( F ). ( F )
i 1
i i
T
)
(19)
 ( Fi )   ( Fi )   (20)
q
1

q
 (F )
i 1
i
(21)
9
where q represents the total number of feature segments for training and Φ is a Gaussian kernel. Now, the
principal components can be found by satisfying following eigenvalue decomposition problem.
 E  QE (22)
Q  ET E (23)
where E represents principal components and  corresponding eigenvalues. The feature vectors using
KPCA for a signal segment can be represented as
K  FE mT . (24)
where m represents the number of top principal components. Fig. 3 shows 100 eigenvalues for 100
principal components where the values after 20th components are almost zero. However, 100 components
are considered throughout this work as rest of them are negligible.
Fig. 3. Hundred eigenvalues after applying KPCA on the features of training samples.
3. Activity Modeling in the Proposed Work

Modeling facial expressions through Deep Belief Network (DBN) has two basic parts: pre-training and fine-
tuning. The pre-training phase is based on a Restricted Boltzmann Machine (RBM) [39]. Once the network is
pre-trained, weights of the networks are adjusted by a fine-tuning algorithm. RBM is useful for unsupervised
learning. As shown in Fig. 4, two hidden layers are used in this work for RBM. RBM is basically used to
initialize the networks where a greedy layer-wise training methodology is used. Once the weights of the RBMs
in the first hidden layer are trained, they are used as inputs to the second hidden layer. Similarly, the weights of
10
the RBM
Ms of the seco
ond hidden laayer are traineed and used as
a inputs to th
he output layeer. Fig. 4 sho
ows a basic
architectu
ure of pre-traaining and fin
ne-tuning in ty
ypical DBN w
with inputs, II, n hidden lay
yer's H, and o
output layer
O.
Fig.. 4. Structure of a DBN useed in this worrk with 100 neurons

n in inp
put layer, 60 in
n hidden layeer1, 20 in
hidden layeer2, and 12 in
n output layerr.
To updatte the weight matrix, a Con

ntrastive Diveergence algorrithm is used.. First, the bin
nary state of first hidden
layer H1 is computed as
a
1, f ( r  IG T )  t (25)
H1   ,
0 Otherwise
1 (26)
f (v ) 
1  ev
where r iis the bias vecctor for input layer I, G iniitial weight m
matrix, and t thhreshold learnned along withh the weight
matrix baased on a sigm
moid function. Then, the binaary state of thee input layer is
i reconstructeed as Irecon from
m the binary
state of thhe hidden layeer H1 as
1, f ( b  H 1G )  t (27)
I r e ccon   ,
0 Otherwise
where b is the bias vecttor for the inpuut layer. Afterrward, hidden layer is reconnstructed as Hrecon from Ireconn as
11
H r e con  f ( r  I recon G T ). (28)
Then, the weight difference is computed as
H1I H I (29)
G  ( )  ( 1recon recon )
 
where  is the batch size. Once pre-training is done, a conventional back propagation algorithm is run to adjust all
parameters, this is called fine-tuning. Fig. 5 shows a sample convergence plot of DBN using proposed features that
indicate that the training error becomes almost zero when a number of epochs is near to 1000.
Fig. 5. Convergence of DBN using 1000 epochs.
4. Experiments and Results

For experiments, a publicly available database was collected [41]. The database consisted of twelve activities:
Standing, Sitting, Lying Down, Walking, Walking-upstairs, Walking-downstairs, Stand-to-Sit, Sit-to-Stand, Sit-to-
Lie, Lie-to-Sit, Stand-to-Lie, and Lie-to-Stand. A total of 7767 and 3162 events were used for training and testing
activities respectively. Each event consisted of 561 basic features. It is to be noted that in the database used in this
work, the number of samples for training and testing different activity is not evenly distributed. Some activities
contain a large number of samples whereas some of them have a very small number of samples.
12
We started the experiments with traditional Artificial Neural Networks (ANNs). For that, we run typical ANN
algorithm several times and obtained 65.31% of mean recognition rate at best. The ANN-based experimental results
are shown in Table 1. The overall accuracy obtained by ANN was 89.05%. Then, we proceeded to apply multiclass
Support Vector Machines (SVMs) that brought us mean recognition rate of 82.02% at best. The SVM-based
experimental results are reported in Table 2. Finally, we applied the proposed approach that provided us the mean
recognition rate of 95%, the highest recognition rate. Thus, the proposed approach showed the superiority over
others. Table 3 shows the experimental results using the proposed approach.
As there are a different number of samples in different testing activities, poor mean recognition rate does not
indicate poor accuracy. For instance, the mean recognition rate of ANN-based approach was 65.31%. But, the
accuracy of the approach was 89.06% as 2816 samples were rightly classified among 3162 samples. The accuracy
obtained via SMV was 94.12% as 2976 samples were rightly classified. Similarly, the accuracy using the proposed
deep learning-based approach was 95.85% as 3031 samples were rightly classified. Table 4 shows the accuracy and
errors using different approaches for HAR where the proposed one shows highest overall accuracy and lowest
overall errors.
Table 1. HAR-experiment results using traditional ANN-based approach

Activity Recognition Rate Mean
Standing 94.56%
Sitting 90.87
Lying Down 85.71
Walking 83.27
Walking-upstairs 94.96
Walking-downstairs 97.80
65.31%
Stand-to-Sit 34.78
Sit-to-Stand 0.00
Sit-to-Lie 56.25
Lie-to-Sit 76.00
Stand-to-Lie 51.02
Lie-to-Stand 18.52
13
Table 2. HAR-experiment results using traditional SVM-based approach
Standing 99.60%
Sitting 93.84
Lying Down 97.14
Walking 87.20
82.02%
Stand-to-Sit 73.91
Sit-to-Stand 90.00
Sit-to-Lie 50.00
Lie-to-Sit 64.00
Stand-to-Lie 69.39
Lie-to-Stand 62.96
Table 3. HAR-experiment results using proposed DBN-based approach

Standing 99.60%
Sitting 95.97
Lying Down 96.67
Walking 93.50
89.61%
Stand-to-Sit 82.61
Sit-to-Stand 90.00
Sit-to-Lie 81.25
Lie-to-Sit 72.00
Stand-to-Lie 85.71
Lie-to-Stand 81.48
14
Table 4. Accuracy and error rate using different HAR approaches.
Total Total Rightly Total Wrongly
Approach Overall Overall
Testing Classified Classified
Accuracy (%) Error (%)
Samples Samples Samples
ANN 3162 2816 89.06 346 10.94
SVM 3162 2976 94.12 186 5.88
DBN 3162 3031 95.85 131 4.14
5. Conclusion
The main purpose of this work is to develop a robust human activity recognition system based on the smartphone
sensors’ data. It seems very feasible to use smartphones for activity recognition as the smartphone is one of the
mostly used devices by people in their daily life not only for communicating each other but also for a very wide
range of applications, including healthcare. Thus, a novel approach has been proposed here for activity recognition
using smartphone inertial sensors such as accelerometers and gyroscope sensors. From the sensor signals, multiple
robust features have been extracted followed by KPCA for dimension reduction. Furthermore, the robust features
have been combined with deep learning technique, Deep Belief Network (DBN) for activity training and recognition.
The proposed method was compared with traditional multiclass SVM approach where it showed its superiority.
The system has been checked for twelve different physical activities where it has obtained a mean recognition rate
of 89.61% and an overall accuracy of 95.85%. On the contrary, other traditional approaches could achieve a mean
recognition rate of 82.02% and an overall accuracy of 94.12% at best. Besides, it has shown its ability to distinguish
between basic transitional and non-transitional activities. In future, we plan to focus on more robust features and
learning for more efficient and complex activity's recognition in real-time environments.
Acknowledgement
The authors would like to extend their sincere appreciation to the Deanship of Scientific Research at King Saud
University for its funding of this research through the research group project no. RGP-281. The second author has
the equal contribution as the first author to accomplish this work.
References
[1] Y. Chen and C. Shen, "Performance Analysis of Smartphone-Sensor Behavior for Human Activity
Recognition," IEEE Access, (2017) 5: 3095-3110.
[2] M. Cornacchia, K. Ozcan, Y. Zheng and S. Velipasalar, A Survey on Activity Detection and Classification
Using Wearable Sensors, IEEE Sensors, (2017), 17(2): 386-403.
[3] A. Campbell, T. Choudhury, From Smart to Cognitive Phones, IEEE Pervasive Computing 11 (3) (2012) 7–11.
15
[4] B.P. Clarkson, 2002. Life patterns: structure from wearable sensors (Ph.D. thesis), Massachusetts Institute of
Technology.
[5] A. Avci, S. Bosch, M. Marin-Perianu, R. Marin-Perianu, P. Havinga, Activity recognition using inertial sensing
for healthcare, wellbeing and sports applications: a survey, in: International Conference on Architecture of
Computing Systems, 2010.
[6] W. Lin, M.-T. Sun, R. Poovandran, Z. Zhang, Human activity recognition for video surveillance, in: IEEE
International Symposium on Circuits and Systems, 2008.
[7] O. Lara, M. Labrador, A survey on human activity recognition using wearable sensors, IEEE Commun. Surv.
Tutor. 1 (2012) 1–18.
[8] A. Mannini, A.M. Sabatini, Machine learning methods for classifying human physical activity from on-body
accelerometers, Sensors, (2010) 10:1154–1175.
[9] R. Poppe, A survey on vision-based human action recognition, Image Vis. Comput. (2010) 28:976–990.
[10] B. Nham, K. Siangliulue, S. Yeung, Predicting mode of transport from iphone accelerometer data, Technical
Report, Stanford University, 2008.
[11] E. Tapia, S. Intille, K. Larson, Activity recognition in the home using simple and ubiquitous sensors, in:
Pervasive Computing, 2004.
[12] L. Bao, S. Intille, Activity recognition from user-annotated acceleration data, in: Pervasive Computing, 2004.
[13] J. Aggarwal, M.S. Ryoo, Human activity analysis: a review, ACM Comput. Surv., (2011) 43(3):1–16.
[14] S.K. Tasoulis, N. Doukas, V.P. Plagianakos, I. Maglogiannis, Statistical data mining of streaming motion data
for activity and fall recognition in assistive environments, Neurocomputing 107 (2013) 87–96.
[15] A. Behera, D. Hogg, A. Cohn, Egocentric activity monitoring and recovery, in: Asian Conference on Computer
Vision, 2013.
[16] D. Townsend, F. Knoefel, R. Goubran, Privacy versus autonomy: a tradeoff model for smart home monitoring
technologies, in: 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology
Society, EMBC, 2011.
[17] AM Khan, YK Lee, SY Lee, TS Kim, A triaxial accelerometer-based physical-activity recognition via
augmented-signal features and a hierarchical recognizer, IEEE transactions on information technology in
biomedicine, (2010) 14(5):1166-1172.
[18] U. Maurer, A. Smailagic, D. Siewiorek, and M. Deisher. Activity recognition and monitoring using multiple
sensors on different body positions. in Proc. Int. Workshop Wearable Implantable Body Sens. Netw., (2006)
113–116.
[19] N. Kern, B. Schiele, H. Junker, P. Lukowicz, and G. Troster, Wearable sensing t oannotate meeting recordings,
Pers.UbiquitousComput.,(2003) 7:263–274.
[20] D. Minnen, T. Starner, J. Ward, P. Lukowicz, and G. Troester, Recognizing and discovering human actions
from on-body sensor data, in Proc. IEEE Int. Conf. Multimedia Expo,(2005) 1545–1548.
[21] D. Giansanti, V. Macellari, and G. Maccioni, New neural network classifier of fall-risk based on the
Mahalanobis distance and kinematic parameters assessed by a wearable device, Physiol.Meas.,.29:11–19.
16
[22] M.R. Narayanan, M.E. Scalzi, S.J. Redmond, S.R. Lord, B.G. Celler, and N. H. Lovell, A wearable triaxial
accelerometry system for longitudinal assessment of falls risk, in Proc. 30th Annu. IEEE Int. Conf. Eng. Med.
Biol. Soc., (2008) 2840–2843.
[23] M. Marschollek, K. Wolf, M. Gietzelt, G. Nemitz, H. M. Z. Schwabedissen, and R. Haux, Assessing elderly
persons’ fall risk using spectral analysis on accelerometric data—A clinical evaluation study, in Proc. 30th
Annu. IEEE Int. Conf. Eng. Med. Biol. Soc., (2008) 3682–3685.
[24] G.-Z. Yang, M. Yacoub, Body Sensor Networks, Springer, London, 2006.
[25] W. Wu, S. Dasgupta, E.E. Ramirez, C. Peterson, G.J. Norman, Classification accuracies of physical activities
using smartphone motion sensors, J. Med. Internet Res. (2012) 14:105–130.
[26] D. Anguita, A. Ghio, L. Oneto, X. Parra, J.-L. Reyes-Ortiz, Training computationally efficient smartphone-
based human activity recognition models, in: International Conference on Artificial Neural Networks, 2013.
[27] A. Ghio, L. Oneto, Byte the bullet: learning on real-world computing architectures, in: European Symposium on
Artificial Neural Networks, Computational Intelligence and Machine Learning ESANN, 2014.
[28] D. Anguita, A. Ghio, L. Oneto, X. Parra, J.-L. Reyes-Ortiz, A public domain dataset for human activity
recognition using smartphones, in: European Symposium on Artificial Neural Networks, Computational
Intelligence and Machine Learning ESANN, 2013.
[29] A. M. Khan, Y.-K. Lee, S. Lee, T.-S. Kim, Human activity recognition via an accelerometer-enabled-
smartphone using kernel discriminant analysis, in: IEEE International Conference on Future Information
Technology, 2010.
[30] D. Roggen, K. Förster, A. Calatroni, T. Holleczek, Y. Fang, G. Tröster, P. Lukowicz, G. Pirkl, D. Bannach, K.
Kunze, A. Ferscha, C. Holzmann, A. Riener, R. Chavarriaga, J. del R. Millán, Opportunity: towards
opportunistic activity and context recognition systems, in: IEEE Workshop on Autononomic and Opportunistic
Communications, 2009.
[31] O.D. Lara, A.J. Pérez, M.A. Labrador, J.D. Posada, Centinela: a human activity recognition system based on
acceleration and vital sign data, Pervasive Mob. Comput. (2012) 8:717–729.
[32] Y.S. Lee, S.B. Cho, Activity recognition with android phone using mixture-ofexperts co-trained with labeled
and unlabeled data, Neurocomputing (2014) 126:106–115.
[33] J.R. Kwapisz, G.M. Weiss, S.A. Moore, Activity recognition using cell phone accelerometers, SIGKDD Explor.
Newsl. (2011) 12:74–82.
[34] D. Anguita, A. Ghio, L. Oneto, X. Parra, J.L. Reyes-Ortiz, Human activity recognition on smartphones using a
multiclass hardware-friendly support vector machine, in: Ambient Assisted Living and Home Care, 2012.
[35] T. Brezmes, J. Gorricho, J. Cotrina, Activity recognition from accelerometer data on a mobile phone, Distrib.
Comput. Artif. Intell. Bioinform. Soft Comput. Ambient Assist. Living (2009) 5518:796–799.
[36] D. Fuentes, L. Gonzalez-Abril, C. Angulo, J. Ortega, Online motion recognition using an accelerometer in a
mobile device, Expert Syst. Appl. (2012) 39:2461–2465.
[37] M. Kose, O.D. Incel, C. Ersoy, Online human activity recognition on smart phones, in: Workshop on Mobile
Sensing: From Smartphones and Wearables to Big Data, 2012.
17
[38] H. M. Ebied, Feature extraction using PCA and Kernel-PCA for face recognition. 8th International Conference
on Informatics and Systems (INFOS), (2017) 72-77.
[39] GE Hinton, S. Osindero, Y-W. Teh, A fast learning algorithm for deep belief nets. Neural Computation. 2006
Jul; 18(7): 1527-1554.
[40] M. Z. Uddin, M. M. Hassan, A. Almogren, M. Zuair, G. Fortino, J. Torresen, A facial expression recognition
system using robust face features from depth videos and deep learning, Computers & Electrical Engineering,
2017, http://dx.doi.org/10.1016/j.compeleceng.2017.04.019.
[41] M. Lichman. UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of
California, School of Information and Computer Science (2013)
18
Authors Biography
Mohammad Mehedi Hassan is currently an Associate Professor of Information Systems Department in the College
of Computer and Information Sciences (CCIS), King Saud University (KSU), Riyadh, Kingdom of Saudi Arabia. He
received his Ph.D. degree in Computer Engineering from Kyung Hee University, South Korea in February 2011. He
received Best Paper Award from CloudComp conference at China in 2014. He also received Excellence in Research
Award from CCIS, KSU in 2015 and 23016 respectively. He has published over 100+ research papers in the
journals and conferences of international repute. He has served as, chair, and Technical Program Committee
member in numerous international conferences/workshops like IEEE HPCC, ACM BodyNets, IEEE ICME, IEEE
ScalCom, ACM Multimedia, ICA3PP, IEEE ICC, TPMC, IDCS, etc. He has also played role of the guest editor of
several international ISI-indexed journals such as IEEE IoT, FGCS, etc. His research areas of interest are cloud
federation, multimedia cloud, sensor-cloud, Internet of things, Big data, mobile cloud, cloud security, IPTV, sensor
network, 5G network, social network, publish/subscribe system and recommender system. He is a member of IEEE.
Md. Zia Uddin received his Ph.D. in Biomedical Engineering in February of 2011. He is currently working as a
post-doctoral research fellow under Dept. of Informatics, University of Oslo, Norway. Dr. Zia’s researches are
mainly focused on computer vision, image processing, artificial intelligence, and pattern recognition. He got more
than 60 research publications including international journals, conferences and book chapters.
Amr Mohamed received his M.S. and Ph.D. in electrical and computer engineering from the University of British
Columbia, Vancouver, Canada, in 2001, and 2006 respectively. He has over 20 years of experience in wireless
networking research and industrial systems development. He holds 3 awards from IBM Canada for his achievements
and leadership, and 3 best paper awards, latest from IEEE/IFIP International conference on New Technologies,
Mobility, and Security (NTMS) 2015 in Paris. His research interests include networking and MAC layer techniques
mainly in wireless networks. Dr. Amr Mohamed has authored or coauthored over 120 refereed journal and
conference papers, textbook, and book chapters in reputed international journals, and conferences. He has served as
a technical program committee (TPC) co-chair for workshops in IEEE WCNC'16. He has served as a co-chair for
technical symposia of international conferences, including Globecom'16, Crowncom'15, AICCSA'14, IEEE
WLN'11, and IEEE ICT'10. He has served on the organization committee of many other international conferences as
a TPC member, including the IEEE ICC, GLOBECOM, WCNC, LCN and PIMRC, and a technical reviewer for
many international IEEE, ACM, Elsevier, Springer, and Wiley journals.
Ahmad Almogren has received PhD degree in computer sciences from Southern Methodist University, Dallas,
Texas, USA in 2002. Previously, he worked as an assistant professor of computer science and a member of the
scientific council at Riyadh College of Technology. He also served as the dean of the college of computer and
information sciences and the head of the council of academic accreditation at Al Yamamah University. Presently, he
works as an Associate Professor and the vice dean for the development and quality at the college of computer and
information sciences at King Saud University in Saudi Arabia. He has served as a guest editor for several computer
journals. His research areas of interest include mobile and pervasive computing, computer security, sensor and
cognitive network, and data consistency.
19
Au
uthors Ph
hoto
mad Mehedi H
Mohamm Hassan
Md. Zia U
Uddin
Amr Mohamed
Ahmad Almogren
A
20
Highlights
* A smartphone inertial sensors-based approach for human activity
recognition
* Uses deep learning based solution for successful activity
recognition
* The proposed approach was compared with traditional expression
recognition approaches
21

Robusthuman 2017

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Robusthuman 2017

Uploaded by

Copyright:

Available Formats

Accepted Manuscript

A Robust human activity recognition system using smartphone sensors

To appear in: Future Generation Computer Systems

Received date : 31 July 2017

Keywords— Activity Recognition, Sensors, Smartphones, Deep Belief Network.

2.1 Signal Processing

2.2 Feature Extraction

The mean absolute deviation of a sliding window is determined as

median (| wi  median ( w) |).

The highest value in a fixed-length sliding window is determined as

The frequency skewness of a sliding window is obtained as

The frequency skewness of a sliding window is obtained as

The maximum frequency in a sliding window is obtained as

The average energy in a sliding window is determined as

The interquartile range in a window is determined based on the medians as

The auto regression (AR) coefficients of a window can be determined as

Then, the frequency signal weighted average is calculated as

The spectral energy of a frequency band [x, y] is determined as

2.3 Dimension Reduction

3. Activity Modeling in the Proposed Work

Fig.. 4. Structure of a DBN useed in this worrk with 100 neurons

To updatte the weight matrix, a Con

Then, the weight difference is computed as

Fig. 5. Convergence of DBN using 1000 epochs.

4. Experiments and Results

Table 1. HAR-experiment results using traditional ANN-based approach

Lying Down 85.71

Lying Down 97.14

Table 3. HAR-experiment results using proposed DBN-based approach

Lying Down 96.67

ANN 3162 2816 89.06 346 10.94

SVM 3162 2976 94.12 186 5.88

DBN 3162 3031 95.85 131 4.14

You might also like