Professional Documents
Culture Documents
PII: S2210-6707(20)30790-3
DOI: https://doi.org/10.1016/j.scs.2020.102572
Reference: SCS 102572
This is a PDF file of an article that has undergone enhancements after acceptance, such as
the addition of a cover page and metadata, and formatting for readability, but it is not yet the
definitive version of record. This version will undergo additional copyediting, typesetting and
review before it is published in its final form, but we are providing this version to give early
visibility of the article. Please note that, during the production process, errors may be
discovered which could affect the content, and all legal disclaimers that apply to the journal
pertain.
Abstract
of
The Internet of Things (IoT) provides smart solutions for future urban communi-
ties to address key benefits with the least human intercession. A smart home of-
ro
fers the necessary capabilities to promote efficiency and sustainability to a resident
with their healthcare-related, social, and emotional needs. In particular, it provides
an opportunity to assess the functional health ability of the elderly or individuals
-p
with cognitive impairment in performing daily life activities. This work proposes
an approach named Cognitive Assessment of Smart Home Resident (CA-SHR) to
measure the ability of smart home residents in executing simple to complex ac-
re
tivities of daily living using pre-defined scores assigned by a neuropsychologist.
CA-SHR also measures the quality of tasks performed by the participants using su-
pervised classification. Furthermore, CA-SHR provides a temporal feature analysis
lP
∗
Corresponding author
of
health solutions such as supporting cognitively impaired individuals by assessing
their daily life routine, activity recognition, emotion analysis, and depression es-
ro
timation [7]. With an increase in chronic diseases in aged people, IoT-based smart
home solutions are more considered nowadays [8, 9, 10].
About 66% of the population will be living in the civic area by 2050, while
the amount of civic areas having 10 million residents or more is expanding at
-p
the analogous pace [11]. A community-driven plan is embraced in building smart
cities together to share assets viably and keenly, for instance, smart workplaces
and smart homes. The smart home is considered to be the best setting for assisting
re
smart cities to increase personal informative details when security is appropri-
ately executed [12]. Smart cities consist of wider scale connectivity, information
retrieval, information sharing, security, privacy, integrity, and human-computer in-
lP
early detection of dementia patients [13]. Activities of Daily Living (ADL) can be
helpful to detect the early onset of dementia [14]. Early detection can help doctors
and clinicians to help the individuals leading towards the critical stages of demen-
tia [15, 16, 17]. Similarly, authors in [18, 19] also discusses the early detection
ur
2
of
ro
-p
re
lP
na
of cognitive assessment.
Figure 2 shows the flow diagram of the proposed CA-SHR that focuses on as-
Jo
sessing the cognitive health condition of people identified with dementia. CA-SHR
categorizes individuals into three categories: healthy, MCI, and dementia. Healthy
is the individual that can perform their daily life activities efficiently and inde-
pendently, MCI is the individual that may not be able to perform their daily life
activities independently and efficiently while dementia individual needs support
to perform the daily activities. The workflow starts with collecting the dataset and
3
then preprocessing it for better prediction from the classification model. The next
step extracts the feature set that can potentially determine ADL from the sensing
dataset. In the next step, feature selection is performed to acquire a feature set to
represent the overall dataset. In the next step, data balancing is performed to im-
prove the representation of the class that has fewer instances. Finally, the classifier
is trained to predict the cognitive health status of a person performing ADL. The
detailed explanation of the proposed approach is stated in Section 3.
This paper makes the following key contributions:
of
• Proposes a new approach named CA-SHR for early identification and au-
tomated cognitive health assessment of healthy, MCI, and dementia indi-
ro
viduals by applying machine learning algorithms over sensor stream data
collected from simple daily life activities (SADLs) and complex instrumen-
tal activities of daily living (IADLs) in a smart home.
-p
• Provides automated measuring of the standard of the proficiency of the tasks
performed by the participants using supervised classification.
re
• Provides temporal feature analysis to estimate if the temporal features help
to detect impaired individuals effectively while single-tasking as well as
multitasking.
lP
CA-SHR first extract features from the instances of simple and complex in-
terwoven activities and then select the discriminative features through Principal
Component Analysis (PCA). CA-SHR improves the representation of cognitive
classes by improving the representation of minority classes. For complex inter-
ur
woven activities, we use both, the significant features and remove the duration
feature from the feature set, since it is not providing any useful temporal infor-
mation. Finally, the classification performance is improved by using the ensemble
Jo
4
of
ro
-p
re
lP
na
life tasks. Section 4 shows the experimental setup containing datasets, defined
parameters for the state of the art technique, observations, and results with com-
parisons of the proposed method. Section 5 discusses the major aspects and mo-
Jo
tivation of research along with general assessment results. Section 6 gives a sum-
marized overview of the paper and future recommendations.
5
2. Related Work
This section explains different studies presented in past literature. Below the
explanation of all the relative studies is provided as this study focuses on health
assessment and quantification of daily life tasks using smart home.
A smart home is equipped with different sensors like temperature, motion,
heat, and light sensors that can be used as a remote environment controlled by
human-specific devices like smartphones and computers [21, 22, 23, 24]. These
sensors are made intelligent enough to reason about and make the decision about
of
our smart home environment setting [25]. The smart home with a diverse network
of sensors and advances in intelligent techniques offers support and responsive
ro
administrations for grasping a scope of social, healthcare, independent living, se-
curity, safety, and environmental sustainability benefits [2, 3].
The narrative of smart home started in the late 1960s when PC beginners be-
gan introducing PCs at home, and one of the well-known home PCs named as
-p
‘ECHO-IV’ made for family accounting, stock estimation, and temperature con-
trol [26]. At the point when PCs showed up in the mass-showcase in the late
1970s [27], real-time home machines became Do-It-Yourself (DIY) extends by
re
specialists. The remote control was accomplished by deciphering Double Tone
Multi-Frequency (DTMF) flags through phone lines when a household Internet
administration was not yet for the most part accessible [28]. Research in the smart
lP
home has been advancing however the genuine reception is still low. The authors
in [29] depicted in 1992 the development of smart home showcase was “practi-
cally around the bend” following 10 years of research and execution. We have
not seen noteworthy smart home reception yet and the significant expense, trou-
na
blesome establishment, and threatening tasks are as yet the principal deterrents
changing the promotion to reality.
Initially, dedicated wearable motion sensors were used for activity recogni-
tion [30]. These sensors were somehow useful but carrying these sensors is dif-
ur
ficult. There has been a shift towards smart sensors because of the availability
of valuable, precise, and accurate sensors equipped in these devices. There exist
high-quality sensors such as GPS, light sensor, nearest field sensor, accelerome-
Jo
ter, gyroscope, microphone, and magnetometer [31]. [32] used wearable watches
for activity recognition. For cognitive health assessment, activity recognition is an
important step. By using sensor events while the individuals perform their daily
life activities in a smart home. The pattern of doing an activity is different for
healthy, MCI, and dementia individuals that help to identify which activity is be-
ing performed in a smart home.
6
Real-time activity recognition was a challenge in the early phases. Many re-
searchers proposed different methods for activity recognition. In the early stages,
authors in [33] used their previously made algorithm for offline activity recog-
nition but this time for online activity recognition onto a PDA (Personal Digital
Assistant). They used a motion band for activity recognition. Motion sensors are
very much used in different studies like [33, 34, 35, 36, 37, 38, 39, 40]. Authors
in [34] used the motion sensor and a heart rate monitor for activity recognition.
Similar work was also done by [36] who used motion sensors and applied a fuzzy
of
function-based (FBF-based) classifier constructed with the results got from a lin-
ear discriminant analysis (LDA). They used LDA to get a highly co-related feature
subset.
ro
Authors in [41] focus on activity recognition in smart homes. They made a
framework for designing and implementing smart home applications. They made
Cloud-Assisted Smart Home Environment (CASE) architecture to make Smart
-p
Home applications. They used environmental and mobile sensors for activity recog-
nition. Authors in [13] used a smart home environment for activity recognition.
They used a passive infrared sensor (PIR) to monitor the presence and hall effect
re
sensor to identify the door state either it is open or closed. Authors in [40] did ac-
tivity recognition based on activity clustering using the K-pattern clustering algo-
rithm and then making activity decisions using a temporal based artificial neural
lP
network. [42] used three algorithms of artificial neural networks: Quick Propa-
gation, Levenberg Marquardt, and Batch Back-Propagation for activity recogni-
tion. They compared their results with the Massachusetts Institute of Technology
(MIT)’s smart home dataset and showed the Levenberg Marquardt works better
for activity recognition purposes. [43] used motion sensors placed in a building.
na
They made an activity recognition algorithm that predicts which activity is being
performed. According to this classification, they increased the energy efficiency
of a building. Firstly, we discuss offline activity recognition methods that use clas-
sification methods to predict activity.
ur
sensor selection, and environment setup. [25][45] shows, how activity recognition
helps to assess health condition. They presented that calculated values from a sen-
sor are mapped to a tag that shows which activity or motion is being executed. In
a smart home mostly data is collected from smart homes like motion sensors, heat
sensors, and environmental sensors. Some activities that involve unique objects
need a tag of that object with whom a person is interacting. Some activities are
7
unique enough like taking medication, preparing a meal, doing dishes. These ac-
tivities are done by interacting with unique objects. Researchers [46, 47] used
radio-frequency identification (RFID), pressure and binary sensors for tagging
each activity for activity recognition. [48] also used RFID for indoor localiza-
tion. They compared this technique with other existing techniques in a sense of
robustness, affordability, and reliability. They found that increasing accuracy also
increases the cost of the system.
Some researchers used unsupervised learning models for activity assessment
of
[49, 50, 51]. Regression is a statistical method for illustrating the relationship
between a dependent variable and two or more independent variables. Linear re-
gression is a vital machine learning tool that builds a linear relationship between
ro
two variable sets, dependent and independent variable sets to predict uncertain-
ties. The number of authors used regression is very less because the researcher
focused on predicting either this patient is healthy or not but that is not good.
-p
For the early detection of disease, we need a model that says that this person is
moving towards this disease. Authors in [49] used a multivariate linear regression
model for predicting sitting, standing, and walking. They got acceleration data
re
from an accelerometer. Authors in [50] used hidden Markov models in a regres-
sion context. They collected data from accelerometer from different body parts.
They gather unlabeled data and used temporal acceleration data to accurately de-
lP
smart home sensors. They volunteered 179 individuals in which 145 were healthy,
2 with dementia, and 32 with MCI. They worked on a complex Day out Task
(DOT) because there exists a versatility in performing the Day out Task. They
compared the result collected from their method with the direct observations la-
ur
beled by a trained psychologist. They also compared the result made by a machine
learning algorithm with the diagnoses made by a clinic. Many studies objected
that these observations might be biased or collected data might not suit in real-life
Jo
8
of Daily Living (IADLs) i.e., food preparation, medication, using the telephone,
and doing housework either complete or incomplete.
Authors in [17] proposed a study to automate cognitive health using machine
learning algorithms. They used supervised learning algorithms to classify cogni-
tive health. Firstly, they set ground truth using lab-based assessment tests. Data
was collected from sensors embedded in the smart home. They showed that their
algorithm can correctly differentiate between a cognitive healthy person and a
person having an impairment. Similar work was done by [16], who developed
of
machine learning methods to classify the quality of activities being performed in
smart homes. They extracted a feature from data of the sensors obtained from sen-
sors placed in the smart home environment. Similar work was done by [15]. They
ro
monitor the daily life activities of inhabitants living in the smart home to assist
the clinician to better understand and diagnose people.
Labiba et al. [54] presented a study on the long-term observation and assess-
-p
ment of daily routine activities. Firstly, they recognized the activity of smart home
residents and then they monitor the changes in daily routine activities. They used
a probabilistic neural network to label newly detected activity. They also applied
re
the K-means clustering algorithm to separate routing activities from unique and
unusual activities. In another study Labiba et al. [55] addressed some issues that
might happen while assessing the cognitive health of a smart home resident. They
lP
showed that less amount of training data and an imbalance amount of activity
instances might result in over-fitting which results in inconsistent activity recog-
nition. They proposed an activity recognition approach which uses a probabilis-
tic method and distance minimization method to recognize user activity. They
showed that this method makes the recognition more reliable and consistent. They
na
combined the output of both probabilistic and distance minimization through the
Support Vector Machine (SVM).
There exist limitations in existing work for activity assessment. Authors in
[14] and [17] analyzed the classification of cognitively healthy individuals and
ur
cognitively impaired individuals. In real-world data, there exist a problem with the
collection of examples. In most cases, several positive examples are always greater
than several negative examples which result in the class imbalance problem. As
Jo
9
Table 1: Detailed Analysis of Activity Assessment Approaches and Dataset used in Related Work
I. Key: HMM - Hidden Markov Model, SVM - Support Vector Machine, GMM - Gaussian Mixture
Model, RFID - Radio Frequency Identification, SCCRF - Skip Chain Conditional Random Field,
EP - Emerging Pattern, ES - Embedded Assessment, CRF - Conditional Random Fields
Ref [25] [14] [53] [45] [56] [57] [41] [48] [58] [17] [16] [15]
HMM X - - - X - - - - - - -
NB X - - - - - - - - - - X
J48 - - - - - - - - - X - X
SMO - - - - - - - - - - - X
of
LR - - - - - - - - - X - -
MLP - - - - - - - - - X - X
CRF X - - - X - - - - - - -
SVM - X - - - - - - - - - -
ro
GMM - - X - - - - - - - - -
Method RFID - - - - - - - X - - X -
ES - - - X - - - - - - - -
SCCRF - - - - X - - - - - - -
-p
EP - - - - X - - - - - - -
INSigHt - - - - - X - - - - - -
CASE - - - - - - X - - - - -
re
Psmart - - - - - - - - X - - -
Self - X X X - X X X X X X X
Dataset CASAS X - - - - - - - - - - -
Publicly - - - - X - - - - - - -
lP
Table 2: Detailed Analysis of Activity Assessment Targets and Evaluation Measures used in Re-
lated Work II. Target section shows the Age or Health Status of Individuals and Evaluation Mea-
sure Shows the Metrics in which Previous Work Compete for Their Work.
na
Ref [25] [14] [53] [45] [56] [57] [41] [48] [58] [17] [16] [15]
Mid age - X - - - - - - - - - -
Old Age X X X X - - - - - X X -
Target All Above - - - - X X X X X - - X
MCI - X - - - - - - - - - -
ur
dementia X X - - - - - - - - X -
Accuracy X X - - X X - X X - - X
Precision - - - - - - - - - X - -
Recall - - - - - - - - - X - -
Jo
Eval. F-Score - - - - - - - - - - X -
Measure G-Mean - - - - - - - - - - X -
KL Diver. - - X - - - - - - - - -
Correlation - - - X - - - - - - - X
10
3. Proposed Approach
The proposed approach is segmented into four major steps i.e., feature extrac-
tion, feature selection, data balancing, and classification of subjects into healthy
or otherwise cognitively impaired individuals. We focus more on providing an op-
timal and sustainable solution to detect dementia individuals at their stages. In the
initial step, the extraction of feature sets takes place that can potentially determine
ADL from the sensing dataset. Next, feature selection is performed to acquire a
feature subset that correlates with cognitive health. The optimum feature selec-
of
tion approach is crucial for detecting dementia individuals at their stages. Data
balancing is performed next to increase the number of instances in the class that
ro
has fewer samples. In the endmost segment, the classifier is trained to predict the
cognitive health status of a person performing ADL. Figure 3 summarizes our
approach.
-p
re
lP
na
ur
11
• Sensor Count: Total number of times a particular sensor is used during an
activity.
• Sensor Events: Total number of unique sensor events in an activity.
• Activity Completeness: A boolean feature that represents whether the par-
ticipant was able to complete the activity.
Similarly, for each complex interwoven activity instance, in addition to the
of
features described above, following features are extracted:
• Activity Interruption: Total number of activity interruptions. Where inter-
ruption occurs when one activity is interrupted by another activity.
ro
• Activity Sequence: Vector that represents the sequence in which the activi-
ties are performed
-p
• Parallelism Vector: Parallelism is an ability to multitask activities, and is
represented as a P-index. To measure parallelism, Run Length Encoding
(RLE) is used that counts the number of activities in progress at a particular
re
time. High Activity-Level Run Measure (HALRM) and the Low Activity-
Level Run Measure (LALRM) determines the level of task parallelism and
are calculated using Equations 1 and 2:
lP
M X
X N
HALRM = P (i, j) × i × j (1)
i=1 j=1
na
N
M X
X P (i, j) × j
LALRM = (2)
i=1 j=1
i
as shown in Equation 3.
HALRM
P − index = (3)
LALRM
For a fair comparison, our feature set for both simple and complex activities
is the same as the one proposed by [14] and [17] respectively.
12
3.2. Feature Selection
Feature Selection is the process of reducing the number of features when de-
veloping a predictive model. The proposed approach CA-SHR uses two feature
selection methods to find features that are highly correlated with cognitive health.
We compare the performance of feature selection algorithms i.e., Information
Gain (IG) and Principal Component Analysis (PCA). Feature selection techniques
reduce overfitting, improves accuracy, and reduces training time. Both feature se-
lection techniques are of different nature and tend to provide a good feature subset
of
in case of noise and highly redundant data. PCA reduces the feature space based
on variance and covariance structure through linear combinations. While IG cal-
culates the gain of each feature in the context of the target variable and selects
ro
features that have the least entropy (impurity). Details of each are given below:
-p
Information Gain is an entropy based feature selection technique to measure
the similarity and correspondence of a feature fi in a particular class ci . After
determination of entropy, information gain for a instance can be computed as
re
IG(fi ) = H(ci ) − H(ci |fi ). Where H(ci |fi ) represents the entropy of ci depen-
dant on fi .
PCA [59] is a method to reduce dimensions of a dataset such that its purpose
is to transform a matrix. The original variance matrix can be yet still represented
by the reduced set. Equation 4 represents the formula for computing PCA :
na
n
1 X − −
cov(X, Y ) = (Xi − x)(Yi − y ) (4)
n − 1 i=1
13
multiplied with a number between range 0−1 and is added to the original instance
to create a new synthetic instance that is distinct from the original data point. Thus
the decision boundary of the minority class is generalized.
of
3.4. Classification Models
In this sub-section, an overview of the algorithms has been presented that are
used in this work.
ro
3.4.1. Decision Tree (j48)
The decision tree is built using the information gain and entropy [61, 62]. It
-p
is a rule-based approach in which decision are made on the basis of entropy and
information gain. Algorithm 1 provides the psuedocode of this algorithm. Below
is the Equation 5 to calculate the entropy.
re
J
X
Entropy = H(T ) = IE (k1 , k2 , ..., kJ ) = − ki log2 ki (5)
i=1
lP
Where K is the probability of class i and H(f eature) is the entropy that basically
measures the degree of “impurity”. Information gain of a feature Fi is calculated
using equation 6.
IG(Fi ) = H(C) − H(C|F i) (6)
na
p(Ck ) p(x | Ck )
p(Ck | x) = (7)
p(x)
Given a discrete vector x, P (x|Ck ) represents the posterior probability of class
(c, target) given predictor (x, attributes). P (Ck ) is the prior probability of class.
P (x|c) is the likelihood which is the probability of predictor given class. P (x) is
the prior probability of predictor.
14
Algorithm 1 Pseudocode for J48 Algorithm
1: Input a dataset D
2: Start
3: Tree=[]
4: If (D is pure) || Other threshold for stopping is met then end
5: For all features f αεDdo
6: IF we split a then calculate impurity function
7: abest = select best attribute according to above set criteria
of
8: Tree= Build a decision node that tests abest in the root
9: Dv = Select subset of datasets from dataset D depending on abest
10: For all Dv do
ro
11: Start
12: T reev =J48(Dv )
13: Append T reev to matching branch of tree
-p
14: End
15: Return Tree
16: End
re
3.4.3. Sequential Minimal Optimization (SMO)
SMO, because of less complexity, Support Vector Machines(SVMs) are trained
lP
subject to :
(8)
0 ≤ αi ≤ C, f or i = 1, 2, ..., n,
n
X
y i α1 = 0
ur
i=1
sensitivity (the degree of mistakes allowed) when separating classes [61]. The
sensitivity was set to 1.0 so that SMO can become highly sensitive in classification
process [64].
15
Algorithm 2 Naı̈ve Bayes Working Algorithm
1: Mc = 0, Mjc = 0;
2: F or j = 0 : M do
3: c = li // ith class label
of
4: Mc := Mc + 1;
5: F or k = 1 : D do
6: if xij = 1 then
ro
7: if xkc := xkc | 1
8: πc∧ = MM
c ∧
, θkc
-p
re
lP
3.2.1: If above equation does not change result, randomly choose unbound ex-
ample
3.2.2: If that fails, randomly choose example. If that fails, rechoose ec1
Jo
16
network composes of an input layer, a hidden layer(s), and an output layer [61]
shown in Figure 4. The number of nodes in the input layer is equal to the to-
tal number of features in the dataset, whereas the nodes on the output layer are
dependent upon the labels. Generally, one hidden layer is sufficient to map the in-
puts to outputs within reasonable accuracy. While mapping the input to outputs, a
transfer function or activation function is used. The activation function is a math-
ematical representation (usually of sigmoid shape) of the relationship between the
input and output nodes. Algorithm 4 provides the pseudocode of this algorithm.
of
Given an input node xi , the output of the hidden node hj is given as Equation 9:
n
!
X
hj = φ1 + wij + θj (9)
ro
i=1
-p
n
!
X
output = φ2 + wjk + θk (10)
j=1
re
lP
na
ur
Jo
17
Algorithm 4 Pseudocode for MLP
1: for k = 1 to m do
2: Find the output yi = fi
3: end for
4: for I with i=n+1 to NN do
5: Using individual Xi , to connect with i
6: Get vector Z from weight Wi for neuron i
7: For neuron I get synaptic weights s from Wi
of
8: For neuron I get bias b from weight Wi
9: For neuron I get the transfer functionfor index t from Wi
Pi
10: Calculate neuron I output as yi = ft j=1 sj .zj .yi + b
ro
11: end for
12: for k=NN -m to NN do
13: Calculate NN output with yi − (N N − m + 1) + 1 = yi
14: end for
-p
re
accurately observed in the in the respective last iteration. Adaptive AdaBoost is
utilized to minimizes exponential loss as defined below in Equation 11 .
T
lP
X
FT (x) = ft (x) (11)
t=1
na
the individuals i.e., healthy, mild cognitive disorder, or suffering from dementia.
Details of our setup and results are described below.
Jo
4.1. Dataset
We use the publicly available dataset Cognitive Assessment Activity (Kyoto)
from the Centre for Advanced Studies in Adaptive Systems (CASAS) [65] 1 . To
1
http://casas.wsu.edu/datasets/assessmentdata.zip
18
Algorithm 5 AdaBoost Working Algorithm
1: Given(Ai , B i ),. . . ....,(An , B n ),. . . ...., where ai ∈ A , bi {−1, +1}
2: Start: F1 (i) = m1 f or i = 1, . . . . . . , n
3: For j=1,. . . .. J:
4: Train Weak Classifier Using Distribution Fj
5: Get Weak Supposition hj : A → {−1, +1}
6: Goal: Pick hj with low weighted error
7: ∈j = pri∼Fj [hj (Ai ) 6= bi ]
of
1−
8: Choose αt = 21 in ∈j j
9: Update for i=1,. . . ..,n:
F (i)exp(−αj ji hj (ai ))
ro
10: Fj+1 (i) = j Zj
11: Here Zj represents the normalization factor
12: The output of finalP hypothesis is given below
-p
J
13: H (x) = sign j=1 αj hj (a)
re
our best knowledge, this is the only dataset that can be used for automated cog-
nitive health assessment of healthy, MCI, and dementia individuals using sensor
stream data collected from simple daily life activities (SADLs) and complex in-
lP
strumental activities of daily living (IADLs) in a smart home. The dataset contains
passive and automatic sensing data collected from 400 participants. The partici-
pant’s interaction with the smart home is recorded with the help of binary, digital,
and analog sensors that includes sensors like motion, door, light, temperature,
na
burner, etc. The dataset contains instances of both simple and complex daily life
activities. Simple activities are defined as those that are performed in daily rou-
tine, and are not interwoven with other activities (e.g., doing dishes, medicine
intake). While complex activities represent interwoven activities (e.g., examining
ur
a bus schedule, planning a picnic). The dataset contains 24 daily life activities, in
which the first 8 activities represent the simple daily life tasks [17], and the last 8
activities represent the complex daily life interwoven tasks [14]. Activities from
9 − 16 are also daily life activities but without a label. Since our focus is on clas-
Jo
sification using ground truth information, thus these activities are not considered
in the current study.
Let H = H1 , H2 , H3 represents the health status of individuals i.e., healthy,
dementia or MCI. Let A = A1 , A2 , . . . , Ax be the set of ADL. Let S = s1 , s2 , . . . , sn
represents the sensors. For each activity Ai , the dataset contains total number of
19
times a sensor si is triggered for a particular individual. Thus each tuple Ti in our
dataset can be represented as Ti = ts1 , ts2 , . . . , tsn , Ai , Hi , where Hi is the re-
sponse variable. Table 3 and 4 summaries the main features of the dataset that is
used in this work.
of
CASAS [65]. We use Accuracy, Precision, Recall, F-score, and ROC Curve (AUC)
as evaluation measures. Accuracy is calculated as the overall correctly classified
instances and is given as T P +TTNP +T N
+F P +F N
, where T P are the true positive which in
ro
our case are the amount of predicted true (healthy) that are rightly true (healthy),
T N amount of instances predicted negative (dementia) and are rightly negative
(dementia), F P is the number of instances predicted positive (healthy) and are
-p
rightly negative (dementia) and F N are the false negatives are the number of in-
stances predicted negative (dementia) and are rightly positive (healthy). Precision
and Recall are calculated using T PT+F P
P
and T PT+F
P
N
respectively. F-score is the
re
Table 3: Characteristics of the Dataset [17].
Dataset [17]
lP
Number of Participants 79
Mean Age 66
Total healthy 65
Total dementia 14
na
Total Activities 8
“Sweeping the kitchen and dusting the living room.”
“Obtaining medicine containers and a weekly
medicine dispenser, filling the dispenser according to
the directions.”
ur
20
Table 4: Characteristics of the Dataset [14].
Dataset [14]
Number of Participants 179
Total healthy 145
Total MCI 32
Total dementia 2
Total Activities 8
Magazine: “Choose a magazine from the coffee table
of
to read on the bus ride.”
Heating pad: “Microwave for 3 minutes a heating pad
located in the kitchen cupboard to take on the bus.”
Medication: “Right before leaving, take motion sick-
ro
ness medicine found in the kitchen cabinet.”
Complex Interwoven Activities Bus map: “Plan a bus route using a provided map, de-
termine the time that will be needed for the trip and
-p
calculate when to leave the house to make the bus.”
Change: “Gather correct change for the bus.”
Recipe: “Find a recipe for spaghetti sauce in the
recipe book and collect ingredients to make the sauce
re
with your friend.”
Picnic basket: “Pack all of the items in a picnic basket
located in the closet.”
lP
recision+Recall
AUC represents the extent or measure of distinguishable features. Next, we ex-
plain the results of applying our CA-SHR on Simple Activities Dataset, and Com-
plex Activities Dataset.
ur
selection methods when compared using Accuracy, AUC, Recall, and F-score.
Table 5, Figure 5 and Figure 6 demonstrate the results of Voting algorithm
of healthy and dementia individuals using simple life activities. We achieve 30%
higher F-score of dementia individuals using NB and 26% higher F-score of de-
mentia individuals using j48, 28% higher F-score of dementia individuals using
NB, 32% higher F-score of dementia individuals using SMO and achieve the high-
21
est F-score of 34% of dementia individuals using Ensemble Adaboost. The CA-
SHR attain the highest Accuracy of 95% using Ensemble Adaboost as shown in
Table 5 and Figure 6.
Table 5: Comparison of Voting classifier by combining the prediction from individual classifier for
simple daily life activities [17]. Key: CA-SHR - Cognitive Assessment of Smart Home Resident,
NB - Naive Bayes Classifier, J48 - Decision Tree J48, SMO - Sequential Minimal Optimization,
MLP - Multilayer Perceptron.
of
Algorithm AUC F-score Accuracy
Class-CH Class-Dem
NB 0.73 0.93 0.60 88.63
CA-SHR NB 0.913 0.909 0.917 91.31
ro
J48 0.80 0.91 0.64 86.07
CA-SHR J48 0.912 0.91 0.92 91.20
MLP 0.77 0.91 0.62 86.07
-p
CA-SHR MLP 0.914 0.91 0.92 91.40
SMO 0.73 0.93 0.60 88.60
CA-SHR SMO 0.91 0.91 0.92 91.19
re
CA-SHR Adaboost 0.98 0.95 0.95 95.0
lP
na
ur
Jo
Figure 5: F-score comparison of the Voting algorithm by combining the prediction from individual
classifier
22
of
ro
Figure 6: Accuracy Comparison of the Voting Algorithm by Combining the Prediction from Indi-
-p
vidual Classifier
Figure 7: Confusion Matrix of the Classification of Healthy and Dementia Individuals using En-
semble Adaboost
Table 6, Figure 8, and Figure 9 demonstrate the results of the Voting algorithm
of healthy and dementia individuals using simple life activities when classification
23
results from each algorithm are averaged. We achieve a 29% higher F-score of
dementia individuals using NB and 30% higher F-score of dementia individuals
using j48, 31% higher F-score of dementia individuals using NB, 32% higher F-
score of dementia individuals using SMO, and achieve the highest F-score of 35%
of dementia individuals using Ensemble Adaboost. The model performs with the
highest accuracy of 96.02% using Ensemble Adaboost as shown in Table 6 and
Figure 9.
of
Table 6: Comparison of Voting classifier by averaging results from individual activity for simple
daily life activities [17]. Key: CA-SHR - Cognitive Assessment of Smart Home Resident, NB -
Naive Bayes Classifier, SMO - Sequential Minimal Optimization, MLP - Multilayer Perceptron.
ro
Algorithm AUC F-score Accuracy
Class-CH Class-Dem
NB 0.73 0.93 0.64 88.60
-p
CA-SHR NB 0.94 0.94 0.94 94.37
J48 0.80 0.91 0.64 86.07
CA-SHR J48 0.94 0.94 0.95 94.31
MLP 0.86 0.93 0.64 83.54
re
CA-SHR MLP 0.94 0.94 0.94 94.13
SMO 0.73 0.93 0.60 88.60
CA-SHR SMO 0.94 0.95 0.95 94.86
lP
24
of
ro
Figure 8: F-score comparison of the Voting algorithm by Averaging the results from individual
-p
classifier. re
lP
na
ur
Figure 9: Accuracy Comparison of the Voting Algorithm by Averaging the Results from Individual
Classifier
Jo
25
4.4. Complex Interwoven Activities Dataset
For Complex Interwoven Activity Dataset as shown in Table 4, we use PCA to
select a subset of features that may represent the key features in the dataset. Table
7 and Figure 10 demonstrate the shows the classification performance of MCI
individuals and healthy individuals. Our CA-SHR achieves 28% higher F-score of
dementia individuals using SMO and 32% higher F-score of dementia and healthy
individuals using Bagged SMO.
of
Table 7: Comparison of the machine learning classifiers for MCI and healthy using complex in-
terwoven activities. Key: PCA - Principal Component Analysis, IG - Information Gain, CA-SHR
- Cognitive Assessment of Smart Home Resident, NB - Naive Bayes Classifier, SMO - Sequential
Minimal Optimization.
ro
Algorithm F-score AUC
healthy MCI
-p
PCA + Cost Sensitive SMO 0.37 0.78 0.62
CA-SHR SMO 0.92 0.85 0.88
Under sampling + Bagged SMO 0.34 0.42 0.60
re
CA-SHR OverSampling+Bagged SMO 0.90 0.91 0.91
lP
na
ur
Jo
26
The graphical representation of confusion matrix in Figure 11 depicts the per-
centage of accurately classified health status concerning not-accurately classified
health status using the Adaboost classifier. 12.4% of ‘healthy’ instances are not
classified correctly as ‘MCI’ and 3.4% examples of ‘MCI’ are not classified cor-
rectly as ‘healthy’. Overall Bagged SMO provides promising performance.
of
ro
-p
Figure 11: Confusion Matrix of the Classification of Interwoven Activities of MCI and Healthy
re
Individuals using Bagged SMO
Table 8: Comparison of the Machine Learning Classifiers for Dementia and Healthy sing Complex
na
Interwoven Activities. Key: CA-SHR - Cognitive Assessment of Smart Home Resident, SMO -
Sequential Minimal Optimization.
healthy Dem.
Missing Values+ SMO 0.93 0.99 0.94
CA-SHR Missing Values+ SMO 0.98 1.00 0.98
Jo
27
of
ro
Figure 12: F-score Comparison of the Ensemble Voting Averaging Algorithm.
-p
sified as ‘dementia’ and 1.0% instances of ‘dementia’ are not classified correctly
as ‘healthy’.
re
lP
na
ur
Figure 13: Confusion Matrix of the Classification of Dementia and Healthy Individuals of Inter-
woven Activities using SMO
Jo
28
4.5. Supervised Classification of Task Quality
Task quality is a measure to understand how well a task is performed. We use
machine learning classifiers for task quality classification. The observation-based
Accuracy score has been selected based on comparison with automated scores as
the ground truth is provided by the neuropsychologist. We split the quality evalu-
ated results into two major sub-classes using uniform-frequency binning. Table 9
and 14 shows the task quality classification evaluation and comparison with [14].
Neuropsychologists assigned an observation-based Accuracy score to the tasks.
of
The results are proficient most at the Accuracy of 88% using the voting algorithm.
Table 9: Comparison of the machine learning classifiers for the classification of Task Quality
ro
scores. Key: CA-SHR - Cognitive Assessment of Smart Home Resident, NB - Naive Bayes Clas-
sifier, SVM - Support Vector Machine, MLP - Multilayer Perceptron.
-p
Good Bad
SMO 80.45 0.84 0.76 0.85
CA-SHR Vote SMO 87 0.86 0.869 0.92
re
MLP 79.33 0.82 0.74 0.85
CA-SHR Vote MLP 88.1 0.88 0.88 0.93
NB 82.13 0.85 0.78 0.88
lP
29
of
Figure 15: Confusion Matrix of the Classification of Task Quality as Good or Bad of all Individuals
ro
using MLP
-p
centage of accurately classified task quality status concerning not-accurately clas-
sified task quality using the MLP classifier. 11.1% of ‘Good’ instances are not
classified correctly as ‘Bad’ and 12.7% instances of ‘Bad’ are not classified cor-
re
rectly as ‘Good’.
lP
na
ur
Jo
30
4.6. Duration Analysis
We did a comprehensive analysis of features for each dataset. After a thorough
analysis of simple daily activities that were performed by individuals belonging
to two classes: Cognitively healthy and dementia, the duration feature was helpful
to improve the performance of the model. Adding the duration feature improves
the overall Accuracy by ∼ 1.5% when all the activities were classified using vot-
ing. Our classifier shows improvement in results too when Cognitively healthy
and dementia were classified using each activity alone. After the analysis of com-
of
plex interwoven activities, we observed that the temporal feature is not good in
classifications of individuals. It does not help to improve the identification perfor-
mance instead it was decreasing the performance. Interwoven dataset encourage
ro
to do multitasking. For example, when a participant starts a kitchen activity, at
the same time the participant starts writing a card. In this case, the total duration
will be more from the original duration if it is performed separately, that is why
-p
the duration is not useful fro interwoven activities. We removed this feature from
the interwoven activities dataset for better performance. Task quality assessment
duration gives positive information about the time surfed during a particular task
re
performed by each class individuals. It provides distinctive information about each
task executed.
lP
5. Discussion
The analysis of the CA-SHR on both the simple and complex activities of
daily life shows the better performance of the CA-SHR. For the simple activities
dataset, the F-score and AUC of minority classes were very low in the state of
na
the art. In our work, we get better results in both cases and clear identification
of minority class (dementia). To improve the classification performance of inter-
woven complex activities, we analyze the performance of feature selection and
ur
class imbalance techniques. We found that the CA-SHR performs better than ear-
lier methods. We got about 50% more Accuracy than the state of the art studies as
shown in Table 7. Our results show 40% better as shown in Table 8. We show that
the duration feature has a clear effect on the performance of the machine learning
Jo
model. We analyzed the effect of this feature for the simple life activities dataset
as well as the complex interwoven activities dataset. We found that it increases the
performance of the model when individuals perform simple daily life activities as
shown in Table 5 and 6 and decrease the performance when it is used to classify
individual using complex interwoven activities as depicting in Table 7, 8 and 9.
31
6. Conclusion and Future Work
A smart home is extensively used to recognize and analyze activities. Our goal
is to assess the quality of the activities performed in a smart home to promote sus-
tainability. This work focused on improving the classification performance of Sim-
ple Daily Life Activities as well as Complex Interwoven Activities. Furthermore,
we cross-verified the evaluation of task quality that has been done automatically
with direct observation scores directly assigned by a specialized neuropsycholo-
gist. We achieved the best Accuracy of 96.02% and 99.6% in comparison with
of
existing approaches. We also analyzed which activities were difficult for indi-
viduals having dementia. Furthermore, we provided temporal feature analysis to
estimate if the temporal features help to detect impaired individuals effectively.
ro
This work is a very important step to analyze the functional health of individuals
in their home environment. This work improved the reliability of the assessment
approach so that it could be applied in the real world. In the future, we intend to
-p
make an automated framework that will assess human activities in real-time. This
framework will assign scores that are generated automatically to the activities
which will be performed in the smart home in real-time.
re
References
lP
[1] H. K. Mukendi, M. Adonis, Smart homes and sustainable cities: The design
of a low-cost solution for comprehensive home automation, in: Sustainable
Cities-Authenticity, Ambition and Dream, IntechOpen, 2018.
32
Internet of Things (iThings) and IEEE Green Computing and Communica-
tions (GreenCom) and IEEE Cyber, Physical and Social Computing (CP-
SCom) and IEEE Smart Data (SmartData), 2019, pp. 929–936.
[6] A. O. Michalec, E. Hayes, J. Longhurst, Building smart cities, the just way.
a critical review of “smart” and “just” initiatives in bristol, uk, Sustainable
Cities and Society 47 (2019) 101510.
of
environments for older people, Frontiers in psychology 7 (2016) 1329.
ro
Hash polynomial two factor decision tree using iot for smart health care
scheduling, Expert Systems with Applications 141 (2020) 112924.
-p
[9] F. Liu, Research on smart care system for elder sojourners, in: International
Conference on Human-Computer Interaction, Springer, 2020, pp. 113–127.
[10] M. Allen, et al., Big healthcare data analytics in internet of medical things,
re
American Journal of Medical Research 7 (1) (2020) 48–54.
[11] World’s population increasingly urban with more than half living in urban
lP
areas, http://www.un.org/en/development/desa/news/
population/world-urbanization-prospects-2014.html,
accessed: 2020-04-12 (2020).
33
[16] P. N. Dawadi, D. J. Cook, M. Schmitter-Edgecombe, C. Parsey, Automated
assessment of cognitive health using smart home technologies, Technology
and health care 21 (4) (2013) 323–343.
of
vastava, Hybrid genetic algorithm and a fuzzy logic classifier for heart dis-
ease diagnosis, Evolutionary Intelligence (2019) 1–12.
ro
[19] G. T. Reddy, N. Khare, P. K. R. M. Sweta Bhattacharya, Saurabh Singh,
G. Srivastava, Deep neural networks to predict diabetic retinopathy. j ambi-
ent intell human comput (2020)., International Journal of Intelligent Engi-
-p
neering and Systems (2020).
source data using ambient application, in: 2019 22nd International Multi-
topic Conference (INMIC), IEEE, 2019, pp. 1–6.
Jo
34
[26] K. Gotkin, When computers were amateur, IEEE Annals of the History of
Computing 36 (2) (2014) 4–14.
[27] J. Abbate, Getting small: a short history of the personal computer, Proceed-
ings of the IEEE 87 (9) (1999) 1695–1698.
[28] B. Koyuncu, Pc remote control of appliances by using telephone lines, IEEE
Transactions on Consumer Electronics 41 (1) (1995) 201–209.
of
[29] J. J. Greichen, Value based home automation for todays’ market, IEEE
Transactions on Consumer Electronics 38 (3) (1992) XXXIV–XXXVIII.
[30] O. D. Lara, M. A. Labrador, A survey on human activity recognition using
ro
wearable sensors, IEEE communications surveys & tutorials 15 (3) (2012)
1192–1209.
-p
[31] M. Shoaib, S. Bosch, O. D. Incel, H. Scholten, P. J. Havinga, A survey of on-
line activity recognition using mobile phones, Sensors 15 (1) (2015) 2059–
2085.
re
[32] U. Maurer, A. Smailagic, D. P. Siewiorek, M. Deisher, Activity recognition
and monitoring using multiple sensors on different body positions, in: In-
ternational Workshop on Wearable and Implantable Body Sensor Networks
lP
of the IEEE Engineering in Medicine and Biology Society, IEEE, 2008, pp.
4451–4454.
[34] E. M. Tapia, S. S. Intille, W. Haskell, K. Larson, J. Wright, A. King, R. Fried-
man, Real-time recognition of physical activities and their intensities using
ur
wireless accelerometers and a heart rate monitor, in: 2007 11th IEEE inter-
national symposium on wearable computers, IEEE, 2007, pp. 37–40.
Jo
35
[37] D. Riboni, C. Bettini, Cosar: hybrid reasoning for context-aware activity
recognition, Personal and Ubiquitous Computing 15 (3) (2011) 271–289.
of
on Wearable Computers (ISWC) 2010, IEEE, 2010, pp. 1–8.
ro
pattern clustering applied to temporal ann algorithm, Sensors 15 (5) (2015)
11953–11971.
-p
[41] F. Cicirelli, G. Fortino, A. Giordano, A. Guerrieri, G. Spezzano, A. Vinci,
On the design of smart homes: A framework for activity recognition in home
environment, Journal of medical systems 40 (9) (2016) 200.
re
[42] H. D. Mehr, H. Polat, A. Cetin, Resident activity recognition in smart homes
by using artificial neural networks, in: 2016 4th international istanbul smart
grid congress and fair (ICSG), IEEE, 2016, pp. 1–5.
lP
36
[47] M. Philipose, K. P. Fishkin, M. Perkowitz, D. J. Patterson, D. Fox, H. Kautz,
D. Hahnel, Inferring activities from interactions with objects, IEEE perva-
sive computing 3 (4) (2004) 50–57.
of
regression based activity recognition and classification, in: International
Conference on Information Communication and Embedded Systems (ICI-
CES2014), IEEE, 2014, pp. 1–6.
ro
[50] D. Trabelsi, S. Mohammed, F. Chamroukhi, L. Oukhellou, Y. Amirat, An
unsupervised approach for automatic activity recognition based on hidden
-p
markov model regression, IEEE Transactions on automation science and en-
gineering 10 (3) (2013) 829–835.
37
[56] E. Kim, S. Helal, D. Cook, Human activity recognition and pattern discovery,
IEEE pervasive computing 9 (1) (2009) 48–53.
of
74–82.
ro
rics and intelligent laboratory systems 2 (1-3) (1987) 37–52.
-p
minority over-sampling technique, Journal of artificial intelligence research
16 (2002) 321–357.
[63] I. Rish, et al., An empirical study of the naive bayes classifier, in: IJCAI
na
Explanation-of-SMO-Parameters-td21768.html, accessed:
2020-10-02 (2020).
Jo
[66] J. Platt, Sequential minimal optimization: A fast algorithm for training sup-
port vector machines (1998).
38
[67] M. Graczyk, T. Lasota, B. Trawiński, K. Trawiński, Comparison of bagging,
boosting and stacking ensembles applied to real estate appraisal, in: Asian
conference on intelligent information and database systems, Springer, 2010,
pp. 340–350.
of
ro
-p
re
lP
na
ur
Jo
39