ClusterNN: A Hybrid Classification Approach to Mobile
Activity Recognition
Sulaimon Bashir Daniel Doolan Andrei Petrovski
Robert Gordon University Robert Gordon University Robert Gordon University Aberdeen, UK Aberdeen, UK Aberdeen, UK s.a.bashir@rgu.ac.uk d.c.doolan@rgu.ac.uk a.petrovski@rgu.ac.uk
ABSTRACT being recognised include in-vehicle, on-bicycle, walking, still
Mobile activity recognition from sensor data is based on su- and running [4]. This can facilitate a lot of context aware pervised learning algorithms. Many algorithms have been applications that enable a device to react based on user ac- proposed for this task. One of such algorithms is the K- tivity. For example, a device can be configured to increase nearest neighbour (KNN) algorithm. However, since KNN the screen font size if the user is walking to make it easy is an instance based algorithm its use in mobile AR has to read the screen or a device may do self-management to been limited to evaluation stage. This is because for KNN switch to silent mode if user is driving. Similarly, activity to work well all the training instances must be kept in mem- recognition is useful for fitness and health monitoring [8], so- ory for similarity measurement with the test instance. This cial networking [10] and commercial application like activity is however prohibitive for mobile environment. Therefore, based advertising [11]. Activity recognition is a classification we propose an unsupervised learning step that reduces the task whereby labelled data are used to train a classification training set to a proportional size of the original dataset. algorithm to induce a model that recognize new unlabelled The novel approach apply clustering to the dataset to obtain data. There are two approaches to model induction in mo- set of micro clusters from which cluster centres, minimum, bile activity recognition [14]. The first approach called of- average and standard deviation characteristics are extracted fline training collects sample data from subjects who per- for similarity measurement. These reduced representative form the designated activities. The collected data is used to sets are used for classifying new instances using the nearest induce a model on a remote system off the mobile device. neighbour algorithm step on the mobile phone. Experimen- The induced model is later deployed into the application tal evaluation of our proposed approach using real mobile for recognition. The second approach called online train- activity recognition dataset shows improved result over the ing involves inducing the model directly on the device using basic KNN algorithm. the users self-annotated data. The advantage of the second approach is that it facilitate online and incremental learn- Categories and Subject Descriptors ing for the model to adapt to changes in the environment. H.5.2 [User/Machine Systems]; I.5 [Pattern Recogni- Many studies have evaluated different algorithms both in tion]: MetricsPercentage Accuracy online and offline modes. Many of these studies have re- ported KNN to give good performance in terms of accuracy [3] [9][13]. But despite this performance, KNN is not being General Terms used for online recognition on mobile phone. The reason for Algorithms, Experimentation, Performance this is because of the need to keep a large amount of data in memory for the instance based classification operation Keywords in KNN. This cost is prohibitive especially for the resource Activity Recognition, KNN, Smartphones, ClusterNN constraint and real time response requirement of the mobile device in the face of multitasking and multifarious mobile 1. INTRODUCTION applications. Hence the need to make KNN amenable for Mobile Activity Recognition has become a hot topic in re- online recognition of activities. cent years. This is due to its inherent usefulness in a wide range of applications. For example, the Goggle Android API To make KNN amenable to online recognition with mobile now includes set of API that enables developers to use pre- AR, we propose a preliminary step that reduces the amount built models to recognise activity of users. Such activities of training instances to a desired percentage that maintains a good representation of the training set and ensure good accuracy of KNN in an online setting for activity recognition. The evaluation of the proposed novel framework shows that it performs better than using the basic nearest neighbour algorithm.
The rest of this paper is organised as follow: Section II
presents some of the related work in using KNN for activity recognition and other general algorithms. Section III de- scribes the methodology. Section IV presents the result and 3. METHODOLOGY discussion of the result. Section V gives the conclusion, and Our proposed approach to make KNN amenable to online future work is highlighted in the last section of the paper. activity recognition employs a data reduction strategy to reduce the initial training set to a more compact set suit- 2. RELATED WORK able for in-memory use for online recognition. As shown in Activity recognition using different sensor modalities and Algorithm 1, the method takes the training data and the de- algorithms has been study extensively. A number of ma- sired percentage of data to retain as input and produces the chine learning approaches in activity recognition were re- Model Data (MD). The model data (MD) is the set of clus- viewed in [12]. A more recent review focusing on mobile ter average, minimum and maximum obtained after applying phone based activity recognition is presented in [14]. The re- clustering on the dataset. In this algorithm, all data samples view identified many systems that are based on using smart- belonging to each class classi are clustered (line 4 Algorithm phone sensors for activity recognition. Also, a comparative 1) by applying a clustering technique on the data. Pos- study of different classifier algorithms from Weka [5] ma- sible clustering algorithms include k-Means, DBScan, EM chine learning tool was performed in [2] using data obtained and host of others. However, for the sake of simplicity we from smartphone accelerometer. The data collected with employ Bisecting K-Means in the present work. After the phone placed in the shirt pocket was used to compare accura- clustering step, the list of cluster centres obtained are given cies of IBK, Naive Bayes, Rotation Forest, VFI, DTNB and the label of the present class. In addition, we extract the LMT algorithms while the data collected when the phone minimum and maximum data point from each cluster re- was placed in the palm position was used to compare accu- turned for the current class (Algorithm 1 lines 4-5). The racies of SMO, NNge, ClassificationViaRegression, FT, VFI, number of clusters created per class is proportional to the IBK and Naive Bayes algorithms. Out of all the algorithms number of samples in the class and the percentage of reten- tested, they reported IBK and IB1 to give the best accuracy tion input to the algorithm. This step is repeated for each for the hands palm data and VFI gives the lowest accuracy. class in the data. Finally, the set of characteristic features The KNN algorithm was not used directly on mobile phone in terms of centroids, minimum list, maximum list obtained for activity recognition. This shows the impracticability of from the different clusters of each class and their associated using KNN directly for online activity recognition. labels is returned from the algorithm. These represent the Model Data (MD) to be deployed for the online recognition Similarly authors in [9][13] have all shown the superior per- on a mobile phone. he key feature of the model is that it formance of KNN in terms of accuracy for mobile AR in an is more compact and has a reduced resource overhead in offline evaluation scenario. Kose et.al. [6] have proposed terms of memory requirement and time when compared to an improved KNN algorithm for online activity recognition. the ordinary KNN. Also, the reduced compact set including Initially, the main dataset consists of 4 features: average, centroids can be adapted to evolving sensory stream as new minimum, maximum and standard deviation. The features unanticipated changes occurs in the input data distribution. are extracted from raw reading obtained within each time window of each activity. The algorithm works by selecting Algorithm 1: Offline Training k values from the minimum, maximum and average features across each activity data and the standard deviation of the Input: Cn number of classes in the dataset data in each class. The reduced values and the correspond- Kn percentage of data to remain in each classes of ing class tags are employed during recognition phase. The examples that serves as cluster centroids main drawback of this approach is its feature dependence. Data: D = (xi , y i ) xi Rn and y i R1 set of The algorithm cannot be applied to a dataset with feature training examples characteristics different from the one used in the algorithm. Result: MD=Centroids features Our approach does not have this limitation as it is applica- 1 foreach classi in Cn do ble to any feature set. Abdallah et. al. [1] have proposed 2 datai =Data[classi ] retrieve all samples in class i; a cluster based classification algorithm that employs clus- 3 centroidsi ,clusterAssignment =Clustering(datai ,Kn ) tering of entire datasets into K clusters of the number of 4 minimumList.append(min(Data[ClusterAssignment])) activities. The clusters of data obtained are processed by 5 maximumList.append(max(Data[ClusterAssignment])) removing instances of other classes that are mixed up in a 6 MD = [centroids, minimumList, maximumList,classi] majority cluster instances. Then the cluster centroids is cal- 7 end culated. The algorithm employs four measures that were 8 return MD computed from each cluster to classify new data. However, this algorithm does treat new activity data to be classified as individual instances rather a sizeable number of instances During the online phase, new instance is classified by pass- are collected and clustering is applied on them. The clusters ing it and the MD to Nearest-Neighbour routine. It employs obtained are then compared to the cluster generated from Euclidean distance to compute the K-nearest neighbour to the training data using Euclidean distance, density, gravi- the new instance and assigns the majority label of the K tational force and within cluster standard deviation. This nearest point to it (Algorithm 2). Since we have more than approach does not segment between one activity and the one cluster characteristics in the MD, each is considered sep- other. Also, the time required to collect enough samples arately and a majority voting is performed on the outcome of that can meaningfully be clustered will be high for online each comparison. The final class given to the new instance is recognition system that requires immediate and real time the majority label returned by all of them. We also evaluate feedback of the recognised activity. the accuracy of using each cluster characteristic individually for classification decision. Algorithm 2: Online Classification Table 2: Accuracy of the ClusterNN Algorithm with Input: xnew new unlabelled instance Different Cluster Characteristics Compared with Data: M D compressed training set with characteristics KNN Algorithm features K Combined Centroid Maximum Minimum KNN Result: ynew =predicted label 1 foreach clusterCharacteristicsi in M D do 1 79.80 81.46 77.31 77.95 80.90 2 78.23 79.70 77.40 77.58 80.99 2 P redictioni =NearestNeighbour( 3 77.58 79.43 76.11 77.68 79.89 clusterCharacteristicsi , xnew , K ) 4 77.68 79.24 77.03 77.86 80.07 3 end 5 77.58 78.51 76.38 77.49 79.98 4 ynew = argmaxc (P redictioni ...C) 5 return ynew Table 3: Accuracy of Using Centroid Characteristics with Varying Percentage of Data Retained 3.1 Experiments Centroid Cluster Characteristic In this section we describe the experiments conducted to K 10% 20% 30% 40% 50% evaluate the applicability and accuracy of ClusterNN algo- 1 76.85 79.80 80.07 80.54 81.46 rithm described earlier. The dataset used in the experiment 2 76.01 77.77 78.78 79.80 79.70 3 75.83 78.32 78.78 78.32 79.43 was the WISDM dataset released to the public for smart- 4 76.20 77.31 77.95 78.60 79.24 phone based activity recognition evaluations. The WIS- 5 75.37 76.57 77.58 78.04 78.51 DIM activity recognition dataset [7] is obtained from the accelerometer of mobile phones. The data are collected from 32 users that performed six designated activities of working, Table 4: Accuracy of Using Combined Characteris- jogging, ascending and descending stairs, sitting and stand- tics with Varying Percentage of Data Retained ing. Each data sample in the dataset is represented by 43 Combined Cluster Characteristics features. The features are obtained from the transformation of 200 raw samples of data recorded from the tri-axial ac- K 10% 20% 30% 40% 50% celerometer of a mobile phone. Each 200 worth of samples 1 73.71 76.85 78.32 79.61 79.80 2 69.74 72.79 76.01 78.14 78.23 are recorded within a 10 second window with a sampling 3 70.20 73.80 75.74 76.66 77.58 frequency of 20Hz. The features used are basic statistical 4 70.39 73.43 75.55 76.66 77.68 features which are described in [7]. The dataset distribution 5 69.46 72.88 75.18 76.38 77.58 spread across the six activities. The total samples in the obtained dataset and their distribution across each activity is shown in Table 1. In carrying out the experiment, we fol-
lowed the hold-out evaluation strategy. The entire dataset
is divided into the training set and test set. We ensured that the split of the data were proportionate in terms of the Figure 1: Accuracy of Using Different Cluster Char- number of instances in each class for the training set and acteristics and KNN Algorithm the testing set. The ratio of the split is 80 to 20%. The same configuration is used in evaluating ClusterNN and the accuracy of using minimum characteristic for classification benchmark algorithm. decision is the second best followed by maximum charac- teristic. However, when we combined all the characteristics 4. RESULTS AND DISCUSSION and use majority voting scheme to select the final label of The accuracy of the proposed approach to classification of an instance after each characteristic has voted, the accuracy mobile activity recognition is presented here. Table 2 sum- obtained is greater than both the minimum and maximum marizes the results for the different measures employ in the except the centroid characteristic. This validates the as- nearest neighbour to classify new instances. We used three sumption of using centroids obtained as representative of characteristics that were extracted from all the clusters ob- the global dataset. Moreover, the accuracy of basic KNN is tained from the data samples. As indicated in the table, cen- lower than the centroid characteristic. The advantage of us- troid characteristic gives the overall best accuracy in classi- ing centroid characteristic with reduced dataset is justified fying new instances when nearest neighbour is set to 1. The compared with KNN that uses all the dataset. This best ac- curacy is obtained when the percentage data retained is set [2] M. A. Ayu, S. A. Ismail, A. F. A. Matin, and to 50%. We observed that going beyond 50% data retention T. Mantoro. A comparison study of classifier does not give any further significant accuracy. algorithms for mobile-phones accelerometer based activity recognition. Procedia Engineering, 41:224229, The accuracy obtained for using centroid characteristic and 2012. the combined characteristics with different percentages of [3] S. A. Bashir, D. C. Doolan, and A. Petrovski. The the data retained between 10-50% are shown in Table 3 and impact of feature vector length on activity recognition 4 respectively. The best accuracy is obtained with the 50% accuracy on mobile phone. Lecture Notes in data retention and K = 1. Using this small sample and near- Engineering and Computer Science: Proceedings of est neighbour set to 1 will have a minimal resource overhead The World Congress on Engineering 2015, 1-3 July, during online recognition compared to using the basic KNN 2015, London, U.K., 1:332337, 2015. algorithm with all the training data. Figure 1 shows the [4] Google. relative accuracy of each characteristic with varying num- https://developers.google.com/.../activityrecognitionapi, ber of nearest neighbours for the 50% data retention . As Accessed: 10th August 2015. the figure shows, the accuracy for each measure decreases [5] M. Hall, E. Frank, G. Holmes, B. Pfahringer, with increasing number of K. However, the best accuracy P. Reutemann, and I. H. Witten. The weka data for KNN is obtained when K is 2 and decreases afterwards mining software: an update. ACM SIGKDD as well. The general low accuracy below 90% in all the ex- explorations newsletter, 11(1):1018, 2009. periments can be attributed to the nature of the dataset. [6] M. Kose, O. D. Incel, and C. Ersoy. Online human The dataset contain data from 32 different users of varying activity recognition on smart phones. In Workshop on characteristics in performing the designated activity. This Mobile Sensing: From Smartphones and Wearables to produces a lot of variations in the training and testing data. Big Data, pages 1115, 2012. Nevertheless, the performance of our centroid characteristic [7] J. R. Kwapisz, G. M. Weiss, and S. A. Moore. Activity is good given the fact that it can use a reduced dataset for recognition using cell phone accelerometers. ACM online recognition compare to KNN that requires the entire SigKDD Explorations Newsletter, 12(2):7482, 2011. training instance to achieve good performance. [8] N. D. Lane, M. Mohammod, M. Lin, X. Yang, H. Lu, S. Ali, A. Doryab, E. Berke, T. Choudhury, and 5. CONCLUSION A. Campbell. Bewell: A smartphone application to In this paper we identified the drawback of using KNN for monitor, model and promote wellbeing. In 5th online activity recognition on mobile phone. The drawback International ICST Conference on Pervasive of KNN in terms of keeping all the large amount of training Computing Technologies for Healthcare, pages 2326, data available at recognition time is addressed by propos- 2011. ing a hybrid approach to classification. The algorithm is [9] S. L. Lau and K. David. Movement recognition using based on the concept of clustering and nearest neighbour. the accelerometer in smartphones. In Future Network The novel approach employs bisecting k-Means algorithm and Mobile Summit, 2010, pages 19, June 2010. to cluster the training instances into k clusters per class. [10] E. Miluzzo, N. D. Lane, K. Fodor, R. Peterson, H. Lu, The number of clusters per class is computed proportionally M. Musolesi, S. B. Eisenman, X. Zheng, and A. T. to the number of samples in each class to ensure a balance Campbell. Sensing meets mobile social networks: the proportionate of the instances in each cluster. The evalu- design, implementation and evaluation of the cenceme ation of the approach shows that it performed better than application. In Proceedings of the 6th ACM conference basic KNN on a realistic mobile activity recognition dataset. on Embedded network sensor systems, pages 337350. ACM, 2008. 6. FUTURE WORK [11] K. Partridge and B. Begole. Activity-based advertising The result shows that our approach performs better than techniques and challenges. In Proceedings of Workshop KNN. Since we performed the test using hold-out approach on Pervasive Advertising, 2009. we are able to show the true accuracy of the algorithm on [12] S. J. Preece, J. Y. Goulermas, L. P. Kenney, and totally unseen data. In the future we will improve the perfor- D. Howard. A comparison of feature extraction mance of the algorithm by employing other clustering tech- methods for the classification of dynamic activities niques. In addition, online stream clustering method will from accelerometer data. IEEE Transactions on be considered to see the possibility of performing the pre- Biomedical Engineering, 56(3):871879, 2009. processing step of data reduction online. In addition, further [13] Z. Prekopcs ak, S. Soha, T. Henk, and investigation into the resource usage of our algorithm on the C. Gaspar-Papanek. Activity recognition for personal mobile phone will be conducted and compare with KNN to time management. Springer, 2009. show the benefits of reduced dataset on resource consump- [14] M. Shoaib, S. Bosch, O. D. Incel, H. Scholten, and tion. P. J. Havinga. A survey of online activity recognition using mobile phones. Sensors, 15(1):20592085, 2015. 7. REFERENCES [1] Z. S. Abdallah, M. M. Gaber, B. Srinivasan, and S. Krishnaswamy. Cbars: Cluster based classification for activity recognition systems. In Advanced Machine Learning Technologies and Applications, pages 8291. Springer, 2012.