You are on page 1of 9

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2016.2582723, IEEE Internet of
Things Journal

Gender Classification of Walkers Via Underfloor


Accelerometer Measurements
Dustin Bales, Pablo Tarazaga, Mary Kasarda, Dhruv Batra, A.G. Woolard, J.D. Poston, V.V.N.S
Malladi

ABSTRACT – The ability to classify the gender of occupants I. INTRODUCTION


in a building has far-reaching applications including security Building occupant classification from non-invasive gait
and retail sales. The authors demonstrate the success of measurement has the potential to be of prime importance in
machine learning techniques for gender classification. High- numerous applications such as improving sales in a retail space,
sensitivity accelerometers mounted non-invasively beneath an advancing security and threat detection, and improving
actual building floor provide the input for these machine emergency response. There are various methods of gait
learning methods. While other approaches using gait measurement (e.g. computer vision, wearable accelerometry,
measurements, such as vision systems and wearable sensors, etc.), and many have been used to classify a variety of
provide the potential for gender classification, they each face characteristics of building occupants [1-8]. The novelty of this
limitations. These limitations include an invasion of privacy, work includes the use of non-invasive underfloor mounted
occupant compliance, required line of sight, and/or high sensor accelerometers as a method of measuring gait and the
density. Underfloor mounted accelerometers overcome these application of machine learning techniques to classify the
limitations. The authors utilize the highly-instrumented gender of each walker.
Goodwin Hall smart building on the Virginia Tech campus to Gait, the manner in which humans walk, consists of the
measure vibrations of the walking surface caused by walkers. interaction between hundreds of muscles and joints in the body
In this study, the gait of fifteen individual walkers was recorded [9]. The patterns associated with this movement are similar in
as they, alone, walked down the instrumented hallway. humans, but not identical [9]. These differences in gait lead to
Fourteen accelerometers, mounted underneath the walking the assumption that gender can be inferred from its analysis.
surface, recorded walking trials with the placement of the Nixon and Carter [10] support this idea by suggesting that
sensors unknown to the walker. This work studies Bagged gender classification is achievable by observation with cameras
Decision Trees, Boosted Decision Trees, Support Vector in conjunction with computer vision techniques due to
Machines (SVMs), and Neural Networks as the machine differences in gender behavior such as larger hip swing for
learning techniques for their ability to classify gender. A ten- women and shoulder swing for men.
fold-cross-validation method is used to comment on the validity One approach to organizing existing gait measurement
of the algorithm’s ability to generalize to new walkers. This methods in the literature is to broadly label an approach as
work demonstrates that a gender classification accuracy of 88% either “visual based” or “non-visual based.”
is achievable using the underfloor vibration data from the Visual based methods involve some form of computer vision
Virginia Tech Goodwin Hall by using Decision Tree to measure the movement of a walker. These measurements are
approaches. broken down further into active and passive methods. The
active method, known as Structure from Motion, tracks specific
Index Terms – Application Platform, Gait Measurement, points on the body to characterize its motion [11]. Passive
Goodwin Hall, Smart Buildings, Vibration methods are not interested in the underlying physical structure
_____________________________________________ of a body’s motion and treat each frame of the video as a unique
pose made by a walker [12]. Visual based methods have
The authors wish to acknowledge the support as well as the collaborative limitations associated with video recording and computer vision
efforts provided by our sponsors, VTI Instruments, PCB Piezotronics, Inc.;
processing (e.g., line of sight requirement, large data sets, the
Dytran Instruments, Inc.; and Oregano Systems. The authors are particularly
appreciative of the support provided by the College of Engineering at Virginia ability of an individual to mask their identity, privacy concerns,
Tech through Dean Richard Benson and Associate Dean Ed Nelson as well as etc.).
VT Capital Project Manager, Todd Shelton. The work was conducted through Non-visual methods achieve gait measurement through a
the support of the Virginia Tech Smart Infrastructure Laboratory and its
nd variety of techniques including step loggers, accelerometer
members. Submitted for review on the 2 of February 2016.
Dustin Bales is in the Mechanical Engineering Department at the Virginia instrumented vests, mobile force plates, instrumented shoes,
Polytechnic Institute and State University, Blacksburg, VA 24061 USA (email: walker mounted accelerometers, sound measurement, and
dbbales@vt.edu). All authors are affiliated with Virginia Tech. sensitive floors [5, 8, 13-21]. These technologies have various
Pablo Tarazaga, Mary Kasarda, A.G. Woolard, J.D. Poston, V.V.N.S
limitations that restrict wide adoption. For example, they may
Malladi are also in the Mechanical Engineering Department (respective emails:
patarazag@vt.edu, maryk@vt.edu, awool012@vt.edu, poston@vt.edu, require the active participation of a walker (i.e. wearing a
sriram@vt.edu.) sensor), may have significant potential for the violation of
Dhruv Batra is in the Electrical and Computer Engineering Department privacy (i.e. recording of conversations), and/or pre-installation
(email: dbatra@vt.edu).
of the system is very expensive and/or difficult. Another
Copyright (c) 2012 IEEE. Personal use of this material is permitted. However, permission to use this material for
any other purposes must be obtained from the IEEE by sending a request to pubs-permissions@ieee.org.

2327-4662 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2016.2582723, IEEE Internet of
Things Journal

2
IoT – 0902 - 2016
subcategory under non-visual methods is ground measurement. neurodegenerative diseases [4, 7]. The algorithms used ranged
Pan et al. [1, 22] and Sabatier and Ekimov [23, 24] measure from simple thresholding of a single term to complex non-linear
vertical velocity (using geophones) or the vertical acceleration learning strategies. Yun [8] contains a list of gait classification
(using accelerometers) of the ground caused by footsteps. These examples. Additionally, Table 1 contains an extended list of
works mount sensors on the walking surface where they can gait measurement classification works not included in Yun [8].
become a trip hazard, and each could be destroyed or evaded The most similar effort in gender classification to our work is
because of his or her obvious location. The underfloor mounted by DeLoney [18] and Li et al. [19]. Both use acoustic
accelerometers in this work have locations that are unknown to measurements of walkers with gender classification accuracies
building occupants. Retrofitting underfloor accelerometers on that were either not reported (NR) [18] and 75% [19],
new or existing buildings can be relatively easy if floors are respectively. Our work uses machine learning techniques to
accessible from below. interpret the vibrational data measured by the novel underfloor
Much of the data produced in these previous works have been mounted accelerometers in an operational building (Goodwin
used for the purpose of classification of different individual Hall) to classify the gender of a walker.
characteristics from walker identification to diagnosing

Table 1. Gait classification overview of previous studies. The reference, the gait measurement method, what characteristics were classified,
accuracy, algorithm, and notes.

Accuracy
Gait Measurement Classification Machine Learning
Source Achieved Special Notes
Method Goal Algorithm
(%)
Walking Surface Footsteps/
[1] 99.55 / 85 Decision Trees N/A
Mounted Geophones Occupancy
Individual
[3] Video Surveillance 93 k-nearest neighbors Leave-one-out-cross-validation
Identification
Human
[4] Video Surveillance 95 k-nearest neighbors Generalized Symmetry Operator
Recognition
Individual
[2] Pressure Sensors 80 Decision Trees N/A
Identification
Walking
97.72 /
Walker Mounted upstairs/ Discrete Dyadic Wavelet
[5] 93.18 / Decision Trees
Accelerometers downstairs/ Transformation
93.93
level ground
Individual 90.40 - Self-organizing-
[6] Video Surveillance N/A
Identification 99.07 maps (SOM)
Various
Database/ Video 79.04 - Support Vector
[7] Neurodegenera- 10-fold cross-validation
Surveillance 93.96 Machines
tive Diseases
Gender/
Microphone Support Vector
[18] Individual NR / 60 Modulation Features
Recording Machines
Identification
Microphone Discriminant
[19] Gender 75 Principal Component Analysis
Recording Analysis

II. EXPERIMENTAL SETUP in the building [25-30]. For this study, data was collected in a
The 5-story, 155,000 sq. foot Goodwin Hall on the Virginia single hallway with each participant walking alone. Fifteen
Tech campus has been instrumented with 212 accelerometers, individuals (eight males and seven females) walked in this
with more in various stages of installation and commissioning. study. Walkers were asked to state whether they considered
All accelerometers are mounted to structural members of the their shoes hard or soft soled. Three females and two males
building via steel mounting beams as shown in Fig. 1d. The were asked to walk in both hard and soft soled shoes while the
instrumentation of this operational building creates the ideal rest walked in either hard or soft soled shoes. This strategy was
“real-world” test bed for walker classification. These used to reduce bias in walker characteristics while accounting
measurements will encounter complications possibly not for some differences in footwear. Although in previous studies,
present in a simplified, controlled laboratory system. The varied footwear has been shown to have a small effect on gait
nature of the building (i.e. classrooms, laboratories, measurements [19, 31]. In summary, thirteen soft sole and
auditoriums, and staff offices) allows for the developed seven hard sole walkers were tested. Each of the 20 walkers
technologies to translate directly into many other types of completed six trials, totaling 120 walking trials. Participants
buildings. Various initial works describe the studies completed

2327-4662 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2016.2582723, IEEE Internet of
Things Journal

3
IoT – 0902 - 2016
walked in both directions of the hallway: Eastward (East to multiple features. The dimensionality of the raw data for a
West) and Westward (West to East) for three trials each. single walking trial is ℝ𝑛×𝑚 where 𝑛 is nominally the number
Fig. 1 shows the fourth-floor hallway where the study was of samples required to complete a walking trial and 𝑚 is the

Fig. 1. The hallway used for testing: (a) shows a floor plan of the 4th floor depicting the location of the sensors mounted
under the floor, (b) the blue dashed area shows the section used in this study, and (c) show the view from the east side
of the building facing westward. Building north is labeled. (d) a tri-axis mount with three accelerometers. Only a subset
of the sensors in (b) were used. Only vertical accelerometers were used in this study.
completed. Only fourteen of the 212 PCB393B04
accelerometers with nominal 1000 mV/g sensitivity, sampling 0.3
at 51,200 Hz were used. The relatively high sampling rate
0.25
allows for high fidelity time signal resolution while filtering can 0.2
Acc. (g)

reduce resonant effects of each sensor in post-processing. 0.2


0
The time taken to complete a trial was dependent on the 0.15
walking speed of the individual as researchers gave no
Acceleration (g)

0.1 -0.2
instructions about appropriate walking speed. These times 16.25 16.3 16.35 16.4
Time (s)
ranged from approximately 20 seconds to 35 seconds for 0.05

individuals to traverse the hallway. The recorded data was 0


processed by removing the DC offset of each accelerometer, -0.05
giving a zero initial value for each signal, and applying a low
-0.1
pass filter. A Butterworth filter of order three with a 500 Hz
cutoff frequency was used to capture all the expected frequency -0.15
content of the normal force for measurement in the vertical -0.2
0 2 4 6 8 10 12 14 16 18 20
direction [17]. Time (s)
Special care was taken to ensure the only walking recorded
Fig. 2. A typical walking signal from one sensor. This
during the experiment was that of the study participants. Future
particular signal is the first trial of Female A wearing hard soled
work will study methods that enable multiple walkers walking
shoes. The peaks of the signal represent footsteps. The inlaid
along the hallway.
plot shows a portion of the same signal.
III. FEATURE EXPLORATION number of accelerometers used in the study (𝑚 = 14). A typical
A feature set is the extracted data used as inputs to the walking signal is shown in Fig. 2 for reference.
machine learning algorithms. One of the aims of this work is to Due to the large dimensionality, the raw data is not an ideal
understand what makes a useful feature type. There is little a set of features for a machine learning algorithm. Because large
priori knowledge about which walking features will yield the features lead to high computation time and low accuracies in
highest gender classification rate necessitating definition of classification due to overfitting [32]. To reduce the

2327-4662 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2016.2582723, IEEE Internet of
Things Journal

4
IoT – 0902 - 2016
dimensionality of the data multiple feature sets were developed: The second feature set examined, MSS, is constructed by
Sensor Averaged Step (SAS), Multiple Sensor Step (MSS), and taking the most prevalent step (based on amplitude) recorded
Single Sensor Step (SSS). Each of the three feature types was by accelerometer 4 and then recording the same footstep in time
constructed in the time domain and then transformed to the measured by accelerometers 2 and 6. These three steps are
frequency domain using a fast Fourier transform (FFT). A concatenated so that the signal contains the three steps in the
description of the construction of each feature set follows, and specified order (2, 4, 6 for Westward trials and 6, 4, 2 for
Eastward trials). Fig. 4 details a representative layout of sensors
2, 4, and 6. The MSS feature set was constructed after
recognizing that the averaging of SAS may adversely affect the
machine learning capabilities. The averaging caused the loss of
high-frequency content and spatial information.
The third feature set examined, SSS, is constructed by taking
the most prevalent single step recorded by sensor 1, see Fig. 4.
This feature set is used as a control to determine the ability of a
single accelerometer and single step to classify gender. A single
footstep in all time feature sets was defined as the 61.0 ms
before a footstep peak and 152.6 ms after the peak, as shown in
the inlaid figure of Fig. 4. The development of the features was
completed with the MATLAB ‘peakfinder’ command to
Fig. 3. Representation of initial data in a single identify footsteps to then peak align and average.
walking trial. Each column represents a single sensor In summary, there are ten sets of data. The SAS10, SAS5,
and a spike in the signal represents a footstep. SAS3, MSS, and SSS in the time domain (denoted time), and
Table 2 shows a pictorial summary for each feature type. Table the frequency domain (denoted frequency).
2 shows the construction of these various feature based on a
schematic walking signal shown in Fig. 3.
The first feature set examined, the SAS feature set, is 1
constructed by selecting the 𝑛 most prevalent steps (based on
amplitude) recorded by a single sensor during the entirety of a 6 4 2
single walking trial. These 𝑥 steps were maximum value
aligned and averaged. The most appropriate number of steps to
average is unknown a priori, and therefore, 𝑥 = 10, 5, and 3 are
chosen to construct three SAS feature sets (denoted SAS10, Fig. 4. Illustration of sensor arrangement. The spatial relation
SAS5, and SAS3). The SAS feature sets were then transformed between sensor 1, 2, 4, and 6 in the testing hallway. These
to the frequency domain, using an FFT, to study domain type sensors are used in the construction of the MSS and SSS data
influence on feature representation. types.

Table 2. Illustration of the feature type construction. The steps required to develop the three feature types (SAS, MSS, and
SSS) in the time domain are illustrated schematically for ease of understanding.

Feature
Step 1 Step 1 Representation Step 2 Step 2 Representation
Type

Averagi- Sensor 1 Sensor 2 Sensor 3 Sensor 14


Sensor 1 Sensor 2 Sensor 3 Sensor 14
ng of 𝑛
Concatenati-
steps of
SAS on
each
sensor

Dimension: ℝ7001×14 Dimension: ℝ98014×1


Sensor 2 Sensor 4 Sensor 6
Most
prominen Concatenati- Sensor 2 Sensor 4 Sensor 6
-t step on * Based
MSS recorded on walking
by direction
Sensor 4

Dimension: ℝ7001×3 Dimension: ℝ21003×1

2327-4662 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2016.2582723, IEEE Internet of
Things Journal

5
IoT – 0902 - 2016
Single
most Sensor 1
prominen
-t step
SSS recorded
from
only
Sensor 1

Dimension: ℝ7001×1
IV. MACHINE LEARNING TECHNIQUES new walker’s features would be evaluated at each node
This work uses supervised machine learning to classify until classification occurs at a leaf to test the
gender. The machine learning algorithms used in this classification ability of the tree. Further reading on
work require three steps: training, validation, and testing. decision trees is found in Quinlan [34].
The training step develops a generalized predictive Choosing the most appropriate features to split on is
model, the validation step sets or updates ‘hyper- difficult a priori and leads to the desire to combine many
parameters’ or model design choices, and the test step decision trees through ensemble learning methods:
determines the ability of the algorithm to classify new bagged decision trees and boosted decision trees.
walker measurements. The features used in this work are Bagged and boosted machine learning algorithms are
the ten previously discussed feature sets and the gender examples of ensemble methods that combine multiple
labels for each walker. The machine learning algorithms learning algorithms to solve complex problems. These
develop models that make predictions on new. The error two examples manipulate the feature used in training to
(i.e. the percentage of gender misclassifications) of each generate multiple models [35]. For background before
step of machine learning is tracked, and only test errors further discussion, bias is an error due to assumptions in
are reported throughout this work. the model and variance is an error due to small changes
To reduce the effect of feature initialization (i.e. which in the training dataset.
trials of the walking features are sorted into each of the Bagging machine learning algorithms aim to reduce
three steps above) this work uses a ten-fold-cross- the variance associated with high variance, low bias
validation method. This method randomly partitions algorithms [35, 36]. The process of bagging takes the
feature sets into ten, nearly equal parts, known as folds. original training dataset and constructs multiple datasets
A single fold is held out, and the remaining nine folds by sampling instances (with replacement) from the
are used in the training and validation steps. The original dataset. It can be shown that this process results
developed model is used to classify the held out tenth in approximately 63% of unique samples in each newly
fold, and the test error is recorded. Then, a different fold created “bagged” dataset [36]. A decision tree is
is held out, with this process repeating until all folds have developed using the labels and features of this bagged
been held out once. The averaged test error represents dataset. This process repeats for a specific number of
the error that can be expected on a never before seen bags, and the test errors of these models are averaged for
walker [33]. a representative test error for the entire system.
An aim of this work is to understand the effect of type Boosted machine learning algorithms try to reduce the
of machine learning algorithm on the final gender bias associated with high bias, low variance systems
classification abilities. The machine learning techniques [37]. In the process of boosted learning, every simple
used in this work are Bagged Decision Trees, Boosted model (i.e. a decision tree) is weighted based on its
Decision Trees, Support Vector Machines (SVMs), and ability to correctly classify. The final model is a
Neural Networks (NNs). Each algorithm type is briefly weighted combination of the results of the simple
discussed. models. This work uses the AdaBoost method [35].

A) Decision Trees B) Neural Networks


Decision trees involve the splitting of feature sets at a Neural Networks (NNs) are machine learning
node until there is a reasonably high probability of a algorithms that, at the highest level, map features to
specific classification. A node represents the splitting hidden neurons and then to an output label through a
point for a single feature, and a branch represents the series of non-linear functions (i.e. Sigmoid, etc.) [38].
flow of features to further nodes. The training and Each feature contributes a weight to a neuron in the
validation steps of these algorithms determine which hidden layer. This work uses a linear mapping of features
features are the most important to split on and the to neurons in a hidden layer. A single hidden layer was
appropriate depth of a tree (i.e. how many nodes are used in this work as more layers can add unnecessary
necessary for an acceptable level of classification). A computational complexity [39]. All of the weights at a

2327-4662 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2016.2582723, IEEE Internet of
Things Journal

6
IoT – 0902 - 2016
single neuron then pass through a non-linear function, A) Comparison of Feature Type Selection
and the outputs are weighted and combined into the The statistical testing reveals that there is
output layer. A final, chosen non-linear function indistinguishability between the SAS5 and SAS10
generates a classification label for gender in this work. feature types. While the mean error of the SAS5 type in
the time and frequency domain is the lowest reported at
C) Support Vector Machines 22.9% and 33.8%, respectively. All feature type errors
These algorithms ‘learn’ a hyperplane between the were developed while considering all four machine
classes (i.e. female and male), and in particular, they learning algorithm types. This method is to ensure
learn the hyperplane that maximizes the margin between robustness regardless of machine learning type.
the two classes. Often associated with SVMs is the use
of what is called the kernel trick. The kernel trick allows B) Comparison of Machine Learning Types
for the replacement of inner products (i.e. measures of We find that Bagged and Boosted Decision Trees as
similarity) with kernel functions [40]. Kernels map the well as SVMs are all effective machine learning
feature sets into a higher dimensional space, via a chosen algorithms for gender classification in our dataset.
non-linear mapping. The non-linear mapping (i.e. Neural Networks perform statistically worse than the
polynomial) is chosen a priori to develop an optimal other algorithms. When examining the mean error of the
hyperplane that separates the classes [41, 42]. Based on machine learning algorithms, Bagged Decision Trees
their wide use and high rates of success, Radial Basis produced a classification error rate of 21.5% and 27.8%
Function (RBF) and polynomials of order 1-25 SVMs across all time and frequency domain feature types,
are used in this work to understand how the type of SVM respectively. Considering the statistical analysis across
affects classification rate. the averaging across feature types is to build robust
The machine learning algorithms were developed machine learning algorithms independent of which
using MATLAB commands: ‘fitensemble’, ‘patternnet’ feature type.
and ‘fitcsvm’ respectively. The ten-fold-cross-validation
was completed either internally to the machine learning C) Comparison of Time Frequency Domain
command or through the ‘KFold’ command. Each of the As shown in Fig. 5, the time domain data input gives a
four learning algorithms were combined with the ten lower misclassification rate than the frequency domain.
feature types to examine the effectiveness of each The overall accuracy of the time domain is 70% while
combination to successfully classify gender. the frequency domain is 64%. This result may be partly
due to the relative richness of input data in the time
V. RESULTS domain versus the frequency domain. Generally,
The ten previously discussed feature types (SAS10, dynamicists “simplify” complex time series data by
SAS5, SAS3, MSS, SSS, in both time and frequency) transforming into the frequency domain to understand
were all used as input with each of the four machine underlying phenomena. However, a larger number of
learning algorithms. This combination allowed for the peaks in the time domain are represented only by a
answering of 1) which feature type is most informative, dominating peak in the frequency domain. This result
2) which machine learning algorithm is most effective, means that the time domain has a larger amount of
and 3) whether the time or frequency domain houses the potentially interesting (non-zero) features than the
most important classification information. The overall frequency domain. This hypothesis is supported by
range accuracies for all feature set types and machine Ekimov and Sabatier [17], showing that the energy in the
learning algorithm combinations was 95 – 42%. frequency domain of gait is bounded by 500 Hz for the
Fig. 5 shows the classification error results for all normal force interaction between the foot and the
combinations of feature types and machine learning ground. The frequency domain in this study contains
algorithms in the study. There are two broad columns zero amplitude higher frequency terms. These zero
representing the time and frequency domains and the values may only add noise to the machine learning
columns further subdivided show feature types. The algorithm. The sensitivity of specific machine learning
dependent axis is the average error for each algorithm algorithms to the features used may also explain some of
feature type combination. the differences between the two domains.
To better determine the significance of the results
presented in Fig. 5, a statistical test is applied. D) Machine Learning – Feature Combination
Specifically, a one-way analysis of variance, ANOVA, The single best performing feature type and machine
and a Tukey test with a confidence of 5%. The Tukey learning combination was the boosted decision trees
test examines all possible pairs of means (in this work using SAS5 time as features, with an accuracy of 88.3%
average classification error) to find pairs that are in the classification of the gender of an individual
statistically different from each other to evaluate walker. Similar results may be possible using either the
performance. bagged decision tree or SVM as the learning algorithm
and either SAS5 or SAS10 features, as these produce
statistically indeterminate results.

2327-4662 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2016.2582723, IEEE Internet of
Things Journal

7
IoT – 0902 - 2016
the large number of features and the desire to keep the
developed neural network single layered has led to the
relatively poor results. A more complex NN or
successfully selecting lower dimensionality features is
predicted to improve algorithm performance [32]. This
is beyond the scope of the current work.
The optimum polynomial degree for the SVMs (24)
and the optimum number of neurons in the neural
networks (2) were determined by iterative search. There
is relatively low change in accuracy of classification as
the degree of polynomial increases, as shown in Fig. 6
suggesting robustness to overfitting. The number of
neurons has little effect on the accuracy of classification
leaving little confidence in robust classifying with the
NN single-layer scheme chosen for the study.
Parameter Study

0.5

0.45

0.4
SVM
Percent Error 0.35 Neural Networks

0.3

0.25

0.2

0.15

0.1
5 10 15 20 25
SVM Polynomial Order / NN Number of Neurons
Fig. 5. Average error result of all feature types and
machine learning algorithms. Fig. 6. SVM and NN parameter study. The independent
axis is either the order of the SVM polynomial or the
E) Notable Observations
number of neurons in the neural network each plotted
Additional observations beyond the primary test goals
against the percent error for gender classification.
were made regarding the success of the “control” SSS
feature type, poor NN performance, and parameter
VI. CONCLUSIONS
studies of SVMs and NNs.
Using data from underfloor mounted accelerometers as
The investigation of the control feature set, SSS, reveals
input feature sets for machine learning techniques shows
that with using data from only a single footstep and
promising results for classifying the gender of an
accelerometer shows that gender classification is
individual walker. The use of vibration data from this
possible with little gait information. The boosted
novel mounting of accelerometers is out of sight,
decision trees produced the highest accuracy rate of 80%
allowing for an additional level of both privacy and
accuracy with this SSS feature set. This result implies
tamper resistance not achievable with visual systems or
that a high density of gender classification information
microphones used in current gait measurement
lies within a single footstep. If higher classification rates
techniques. The primary results of this work are that a
are desired, then multiple footsteps should be recorded
Bagged or Boosted Decision Tree and Support Vector
as shown by the better performance of the SAS. The
Machines produce statistically equivalent gender
addition of 4 more steps (SAS5) only increases accuracy
classification results and seem robust to the type of
by 8%, although there is four times as much data, and
feature set used as an input. All of the feature types
therefore, an increase of computation time and storage.
constructed in the time domain produce better gender
The low accuracies produced by NNs are not
classification results. The time domain provides more
surprising due to the number of weights that must be
useful features for successful gender classification than
learned for this type of algorithm. This particular form of
the frequency domain. The single best classifier created
the algorithm must learn 𝑛𝑑 + 𝑛 number of weights,
was a Bagged Decision Tree using the SAS5 time feature
where 𝑛 is the number of neurons and 𝑑 is the
set (this feature set is created by peak aligning the five
dimensionality of the feature set. It is hypothesized that
most prevalent steps recorded by each sensor and

2327-4662 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2016.2582723, IEEE Internet of
Things Journal

8
IoT – 0902 - 2016
averaging) with an accuracy of 88% in gender [16] A. Godfrey, R. Conway, D. Meagher, and G. ÓLaighin,
"Direct measurement of human movement by
classification.
accelerometry," Medical Engineering & Physics, vol. 30,
Ongoing studies have shown that robust features are pp. 1364-1386, 2008.
capable of accommodating multiple floor types (e.g., [17] A. Ekimov and J. M. Sabatier, "Vibration and sound
concrete and hardwood). Future work will address the signatures of human footsteps in buildings," The Journal of
the Acoustical Society of America, vol. 120, pp. 762-768,
robustness of the proposed feature types for
2006.
classification on different floor types. [18] C. DeLoney, "Person Identification and Gender
Recognition from Footstep Sound using Modulation
REFERENCES Analysis," p. NA, 2008.
[1] S. Pan, A. Bonde, J. Jing, L. Zhang, P. Zhang, and H. Y. [19] X. Li, R. J. Logan, and R. E. Pastore, "Perception of
Noh, "BOES: Building Occupancy Estimation System acoustic source characteristics: Walking sounds," The
using sparse ambient vibration monitoring," 2014, p. NA. Journal of the Acoustical Society of America, vol. 90, pp.
[2] L. Middleton, A. A. Buss, A. Bazin, and M. S. Nixon, "A 3036-3049, 1991.
floor sensor system for gait recognition," in Fourth IEEE [20] G. Qian, J. Zhang, and A. Kidané, "People Identification
Workshop on Automatic Identification Advanced Using Floor Pressure Sensing and Analysis," IEEE Sensors
Technologies, 2005, pp. 171-176. Journal, vol. 10, pp. 1447-1460, 2010.
[3] C. BenAbdelkader, R. Cutler, H. Nanda, and L. Davis, [21] M. Kohle and D. Merkl, "Identification of gait patterns with
"EigenGait: Motion-Based Recognition of People Using self-organizing maps based on ground reaction force.,"
Image Self-Similarity," in Audio- and Video-Based 1996, pp. 24-26.
Biometric Person Authentication, J. Bigun and F. Smeraldi, [22] S. Pan, N. Wang, Y. Qian, I. Velibeyoglu, H. Y. Noh, and
Eds., ed: Springer Berlin Heidelberg, 2001, pp. 284-294. P. Zhang, "Indoor Person Identification Through Footstep
[4] J. B. Hayfron-Acquah, M. S. Nixon, and J. N. Carter, Induced Structural Vibration," presented at the Proceedings
"Recognising human and animal movement by symmetry," of the 16th International Workshop on Mobile Computing
in International Conference on Image Processing. Systems and Applications, New York, NY, USA, 2015.
Proceedings, 2001, pp. 290-293. [23] J. M. Sabatier and A. E. Ekimov, "A Review of Human
[5] M. N. Nyan, F. E. H. Tay, K. H. W. Seah, and Y. Y. Sitoh, Signatures in Urban Environments Using Seismic and
"Classification of gait patterns in the time–frequency Acoustic Methods," in 2008 IEEE Conference on
domain," Journal of Biomechanics, vol. 39, pp. 2647-2656, Technologies for Homeland Security, 2008, pp. 215-220.
2006. [24] J. M. Sabatier and A. E. Ekimov, "Range limitation for
[6] Y.-C. Lin, B.-S. Yang, Y.-T. Lin, and Y.-T. Yang, "Human seismic footstep detection," 2008, pp. 69630V-69630V-12.
Recognition Based on Kinematics and Kinetics of Gait," [25] J. M. Hamilton, B. S. Joyce, M. E. Kasarda, and P. A.
Journal of Medical and Biological Engineering, vol. 31, pp. Tarazaga, "Characterization of Human Motion Through
225-263, 2010. Floor Vibration," in IMAC XXXII, Orlando, FL, February
[7] M. Yang, H. Zheng, H. Wang, and S. McClean, "Feature 3-6, 2014.
selection and construction for the discrimination of [26] B. S. Joyce and P. A. Tarazaga, "Estimating Noise Spectra
neurodegenerative diseases based on gait analysis," in 3rd for Data from an Instrumented Building," in IMAC XXXIV
International Conference on Pervasive Computing A Conference and Exposition on Structural Dynamics,
Technologies for Healthcare, 2009. PervasiveHealth 2009, Orlando, Florida, January 25-28, 2016.
2009, pp. 1-7. [27] M. E. Kasarda, P. A. Tarazaga, M. Embree, S. Gugercin, A.
[8] J. Yun, "User Identification Using Gait Patterns on Woolard, B. S. Joyce, et al., "Detection and Identification
UbiFloorII," Sensors (Basel, Switzerland), vol. 11, pp. of Firearms Upon Discharge Using Floor-Based
2611-2639, 2011. Accelerometers," in IMAC XXXIV A Conference and
[9] C. BenAbdelkader, R. Cutler, and L. Davis, "Stride and Exposition on Structural Dynamics, Orlando, Florida,
cadence as a biometric in automatic person identification January 25-28, 2016.
and verification," in Fifth IEEE International Conference [28] J. D. Poston, R. M. Buehrer, V. S. Malladi, A. G. Woolard,
on Automatic Face and Gesture Recognition. Proceedings, and P. A. Tarazaga, "Vibration Sensing in Smart
2002, pp. 372-377. Buildings," in International Conference on Indoor
[10] M. S. Nixon and J. N. Carter, "Automatic Recognition by Positioning and Indoor Navigation (IPIN), Banff, Alberta,
Gait," Proceedings of the IEEE, vol. 94, pp. 2013-2024, Canada, 13-16 October, 2015.
2006. [29] J. D. Poston, J. Schloemann, R. M. Buehrer, V. S. Malladi,
[11] D. D. Hoffman and B. E. Flinchbaugh, "The interpretation V. S. Woolard, and P. A. Tarazaga, "Towards Indoor
of biological motion," Biological Cybernetics, vol. 42, pp. Localization of Pedestrians via Smart Building Vibration
195-204, 1982. Sensing," in International Conference on Localization and
[12] C. Cedras and M. Shah, "A survey of motion analysis from GNSS, June 22-24, 2015.
moving light displays," in, 1994 IEEE Computer Society [30] J. Schloemann, V. V. N. S. Malladi, A. G. Woolard, J. M.
Conference on Computer Vision and Pattern Recognition, Hamilton, R. M. Buehrer, and P. A. Tarazaga, "Vibration
1994. Proceedings CVPR '94, 1994, pp. 214-221. Event Localization in an Instrumented Building," in
[13] T. Liu, Y. Inoue, K. Shibata, and K. Shiojima, "A Mobile International Modal Analysis Conference (IMAC) XXXIII,
Force Plate and Three-Dimensional Motion Analysis Orlando, Florida, February 2-5, 2015.
System for Three-Dimensional Gait Assessment," IEEE [31] R. J. Orr and G. D. Abowd, "The Smart Floor: A
Sensors Journal, vol. 12, pp. 1461-1467, 2012. Mechanism for Natural User Identification and Tracking,"
[14] M. Sousa, A. Techmer, A. Steinhage, C. Lauterbach, and P. 2000, pp. 275–276.
Lukowicz, "Human Tracking and Identification using a [32] I. Czarnowski and P. Jȩdrzejowicz, "Data Reduction
Sensitive Floor and Wearable Accelerometers," IEEE Algorithm for Machine Learning and Data Mining," in New
International Conference on Pervasive Computing and Frontiers in Applied Artificial Intelligence, N. T. Nguyen,
Communications (PerCom), pp. 166-171, 2013. L. Borzemski, A. Grzech, and M. Ali, Eds., ed: Springer
[15] J. Sinclair, S. J. Hobbs, L. Protheroe, C. J. Edmundson, and Berlin Heidelberg, 2008, pp. 276-285.
A. Greenhalgh, "Determination of gait events using an [33] B. Parmanto, P. W. Munro, and H. R. Doyle, "Reducing
externally mounted shank accelerometer," Journal of Variance of Committee Prediction with Resampling
Applied Biomechanics, vol. 29, pp. 118-122, Feb 2013 Techniques," Connection Science, vol. 8, pp. 405-426,
2013. 1996.

2327-4662 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2016.2582723, IEEE Internet of
Things Journal

9
IoT – 0902 - 2016
[34] J. R. Quinlan, "Induction of Decision Trees," Machine [39] G. Cybenko, "Approximation by superpositions of a
Learning, vol. 1, pp. 81-106, 1986. sigmoidal function," Mathematics of Control, Signals and
[35] T. G. Dietterich, "Ensemble Methods in Machine Systems, vol. 2, pp. 303-314, 1989.
Learning," in Multiple Classifier Systems, ed: Springer [40] K. P. Murphy, Machine Learning: A Probabilistic
Berlin Heidelberg, 2000, pp. 1-15. Perspective: MIT Press, 2012.
[36] L. Breiman, "Bagging Predictors," Machine Learning, vol. [41] B. E. Boser, I. M. Guyon, and V. N. Vapnik, "A Training
24, pp. 123-140, 1996. Algorithm for Optimal Margin Classifiers," 1992, pp. 144–
[37] R. E. Schapire, "The strength of weak learnability," 152.
Machine Learning, vol. 5, pp. 197-227, 1990. [42] V. N. Vapnik, The Nature of Statistical Learning Theory.
[38] S. Haykin, A comprehensive foundation: Neural Networks, New York, NY: Springer New York, 2000.
2004.
Prof. Batra is a recipient of Carnegie Mellon Dean's
Biographies Fellowship (2007), two Google Faculty Research
Dustin Bales received his B.S. from the University of Awards (2013, 2015), Virginia Tech Teacher of the
Wyoming in 2014, and his M.S. from Virginia Tech in Week (2013), Army Research Office (ARO) Young
2016, both in engineering. Investigator Program (YIP) award (2014), the National
Science Foundation (NSF) CAREER award (2014), and
His research includes gait analysis and characteristic Virginia Tech CoE Outstanding New Assistant Professor
classification through advance computational methods. award (2015). Research from his lab has been featured
He currently serves as Chief Business Officer for GAiTE in Bloomberg Business, The Boston Globe, MIT
LLC, a company progressing the Internet of Things. Technology Review, Newsweek, WVTF Radio IQ, and
a number of popular press magazines and newspapers.
Pablo A. Tarazaga received his B.S. degree from the
University of Puerto Rico at Mayaguez in 2002, and his A.G. Woolard is a PhD candidate in
M.S. and Ph.D. degrees from Virginia Tech in 2004 and Mechanical Engineering at Virginia Tech. His research
2009, respectively all in Mechanical Engineering. area is in wave propagations in elastic solids for the
purpose of tracking people in instrumented buildings
He is currently an Assistant Professor in the Department through footstep induced floor vibrations.
of Mechanical Engineering at Virginia Tech where he
directs the Virginia Tech Smart Infrastructure J. D. Poston is currently a Ph.D. student with research
Laboratory and the Vibrations, Adaptive Structures and interests in novel indoor localization techniques and their
Testing Laboratory. He is the receipt of the 2015 Air privacy implications. Also, he is interested in the
Force Office of Scientific Research Young Investigator application of machine learning and signal processing
Award. techniques to cyber-physical systems.

Mary Kasarda received her B.S., M.S., and PhD from Previously, he worked in industry in the fields of
the University of Virginia in 1984, 1988, and 1997, telecommunications and defense.
respectively. She worked as a mechanical engineer at He received a B.S. degree from Virginia Tech and M.S.
Ingersoll-Rand, Du Pont, and Rotor Bearing Dynamics, degree from George Washington University.
between her academic degrees. She is currently an He is a Senior member of the IEEE.
Associate Professor in the Department of Mechanical
Engineering at Virginia Tech. Her research interests V. V. N. S. Malladi received the B.Tech degree in
include Smart Building Technologies, Machinery Health Mining Machinery engineering from Indian Institute of
Monitoring, and Engineering Education. She is an NSF Technology (or ISM) – Dhanbad, India, in 2011, and
CAREER Award winner and a Fellow of the American M.S. and PhD degrees in Mechanical Engineering from
Society of Mechanical Engineers (ASME). Virginia Polytechnic Institute and State University,
Blacksburg in 2013 and 2016 respectively. He is
Dhruv Batra is an Assistant Professor at the Bradley currently a postdoctoral researcher working on modal
Department of Electrical and Computer Engineering at testing, waves, and gait analysis.
Virginia Tech, where he leads the VT Machine Learning
& Perception group. His research interests lie at the
intersection of machine learning, computer vision, and
AI, with a focus on developing intelligent systems that
are able to concisely summarize their beliefs about the
world, integrate information and beliefs across different
sub-components or `modules' of AI (vision, language,
reasoning) to extract a holistic view of the world, and
explain why they believe what they believe.

2327-4662 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

You might also like