You are on page 1of 6

2009 IEEE/ASME International Conference on Advanced Intelligent Mechatronics

Suntec Convention and Exhibition Center


Singapore, July 14-17, 2009

Pattern Recognition-based Real-time End Point Detection


Specialized for Accelerometer Signal
Jong Gwan Lim, Sang-Youn Kim, and Dong-Soo Kwon, Member, IEEE

Abstract End point detection is proposed for motion


detection by acceleration. Apart from the conventional methods
based energy feature normalization in automatic speech
recognition and heuristic thresholdbased algorithms,
supervised learning in pattern recognition is proposed to
discriminate a motion state and a non-motion state. Before the
algorithm developments in earnest, feasibility and feature
selection for the research objectives are mainly studied in this
paper. As feature candidates for data representation, we have
chosen the absolute value of acceleration, its 1st derivatives, and
2nd derivatives respectively based on correlation coefficient
first. Using them, we have formed feature vectors and then
transformed 2D or 3D feature vectors into variant vectors with
Principle component analysis and Fishers Linear Discriminant
(FLD). Also the sequence of the absolute 1st derivatives with
incremental order is critically considered as feature vectors. In
addition to the various feature vectors, artificial neural network
has been designed to investigate and analyze the feasibility of
the proposed algorithm. As a result, it is observed that vectors
except for the FLD-transformed doesnt show significant
difference and the sequence of the absolute 1st derivatives
record comparatively reliable and stable recognition rates
regardless of subjects.

I. INTRODUCTION

HE Inertia Measurement Unit (IMU) used in the inertia


navigation system in manned/ unmanned aerial vehicles
has been applied widely in motion detection recently [1]-[6].
While most applications were equipped with only
accelerometers in the beginning, they have been evolved into
devices with both accelerometers and gyroscopes or 6-axis
IMU gradually [5]. Since a gyroscope and 6-axis IMU,
however, are disadvantageous in cost and size compared to
accelerometers, 3-axis accelerometers are becoming more
popular again and regarded as an essential option for the
context awareness of mobile devices these days [6].
Small-sized and cost-effective accelerometers are
manufactured with MEMS process by Freescale, Analog, and
STMicro mainly. We deal with MMA7261Q made in
Manuscript received January 31, 2009. This work was supported by the IT
R&D program of MKE/IITA [2008-F039-01, Development of Mediated
Interface Technology for HRI].
Jong Gwan Lim is a PhD candidate in the Department of Mechanical
Engineering, KAIST, Daejeon, Korea (e-mail: limjg@robot.kaist.ac.kr).
Sang-Youn Kim is leading the Interaction lab in Korea University of
Technology and Education, Cheonan, Chungnam, Korea (e-mail:
sykim@kut.ac.kr).
Dong-Soo Kwon is now with the Department of Mechanical Engineering,
KAIST, Daejeon, Korea (corresponding author to provide phone:
+82-42-350-3042; fax: +82-42-350-3210; e-mail: kwonds@kaist.ac.kr).

978-1-4244-2853-3/09/$25.00 2009 IEEE

Freescale in this paper [7].


A. Accelerometer Signal Characteristics
Though the spectral feature of the signal shows small
deviation with respect to each gesture or motion made in 3D
space, roughly speaking, 0~20Hz are significant band and
there are big difference from speech, which is mainly
composed of 0~16kHz signal. The more detailed signal
characteristics described in [1], [3] commonly indicate the
change in the deflection of the vertical gravity and thermal
bias drift are critical error sources.
B. The Significance of End Point Detection (EPD)
Fig.1 illustrates general motion detection procedure in using
accelerometers. It consists of pre-processing, end point
detection, feature extraction, and recognition with respect to
raw data and we cover in-depth end point detection in this
paper.
Basically 6DOF information is required for trajectory
estimation in 3D space and the absence of 3-axis gyroscopes
makes it hard to solve for the 3 empty parameters analytically.
Therefore, we resort to iterative optimization approach or
pattern recognition approach in trajectory estimation.
Furthermore the pattern recognition approaches with respect
to raw acceleration signal are getting interests for motion
detection due to its tolerance to noise recently instead of the
trajectory estimation which is at a standstill at the moment [8],
[9].

Fig. 1. General pattern recognition procedure. The case including


trajectory estimation is represented as Quantitative and the case
excluding it is depicted as Qualitative

EPD is the method to discriminate significant signal from


non-significant one for processing and is essential time series
processing such as in Automatic Speech Recognition (ASR)
in general [10]. In terms of motion, it implies the method to
find out whether motion is activated in the interested duration.

203

In spite of many studies on EPD in ASR, it is hard to apply


them to the acceleration processing without modification
because of drift error and temporal/ spectral characteristic
difference. These differences in the spectral feature and error
sources are good motivation for EPD study dedicated to
accelerometer signals.
II. PROPOSED METHOD
A. Previous approach and analysis
Since there has been no independent publication about
acceleration-specialized EPD, a few methods are excerpted
partly for detailed study from various papers (totally six
methods are found during literature survey and three are
already evaluated quantitatively in the previous work [11]).
One of the most straightforward methods found in literature
is to allow users to indicate the starting point and ending point
of motion directly with a help of additional buttons [3], [9],
[12]. Considering additional button manipulation may
deteriorate intuitiveness by increasing users cognitive load,
although this is a very powerful solution, we concentrate on
automatic EPD not manual EPD.
Generally automatic EPD for acceleration is performed by
calculating signal energy level and determining an
appropriate threshold, which is nearly identical to energy
feature normalization in ASR though the identical
performance reliability is not guaranteed due to the
aforementioned signal characteristics. Fig.2 illustrates such a
situation with one acceleration signal and its energy.

Fig. 2. Acceleration and its energy

We can intuitively find out that a certain motion has been


made in between 13 and 57 samples in the top subfigure.
However the bottom subfigure shows it is difficult to
discriminate motion period from non-motion period to apply
a threshold simply in the energy level. The valley (32~42
samples) and DC bias offset (58~80 samples) are crucial
barriers to the thresholding and associated with the
acceleration spectral characteristics featured by low
frequency component and the change in the deflection of the
gravity respectively. Consequently, the deliberate
modification should be put on the energy calculation. The

envelope detector composed of a cascade of half/ full rectifier


and RC low-pass filter is a good example for the modified
energy calculation where detected envelope is treated as a sort
of energy [3].
Since the EPD performance is closely dependent on how we
calculate the modified energy with minimal information loss
and error source removal, usually the main difference
between acceleration-specialized EPD methods also deviates
from it [3], [5], [6], [11], [13]. The various modified energy
calculations are commonly analyzed into three stages: DC
component removal, rectification, and signal smoothing.
Actual energy calculation is done in the rectification stage.
DC bias offset and the valley in Fig.2 are removed in the DC
component removal and signal smoothing stage one by one.
Detailed techniques in each method are given in Table I.
TABLE I
PREVIOUS APPROACHES
Procedure
Approach
DC removal
[3]

H filtering

[13]

H filtering1

[6]
3

EPS [11]
1
2
3
4

Rectification

Signal smoothing

full rectifier

L filtering2

squaring

summation over axes

piecewise variance

maximum choice btw axes

piecewise variance

APCA4

high-pass filtering
low-pass filtering
Extreme Point Sampling
Adaptive Piecewise Constant Approximation

The drawbacks in the previous approaches are summarized


as follows. Firstly, some of them are dependent on the filter
characteristics [3], [13]. Since there is no ideal filter, the
multiple uses of a filter affect EPD performance dramatically
distorting signal and producing time delay by filtering
process. Secondly, all of them are heuristic that they need
several parameters including a threshold and the above given
filtering-based approaches, in particular, increase the
considerations related to filter selection (time delay, cutoff
frequency, filter kinds and etc). Trade-offs among several
considerations, accordingly, make it harder to find the
optimal condition for the best performance. Lastly, each
method shows incoherent performance according to the
application because each of them is developed with the
different purpose for their own applications that EDP
performance isnt guaranteed if they are used in the
inappropriate applications out of their original use [6], [11].
That is, the method developed for large scale motion such as
Nintendo WII doesnt suit to small motion applications like
handwriting recognition and vice versa.
B. Research objectives
In this paper, we pursue the acceleration-specialized EPD
that meets the following requirements. First of all, reliable
EPD performance should be guaranteed enough to distinguish
motion period and non-motion period accurately. In addition
to performance reliability, the rapid response to motion is so
critical that the algorithm itself should cause lower latency.

204

Given that EPD is just one part of the whole application and
most applications are operated together with communications,
time delay should be suppressed for real-time processing as
possible as can be. At the same time, heuristics such as
threshold-based approaches should be avoided enough to
accomplish completely automatic EPD. Finally, the proposed
method should be multipurpose or easily adaptable to the
optimal performance according to motion scale in
applications.
To meet four detailed requirements in the research objectives,
we propose to discriminate motion state every sample by
using pattern recognition techniques for EPD. Instead of
various techniques in the modified energy calculation, if
features robust to error are extracted properly, time delay is
minimized by avoiding the use of filtering in pre-processing
and feature extraction. The other strengths come from
supervised learning in pattern recognition. Given that a
recognizer needs training data in supervised learning,
different training data offer different optimal points and it
means that this method can be applicable to the various
purposes. Learning can be also understood as the process to
optimize every parameter used in the proposed algorithms
determining a threshold automatically through supervised
learning.
III.

presented in Table II. (When df=1680, p < .01, and


CC=0.0629, it is acceptable that they are highly correlated)
According to Table II, it is clear that absolute values of each
candidate are highly correlated to manual EPD result. We
have constructed various feature vectors such as
|, |
| , |
|, |
|, |
| , and etc
|
using them. Additionally Principle Component Analysis
(PCA) and Fishers Linear Discriminant Analysis (FLD) have
|, |
| and
been conducted with respect to |
|, |
|, |
| . PCA and FLD are used for
[|
feature vector dimension reduction but we just use them for
rotating the axes to line up with the directions of highest
variance because feature vectors are already chosen by their
correlation [14]. Through many trial and errors, it is observed
| has more significance than the
that the sequence of |
feature vectors made by the combination of each candidate
because the connectivity between values at each sample
seems to be associated with the motion state. Based on this
observation, we choose a certain period of sequence of
acceleration as a feature vector too.
B. Neural Network Design

MOTION STATE DISCRIMINATION

A. Feature Vector Selection


TABLE II
CORRELATION COEFFICIENTS (CC)
Candidates

|
|

|CC|
3

0.0488

0.0035

0.0585

0.0719

0.0457

0.5202

0.4612

0.4565

0.4528

0.4727

0.0004

0.0069

0.0069

0.0028

0.0043

0.5203

0.5469

0.4275

0.4743

0.4923

0.0008

0.0008

0.0004

0.0007

0.0023

0.5302

0.4980

0.3688

0.4386

0.4589

0.4152

0.3917

0.2944

0.3400

0.3603

mean

Fig. 3. A focused Time Lagged Feedforward Network(TLFN)

The problem we first face in pattern recognition is a proper


data representation with respect to a given raw acceleration
signal. It is crucial because the valid feature selection is
highly proportional to the recognition performance and
reduces data processing time by representing data
compressively [14]. To represent acceleration signal, raw
acceleration signal is processed by median filter and low-pass
filter (FIR 10th, 5 Hz) in pre-processing and then we select
several candidates as elements in feature vector through
extensive scrutinization : acceleration
and its absolute
|, 1st derivatives of acceleration
value |
and its
|, 2nd derivatives of acceleration
absolute |
| and etc. Basically the
and its absolute value |
Correlation Coefficients (CC) between candidates and
manually captured EPD are calculated and the result is

As a recognizer for EPD, we have chosen Artificial Neural


Network (ANN) because we dont know which typically
conservative algorithm works well in this case. In addition, it
is generally accepted that ANNs can learn any learnable
functions and easily handle nonlinearities [15].
For the given feature vectors, we have designed the basic
ANNs with three layers using Mathwork Matlab toolbox
because three layers suffice to implement any arbitrary
function [16]. Since the proper number of node in a hidden
layer is unclear, we repeat training and simulation for the
networks whenever a node is added one by one with the node
number in input and output layers fixed (one node for the
output layer and the same number as the feature dimension
for the input layer). Tangent - sigmoid function and log sigmoid function are chosen as the transfer function for
hidden layers and output layer respectively.
|, focused Time Lagged
For the sequence of |
Feedforward Network (TLFN) has been proposed. One
example is given in Fig. 3. TLFN stores p-order inputs in a

205

sort of delayed filter and uses it as input to multilayer


perceptron. If input is offered at time t, feature vectors can be
represented as in (1) and can be also understood as the state of
the nonlinear filter at time t. In our case p was determined as
7.
x[t ] = [x[t ], x[t 1], L , x[t p ]]

(1)

In training, every feature is normalized into values that range


from 0 to 1. Levenberg - Marquardt has been chosen for
learning and each parameter is set as the followings: learning
rate = 0.05, maximum epoch = 300, minimum error = 0.001.
A typical result with no time delay and higher performance
reliability is shown in Fig. 4.

and user independent test have been conducted respectively


and the recognition rates are calculated on the basis of
samples.
B. Test Results
The incremental node number at the hidden layer shows
evident trends in Fig. 6. As the node number increases, the
recognition rates are decreased in the all feature vectors. In
|, these phenomena get clearer as the
the sequence of |
order of time delay gets more increased. According to the
observation, it is concluded that one or two nodes are enough
for the hidden layers and more nodes may cause overfitting.
Consequently, the recognition rates are compared when one
node is put at the hidden layer in Table III and Fig. 7.

Fig. 4. Motion state recognition result. It is not yet rounded but clearly
shows that the values at each sample can be categorized into motion
state or non-motion state without latency.

IV.

EXPERIMENT AND RESULT


Fig. 6. Node number increase at the hidden layer. PCA-transformed
| (middle), and 7th order
2D vector (top), 1st order delayed |
| are entered as a feature.
delayed |

A. Test

TABLE III
STATE RECOGNITION RATES
Subject

Fig. 5. Handwritings. The handwriting gestures have been proposed in


[5] for the first time and we use them in honor of their contribution in
the related research.

In order to verify the proposed method, 4 subjects (2 male


and 2 female) have been recruited. 112 acceleration data and
manual EPDs are totally collected while subjects make the
handwriting motion given in Fig. 5 twice. The acceleration on
X and Z axis is only considered for testing. The collected
manual EPD data are used for the expected data in training
and result comparison. The collected data has been divided
into 2 sets which are training set and validation/test set. For
training 56 data are used and total to 8671 samples. For
validation and test the other 56 data are used and consist of
8939 samples. The validation set is used to decide when to
stop adding a node in the hidden layer. User dependent test

1
2

Feature
2D1

3D2

FLD2D FLD3D PCA2D PCA3D |

1(M)

0.9605

0.9536

0.9605

0.9461

0.9605

0.9536

0.9524

2(F)

0.9389

0.9485

0.9389

0.9243

0.9389

0.9489

0.9478

3(F)

0.9332

0.9298

0.9332

0.9184

0.9332

0.9298

0.9408

4(M)

0.9552

0.9569

0.9552

0.9569

0.9552

0.9569

0.9560

All

0.9390

0.9412

0.9012

0.9116

0.9390

0.9412

0.9408

Mean

0.9454

0.9460

0.9378

0.9315

0.9454

0.9461

0.9476

Std

0.0118

0.0108

0.0233

0.0192

0.0118

0.0108

0.0068

|3

|a t |, |a t |
|a t |, |a t |, | a t |
|

| 7th order

For evaluation, we discriminate the output values by


rounding them that range from 0 to 1 off to the nearest whole
number. User dependent case shows better recognition rates
than user independent one even though there is difference by
each feature. Especially it is notable that subject 1 and 4, male,
show better results than female subject 2 and 3. In fact, since
everybody has different muscular strength, trembling and
habit, its not easy to apply the identical criterion to various

206

subjects and Table III proves that the gender gap explains this
aspect. Of the features, while the FLD-transformed show
lower and incoherent records dependent on subjects, the
PCA-transformed record higher and even rates regardless of
|, shows evenly
subjects. On average, 7th order of |
stable results with the highest rate and the lowest standard
deviation. The order has been determined by the simulation
and it will be discussed in the next section.

| and its order. Fig. 9 illustrates the


sequence of |
gender gap shows different critical values to determine the
|. Female subjects still show little
optimal order for |
difference while male subject records dramatically decline at
the 8th order. Notably it makes sense that the gender gap can
be understood as a sort of individual gap.

Fig. 9. The optimal sequence order in |


Fig. 7. Recognition comparison

V.

DISCUSSION AND FURTHER WORK

| is displayed at the top and the state


Fig. 8. Fatal false negative. |
recognition results by ANN are shown at the bottom.

On the whole, we have concluded that pattern recognition


based EPD meets the research objectives in that the approach
offers quick response, easy optimization with fewer
parameter and versatile use. However, Fig. 8 suggests a very
important point about the reliable EPD performance. It is
clear that simply low error rate doesnt guarantee the
complete EPD performance reliability because the frequency
of false negatives is more fatal than that of false positives in
the EPD case. At the bottom in Fig. 8, though a false negative
at 60th sample is simply counted as one wrong detected case,
the significance is quite particular that the false negative
separates one consecutive motion into two pieces. Therefore
it is noted that another evaluation is required to check the both
end points and the motion connectivity. In this evaluation, the
performance of other approaches will be also compared
checking whether a false negative case in Fig. 8 occurs only
in the proposed method.
It is a positive aspect in the given result that recognition rates
vary with p value change in (1) because it means the
individual motion feature can be encoded in a way of the

We expect the parameter number would be reduced in the


proposed approach as possible as can be. If the sequence of
| is determined as a feature for data representation, it
|
is, consequently, evident that we fail to achieve the objective
| is specified by one
because the sequence of |
additional parameter, the order. If the use of the sliding
window cant be avoidable (the order of sequence is the same
as the size of sliding window as a result), it would be
preferred to choose the maximum value in each window as
proposed in [11]. As in Fig. 10, when 10 samples are taken for
the size of the window, CC increases to 0.7176 rapidly.
Considering the largest CC is 0.5469 in Table II, we may
conclude that piecewise constant approximation by
maximum value in the moving window might be one of the
best feature candidates.
If this reasoning is correct, as in Extreme point sampling to
adaptively determine the sliding window size in [11], our next
study objective will be, as a natural consequence, to find the
optimal window size to reflect the personal motion
characteristics, based on the frame of pattern recognition based EPD. In addition, the more detailed analysis on the
relationship between data representation and recognition
rates and the influence of new recognizers such as Radial
basis function or Support vector machine will be reported in a
subsequent study.

207

Fig. 10. Maximum value selection in sliding window. The size of the
sliding window is 10.

REFERENCES
[1]
[2]
[3]

[4]
[5]
[6]

[7]

[8]
[9]

[10]
[11]

[12]

[13]
[14]
[15]
[16]

J. R. Huddle, "Trends in inertial systems technology for high accuracy


AUV navigation", in Proc. Workshop Autonomous Underwater
Vehicles, Aug. 1998, pp. 63-73.
N. M. Herbst and J. H. Morrissey, "Signature verification method and
apparatus", U.S. Patent 3 983 535, Sep. 28, 1976.
Robert Baron and Rejean Plamondon, "Acceleration Measurement with
an Instrumented Pen for Signature Verification and Handwriting
Anaysis", IEEE Transactions on Instrumentation And Measurement,
Vol. 38. No. 6, Dec. 1989, pp. 1132-1138.
B. Milner, "Handwriting recognition using acceleration-based motion
detection", IEE Colloquium on Document Image Processing and
Multimedia, 1999
W. C. Bang, et al, "Self-contained spatial input device for wearable
computers", in Proc. 7th IEEE Int. Symp. Wearable Computers, White
Plains, NY, Oct. 2003, pp. 26-34.
Eun-Seok Choi, et al, "Beatbox music phone : gesture-based interactive
mobile phone using a tri-axis accelerometer," in Conf. Rec. IEEE
International Conference on Industrial Technology 2005, Dec. 2005,
pp. 97-102.
"1.5g-6g Three Axis Low-g Micromachined Accelerometer :
Freescale Semiconductor Technical Data," Freescale Semiconductor .
Available:http://www.freescale.com/files/sensors/doc/fact_sheet/MM
A7260QFS.pdf
Zhuxin Dong, et al., "IMU-Based Handwriting Recognition
Calibration by Optical Tracking," IEEE International Conference on
Robotics and Biomimetics, 2007.
Sung-Do Choi, Alexander S. Lee, and Soo-Young Lee, "On-Line
Handwritten Character Recognition with 3D Accelerometer," in Proc.
2006 IEEE Int. Conf. Information Acquisition, Shandon, China, Aug.
2006, pp. 845-850.
Lawrence Rabiner and Biing-Hwang Juang, Fundamentals of Speech
Recognition, Korea, Prentice Hall, 1998, pp. 143149.
Jong Gwan Lim, Young-il Sohn, Dong-soo Kwon, "Real-time
Accelerometer Signal Processing of End Point Detection and feature
Extraction for Motion Detection," 10th IFAC/ IFIP/ IFORS/ IEA
Symposium on Analysis, Design, and Evaluation of Human-Machine
Systems, Seoul, Korea, Sep. 2007.
Jong Gwan Lim, Farrokh Sharifi, and Dong-soo Kwon, "Fast and
Reliable Camera-tracked Laser Pointer System Designed for
Audience", 5th Int. Conf. on Ubiquitous Robots and Ambient
Intelligence, Seoul, Nov. 2008.
Seokhee Jeon et al., Motion-Recognizing Game Controller with
Tactile Feedback, KHCI, Korea, 2008.
Ethm Alpaydin, "Introduction to Machine Learning", Cambridge, The
MIT Press, 2004, pp. 106-131.
Windsor C. G. and Harker A. H., "Multi-variate financial index
prediction - a nural network study", in Proc. Int. Neural Network Conf.,
Paris, France, 1990.
Richard O. Duda, Peter E. Hart, and David G. Stork, "Pattern
Classification ", John Wiley & Sons, Inc., 2001, pp. 317-318.

208

You might also like