Professional Documents
Culture Documents
Techniques of Acoustic Feature Extraction For Detection and Classification of Ground Vehicles
Techniques of Acoustic Feature Extraction For Detection and Classification of Ground Vehicles
Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 3, Issue 2, February 2013)
Abstract: Vehicles may be recognized from the sound A general consideration says that similar vehicle types
from the sound they make when moving i.e. from their produce similar sound. The techniques for feature
acoustic signature. These sounds may come from extraction and classification that are reviewed in this
various sources including rotational parts, vibrations paper are not centralized about the classification and
in the engine, friction between the tires and the identification of similar vehicles that are travelling in
pavement, wind effect, gears, fans. Similar vehicles diverse speed and in various distances from the recording
working in comparable conditions would have a device, but instead this the goal of these techniques is to
similar acoustic signature that could be used for detect and classify different vehicles that belong to a
recognition. Characteristic patterns may be extracted particular class.
from the Fourier description of the signature and used A vehicle detection system consist three main
for recognition. Classification of ground vehicles based components: Event detection, Feature extraction and
on acoustic signals can be employed effectively in Classification. Event detection is used to keep the required
battlefield surveillance, traffic control, and many other computation resources low. A simple and fast algorithm is
applications. The classification performance depends used to find an acoustic event, than the complex time
on the selection of signal features that determine the consuming algorithm is applied for an in depth analysis of
separation of different signal classes. This paper the observed event. In ref [1], an energy index is used in
compares various available techniques of acoustic order to find whether a vehicle is in range or not. It is
feature extraction for detection and classification of done by defining a threshold and time window. If the
ground vehicles. Finally we present an overview of the energy index of the observed vehicle is greater than the
methods discussed and their success rate in tabular threshold within the window, then vehicle is present
form along with the classifier used. otherwise it is discarded. The energy index E[i] can be
found in ref [1] using following equation,
Keywords: Feature Extraction, Energy Index, FFT,
Classifier E[i] = ∑log (Xi[k]);
419
International Journal of Emerging Technology and Advanced Engineering
Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 3, Issue 2, February 2013)
An extra care should be taken before choosing the
techniques for feature extraction and classification, Start
because successful detection of the ground vehicles
depends on the extracted acoustic signatures that are
generated from the characteristic features. So many Find energy index E[i], (Time
techniques for feature extraction and classification are function).
available now days to improve the efficiency of detection.
420
International Journal of Emerging Technology and Advanced Engineering
Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 3, Issue 2, February 2013)
Energy envelope consideration for feature extraction: Now this zero mean sound is converted into frequency
There are numerous features involved in the energy domain by applying FFT. Magnitude of frequency
envelope of the signal which can be extracted as spectrum used to generate feature vector can be written as,
characteristic features. 30 characteristic features from Yifft = FFT (Yi )
energy envelope such as mean energy and location of Here distance variation from the vehicle and
peaks within the analysis window, sum of number of zero microphone’s sensitivity can cause some amplitude
crossings for each window, maximum value of averaged variation in frequency spectrum. To reduce the amplitude
energy (avE), approximated location of maximum value variation, normalization is applied to the frequency
of E/avE within each analysis window and number of spectrum.
windows having E/avE > 1are derived in ref [9]. yi,jfft = xi,jfft /( ∑xi,kfft), where k = 1,2,….,n
Time Encoded Signal Processing and recognition Where i is defined as 0<i<(N+1). Each vehicle has a
(TESPAR): In this method total number of zero crossings, unique Yifft that characterizes the sound of vehicle. By
wave shape between two zero crossings and time taken by averaging the N number of Yifft and Yµ,j, a stable
acoustic sound wave to arrive at the next zero crossing are characteristic of vehicle can be obtained, where
the main parameters to be defined. On the basis of these yµ,j = ∑(xi,jfft /N), where i=1,2,….,N.
parameters, characteristic features of the signal are To highlight overall shape of the spectrum and to filter
generated. This method is experimentally adopted in ref out the short-term fluctuation of Yµ,j , moving average
[10] to generate feature from vehicle acoustic and seismic filter is used
signals. yµ,j = ∑ yµ,k,
Where range of k is defined as (j+1)<k<(j+w+1)
2.2) Feature extraction in frequency domain: Thus acoustic signature of the vehicle is given by Yµ =
[yµ,1, yµ,2,…., yµ,(n/2)-w], where w is the window size.
Spectral characteristic of signatures vary significantly
among target classes. In the spectral domain, acoustic In ref [12] four feature vectors are generated by
signal waveforms generated by vehicle appears as narrow applying Linear Fast Fourier Transform (LFFT),
band harmonic components. The information provided by Multidimensional Fast Fourier Transform (MFFT), Linear
these components, is used to construct the characteristic Power Spectral Density (LPSD) and Multidimensional
features for a particular vehicle. Characteristic features are Power Spectral Density (MPSD) to the time series data of
generated by using low frequency components present in the recorded signals. Two types of spectral features are
the acoustic signal because most of the sound produced by proposed in ref [11]. First is non-parametric FFT-based
the vehicles is due to their rotating parts which rotate and PSD estimates and second is parametric PSD estimates
reciprocate in a low frequency mainly less than 600 Hz. using autoregressive (AR) modelling of time series. AR
Another reason for using low frequency bands is that they modelling is primarily explored to improve statistical
are strongest and suffer least attenuation [14, 15]. Feature reliability of the PSD estimates but due to requirement of
generation methods based on frequency domain such as more parameters it is very difficult to perform
Fast Fourier Transform (FFT) [12, 13, 14 and 15] and experimentally so ref [11] report results based only on
Power Spectral Density (PSD) [12] are commonly used in non-parametric PSD features. Additional time-averaging
vehicle detection and classification. Here overall shape of may be used to reduce the variance of PSD estimates by
the frequency spectrum is utilized to construct feature using shorter FFT segments at the cost of smoothing
vector. spectral features.
Feature generation using FFT: Let Y be the sound Harmonics can also be used to extract feature vector
recorded to construct feature vector. Y is divided into [16, 17]. Harmonics are the peaks present in spectral
multiple sound frames Yi, where Yi is a basic unit of domain representation of a signal. Relation between
analysis. Pre-processing is the basic step in which DC bias amplitude and phase of these harmonics is used to form
which may be caused by the device during sampling, is the feature vector. These feature vectors are known as
removed by applying zero mean to the Yi. Harmonic Line Association (HLA) feature vector.
yi,j = yi,j(old) – (1/n)∑ yi,k, Where k = 1,2,…..,n.
421
International Journal of Emerging Technology and Advanced Engineering
Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 3, Issue 2, February 2013)
In ref [16], magnitude of harmonic frequency Where a is called scale parameter, b is called translation or
components from 2nd to 12th are considered as HLA feature shift parameter, and ѱ (t) is called wavelet base function.
vector whereas in ref [17] a fusion approach is proposed, Wavelet Transform is called Continuous Wavelet
in which combination of two sets of features is used in Transform (CWT) when values of a, b are continuous and
classification phase by emphasizing various aspects of an it is called Discrete Wavelet Transform (DWT) when they
acoustic signature. First set of features consists a number are discrete.
of harmonic components, used to account for the engine Wavelet-based acoustic detection: This transform is
noise and second set is formed by grouping the key computationally efficient. The main concept used in this
frequency components. Second set describe other minors technique is to extract the set of features from the class of
such as acoustic signature in the tire friction noise of a signals emitted by a certain vehicle. The set of features is
vehicle. Harmonic feature extraction (first set) is common obtained by calculating the inherent energies in the blocks
in use [18, 19]. The main concern is to select key of the wavelet packet coefficients of the signal, each of
frequency components. Key frequency components are which is related to a certain frequency band. Wavelet
selected by using mutual information (MI) based method. packet transform of a signal yields different partitions of
Mutual information (MI) is a metric, based on the the frequency domain. Due to the presence of time
statistical dependence between two random variables [20]. variance property in the multiscale wavelet packet
A feature vector made from key frequency components decomposition, whole blocks of wavelet packet
contributes most of the discriminatory information. After coefficients are used instead of individual coefficients and
constructing these two feature sets, a feature level fusion waveforms. The collection of energies in the blocks can
process is applied to get a final feature vector which has a be considered as an averaged Fourier spectrum of the
combined effect of above two sets. Dimensionality is kept signal, which provides more improved representation of
same as the first feature space by replacing higher order signals. A method for utilizing the wavelet packet
harmonics (less important) with the same number of key coefficient is characterized as a random search for a near-
frequency components. Key frequency components are optimal footprint (RSNOFP) of a class of signals [24].
selected to be unrelated with the fundamental frequency to This is very close to the compressed sensing [3, 25 and
get the better feature vector after fusion. Fusion of two 26] idea. In order to get the more efficient detection, we
feature vectors provides a more complete description of can implement three different versions of RSNOFP that
vehicle’s acoustic signature in comparison to the validate each other. To generate feature vector, a set of
techniques where a single feature set is used. signals with known membership is used. These signals are
sliced into the overlapped fragments of length L and each
2.3) Feature extraction in time-frequency domain: fragment is subjected to the wavelet packet transform.
Wavelet packets transform works as a bridge between the
The techniques used to extract features in time-frequency time domain and frequency domain representation of a
domain are Short Time Fourier Transform (STFT) and signals. The coefficients from the upper level or finer
Wavelet Transform (WT) [21]. Ref [22] shows a scales correspond to basic waveforms, which have narrow
comparative view of probability based classifiers, that are spectrum in time domain but occupy wide frequency
trained with Bayesian subspace principal components of bands and the coefficients from the deeper level
the short time Fourier transform (STFT) of the vehicle’s correspond to waveforms, having narrow spectrum in
acoustic signature. To transform the overlapped acoustic frequency domain but occupy high range in time domain
hamming windowed blocks into a feature vector Short [24]. Energy of each block is calculated and then three
Time FFT is used in ref [1]. Wavelet transform also versions of RSNOFP are applied. As a result each
provides multi-resolution time-frequency analysis [23]. It fragment is represented by three different vectors of
is the projection of a signal onto the wavelet. Wavelet is a length l and the components of these vectors are the
series of functions ѱab(t) derived from a base function ѱ(t) characteristics features of the fragments.
by translation and dilation. Now a days a broad variety of orthogonal and bi-
1 𝑡−𝑝
ѱab(t) = |𝑎| ѱ( 𝑎 ), orthogonal filters are available that can generate wavelet
packet coefficients.
422
International Journal of Emerging Technology and Advanced Engineering
Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 3, Issue 2, February 2013)
In [27] researchers used 8th order spline wavelet (2) p(x/Ci): class specific probabilities to which
packets and 5 wavelet packet with 10 vanishing moments extend x belongs to Ci and are assumed to be
normally distributed.
III. TECHNIQUES FOR CLASSIFICATION PHASE
Value of second parameter can be determined as [28]:
1
This is the final phase of a vehicle detection process in p(x/Ci) = {1/(2π)n/2|σi|1/2}exp[ -2(x-µi)´ σi-1(x- µi)],
which the class and type of a vehicle is determined by the where µi is mean vector and σi is covariance matrix of
help of classifier. A classifier provides the functions or dimension n*n.
rules that are used to divide the feature space into various If Cest is the estimated class as an output of classifier then
regions, where each region belongs to a particular class. by using optimum decision rule,
Classifiers can be categorized as parametric and non- Set Cest = Ci if
parametric classifiers, based on the knowledge of signal x € Ci p(x/Ci)p(Ci) > p(x/Cj)p(Cj), for all i ≠ j.
distribution parameters. A parametric classifier is one
For simple computation a monotone natural logarithmic
which can be represented in closed form i.e. some
function Hi(x) instead of p(x/Ci)p(Ci) is considered by
assumptions are made about the probability density
authors of ref [28]. So Hi(x) = ln [p(x/Ci)p(Ci)], thus the
function (pdf) for each class whereas in non-parametric
new classification rule is;
classifiers no assumptions are made about density x € Ci Hi(x) > Hj(x), for all i ≠ j
function. Bayesian Classifier [22, 28], Support Vector
Value of Hi(x) can be calculated easily after getting value
Machine (SVM), Gaussian Mixture Model (GMM) and
of p(x/Ci). To compute this term, value of mean vector
Hidden Markov Model (HMM) [29, 30] are some
and covariance matrix is given as:
commonly used parametric classifiers in vehicle
µi = (1/m) ∑ xj and σi = ∑ xjx j´ - µi µi´
classification based on acoustic signature. HMM is a
Where j=1, 2, … m.
sequential pattern model while GMM is a static pattern
Bayesian classifier requires a large number of training set
model. Examples of non-parametric classifiers are K-
otherwise it will not be able to classify vehicles correctly.
nearest neighbour (KNN) [5, 31], Artificial Neural
This requirement makes it hard to compute the
Network (ANN) [6, 14], Decision Tree [32] and Fuzzy
discriminate functions.
Logic Rule-Based Classifiers. Among these KNN is rarely
used technique because of the requirement of large
memory and increased computation complexity [33]. 3.2) Hidden Markov Model (HMM) Based Classifier:
424
International Journal of Emerging Technology and Advanced Engineering
Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 3, Issue 2, February 2013)
The following table shows a summary and comparison [3] D. L. Donoho, “Compressed sensing.” IEEE Transactions on
in terms of acoustic feature extraction, classification Information Theory, 52(4), April 2006.
techniques and efficiency of detection. [4] J. Romberg E. Candes and T. Tao, “ Robust uncertainty principles:
Exact signal reconstruction from highly incomplete frequency
Table: 1 information.” IEEE Transactions on Information Theory, 52(2),
February 2006.
S. Feature Classifier Classes Classification
[5] X. Wang H. Qi, “Acoustic target classification using distributed
No extraction number Rate
method sensor arrays.” In Proc. IEEE ICASSP, 4:4186–4189, 2002.
KNN, [6] H. Wu, M. Siegel, P. Khosla. “Vehicle sound signature recognition
FFT, DWT,
1 Bayesian 4 85%-88% by frequency vector principal component analysis.” Instrumentation
STFT,PCA
classifier and Measurement, IEEE Transactions on, 48(5):1005–1009, 1999.
2 STFT,PCA ANN 3 ---
[7] A. Averbuch, N. Rabin, A. Schclar, V. Zheludev, “Dimensionality
Energy
3 Envelope, ANN 5 97% reduction for detection of moving vehicles.”
PCA [8] S. Erb, “Classification of vehicles based on acoustic features.”
KNN, Master’s thesis, Begutachter: Univ.-Prof. Dipl.-Ing. Dr. Bernhard
4 FFT, PSD Bayesian 2 78%-97% Rinner, 2007.
classifier [9] S. Somkiat, “Neural fuzzy techniques in vehicle acoustic signal
Harmonic
classification.” Ph.D.dissertation, chair-Vanlandingham, Hugh F.
Line
5 ANN 18 88% 1997.
Association
(HLA) [10] G. P. Mazarakis and J. N. Avaritsiotis, “Vehicle classification in
Bayesian sensor networks using time-domain signal processing and neural
classifier networks.” Microprocess. Microsyst., 31(6) 381–392, 2007.
6 FFT, WT 3 95.5%
(MPP), [11] D. Li, K. D. Wong, Y. H. Hu, and A. M. Sayeed, “Detection,
KNN
classification and tracking of targets in distributed sensor networks.”
2 97.95
Bayesian In IEEE Signal Processing Magazine, 2002.
7 MFCC, FFT 4 92.24
classifier [12] M. Baljeet, N. Ioanis, H. Janelle, “Distributed classification of
5 78.67
8 WPT CART --- --- acoustic targets in wireless audio-sensor networks.” Computer
Hidden Network, 52(13):2582–2593, 2008.
Cepstral Markov [13] S. S. Yang, Y. G. Kim1, H. Choi, “Vehicle identification using
9 9 96%
Coefficient Model discrete spectrums in wireless sensor networks.” Journal Of
(HMM)
Networks, 3(4):51–63, 2008.
KNN,
Bayesian [14] G. Succi, T. Pedersen, R. Gampert and G. Prado, “Acoustic target
10 PSD classifier 2 Up to 97% tracking and target identification-recent results.” In Proceedings of
(ML), the SPIE - The International Society for Optical Engineering,
SVM 3713:10–21, 1999.
[15] A. Aljaafreh, L. Dong, “An evaluation of feature extraction methods
for vehicle classification based on acoustic signals.” International
REFERENCES Conference in Networking, Sensing and Control (ICNSC), 2010.
[16] A. Y. Nooralahiyan, H. R. Kirby, D. McKeown, “Vehicle
[1] Andreas Klausner, Stefan Erb, Allan Tengg, Bernhard Rinner, “Dsp classification by acoustic signature.” Mathematical and computer
based Acoustic Vehicle Classification for Multi-Sensor Real-Time modeling, 27:9 –11, 1998.
Traffic Surveillance.” 15th European Signal Processing Conference [17] B. Guo, M. Nixon, and T. Damarla, “Acoustic information fusion
(EUSIPCO 2007), Poznan, Poland, September 3-7, 2007. for ground vehicle classification.” In Information Fusion, 11 th
[2] Y. Tsaig, D. L. Donoho, I. Drori and J.L.Strack, “Sparse solution of International Conference, 2008.
underdetermined linear equations by stagewise orthogonol machine [18] D. Lake, “ Harmonic phase coupling for battlefield acoustic target
pursuit.” Technical report No. 2006-2 Departement of Statistics, identification.” In Proceedings of IEEE International Conference on
Stanford University, April 2006. Acoustics, Speech, and Signal Processing, 1998.
425
International Journal of Emerging Technology and Advanced Engineering
Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 3, Issue 2, February 2013)
[19] D. Lake, “ Tracking fundamental frequency for synchronous [33] Ahmad Aljaafreh, Ala Al-Fuqaha, “Multi-Target Classification
mechanical diagnostic signal processing.” In Proceedings of 9th Using Acoustic Signatures in Wireless Sensor Networks: A survey.”
IEEE Signal Processing Workshop on Statistical Signal and Array Signal Processing-An International Journal (SPIJ), Volume (4):
Processing, 1998. Issue (4).
[20] R. Battiti, “Using mutual information for selecting features in [34] B. Moghaddam, A. Pentland, “Probabilistic visual learning for
supervised neural net learning.” IEEE Transactions on Neural object detection”. In International Conference on Computer Vision,
Networks, 5(4):537–550, July 1994. pages 786–793, 1995.
[21] Y. Sun and H. Qi., “Dynamic target classification in wireless sensor [35] G. D. Forney, “Exponential error bounds for erasure, list, and
networks.” In Pattern Recognition, ICPR ,19th International decision feedback schemes,” IEEE Trans. on Information Theory,
Conference, 2008. vol 14, no. 2, pp. 2062-20, Mar, 1968.
[22] E.M. Munich, “Bayesian subspace methods for acoustic signature [36] A. J. Oppenheim, R.W. Schafer, “Discrete-Time Signal
recognition of vehicles.” in Proceedings of the European Signal Processing.”, Prentice Hall, Englewood Cliffs, NJ, 1989.
Processing Conference (EUSIPCO-04), Vienna, Austria, Sept. 2004. [37] M. Wlchli T. Braun, “Event classification and filtering of false
[23] H.-l. Wang, W. Yang, W.-d. Zhang, and Y. Jun, “Feature extraction alarms in wireless sensor networks.” Parallel and Distributed
of acoustic signal based on wavelet analysis.” In ICESSSYMPOSIA Processing with Applications, International Symposium on, 0:757–
’08: Proceedings of the 2008 International Conference on Embedded 764, 2008.
Software and Systems Symposia. Washington, DC, USA: IEEE [38] R. Mgaya, S. Zein-Sabatto, A. Shirkhodaie, W. Chen. “Vehicle
Computer Society, 2008. identifications using acoustic sensing.” In SoutheastCon, 2007
[24] Amir Averbuch, Valery A. Zheludev, Neta Rabin, Alon Schclar, Proceedings. IEEE, 2007.
“Wavelet-based acoustic detection of moving vehicles.” Multidim [39] H. Qi, X. Tao, L. H. Tao, “Vehicle classification in wireless sensor
Syst Sign Process, Springer Science+Business Media LLC 2008. networks based on rough neural network.” In ACST’06: Proceedings
[25] Candes, E., Romberg, J., Tao.T., “Robust uncertainty principles: of the 2nd IASTED international conference on Advances in
Exact signal reconstruction from highly incomplete frequency computer science and technology. Anaheim, CA, USA: ACTA
information.” IEEE Transactions on Information Theory, 52/2, 489– Press, 2006.
509 (2006).
[26] Donoho, D., & Tsaig, Y., “ Extensions of compressed sensing. ”
Signal Processing, 86(3), 533–548 (2006).
[27] Averbuch,A. Z., Hulata, E., Zheludev,V. A., Kozlov, “A wavelet
packet algorithm for classification and detection of moving
vehicles.” Multidimensional Systems and Signal Processing, 12(1),
9–31. I. (2001a).
[28] M. Friedman and A. Kandel, “ Introduction to pattern recognition:
statistical, structural, neural, and fuzzy logic approaches.” World
Scientific, 1999.
[29] W. J. Roberts, H. W. Sabrin, Y. Ephraim, “Ground vehicle
classification using hidden markov models.” In Atlantic coast
technologies Inc., Silver Spring MD, 2001.
[30] Ahmad Aljaafreh, Liang Dong, “Hidden Markov Model Based
Classification Approach for Multiple Dynamic Vehicles in Wireless
Sensor Networks. ”
[31] D. Li, K. Wong, Y. H. Hu, and A. Sayeed, “Detection,
classification, and tracking of targets.” Signal Processing Magazine,
IEEE, 19(2):17–29, 2002.
[32] H. Xiao1, Q. Yuan1, X. Liu1, and Y. Wen, “Advanced Intelligent
Computing Theories and Application, with Aspects of Theoretical
and Methodological Issue.” Springer Berlin /Heidelberg, 2007.
426