You are on page 1of 5

Autonomous Classification of Intra- and

Interspecific Bee Species Using Acoustic Signals in
Real Time
David Ireland
School of Information Technology and Electrical Engineering
University of Queensland
Brisbane, Australia 4072
Abstract—This paper pertains to the development of a real time
classification system for the discrimination of intraspecific and
interspecific bee species using the K-nearest neighbor and prob-
abilistic neural network classification algorithms. The intended
applications for this system are for autonomous surveillance of
invasive bee species and monitoring tools for entomologists. The
system was developed on a low cost platform which showed at
least 80% classification accuracy on two intraspecific bee species
and 100% accuracy in the classification of four distinct bee
species.
I. INTRODUCTION
With the rapid decline of insect pollinators there is an
increasing demand for tools that provide autonomous tracking
of the movements and activities of pollinating insects. This
paper focuses on the initial development of a system for the
detection and classification of bees: an essential insect in the
production of the the global flood supply. In a cost effective
and portable platform, our system aims to:
1) Provide a tool for entomologist to study the behavior
traits of foraging bees. For example, to determine which
bee species favor pollinating a particular agricultural
crops.
2) Provide an autonomous surveillance system for invasive
bee species that present a potential hazard to current
ecosystems. Australia for example, considers the Asian
bee (Apis cerana) and bumble bee (Bombus terrestris)
insects invasive.
Given the enormous diversity of insects, autonomous detec-
tion is a widely research fielded. Insect classification methods
can usually be placed into two broad categories: acoustic and
imaging methods. A perusal of the literature shows acous-
tic methods are mainly used for field measurements while
imaging approaches are conducted in a laboratory environment
usually posthumous. Examples using acoustic methods include
detection systems for insects in grain silos [1], [2] classifica-
tion of mosquitoes in [3], [4] and aphids in [5]. Identifying
crickets based on their sounds can be found in [6] and [7].
Examples of insect classification using imaging systems can
be found in [8] for the identification of aquatic insects and [9]
for the identification of aphids.
The method proposed in this paper relies solely on acoustic
signals emitted by the insects during flight. The novelty of
this paper is the discrimination of bee insects using acoustics
signals which is absent from the literature. Moreover, emphasis
on a cost effective, field detection/classification system in real
time is a major feature of this paper.
II. ACOUSTIC INSECT DETECTION
Insect classification by acoustic signals emitted during flight
is not a new technique. The method relies on the phenomenon
that the acoustics emitted by an insect in flight has an
fundamental frequency approximately equal to the wing-beat
frequency of the insect [11]. Further spectrum analysis also
reveals a harmonic series where often the dominant frequen-
cies are not the fundamental frequency [11]. Figures 2 and 3
give the spectrogram of two distinct bee species, Apis mellifera
and Amegilla cingulata. Both waveforms have similar funda-
mental frequencies 220Hz however, in the latter example the
fundamental frequency is not the dominant frequency and has
more power in the higher harmonics as opposed to the Apis
mellifera species.
A statistical analysis done by [10] has shown the the wing-
beat frequency (and thus the produced fundamental frequency)
is inversely proportional to the wing area of the insect.
Given the extensive variation in insect anatomy, the wing-
beat frequency and the associated harmonic series, a feature
set can be extracted electronically. This work was inspired by
Moore et al in [3], [4], [5], who pioneered the use of insect
discrimination using the harmonic sets.
Figure 1 provides a flowchart of our proposed field system.
The system records continuously where after some duration,
the fundamental frequency f
o
is determined. If f
o
is de-
termined to be in a region of interest, a feature vector is
constructed from the audio sample and subsequently classified
and logged. Section II and III will discuss the feature vector
extract and considered classifications used in this instance.
III. FEATURE VECTOR GENERATION
Given the harmonic nature of the signal emitted by insects
in flight, the cepstrum method was used in determining the
fundamental frequency. This method involves first finding the
cepstrum power using:
C (q) =
¸
¸
¸T
_
log
_
[T ¦y (n)¦[
2
__¸
¸
¸
2
(1)
Record
audio
sample
Compute
f o
Discard
audio
sample
Log event
f o ∈
f
mi n
o
, f
max
o
Extract
feature
vector
Classify
feature
vector
no
yes
Fig. 1. An overview flowchart of the proposed detection/classification system.
Fig. 2. Spectrogram of an acoustic signal emitted by an European honey
bee Apis mellifera during flight.
where y (n) is the sampled waveform, T () denotes the
Fourier transform, and q are the units of the cepstrum power
referred to as the quefrency. Subsequently f
o
is found by:
f
o
=
f
s
argmax
q
C (q)
(2)
where f
s
is the sampling frequency. In order to scale f
o
for the classification algorithm, we propose the following
normalisation function:
f

o
=
_
f
o
− f
min
o
_
(f
max
o
− f
min
o
)
(3)
where f
min
o
and f
max
o
are the minimum and maximum
possible ranges of the f
o
.
The next step in creating the feature vector is to compute
relative power at multiples of the estimated f
o
. This is first
achieved by summing the power spectrum density G
y
(f) at
the harmonics of interest. In this instance we considered ±5%
at the harmonic regions. Using a sampling frequency f
s
of
44.1KHz and fast Fourier transform length of f
s
, we have for
each multiple n after some simplifications:
Fig. 3. Spectrogram of an acoustic signal emitted by an Australia blue
banded bee Amegilla cingulata during flight.
h
n
=

21nfo
20

f=
19nfo
20

G
y
(f) ∀n = 1, . . . , N
h
(4)
where N
h
is the number of multiples considered, this is
an arbitrary value and dependent on the bee species to be
detected. Functions ¸| and ¸| denote the floor and ceil
functions respectively. The h
n
values are further normalised
using:
h

n
=
h
n
max ¦h
1
, h
2
, . . . , h
N
h
¦
(5)
Finally the feature vector is defined as:
x =
_
f

o
, h

1
, h

2
, . . . , h

N
h
_
(6)
For future reference we denote the length of x as N
x
where
N
x
= N
h
+ 1.
IV. CLASSIFICATION ALGORITHMS
For convenience we define the classification of the feature
vector x as the function:
T(x) ∈ ¦1, 2, . . . , N
class
¦ (7)
where N
class
is the number of classes (or bee insects) consid-
ered.
A. K-Nearest Neighbor Method
The K-nearest neighbor method (kNN) is a widely used
classification method. Given an unknown sample, the kNN
method finds the K nearest objects (training data) typically
using the Euclidean distance as a metric. Subsequently, the
sample is classified based on a majority vote of the K objects.
For example, if:
¦x
1
, x
2
, x
3
. . . , x
K
¦ (8)
denote the K nearest feature vectors to the unknown feature
vector, determined by some distance metric, then the newly
assigned class is determined by:
k = /¦T(x
1
) , T(x
2
) , . . . , T(x
K
) , ¦ (9)
where /() computes the mode of the dataset of classes. The
Euclidean distance metric was used in this paper for all uses
of the kNN algorithm.
B. Probabilistic Neural Network
Probabilistic neural networks (pNN)s are a practical means
of implementing Bayesian classification techniques. If an
object is to be classified into one of two classes denoted i and
j, then class i is chosen according to Bayes optimal decision
rule:

i
c
i
f
i
(x) >
j
c
j
f
j
(x) (10)
denotes the loss associated with misclassifying x, h
i
is the
prior probability of occurrence in the ith class, and f
i
(x)
is the posterior probability density function (PDF) for the
ith class. In practise f
i
(x) is usually not known and must
be estimated using Parzen’s method. This involves taking an
average sum of a suitably chosen kernel for each observation
in the training data [12].
The Gaussian function is a common choice for the kernel
as it is well behaved and easily computed [12]. After some
simplification the estimated PDF for a particular class with
N
k
training observations becomes:
f
k
(x) =
1
N
k
N
k

i=1
exp
_
[[x −x
ki
[[
2
σ
2
_
(11)
where x
ki
is the ith example of the training data for class k
and σ is a scaling parameter that controls the area of influence
of the kernel. There is no rigorous mathematical method to
determine an optimal σ, however, the author has found a
simple first-order optimisation approach such as the gradient
descent method [13] quite efficient in determining a suitable
σ for the training set prior to the system being placed online.
Assuming the misclassification loss and prior probabilities
of occurrence are constant, the class belonging to the feature
vector is determined by:
T(x) = argmax
n
¦f
1
, f
2
, . . . , f
n
, . . . , f
Nclass
¦ (12)
V. EXPERIMENT SETUP
A. Hardware
An algorithm to perform the operation given in figure 1
was programmed on a FriendlyARM mini2440 platform [14].
This platform features a 400MHz Samsung ARM9 processor
with on board circuitry for sound recording and USB interface
for data storage. The platform is capable of running both
Linux and Windows based operating systems. It was a powered
by a 12V lead acid battery. The cost of the platform is
approximately $90AUD.
The developed classification software was written in C++
and provided continuous recording using dual threads and
dual alternating buffers. Two threads were initially created,
these will be referred to as the recording thread and the
classification thread. The recording thread continuously placed
audio samples into an available buffer while the classification
thread waits for a buffer to be full (1 second of recording time).
Once a buffer was full, the recording thread redirects the audio
samples into the second buffer while the classification thread
computes the f
o
of the waveform stored in the full buffer and
subsequently classifies the waveform if the right conditions are
met i.e. f
o
∈ f
min
o
, f
max
o
. Continuous recording was found
to be met while the f
o
computation and classification stages
required no more than 1 second of computation time. The
freely available FFTW subroutine library [15] was used to
compute the PSD. This library is considered to be the most
efficient freely available library for computing the fast Fourier
transformation. Benchmarks performed on on a variety of
platforms show that FFTW’s performance is typically superior
to that of other publicly available FFT software, and is even
competitive with vendor-tuned codes [15]. Figure 4 provides
a photo of the classification system being tested on a colony
of Apis mellifera honey bees.
Fig. 4. Photo of the classification system being tested on a colony of Apis
mellifera honey bees.
B. Classification Performance Criteria
The performance of the algorithm was determined by the
amount of successful classifications that occurred during the
testing. We mathematical define the function:
g
i
=
_
1 if T(x
i
) = k
0 if T(x
i
) ,= k
where x
i
is the ith testing sample. The total error which
represents the number of successful classifications is given
as:
=
1
N
test
N
test

i=1
g
i
(13)
where N
test
the number of testing samples.
VI. EXPERIMENT 1: COLONY CLASSIFICATION
The first study presented in this paper is on the efficacy of
the classification system to classify between two intraspecific
colonies of European honey bees (Apis mellifera) with an
arbitrary size of training data. The system was given a total
of N
train
training samples with a 50% distribution of training
samples for each colony. Each training system was audible
checked to ensure it contained a acoustics signal produced by
a bee and had a f
o

_
f
min
o
, f
max
o
¸
, where f
min
o
= 200Hz
and f
max
o
= 250Hz. The system was stopped running after
it had classified 100 bees. This was repeated 5 times with the
classification error defined in equation 13 evaluated at each
instance. It has been observed a priori, that the harmonics
emitted by the Apis mellifera bee have negligible amplitude
passed the 3rd harmonic there,fore N
x
was set to 4.
The results of this experiment are given in table I where
µ
denotes the mean of the classification error for each system
run. Evidently with a minimal training size of 2, the system
is able to obtain, on average, at least 61% and 54% accuracy
using the pNN and kNN algorithms respectively. There was
observable increase in classification accuracy as the number of
training samples increased, on average, 78% and 72% accuracy
was obtained for the pNN and kNN algorithms respectively.
The pNN algorithm is seen to be the more accurate algorithm.
TABLE I
PERCENTAGE OF SUCCESSFUL CLASSIFICATIONS DETERMINED BY
EQUATION 13 FOR THE PNN AND KNN ALGORITHMS AS A FUNCTION OF
TRAINING SIZE N
t
FOR WHEN N
x
= 4.
N
train
Alg.
1

2

3

4

5

µ
pNN 67% 38% 59% 73% 69% 61%
2
kNN (k = 1) 45% 24% 59% 72% 69% 54%
pNN 65% 72% 70% 80% 71% 72%
10
kNN (k = 5) 64% 72% 67% 73% 68% 69%
pNN 79% 75% 76% 86% 73% 76%
20
kNN (k = 5) 68% 77% 72% 73% 63% 71%
pNN 71% 78% 77% 73% 74% 75%
40
kNN (k = 5) 63% 74% 67% 69% 73% 69%
pNN 79% 78% 78% 76% 80% 78%
100
kNN (k = 5) 68% 77% 76% 70% 73% 73%
VII. EXPERIMENT 2: INTERSPECIFIC BEE
CLASSIFICATION
The second experiment presented in this paper is on the
efficacy of the classification system to classify between four
different bee species. The species include the Asian honeybee
TABLE II
TABLE OF THE WING-BEAT FREQUENCY ESTIMATIONS FOR FOUR
DIFFERENT BEE SPECIES. ESTIMATIONS DONE IN PRESENT STUDY WERE
AVERAGE VALUES FROM THE TRAINING SET
Species f
o
Citation
Apis cerana 265Hz Present study
Apis cerana 306Hz [18]
Amegilla cingulata 229Hz Present study
Apis mellifera 225Hz Present study
Apis mellifera 240Hz [16]
Apis mellifera 197Hz [17]
Bombus terrestris 175Hz Present study
Bombus terrestris 156Hz [16]
Bombus terrestris 130Hz [17]
(Apis cerana), the native Australian blue banded bee Amegilla
cingulata, the European honey bee (Apis mellifera) and the
bumble bee (Bombus terrestris). In Australia, the Apis cerana
and Bombus terrestris are prohibited species and therefore
obtaining audio recordings of this insects in flight very difficult
to obtain. As such, the author obtained permission to use audio
recordings taken by amateur entomologist in Japan for the Apis
cerana bee and in South America for the Bombus terrestris
bee. From these audio recordings and further recordings done
locally, a training set was constructed which contained 5, 1-
second audio samples of the bee species under consideration.
The centroids of the testing set for each species are given
in figure 5. There is evidently a large variation in wing-beat
frequency and the distribution of power in the harmonics.
To provide some evidence to the veracity of this figure, the
average recorded f
o
measured wing-beat frequency for each
species was compared to values cited in the literature shown in
table II. Generally consistency is shown between previously
cited values for all species except Amegilla cingulata as no
literature value could be found.
Due to the small number of training and testing samples, the
classification system was tested offline. A testing sample was
removed from the training set and applied to the classification
algorithm. Table III provides the results for N
x
= 1 i.e.
only the wing-beat frequency is used in the classification
algorithms, and N
x
= 12. As seen, both algorithms performed
the same and were able to provide 88% accuracy when using
only the f
o
as the classification feature. However, when both
algorithms were given the complete feature vector (N
x
= 12)
both algorithms achieved 100% classification accuracy. It
would also appear the kNN in this instance preferred a low
value of k. Given the figure 5, these results are not surprising
as there is significant difference between feature vectors for
the different bee species. It is also apparent the wing-beat
frequency can also be reasonable feature in interspecific bee
discrimination.
1 2 3 4 5 6 7 8 9 10 11 12
0
1
Apis cerana
h
n
1 2 3 4 5 6 7 8 9 10 11 12
0
1
Amegilla cingulata
h
n
1 2 3 4 5 6 7 8 9 10 11 12
0
1
Apis mellifera
h
n
1 2 3 4 5 6 7 8 9 10 11 12
0
1
Bombus terrestris
n
h
n
Fig. 5. Centroid of the training samples for the four bee species: Apis cerana,
Amegilla cingulata, Apis mellifera and Bombus terrestris. f
min
o
= 150Hz
and f
max
o
= 300Hz
.
TABLE III
PERCENTAGE OF SUCCESSFUL CLASSIFICATIONS DETERMINED BY
EQUATION 13 FOR THE PNN AND KNN ALGORITHMS FOR WHEN N
x
= 1
AND Nx = 12.

Algorithm N
h
= 1 N
h
= 12
pNN 88 % 100%
kNN (k = 1) 88% 100%
kNN (k = 2) 88% 100%
kNN (k = 3) 71% 82%
kNN (k = 4) 82% 88%
kNN (k = 5) 65% 76%
VIII. CONCLUSION
This paper has presented a system for the surveillance and
classification of bee insects in real time using the acoustics
emitted by the insects during flight. The intended purpose of
this system is in the surveillance of invasive bee species and
tools for the tracking of bee behavior for new entomology
studies. Extraction of a feature vector from the sampled acous-
tic waveform was described followed by two classification
algorithms implemented on a low cost prototype platform.
The first experiment pertained to the intraspecific of two
colonies of Apis mellifera colonies. An average classification
accuracy of 79% was obtained using a probabilistic neural
network. The second experiment pertained to the interspecific
classification of four distinct bee species. 100% classification
accuracy was obtained using both the probabilistic neural
network and k-nearest neighbor methods. This shows intraspe-
cific classification is possible and obtains reasonable accuracy
with the proposed algorithms. The results of interspecific
classification were very promising given, albeit, a limited
training and testing set.
Future work on the proposed system includes the inclusion
of more bee species in the training set and the extension of
wireless connectivity for event notification. Subsequently the
system is expected to deployed in a wider area and operated
for long periods of time.
ACKNOWLEDGMENT
The author would like to thank Yu’s apiaries for the use of
their beehives and the amateur and professional entomologists
who donated their audio recordings of various insects. The
author acknowledges the technical assistance given by Dr.
Konstanty Bialkowski of the University of Queensland.
REFERENCES
[1] K.M. Coggins and J. Pricipe,, Detection and classification of insect
sounds in a grain silo using a neural network, Neural Networks Pro-
ceedings, 1998. IEEE World Congress on Computational Intelligence,
vol.3, pp.1760-1765, 4-9 May 1998
[2] F. Fleurat-Lessard, B. Tomasini, L. Kostine and B. Fuzeau, Acoustic
detection and automatic identification of insect stages activity in grain
bulks by noise spectra processing through classification algorithms,
Proceedings of the 9th International Working Conference on Stored
Product Protection, 15 - 18th October 2006, Campinas, Sao Paulo, Brazil.
[3] A. Moore, J.R. Miller, B.E. Tabashnik and S.H. Gage, Automated iden-
tification of flying insects by analysis of wingbeat frequencies, J. Econ.
Entomol. 79: 1703-1706
[4] A. Moore, Artificial neural network trained to identify mosquitoes in
flight, Journal of insect Behavior, Vol. 4 No. 3 1991
[5] A. Moore and R.H. Miller Automated identification of optically sensed
aphid (Homoptera: Aphidae) wing waveforms, Annals of the Entomolog-
ical Society of America, 95(1):1-8, 2002
[6] I. Potamitis, T. Ganchev and N. Fakotakis, Automatic acoustic identi-
fication of insects inspired by the speaker recognition paradigm, IN-
TERSPEECH 2006 - ICSLP, 9th International Conference on Spoken
Language Processing Pittsburgh, PA, USA September 17-21, 2006
[7] E.D. Chesmore, Application of time domain signal coding and artificial
neural networks to passive acoustical identification of animals, Applied
Acoustics 62 (2001) 13591374
[8] M. J. Sarpola, R. K. Paasch, E. N. Mortensen, T. G. Dietterich, D. A.
Lytle, A. R. Moldenke and L. G. Shapiro, An aquatic insect imaging
system to automate insect classification, Transactions of the American
Society of Agricultural and Biological Engineers, 51(6): 2217-2225. 2008
[9] R. Kumar, V. Martin and S. Moisan, Robust insect classification applied
to real time greenhouse infestation monitoring, IEEE ICPR workshop on
Visual Observation and Analysis of Animal and Insect Behavior, Istanbul,
2010
[10] M. Deakin, Formulate for insect wingbeat frequency, Journal of Insect
Science, 10(96):1-9 2010
[11] R. Dudley, The Biomechanics of insect flight, Princeton University press,
Oxfordshire United Kingdom.
[12] T. Masters, Practical neural network recipes in C++, Morgan Kauf-
mann, 1st edition Academic Press Inc. (April 14, 1993)
[13] J. A. Snyman, Practical mathematical optimization: An introduction to
basic optimization theory and classical and new gradient-based algo-
rithms. Springer Publishing. 2005
[14] FriendlyARM. [Online]. Available: http://www.friendlyarm.net [Ac-
cessed: April 12th, 2011].
[15] FFTW, [Online]. Available: http://www.fftw.org/ [Accessed: April 12th,
2011].
[16] O. Sotavalta, The essential factor regulating the wing stroke frequency
of insects in wing mutilation and loading experiments and in experiments
at subatmospheric pressure. Ann. Zool. Soc. ”Vanaino” 15, 1-67
[17] D. N. Byrne, Relationship between wing loading, wingbeat frequency
and body mass in Homopterous insects, Journal of Experimental Biology,
135, 9-23, 1988
[18] N.P. Goyal and A.S. Atwal, Wingbeat frequency of A. indica indica F
and A. mellifera L., Journal of Apiculture Research, 16:4748, 1977

The next step in creating the feature vector is to compute relative power at multiples of the estimated fo . .1KHz and fast Fourier transform length of fs . . This is first achieved by summing the power spectrum density Gy (f ) at the harmonics of interest. . h∗ . . Fig. x2 . . For example. . h∗ h 1 2 N where y (n) is the sampled waveform. this is an arbitrary value and dependent on the bee species to be detected. In order to scale fo for the classification algorithm. Nh (4) where Nh is the number of multiples considered. In this instance we considered ±5% at the harmonic regions. Nclass } (7) (2) where fs is the sampling frequency. h∗ . Using a sampling frequency fs of 44. h2 . . Spectrogram of an acoustic signal emitted by an Australia blue banded bee Amegilla cingulata during flight. Spectrogram of an acoustic signal emitted by an European honey bee Apis mellifera during flight. F (·) denotes the Fourier transform. 1. if: {x1 . . the sample is classified based on a majority vote of the K objects. .f o yes Extract feature vector Classify feature vector Fig. IV. Subsequently. . K-Nearest Neighbor Method The K-nearest neighbor method (kNN) is a widely used classification method. Subsequently fo is found by: fo = fs argmax C (q) q (6) For future reference we denote the length of x as Nx where Nx = Nh + 1. 2. xK } (8) min max where fo and fo are the minimum and maximum possible ranges of the fo . . hn max {h1 . An overview flowchart of the proposed detection/classification system. C LASSIFICATION A LGORITHMS For convenience we define the classification of the feature vector x as the function: D (x) ∈ {1. hNh } (5) Finally the feature vector is defined as: ∗ x = fo . 3. the kNN method finds the K nearest objects (training data) typically using the Euclidean distance as a metric. Functions · and · denote the floor and ceil functions respectively. .Record audio sample Discard audio sample Compute fo Log event no fo ∈ mi max f o n. Given an unknown sample. . 21nfo 20 hn = f= 19nfo 20 Gy (f ) ∀n = 1. . . . . . 2. A. we have for each multiple n after some simplifications: . we propose the following normalisation function: ∗ fo = min fo − fo max − f min ) (fo o (3) where Nclass is the number of classes (or bee insects) considered. . x3 . The hn values are further normalised using: h∗ = n Fig. and q are the units of the cepstrum power referred to as the quefrency.

e. . Probabilistic Neural Network Probabilistic neural networks (pNN)s are a practical means of implementing Bayesian classification techniques. fo . . This involves taking an average sum of a suitably chosen kernel for each observation in the training data [12]. If an object is to be classified into one of two classes denoted i and j. f2 . fN class } (12) Fig. . This platform features a 400MHz Samsung ARM9 processor with on board circuitry for sound recording and USB interface for data storage. Two threads were initially created. Figure 4 provides a photo of the classification system being tested on a colony of Apis mellifera honey bees. the class belonging to the feature vector is determined by: D (x) = argmax n {f1 . . V. fn . } (9) where M (·) computes the mode of the dataset of classes. the author has found a simple first-order optimisation approach such as the gradient descent method [13] quite efficient in determining a suitable σ for the training set prior to the system being placed online. The platform is capable of running both Linux and Windows based operating systems. E XPERIMENT S ETUP A. Once a buffer was full. fk (x) = exp i=1 ||x − xki || σ2 2 (11) where xki is the ith example of the training data for class k and σ is a scaling parameter that controls the area of influence of the kernel. B. then class i is chosen according to Bayes optimal decision rule: i ci fi (x) > j cj fj (x) (10) denotes the loss associated with misclassifying x. We mathematical define the function: gi = 1 0 if D (xi ) = k if D (xi ) = k . There is no rigorous mathematical method to determine an optimal σ. D (xK ) . the recording thread redirects the audio samples into the second buffer while the classification thread computes the fo of the waveform stored in the full buffer and subsequently classifies the waveform if the right conditions are max min met i. The Gaussian function is a common choice for the kernel as it is well behaved and easily computed [12]. The recording thread continuously placed audio samples into an available buffer while the classification thread waits for a buffer to be full (1 second of recording time). D (x2 ) . and is even competitive with vendor-tuned codes [15]. and fi (x) is the posterior probability density function (PDF) for the ith class. Photo of the classification system being tested on a colony of Apis mellifera honey bees. . then the newly assigned class is determined by: k = M {D (x1 ) . however. Assuming the misclassification loss and prior probabilities of occurrence are constant. In practise fi (x) is usually not known and must be estimated using Parzen’s method. After some simplification the estimated PDF for a particular class with Nk training observations becomes: 1 Nk Nk by a 12V lead acid battery. It was a powered B. 4. Hardware An algorithm to perform the operation given in figure 1 was programmed on a FriendlyARM mini2440 platform [14].denote the K nearest feature vectors to the unknown feature vector. fo ∈ fo . Classification Performance Criteria The performance of the algorithm was determined by the amount of successful classifications that occurred during the testing. . . The cost of the platform is approximately $90AUD. hi is the prior probability of occurrence in the ith class. . Continuous recording was found to be met while the fo computation and classification stages required no more than 1 second of computation time. these will be referred to as the recording thread and the classification thread. The Euclidean distance metric was used in this paper for all uses of the kNN algorithm. This library is considered to be the most efficient freely available library for computing the fast Fourier transformation. The developed classification software was written in C++ and provided continuous recording using dual threads and dual alternating buffers. . . Benchmarks performed on on a variety of platforms show that FFTW’s performance is typically superior to that of other publicly available FFT software. determined by some distance metric. The freely available FFTW subroutine library [15] was used to compute the PSD. . .

Table III provides the results for Nx = 1 i. This was repeated 5 times with the classification error defined in equation 13 evaluated at each instance. the author obtained permission to use audio recordings taken by amateur entomologist in Japan for the Apis cerana bee and in South America for the Bombus terrestris bee. only the wing-beat frequency is used in the classification algorithms. The system was stopped running after it had classified 100 bees. As such. on average. where fo = 200Hz max and fo = 250Hz. The species include the Asian honeybee . 67% 45% 65% 64% 79% 68% 71% 63% 79% 68% 38% 24% 72% 72% 75% 77% 78% 74% 78% 77% 59% 59% 70% 67% 76% 72% 77% 67% 78% 76% 73% 72% 80% 73% 86% 73% 73% 69% 76% 70% 69% 69% 71% 68% 73% 63% 74% 73% 80% 73% 61% 54% 72% 69% 76% 71% 75% 69% 78% 73% VII. Given the figure 5. However. A testing sample was removed from the training set and applied to the classification algorithm. the average recorded fo measured wing-beat frequency for each species was compared to values cited in the literature shown in table II. The system was given a total of Ntrain training samples with a 50% distribution of training samples for each colony. that the harmonics emitted by the Apis mellifera bee have negligible amplitude passed the 3rd harmonic there. the European honey bee (Apis mellifera) and the bumble bee (Bombus terrestris).fore Nx was set to 4. 1second audio samples of the bee species under consideration. a training set was constructed which contained 5. The pNN algorithm is seen to be the more accurate algorithm.where xi is the ith testing sample. Evidently with a minimal training size of 2. Each training system was audible checked to ensure it contained a acoustics signal produced by min max min a bee and had a fo ∈ fo . The total error which represents the number of successful classifications is given as: 1 Ntest Ntest TABLE II TABLE OF THE WING . In Australia. To provide some evidence to the veracity of this figure. Ntrain 2 10 20 40 100 Alg. these results are not surprising as there is significant difference between feature vectors for the different bee species. As seen. It has been observed a priori. It would also appear the kNN in this instance preferred a low value of k. E XPERIMENT 2: I NTERSPECIFIC B EE C LASSIFICATION The second experiment presented in this paper is on the efficacy of the classification system to classify between four different bee species. both algorithms performed the same and were able to provide 88% accuracy when using only the fo as the classification feature. the Apis cerana and Bombus terrestris are prohibited species and therefore obtaining audio recordings of this insects in flight very difficult to obtain.e. 78% and 72% accuracy was obtained for the pNN and kNN algorithms respectively. pNN kNN (k = 1) pNN kNN (k = 5) pNN kNN (k = 5) pNN kNN (k = 5) pNN kNN (k = 5) 1 2 3 4 5 µ Apis cerana Apis cerana Amegilla cingulata Apis mellifera Apis mellifera Apis mellifera Bombus terrestris Bombus terrestris Bombus terrestris (Apis cerana). Generally consistency is shown between previously cited values for all species except Amegilla cingulata as no literature value could be found. The results of this experiment are given in table I where µ denotes the mean of the classification error for each system run. VI. TABLE I P ERCENTAGE OF SUCCESSFUL CLASSIFICATIONS DETERMINED BY EQUATION 13 FOR THE P NN AND K NN ALGORITHMS AS A FUNCTION OF TRAINING SIZE Nt FOR WHEN Nx = 4. Due to the small number of training and testing samples. when both algorithms were given the complete feature vector (Nx = 12) both algorithms achieved 100% classification accuracy. From these audio recordings and further recordings done locally.BEAT FREQUENCY ESTIMATIONS FOR FOUR DIFFERENT BEE SPECIES . E XPERIMENT 1: C OLONY C LASSIFICATION The first study presented in this paper is on the efficacy of the classification system to classify between two intraspecific colonies of European honey bees (Apis mellifera) with an arbitrary size of training data. the native Australian blue banded bee Amegilla cingulata. The centroids of the testing set for each species are given in figure 5. and Nx = 12. the system is able to obtain. the classification system was tested offline. There is evidently a large variation in wing-beat frequency and the distribution of power in the harmonics. on average. There was observable increase in classification accuracy as the number of training samples increased. E STIMATIONS DONE IN PRESENT STUDY WERE AVERAGE VALUES FROM THE TRAINING SET Species fo 265Hz 306Hz 229Hz 225Hz 240Hz 197Hz 175Hz 156Hz 130Hz Citation Present study [18] Present study Present study [16] [17] Present study [16] [17] = gi i=1 (13) where Ntest the number of testing samples. at least 61% and 54% accuracy using the pNN and kNN algorithms respectively. fo . It is also apparent the wing-beat frequency can also be reasonable feature in interspecific bee discrimination.

Practical neural network recipes in C++. J.P. C ONCLUSION This paper has presented a system for the surveillance and classification of bee insects in real time using the acoustics emitted by the insects during flight. 1993) [13] J. Potamitis. Konstanty Bialkowski of the University of Queensland.1760-1765. . Formulate for insect wingbeat frequency. 2011]. J. USA September 17-21. 2002 [6] I. 10(96):1-9 2010 [11] R. Journal of Experimental Biology. Tabashnik and S. Neural Networks Proceedings. K. albeit. Byrne. vol. Robust insect classification applied to real time greenhouse infestation monitoring. IEEE World Congress on Computational Intelligence. (April 14.Apis cerana 1 h n 0 1 2 3 4 5 6 7 8 9 10 11 12 Amegilla cingulata 1 hn 0 Future work on the proposed system includes the inclusion of more bee species in the training set and the extension of wireless connectivity for event notification. Shapiro. Acoustic detection and automatic identification of insect stages activity in grain bulks by noise spectra processing through classification algorithms. Sotavalta. a limited training and testing set. Martin and S. N. INTERSPEECH 2006 . Sarpola.H. N. [3] A. Zool. 16:4748.fftw. 2008 [9] R. Moore.. Applied Acoustics 62 (2001) 13591374 [8] M. Springer Publishing. Practical mathematical optimization: An introduction to basic optimization theory and classical and new gradient-based algorithms. Application of time domain signal coding and artificial neural networks to passive acoustical identification of animals. Econ. indica indica F and A. [Online].ICSLP. 2011]. Moore. [16] O. R. Ganchev and N. T. J.M. L. The results of interspecific classification were very promising given. Tomasini. G. Journal of Apiculture Research. Journal of insect Behavior. R EFERENCES [1] K. [15] FFTW. Fuzeau. Algorithm pNN kNN kNN kNN kNN kNN (k (k (k (k (k = 1) = 2) = 3) = 4) = 5) Nh = 1 88 % 88% 88% 71% 82% 65% Nh = 12 100% 100% 100% 82% 88% 76% VIII. E. Snyman. 3 1991 [5] A. Vol. Morgan Kaufmann. Moldenke and L.net [Accessed: April 12th. 4-9 May 1998 [2] F. R.18th October 2006.H. Fakotakis. Masters. pp. fo and max . Relationship between wing loading. Campinas. Automatic acoustic identification of insects inspired by the speaker recognition paradigm. The first experiment pertained to the intraspecific of two colonies of Apis mellifera colonies. 2005 [14] FriendlyARM. Miller Automated identification of optically sensed aphid (Homoptera: Aphidae) wing waveforms. 1988 [18] N. Paasch. wingbeat frequency and body mass in Homopterous insects. Coggins and J.R. B. [Online]. Moisan. Kumar. 79: 1703-1706 [4] A. Dudley. Available: http://www. 51(6): 2217-2225. An average classification accuracy of 79% was obtained using a probabilistic neural network. The essential factor regulating the wing stroke frequency of insects in wing mutilation and loading experiments and in experiments at subatmospheric pressure. min = 150Hz Amegilla cingulata. 9-23. Apis mellifera and Bombus terrestris. Princeton University press. 95(1):1-8. Centroid of the training samples for the four bee species: Apis cerana. Oxfordshire United Kingdom. A. Lytle.org/ [Accessed: April 12th. Deakin. ”Vanaino” 15. A. 5. The second experiment pertained to the interspecific classification of four distinct bee species. 15 . 1998. The Biomechanics of insect flight. B.friendlyarm. Soc. Extraction of a feature vector from the sampled acoustic waveform was described followed by two classification algorithms implemented on a low cost prototype platform.3. The author acknowledges the technical assistance given by Dr. ACKNOWLEDGMENT The author would like to thank Yu’s apiaries for the use of their beehives and the amateur and professional entomologists who donated their audio recordings of various insects. An aquatic insect imaging system to automate insect classification. mellifera L. Brazil. Pricipe. Journal of Insect Science.D. Dietterich. Subsequently the system is expected to deployed in a wider area and operated for long periods of time. 2006 [7] E. Goyal and A. 100% classification accuracy was obtained using both the probabilistic neural network and k-nearest neighbor methods.E. 1977 1 2 3 4 5 6 7 8 9 10 11 12 Apis mellifera 1 h n 0 1 2 3 4 5 6 7 8 9 10 11 12 Bombus terrestris 1 hn 0 1 2 3 4 5 6 7 n 8 9 10 11 12 Fig. Sao Paulo. Chesmore. fo = 300Hz TABLE III P ERCENTAGE OF SUCCESSFUL CLASSIFICATIONS DETERMINED BY EQUATION 13 FOR THE P NN AND K NN ALGORITHMS FOR WHEN Nx = 1 AND Nx = 12. Transactions of the American Society of Agricultural and Biological Engineers..S. Kostine and B. Ann. Istanbul. Artificial neural network trained to identify mosquitoes in flight. Miller. Automated identification of flying insects by analysis of wingbeat frequencies. This shows intraspecific classification is possible and obtains reasonable accuracy with the proposed algorithms. 1-67 [17] D. IEEE ICPR workshop on Visual Observation and Analysis of Animal and Insect Behavior. Annals of the Entomological Society of America. V. Fleurat-Lessard. 9th International Conference on Spoken Language Processing Pittsburgh. Moore and R. 4 No. Mortensen. T. G. Gage. Atwal. D. Available: http://www. Proceedings of the 9th International Working Conference on Stored Product Protection. A. Entomol. The intended purpose of this system is in the surveillance of invasive bee species and tools for the tracking of bee behavior for new entomology studies. PA. Wingbeat frequency of A. 1st edition Academic Press Inc. 135. Detection and classification of insect sounds in a grain silo using a neural network. 2010 [10] M. [12] T.