Evolutionary Computing and Particle Filtering A Hardware-Based Motion Estimation System PDF

3140 IEEE TRANSACTIONS ON COMPUTERS, VOL. 64, NO.
11, NOVEMBER 2015
Evolutionary Computing and Particle Filtering:

A Hardware-Based Motion Estimation System
lix Moreno, Member, IEEE
Alfonso Rodrguez, Student Member, IEEE and Fe
AbstractParticle filters constitute themselves a highly powerful estimation tool, especially when dealing with non-linear
non-Gaussian systems. However, traditional approaches present several limitations, which reduce significantly their performance.
Evolutionary algorithms, and more specifically their optimization capabilities, may be used in order to overcome particle-filtering
weaknesses. In this paper, a novel FPGA-based particle filter that takes advantage of evolutionary computation in order to estimate
motion patterns is presented. The evolutionary algorithm, which has been included inside the resampling stage, mitigates the known
sample impoverishment phenomenon, very common in particle-filtering systems. In addition, a hybrid mutation technique using two
different mutation operators, each of them with a specific purpose, is proposed in order to enhance estimation results and make a more
robust system. Moreover, implementing the proposed Evolutionary Particle Filter as a hardware accelerator has led to faster
processing times than different software implementations of the same algorithm.
Index TermsEmbedded systems, evolutionary computing, FPGAs, particle filtering
1 INTRODUCTION
S TATE estimation and prediction are considered major

concerns in the field of real-world system analysis. A
large number of studies regarding these two hot topics can
This work addresses two different enhancements over
the basic particle filter algorithm: on the one hand, a fast
hardware-based implementation intended for embedded
be found throughout the literature, ranging from the basic computations; on the other hand, error mitigation
filtering approaches to more developed alternatives that through the addition of evolutionary capabilities within
make use of complex mathematical concepts. the particle-filtering algorithm. Hardware-based architec-
In 1960, a new filtering algorithm was proposed: the tures can boost particle filters performance, since parallel
Kalman Filter [1]. This algorithm estimates unknown states processing capabilities are able to speed up many repeti-
using noisy measurements in two stages: prediction and tive tasks. Some approaches have been developed in the
correction. In the former, the system predicts new states, literature. For example, in [5] a hardware implementation
which are then corrected in the latter using the measure- of the Bootstrap Filter is presented. The authors use
ments. The results obtained with this double-staged VHDL (Very high speed integrated circuit Hardware
approach are more precise than those obtained from pure Description Language) and a Xilinx Virtex-2 FPGA (Field
observation. The algorithm has been thoroughly analyzed Programmable Gate Array) in order to obtain real-time
since it was proposed, and additional resources can be properties in an object tracking application. In [6], the
found in the literature in this regard [2]. same particle-filtering algorithm is implemented in a
Limitations in the Kalman Filter algorithm when dealing Xilinx Virtex-5 FPGA. Parallel processing capabilities are
with systems other than Gaussian and linear revealed that a used in order to process data concurrently, since every
different alternative had to be developed in order to cope stage performs its computation while the others are work-
with more complex estimations. Monte-Carlo simulations, ing with other valid data, i.e. in pipeline-fashion. This
as well as new mathematical models, such as Hidden Mar- leads to a significant decrease in terms of resource con-
kov Models (also used in Kalman filtering), provided a new sumption. However, there are not only hardware
framework in which new algorithms, the so-called particle approaches, but also software/hardware mixed imple-
filters, were developed. An extensive report on these topics mentations. For instance, in [7] a SoPC (System on Pro-
can be found in [3]. The most representative implementa- grammable Chip) model with hardware accelerators is
tion of a particle filter appears in [4], where a new algo- proposed, targeting an Altera Cyclone II FPGA. An
rithm, the Bootstrap Filter, is presented. This filter is usually embedded processor is in charge of computing weight
thought of as the reference particle filter. values, whereas the particle update is carried out in hard-
ware, since it is a highly repetitive process. The authors
also introduce some characteristic elements from evolu-
The authors are with the Centro de Electr
onica Industrial, ETSI Industriales,
Universidad Politecnica de Madrid, Madrid 28006, Spain. tionary computation, such as tournament selection algo-
E-mail: {alfonso.rodriguezm, felix.moreno}@upm.es. rithms, in the resampling stage. Another FPGA-based
Manuscript received 10 Mar. 2014; revised 10 Dec. 2014; accepted 21 Jan. implementation of a particle filter can be found in [8]. In
2015. Date of publication 5 Feb. 2015; date of current version 9 Oct. 2015. this particular example, the algorithm is enhanced using
Recommended for acceptance by J. D. Bruguera. color histogram optimizations. A Xilinx Virtex-5 FPGA
For information on obtaining reprints of this article, please send e-mail to:
reprints@ieee.org, and reference the Digital Object Identifier below. performs all weight and histogram calculations taking
Digital Object Identifier no. 10.1109/TC.2015.2401015 advantage of parallel processing capabilities.
0018-9340 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
RODRIGUEZ AND MORENO: EVOLUTIONARY COMPUTING AND PARTICLE FILTERING: A HARDWARE-BASED MOTION ESTIMATION SYSTEM 3141
However, in the last few years there has been an increase the best genetic traits can be shared. This also leads to an
in the number of works related with high-performance parti- improvement in global optimum search. Moreover, the con-
cle filters, which use huge numbers of particles in order to cept of genetic resampling stage is introduced in this paper
solve complex high-dimensional problems. These specific for the first time. Nevertheless, the paper provides only sim-
problems require massive amounts of computational resour- ulation results.
ces. Hence, some approaches have started to use many-core More complex examples can also be found in the litera-
architectures in order to cope with that restriction. Parallel ture: a hybrid evolutionary particle filter is presented in
computing has proved itself a feasible alternative to tradi- [16]. This hybrid approach tries to take advantage of both
tional approaches, as it has been shown in [9], where the genetic algorithms (to maintain particle diversity) and parti-
authors establish a comparison between a classic CPU (Cen- cle swarm optimization (to optimize the final particle distri-
tral Processing Unit) and a GPU (Graphics Processing Unit). bution). Furthermore, the algorithm presented has parallel
Their results show that the more parallel the approach is, the features, thus reducing computation times. The strategy
faster the processing can be done. However, there are some presented divides the population in two groups, and then
limitations, since each parallel block resamples from a sub- performs the specific operations that are required: in one
population and not from the whole population. Different group, a genetic algorithm; in the other, particle swarm
parallel implementations using GPGPUs (General Purpose optimization. Before the next time step, a migration opera-
GPUs) and multi-core CPUs also appear in [10]. The authors tion is performed (to share genetic information, as in previ-
provide a detailed report on sensitivity analysis when scal- ous examples).
ing system parameters such as the number of particles per fil- Real-world applications include object tracking, as in
ter, the number of sub-filters and even the state dimensions. [17], where another evolutionary approach is presented in
The authors have also extended their research to real-time order to deal with sample impoverishment. In this case, the
control applications of distributed computing approaches to evolutionary resampling stage may or may not take part in
particle filtering, as in [11]. the estimation loop. The decision is made based on the
Particle filtering and evolutionary algorithms, especially aforementioned parameter, i.e. the effective number of par-
genetic algorithms, share many conceptual similarities. ticles. The algorithm presented provides good results but it
These similarities have been studied and documented for a is not accurate in tracking sequences with occlusion. Recent
long time. For instance, in [12] a modified particle filter is studies with differential evolution have been conducted in
presented. The author uses genetic operators such as cross- order to reduce significantly sample sizes [18]. The differen-
over and mutation to implement the prediction stage of the tial evolution algorithm divides the particles in three differ-
filter. The connection between particle filtering (Monte- ent groups: the first group would undergo crossover; the
Carlo simulations) and Bayesian inference and their appli- second group would undergo mutation; the last group
cation to evolutionary environments has also been studied. would not suffer any modification. One important remark
In [13], the foundations for Bayesian evolutionary computa- about this work is that the evolutionary stage takes place at
tion are presented. The idea is to let Bayes rule guide the the very beginning of the process, and it is immediately fol-
evolutionary process, since it is reasonable to think that the lowed by the importance sampling stage.
most probable solution is the best one. However, this is not In conclusion, evolutionary computation has found a
state-of-the-art research. vast field of application in which optimization is not the
In the last few years, the main interest of evolutionary main concern. The characteristic properties of the genetic
computation applied to particle filtering has been focused operators make evolutionary algorithms a potential tool in
on mitigating sample impoverishment phenomena, which particle filtering, since they are able to reduce to almost neg-
is due to suboptimal resampling strategies. Evolutionary ligible values both problems: particle degeneracy (it is
algorithms, as stated in previous sections, use genetic opera- avoided performing resampling) and sample impoverish-
tors in order to transfer good genes and to introduce genetic ment (it is avoided introducing genetic diversity in the par-
diversity. This last feature is the key to understand why ticle population).
evolutionary computing is suitable for solving sample The rest of this paper is organized as follows: first, the
impoverishment (i.e. diversity loss) problems. Hence, in mathematical concepts regarding particle filtering are thor-
[14] an evolutionary approach is introduced in order to miti- oughly covered in Section 2. In Section 3, the evolutionary
gate those effects. The authors propose introducing genetic algorithm that has been implemented in the resampling
operators right after performing the importance sampling stage is discussed. The proposed system architecture is pre-
stage and immediately before the resampling stage. The sented in Section 4. Section 5 includes experimental results,
resampling stage, as opposed to the basic Bootstrap Filter followed by the conclusions, presented in Section 6.
[4], performs an adaptive strategy, in which a parameter
(effective number of particles) is considered in order to
decide whether to perform resampling or not. In this paper, 2 PARTICLE FILTERING
it is also shown that the mutation operator provides better Particle filters are based upon Monte-Carlo methods, which
dynamic response when the state jumps abruptly. Other are a set of computational algorithms that obtain numerical
research lines use parallel distributed filters, i.e. with sev- results by using repetitive random sampling, and Bayesian
eral particle subpopulations evolving at the same time. In inference, which uses the well-known Bayes rule to com-
[15], the authors use these subpopulations to perform pute the posterior distribution from the previous distribu-
genetic operations and, from time to time, migrate the best tion and the likelihood distribution. Bayes rule, which is
individuals from one subpopulation to the others so that also referred to as Bayes theorem, is often expressed as (1).
3142 IEEE TRANSACTIONS ON COMPUTERS, VOL. 64, NO. 11, NOVEMBER 2015
P bja P a px0:t jy1:t ; (5)

P ajb ; (1)
P b
px0:t jy1:t
P ajb represents the posterior distribution, i.e. the probabil- wx0:t ; (6)
ity of a when b is known to be a particular value, P a is the px0:t jy1:t
prior distribution, P bja is the likelihood distribution, i.e. Eq. (5) is the importance sampling distribution and (6) are
the probability of obtaining b after a has been observed, and the importance weights. The posterior distribution is
P b is the marginal likelihood, which is independent of the then approximated by a finite set of particles using the
hypothesis that is being tested (i.e. changes in the hypothe- expression in (7).
sis do not affect the marginal likelihood). Bayesian inference
is nothing but an updating process in which not only evi- X
N
dence (prior distribution) but also previous knowledge px0:t jy1:t P^N dx0:t jy1:t ~ it dxi dx0:t ;
w (7)
0:t
(likelihood distribution) are taken into account in order to i1
draw a conclusion. P^N dx0:t jy1:t is the approximated posterior distribution,

All equations, expressions and conclusions in this section dxi dx0:t is the delta-Dirac function located in dx0:t , N is
are based upon the contents of [19]. 0:t
the number of particles and w~ it is the normalized weight for
2.1 Perfect Monte-Carlo Sampling particle i up to time t, which is obtained from the particle
Hereafter, let us assume that real-world systems can be weight wxi0:t using (8).
expressed as Markovian, non-linear, non-Gaussian state-
wxi
space models. A Markov model is a stochastic model (i.e. a ~ it PN 0:t j :
w (8)
system which shows random behavior) in which the Markov j1 wx0:t
property is satisfied, i.e. future states depend only upon the
present state and not the previous history of the system. Mar- 2.3 Sequential Importance Sampling
kov property is also referred to as memoryless property. One of the most important disadvantages of the aforemen-
tioned IS strategy is that it is computationally intensive,
pxt jxt1 ; (2) since the posterior distribution is calculated from scratch
pyt jxt ; (3) each time step. Sequential Importance Sampling (SIS) per-
forms the same computations but in a recursive manner.
Eq. (2) represents the hidden states transition model, or pro- The importance sampling distribution and the weight
cess model, whereas (3) represents the observation or mea- update equation are now expressed as (9) and (10).
surement model. These models are called Hidden Markov
Models (HMMs) because the state sequence is unknown, px0:t jy1:t px0:t1 jy1:t1 pxt jx0:t1 ; y1:t (9)
i.e. not all states are observable. The hidden states at time t
are expressed as xt , whereas the observations at time t are pyt jxt pxt jxt1
wxt / wxt1 : (10)
expressed as yt . With these definitions, the posterior distri- pxt jx0:t1 ; y1:t
bution is calculated using Bayes theorem as in (4).
Notice that these expressions provide a simple methodol-
py1:t jx0:t px0:t ogy for online filtering, since particle weights at any current
px0:t jy1:t R ; (4) time step are obtained from the value at the previous time
py1:t jx0:t px0:t dx0:t
step. However, these equations are still very complex, thus
px0:t jy1:t is the posterior distribution of the states x0:t hav- leading to performance problems if real-time requirements
ing observed the measures y1:t . The subscripts represent the are present as design constraints. A simple approach can be
time steps in which the distribution is defined, e.g. adopted in order to overcome these limitations. By selecting
px0:t jy1:t is the posterior distribution of the hidden states the prior distribution as the importance function, the equa-
from t 0 to t t when the observations are known from tions are reduced significantly to (11) and (12).
t 1 to t t.
px0:t jy1:t px0:t px0:t1 pxt jxt1 ; (11)
2.2 Importance Sampling
wxt / wxt1 pyt jxt : (12)
Perfect Monte-Carlo sampling is unfeasible in most real-
world situations, since it is impossible to sample from the Therefore, the transition model pxt jxt1 , i.e. the process
posterior distribution. In those cases, another sampling model, is used in the sampling stage, and the observation
strategy called Importance Sampling (IS) is used. With this model pyt jxt , i.e. the measurement model, in the weight
approach, the distribution from which the samples are update stage.
drawn is not the actual posterior distribution, but an arbi- Basic particle-filtering implementations using SIS show
trary user-defined distribution called proposal distribution, biased populations, where only one particle is significant
importance function, or importance sampling distribution. (i.e. its weight is one, whereas the rest of the population is
Therefore, each drawn sample has to be weighted in order negligible) after a certain finite execution time, as it can be
to make both posterior and importance sampling distribu- seen in Fig. 1. This is the so-called particle degeneracy phe-
tions match. Hence, IS can be expressed in terms of the nomenon, and it might lead to large errors in the estima-
Equations (5) and (6). tions, since the posterior distribution is represented by only
Fig. 1. Particle degeneracy phenomenon. Notice that after a finite esti- Fig. 2. Sample impoverishment phenomenon. Suboptimal resampling
mation time, the particle population is biased, since only one individual strategies generate inaccurate posterior distributions. Results show large
has a non-zero weight. There is no diversity in the population, and thus variations in weight and particle distributions before (left) and after (right)
posterior distributions are not accurate. resampling operations. In the images, actual measurements appear in
black, estimated states in green and particle distributions in red.
one effective particle. The main reason for this phenomenon

to appear is that the variance of the importance weights can [21], which means that most of the particles are identical
only increase over time, as it has been shown in [20]. (see Fig. 2). Therefore, the posterior distribution is inaccu-
rate, and estimation performance is significantly reduced.
This phenomenon is known as sample impoverishment,
2.4 Resampling since the population is reduced to a set of particles (usually
The particle degeneracy phenomenon can be mitigated pro- a small number) with the same state, not providing addi-
vided that an additional resampling stage is included in the tional information.
basic particle filter. The resampling stage is located after the Both particle degeneracy and sample impoverishment
sampling and weight calculation processes, and generates a are the most important problems in particle-filtering sys-
new particle population according to their current weights. tems, and have to be mitigated, if not eliminated.
This new population can then be homogenized by making
all particle weights equal. This is the normal procedure in 2.5 Further Discussion
the Bootstrap Filter [4], whose algorithm is shown in Even though HMMs are very widely used throughout the
Table 1. literature, there are other mathematical models that provide
However, suboptimal resampling algorithms tend to much more generalization and in which particle filtering is
generate populations where no diversity exists, for particles still feasible. That is the case of Pairwise and Triplet Markov
with higher weights are statistically selected many times Models.
Pairwise Markov Models (PMMs) can be built upon the
TABLE 1 assumption that the pair xt ; yt is markovian, whereas Trip-
Bootstrap Filter Algorithm
let Markov Models (TMMs) consider the triplet xt ; rt ; yt ,
0: Initialize particle population xio and k 1 where rt is an additional auxiliary process, to be markovian.
In [22], the authors prove that TMMs are more general than
1: while(1)
2: for (i 0; i < PARTICLES; i++)
PMMs and PMMs more general than HMMs. Furthermore,
3: Importance sampling xik pxik jxik1 the authors show that classical particle filtering approaches
4: Compute weights wxik pyik jxik
can be extended to both PMMs and TMMs.
wxi
Examples of particle filters implemented using these
5: ~ ik PPARTICLES
Normalize weights w k
j more general models as mathematical foundation can be
wxk
j1
6: end for found in the literature [23]. However, and since the objec-
7: for (i 0; i < PARTICLES; i++) tive of this work is to show the benefits of both evolutionary
8: Resample according to weight xik / w
~ ik computing and hardware acceleration in particle filtering,
9: end for the classical HMMs approach is sufficient.
10: k++
11: end while
3 EVOLUTIONARY RESAMPLING
The Boostrap Filter algorithm presents some similarities
with the Kalman Filter. There is a prediction stage, in which The sample impoverishment phenomenon in particle filter-
each particle is updated using the prior distribution (impor- ing appears, as it has been mentioned in the previous sec-
tance sampling). The update is carried out afterwards, first tion, when the posterior distribution is represented by a
computing the normalized weights and then performing particle set in which there is no diversity. Since one of the
resampling in order to reject those individuals whose weight
is smaller. Notice that there is a resampling operation each key features of genetic operators is introducing genetic
time step k. diversity, it seems reasonable to combine evolutionary
computation and particle filtering in order to improve the averaging, with some weighting process in between, the
posterior distribution precision, and thus, the overall filter- parents states. Therefore, if the parents are in a close region
ing performance. around the maximum fitness point, the children will likely
It is for this reason that the resampling stage of the pro- have better fitness values. Otherwise, they will be discarded
posed Evolutionary Particle Filter features a genetic algo- in future generations. Hence, the posterior distribution will
rithm. Each chromosome encodes the particle state, as it is be more precise. The proposed mutation strategies cover
shown in (13). two different and opposed situations. On the one hand, local
2 3 search mutation is useful when the particle population is
xik close to the optimal value, since mutation generates another
6 yi 7 particle in a close environment of the parent. This leads to
6 7
xik 6 ki 7: (13) an increase in high-fitness population. On the other hand,
4 vx k 5
random placement mutation provides robustness in those
vy ik time steps in which the tracking is lost, i.e. the posterior dis-
xik , yik are the position values in both x-axis and y-axis, and tribution is inaccurate or cannot be properly estimated.
vx ik , vy ik are the velocity values referred to the same axes.
3.2 Fitness Function
The subscripts k represent the current time step, whereas
the superscripts i represent the particle identifier. Given the fact that particle filtering and evolutionary com-
puting have common features, such as being population-
3.1 Genetic Operators based algorithms, some parallelisms can be established (at
least to some extent). For instance, weights can also be
Let us now focus on the genetic operators: crossover and
thought of as fitness values. By making this simple
mutation. Crossover not only increases population diversity,
assumption, the design is simplified significantly, since
but also helps improving population fitness, for children that
only one computation (weight) has to be performed instead
inherit the best characteristics, i.e. genes, from the parents
of two (fitness value and weight). In this particular case,
are more likely to pass from one generation to the next one.
particle weights (i.e. fitness values) are calculated using a
Since all state variables are represented using real numbers,
normal multivariate probability density function, using
arithmetic crossover can be applied in order to combine
two dimensions and assuming that the two random varia-
genetic traits from both parents. Crossover operations are
bles are non-correlated. Therefore, these values can be
therefore expressed in terms of the Equations in (14).
obtained using (17).
(
xak a xik 1 a xjk ; 1
(14) fx; y
xbk a xjk 1 a xik ; 2 p sx sy
2
! (17)
xak , xbk are the offspring particles obtained from the parents 1 x mx 2 y my
exp ;
xik and xjk , and a U0; 1. The superscripts are the individ- 2 s 2x s 2y
ual identifiers (a, b for the children; i, j for the parents).
s x and s y are the standard deviation values of the random
Mutation, on the other hand, is used to introduce random
variables x and y, which are the actual measurements in both
variations within the particle population, which also
x-axis and y-axis, whereas mx and my represent the mean val-
increases population diversity. In this work, two different
mutation proposals have been defined: local search muta- ues, which are the current particle x-axis and y-axis esti-
tion and random placement mutation. The former mutates mated measurements (calculated from the particle state
the parent and generates a child in a close region around variables). The fitness function is then expressed as (18).
the state of the parent, whereas the latter generates a valid 1
random particle in the state space, i.e. within its limits. f zk ; zîk
2 p sx sy
These two possible mutation operations provide more flexi- 2 2 ! (18)
bility. Local search mutation is defined as in (15). 1 zxk z^x ik zy k z^y ik
exp :
2 s 2x s 2y
xck xik dk ; (15)
Therefore, each particle weight/fitness is the probability
xck is the particle generated after mutation, xik are the state of obtaining the measured positions in a normal two-
variables of the parent i, and dk is an array where each com- dimensional probability density function whose mean val-
ponent is defined by dki Nmi ; s i . Random placement ues are that particle positions.
mutation is based upon (16).
xck xmin b x
xmax xmin ; (16) 3.3 Selection
In traditional implementations of evolutionary algorithms,
xck is, again, the particle generated after mutation, xmin and parent selection usually follows stochastic processes,
xmax are the minimum and maximum allowable values for whereas survivor selection is based upon deterministic pro-
the state variables, i.e. the state space boundaries, and cesses. In this work, however, both parents and survivors
b U0; 1. are selected using Stochastic Universal Sampling (SUS),
Why have these genetic operators, and not others, been since having only one selection algorithm is more suitable
selected? Arithmetic crossover generates children by for hardware implementations with limited resources. This
Fig. 3. Stochastic Universal Sampling. After sorting the particles according to their fitness values (fi ) and computing the cumulative fitness function,
individuals are selected drawing a uniform random number r and dividing the whole cumulative fitness range in as many equal divisions as individuals
to be selected (nsel ).
algorithm performs fitness proportionate selection in a rou- Memory), and the fitness function is stored in internal LUTs
lette-wheel fashion. However, as opposed to the basic rou- (Look-Up Tables). In addition, some operations have been
lette-wheel implementation, it is an unbiased process in multiplexed in order to maximize resource sharing, i.e. to
which any individual may be selected, even those with the minimize resource consumption, which also provides signifi-
lowest fitness values in the population. SUS algorithm is cant benefits (e.g. less area overhead). Hence, a tradeoff
shown in Fig. 3. between resource utilization and hardware acceleration is
made, which is a common practice in embedded system
design.
4 SYSTEM ARCHITECTURE Limited-precision fixed-point arithmetic is inherent to
A novel hardware-based Evolutionary Particle Filter, the so- most hardware designs, since computations require fewer
called HW-EPF, is presented in this work. The algorithm resources and are carried out faster. Some problems may
(see Table 2) has been implemented in a FPGA using arise when using limited-precision data types, e.g. overflow,
VHDL. The HW-EPF is used as a hardware accelerator by underflow, and the results might not be accurate enough.
the main processor of a SoPC, since it provides better per- However, these factors have no relevant influence in the
formance than software in fast and repetitive sequential proposed architecture, as it will be shown in forthcoming
processes, e.g. particle update or particle sorting. The block sections.
diagram of the SoPC can be seen in Fig. 4.
The proposed implementation takes advantage of the 4.1 Motion Model Equations
distributed resources inside the FPGA. For instance, particle Assuming uniform motion patterns in both axis, the impor-
states are stored in the internal RAM (Random Access tance sampling process, i.e. particle updating process, is
performed using the system of linear Equations shown in
(19) and (20).
TABLE 2
HW-EPF Algorithm
xik A xik1 wk1 (19)
0: Initialize particle population xio and k 1
2 3
1: while(1) 1 0 T 0
2: for (i 0; i < PARTICLES; i++) 60 1 0 T7
A6
40
7; (20)
3: Importance sampling xik pxik jxik1 0 1 05
4: Compute weights wxik pyik jxik 0 0 0 1
wxi
5: ~ ik PPARTICLES
Normalize weights w k
j
wxk
j1
6: for (gen 0; gen < GENERATIONS; gen++)
7: Parent selection (SUS)
8: Draw r U0; 1. Arithmetic crossover if r < pcross
9: Draw r U0; 1. Mutation if r < pmut
Local search mutation if r pmut rmut
Random placement mutation if r < pmut rmut
10: Survivor selection (SUS)
11: end for
12: end for
13: k++
14: end while
The HW-EPF algorithm modifies the basic resampling stage and includes a Fig. 4. SoPC internal architecture. The main processor in the system (in
genetic algorithm (steps 6 to 11) in order to take advantage of the optimiza- this case a Xilinx MicroBlaze soft-core) uses the HW-EPF as a hardware
tion capabilities of evolutionary computation. In addition, more than one accelerator. The HW-EPF provides three different interfaces: a register-
mutation operator has been included, in order to further improve estimation based control port via PLB (Processor Local Bus), a memory controller
performance. port and some interrupt signals.
wk1 is an array where each component is defined by

wik1 Nmi ; s i , i.e. an array of normal-distributed ran-
dom numbers or white noise. The coefficient T in matrix A
represents the sampling time, i.e. the theoretical interval
between two consecutive time steps, and has been included
in order to weight the influence of the particle velocities in
the estimation results. Notice that this parameter is not
related with the system clock frequency at all.
In order to reduce the overall number of multipliers, each
particle uses four clock cycles (one clock cycle per state vari-
able) to be fully updated. Multiplexing the input signals of
the arithmetic operators (adders and multipliers), and thus
providing resource sharing mechanisms, resource con-
sumption remains balanced.
With the updated states, the system provides an esti-
mated measurement. This value is computed using the
Fig. 5. Fitness evaluation LUT. The normal probability density function is
expressions (21) and (22). computed at synthesis time. Taking advantage of symmetry properties,
" # only one quadrant has been mapped into the LUT, thus achieving a
z^x ik reduction in the number of internal resources of the FPGA.
zîk H xik (21)
z^y ik
4.4 Estimation
At the end of each estimation step, the current state is
1 0 0 0 obtained from the whole particle population using the
H ; (22)
0 1 0 0 expression (23).
z^x ik and z^y ik are the estimated measured values of the particle 2 3
k
i state variables at time k in both x-axis and y-axis 6 y^k 7 X N
6 7
respectively. x^k 6 7 ~ i xi ;
w (23)
Since the estimated measurements are the position state 4 v^xk 5 i1 k k
variables, the architecture has been simplified avoiding the v^y k
usage of additional resources.
x^k is the estimated state at time k, xik are the state variables
4.2 Genetic Operators ~ ik are the normalized weights
of the particle i at time k, and w
Genetic operations are performed over each parent if a uni- for each particle, computed as in (24).
form random number is below the established thresholds,
i.e. crossover and mutation probabilities. Crossover unit w xik f zk ; zîk
~ ik
w PN i PN : (24)
follows Equation (14), whereas mutation unit follows j1 w xk j1 f zk ; z ^jk
Equations (15) and (16). As far as the hardware implementa-
tion of these modules is concerned, they share the same
architecture as the one described in the previous subsection. 4.5 Random Number Generator
The arithmetic units have their inputs multiplexed in order Particle filters are always built upon stochastic processes.
to favor resource sharing, at the expense of adding a fixed Furthermore, evolutionary algorithms also require random
latency, in terms of clock cycles, to the system. numbers in order to decide whether to apply a specific
genetic operator, or which individuals are chosen in the
4.3 Fitness Evaluation selection stages. Hence, random number generation is con-
Implementing Equation (18) is not feasible in hardware, sidered a main task within the HW-EPF.
due to the limitations of fixed-point arithmetic. In addi- Taking a closer look at the proposed algorithms, it is clear
tion, complex mathematical functions, such as exponen- that two different types of random numbers have to be gen-
tials, are highly time-consuming. Therefore, they should erated: on the one hand, uniform-distributed random num-
not be used when optimizing system performance, even bers, e.g. crossover and mutation probability thresholds, a
if a floating-point unit is available in the design. In order in (14) and b in (16). On the other hand, normal-distributed
to overcome this disadvantage, the proposed implementa- random numbers, e.g. wk1 in process model Equation (19),
tion changes complex computations by indexing opera- or dk in local search mutation Equation (15).
tions in a LUT. The values of Equation (18) are evaluated Uniform distributions are relatively easy to obtain in
in a small set of reference points at synthesis time. The hardware systems using a LFSR (Linear Feedback Shift Reg-
result of this discretization process is shown in Fig. 5. ister). These devices generate a stream of uniform-distrib-
Then, whenever the system needs to compute a value, the uted pseudorandom numbers. Since motion estimation is
LUT is accessed at run-time and the fitness is obtained by not a highly demanding application, pseudorandom fea-
zero-order interpolation. With this proposed alternative, tures are enough.
the system shows improved estimation times without Normal distributions, on the other hand, are not that sim-
affecting estimation precision. ple to obtain in hardware systems. Some implementation
Fig. 6. Random Number Generator histograms. Uniform-distributed ran- Fig. 7. HW-EPF vs. Bootstrap filter. In normal operation mode, both fil-
dom numbers (left) and normal-distributed random numbers are used in ters provide accurate estimations and are able to track the object. The
different stages in the HW-EPF. image shows the real trajectory, as well as both filter estimations.
alternatives can be found in the literature [24], but in this

work, a new strategy has been used. Using a large LFSR, 12 xk1
xk xk1 7 cos 1:2 k 1 wk1 ; (25)
twelve uniform random numbers are drawn at the same 1 x2k1
time. Then, these numbers are added, and the resulting
number, as a consequence of the central limit theorem, is x2k
assumed to follow a normal distribution. This assumption zk k ; (26)
20
is valid, as it can be seen in the distribution histograms in
Fig. 6. This approach provides an accurate implementation wk1 N0; s x is the so-called process noise, whose stan-
without increasing excessively resource consumption. dard deviation is represented by s x , and k N0; s z is the
so-called measurement noise, whose standard deviation is
represented by s z . Both noise distributions have zero mean.
5 EXPERIMENTAL RESULTS Notice that the equations represent a one-dimensional sys-
In this section, different aspects of the implemented design tem, since the aim of these tests is to validate the functional-
are put to the test. This work addresses algorithm valida- ity of the evolutionary resampling stage, and not the target
tion, overall system functionality validation through Hard- application itself.
ware In the Loop (HIL) co-simulation, and a sensitivity Results from a comparison between the standard Boot-
analysis of the most important parameters in the system. It strap Filter and the proposed HW-EPF show that, under
also covers performance comparisons between different normal operation, both filters provide accurate estimations
implementations of the same algorithm, and a summary (see Fig. 7). However, the HW-EPF slightly outperforms, in
report on resource consumption and timing rates. All those terms of tracking performance, the Bootstrap Filter (see esti-
test in which an actual physical device is needed have been mation errors in Fig. 8). Moreover, if changes are introduced
carried out using the XUP-V5 development board, which
features a Xilinx Virtex-5 FPGA.
5.1 Evolutionary Resampling Algorithm Validation

A first set of test has been designed in order to validate
the evolutionary algorithm in the resampling stage of the
particle filter. Since these tests deal with the algorithm
itself and not with the implementation, they have been
carried out in MATLAB, not taking into account actual
implementation considerations, such as fixed-point data
precision or data overflow effects within the mathemati-
cal operations. The most used equations in particle-filter-
ing performance validation tests are those from the
univariate non-stationary growth model, with the addi-
tion of a quadratic measurement model. The combination
of both equations provides a highly non-linear system,
Fig. 8. HW-EPF vs. Bootstrap filter: overall estimation error. Although the
thus making it suitable for testing non-linear estimation tracking is lost at certain points in both filters, the HW-EPF provides esti-
capabilities. Hence, the validation equations can be mations with less mean error. Moreover, lost-tracking situations appear
expressed as (25) and (26). more often in the Bootstrap Filter.
Fig. 9. HW-EPF vs. Bootstrap filter: mutation benefits. Mutation gener- Fig. 10. Sample impoverishment mitigation in the HW-EPF. As opposed to
ates individuals in high-fitness areas of the state space, thus recovering the Bootstrap Filter, the HW-EPF assures particle diversity due to its evo-
from lost-tracking situations, as opposed to the Bootstrap Filter. lutionary properties, thus providing accurate posterior distributions. These
results are better if the generation count is kept at a minimum value. Notice
that the population moves towards the maximum fitness point.
in the trajectory so as to simulate an inaccurate transition
process modeling, the HW-EPF is able to recover from lost-
tracking states, whereas the standard Bootstrap Filter loses the system can switch applications by simply changing the
the tracking never to recover from that stray state. This spe- preprocessing stage. Therefore, the proposed HW-EPF con-
cific condition is shown in Fig. 9. stitutes itself a generic, application-independent motion
Another important feature of the proposed resampling estimation system. However, a visual tracking application
stage is that, as opposed to the resampling strategy adopted has been selected as a proof of concept.
in the Bootstrap Filter, it does not generate sample impover- The HIL co-simulation has been set up using both Simu-
ishment phenomena (at least to some extent) in the system link and Xilinx System Generator toolboxes. Images are
(see Fig. 10). Furthermore, if the number of generations the acquired through a webcam and then preprocessed inside
evolutionary algorithm runs is not large, particle diversity the Simulink model. The target application is to track red
is kept and thus sample impoverishment phenomena are objects within the visual space. Hence, the preprocessing
mitigated. However, evolutionary processes in which the stage detects those red objects, selects the biggest among
maximum generation limit is not set to be small might show them and extracts its center of mass x-axis and y-axis coordi-
no mitigation at all. nates. These two values are the actual measured values that
are sent to the evaluation board through an Ethernet point-
to-point connection. The HW-EPF then estimates the posi-
5.2 HIL Functional Validation tion of the center of mass and sends back those estimated
Hardware-in-the-loop co-simulation is a useful testing coordinates, which are then printed over the original image.
methodology in which the designed system is connected to The obtained results prove that the HW-EPF provides an
a real-world environment, usually the one in which it has accurate tracking (i.e. estimated values), thus validating the
been designed to work. The environment is in charge of pro- proposed system architecture and implementation. In addi-
viding all necessary stimuli to the design under test and, tion, and since normal operation times allow real-time com-
therefore, it does not have to be modeled. Environment putation (as it will be shown in the forthcoming
modeling is a highly time-consuming task in system testing subsections), it is possible to track objects using not only
or validation processes. Hence, HIL co-simulation is able to static images but also live-feed video (Fig. 11).
speed up the testing phase in every design process. It is for
this reason that this testing methodology has been used in
order to validate the right functionality of the HW-EPF, i.e. 5.3 Sensitivity Analysis
its motion estimation capabilities. Complex designs such as the HW-EPF require a large num-
Particle filters are commonly used as the main estimation ber of system parameters so that they can be flexible and, to
engine in many different applications. These applications some extent, reconfigurable. Therefore, it seems reasonable
include, but are not limited to, visual tracking [25], [26], to analyze whether these parameters have some impact on
object detection [27], image segmentation [28], contour system performance or not. Furthermore, it also seems rele-
detection [29], video stabilization [30], and even point set vant to detect on which parameter the system has stronger
registration [31], i.e. finding a spatial transformation that dependencies. The complete list of system parameters, as
aligns two given point sets. Moreover, particle filters have well as their default values can be seen in Table 3. The anal-
also been used as powerful estimation tools in other appli- ysis is focused on two different, yet significant, performance
cations, such as video coding/decoding [32]. The proposed variables: on the one hand, estimation error, i.e. the differ-
architecture can be used in almost any application scenario ence between the estimation and the actual value; on the
that requires estimations of both position and velocity of a other hand, estimation time, i.e. the time the system needs
moving object. The modular implementation ensures that to generate a valid estimation from a valid measurement.
Fig. 11. HIL co-simulation results. The functional validation of the proposed architecture is carried out using a red object tracking proof-of-concept
demonstrator. Results show that the HW-EPF performs estimations in real time (assuming video inputs of 30 fps).
The estimation error is evaluated as the Euclidean distance Crossover and mutation probabilities, generally speak-
(or 2-norm distance) between both estimation and measure- ing, provide better results, in terms of estimation error,
ment points, which is shown in (27). when their values are close to one. An increase in cross-
over probability always leads to smaller estimation
q
2 2 errors. However, increases in mutation probability have
ek zxk x^k zy k y^k ; (27) different outcomes depending on the mutation operator
that is dominant. When the mutation ratio favors ran-
zxk and zy k are the measurements, whereas x^k and y^k are the dom placement the estimation error is increased (see
estimations. Fig. 16); when local search mutation is favored, estima-
System parameters in this work can be divided into two tion errors decreases (see Fig. 15). This reveals that,
different groups: model parameters (e.g. process noise or although random placement mutation is absolutely nec-
measurement noise standard deviations) and evolutionary essary in order to deal with lost-tracking situations, it
algorithm parameters. The results that appear in this sub- must be kept at minimum values to avoid affecting sys-
section deal with the latter. tem performance.
It seems clear that population size has a huge impact on The effect that crossover and mutation probabilities have
estimation performance (see Fig. 12). However, the larger the in estimation time is the opposite of that in estimation error.
population is, the slower the estimations are and the larger Larger probabilities mean more offspring particles and,
the resource consumption rate is. Increasing the number of therefore, more sorting operations that increase sharply esti-
parents also leads to a very significant increase in hardware mation times (see Fig. 17). Hence, another tradeoff between
resources (more internal RAM memory is needed in both estimation performance and execution time determines
cases) and estimation times, but the estimation error is not whether to select specific values for crossover and mutation
improved (see Fig. 13). Therefore, the number of particles has probabilities or not.
to be determined by the tradeoff between accuracy, speed
and area, whereas the number of parents is determined by 5.4 HW vs. SW Performance Comparison
the minimum amount that assures particle diversity and a The proposed architecture targets embedded systems. The
sufficient number of mutation and crossover operations. usage of hardware resources instead of software routines
When the number of generations is increased, the estima- provides a significant improvement in terms of performance
tion error suffers the same variation (see Fig. 14), due to and execution time. Take for instance the fitness evaluation
sample impoverishment phenomena. Furthermore, an
increase in the generation limit produces longer estimation
times. Hence, the number of generations has to be kept
under a certain threshold to still benefit from the evolution-
ary resampling stage.
TABLE 3
System Parameters: Reference Values
Parameter Description Value

s pos Position standard deviation 32
s vel Velocity standard deviation 0.01
s meas Measurement standard deviation 10
T Theoretical sampling time 0.033
#LUT Values Fitness LUT values per dimension 32
#Particles Population size 200
#Parents Parents number 10
#Generations Generations number 2
pcross Crossover probability 0.6
pmut Mutation probability 0.1 Fig. 12. Estimation error vs. population size boxplot. Estimation error can
rmut Mutation ratio 0.4 be significantly reduced if the population size is increased. Small particle
s mut Local search standard deviation 6.0 population generate bad results, as it can be seen in the example of 20
particles, whereas large populations increase resource utilization.
Fig. 13. Estimation error vs. parent number boxplot. Increasing the number
of parents increases system performance, but only slightly. Since parent
Fig. 16. Estimation error vs. genetic operators with less local search
number has a deep impact on resource consumption, the fewer individuals
mutation than random placement mutation. Random placement muta-
are selected as parents, the smaller the resulting implementation is.
tion, although necessary in order to recover from lost-tracking events,
does not help reducing estimation errors as local search mutation.
Fig. 14. Estimation error vs. generation number boxplot. Whenever the
generation limit is increased, there is an increase in the estimation error,
since the resulting posterior distribution suffers more sample impoverish- Fig. 17. Estimation time vs. genetic operators. The children count affects
ment effects. the time spent in each estimation. There is a bottleneck in the fitness-
sorting algorithm, and its negative effects are sharper when the cross-
over probability is increased (two children appear, instead of only one as
in mutation).
module, which replaces a complex mathematical equation

(online computation) by a much faster indexing process
(LUT addressing).
Establishing a comparison between the same algo-
rithm implemented on different embedded platforms, it
is clear that the hardware approach outperforms those
that try to take advantage of software resources. Soft-
ware approaches, no matter what data precision the
computing core uses (fixed-point or floating-point opera-
tions), present a bottleneck in the fitness evaluation unit.
Complex mathematical operations, such as exponential
functions, reduce embedded systems performance. The
proposed hardware accelerator performs estimations in
less time that the other alternatives, independently of the
Fig. 15. Estimation error vs. genetic operators with more local search
mutation than random placement mutation. Local search mutation number of particles that is being used. This can be seen
reduces overall estimation errors. Crossover also helps reducing estima- in both Fig. 18 (minimum estimation times) and Fig. 19
tion errors, but its effects are more relevant. (maximum estimation times).
TABLE 4
Resource Consumption
Resource Utilization(%)
Slice Registers 12
Slice LUTs 18
Block RAM 6
DSPs 26
Resource consumption report on the global hardware acceler-

ator architecture. The module has been implemented in a
Xilinx Virtex-5 FPGA (5vlx110tff1136-1). Notice that there
is a large number of DSP processing elements used, which
may generate large area overheads in other platforms.
outperforms other algorithms, e.g. the Bootstrap Filter.

Particle-filtering common problems, i.e. particle degeneracy
Fig. 18. Minimum estimation times comparison. Both crossover and and sample impoverishment, are mitigated with the pro-
mutation probabilities are set to zero and thus no child is generated. posed algorithm, providing both accurate and realistic pos-
This is not a real situation, since there is no evolutionary influence on the terior distributions.
estimations. However, the HW-EPF is still much faster than the other
two approaches. The HW-EPF modular architecture provides flexibility
and reconfiguration capabilities to the embedded system in
which is used. As a hardware accelerator, it speeds up esti-
5.5 Resource Consumption and Timing Results mation throughput. This acceleration has been verified
This last subsection shows the resource utilization report when establishing a comparison between the same algo-
(Table 4) and the timing results (Table 5), i.e. maximum fre- rithm with different implementations (only hardware, soft-
quency in the design. Notice that the implemented HW-EPF ware with fixed-point precision, and software with floating-
uses a large number of DSPs units inside the FPGA, even point precision) over the same evaluation platform, and
though the architecture has been optimized by means of comes from the advantages that hardware processing has
resource sharing techniques. This can be considered one of when dealing with repetitive operations.
the design main weaknesses. In addition, the maximum The sensitivity analysis shows that those system param-
allowable frequency is not that high when compared with eters that increase particle number, e.g. population size,
those of the individual modules that constitute the whole parent size (the more parents are selected, the more off-
design. This is mainly due to large routes inside the FPGA. spring is generated), have to remain under certain limits,
However, this maximum frequency value still provides in order to avoid excessively large implementations, i.e.
good results when dealing with real-time video processing. with large area overhead. Moreover, mutation and cross-
over probability thresholds have to be selected taking into
6 CONCLUSIONS account the tradeoff between precision and execution time:
higher values show, generally speaking, more accurate
A novel particle-filtering architecture, the HW-EPF, has results but it takes longer to obtain the estimations. In
been designed. Its performance analysis reveals that it not addition, the maximum number of generations has to be
only provides accurate motion state estimations, but also small, in order to mitigate sample impoverishment phe-
nomena and reduce estimation times.
All things considered, the HW-EPF has proved to be
an outstanding filter, as well as a robust and powerful estima-
tion tool. A proof-of-concept implementation using HIL co-
TABLE 5
Timing Report
Module Maximum frequency (MHz)

Random Number Generator 135.080
Process model 95.716
Crossover unit 113.097
Mutation unit 99.885
Divider 227.376
HW-EPF 63.914
Timing results for each of the individual modules that constitute the HW-EPF
Fig. 19. Maximum estimation times comparison. Both crossover and peripheral, as well as those from the final architecture. The maximum operat-
mutation probabilities are set to one and thus all possible children are ing frequency of the integrated design is reduced due to large interconnecting
generated. Estimation times are much shorter in the HW-EPF, indepen- routes between modules. As in the resource consumption report, the selected
dently on the population size. device is a Xilinx Virtex-5 FPGA (5vlx110tff1136-1).
simulation has been made in order to validate system [21] M. Arulampalam, S. Maskell, N. Gordon, and T. Clapp, A tuto-
rial on particle filters for online nonlinear/non-Gaussian Bayesian
functionality. tracking, IEEE Trans. Signal Process., vol. 50, no. 2, pp. 174188,
Feb. 2002.
ACKNOWLEDGMENTS [22] F. Desbouvries and W. Pieczynski, Particle filtering in pairwise
and triplet Markov chains, in Proc. IEEE EURASIP Workshop Non-
The authors would like to thank the Spanish Ministry of linear Signal Image Process., Grado-Gorizia, 2003, pp. 811.
Education, Culture and Sport for its support under the FPU [23] N. Abbassi, D. Benboudjema, S. Derrode, and W. Pieczynski,
Optimal filter approximations in conditionally Gaussian pair-
grant program. wise markov switching models, IEEE Trans. Autom. Control,
vol. PP, no. 99, pp. 11, 2014, DOI: 10.1109/TAC.2014.2340591.
REFERENCES [24] D.-U. Lee, J. Villasenor, W. Luk, and P. Leong, A hardware
Gaussian noise generator using the Box-Muller method and its
[1] R. E. Kalman, A new approach to linear filtering and prediction error analysis, IEEE Trans. Comput., vol. 55, no. 6, pp. 659671,
problems, ASME J. Basic Eng., vol. 82, no. Series D, pp. 3545, Jun. 2006.
1960. [25] Y. Zheng, Z. Shi, R. Lu, S. Hong, and X. Shen, An efficient data-
[2] G. Welch and G. Bishop, An introduction to the Kalman filter, driven particle phd filter for multitarget tracking, IEEE Trans.
Chapel Hill, NC, USA, Tech. Rep. 95-041, 1995. Ind. Inform., vol. 9, no. 4, pp. 23182326, Nov. 2013.
[3] A. Doucet and A. M. Johansen, A tutorial on particle filtering and [26] J. Kwon, H. S. Lee, F. Park, and K. M. Lee, A geometric particle
smoothing: Fifteen years later, in Oxford Handbook of Nonlinear filter for template-based visual tracking, IEEE Trans. Pattern
Filtering, pp. 656704, 2011. Anal. Mach. Intell., vol. 36, no. 4, pp. 625643, Apr. 2014.
[4] N. Gordon, D. Salmond, and A. Smith, Novel approach to [27] G. Gualdi, A. Prati, and R. Cucchiara, Multistage particle win-
nonlinear/non-Gaussian Bayesian state estimation, IEEE Proc. dows for fast and accurate object detection, IEEE Trans. Pattern
F, Radar and Signal Process., vol. 140, no. 2, pp. 107113, Apr. Anal. Mach. Intell., vol. 34, no. 8, pp. 15891604, Aug. 2012.
1993. [28] D. Varas and F. Marques, Region-based particle filter for video
[5] J. U. Cho, S.-H. Jin, X. D. Pham, J. W. Jeon, J.-E. Byun, and H. object segmentation, in Proc. IEEE Comput. Vis. Pattern Recognit.
Kang, A real-time object tracking system using a particle fil- Conf., Jun. 2014, pp. 34703477.
ter, in Proc. Intell. Robot. Syst. IEEE/RSJ Int. Conf., Oct. 2006, [29] N. Widynski and M. Mignotte, A multiscale particle filter frame-
pp. 28222827. work for contour detection, IEEE Trans. Pattern Anal. Mach.
[6] H. El-Halym, I. Mahmoud, and S. E.-D. Habib, Efficient hard- Intell., vol. 36, no. 10, pp. 19221935, Oct. 2014.
ware architecture for particle filter based object tracking, in Proc. [30] J. Yang, D. Schonfeld, and M. Mohamed, Robust video stabiliza-
Image Process. 17th IEEE Int. Conf., Sept 2010, pp. 44974500. tion based on particle filter tracking of projected camera motion,
[7] S.-A. Li, C.-C. Hsu, W.-L. Lin, and J.-P. Wang, IEEE Trans. Circuits Syst. Video Technol., vol. 19, no. 7, pp. 945954,
Hardware/software co-design of particle filter and its applica- Jul. 2009.
tion in object tracking, in Proc. Syst. Sci. Eng. Int. Conf., Jun. 2011, [31] R. Sandhu, S. Dambreville, and A. Tannenbaum, Point set regis-
pp. 8791. tration via particle filtering and stochastic dynamics, IEEE Trans.
[8] S. Agrawal, P. Engineer, R. Velmurugan, and S. Patkar, FPGA Pattern Anal. Mach. Intell., vol. 32, no. 8, pp. 14591473, Aug. 2010.
implementation of particle filter based object tracking in video, [32] S. Wang, L. Cui, L. Stankovic, V. Stankovic, and S. Cheng,
in Proc. Electron. Syst. Des. Int. Symp., Dec. 2012, pp. 8286. Adaptive correlation estimation with particle filtering for distrib-
[9] V. Jilkov, J. Wu, and H. Chen, Performance comparison of gpu- uted video coding, IEEE Trans. Circuits Syst. Video Technol.,
accelerated particle flow and particle filters, in Proc. Inf. Fusion vol. 22, no. 5, pp. 649658, May 2012.
16th Int. Conf., Jul. 2013, pp. 10951102.
[10] M. Chitchian, A. van Amesfoort, A. Simonetto, T. Keviczky, and Alfonso Rodrguez received the BSc degree in
H. Sips, Adapting particle filter algorithms to many-core industrial engineering and the MSc degree in
architectures, in Proc. Parallel Distributed Process. IEEE 27th Int. industrial electronics from the Universidad
Symp., May 2013, pp. 427438. Polite cnica de Madrid (UPM), Madrid, Spain, in
[11] M. Chitchian, A. Simonetto, A. van Amesfoort, and T. Keviczky, 2012 and 2014, respectively. He is currently a
Distributed computation particle filters on GPU architectures for full-time researcher and working toward the
real-time control applications, IEEE Trans. Control Syst. Technol., PhD degree in industrial electronics at Centro
vol. 21, no. 6, pp. 22242238, Nov. 2013. de Electro nica Industrial, UPM. His current
[12] T. Higuchi,Monte Carlo filter using the genetic algorithm oper- research interests include artificial intelligence,
ators, J. Statist. Comput. Simul., vol. 59, no. 1, pp. 123, 1997. embedded systems, high performance comput-
[13] B.-T. Zhang, A Bayesian framework for evolutionary ing, and reconfigurable computing.
computation, in Proc. Evolutionary Computation Congr., 1999,
vol. 1, pp. 728,
[14] S. Park, J. P. Hwang, E. Kim, and H.-J. Kang, A new evolutionary Felix Moreno received the MSc and PhD degrees
particle filter for the prevention of sample impoverishment, IEEE in telecommunication engineering from Universi-
Trans. Evolutionary Computat., vol. 13, no. 4, pp. 801809, Aug. cnica de Madrid (UPM), in 1986 and
dad Polite
2009. 1993, respectively. Currently, he is an associate
[15] C. Li, Q. Honglei, and X. Juhong, Distributed genetic resampling professor of electronics at UPM. His research
particle filter, in Proc. Adv. Comput. Theory Eng. 3rd Int. Conf., interests are focused on evolvable hardware,
vol. 2, Aug 2010, pp. V232V237. high-performance reconfigurable and adaptive
[16] J. Zhang, T.-S. Pan, and J.-S. Pan, A parallel hybrid evolutionary systems, hardware embedded intelligent architec-
particle filter for nonlinear state estimation, in Proc. Robot, Vis. tures, and digital signal processing systems. He is
Signal Process. First Int. Conf., Nov. 2011, pp. 308312. a member of the IEEE.
[17] W. L. Khong, W. Y. Kow, Y. K. Chin, M. Y. Choong, and K. Teo,
Enhancement of particle filter resampling in vehicle tracking via
genetic algorithm, in Proc. Comput. Model. Simul. Sixth UKSim/ " For more information on this or any other computing topic,
AMSS Eur. Symp., Nov. 2012, pp. 243248. please visit our Digital Library at www.computer.org/publications/dlib.
[18] C. Nyirarugira and T. Y. Kim, Adaptive evolutional strategy of
particle filter for real time object tracking, in Proc. Consumer Elec-
tron. IEEE Int. Conf., Jan 2013, pp. 3536.
[19] A. Doucet, N. De Freitas, and N. Gordon, Eds., Sequential Monte
Carlo Methods in Practice. Springer, 2001.
[20] A. Doucet, S. Godsill, and C. Andrieu, On sequential Monte
Carlo sampling methods for Bayesian filtering, Statist. Comput.,
vol. 10, no. 3, pp. 197208, 2000.

Evolutionary Computing and Particle Filtering A Hardware-Based Motion Estimation System PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Evolutionary Computing and Particle Filtering A Hardware-Based Motion Estimation System PDF

Uploaded by

Copyright:

Available Formats

3140 IEEE TRANSACTIONS ON COMPUTERS, VOL. 64, NO.

11, NOVEMBER 2015

Evolutionary Computing and Particle Filtering:

Index TermsEmbedded systems, evolutionary computing, FPGAs, particle filtering

S TATE estimation and prediction are considered major

P bja P a px0:t jy1:t ; (5)

draw a conclusion. P^N dx0:t jy1:t is the approximated posterior distribution,

one effective particle. The main reason for this phenomenon

wk1 is an array where each component is defined by

alternatives can be found in the literature [24], but in this

5.1 Evolutionary Resampling Algorithm Validation

Parameter Description Value

module, which replaces a complex mathematical equation

Resource consumption report on the global hardware acceler-

outperforms other algorithms, e.g. the Bootstrap Filter.

Module Maximum frequency (MHz)

You might also like