You are on page 1of 18

1638 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO.

4, APRIL 2012

Dirichlet Process Mixtures for Density Estimation in


Dynamic Nonlinear Modeling: Application to GPS
Positioning in Urban Canyons
Asma Rabaoui, Nicolas Viandier, Emmanuel Duflos, Juliette Marais, and
Philippe Vanheeghe, Senior Member, IEEE

AbstractIn global positioning systems (GPS), classical localiza- have penetrated the transport field through a variety of appli-
tion algorithms assume, when the signal is received from the satel- cations such as monitoring of containers, fleet management, or
lite in line-of-sight (LOS) environment, that the pseudorange error car navigation. These applications do not necessarily request
distribution is Gaussian. Such assumption is in some way very re-
strictive since a random error in the pseudorange measure with an high availability, integrity, and accuracy of the positioning
unknown distribution form is always induced in constrained en- system. However, for safety applications such as guidance of
vironments especially in urban canyons due to multipath/masking autonomous vehicles, performances require to be much more
effects. In order to ensure high accuracy positioning, a good es- stringent.
timation of the observation error in these cases is required. To Unfortunately, most of all these transport applications are
address this, an attractive flexible Bayesian nonparametric noise
model based on Dirichlet process mixtures (DPM) is introduced. mainly used in dense urban environments. Even with modern
Since the considered positioning problem involves elements of non- GPS receivers, high positioning accuracy is only achieved in
Gaussianity and nonlinearity and besides, it should be processed LOS conditions when the signal is received directly without
on-line, the suitability of the proposed modeling scheme in a joint any reflection. Since the signals delivered by sensors are easily
state/parameter estimation problem is handled by an efficient Rao- blurred because of hard external conditions in urban canyons,
Blackwellized particle filter (RBPF). Our approach is illustrated
on a data analysis task dealing with joint estimation of vehicles they may deliver totally erroneous measurements, especially
positions and pseudorange errors in a global navigation satellite when the signal reaches the receiver after interaction with sev-
system (GNSS)-based localization context where the GPS informa- eral objects/obstacles. In urban areas, some obstacles (buildings,
tion may be inaccurate because of hard reception conditions. trees, etc.) can be modeled using a simulator and thus some
Index TermsDensity estimation, global navigation satellite of the errors can be predicted. However, some other obstacles
system (GNSS), global positioning systems (GPS), nonpara- (cars, pedestrians, etc.) can appear suddenly and can induce a
metric Bayesian methods, pseudorange errors, Rao-Blackwellized random error in the measurements. Therefore, in order to en-
particle filter (RBPF), sequential Monte Carlo methods, urban sure high accuracy positioning, a good estimation of the obser-
canyon.
vation error in these cases is required. As a consequence, the
incoming signal is most of the time the sum of direct signals
and several delayed replica called non-LOS signals. However,
I. INTRODUCTION
non-LOS signals which are signals received after reflections on
the surrounding obstacles, frequently occur in dense environ-
ments. In some of these environments considered as very con-
A global positioning system (GPS) is a radio navigation
system that relies on radio-frequency signals emitted by
a constellation of satellites. Consequently, it can be easily dis-
strained, it can happen that no direct signals can reach the re-
ceiver which leads to severe accuracy degradation due to the
induced high propagation time delays.
torted by the measuring system and propagation impairments
In this paper, we address a very challenging issue related to
(atmospheric layers, multipath, diffraction, and mask phe-
the positioning degradation accuracy problem when we do not
nomena). Today, global navigation satellite systems (GNSS)
receive the direct signal but just reflected replica of it. In this
case, the receiver tracks only reflected signals. Such phenomena
make the observation error distribution to become a noncentered
Manuscript received May 15, 2011; revised October 04, 2011; accepted De-
cember 02, 2011. Date of publication December 21, 2011; date of current ver- Gaussian distribution because of additional errors induced by
sion March 06, 2012. The associate editor coordinating the review of this man- the propagation time delays of reflected signals. As a conse-
uscript and approving it for publication was Dr. Lawrence Carin. quence, classical localization methods assuming that observa-
A. Rabaoui is with the IMS, LAPS CNRS UMR 5218, Bordeaux, France
(e-mail: asma.rabaoui@u-bordeaux1.fr). tion noises are zero mean Gaussian are not efficient anymore
N. Viandier and J. Marais are with the IFSTTAR, LEOST, Villeneuve dAscq, and can severely impair positioning accuracy.
France.
E. Duflos and P. Vanheeghe are with the LAGIS FRE CNRS 3303, Ecole
Centrale de Lille, Villeneuve dAscq, France. They are also with Team-Project
A. Previous Work
SequeL, INRIA Lille Nord-Europe, Villeneuve dAscq, France.
Literature focusing on techniques for localization perfor-
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org. mance enhancement in constrained environments is abundant.
Digital Object Identifier 10.1109/TSP.2011.2180901 Most previous work such as [1] and [2] proposed to improve

1053-587X/$26.00 2011 IEEE


RABAOUI et al.: DIRICHLET PROCESS MIXTURES FOR DENSITY ESTIMATION 1639

the GNSS performances by adding other sensors (odometer, B. The Motivating Problem
inertial measurement unit, etc.). However it seems that these
multisensors-based methods show some drawbacks and in In this paper, the state evolution equation governing the po-
particular the high system cost and complexity. For that reason, sition of a land vehicle is a dynamic nonlinear model and the
we propose here solutions allowing to improve positioning measurement equation (observation model) is nonlinear as well
accuracy where the sensor available is the GPS-based sensor. with additive observation errors. Within this setting, a joint esti-
However, related to this question of positioning accuracy in mation problem of hidden states and noise probability densities
hard conditions (for instance in urban canyons), there have been will be considered. Often, the observations which are the only
very few approaches proposed for improving the precision of GPS measurements arrive sequentially in time and one is inter-
the GPS-based framework in the presence of multipath effects. ested in performing the joint estimation on-line.
In recent work, some solutions have been proposed for this Note that in a dynamic nonlinear model with additive ob-
question. Many involve model adaptation-based framework. In servation errors, it is usual to assume that errors are normally
[2][4], the proposed approaches allow to adapt the error model distributed or can be approximated by a finite Gaussian mix-
in the filtering process to the reception condition of each satel- ture. This can cause problems when there are for example out-
lite signal. This modeling permits to handle the overall recep- lying errors leading to many modes in the density distribution
tion process with switches between some preselected measure- form whose variability over time induces poor inference about
ment models. The switching process permits to define three re- the model parameters. As we are considering in this paper the
ception states (LOS, non-LOS, and no reception). The principal case where multipath propagation affects severely GPS signals
idea of such modeling scheme is to define an indicator variable qualities, it is our intention therefore to model the observation
which governs the behavior of the statistical model and allows errors using a highly flexible family of density functions. It is
for switching from one model to another. An interesting way important to notice that mixture modeling is considered as a
of defining the statistical structure of this indicator variable has successful and widely used density estimation method, capable
been developed in [2], [5]. of representing the phenomena that underlie many real-world
Switching state-space models, also called jump state-space datasets. However, the main difficulty in mixture analysis is
models, have been widely studied in the literature. In [6], [7] how to choose the number of mixture components. Model se-
a jump Markov linear system (JMLS) has been proposed for lection methods in general treat the number of components as
the case where the considered state-space model is conditionally an unknown constant and set its value based on the observed
linear and Gaussian. The estimation problem was handled using data. Such approaches lack flexibility, since in practice we often
finite mixture modeling [interacting multiple model (IMM) [8], need to model the possibility that new observations come from
[9] and generalized pseudo-Bayes (GPB) [6] algorithms]. For as yet unseen components. This is a challenging research issue
nonlinear and non-Gaussian cases, efficient sequential Monte and all the related questions motivate the general framework we
Carlo algorithms have been applied [10][13]. are going to develop in this paper. To the best of our knowledge,
In some recent studies, as in [3], the proposed particle filter on-line estimation on a nonlinear dynamic system is a topic that
algorithm with observation switching models shows acceptable has not been sufficiently addressed in navigation literature.
results in terms of accuracy and stability in the context of GPS In this paper, we propose to use a flexible density modeling
positioning. The filtering algorithm takes into account two ob- based on a Bayesian nonparametric approach involving an
servation noise models depending on the reception state of each infinite mixture model. The Dirichlet process mixture (DPM)
satellite. In the direct path, a Gaussian probability distribution model, studied in Bayesian nonparametrics [15][17], is the
is used. Whereas, in the non-LOS case, a Gaussian mixture most used approach for nonparametric density estimation.
(GM) model is used. In this setting, the proposed approach It is based upon the construction of probability measures on
for error estimation uses the expectation-maximization (EM) the space of distribution functions. A rich literature exists on
algorithm [14] which shows some limits especially related to its theoretical properties, and it has been used in a variety of
slow convergence of the EM algorithm and the nonability of problems [18][25]. Despite this activity, only a few work
the density estimation model to capture the right shape of the has been conducted on how to tune the hyperparameters of
very changing distribution of observation errors. In fact, the nonparametric mixture models. Most of the time, in real-world
non-LOS case is modeled by a finite Gaussian mixture and the application contexts, the observed data can help in defining
number of Gaussian components was fixed. Consequently, the a prior for some of the hyperparameters in the considered
mixture model were not able to represent the true measurement Bayesian model. Computing every unknown hyperparameter
error along the mobile trajectory. with pure sampling is not the most effective approach in all
In this paper, in order to limit costs, we have chosen to work cases since this can be practically difficult and time consuming.
only with GNSS signals, no additive sensors will be used. For Our objective is to estimate jointly the hidden states
a better accuracy, we have proposed a new statistical filtering and the error density parameters that
method based on a better definition (and use) of the observation pertain to the observation vectors . The error density
noise for each satellite signal. In order to ensure high accuracy parameters are the latent variables to be estimated using a non-
positioning a good estimation of the observation error in such parametric Bayesian approach. For on-line Bayesian filtering,
cases was required. the particle filter permits to compute the posterior distribution
recursively over time. In this case, we have to
take into consideration the fact that computing everything
1640 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 4, APRIL 2012

with pure sampling can be computationally expensive. One context and the interesting validation schemes. The efficiency
solution consists in considering the augmented hidden state of the proposed approaches is demonstrated by applying a vali-
vector . By the partition of the state-space into dation step involving both simulated and real GNSS signals.
two subspaces drawn by and (for all ), improved The remainder of this paper is organized as follows. Section II
formulations of the particle filter can be applied. Rao-Black- presents the observation noise modeling. Section III is dedicated
wellization [26], [27] is one way of improving the efficiency to the Bayesian modeling of the problem. Section IV is about
of the particle filter. In the case where it is possible to evaluate the nonparametric density estimation problem. We show how to
some of the filtering equations analytically and the others with tune the DPM hyperparameters in a flexible way for a better fit-
Monte Carlo sampling, the use of Rao-Blackwellization leads ting of the data distribution shape. Section V discusses the use
to estimators with less variance than what could be obtained of an efficient particle filter to perform optimal estimation. In
with pure Monte Carlo sampling [28]. The idea is to partition Section VI the efficiency of the proposed approaches is demon-
the state vector so that one component of the partition is a strated by conducting validation experiments involving simu-
conditionally nonlinear Gaussian state-space model; for this lated and real GNSS signals. Section VII concludes this paper
component one can work out the solution analytically and with a summary and discussion.
use for instance the extended Kalman filter (EKF) to compute
. The particle filter is then used only for the
nonlinear non-Gaussian portion of the state-space to compute II. PSEUDORANGE ERROR MODELING
. In this way, we can say that the hard part of the
problem will be reduced into the computation of the posterior In multisensors-based systems, each sensor transmits a signal
distribution . (an information) to the receiver (or antenna). In this paper, the
GNSS constellation [29] is considered as a sensor network,
C. Contributions and Paper Organization and consequently each GNSS satellite is considered as a sensor
This paper considers the problem of density estimation from [30]. These sensors work according to three different operation
a Bayesian nonparametric viewpoint, using a hierarchical mix- modes which are:
ture model. A typical choice is to assume the unknown density normal mode (LOS case), when the information delivered
as a mixture of a parametric family with a discrete random prob- by the sensor can be considered as accurate;
ability as a mixing distribution. This random probability is in- degraded mode (non-LOS case), when the information de-
duced by the Dirichlet process (DP) and the hierarchical model livered by the sensor is not accurate due to noisy data;
is the DPM model. By using DPM, it is important to mention failure mode (no reception case), when the sensor provides
that no matter what additive error distributions are involved we no information.
are confident that our family of densities will be able to capture The navigation signal transmitted by each satellite includes a
the right shape and hence statistical inference for the parame- precise time at which the signal was transmitted. The distance
ters of interest will be improved and reliable. A drawback of or range from a receiver to each satellite may be determined
using a model based on the DP is that it is infinite dimensional using this time of transmission which is included in each navi-
and therefore inference will be complicated. However, recent gation signal. By noting the time at which the incoming signal
innovations in sampling algorithms within infinite dimensional was received at the antenna, a propagation time delay can be
frameworks have made considerable progress in recent years to calculated. This time delay when multiplied by the speed of
such an extent that it is now possible to perform exact inference propagation of the signal yields a pseudorange from the trans-
without the need to set up arbitrary approximations. mitting satellite to the receiver. The range is called a pseudor-
The flexibility of DPM models grows when we assume one ange because the receiver clock may not be precisely synchro-
more step in the hierarchy, i.e., when the DPM hyperparame- nized to GPS time and because propagation through the atmos-
ters are random. Our aim using this family of densities is first to phere introduces delays into the navigation signal propagation
be able to capture the right shape of the noise probability den- times. These result, respectively, in a clock bias and an atmo-
sity functions (pdfs) and then to improve the statistical inference spheric bias. Clock biases may be as large as several millisec-
for the parameters of interest. Therefore, we will show how to onds. Using information from at least four satellites, the posi-
tune the DPM hyperparameters in a flexible way. Some hyper- tion of a receiver with respect to the center of the Earth can be
parameters will be estimated as part of the Gibbs sampler, and determined using triangulation techniques [29], [31]. The pseu-
some others will be chosen carefully in a data-adaptive way for dorange measure deduced from the signal propagation time is
a better fitting of the data distribution shape. expressed as follows:
To sum up, this paper proposes several contributions. The first
is concerned with the modeling of the observation noises using (1)
DPMs. Then, we focus on the suitability of this family of den-
sities in the estimation problem handled by an improved par- where
ticle filter called Rao-Blackwellized particle filter (RBPF). The is the pseudorange between the satellite and the re-
use of Rao-Blackwellization permits to improve the efficiency ceiver at time ;
of the filtering process since we do not need to compute every- is the true satellite-receiver distance;
thing with pure sampling which reduces significantly the com- is the three position coordinates of the
putational cost. An another contribution is about the application vehicle in the reference system of coordinates East, North,
RABAOUI et al.: DIRICHLET PROCESS MIXTURES FOR DENSITY ESTIMATION 1641

Up (ENU)1 and is a vector containing estimating the hidden variables (states) when the related mea-
the position coordinates of the satellite ; surements (observations) involve outliers is a challenging issue.
is the receiver clock offset with respect to the GPS In this paper, we focus on a particular joint state/parameter es-
reference time2 and is the celerity (the timation problem wherein data should be processed on-line.
speed of light); In this section we are interested in designing a dynamic model
is the satellite clock offset; for the parameters to be estimated. Our ultimate objective is to
and are, respectively, the ionospheric and the tropo- estimate accurately the state of a moving vehicle. The vehicle
spheric errors; state is defined by the state vector containing vehicle positions
is the receiver noise considered as a white Gaussian and velocities . Since we are
random variable with variance in general considered to interested in estimating jointly the hidden state
be equal to unity; and the pseudorange noise density, we will consider an aug-
is the error due to the signal reflections in case of mul- mented hidden state vector containing both hidden state param-
tipath propagations. eters and observation noise parameters. In this setting, the obser-
The atmospheric propagation errors and the satellites clock vation error is assumed to be distributed according to an un-
bias can be modeled by incorporating some correction blocks known probability density function whose parameter vector
in the receiver [29]. Consequently, after corrections, (1) can be denoted has to be estimated and stands for the vector
rewritten as containing all the parameters to be estimated. Note that our main
concern in the handled problem is the estimation of nonlinear
(2) models with unknown statistic noises.
It follows that the pseudorange error can be defined in function In the following, a state-space approach providing a general
of two different error sources as framework for describing the dynamic state estimation problem
at hand is provided. Here, we will use the state-space formu-
lation to introduce the filtering algorithms. Statistical Bayesian
filtering aims to compute the posterior probability density func-
where is the sum of the error induced by reflections between tion of a state vector from sequentially obtained sensor
the satellite and the receiver and the receiver noise. According measurements (pseudoranges) . In our context, it relies
to the reception conditions, can switch between different ob- on the following model:
servation models as following:
(4)
(3) (5)

In the case when a signal is received in LOS, the pseudo- (6)


range error distribution is considered white-Gaussian. In the
where f is the state transition function, h is the observation func-
other case, the pseudorange error distribution is unknown. In
tion, they are assumed to be nonlinear, and are, respec-
constrained environments, the density form can change abruptly
tively, the state and the measurement noise vectors. Note that in
or evolve slowly during a long period of time and this depending
this paper, we will be interested in estimating and not . We
on the obstacle nature. Moreover, several moving obstacles (ve-
will assume that is distributed according to a known density
hicles, pedestrians, etc.) can induce random errors. Therefore,
function with fixed parameters.3
to accurately estimate the pseudorange error density, a flexible
We are considering a real-world data analysis task which in-
model is proposed in this paper. It should be noted that our aim
volves estimating unknown quantities from some given obser-
is to estimate at each time the position , the velocity and
vations. In most of such applications, prior knowledge about
the observation noise density from the set of collected mea-
the phenomenon being modeled is available. This knowledge
surements where is the number of visible
allows us to formulate Bayesian models, that is prior distribu-
satellites at time .
tions for the unknown quantities and likelihood functions re-
III. BAYESIAN MODELING lating these quantities to the observations. Within this setting,
all inference on the unknown quantities are based on the poste-
In the previous section, a model for the observations (pseu-
rior distribution obtained from the Bayes theorem. Often, the
doranges) has been introduced in the context of GPS-based po-
observations arrive sequentially in time and one is interested in
sitioning. These observations may be corrupted by errors with
performing inference on-line. It is therefore necessary to update
an unknown distribution. In most multisensors-based systems,
the posterior distribution as data become available. It is worth
1This Cartesian coordinate system is far more intuitive and practical than noting that computational simplicity in the form of not having
ECEF (Earth-Centered Earth-Fixed) or Geodetic coordinates [29]. The local to store all the data might be an additional motivating factor for
ENU coordinates are formed from a plane tangent to the Earths surface fixed
to a specific location and hence it is sometimes known as a Local Tangent or sequential methods.
local geodetic plane. By convention the east axis is labeled , the north and In this paper, we will focus on sequential Monte Carlo (SMC)
the up . integration methods [26] which are a set of simulation-based
2GPS has a reference time called GPS time maintained on the earth at the
US naval observatory. The satellite clocks, though very accurate, are different 3Note that the proposed density estimation framework to tackle the observa-
pieces of equipment and so are not, in general, exactly synchronized with each tion error an also be applied to the state error . However, since our main
other. Thus, they have different offsets with respect to GPS time. motivation is multipath errors estimation, in this paper we will focus only on .
1642 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 4, APRIL 2012

techniques that provide a convenient approach to computing the multimodal due to hard reception conditions especially in urban
posterior distributions in our nonlinear and non-Gaussian case canyons. In this paper, we address the problem of optimal state
and lead most of the time to accurate filtering results. estimation when the pdfs of the noise sequences are unknown
In this paper, the inferential problem is to estimate sequen- and need to be estimated on-line from the data.
tially the augmented hidden states driving the observations Although the desired probability density is unknown, we
along with the starting value. At the same time, our focus is to assume that it has a known functional form.4 Lets see how the
model the additive observation errors using a highly flexible Bayesian approach can be used to obtain the desired density .
family of density functions based on an infinite mixture model. The basic assumptions are summarized as follows:
In the following, in order to introduce the statistical Bayesian 1) The form of the density is assumed to be known, but
filtering for estimating the hidden states, we first review some the value of the parameter vector is not known exactly.
theoretical aspects about nonlinear dynamic state-space mod- 2) Our initial knowledge about is assumed to be contained
eling. Then, we will provide some typical ways dealing with in a prior density .
density estimation of errors via mixture-based models. 3) the rest of our knowledge about is contained in a set of
samples drawn independently according to the
A. Nonlinear Dynamic State-Space Model and Statistical unknown probability density .
Bayesian Filtering The basic problem is to compute the posterior density ,
The unobserved signals are modeled as a because this is a necessary step in computing . There
Markov Process of initial distribution and transition have been several efficient sampling algorithms in [32][34]
equation provided by the motion model (4). The which enable to sample from . However, these algo-
observations , are assumed to be conditionally rithms cannot be applied to our case since the noise sequence
independent given the process and the marginal is not observed directly.
distribution is deduced from (5). To sum up, the model Generally, in density estimation problems, it is the value of
is described by the mixture likelihood at individual points that is of interest,
rather than the membership of the components, which is im-
portant in clustering or discriminant analysis. In this paper, a
Bayesian nonparametric framework based on a mixture model
for density estimation will be introduced. Later, we will study
the inference procedure when the nonparametric component is
We denote by and , applied to the additive observation errors . More
respectively, the states and the observations up to time . Our precisely, we will show that by using a Dirichlet process prior
aim is to estimate recursively in time the posterior distribution [15], [16], [35], we can derive a Bayesian nonparametric mix-
and its associated features (including the marginal ture model. Before we discuss the Dirichlet process prior and the
distribution , known as the filtering distribution). The nonparametric Bayes analysis, let us first consider some typical
posterior distribution can be calculated in two steps theoreti- ways to deal with a mixture model in general.
cally: prediction and update. In the prediction step, we integrate
the state distribution from the previous state using the system C. Bayesian Analysis of Mixtures for Density Estimation
model in (4)(6).
1) General Framework: Many papers, including [15], have
The update step modifies the prediction distribution making
shown that mixtures of normal distributions provide a simple
use of the latest observation. Given the initial distribution
and effective basis for nonparametric Bayesian density estima-
, transition distribution , and the likelihood
tion. Such an approach is very flexible and Markov chain Monte
distribution , the objective of the filtering is to esti-
Carlo methods make it easy to sample from the posterior. To
mate the state , given the observations up to time .
better understand Gaussian mixtures, we will first consider these
From Bayesian perspective, when an observation becomes
models for a fixed value of components, and later explore the
available at time , we can obtain the posterior distribution
properties in the limit where . The finite Gaussian mix-
via the Bayes rule. The expression of this distri-
ture model with components may be written as
bution requires the evaluation of complex high-dimensional
integrals. Since we are in a nonlinear case, the multidimen-
sional integrals are intractable and some approximate methods (7)
must be used. More details about the used filtering approach
will be given in Section V.
where , is the number of com-
B. Bayesian Computations Objectives ponents, the s are the mixture weights which must be non-
negative and sum to one, the s denote the vector of all un-
We will use the model formulated in (4)(6) to estimate the
known parameters associated with th component, and is the
hidden state given the observations . It is a very common
choice to assume that the noise pdfs are Gaussian, with known 4Note that the pdf of the unknown density is defined using a nonparametric ex-

parameters, as this enables the use of Kalman filtering. So far pression with an infinite-dimensional parameter space. In fact, when we cannot
assume that the density is characterised by a set of parameters, we must resort to
we have explained that the Gaussian assumption is inappro- nonparametric methods of density estimation; that is, there is no formal struc-
priate since the observation noise distribution are likely to be ture for the density prescribed.
RABAOUI et al.: DIRICHLET PROCESS MIXTURES FOR DENSITY ESTIMATION 1643

pdf of a (normalized) Gaussian with specified mean and vari-


ance. Note that the functional form of each component is usually
assumed to be known. In some cases, the number of components
may not be known, and is estimated from the data. In this case,
Bayesian nonparametric mixture modeling should be applied.
2) Nonparametric Bayesian Analysis of Mixture Distribu-
tions: Suppose data are distributed as ,
independently over . The mixture structure simply imposes
the constraint that, for some positive integer , there exist
distinct numbers such that, for each
, for some . One way of gen-
erating such a mixture is to use a DP for the prior distribution
of . An important result of the DP is that the
probability of for is positive. Such Bayesian
analyses have been of limited use due to the computational dif-
ficulties. However, these difficulties can now be overcome using
Gibbs sampling techniques [15]. This permits the extension of
standard Bayesian parametric analyses to Bayesian nonpara-
metric analyses, by linking the Gibbs sampling computations
for Dirichlet process priors to those for the standard models.

IV. NONPARAMETRIC DENSITY ESTIMATION OF PSEUDORANGE


Fig. 1. The pseudorange errors can be approximated by a finite Gaussian mix-
ERRORS USING DPM ture model where the number of components in the mixture should be specified
In [36] it was shown that an important characteristic of pseu- in advance and the choice of the analysis window length should be done care-
fully by considering data evolution aspects.
dorange errors is the nonstationarity. In the context of dynamic
models, the assumption of stationarity is wrong. In fact, when
reflections occur; and by considering a long period of observa- the main motivation behind this choice is the application con-
tion time, the pseudorange error distribution can be easily ap- text.
proximated by a finite Gaussian mixture model (Fig. 1). How-
ever, by considering a short period of time, pseudorange errors A. Density Formulation
can be modeled by only one Gaussian distribution whose pa- In mixture models the Dirichlet Process prior is used to
rameters (mean and variance) evolve over time. This problem specify latent patterns of heterogeneity, particularly when the
can be limited by the use of DPM. In this section, DPM models distribution of latent parameters is thought to be clustered.
will be used to generate an infinite Gaussian mixture where the However, neither the number nor the location of clusters needs
Dirichlet process prior is used to specify latent patterns of het- to be specified a priori. Therefore, DPM models are considered
erogeneity, particularly when the distribution of latent parame- very attractive because DPMs allow for uncertainty in the
ters is thought to be multimodal. It will be shown through this choice of parametric forms and in the number of mixing com-
paper that this method is flexible and well suited for on-line ap- ponents. However, analyses based on DPMs are sensitive to the
plications. choice of the DP parameters; the need to treat these parameters
DPM models [16] have become increasingly popular for carefully was first discussed in [15].
modeling when conventional parametric models would impose Let consider a pdf and a set of vectors statistically
unreasonably stiff constraints on the distributional assump- distributed according to , for . We
tions. Recall that in the finite mixture model, each data point will consider a nonparametric model allowing to estimate as
is drawn from one of fixed distributions. To allow the number follows:
of mixture components to grow with the data, we move to the
general mixture model setting. This is best understood with the (8)
hierarchical graphical model in [15].
In traditional application of mixture models, namely, the es-
timation of a (univariate) density function, the problem of den- where is called the latent variable (also called cluster),
sity estimation arises when data are considered to be sampled is the mixed pdf and is the mixing distribution. Under
from a certain distribution, but this distribution is itself assumed a Bayesian framework, it is assumed that is a random proba-
unknown. Nonparametric Bayesian methods are specially well bility measure (RPM) distributed according to a prior distribu-
suited for such kind of problem. References related to this issue tion which is DP prior.
can be found, among others in [17], [37][39]. The focus of our
work will be on mixtures of normal distributions which means B. Dirichlet Process Prior
that the unknown parameter is such that , where In [35] the Dirichlet Process has been introduced as a prob-
and are, respectively, the mean and the variance of normal ability measure on probability measures. It is parameterized by
distributions in the mixture. As it will be explained in this paper, a base distribution on a measurable space and a positive
1644 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 4, APRIL 2012

scaling parameter . Lets suppose now that we draw a random such that . Here, the
measure from a DP, and independently draw random vari- s give the model hyperparameters, while s indicates
ables from : their corresponding labels, such that . From the Polya
urn probabilities in (12), we can claim that the draws for the
(9) indicator variables are obtained according to the following
sampling scheme:
where . Marginalizing out the random mea-
sure , the joint distribution of follows a Polya
urn scheme [40]. In another representation of the DP called the (13)
stick-breaking scheme, is introduced explicitly as an infinite
sum of atomic measures [41]. The realizations of a DP are ex-
where , the number of s
pressed as follows:
which equal . For computational reasons, we assume for
a conjugate normal inverse Wishart base distribution as it will
(10) be detailed later in Section IV-F.

D. Estimation of DP Hyperparameters
with , and
where denotes the Beta distribution and denotes the Dirac In a hierarchical model, a hyperprior can be placed on the pa-
delta measure located in . The underlying random measure rameters of the DP by simply putting prior distributions on the
is then discrete with probability one. Using (8), it comes that scaling (precision) parameter and the parameters of the base
the following flexible prior model is adopted for the unknown distribution . The data is then used to calculate the posterior
distribution distribution. By this, we can say that another layer is added in
the hierarchical model. Some authors have chosen a fixed ,
(11) such as a normal distribution with large variance relative to the
data as in [19] or a , where and are stipulated without
further discussion as in [43]. Other authors have used a para-
More details about other important properties of DPs can be metric family of priors for , such as uniform-inverse Wishart
found for instance in [16], [40][42]. [44] or normal-inverse Gamma [44], [45], with fixed hyperprior
C. Dirichlet Process Mixture Model distributions. The hyperprior distributions are usually chosen by
sensitivity analysis, or else they are chosen to be diffuse with re-
The core of the DPM model can basically be thought of as spect to the observed data.
a simple Bayes model given by the likelihood In this paper, priors on the hyperparameters will be speci-
and prior , with added uncertainty about the prior fied using a data-adaptive way without applying a pure sampling
distribution approach. Most of the time, in real-world application contexts,
the observed data can help in defining a prior for some of the
hyperparameters in the considered model. Clearly, computing
every unknown hyperparameter with pure sampling is not the
most effective approach in all cases since this can be practically
Through this hierarchy, a marginalization step is often used in difficult and time consuming.
actual implementations through the so-called Polya urn repre- Concerning the scaling parameter which is an important
sentation [40]. By integrating over through the Polya urn rep- parameter to be tuned (or chosen) carefully, one may want to
resentation, the joint distribution of may be put a prior on and then use the data to calculate a posterior
factored into a product of successive conditional distributions distribution. To understand what values of to use, the rela-
of the following form: tionship between and the expected number of clusters in the
data might be considered. In [46], the author shows how, with
respect to a flexible class of prior distributions for parameter ,
the posterior may be represented in a simple conditional form
(12) that is easily simulated. As a result, for a better fitting of DPM
where is the hyperparameter vector. This fac- models, inference about this parameter may be developed with
torization implies that has discrete, though infinite support. classical Gibbs sampling algorithms [15].
This implies that a random draw from either equals one of the
previous draws or is drawn independently from the base proba- E. Prior and Posterior Distributions for
bility measure . We have mentioned in this paper that a key feature of the
Note that may be represented in an equiv- Dirichlet is its discreteness, which in our context implies that
alent way by its latent variables and cluster locations . the pairs , concentrate on a set of some
Therefore, we introduce in the following the notion of a distinct pairs. Let be the number of distinct values of
class label assigned to each hyperparameter . We set . To sample the precision parameter , we first deter-
and we denote by the set of mine the prior distribution for , the number of normal compo-
values taken by the labels. The location variables are denoted nents in the mixture (clusters). At each stage of the simulation
RABAOUI et al.: DIRICHLET PROCESS MIXTURES FOR DENSITY ESTIMATION 1645

analysis, a specific value of is simulated from the posterior for


(together with sampled values of the means and variances of
the normal components) which also depends critically on this
hyperparameter . In [15] and [46], it was shown how, based
on a specific but flexible family of prior distributions for , the
parameter vector may be augmented to allow for simulation
of the full joint posterior now including .
In this paper, building on [46] and in the same manner as in
[47], a Gamma prior for will be used. Suppose ,
a Gamma prior with shape and scale . In this case,
may be expressed as a mixture of two Gamma poste-
riors, and the conditional distribution of the mixing parameter Fig. 2. DPM hierarchical model adaptation using the indicator of the signal
given and is a simple Beta. reception state.

(14)
where is the mean of the Normal distribution, is a scale
with weights defined by parameter, and are, respectively, the inverse scale matrix
and the degree of freedom for the inverse Wishart distribution
(see [16] for more details about these distributions). In the
(15)
sequel, we denote by the base distribution
hyperparameter vector. The same base distribution should
where , , and . More not be applied in all navigation situations (LOS, non-LOS, and
details about this sampling can be found in [46]. It is important no reception). Here, we propose to adapt the priors on the
to mention that should be sampled at each Gibbs iteration parameters at each time . More precisely, we propose to use an
stage in the simulation process that allow to generate the additional parameter defined at each time for each satellite
according to (12). The current sampled values of and permit . This parameter adds one more step in the DPM hierarchical
to compute a new value of . In [42] the expected number of model as shown in Fig. 2.
components sampled from a Dirichlet process is given by The parameter is an indicator of the propagation state for
each satellite at each time . It will be used to update the param-
eters of . In practice, when a LOS reception state is detected
the hyperparameters of the base distribution will be sam-
pled from the normal-scaled inverse Wishart distribution whose
parameters , , , and should be chosen carefully. How-
ever, when a non-LOS reception state is detected, it will be in-
teresting to adapt the distribution of the hyperparameters as fol-
(16) lows:

Thus, although with probability 1 as , the ( )


number of components increases approximately logarithmically (18)
with the number of observations.

F. The Parameters
with , , , and .
DP mixtures have received a considerable interest in the The additive term is the estimation of the pseudorange
Bayesian nonparametric literature. However, little has been error computed at each time as
written on choosing a centering distribution for these
models. Building on a previous work [47], the latent sources of (19)
heterogeneity to be specified through the parameters
are considered using a data-adaptive way. is a normal dis- where is the corrected pseudorange measurement and
tribution with unknown mean and covariance . The couple is the predicted position and
are chosen to be distributed according to a normal-scaled is the predicted receiver clock bias computed at
inverse Wishart distribution, denoted as time .

(17) V. RBPF FOR DYNAMIC BAYESIAN MODELS


such that The main concern of this section is to estimate , the hyper-
parameter vector , the latent variables and state variables
and this conditional on the observations . is the hy-
perparameter vector containing the precision hyperparameter
1646 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 4, APRIL 2012

and the hyperparameters of the base distribution . In weights and removing those with small weights. The variance
practice, only the state variable is of interest. We have seen introduced by the resampling procedure can be reduced by
in Section IV-C that through the Polya urn representation, we proper choice of the resampling method [26].
integrate out analytically from the posterior. The parameters
and remain and the inference is based upon B. Improving the Particle Filtering by Rao-Blackwellization
where is the augmented hidden state vector. A method to increase the efficiency of sampling techniques
Note that the on-line estimation of is necessary for a good is to reduce the size of the state space by marginalizing out if
estimation of which implies a better fitting of the DPM possible some of the variables analytically; this is called Rao-
model and hence the estimation accuracy of the state vector Blackwellization [28]. In the following, this approach will be
could be enhanced. introduced in order to improve filtering. As shown previously,
In Section V-A, a filtering approach for estimating sequen- we can compute the posterior distribution on-line by applying
tially the augmented hidden state using a particle filter (PF) Bayes rule sequentially. However, in general, the required in-
is provided. In this setting, a first solution based on computing tegrals cannot be computed in closed form. Particle filtering
everything with pure sampling is proposed. Then, to improve therefore approximates the posterior using sequential impor-
the efficiency of this filtering approach, some of the variables tance sampling. Now, if we partition the state-space into two
will be marginalized out and an efficient Rao-Blackwellized PF subspaces, drawn by and (for all ). Then, by the
is provided in Section V-B. chain rule of probability, we can write

A. Particle Filtering
Unscented Kalman filters (UKF) and EKF which are widely (22)
used as suboptimal solution in case of white Gaussian systems
where . We will need to compute the posterior
are not suitable for treating the models which exhibit noncen-
distribution using the conditional distributions
tered and non-Gaussian distributions. Except in a few simple
and . Recall that details for
cases, there is no closed-form solution to this problem. It is
the computation of these conditional distributions were given in
therefore necessary to adopt numerical techniques in order to
Sections IV-D, -E, and -F. Based on these details, note that we
compute reasonable approximations. SMC methods (particle fil-
can sample from the posterior distribution
ters) are powerful tools that allow us to accomplish this goal.
by simulating a Markov chain that has the posterior as its
This filtering method is able to cope with noises of any distribu-
equilibrium distribution. The simplest such methods are based
tion and provides a convenient and attractive approach to com-
on Gibbs sampling. Then, the hyperparameters distribution
puting the posterior distributions.
can be included in the Markov chain simulation.
The PF approximates the posterior probability density func-
On the other hand, if we can update
tion via the discrete weighted sum
analytically and efficiently, then we only need to sample
using the PF. Since we are now sampling in a
(20) smaller space, in general we will need far fewer particles to
reach the same accuracy as standard PF. This is the key idea
The individual weights , are computed by applying the prin- behind RBPF.
ciple of importance sampling. Since it is difficult or impos- The following is a generalization of the RBPF to DPM-based
sible to directly sample , particles are simulated framework as presented in [26] and applied in [5], [48]. At time
according to a proposal distribution , we use the following empirical distribution to approximate
through a set of particles

Then, to correct for the discrepancy between the proposal and (23)
the target distribution, importance weights will be assigned to
the generated particles with

(24)
where By using an EKF or an UKF, for each particle we can com-
pute recursively the terms and . For more
details about how to build the sampling algorithm refer to [2],
[5], and [48]. In RBPF, each particle maintains not just a sample
(21) from but also a parametric representation of
According to (21), the set of particles is updated and reweighted the distribution (the parametric represen-
using a recursive version of importance sampling. The most tation in our case are a mean vector and a covariance matrix).
likely particles yield high importance weights. An additional For our tracking application, the samples are updated as in
resampling procedure is used for replacing particles with large standard PF, and then the distributions are updated using an
RABAOUI et al.: DIRICHLET PROCESS MIXTURES FOR DENSITY ESTIMATION 1647

EKF, conditional on . The overall algorithm is shown in Al- nonparametric modeling. We know that in real applications,
gorithm 1. Therefore, the optimal importance distribution can the sources of this heterogeneity cannot be completely known.
be written as . We will prove through some experiments that the Dirichlet
Without deriving approximations of this importance distribu- process prior may provide a satisfactory model in case where
tion, no efficient importance distributions can be used directly the sources of heterogeneity in the observation errors density
since the associated importance weights will be computationally are unobserved. The DP prior is robust to errors in model spec-
intractable. In this paper, for the sake of simplicity, the impor- ification and allows heterogeneity in the patterns distributions
tance distribution is chosen to be equal to the Polya urn distri- to be specified in a data-adaptive way.
bution as in (12). This approximation yields For example, the predictive distribution of densities based on
to the simplification of the weight expression (21). Finally, the our Dirichlet process prior can be multimodal (non-LOS case)
minimum mean squared error (MMSE) estimate and posterior owing to its natural potential for identifying clusters. In contrast,
covariance matrix of are given by standard parametric prior distributions of densities, such as the
normal distribution, cannot capture multimodality because of
(25) their shape restrictions. But, if the actual distribution of density
can be approximated by a unimodal distribution (LOS case), the
Dirichlet process prior also adapts to this situation. In fact, the
(26) parametric model based on finite Gaussian mixtures is a limiting
case of our modeling scheme when applying Dirichlet process
Note that in this subsection and in Algorithm 1 the estimation prior with . Therefore, we can say that the Dirichlet
of the hyperparameter is not explicitly shown, more details process prior is a natural extension of fully parametric models.
about this will be discussed later in Section IV-A. In our application of on-line pseudorange error density es-
timation, the latent variables are the mean and the variance
Algorithm 1: Computing the posterior distribution of each Gaussian distribution included in the infinite mixture,
using a Rao-Blackwellized Particle Filter i.e., , for . To compute a vehicle position
at each step of the filtering process, particles are
At time , computed. The problem with GNSS applications is that signal
propagation can be considered in different reception modes. In
Step 0: initialization
LOS reception, the pseudorange error noise follows a zero-mean
For , sample and set
Gaussian distribution with a standard deviation inferior or
equal to 1 (actually it depends on the used receiver). In non-LOS

reception, the pseudorange error can be higher than 100 m. In
At time , the proposed DPM model, the will be estimated by sampling
as detailed in Section IV-D. The same base distribution will
Step 1: Importance Sampling step
not be applied in all navigation situations. Here, we propose to
For
adapt and each time using the proposed approaches in
Sample
Section IV-D.
Sample
Evaluate the importance weights

Step 2: Selection step B. About Computational Complexity


Multiply/Discard particles with
The computational complexity for particle filters is indepen-
respect to high/low normalized importance weights
dent of the dimension of the state-space: it grows linearly in the
to obtain equally weighted particles.
number of particles, with the error decreasing as the square root
Step 3: Apply an EKF to compute of the number of particles . We can say that it is evidently of
Compute at each iteration, i.e., at each time (see Algorithm 1).
For further discussions on the relation between dimensionality
and Monte Carlo error the reader is referred to [49]. In Fig. 3,
we depicted the evolution of the state estimation error when the
number of particles grows. As can be seen, a reasonable number
of particles (about 500) can be sufficient to lead to a good esti-
mation accuracy.
In sequential filtering, we know that the complexity of
VI. APPLICATIONS TO GNSS POSITIONING IN URBAN CANYON the posterior distribution increases when the dimension of the
hidden state vector is large. However, by applying a Rao-Black-
A. Error Density Modeling Using DPM wellization in the proposed scheme, we reduce significantly
For density estimation, the heterogeneity in the distribution the computational cost since we do not need to compute all the
of the latent parameters is a problematic issue in Bayesian parameters with pure sampling, but just some of them.
1648 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 4, APRIL 2012

each time by applying a Gibbs sampler, its computational com-


plexity at each iteration depends only on the dimension of the
vector containing the couple of parameters . Note that the
adaptation of the precision parameter and the base distribution
parameters (as shown in Algorithms 2 and 3) is simply con-
ducted at each iteration of the Gibbs sampler. However, com-
paring algorithms exclusively in terms of complexity may be
misleading, since a scheme with low complexity may in fact be
useless because it mixes very badly; that means we get stuck
in one point of the parameter space. In particular, this remark
applies when comparing the DPM model with a finite mixture
Fig. 3. Evolution of the positioning error when the number of particles used in model (e.g., GM) and, hence, the use of the DPM model instead
the RBPF-based algorithm grows.
of a finite mixture model to estimate the error density was com-
pletely justified in this paper.

C. Experiments With Simulated Data


Algorithm 2: Tuning the hyperparameters of the DP
The software Ergospace developed by the company Er-
For gospace 5 simulates the propagation of electromagnetic signals
For ( : Number of visible satellites at in 3D realistic constrained environments. It uses the Ray
time ) Tracing method to deterministically calculate direct paths as
For each iteration of the Gibbs sampler well as multipaths between transmitters and a receiver. This
( ), do software describes and analyzes with a good precision the
Sample according to Algorithm 3 performances of a satellite navigation system. It is a powerful
If LOS reception, set the following parameters of simulation tool that can replace long-term and expensive
measurement campaigns. In this work, very representative 3D
models of two cities in France (Rouen and Toulouse) have
been used for the simulations. They have been produced by
the Urbanism Departments of both cities for their proper use.
If non-LOS reception, set the following parameters For all the simulation modes (static and dynamic), Ergospace
of : outputs the characteristics of all the direct and reflected paths,
the pseudorange between the satellites and the receiver, the
evolution of geometrical and performance indicators, and the
visible satellites. In addition to the possible severe conditions
(no direct visibility, multipath, and no visibility), the motion of
the user in this environment may produce fast variation of the
Algorithm 3: Sampling the signal configuration and has also to be taken into account.
Two typical geographical areas have been chosen for the sim-
For each iteration of the Gibbs sampler ( ), do ulations with Ergospace:
Set the parameter to some predetermined value a trajectory containing both residential and suburban envi-
Consider the number of measurements provided by ronments (Trajectory 1); the 3D model and the simulated
the satellite at time trajectory are displayed on Figs. 4 and 5.
Compute the number of components in the mixture a city center area (urban center) in Toulouse (Trajectory 2);
(clusters) at time denoted , the simulated trajectory is displayed on Fig. 13.
Set a prior for : 1) Rouen City (Trajectory 1): In this section, simulations
Compute a conditional distribution of the mixing have been performed with a 3D scene of the town of Rouen. The
parameter denoted using the Beta distribution as simulated route corresponds to an existing bus line. In these sim-
following ulations, the mobile ran at a constant speed of 30 km/h, during
Use to compute weights as in (15) 10 min and is covering a distance of 8236 meters. GNSS signals
is a mixture of two gamma posteriors from the GPS constellation can be received with a maximum of
computed as in (14) 3 reflections. Since, in most urban transport applications, local-
ization should be provided with a good accuracy, in our exper-
iments we have fixed the accuracy threshold to 3 m as in [50].
It may be instructive to compare the computational complexity
Note that in the considered trajectory neighborhood, there are
of the proposed scheme (RBPF with adapted parameters) with
large open zones, which are well covered by GNSS satellites,
that of a corresponding standard PF implementation. One way
but in the narrowest streets, lack of visibility occurs up to 50%
to achieve this is to quantify the additive complexity induced by
of the time.
the parameter estimation using DPM. However, since the DPM-
based estimation step consists in computing at 5An overview of the software Ergospace is in http://www.ergospace.fr/.
RABAOUI et al.: DIRICHLET PROCESS MIXTURES FOR DENSITY ESTIMATION 1649

TABLE I
COMPARISON OF POSITIONING ERRORS USING DPM AND GM (TRAJECTORY 1)

Fig. 4. An example illustrating the vehicle trajectory on a 3D realistic con-


strained environment (city of Rouen, France) using the Ergospace software sim-
ulator.

Fig. 6. Localization accuracy obtained (using trajectory 1) by applying var-


ious filtering algorithms: EKF, PF (with GM model), and the proposed approach
based on RBPF (with DPM model).

In order to highlight the effectiveness of the proposed


positioning framework, we have conducted preliminary ex-
periments using EKF, PF, and RBPF algorithms. Using these
methods, localization performances were compared along
the same trajectory during the same period of time. In the
EKF-based algorithm, pseudorange error distributions are
assumed to be centered Gaussian. In the PF algorithm, we con-
sidered a finite Gaussian mixture (GM) with two components.
About 500 particles are used in the above algorithms. Accuracy
Fig. 5. The reference trajectory of a vehicle in the city of Rouen: Positions are
has been calculated with reference to an exact position given
drawn in the local East North Up (ENU) Cartesian coordinate system. The unit by the simulator (Fig. 5).
of the vehicle coordinates is the meter (m). The use of RBPF algorithm is fully justified by the results
presented in Figs. 6, 7, and 8, as it yields consistently lower
error rates and a high localization accuracy. It appears that using
Here, we will present some results that pertain to the DPM models within a RBPF can improve the accuracy and the
choosing of the DPM model for density estimation. Moreover, availability of the localization even in very constrained environ-
we designed experiments involving some comparison with ments. Fig. 8 shows the evolution over time of the positioning
some commonly used estimation and filtering methods to show error. We see clearly that the proposed algorithm has proven
the robustness, flexibility, and merits of our approach. Note to be more accurate than the other algorithms. Besides, another
that in the following, the DPM simulations are based on the key point permitting to conclude that those methods are more
implementation of the DP mixture model described in the suitable than the others is the algorithm stability when abrupt
previous subsection, applied to the case of conjugate models propagation changes occur during mobile dynamic evolution
with normal structure. The precision parameter is calibrated (Fig. 7). Fig. 7 shows zooms into Fig. 6 which illustrates the be-
automatically following the procedure proposed in Algorithm havior of the proposed algorithm with comparison to the other
3. The parameters of the centering distribution are simply approaches. These results bring evidence regarding the perfor-
adapted to the data characteristics as shown in Algorithm 2. mances of the proposed approach, the proposed density estima-
We set the initial arbitrary value of the precision parameter tion algorithm and parameter adaptation schemes have shown
, and the values of the hyperparameter vector are: interesting results for the proposed tracking application. The ref-
, , , , , , and erence route is composed of a set of bends with different curva-
. ture radii. Note that there are consecutive bend segments with
1650 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 4, APRIL 2012

Fig. 7. Three zooms into Fig. 6: zoom (a) (top right), zoom (b) (top left), and zoom (c) (bottom).

different curvatures. The top left graph in Fig. 7 depicts the po-
sitions values when the vehicle dynamic changes suddenly. In
this segment of the route, there is a first turning on the left on
a long radius bend and a second turning on the right. In these
situations, the simulated positions using the proposed approach
are the nearest ones to the actual positions throughout the en-
tire segment. This demonstrates that the proposed positioning
approach still robust when abrupt changes occur in the dynamic
of the vehicle. The same remarks are also applicable for the top
right graph of Fig. 7. The bottom graph in Fig. 7 shows the be-
havior of the tested tracking algorithms at the beginning of the
trajectory. At the beginning, the vehicle run about 200 m near
obstacles (high building and trees) and being not completely in
the middle of the road. The proposed approach behaves well
Fig. 8. Evolution of the positioning error for different tracking approaches:
compared to the other approaches since it does not need a long EKF, PF (GM is used for the density estimation of observation errors) and the
initialization period of time. proposed approach based on RBPF (DPM is used for density estimation of the
To model the observation errors density, both GM and DPM- errors).
based models have been used in a sequential Monte Carlo fil-
tering approach. Table I shows the localization performances for
the two models in terms of mean positioning error. The average In fact, a random draw from either equals one
error between the actual and estimated trajectories drops below of the previous draws or is drawn independently from the base
3 m. We have considered three scenarios, each scenario corre- probability measure . The parameter obviously plays an
sponds to a different satellite configuration at a specific time of important role in the distribution of . In (18), note that the prob-
the day. ability that the new draw of the parameter differs from all pre-
In most applications of DPM models, the precision parameter viously drawn parameter values is proportional to ; therefore,
is not known and must be well chosen. In this paper, some higher values of lead to a higher probability of many unique
empirical tests were conducted to understand the influence of values (i.e., a higher number of classes relative to the sample
values on the estimation performance. In [16] for example, size ). Conversely, lower values of lead to a greater chance
it was shown that in problems where is unknown, estimation of clustered distributions with fewer unique values in .
results can be extremely sensitive to the choice of prior assumed In preliminary experiments, we considered a range of values
for . in order to explore the extend to which our results are sensitive
RABAOUI et al.: DIRICHLET PROCESS MIXTURES FOR DENSITY ESTIMATION 1651

Fig. 9. Estimation of the pseudorange errors using DPM models with fixed and
estimated values of .

to different choices of . While the overall shape of the distri-


bution remains the same across values of , higher values of
appear to produce an increased number of localized features and
even indicate additional modes for a subset of the observations.
The estimated means and variances of the model parameters are,
however, quantitatively very similar. Furthermore, we find little Fig. 10. The posterior distribution of the number of clusters is drawn. The
gain from using large values of . In our DPM model, we used clusters number evolves approximately logarithmically as new data come in.
a Gamma distribution as a prior for the . The major advantage
of this choice is that with this prior the conditional distributions
are easy to sample by applying the approach in Section IV-E.
The resulting posterior distribution over based on a Gamma
prior shown in Fig. 11 is considerably variable. Fig. 10 illus-
trates the posterior distribution of . The posterior clearly in-
dicates that , which suggests that the distribution of
the latent variables is clustered. Practically, in most situations
a considerable uncertainty exists about the probable value of
the number of model clusters . In our context, nor the prior
mean neither the prior variance of the are known. As shown
in (16), the number of clusters increases approximately loga-
rithmically with the number of observations. This is illustrated
in Fig. 10 which shows how the prior over the number of com- Fig. 11. Posterior distribution over the dispersion parameter when a Gamma
ponents changes as a function of . Fig. 12 shows clusters and prior is used.
theirs locations for two satellites (signals from satellite 18 are
mostly received in LOS and those of satellite 13 are both in LOS
and non-LOS). It seems to be extremely hard to have such in- simulations is shown. Note that the considered environment is
formations on data in a real-world application. Therefore, the an urban environment characterized by severe masking angles
parameters and of the Gamma prior have to be adjusted and the presence of a great number of obstacles (smaller streets
carefully. For example, we noticed that small values of and and higher buildings). The masking problem results in the lack
lead to nearly similar values of the probability density, which of visible satellites, whereas obstacles lead to a high level of
means a lack of variability in the distribution of . Finally, in multipath with measurement errors at receiver level. In these
Fig. 9 we show that estimating the leads to a good position- conditions, tracking is possible when at least 4 transmitters are
ning accuracy compared to the result obtained when we use a visible from the receiver.
fixed . Figs. 14 and 15 show the behavior of the proposed algorithm
2) Toulouse City (Trajectory 2): To capture different dy- with comparison to the other approaches. These results illustrate
namics, the vehicle was driven on a second trajectory different the performances of the proposed approaches in dense environ-
from the one used in Section VI-C-1. In this part, a second sce- ments and even when the dynamic of the vehicle change sud-
nario was considered using the same simulator with the same denly. The same remarks as previously concerning the robust-
settings. In this scenario, we intentionally choose to drive inside ness of the proposed RBPF are applicable here. Fig. 16 shows
the city of Toulouse (south of France). Many simulations have the localization performances for different algorithms in terms
been conducted to illustrate the performance of the proposed po- of mean error. The error between the actual and estimated tra-
sitioning framework. In Fig. 13, the trajectory used in Ergospace jectories is obtained by averaging the results from three different
1652 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 4, APRIL 2012

Fig. 12. Locations of the clusters in the DPM model for different satellites (satellites 18, 1, and 13) and their corresponding reception states. (a) Temporal location
of the clusters (s). (b) Time diagram illustrating the reception states (LOS, NLOS, Blocked) of satellites during simulation.

TABLE II
COMPARISON OF POSITIONING ERRORS USING DPM, GM, AND SAFEDRIVE (2D ERRORS)

Fig. 13. The reference trajectory in the city of Toulouse: positions are drawn
in the local East North Up (ENU) Cartesian coordinate system. The unit of the
vehicle coordinates is the meter (m). The speed of the mobile is about 30 km/h Fig. 14. The reference trajectory of a vehicle in the city of Toulouse: Positions
and the traveled distance is 4333 meters. are drawn in the local East North Up (ENU) Cartesian coordinate system. The
unit of the vehicle coordinates is the meter (m).

scenarios. As in Table I, we considered again three satellite con-


figurations corresponding to different times of the day. accuracy. A modern RTK receiver can offer greater than 99.9%
reliability and provide up to centimeter-level accuracy. Then,
D. Experiments With Real Data based on this reference trajectory, the positions delivered by
Besides experiments using simulated data, we have con- a second receiver (Safedrive), a low-cost one, are collected to
ducted additional experiments using real-world data, in the be used to conduct tracking experiments. The antennas of both
city of Belfort (in France). We have considered a trajectory receivers have been placed on a the roof of a van at about 2 m
containing urban, residential and suburban environments. above the ground.
The reference trajectory (Figs. 17 and 18) was obtained In Fig. 19, as can be seen, the satellite visibility is low in some
using a modern real time kinematic (RTK) receiver. This is segments of the trajectory. This implies that the pseudorange de-
a high-quality GPS receiver in terms of reliability, speed and livered by Safedrive is clearly corrupted by errors in some areas.
RABAOUI et al.: DIRICHLET PROCESS MIXTURES FOR DENSITY ESTIMATION 1653

Fig. 15. Three zooms into Fig. 14: zoom (a) (top right), zoom (b) (top left), and zoom (c) (bottom).

Fig. 16. Evolution of the positioning error (in trajectory 2) for different
tracking approaches: EKF, PF (GM is used for the density estimation of
observation errors) and the proposed approach based on RBPF (DPM is used
for density estimation of the errors).

In order to highlight the performance of the proposed frame-


work, we have reported in Table II positioning errors obtained
Fig. 17. An overview of the vehicle trajectory (red line) on a real environment
when using PF (GM) and RBPF (DPM with adapted parame- (city of Belfort in France).
ters)-based algorithms. The assumptions concerning the model
parameters and the algorithms setting are the same as in the pre- that permits robust adaptive inference to be carried out on data
vious experiments. with some unknown specifications. The DP prior is extremely
flexible, it allows to model a huge variety of distributional
VII. CONCLUSION forms. In the considered application, its effectiveness was
Our paper deals with the problem of improving localization proved, since in our case little is known about priors to be
accuracy in dense environments when signals delivered by the considered for the errors distribution. We illustrated through
GNSS system are received most of the time after many reflec- experiments that a good choice of the DPM hyperparameters
tions. The errors induced in the pseudorange measurements leads to a better density estimation. Since the proposed density
have an unknown density form. Thus, in this paper we have estimation solution was strongly motivated by the complexity
discussed a Bayesian nonparametric model based on a DPM of the considered data, we proved that the proposed approaches
1654 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 4, APRIL 2012

[8] H. Blom and Y. Bar-Shalom, The interacting multiple model algo-


rithm for systems with Markovian switching coefficients, IEEE Trans.
Autom. Contr., vol. 33, no. 8, pp. 780783, 1988.
[9] E. Mazor, A. Averbuch, Y. Bar-Shalom, and J. Dayan, Interacting
multiple model methods in target tracking: A survey, IEEE Trans.
Aerosp. Electron. Syst., vol. 34, no. 1, pp. 103123, 1998.
[10] C. Andrieu, M. Davy, and A. Doucet, Efficient particle filtering for
jump Markov systems. Application to time-varying autoregressions,
IEEE Trans. Signal Process., vol. 51, no. 7, pp. 17621770, 2003.
[11] A. Doucet and B. Ristic, Recursive state estimation for multiple
switching models with unknown transition probabilities, IEEE Trans.
Autom. Contr., vol. 38, no. 3, pp. 10981104, 2002.
[12] V. Jilkov and X. R Li, Online Bayesian estimation of transition prob-
abilities for Markovian jump systems, IEEE Trans. Signal Process.,
Fig. 18. The reference trajectory of a vehicle in the city of Belfort: Positions vol. 52, no. 6, pp. 16201630, 2004.
are drawn in the local East North Up (ENU) Cartesian coordinate system. The [13] A. Doucet, N. Gordon, and V. Krishnamurthy, Particle filters for
unit of the vehicle coordinates is the meter (m). state estimation of jump Markov linear systems, IEEE Trans. Signal
Process., vol. 49, no. 3, pp. 613624, 2001.
[14] A. P. Dempster, N. M. Laird, and D. B. Ru, Maximum likelihood from
incomplete data via the EM algorithm, J. Royal Statist. Soc. Ser. B
(Methodological), vol. 39, no. 1, pp. 138, 1977.
[15] M. D. Escobar and M. West, Bayesian density estimation and infer-
ence using mixtures, J. Amer. Statist. Assoc., vol. 90, pp. 577588,
1995.
[16] R. M. Neal, Markov chain sampling methods for Dirichlet process
mixture models, J. Computat. Graph. Statist., vol. 9, pp. 249265,
2000.
[17] P. Muller and F. A. Quintana, Nonparametric Bayesian Data Anal-
ysis, Statistical Science, vol. 19, no. 1, pp. 95119, 2004, Inst. of Math.
Stat..
[18] A. E. Gelfand, A. Kottas, and S. N. MacEachern, Bayesian non-
parametric spatial modeling with Dirichlet process mixing, J. Amer.
Statist. Assoc., vol. 100, no. 471, pp. 10211035, 2005.
[19] H. Ishwaran and L. F. James, Approximate Dirichlet process com-
puting in finite normal mixtures: Smoothing and prior information, J.
Computat. Graph. Statist., vol. 11, pp. 508532, 2002.
Fig. 19. Satellite visibility when using the Safedrive receiver. [20] R. M. Dorazio, On selecting a prior for the precision parameter
of Dirichlet process mixture models, J. Statist. Plann. Infer., pp.
33843390, 2009.
outperform standard sequential methods, especially in a non- [21] Y. Xue, X. Liao, L. Carin, and B. Krishnapuram, Multi-task learning
for classification with Dirichlet process priors, J. Mach. Learn. Res.,
linear dynamic context in which joint estimation of states and vol. 8, pp. 3563, 2007.
observation noise distributions is needed. [22] J. Paisley and L. Carin, Hidden Markov models with stick-breaking
priors, IEEE Trans. Signal Process., vol. 57, pp. 39053917, 2009.
Many interesting questions remains to be addressed. One of [23] L. Ren, D. Dunson, S. Lindroth, and L. Carin, Dynamic nonparametric
these is whether nonstationarity can be involved in the non- Bayesian models for analysis of music, J. Amer. Statist. Assoc., vol.
105, pp. 458472, 2010.
parametric model by considering time-varying Dirichlet process [24] I. Pruteanu-Malinici, L. Ren, J. Paisely, E. Wang, and L. Carin,
mixtures. At each time step it can be interesting to consider that Dynamic hierarchical Dirichlet process for modeling topics in
the unknown distribution follows a DPM model. Some inter- time-stamped documents, IEEE Trans. Pattern Anal. Mach. Intell.,
2010.
esting and simple approaches, such as utilized in [51], need to be [25] L. Ren, D. Dunson, and L. Carin, The dynamic hierarchical Dirichlet
investigated. A second is to develop more accurate and efficient process, in Proc. Int. Conf. Mach. Learn. (ICML), 2008.
[26] A. Doucet, N. de Freitas, and N. Gordon, Sequential Monte Carlo
approximations of the importance distribution Methods in Practice. New York: Springer, 2001.
in the proposed RBPF. [27] B. Ristic, S. Arulampalam, and N. Gordon, Beyond the Kalman
Filter. Norwood, MA: Artech House, 2004.
[28] G. Casella and C. P. Robert, Rao-Blackwellization of sampling
REFERENCES schemes, BiometriKa, vol. 83, no. 1, pp. 8194, 1996.
[29] E. D. Kaplan and C. J. Hegarty, Understanding GPS: Principles and
[1] A. Giremus, J.-Y. Tourneret, and V. Calmettes, A particle filter Applications. Norwood, MA: Artech House, 2005.
approach for joint detection/estimation of multipath effects on GPS [30] A. Rabaoui, N. Viandier, J. Marais, and E. Duflos, Studies on DPM for
measurements, IEEE Trans. Signal Process., vol. 55, no. 4, pp. the density estimation of pseudo-range noises and evaluations on real
12751285, 2007. data, in Proc. IEEE/ION Position Location Navigat. Symp., 2010, pp.
[2] F. Caron, M. Davy, E. Duflos, and P. Vanheeghe, Particle filtering for 377382.
multisensor data fusion with switching observation models. Applica- [31] B. H. Wellenhof, H. Lichtenegger, and E. Wasle, GNSSGlobal Nav-
tion to land vehicle positioning, IEEE Trans. Signal Process., vol. 55, igation Satellite Systems: GPS, GLONASS, Galileo & More. New
no. 6, pp. 27032719, 2006. York: Springer-Verlag, 2007.
[3] F. D. Nahimana, E. Duflos, and J. Marais, Reception state estimation [32] C. P. Robert and G. Casella, Monte Carlo Statistical Methods, 2nd
of GNSS satellites in urban environment using particle filtering, in ed. New York: Springer-Verlag, 2004.
Proc. Int. Conf. Inf. Fusion, Cologne, Germany, Jun. 2008, pp. 3033. [33] J. S. Liu, Monte Carlo Methods in Scientific Computing. New York:
[4] N. Viandier, D. F. Nahimana, J. Marais, and E. Duflos, GNSS perfor- Springer-Verlag, 2001.
mance enhancement in urban environment based on pseudo-range error [34] A. Doucet and X. Wang, Monte Carlo methods for signal processing:
model, in Proc. IEEE/ION Position, Location and Navigat. Symp., A review in the statistical signal processing context, IEEE Signal
2008, pp. 377382. Process. Mag., vol. 22, no. 6, 2005.
[5] F. Caron, M. Davy, A. Doucet, E. Duflos, and P. Vanheeghe, Bayesian [35] T. S. Ferguson, A Bayesian analysis of some nonparametric prob-
inference for linear dynamic models with Dirichlet process mixtures, lems, Ann. Statist., vol. 1, pp. 209230, 1973.
IEEE Trans. Signal Process., vol. 56, no. 1, pp. 7184, 2008. [36] N. Viandier, A. Rabaoui, J. Marais, and E. Duflos, GNSS pseudorange
[6] Y. Bar-Shalom and X. R. Li, Multitarget-Multisensor Tracking: Prin- error density tracking using Dirichlet process mixture, in Proc. 13th
ciples and Techniques. New York: YBS, 1995. Int. Conf. Inf. Fusion, 2010.
[7] J. Tugnait, Adaptive estimation and identification for discrete systems [37] E. Fox, E. Sudderth, and A. Willsky, Hierarchical Dirichlet processes
with Markov jump parameters, IEEE Trans. Autom. Contr., vol. 27, for tracking maneouvering targets, in Proc. Int. Conf. Inf. Fusion,
no. 5, pp. 10541065, 1982. 2007.
RABAOUI et al.: DIRICHLET PROCESS MIXTURES FOR DENSITY ESTIMATION 1655

[38] Y. W. Teh and M. I. Jordan, Hierarchical Bayesian nonparametric Emmanuel Duflos was in the Institut Suprieur
models with applications, in Bayesian Nonparametrics: Principles dElectronique du Nord, Lille, France, from 1987 to
and Practice. Cambridge, U.K.: Cambridge Univ. Press, 2010. 1991, as an undergraduate. He then spent one year at
[39] T. L. Griffiths and Z. Ghahramani, Infinite latent feature models and the University of Paris XI, Orsay, France, to prepare
the Indian buffet process, Adv. Neural Inf. Process. Syst. (NIPS), 2006. a Diplme dEtude Approdondie (DEA) in automatic
[40] D. Blackwell and J. MacQueen, Ferguson distributions via Polya urn control and signal processing. From 1992 to 1995,
schemes, Ann. Statist., vol. 1, no. 2, pp. 353355, 1973.
[41] J. Stthuraman, A constructive definition of Dirichlet priors, Statistica he was with the University of Toulon, where he
Sinica, vol. 4, pp. 639650, 1994. received the Ph.D. degree in signal processing, in
[42] C. E. Antoniak, Mixtures of Dirichlet processes with applications September 1995. In December 2002, he received the
to Bayesian nonparametric problems, Ann. Statist., vol. 2, pp. Habilitation a Diriger des Recherches (HDR), from
11521174, 1974. the Universit des Sciences et Technologies de Lille.
[43] J. S. Liu, Nonparametric hierarchical Bayes via sequential imputa- From September 1995 to August 2003, he was a teacher at the Institut Su-
tions, Ann. Statist., vol. 24, pp. 911930, 1996. prieur dElectronique du Nord where, from September 1999 to August 2003,
[44] S. N. MacEachern and P. Muller, Estimating mixture of Dirichlet he was the Head of the Signal, System and Communication Department. In
process models, J. Computat. Graph. Statist., vol. 7, pp. 223238, September 1999, he joined the Laboratoire dAutomatique et dInformatique
1998. Industrielle de Lille (Laboratory LAIL) as a researcher. The LAIL became the
[45] C. E. Rasmussenan and Z. Ghahramani, The infinite Gaussian mixture Laboratoire dAutomatique, Gnie Informatique et Signal (LAGIS) in January
model, Adv. Neural Inf. Process. Syst. (NIPS), pp. 294300, 2000.
[46] M. West, Hyperparameter estimation in Dirichlet process mixture 2006. In September 2003, he was a Full Professor with the Ecole Centrale de
models Duke Univ., NC, 1992, Tech. Rep.. Lille, where he is the Deputy Director and the Director of Research. He is also
[47] A. Rabaoui, N. Viandier, J. Marais, and E. Duflos, On selecting the the head of the Signal and Image team of LAGIS. In April 2006, he participated
hyperparameters of the DPM models for the density estimation of ob- in the creation of the INRIA project team SequeL devoted sequential learning.
servations errors, in Proc. ICASSP, 2011, pp. 40924095. SequeL was hosted by the INRIA Lille Nord Europe. His main research con-
[48] F. Caron, M. Davy, A. Doucet, E. Duflos, and P. Vanheeghe, Bayesian cerns are Bayesian nonparametric methods for signal processing, multisensor
inference for dynamic models with Dirichlet process mixtures, in probability hypothesis density (PHD) filter for sensor management, particle fil-
Proc. 9th Int. Conf. Inf. Fusion, 2006, pp. 18. tering. and MCMC methods.
[49] A. Belloni and V. Chernozhukov, On the computational complexity Dr. Duflos was a member of many international committees in the field of
of MCMC-based estimators in large samples, Ann. Statist., vol. 37, signal processing and data fusion. He is a member of the Board of Directors at
no. 4, pp. 20112055, 2009. the Ecole Centrale de Lille.
[50] N. Viandier, J. Marais, A. Prestail, and E. De Verdalle, Positioning
urban buses: GNSS performances, in Proc. ITST, 2008, pp. 5155.
[51] F. Caron, M. Davy, and A. Doucet, Generalized Polya urn for time-
varying Dirichlet process mixtures, in Proc. 23rd Conf. Uncertainty
in Artif. Intell. (UAI), 2007. Juliette Marais received the Engineering degree
from the Institut Suprieur dElectronique du Nord
Asma Rabaoui received the engineering degree in (ISEN), Lille, France, in 1998, the Masters degree
2004 and the Ph.D. degree in electrical engineering from the University of Lille in 1999, and the Ph.D.
with emphasis in signal processing in 2008 from degree from the French National Institute for Trans-
The National Engineering School of Tunis [in col- port and Safety Research (INRETS, now IFSTTAR)
laboration with SyNeR team in LAGIS (Laboratoire in 2002.
dAutomatique, Gnie Informatique et Signal in Since 2002, she has been with the French Institute
Lille, France)]. of Science and Technology for transport, devel-
Her research has explored a variety of topics opment, and networks (IFSTTAR) where she is a
mainly in statistical audio signal processing, more researcher. Her research interests focus on satel-
specifically analysis and characterization of im- lite-based navigation systems including performance evaluation, in particular,
pulsive sounds, wavelets, signal classification for transport applications, accuracy enhancement, signal propagation modeling,
surveillance applications using hidden Markov models (HMMs) and support and error estimation.
vector machines (SVMs). She is currently a postdoctoral researcher with the
Laboratoire de lIntgration du Matriau au Systme (IMS) working with the
Signal team, since October 2010. Her current research project (supported by
the TOTAL group) involves applying Markov chain Monte Carlo (MCMC) Philippe Vanheeghe (M92SM97) was born in
methods and sequential Monte Carlo for data estimation in multisensors-based France on July 20, 1956. He received the M.S.
systems. She is also interested in Bayesian approaches to model selection. degree in data processing, the Diploma of Advanced
Prior to that, she was working with INRETS (in collaboration with LAGIS) Study in data processing, the Ph.D. degree, and the
on improving GNSS localization precision in urban canyons using sequential Habilitation Diriger des Recherches (HDR) from the
estimation of signals and nonparametrical Bayesian analysis to deal with University of Lille, France, in 1981, 1982, 1984, and
nonlinear dynamic models. She is applying Dirichlet processes and Gaussian 1996, respectively.
processes to a range of problems including mobile tracking and audio-based He joined the Institut Suprieur dElectronique du
systems. Nord (a private engineering school) as an Assistant
Professor and became a Full Professor with the Ecole
Centrale de Lille in 1999. His research activities are
focused on statistical information processing in the multisensor systems. Partic-
Nicolas Viandier was born in Boulogne sur Mer, ularly in the development of new data analysis algorithms for inference and clus-
France, on November 1, 1982. He received the tering based on nonparametric Bayesian modeling implemented using particle
Research Masters degree in numerical engineering filters and Monte Carlo Markov chains methods, these algorithms can be used
in 2007 from the Universit du Littoral te dOpale, to solve a problem of joint state parameter estimation problem, in this case the
Calais, and the Ph.D. degree in automatic, signal utilization of a Bayesian nonparametric noise model based on Dirichlet process
processing and industrial computing from the Ecole mixture has been used. Research activities are developed within the SequeL
Centrale, Lille, France, in 2011. (acronym for Sequential Learning) of the INRIA Lille-Nord Europe Research
He is currently an engineer of the French MoD. His Center.
Ph.D. degree thesis deals with the modelling and use Dr. Vanheeghe is involved in the Technical Program Committees of con-
of GNSS pseudorange errors in the navigation solu- ferences (recently, FUSION 2011, SPIE 2012) and in review activities for
tion to improve the GNSS accuracy with application the IEEE journals (IEEE TRANSACTIONS ON SIGNAL PROCESSING and IEEE
to the transportation field. TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS).