This action might not be possible to undo. Are you sure you want to continue?

Dynamical systems are often modeled using a state-space framework, where the phenomenon of interest is viewed as the unknown state of the system, and noisy measurements of the state are available to an observer. The state-space model is often assumed to be linear: xt+1 = Ft xt + Gt ut yt = Ht xt + vt (1) (2)

where t = 0, 1, 2, · · · is the time index, xt ∈ Rm is the state vector, yt ∈ Rk is the measurement vector, and ut ∈ Rp and vt ∈ Rq are independent noise vectors. In these equations, Ft , Gt , and Ht are known matrices. The second-order statistics of the noise vectors Ut , Vt and the initial state X0 are known. The well-known Kalman ﬁlter recursively produces the so-called linear MMSE estimator 1 of Xt and the linear MMSE predictor of Xt+1 given all the data Y0:t = (Y0 , Y1 , · · · , Yt ) available at time t. If the noise sequences and the initial state are Gaussian, the estimates are also MMSE estimates. In many problems, the noise in non-Gaussian and/or the linear model (1) (2) is invalid. The performance of the Kalman ﬁlter can be disastrous in such cases – e.g., the ﬁlter may fail to track its target. The extended Kalman ﬁlter can cope with some nonlinear models by using linearization of the state equation around the predicted state [1]. However this ﬁlter may still perform poorly if the linearization is crude and/or the noise statistics are strongly non-Gaussian. This has motivated the development of algorithms for nonlinear state-space models in which optimal MMSE estimators and predictors are produced by a nonlinear recursive ﬁlter [2]. The state-space model is as follows: xt+1 = ft (xt , ut ) yt = ht (xt , vt ) (3) (4)

where the mappings ft : Rm × Rp → Rm and ht : Rm × Rq → Rk are assumed to be known, as are the pdf’s pUt (ut ) and pVt (vt ) for the noise vectors, and the pdf pX0 (x0 ) of the initial state. The noise sequences are also temporally independent. ˆ Estimation problem: estimate Xt given Y0:t . We denote this estimator by Xt|t . ˆ Prediction problem: predict Xt+1 given Y0:t . We denote this predictor by Xt+1|t . In Sec. 1 we derive a recursive Bayesian ﬁlter for this nonlinear dynamical system. This solution requires numerical evaluation of integrals, and this is done using a sequential importance sampling method (the particle ﬁlter, aka the bootstrap ﬁlter) [2], as shown in Sec. 2. Nowadays particle ﬁltering methods are widely applied to problems arising in communications, image and video processing, and statistics [3, 4, 5, 6].

1

That is, the best linear ﬁlter in the MMSE sense.

1

y0:t−1 ) p(xt |y0:t−1 ) = Rm p(yt |xt . y0:t−1 ) p(xt |y0:t−1 ) = p(yt |y0:t−1 ) p(yt |xt . xt ) and the posterior pdf’s p(xt |y0:t ) and ˆ p(xt+1 |y0:t ). Step 1: Prediction. Rm (5) (6) ˆ Xt+1|t = The posterior pdf’s can in principle be evaluated recursively using the following two-step procedure. The modern approach to recursive MMSE ﬁltering is therefore to use stochastic integration methods. xt ) = xt − xt 2 . vt )p(vt ) dvt ] p(xt |y0:t−1 ) Rq 1{yt = ht (xt . y0:t−1 ) p(xt |y0:t−1 ) dxt p(yt |xt ) p(xt |y0:t−1 ) = Rm p(yt |xt ) p(xt |y0:t−1 ) dxt = [ Rm [ Rq 1{yt = ht (xt . The optimal estimator ˆ ˆ and predictor are given by the conditional means ˆ Xt|t = Rm xt p(xt |Y0:t ) dxt . due to the curse of dimensionality. and the fourth line from the fact that Y0:t−1 → Xt → Yt forms a Markov chain. Step 2: Update. we have C(xt . For MMSE estimation. vt )p(vt ) dvt ] p(xt |y0:t−1 ) dxt where in the second line follows from Bayes’ rule. As discussed in the previous lecture. ut )} p(ut ) dut p(xt |y0:t ) dxt Rp where in the second line we have used the fact that Y0:t → Xt → Xt+1 forms a Markov chain. xt+1 p(xt+1 |Y0:t ) dxt+1 . y0:t−1 ) p(yt |xt . y0:t ) p(xt |y0:t ) dxt p(xt+1 |xt ) p(xt |y0:t ) dxt Rm = = Rm 1{xt+1 = ft (xt . evaluation of such integrals (one for each xt !) using deterministic quadrature methods is generally computationally infeasible. We can express p(xt+1 |y0:t ) in terms of p(xt |y0:t ): p(xt+1 |y0:t ) = Rm p(xt+1 |xt .1 Bayesian Recursive Filtering The Bayesian approach requires a cost function C(xt . Next we express p(xt |y0:t ) in terms of p(xt |y0:t−1 ): p(xt |y0:t ) = p(xt |yt . 2 .

the distribution of the resampled {Xt+1 (j)} converges to the desired posterior p(xt+1 |y0:t+1 ) ∝ p(yt+1 |xt+1 ) p(xt+1 |y0:t ) where the posterior p(xt+1 |y0:t+1 ). Upon receiving a new measurement yt+1 . 3 . 1 ≤ i ≤ N . the N samples Xt+1 (i). and ∗ similarly. The second idea (to avoid computation of the integrals giving the posterior pdf’s from which ∗ the samples are drawn) is to reuse the samples Xt (i). 1 ≤ i. By construction. evaluate the importance weights qi = ∗ p(yt+1 |Xt+1 (i)) N ∗ j=1 p(yt+1 |Xt+1 (j)) . 1 ≤ i ≤ N . Hence the algorithm propagates a set of particles over time. Resample N times from the set {Xt (i)} with respective probabilities {qi }. This is done as follows. 1 ≤ i ≤ N and Xt+1 (i). Assume we have N iid samples Xt (i). Deﬁne ∗ Xt+1 (i) = ft (Xt (i).2 Particle Filter The particle ﬁlter. also called bootstrap ﬁlter. 1 ≤ i ≤ N drawn from p(ut ). 1 ≤ i ≤ N . 1 ≤ i ≤ N. Ut (i)). l(·) and g(·) in (9). these samples are iid with marginal distribution p(xt+1 |Y0:t ). are drawn iid from the posterior pdf p(xt |Y0:t ). Step 2: Update. 1 ≤ i ≤ N . are drawn iid from the posterior pdf p(xt+1 |Y0:t ). Step 1: Prediction. obtaining updated samples {Xt+1 (j)} with probabilities P r[Xt+1 (j) = Xt (i)] = qi . drawn from p(xt |Y0:t ) and N iid noise samples Ut (i). j ≤ N. is a sequential Monte Carlo method which ˆ ˆ outputs a stochastic approximation to Xt|t and Xt+1|t in (5) and (6). 1 ≤ i ≤ N. The ﬁrst idea is to approximate the conditional-mean estimator and predictor of (5) and (6) with sample averages: ˜ Xt|t = ˜ Xt+1|t = 1 N 1 N N Xt (i) i=1 N ∗ Xt+1 (i) i=1 (7) (8) where the N samples Xt (i). 1 ≤ i ≤ N . the “likelihood function” p(yt+1 |xt+1 ) and the “prior” p(xt+1 |y0:t ) play the role of h(·). to generate Xt+1 (i). By the weighted bootstrap theorem (see appendix).

A Weighted Bootstrap Given an arbitrary positive function l(x) over Rm and a set of samples {X(i).Choice of N .e. 1 ≤ i ≤ N } from the pdf g(x)l(x) h(x) = . Proof.. We have N 1 ≤ i ≤ N. Methods have been proposed in the statistics literature that alleviate this problem to some extent. This problem becomes more signiﬁcant in high dimensions. The weighted bootstrap resamples the set {X(i). The number of particles should be “large enough” so that the stochastic approximations to conditional expectations are accurate. i. but how large should N be? The biggest problem is the potential weak overlap of the “likelihood” p(yt+1 |xt+1 ) and the “prior” p(xt+1 |y0:t ). For notational simplicity we give the proof only in the 1-D case (m = 1). 1 ≤ i ≤ N } drawn iid from a pdf g(x). 1 ≤ i ≤ N } with respective probabilities {qi }. −∞ . which can cause the same samples to be reused many times. it follows from the weak law of large numbers that N (z) above converges in probability to z Eg [N (z)] = Eg [l(X) 1{X≤z} ] = 4 g(x)l(x) dx. we would like to produce iid samples {z(i). N (∞) Since Xi are drawn iid from g(x). leading to an eﬀective number of samples that is much smaller than N . (9) g(x)l(x) dx Deﬁne the importance weights qi = and the empirical pdf N l(X(i)) N j=1 l(X(j)) . P r[Z ≤ z] = = i=1 1 N qi 1{X(i)≤z} N i=1 l(X(i)) 1{X(i)≤z} N 1 i=1 l(X(i)) N = N (z) . for each resample Z we have P r[Z = X(i)] = qi . Proposition: Z converges in distribution to h. 1≤i≤N ˜ hN (x) = i=1 qi δ(x − X(i)) where δ(·) is the Dirac impulse.

J. D. Y. Vol. D. Signal Processing. F. pp. Arulampalam. Sequential Monte Carlo Methods in Practice. Moore. [3] J. Gordon. [6] P.” IEE Proceedings-F. “A Tutorial on Particle Filters for Online Nonlinear/Non-Gaussian Bayesian Tracking. 46. B. 20. 1993. M. NJ. “Novel Approach to Nonlinear and Non-Gaussian Bayesian State Estimation. [7] A.” J. 5 . Apr. T. S. Kotecha. 107—113. “Bayesian Statistics Without Tears: A Sampling-Resampling Perspective. No.. “Applications of particle ﬁltering to selected problems in communications: A review and new developments. Prentice-Hall. 174—188. Optimal Filtering. Gordon. [4] A. and A. 2003. the denominator N (∞) converges in probability to lim P r[Z ≤ z] = z −∞ g(x)l(x) dx ∞ −∞ g(x)l(x) dx ∞ −∞ z g(x)l(x) dx. Smith and Gelfand. Springer. 19—38. Hence N →∞ = −∞ h(x) dx 2 which proves the claim. 5. and J. Samond. No. H. F. pp. Chen. Amer. Miguez. 2001. Anderson and J. Vol.” Signal Processing Magazine. J. pp.. de Freitas and N. Vol. Stat. 2002.Likewise. [5] M. Liu and R. Maskell N. 1998. pp. Sep. [2] N. Smith. M. Feb. M. Djuric. Clapp. Vol. “Sequential Monte Carlo methods for Dynamical Systems. 80—88. NY.” The American Statistician. pp. May 1992. 1032—1044. Assoc. 93. Zhang. 2. S. Huang. and T.” IEEE Trans. Gordon. Eds. N. Vol. 50. M. Ghirmai. 1979. 140. Doucet. Bugallo. References [1] B. F.

Sign up to vote on this title

UsefulNot useful- Bayesian Estimation of AR (1) with Change Point under Asymmetric Loss Functions
- Buse
- 2010LSIexcerpt
- Green belt project documentation X1707700.ppt
- ultima presentacion BayesianCourse
- Measure of Dispersion
- GOODNESS OF FIT TEST FOR LOGISTIC DISTRIBUTION INVOLVING KULLBACK-LEIBLER INFORMATION
- Econ481-2_Syl.pdf
- cv
- 03Chapter2.pdf
- Where Do Statistics
- BRM
- Stata1 Exercises
- Statistics Exam S1 Mark Scheme Edexcel Specimen Paper - A-level Maths Tutor
- Data Analysis of Students Marks With Descriptive Statistics
- Econ483_Syl.pdf
- d Cluster
- Bayesian Estimation for Missing Values in Latin Square Design
- 2 Descriptive
- Spss Intro
- Local Indicators of Spatial Association-LISA
- S1 June 2006 ms Edexcel
- Statistics Test Bank Solutions Manual
- Bootstrap Based
- Spatial
- Statistics Report for Dish Washing
- bib3
- Crude Analysis of SAS
- bayesian nonparametrics
- Corrected Results.docxZZz
- Particle

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue reading from where you left off, or restart the preview.

scribd