Inference and Estimation in Probabilistic Time-Series Models

1.1 Time-series

1.1.1 Inference and Estimation

1.2 Markov models

1.2.1 Discrete state Markov models

1.2.2 Autoregressive (AR) models

1.3 Latent Markov Models

1.3.1 Discrete state latent Markov model

1.3.2 Continuous state latent Markov models

1.4 Inference in latent Markov models

1.4.1 Filtering p(xt|y1:t)

1.4.2 Smoothing p(x1:T|y1:T)

1.4.3 Prediction p(yt+1|y1:t)

1.4.4 Interpolation

1.4.5 Most likely joint path

1.4.6 Inference in Linear Dynamical Systems

1.4.7 Non-linear latent Markov Models

1.5 Deterministic approximate inference

1.5.1 Variational Bayes

1.5.2 Assumed density ﬁltering in latent Markov models

1.5.3 Expectation propagation

1.6 Simulation Based Inference

1.6.1 Markov Chain Monte Carlo

1.6.2 Sequential Monte Carlo

1.6.3 Sequential Importance Sampling, Particle Filtering

1.7 Multi Object Tracking and the PHD ﬁlter

1.7.1 Poisson Point Processes

1.8 Discussion and Summary

Monte Carlo

Adaptive Markov Chain Monte Carlo: Theory and Methods

2.1 Introduction

2.2 Adaptive MCMC Algorithms

2.2.1 Internal adaptive algorithms

2.2.2 External Adaptive Algorithm

2.3 Convergence of the marginal distribution

2.3.1 Main result

2.4 Strong law of large numbers

2.5 Convergence of the Equi-Energy sampler

2.5.1 Convergence of the marginal

2.6 Conclusion

2.7 Proofs

2.7.1 Proof of Sections 2.3 and 2.4

2.7.2 Proof of Section 2.5

Recent Developments in Auxiliary Particle Filtering

3.1 Background

3.1.1 State-Space Models

3.1.2 Particle Filtering

3.1.3 Sequential Importance Resampling

3.1.4 Auxiliary Particle Filters

3.2 Interpretation and Implementation

3.2.1 The APF as SIR

3.2.2 Implications for Implementation

3.2.3 Other Interpretations and Developments

3.3 Applications and Extensions

3.3.1 Marginal Particle Filters

3.3.2 Sequential Monte Carlo Samplers

3.3.3 The Probability Hypothesis Density Filter

3.4 Further Stratifying the APF

3.4.1 Reduction in Conditional Variance

3.4.2 Application to Switching State-Space Models

3.5 Conclusions

inference for diﬀusion processes

4.1 Summary

4.2 Introduction

4.3 Random weight continuous-discrete particle ﬁltering

4.4 Transition density representation for a class of diﬀusions

4.5 Exact simulation of diﬀusions

4.6 Exact simulation of killed Brownian motion

4.7 Unbiased estimation of the transition density using series ex-
pansions

4.7.2 Unbiased truncation of inﬁnite series

4.7.4 Simulation from probability measures on unions of spaces

4.7.5 Monte Carlo for integral equations

4.7.6 Illustrating example: the CIR density

4.8 Discussion and directions

5.1 Introduction

5.2 The Variational Approach

5.2.1 A motivating example

5.2.2 Chapter Organisation

5.3 Compactness of variational approximations

5.3.1 Approximating mixtures of Gaussians with a single Gaussian

5.3.2 Approximating a correlated Gaussian with a factored Gaussian

5.3.3 Variational approximations do not propagate uncertainty

5.4 Variational Approximations are Biased

5.4.1 Deriving the learning algorithms

5.4.2 General properties of the bounds: A sanity check

5.4.3 Learning the dynamical parameter, λ

5.4.5 Learning the magnitude and direction of one emission weight

5.4.6 Characterising the space of solutions

5.4.7 Simultaneous learning of pairs of parameters

5.4.8 Discussion of the scope of the results

5.5 Conclusion

Approximate inference for continuous-time Markov processes

6.1 Introduction

6.2 Partly observed diﬀusion processes

6.3 Hidden Markov characterisation

6.3.1 Example

6.4 The Variational Approximation

6.4.1 The variational approximation in Machine Learning

6.4.2 The variational approximation for Markov processes

6.4.3 The variational problem revisited

6.5 The Gaussian Variational Approximation

6.8 Discussion and outlook

inference in switching Kalman ﬁlter models

7.1 Introduction

7.2 Notation and problem description

7.3 Assumed density ﬁltering

7.3.1 Local approximations

7.3.2 The sum-product algorithm

7.4 Expectation propagation

7.4.1 Backward pass

7.4.2 Iteration

7.4.3 Supportiveness

7.5 Free energy minimization

7.6 Generalized expectation propagation

7.7 Alternative backward passes

7.7.1 Approximated backward messages

7.7.2 Partial smoothing

7.8 Experiments

7.8.1 Comparisons with exact posteriors

7.8.2 Comparisons with Gibbs sampling

7.8.3 Eﬀect of Larger Outer Clusters

7.9 Discussion

7.A Operations on Conditional Gaussian potentials

7.B Proof of Theorem 2

8.1 Introduction

8.2 The Switching LDS

8.2.1 Exact inference is computationally intractable

8.3 Gaussian Sum Filtering

8.3.1 Continuous ﬁltering

8.3.2 Discrete ﬁltering

8.3.3 The likelihood p(v1:T)

8.3.4 Collapsing Gaussians

8.3.5 Relation to other methods

8.4 Gaussian Sum Smoothing

8.7 Summary

Change-point Models

Analysis of Changepoint Models

9.1 Introduction

9.1.1 Model and Notation

9.1.2 Example: Piecewise Linear Regression

9.2 Single Changepoint Models

9.2.1 Likelihood-ratio based approach

9.2.2 Penalised likelihood approaches

9.2.3 Bayesian Methods

9.3 Multiple Changepoint Models

9.3.1 Binary Segmentation

9.3.2 Segment Neighbourhood Search

9.3.3 Minimum Description Length

9.3.4 Bayesian Methods

9.4 Comparison of Methods

9.4.1 Single Changepoint Model

9.4.2 Multiple Changepoint Model

9.5 Conclusion

Multi-Object Models

parameters in multi-target tracking models

10.1 Introduction

10.2 The Multi-target Model

10.3 A Review of the PHD Filter

10.3.1 Inference for Partially Observed Poisson Processes

10.3.2 The PHD Filter

10.4 Approximating the Marginal Likelihood

10.5 SMC approximation of the PHD ﬁlter and its gradient

10.6 Parameter Estimation

10.6.1 Pointwise Gradient Approximation

10.6.2 Simultaneous Perturbation Stochastic Approximation (SPSA)

10.7 Simulation Study

10.7.1 Model

10.7.2 Pointwise Gradient Approximation

10.7.3 SPSA

10.8 Conclusion

Sequential Inference for Dynamically Evolving Groups of

11.1 Introduction

11.2 MCMC-Particles Algorithm

11.2.1 Sequential MCMC

11.2.2 Outline of Algorithm

11.2.3 Toy Example

11.2.4 Inference Algorithm

11.2.5 Comparison with SIR Particle Filter

11.2.6 Comparison with Other Algorithms

11.3 Group Tracking

11.4 Ground Target Tracking

11.4.1 Basic Group Dynamical Model

11.4.2 Bayesian Model for Group Target Tracking

11.4.3 State Dependent Group Structure Transition Model

11.4.4 Observation Model

11.4.5 Simulation Results

11.5 Group Stock Selection

11.5.1 Group Stock Mean Reversion Model

11.5.2 State Independent Group Structure Model

11.5.3 Bayesian Model for Group Structure Analysis

11.5.4 Simulation Results

11.6 Conclusions

11.7 Appendix: Base Group Representation

Non–commutative harmonic analysis in multi–object tracking

12.1 Introduction

12.1.1 Related work

12.2 Harmonic analysis on ﬁnite groups

12.2.1 The symmetric group

12.3 Band-limited approximations

12.4 A Hidden Markov Model in Fourier space

12.4.1 A random walk on Sn

12.4.2 Relabeling invariance

12.4.3 Walks generated by transpositions

12.8 Acknowledgments

13.1 Introduction

13.2 Model

13.3 Novel conditions

13.3.1 The X-factor

13.4 Parameter estimation

13.4.1 Learning normal dynamics: heart rate

13.4.2 Learning artifactual dynamics: blood sampling

13.4.3 Learning physiological dynamics: bradycardia

13.4.4 Learning the novelty threshold

13.4.5 Learning the factorial model

13.5 Inference

13.5.1 Filtering

13.5.2 Smoothing

13.5.3 Handling missing observations

13.5.4 Constraining switch transitions

13.6 Experiments

13.6.1 Evaluation of known factors

13.6.2 Inference of novel dynamics

13.7 Summary

Non-parametric Models

Markov chain Monte Carlo algorithms for Gaussian processes

14.1 Introduction

14.2 Gaussian process models

14.3 Non-Gaussian likelihoods and deterministic methods

14.4 Sampling algorithms for Gaussian Process models

14.4.1 Gibbs sampling and independent Metropolis-Hastings

14.4.2 Sampling using local regions

14.4.3 Sampling using control variables

14.4.4 Selection of the control variables

14.4.5 Sampling the hyperparameters

14.5 Related work and other sampling schemes

14.6 Demonstration on regression and classiﬁcation

14.7 Transcriptional regulation

14.8 Dealing with large datasets

14.9 Discussion

Nonparametric Hidden Markov Models

15.1 Introduction

15.2 From HMMs to Bayesian HMMs

15.4.2 The Beam Sampler

15.5 Example: Unsupervised Part-Of-Speech Tagging

15.6 Beyond the iHMM

15.6.1 The Input-Output iHMM

15.6.2 The Sticky and Block-Diagonal iHMM

15.6.3 iHMM with Pitman-Yor Base Distribution

15.7 Conclusions

prediction

16.1 Introduction

16.2 The Information Processing Problem

16.3 Gaussian Processes

16.3.1 Covariance Functions

16.3.2 Marginalisation

16.3.3 Censored Observations

16.3.4 Eﬃcient Implementation

16.3.5 Active Data Selection

16.4 Trial Implementation

16.4.1 Bramblemet

16.4.2 Wannengrat

16.5 Empirical Evaluation

16.5.1 Regression and Prediction

16.5.2 Censored Observations

16.5.3 Active Data Selection

16.6 Computation Time

16.7 Related Work

16.8 Conclusions

16.A Appendix

16.A.1 Cholesky Factor Update

16.A.2 Data Term Update

16.A.3 Cholesky Factor Downdate

Agent Based Models

Optimal control theory and the linear Bellman Equation

17.1 Introduction

17.2 Discrete time control

17.3 Continuous time control

17.3.1 The HJB equation

17.3.2 Example: Mass on a spring

17.3.3 Pontryagin minimum principle

17.3.4 Again mass on a spring

17.3.5 Comments

17.4 Stochastic optimal control

17.4.1 Stochastic diﬀerential equations

17.4.2 Stochastic optimal control theory

17.4.3 Linear quadratic control

17.4.4 Example of LQ control

17.5 Learning

17.5.1 Inference and control

17.5.2 Certainty equivalence

17.6 Path integral control

17.6.1 Path integral control

17.6.2 The diﬀusion process as a path integral

17.7 Approximate inference methods for control

17.7.1 MC sampling

17.7.2 The variational method

17.8 Discussion

and optimal control problems

18.1 Markov Decision Processes and likelihood maximization

18.2.1 Single variable case

18.2.2 Explicit E-step algorithms

18.2.3 Structured DBN case

18.3 Application to MDPs

18.3.1 Expectation-Maximization with a tabular policy

18.3.2 Relation to Policy Iteration and Value Iteration

18.3.3 Discrete maze examples

18.3.4 Stochastic optimal control

18.4 Application to POMDPs

18.4.1 POMDP experiments

18.5 Conclusion

18.5.1 Follow-up and related work

18.A Remarks

18.B Pruning computations