You are on page 1of 38

Bayesian Multi-object Tracking:

Probability Hypothesis Density Filter & Beyond

Ba-Ngu Vo http://ba-ngu.vo-au.com

Dept. Electrical & Computer Engineering


Curtin University
Perth, Australia

ISPL
Intelligent Sensing & Perception Lab
https://ispl-research.com/ IEEE Signal Processing Society, 2022
Introduction

N. Wiener (1894-1964) A. Kolmogorov (1903-1987) R. Kalman (1930-2016)

1940’s: Wiener filter - Pioneering work by Wiener, Kolmogorov

1950’s-1960’s: Kalman filter - Bode & Shannon, Zadeh & Ragazzini, Levinson,
Swerling, Stratonovich, Kalman ... Schmidt’s Implementation – Apollo program

1990’s--: Particle Filter for non-linear filtering - Gordon, Salmond & Smith

1990’s--: Multi-Object Filter - Pioneering work by Mahler


Probability Hypothesis Density (PHD), Cardinalized PHD filters …

Labeled Random Finite Sets: Generalized Labeled Multi-Bernoulli filters


Outline

State Estimation
Multi-object State Space Model
PHD filter
Beyond PHD filtering
Labelled Random Finite Set
Generalized Labelled Multi-Bernoulli Filter
Some Demonstrations
State Estimation
State Space Model of a Dynamical System
observation space ℤ
zk

zk-1
observation vector
state space 𝕏
State Estimation
state transition
xk System ID
xk-1 state vector Control

State equation xk = F(xk-1, uk-1, nk) fk|k-1(xk| xk-1, uk-1)

Markov transition density


control noise
Observation likelihood

Observation equation zk = Y(xk, uk, vk) gk(zk| xk, uk-1)


State Estimation

𝑥0 𝑥1 𝑥2 𝑥3 𝑥4 𝑥𝑘
State space

state trajectory = history of states


x0:k = [x0 ,…, xk]

0 1 2 3 4 … k time

State Estimation: estimate state trajectory


Online Operation: need fixed complexity per time step to be useful in practice
Filtering: x0,…, xk, suitable for online – Kalman & particle filters
Smoothing: x0:k , not suitable for online – Kalman & particle smoothers
S. Sarkka, Bayesian filtering and smoothing, Cambridge University Press, 2013
State Estimation
gk(zk| xk) fk|k-1(xk| xk-1) p0:k-1(x0:k-1 |z1:k-1)

smoothing-while-filtering  gk(zk| xk) fk|k-1(xk| xk-1)p0:k-1(x0:k-1|z1:k-1)dx0:k


Bayes smoother

 p0:k-1(x0:k-1 |z1:k-1) p0:k(x0:k| z1:k) 

Complexity per time step increases with time - not suitable for online
Smooth over a fix-length moving window - fixed complexity per time step

 pk-1(xk-1| z1:k-1) fk|k-1(xk| xk-1) dxk-1 gk(zk| xk) pk|k-1(xk| z1:k-1)

 gk(zk| xk)pk-1(xk-1| z1:k-1)dxk


Bayes filter
prediction data-update
 pk-1(xk-1 |z1:k-1) pk|k-1(xk| z1:k-1) pk(xk| z1:k) 

Fixed complexity per time step - suitable for online, widely used
Multi-Object System
What if the state & observation are not vectors but finite sets?

Particles Molecules Cells

People Orbital Debris Astronomy


Multi-Object System
Multi-object state: Finite set of vectors (aka point pattern)

observation space ℤ
observations of the
states

state space 𝕏

state transition

Xk
3 vectors
5 vectors
Xk-1

Fundamental difference from classical dynamical system:


▪ Random time-varying number of states and measurements
▪ False negatives, false positives, association uncertainty
▪ Much more challenging!
Multi-Object System
Multi-object system: Finite-set-valued dynamical system

observations multi-object observation


ℤ F(ℤ)

Z
states multi-object state
𝕏 F(𝕏)
XX
Multi-object Tracking: State Estimation on the space F(𝕏) of finite sets of 𝕏

Theoretical considerations
▪ Can’t treat a (random) finite set as if it were a (random) vector
▪ PDF of random finite set is not the same as PDF of random vector
▪ Mahler’s Finite Set Statistics (FISST) provides a suitable notion of density & integration

R. Mahler, Statistical Multisource-Multitarget Information Fusion, Artech House, 2007.


R. Mahler, Advances in Statistical Multisource-Multitarget Information Fusion, Artech House, 2014.
Multi-Object System
Simple Multi-Object State Transition Model

motion State transition


equation
x x’
death Xk = Sk|k-1(Xk-1)Bk|k-1(Xk-1)k

x
X’
spawn
fk|k-1(Xk |Xk-1 )
x
X’ State transition
creation kernel

Transition for each element x of a given multi-object state Xk-1

R. Mahler, Statistical Multisource-Multitarget Information Fusion, ArtechHouse, 2007.


R. Mahler, Advances in Statistical Multisource-Multitarget Information Fusion, ArtechHouse, 2014.
Multi-Object State Space Model
Simple Multi-Object Observation Model

Observation equation
likelihood

x z
Zk = Qk(Xk) Kk

misdetection

x
gk(Zk|Xk)
clutter
 Observation likelihood

state space observation space

Observation for each element x of a given multi-object state Xk

R. Mahler, Statistical Multisource-Multitarget Information Fusion, ArtechHouse, 2007.


R. Mahler, Advances in Statistical Multisource-Multitarget Information Fusion, ArtechHouse, 2014.
Probability Hypothesis Density Filter
Multi-Object Estimation: Multi-object Bayes filter
prediction
 pk-1(Xk-1|Z1:k-1) pk|k-1(Xk|Z1:k-1) data-update pk(Xk|Z1:k) 

fast growing sum

f k |k −1 ( X k | X ) pk −1|k −1 ( X | Z1:k −1 ) X g k ( Z k | X k ) pk |k −1 ( X k | Z1:k −1 )


g k ( Z k | X ) pk |k −1 ( X | Z1:k −1 ) X

Computationally expensive (exponential complexity) … approximation needed!

state of system: single-object first-moment filter


Single-object random vector Bayes filter (e.g. a-b-g filter)

state of system: multi-object first-moment filter


Multi-object random set Bayes filter (“PHD” filter)
Probability Hypothesis Density Filter
PHD = instantaneous expected no. objects per unit (hyper) volume

PHD (intensity function) of an RFS


PHD peaks: highest expected No. points per unit volume,
most likely locations of objects
v(x0) = density/concentration of
expected No. objects at x0

PHD estimator: find “best” cardinality estimate n*,


locate n* highest peaks.

Sv(x)dx = expected No. objects in S

x0 state space 𝕏
S
Probability Hypothesis Density Filter
PHD filter: Propagates PHD forward in time

state space

vk Estimated no. objects = PHD mass


vk-1
Estimated states = location of PHD peaks

PHD PHD
 vk-1(xk-1|Z1:k-1) prediction
vk|k-1(xk|Z1:k-1) update
vk(xk|Z1:k) 

R. Mahler, “Multitarget Bayes filtering via first-order multitarget moments,” IEEE Trans. AES, 39(4):1152–1178, 2003

▪ Approximate by a Poisson RFS density that minimizes KL-div from multi-object filtering density

▪ SMC & Gaussian mixture implementations (linear in no. detections)


Vo et. al., "SMC methods for Bayesian Multi-target filtering with RFSs," IEEE Trans. AES, 41(4): 1224-1245, 2005
Sidenbladh, "Multi-target particle filtering for the probability hypothesis density," Int. Conf. Information Fusion, 2003.
Zajic & Mahler, “Particle-systems implementation of the PHD multitarget tracking filter,” SPIE 5096:291-299, 2003
Vo & Ma, "The Gaussian mixture PHD Filter," IEEE Trans. SP, 54(11):4091-4104, 2006
Probability Hypothesis Density Filter
Cardinalized PHD filter: Propagates PHD & cardinality distribution

 k-1(n|Z1:k-1) cardinality


prediction
k|k-1(n|Z1:k-1) cardinality
update
k(n|Z1:k) 

PHD PHD
 vk-1(xk-1|Z1:k-1) prediction vk|k-1(xk|Z1:k-1) update
vk(xk|Z1:k) 

R. Mahler, “PHD filters of higher order in target number,” IEEE Trans. AES,43(4):1523–1543, 2007

k-1 k Estimated no. objects = argmax k


Estimated states = location of PHD peaks

Better cardinality estimates

vk-1 vk

▪ Approximate by an IID cluster density that minimizes KL-div from multi-object filtering density
▪ Gaussian mixture implementation (cubic in no. detections)
Vo et. al. “Analytic implementations of the Cardinalized PHD Filter," IEEE Trans. SP, 55(7-II): 3553-3567, 2007
Beyond PHD Filtering
Recall: Multi-object Bayes filter
prediction
 pk-1(Xk-1|Z1:k-1) pk|k-1(Xk|Z1:k-1) data-update pk(Xk|Z1:k) 

fast growing sum

f k |k −1 ( X k | X ) pk −1|k −1 ( X | Z1:k −1 ) X g k ( Z k | X k ) pk |k −1 ( X k | Z1:k −1 )


g k ( Z k | X ) pk |k −1 ( X | Z1:k −1 ) X

Computationally expensive! (exponential complexity)

2 Key Considerations for Multi-object Tracking (MOT)

Implementations Trajectory Estimation


PHD/CPHD: parametric approximations PHD/CPHD: multi-object localization filters
Approximation by truncating filtering density? How to estimate trajectories?
Beyond PHD Filtering
Truncation of Multi-object Filtering Density

▪ No guarantee that the truncated sum is a function of sets

e.g. measurement ={z1}, prior = p({x1, x2}), detection probability: PD(x1) = 0.9, PD(x2) = 0.1.

𝑝({𝑥1 , 𝑥2 }|{𝑧1 }) ∝ 92𝑔(𝑧1 |𝑥1 )𝑝({𝑥1 , 𝑥2 }) + 𝑔(𝑧1 |𝑥2 )𝑝({𝑥1 , 𝑥2 }) + 9𝜅(𝑧1 )𝑝({𝑥1 , 𝑥2 })
filtering density x1 detected & x2 missed x1 missed & x2 detected x1 & x2 missed

Suppose we truncate the last 2 terms: 𝑓(𝑥1 , 𝑥2 |{𝑧1 }) ≜ 𝑔(𝑧1 |𝑥1 )𝑝({𝑥1 , 𝑥2 })
𝑓(𝑎, 𝑏|{𝑧1 }) = 𝑔(𝑧1 |𝑎)𝑝({𝑎, 𝑏}) ≠ 𝑓(𝑏, 𝑎|{𝑧1 }) = 𝑔(𝑧1 |𝑏)𝑝({𝑎, 𝑏})

f is not symmetric in its arguments, i.e. not a function of the set {a,b}

▪ Even if the truncated sum is a function of sets, what’s the approximation error?
Beyond PHD Filtering
𝑋0 𝑋1 𝑋2 𝑋3 𝑋4 𝑋𝑘
State space

multi-object state history

𝑋0:𝑘
?

0 1 2 3 4 … k Unlike single-object system


Trajectory ≠ State History
State space

multi-object trajectory

trajectories
0 1 2 3 4 … k
Labeled Random Finite Set
multi-object trajectory = history of labeled multi-object states
X1 X4 Xk
State space

X 0:k

labeled
trajectories

0 1 2 3 4 … k time

multi-object state X = {x1 xj  xn} Label


distinct from other objects
labeled state vector (xj , lj) fixed for the life of the object

labeled multi-object state X = {x1 xj  xn} multi-object state augmented with distinct labels

Goodman et. al. Mathematics of Data Fusion. probability density of point processes, 1997
Mahler, Statistical Multisource-Multitarget Information Fusion, Artech House, 2007
Labeled Random Finite Set
multi-object trajectory = history of labeled multi-object states
X1 X4 Xk
State space

X 0:k

labeled
trajectories

0 1 2 3 4 … k time

Provides multi-object trajectory estimate (even from a single scan)


▪ filtering with labeled multi-object states: X0,…, Xk
▪ smoothing with labeled multi-object states: X0:k
Provides lineage information for modelling/estimation with spawning, division/splitting
Bryant, et. al. "A GLMB Filter with Object Spawning," IEEE Trans. SP, 66(23):6177-6189, 2018
Nguyen, et. al. "Tracking Cells and their Lineages via Labeled RFSs," IEEE Trans. SP, 69:6177-6189, 2021
Labeled Random Finite Set

Closure under truncation?

Standard multi-object likelihood


𝐗
(𝜃(ℒ(⋅))
𝑔𝑘 (𝑍|𝐗) ∝ ෍ Ψ𝑍,𝑘 (⋅) single-object measurement likelihood
𝜃
 PD ,k (x) g k ( z j | x)
set of labels of X  , j0
Y{ z1 ,..., zm },k (x) = 
( j)
 k ( z j )
1 − P (x), j=0
association map 𝜃: ℒ(𝐗)) → {0,1, . . . , |𝑍|}  D ,k
1-1 when positive:  (i ) =  (i' )  0  i = i'
at most 1 measurement index per label detection probability clutter intensity

▪ Each term of the multi-object likelihood is symmetric


▪ Truncated labeled posterior/filtering density is a function of sets

Labeled RFS - 2 birds with one stone: provides trajectories & closure under truncation
Vo & Vo "Labeled RFSs and Multi-Object Conjugate Priors," IEEE Trans. SP, 61(13): 3460-3475, 2013
Labeled Random Finite Set
What about set of unlabeled trajectories?
𝑘
Trajectory space 𝕋 = ⨄𝑡=1 ℕ × 𝕏𝑡 , unlabeled multi-object trajectory = finite subset of 𝕋
e.g. a
3 trajectory
distinct elements from the trajectory space
1 1 1 k = 1, 3 objects, multi-object state = 1.0 Physically nonsensical:
,
1.0 1.0 , 1.0 counts different segments of 1 trajectory, as separate trajectories
2.0 2.0 k = 2, 2 objects, multi-object state = 2.0
inconsistent cardinalities with multi-object state history
3.0 k = 3, 1 object, multi-object state = 3.0

Theoretical Flaw: multi-object state contains repeated points ⇒ no multi-object density


D. J. Daley and D. Vere-Jones, An introduction to the theory of point processes, Springer, 1988.
Goodman et. al., Mathematics of Data Fusion, Kluwer Academic Publisher, 1997, Prop. 19, pp. 159.

Numerically Infeasible:
▪ Need to estimate trajectories over entire duration, complexity per time step explodes with time
▪ Can’t estimate trajectories over shorter window – no means to connect trajectories from different windows
▪ Necessary to truncate posterior, but no guarantee truncated “density” is a function of sets
▪ PHD approximation? complexity per time step still grows with time – won’t work in real applications.
Labeled Random Finite Set
Recall: PHD = instantaneous expected no. objects per unit (hyper) volume
PHD (intensity function) of an RFS
PHD peaks: highest expected No. points per unit volume,
most likely locations of objects

PHD estimator: find “best” cardinality estimate n*;


locate n* highest peaks.

state space 𝕏
𝑘
▪ “PHD” on trajectory space 𝕋 = ⨄𝑡=1 ℕ × 𝕏𝑡 has different units of measurements for different 𝑘
▪ How can we locate the n* highest peaks if they have different units?
Generalized Labeled Multi-Bernoulli (GLMB) Filter
GLMB filter: Multi-object Analogue of Kalman Smoother/Filter
Closure under Bayes recursion - Analytic solutions to:
▪ Multi-object Bayes Posterior Recursion (Smoothing-while-filtering)

estimated labeled multi-object state history X0:k


Vo & Vo “A Multi-Scan Labeled RFS Model for Multi-object State Estimation,” IEEE Trans. SP, 67(19):4948-4963, 2019

▪ Multi-object Bayes Filtering Recursion

estimated labeled multi-object states X0,…,Xk forms a set of tracks


Vo & Vo "Labeled RFSs and Multi-Object Conjugate Priors," IEEE Trans. SP, 61(13): 3460-3475, 2013
Generalized Labeled Multi-Bernoulli (GLMB) Filter

estimated trajectories on a 64km by 32km area


M. Beard et. al. "A Solution for Large-scale Multi-object Tracking," IEEE Trans. SP, 68:2754–2769, 2020.
Generalized Labeled Multi-Bernoulli (GLMB) Filter

Total no. objects tracked is not indicative of problem size


▪ easy to track arbitrarily large no. in scenarios with 1-2 objects per frame
Problem size is better reflected in the no. objects/observations per frame

Scenario
▪ surveillance region: 64km by 36km
▪ objects can appear anywhere, appearing rate ~ 200-3000/frame (unknown to filter)
▪ mean clutter density 200 km-2 (460800 per frame) uniform,
▪ detection probability 0.88
▪ linear Gaussian state & observation noise sigma of 0.2ms-2 & 5m

M. Beard et. al. "A Solution for Large-scale Multi-object Tracking," IEEE Trans. SP, 68:2754–2769, 2020.
Generalized Labeled Multi-Bernoulli (GLMB) Filter

▪ Over 1 million objects per frame


▪ Peak cardinality: 1,217,531 objects per frame (at time 700)
▪ Peak object density: 520 km-2
▪ Duration 1000 instances: ~ 1 billion data points

▪ OSPA(2): metric for sets of tracks,


▪ Generalization of OSPA that accounts for:
• Localization & Cardinality error;
• Track fragmentation;
• ID switches
Beard et. al. "A Solution for Large-scale MOT," IEEE Trans. SP, 68:2754–2769, 2020.
Generalized Labeled Multi-Bernoulli (GLMB) Filter
GLMB density - multi-object analogue of exponential mixture:

labels of X  p  ( x)
xX
( )

distinct label indicator associations history multi-object exponential

▪ Weight Normalization

▪ Cardinality Distribution & PHD

B.-T. Vo, et. al. "Labeled Random Finite Sets and Multi-Object Conjugate Priors," IEEE Trans. SP, 61(13): 3460-3475, 2013.
Generalized Labeled Multi-Bernoulli (GLMB) Filter
Labeled Multi-Object Density Approximations

Truncation of GLMB
▪ Any truncation of a GLMB is a GLMB – closure under truncation
▪ L1-norm of truncation error can be computed analytically
▪ Minimum L1-norm truncation error: truncate smallest weights
Vo et. al. "Labeled RFS and the Bayes Multi-Target Tracking Filter," IEEE Trans. SP, 62(24):6554-6567, 2014

Approximation of GLMB by a 1-term GLMB (or LMB) with


▪ Same PHD & same Cardinality distribution
Reuter et. al. "The labelled multi-Bernoulli filter," IEEE Trans. SP, 62(12):3246-3260, 2014

Approximation of labeled multi-object density by a GLMB with:


▪ Same PHD & same Cardinality distribution
▪ Minimal Kullback-Leibler divergence
Papi et. al., “GLMB approximation of Multi-object densities,” IEEE Trans. SP, 63(20):5487-5497, 2015
Generalized Labeled Multi-Bernoulli (GLMB) Filter
GLMB Filtering/Posterior sum grows in no. terms: requires reduction of terms
Truncate terms with smallest weights ⇒ minimum L1-norm truncation error
Implementation: How to truncate without exhaustive enumeration of the terms?

Filter implementation:
▪ K-shortest path prediction & ranked assignment update (cubic in no. detections)
▪ Gibbs sampling + Joint Prediction & Update (linear in no. detections)
Vo et. al. "Labeled RFS and the Bayes Multi-Target Tracking Filter," IEEE Trans. SP, 62(24):6554-6567, 2014.
Vo et. al. “An Efficient Implementation of the GLMB Filter,” IEEE Trans. SP, 65(8):1975-1987, 2017.

Multi-sensor filter implementation:


▪ Multi-dimensional assignment (NP-Hard): Gibbs sampling (linear in total no. of detections)
Vo et. al. “Multi-sensor multi-object tracking with the GLMB filter,” IEEE Trans. SP, 67 (23):5952-5967, 2019.

Smoother implementation:
▪ Multi-dimensional assignment (NP-Hard): Gibbs sampling (linear in total no. of time steps)
▪ On-line: smooth over fixed-length window and link trajectories with the same labels
Vo & Vo “A Multi-Scan Labeled RFS Model for Multi-object State Estimation,” IEEE Trans. SP, 67(19):4948-4963, 2019
Demonstration: Multi-object Smoothing
Cell-microscopy – posterior multi-scan statistics
▪ Duration 1000mins, sampling period D=10mins
▪ Clutter 0.3/scan, Scenario:
▪ Detection probability 0.33 (very low) ▪ min 001: 4 cells appear, live for 100mins
▪ Observation noise sigma 0.3mm ▪ min 200: 4 cells appear, live for 200mins
▪ Dynamic noise sigma 0.01mm/D2 ▪ min 500: 4 cells appear, live for 400mins

Statistics on births/deaths, cell-life, migration pattern

Vo & Vo “A Multi-Scan Labeled RFS Model for Multi-object State Estimation,” IEEE Trans. SP, 67(19):4948-4963, 2019
Demonstration: Multi-Sensor Fusion
3D (state + extent) tracking from multiple camera views
▪ 4 Kinect sensors placed high up and facing inwards in each corner
▪ YOLO detector produces 2D detections of bounding boxes in pixel space
▪ Detections are noisy and subject to false positives/negatives

Ong et. al. “A Bayesian Filter for Multi-view 3D Multi-object Tracking with Occlusion Handling,” IEEE Trans. PAMI, 2020.
Demonstration: Multi-Sensor Fusion

Tracking output from fused and filtered detections (colours denote estimated identities)

Ong et. al. “A Bayesian Filter for Multi-view 3D Multi-object Tracking with Occlusion Handling,” IEEE Trans. PAMI, 2020.
Demonstration: Multi-Sensor Fusion
Tracking multiple jumping/falling people in 3-D

Ong et. al. “A Bayesian Filter for Multi-view 3D Multi-object Tracking with Occlusion Handling,” IEEE Trans. PAMI, 2020.
Demonstration: Multi-Sensor Fusion
Provides tracks in 3D instead of ground-plane tracks as in existing methods
No retraining when reconfiguring cameras

CLEAR MOT scores and OSPA(2) 3D errors with 3D GIoU base-distance


Detectors (monocular): Faster-RCNN(VGG16) and YOLOv3
GLMB Filters: Standard Multi-Sensor (MS), Multi-View with Occlusion Model (MV)
Asterisk e.g. MV-GLMB-OC* indicates tracking while reconfiguring cameras

Ong et. al. “A Bayesian Filter for Multi-view 3D Multi-object Tracking with Occlusion Handling,” IEEE Trans. PAMI, 2020.
Demonstration: Autonomous Driving
Autonomous Driving: SLAM + multi-object filtering

Prototype system with E-Class Mercedes-Benz

H. Deusch et. al. “The Labeled Multi-Bernoulli SLAM filter,” IEEE Signal Proc. Letters, 22(10), 2015.
Demonstration: Autonomous Driving
Autonomous Driving: SLAM + multi-object filtering

4 patents (Bosch, BMW, Mitsubishi): autonomous driving /driver assistance system


Conclusions

Multi-Object Systems Theory

Filtering/Estimation Control System ID

Many emerging results Largely unexplored

Many interesting problems in


Artificial intelligence, machine learning, data mining
Communications, Astro-dynamics (Space Debris)
Bio-medical research: cell microscopy, brain imaging …

Preprints/latest works: http://ba-ngu.vo-au.com/publications.html


Matlab code: http://ba-tuong.vo-au.com/codes.html

Thank You!

You might also like