Kalman

The
Kalman Filter
COMP 486
Continuous States
●
HMM's only allow us to reason about discrete state
systems:
– Rainy or Sunny?
– Urn #1 or Urn #2?
●
We often need to work with continuous states:
– Robot position.
– Temperature and Humidity.
– etc.
●
One possibility is discretization.
– Forces us to trade off the number of states against the accuracy
of the model.
Kalman Filter
●
The Kalman filter allows us to solve the following
problem:
– Continuous state, discrete time.
– We have a model of the state transition.
– We have a model of how observations relate to the underlying
state.
– Predict the state given a sequence of observations.
●
We will not talk about learning for Kalman filters.
Kalman Filter Assumptions
●
State dynamics are linear – the current state is a linear
function of the previous state.
●
Noise in the state dynamics is normally distributed.
●
The observation process is linear – observations are a
linear function of the state.
●
The observation noise is normally distributed.
Linear State Dynamics
●
A simple example: Radioactive Decay.
– x is our one dimensional state variable amount of decaying
material.
– The following equation describes an exponentially decreasing
amount of radioactive material (r < 1).
x t =r x t −1
Another Example
●
Motion of a particle moving in one dimension at a
constant velocity.
pt
●
x is now a two dimensional vector: x=
[]
vt
1 1
●
●
In other words:
x t = A x t−1
Our update equation is: , where A=
[ ]
0 1
pt 1 1 p t−1 p t = pt −1v t−1

[ ] [ ][ ]
vt
=
0 1 v
t −1
Or
v t =v t−1
Dynamical Systems and Differential Equations
●
There are a huge range of interesting questions here:
– Given a process, how do we specify the correct transition
matrix?
– Given a transition matrix, what can we say about the state
dynamics?
– What if we have a process that has fundamentally nonlinear
dynamics?
●
Take these classes:
●
Math 270 “NonLinear Dynamics and Chaos”
●
Math 260 “Differential Equations/Numerical Methods”
Process Noise
●
Previous examples assumed that our transition model
perfectly captures the true state dynamics.
●
In the real world, our model will never be perfect:
– For example, moving objects are influenced by friction, air
resistance etc.
●
We account for this by adding a noise term to our update
equation:
x t = A x t−1w
●
Where (This is common shorthand for “w
w~N 0, Q
is drawn from the normal distribution with mean 0 and
covariance matrix Q”)
Measurements (Observations)
●
As in an HMM, we do not have access to the true state.
●
We get observations z that are linearly related to the true
state x:
z t= H xt
●
The dimensionality of z, may be different from x.
●
For example, in our 1D motion case, assume that
velocity cannot be directly observed: H=[ 1 0 ]
pt
z t=[ 1 0 ]
[]
vt
, z t = pt
Noisy Measurements
●
Once again, our measurements don't perfectly reflect
the true system state.
●
We account for noisy measurements as follows:
z t = H x t v
●
Where v~ N 0, R
Putting it all Together
x t = A x t−1w
●
State dynamics:
w~N 0, Q
z t = H x t v
●
Measurement model:
v~ N 0, R
●
Complete parameterization:
– A – process transition matrix.
– Q – process noise covariance.
– H – Measurement matrix.
– R – Measurement noise covariance.
An Aside: Control
●
Note that the Welch & Bishop tutorial has the following:
x t = A x t−1 B u t−1w
w~N 0, Q
●
u is a control signal.
●
This raises other interesting questions that we will
ignore.
Filtering Steps
● The system starts in some true (unobservable) state x1.
●
x1
We make an initial guess at the state .
– The “^” indicates that this is an estimate.
– The “” in the superscript indicates that this is an a priori estimate:
no observation has yet been made.
●
x t zt
**We update our estimate to based on the observation .
– This is the a posteriori estimate: after an observation has been made.
● The state is updated according to the dynamics resulting in xt+1.
●
We propagate our estimate according to the state dynamics

x t1
resulting in .

●
Return to **.
Kalman Filter Derivation
●
This is the tricky part:
●
x t zt
**We update our estimate to based on the observation .
●
We want to our estimate to be unbiased, and have the least
possible variance.
– Unbiased: the expected difference between our estimate
and the state should be 0.
– Minimum variance: as little uncertainty as possible in our
estimate.
●
(We won't show the whole derivation. Just the gist.)
●
We want an update rule of the following form:
x t = x t  K  z t − H x t 
predicted state observation predicted observation

●
 is called the residual. It represents the
z t − H x t 
difference between what we saw, and what we expected
to see.
– If the residual is 0: we are happy with our estimate.
– If the residual is large: we change our estimate to reduce it.
●
K is the Kalman gain. It determines how we trade off
measurements against model predictions.
●
How do we select K to minimize variance?
●
Start with some definitions:
a priori estimate error a posteriori estimate error
e ≡ x − x
t t t
e t ≡ x t − x t
a priori estimate a posteriori estimate
error covariance error covariance
T
Pt =E [ e t et T ] P t =E [ e t e t ]
●
The goal is to find a K that minimizes the a posteriori
estimate error covariance:
T T
P t =E [ e t e t ]=E [ x t − x t  x t − x t  ]
●
x t K  z t −H xt  x t
First step: substitute for above.
●
Then take the derivative with respect to the trace of the
expectation.
– The trace is the sum of the diagonal elements. In this case,
that means we are minimizing the sum of the variances of the
different dimensions.
●
Set the derivative to 0 and solve for K.
T
Pt H
●
This is what we end up with: K t= T
H P t H R
●
Reminders:
P t =a priori estimate error covariance.
R=measurement noise covariance.
●
When R is large, K is small: we tend to ignore the
sensors if they are unreliable.
●
When is small K is small: we tend to ignore the
P t
sensors if our a priori estimate is precise.
Kalman Filter in a Nutshell
●
If we already have a reliable estimate:
– ignore our sensors.
●
If we have reliable sensors:
– ignore our estimate.
●
The gain factor gives the optimal tradeoff between these
two extremes.
The Whole Algorithm
PREDICT
project the state ahead:
x t = A x t−1w
project the error covariance
T
ahead: P t = A Pt −1 A Q
CORRECT
compute the Kalman gain:
T T
K t =P t H  H P t H R
−1
update estimate with the measurement:
x t = x t  K  z t − H x t 
update the error covariance:

From Welch and
P k = I − K t H  P t Bishop
An Illustration...
Nice Properties
●
Optimal.
●
Efficient. (Takes advantage of temporal independence
assumptions)
●
Provides both a state estimate and a measure of the
uncertainty of that estimate.
The Extended Kalman Filter
●
An extension of the algorithm that handles nonlinear
state dynamics.
●
Not necessarily optimal.

Kalman

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Kalman

Uploaded by

Copyright:

Available Formats

The

pt 1 1 p t−1 p t = pt −1v t−1

predicted state observation predicted observation

You might also like