Professional Documents
Culture Documents
Mark J. L. Orr
Advanced Robotics Research Ltd. and SD-Scicon UK Ltd.
February 15, 1993
Abstract
This document is an introduction to Kalman ltering and associated
techniques starting at the level of, roughly, Mickey Mouse and the Girl's
Blouse and ending up at the level of, approximately, Robocop II.
1 Introduction
When people tell you something about a thing you want to know about, and
you already know a little bit about that thing, but you're not sure about what
you know, and they're not too sure either about what they're telling you, then
that's when you need ... a Kalman lter.
The Kalman lter, developed as recently as 1960, is a tool for estimating the
true state of aairs from unreliable information. It's a numerical tool, so both
the state of aairs, which hence forth we'll call the state, and the information,
henceforth the observations, must be representable numerically. In addition
the unreliability of the observations and any prior knowledge about the state
must be modelled as normal (i.e. gaussian) probability distributions. Such
distributions require only mean vectors and variance-covariance matricies for
their complete specication.
The Kalman lter is widely used in engineering, control, navigation and
communications. Some examples:
estimating the parameters of an ellipse [7]
rotation estimation in computer vision [5]
target tracking [2]
world modelling and sensor fusion in robotics [1, 3]
satellite navigation
econometrics
In its most general form, the Kalman lter can be applied to time varying,
controlled non-linear systems. We will look at the non-linear extensions but omit
Currently a visiting worker at the Department of Articial Intelligence, Edinburgh
University.
1
the control and time varying aspects. Consequently there will be no discussion
of state transition as there would be in more general treatments (e.g. [2]).
Most applications in robotics (so far) have been restricted to the time-invariant
uncontrolled case.
In section 2 we look at representing uncertainty with probability distribu-
tions. In section 3 we further dene the terms state and observation and intro-
duce the measurement equation and recursive estimation. In the next section
we present the basic Kalman lter equations, and then, in the section following,
extensions to cope with non-linear systems. We present the associated test, the
Mahalanobis distance test in section 6 and nally there is a simple illustrative
example in section 7.
The notation we use below is similar to Ayache and Faugeras [1], one of the
best references for Kalman ltering in a robotics context. Scalars are repre-
sented with normal type letters, vectors with emboldened lower case letters and
matricies with emboldened upper case letters. By convention, vectors are single
column matricies unless explicitly transposed as in:
2 3
x1
66 x2 77
x = 4 ::: 5 , xt = [x1; x2 ; ::: xk ]
xk
We use the hat symbol to distinguish true from estimated values: x is the true
value while x^ is an estimate.
For those of you who generally nd understanding equations hard, I would
like to give some words of encouragement. The Kalman lter is a numerical tool
built on algebraic analysis. To understand how it works, even at the most basic
level, entails understanding certain equations. That cannot be avoided. How-
ever, though these equations may look complicated, with all sorts of syntactic
confetti, I believe that most people with a basic knowledge of mathematics (and
this must surely include almost all ARRL engineers) can understand every one
of them. This may need a little patience to read an equation or explanatory
text more than once, but patience which is rewarded in the end. If you are
like me, understanding something mathematical which once was unintelligible
gives a peculiar sense of satisfaction (see my forthcoming work: The Orgasm in
Mathematics).
2
of (x x^ )(x x^ )t just as x^ is the expected, or most likely, value of x.
E[x] = x^
E[(x x^ )(x x^ )t ] = x
Each diagonal entry of x , ii, is the variance of the corresponding component
xi, while the o-diagonal entries, ij , are the covariances between dierent
components.
The probability distribution for the vector random variable is:
p(x) = j 2 j 1
2 e 1(
2 x x t 1 x x
^) ( ^)
3
Si = (I Ki Mi )Si 1
or equivalently
Si = Si
1
+ Mti i 1 Mi
1
1
One can see that the previously estimated mean a^i 1 is corrected by an amount
proportional to the current error (^xi Mi a^i 1) called the innovation. The
proportionality factor Ki , is called the Kalman gain and is a matrix with n
rows and m columns given by:
Ki = Si Mti (i + Mi Si Mti )
1 1
1
These equations are more intuitive than they might at rst seem. As an
illustration, consider the very simple case when both the state and observation
are scalars and the measurement equation is simply x = a. In this case the
measurement matrix collapses to the scalar value M = 1 and, denoting the
measurement and state variances by and respectively, the gain is also a
scalar:
2
Ki = 2 +i 1 2
i i 1
i i 1
i i 1
Notice how if i 1 ! 1 (no prior information) a^ ! x^i and i2 ! 2i so the
2
4
@ fi = [r f t]t
@a a
are evaluated at (^xi ; a^i 1 ). This equation can be rewritten into the (linear)
form:
yi = Mi a + ui
where
yi = fi (^xi ; a^i 1) + @@fai a^i 1
Mi = @@fai
ui = @@xfi (xi x^i )
i
This is now a linear measurement equation where yi is the measurement, Mi
the linear transformation and ui is the random measurement noise. Both yi
and Mi are readily computed from the actual measurement x^i , the estimate
a^i 1 and the function fi and its rst derivative. The second order statistics of
ui are easily derived:
E[ui] = 0
t
Wi E[uiuti] = @@xfi i @@xfi
i i
We can now write down a new set of equations which take into account the
linearisation process and which constitute the extended Kalman lter.
a^i = a^i 1 + Ki (yi Mi a^i 1 ) = a^i 1 Ki fi(^xi ; a^i )
1
Si = (I Ki Mi )Si 1
or equivalently
Si = Si
1 1
1 + Mti Wi 1Mi
and where
Ki = Si Mti (Wi + Mi Si Mti )
1 1
1
If the estimate a^i 1 around which the Taylor expansion is performed is too
far from the correct parameter a the approximation which led above to a linear
measurement equation is not very good and the optimal solution of the linear
system may dier signicantly from the true one. A method to reduce the eect
of these approximation errors is to apply the iterated extended Kalman lter.
This consists of applying the update equation for the mean, a^i = a^i 1 Ki fi,
as long as a^i a^i 1 is large enough, computing at each iteration a new value of
Ki , yi and Mi obtained from a re-linearisation of fi about the new estimate a^i .
Ayache and Faugeras [1] say: \In general, after a few iterations a^i is so
close to a that the linearisation error is almost zero, yielding an almost optimal
lter". I have to say I have found that this isn't always true. My experience
with problems requiring more than one observation to completely constrain the
solution is that if the initial guess, a^0 , is very dierent from the true value of
a then quite often the true value is never approached. Consequently it is often
necessary to nd some reasonable initial guess before starting the lter, perhaps
by waiting until enough observations are available to constrain the solution.
5
6 Mahalanobis Distance
A basic problem in tracking, data fusion and computer vision is how to associate
a given observation with a given model and its estimate. In most images there is
data about a number of dierent features of the environment; which data goes
with which feature? The basic, and still most common method to solve this
problem is the Mahalanobis distance test [3], also called the nearest-neighbour
standard lter, the validation gate, or the normalised innovation.
This technique relies on calculating a normalised distance between observa-
tion and state which can be used to guage whether the observation plausibly
associates with the state. In the case of a non-linear system we have, at step
i, an estimate a^i 1 and an attached covariance Si 1 for parameter a. We also
have a noisy measurement (^xi ; i ) of xi , and we want to test the plausibility
of this measurement with respect to the equation fi(xi ; a) = 0.
If we consider again a rst order expansion of fi(xi ; a) about (^xi ; a^i 1 ),
since (^xi xi ) and (^ai 1 a) are independent gaussian processes, we see that
so is (up to a linear approximation) f (^xi; a^i 1 ) whose mean and covariance are:
E[f (^xi; a^i 1)] = 0
@ f @ f t @ f @ f t
Qi E[f (^xi; a^i 1 )f (^xi ; a^i 1)] = @ x i @ x + @ a Si 1 @ ai
t i i i
i i
7 Example
In this section we go through a fairly simple example to illustrate the techniques
discussed above. For a more complicated problem, namely the estimation of
rotations, see [4].
Consider the equation of a circle in the (p; q) plane centred on (; ) with
radius
:
(p )2 + (q )2 =
2
Suppose we have an instance of such a circle with the true (but unknown)
parameters: 2 3 2 3
3:0
4 5 = 4 3:0 5
2:0
6
upon which (our imperfect sensors tell us) lie the points:
x^ 1 = 3:02 1:01
x^ 2 = 1:01 2:98
x^ 3 = 2:99 4:97
all with the same error:
i = 0:0001 0:0
0:0 0:0004
and that the prior information is:
2 3
3:5
a^0 = 4 2:5 5
1:5
2 3
1:0 0:0 0:0
S0 = 4 0:0 1:0 0:0 5
0:0 0:0 1:0
What then is the best estimate for the parameters of the circle? If we make
the parameters of the circle the state (a = [
]t ) and make the sensed points
the observations (xi = [pi qi]t) we can use an extended Kalman lter. The
measurement equation is:
fi (xi ; a) = (pi )2 + (qi )2
2 = 0
and its derivatives are:
@fi = @fi @fi = [2(p ) 2(q )]
@ xi @pi @qi i i
@fi = @fi @fi @fi = [2( p ) 2( q ) 2
]
@a @ @ @
i i
These expressions for the measurement equation and its derivatives, the
points x^i ; i = 1; 2; 3 and their covariance i and the prior state information
(^a0; S0) were all fed to an iterated extended Kalman lter. The following results
were obtained:
i a^i Si
3:5 1:0 0:0 0:0
0 2:5 0:0 1:0 0:0
1:5 0:0 0:0 1:0
2:90 2:9 10 1 1:5 10 1 4:3 10 1
1 2:63 1:5 10 1 9:7 10 1 8:9 10 2
1:86 4:3 10 1 8:9 10 2 7:4 10 1
2:85 2:3 10 1 3:2 10 1 2:7 10 1
2 2:75 3:2 10 1 4:5 10 1 3:8 10 1
1:75 2:7 10 1 3:8 10 1 3:2 10 1
3:03 2:0 10 4 6:1 10 5 1:2 10 4
3 3:00 6:1 10 5 2:1 10 4 3:3 10 5
1:95 1:2 10 4 3:3 10 5 2:0 10 4
7
The rst thing to notice is that a^3 is much closer to the value of a = [3 3 2]t
than a^0 and that the nal covariance is small indicating that the lter is con-
dent about this solution. Notice also how the covariance remains relatively high
until after the third observation has been ltered - a consequence of the fact
that it takes at least three points to x a circle.
References
[1] N. Ayache and O.D. Faugeras. Maintaining representations of the environ-
ment of a mobile robot. In Robotics Research 4, pages 337{350. MIT Press,
USA, 1988.
[2] Y. Bar-Shalom and T.E. Fortmann. Tracking and Data Association. Aca-
demic Press, UK, 1988.
[3] H.F. Durrant-Whyte. Multi-sensor data fusion for (semi-)autonomous
robots. Technical review document, Advanced Robotics Research Ltd., Uni-
versity Road, Salford M5 4PP, UK, 1990.
[4] M.J.L. Orr. On estimating rotations. Department of Articial Intelligence,
Edinburgh University, Working Paper 233, 1992.
[5] M.J.L. Orr, R.B. Fisher, and J. Hallam. Uncertain reasoning: Intervals
versus probabilities. In British Machine Vision Conference, pages 351{354.
Springer-Verlag, 1991.
[6] G. Pegman, 1990. Personal communication.
[7] J. Porril. Fitting ellipses and predicting condence using a bias corrected
Kalman Filter. Image and Vision Computing, 8(1):37{41, 1990.