Professional Documents
Culture Documents
T
his article provides a simple understand the basis of the Kalman fil- the Kalman filter, in the interests of
and intuitive derivation of ter via a simple and intuitive derivation. being concise, but instead aims to pro-
the Kalman filter, with the vide tutors with a simple method of
aim of teaching this useful RELEVANCE teaching the concepts of the Kalman fil-
tool to students from disci- The Kalman filter [2] (and its variants ter to students who are not strong
plines that do not require a strong such as the extended Kalman filter [3] mathematicians. The reader is expected
mathematical background. The most and unscented Kalman filter [4]) is to be familiar with vector notation and
complicated level of mathematics one of the most celebrated and popu- terminology associated with Kalman fil-
required to understand this derivation is lar data fusion algorithms in the field tering such as the state vector and cova-
the ability to multiply two Gaussian of information processing. The most riance matrix. This article is aimed at
functions together and reduce the result famous early use of the Kalman filter those who need to teach the Kalman fil-
to a compact form. was in the Apollo navigation computer ter to others in a simple and intuitive
The Kalman filter is over 50 years old that took Neil Armstrong to the moon, manner, or for those who already have
but is still one of the most important and (most importantly) brought him some experience with the Kalman filter
and common data fusion algorithms in back. Today, Kalman filters are at work but may not fully understand its founda-
use today. Named after Rudolf E. in every satellite navigation device, tions. This article is not intended to be a
Kálmán, the great success of the every smart phone, and many com- thorough and standalone education tool
Kalman filter is due to its small compu- puter games. for the complete novice, as that would
tational requirement, elegant recursive require a chapter, rather than a few
properties, and its status as the optimal pages, to convey.
THE KALMAN FILTER IS
estimator for one-dimensional linear
systems with Gaussian error statistics
OVER 50 YEARS OLD BUT PROBLEM STATEMENT
[1]. Typical uses of the Kalman filter IS STILL ONE OF THE MOST The Kalman filter model assumes that
include smoothing noisy data and pro- IMPORTANT AND COMMON the state of a system at a time t evolved
viding estimates of parameters of inter- DATA FUSION ALGORITHMS from the prior state at time t-1 accord-
est. Applications include global IN USE TODAY. ing to the equation
positioning system receivers, phase-
locked loops in radio equipment, The Kalman filter is typically derived x t = Ft x t - 1 + B t u t + w t , (1)
smoothing the output from laptop using vector algebra as a minimum
trackpads, and many more. mean squared estimator [5], an where
From a theoretical standpoint, the approach suitable for students confident ■ xt is the state vector containing
Kalman filter is an algorithm permitting in mathematics but not one that is easy the terms of interest for the system
exact inference in a linear dynamical to grasp for students in disciplines that (e.g., position, velocity, heading) at
system, which is a Bayesian model simi- do not require strong mathematics. The time t
lar to a hidden Markov model but where Kalman filter is derived here from first ■ u t is the vector containing any
the state space of the latent variables is principles considering a simple physical control inputs (steering angle, throt-
continuous and where all latent and example exploiting a key property of the tle setting, braking force)
observed variables have a Gaussian dis- Gaussian distribution—specifically the ■ Ft is the state transition matrix
tribution (often a multivariate Gaussian property that the product of two which applies the effect of each sys-
distribution). The aim of this lecture Gaussian distributions is another tem state parameter at time t-1 on
note is to permit people who find this Gaussian distribution. the system state at time t (e.g., the
description confusing or terrifying to position and velocity at time t-1
PREREQUISITES both affect the position at time t)
Digital Object Identifier 10.1109/MSP.2012.2203621
This article is not designed to be a thor- ■ B t is the control input matrix
Date of publication: 20 August 2012 ough tutorial for a brand-new student to which applies the effect of each
Measurement (Noisy)
r
0 Prediction (Estimate)
[FIG2] The initial knowledge of the system at time t = 0. The red Gaussian distribution represents the pdf providing the initial
confidence in the estimate of the position of the train. The arrow pointing to the right represents the known initial velocity of
the train.
x t - tx t ; t-1 = F (x t-1 - tx t ; t-1) + w t K t = Pt ; t-1 H Tt (H t Pt ; t-1 H Tt + R t) -1 . (7) from a radio ranging system deployed at
& Pt ; t-1 = E [(F (x t-1 - xt t-1 ; t-1) the track side. The information from the
In the remainder of this article, we will predictions and measurements are com-
+ w t) # (F (x t-1 - xt t-1 ; t-1)
derive the measurement update equa- bined to provide the best possible estimate
+ w t) T]
tions [(5)–(7)] from first principles. of the location of the train. The system is
= FE [(x t-1 - xt t-1 ; t-1) shown graphically in Figure 1.
# (x t-1 - xt t-1 ; t-1) T] SOLUTIONS The initial state of the system (at
# F T + FE [(x t-1 The Kalman filter will be derived here time t = 0 s) is known to a reasonable
- xt t-1 ; t-1) w tT] by considering a simple one-dimension- accuracy, as shown in Figure 2. The
+ E [w t x t - 1 al tracking problem, specifically that of location of the train is given by a
- xt t-1 ; t-1T] F T a train is moving along a railway line. At Gaussian pdf. At the next time epoch
every measurement epoch we wish to (t = 1 s) , we can estimate the new posi-
+ E [w t w tT] .
tion of the train, based on known limita-
Noting that the state estimation errors tions such as its position and velocity at
and process noise are uncorrelated THE BEST ESTIMATE t = 0, its maximum possible acceleration
WE CAN MAKE OF THE and deceleration, etc. In practice, we may
E [(x t-1 - xt t-1 ; t-1) w Tt ] LOCATION OF THE TRAIN IS have some knowledge of the control
= E 6w t (x t-1 - xt t-1 ; t-1) T@ = 0 PROVIDED BY COMBINING inputs on the brake or accelerator by the
OUR KNOWLEDGE FROM driver. In any case, we have a prediction of
& Pt ; t-1 = FE [(x t-1 - xt t-1 ; t-1) (x t-1 the new position of the train, represented
THE PREDICTION AND THE
- xt t-1 ; t-1) T] F T + E [w t w tT] in Figure 3 by a new Gaussian pdf with a
MEASUREMENT.
& Pt ; t-1 = FPt-1 ; t-1 F T + Q t . new mean and variance. Mathematically
this step is represented by (1). The vari-
The measurement update equations are know the best possible estimate of the ance has increased [see (2)], representing
given by location of the train (or more precisely, our reduced certainty in the accuracy of
the location of the radio antenna mount- our position estimate compared to t = 0,
xt t ; t = xt t ; t-1 + K t (z t - H t xt t ; t-1) (5) ed on the train roof). Information is avail- due to the uncertainty associated with any
Pt ; t = Pt ; t-1 -K t H t Pt ; t-1 , (6) able from two sources: 1) predictions process noise from accelerations or decel-
based on the last known position and erations undertaken from time t = 0 to
where velocity of the train and 2) measurements time t = 1.
Prediction (Estimate)
???
[FIG3] Here, the prediction of the location of the train at time t = 1 and the level of uncertainty in that prediction is shown. The
confidence in the knowledge of the position of the train has decreased, as we are not certain if the train has undergone any
accelerations or decelerations in the intervening period from t = 0 to t = 1.
???
Prediction (Estimate)
[FIG4] Shows the measurement of the location of the train at time t = 1 and the level of uncertainty in that noisy measurement,
represented by the blue Gaussian pdf. The combined knowledge of this system is provided by multiplying these two pdfs
together.
Measurement (Noisy)
???
Prediction (Estimate)
[FIG5] Shows the new pdf (green) generated by multiplying the pdfs associated with the prediction and measurement of the
train’s location at time t = 1. This new pdf provides the best estimate of the location of the train, by fusing the data from the
prediction and the measurement.
At t = 1, we also make a measure- The measurement pdf represented by The quadratic terms in this new
ment of the location of the train using the blue Gaussian function in Figure 4 function can expanded and then the
the radio positioning system, and this is is given by whole expression rewritten in Gaussian
represented by the blue Gaussian pdf in 2
form
Figure 4. The best estimate we can make y 2 (r; n 2, v 2) _ 1 e - (r - n22) . (9)
2v 2 y fused (r; n fused, v fused)
of the location of the train is provided by 2rv 22 (r - n fused) 2
= 1 e
- 2
combining our knowledge from the pre- The information provided by these two pdfs 2
2v fused , (11)
2rv fused
diction and the measurement. This is is fused by multiplying the two together,
achieved by multiplying the two corre- i.e., considering the prediction and the where
sponding pdfs together. This is repre- measurement together (see Figure 5). The
2 2
sented by the green pdf in Figure 5. new pdf representing the fusion of the n1 v2 + n2 v1
n fused = 2 2
A key property of the Gaussian function v1 + v2
is exploited at this point: the product of two A KEY PROPERTY OF THE
2
v1 (n 2 - n 1)
= n1 + (12)
Gaussian functions is another Gaussian 2
v1 + v2
2
GAUSSIAN FUNCTION IS
function. This is critical as it permits an
EXPLOITED AT THIS POINT:
endless number of Gaussian pdfs to be and
multiplied over time, but the resulting
THE PRODUCT OF TWO
function does not increase in complexity or GAUSSIAN FUNCTIONS 2 v1 v2
2 2
v
4
K` j
important to note however that in reality K v1 2 + v2 O c
2O =P t ; t-1 H Tt (H t P t ; t-1 H Tt +R t) -1 :
a function is usually required to map L c P
(16) the Kalman gain.
predictions and measurements into the
same domain. In a more realistic exten- Substituting H = 1 c and K = (Hv 21) It is now easy to see how the stan-
sion to our example, the position of the ^ H 2 v 21 + v 22 h results in dard Kalman filter equations relate to
train will be predicted directly as a new (17) and (18) derived above:
n fused = n 1 + K $ ^ n 2 - Hn 1 h . (17)
distance along the railway line in units of
Hv 21
meters, but the time of flight measure- Similarly the fused variance estimate n fused = n 1 +e o $ ^ n 2 - Hn 1 h
H v 21 + v 22
2
ments are recorded in units of seconds. becomes
To allow the prediction and measure- " xt t ; t = xt t ; t - 1 + K t ^z t = H t xt t ; t - 1 h
` j
v1 4
2
=` j - c
ment pdfs to be multiplied together, one v fused v1 2
Hv 21
c2 c
` j + v2 v fused = v 1 - e o Hv 1
v1 2 2 2 2
must be converted into the domain of 2
c H v 21 + v 22
2
the other, and it is standard practice to
J v 21 N
map the predictions into the measure- K O 2 " Pt ; t = Pt ; t - 1 - K t H t P t ; t - 1 .
ment domain via the transformation & v 2fused = v 21 - K c O v1 .
K` j
K v1 2 + v2 O c
matrix Ht. 2O