You are on page 1of 4

Introduction to Kalman Filter

1st Azadeh Hadadi


UBFC, Centre universitaire Condorcet
VIBOT program: Master in computer vision
Le Creusot, France

Abstract—this report presents discrite Kalman filtering process 2) Correction (measurement update).
with a simple language and explains how Kalman gain and error
covariance matrix are derived knowing the process and mea- K k = P− T − T
k H (HPk H + R)
−1

surement models. Several variant of Kalman filter has already x̂k = x̂− −
k + Kk (zk − Hx̂k ) (3)
been presented in literatures which are well known as Extended
Kalman Filter. Among them, Kalman-Bucy extension developed Pk = (I − Kk H)P−
k
for continuous time and does not follow the standard process set
up for discrete Kalman filter. This model will be discussed as an Where in equation (2) and (3), x̂−k is the prior estimate which
exceptional cases and two problems will be solved based on the in a way, means the rough estimate before the measure-
theory presented for the Kalman-Bucy filter. ment update correction. Q and R are the covariance matrix
Index Terms—Kalman Filter, Kalman Gain, Error Covariance (Q = E[wwT ], R = E[vvT ]) of the Gaussian noise wk
Matrix, Kalman-Bucy Filter.
and vk respectively. P−k is the prior error covariance. the most
important items in (1)(3) is Kk Kalman gain and Pk error
I. E XERCISE 3: S YNTHETIC K ALMAN FI LTER AND covariance and their formulation might be different depending
E QUATIONS on equation (1) definition. we will derive this equation later
Kalman filtering [12] is an algorithm that provides estimate incorporating more detail. Both equations are calculated and
of some unknown variables given the measurements observed updated at each kth state.
over time. An evident example of Kalman filter is when a B. Initialization
train moves on a rail track and we want to estimate the train
position knowing the train engine force and from time to time To start the process, we need to know the estimate of x0 , and
position measurements but not knowing the force generated P0 , K, R, Q. Either we should find initial condition or assume
by winds or any other unknown random processes. some initial values for these parameters. R is rather simple
In a simple language, Kalman Filter has two equations (1). the to find out, because, in general, we’re quite sure about the
first equation is called process and the second is measurement. noise in the environment. But finding out Q is not so obvious.
usually, if we don’t know a parameters either we assume zero
xk = Axk−1 + Buk + wk−1 or 1 (or we make a clever guess knowing the physics of the
(1) system). for instance we estimate for k=0, x0 = 0, and P0 =
zk = Hxk + vk
1. we don’t choose P0 = 0 because this would mean that
where, xk , xk−1 and zk are signal value at time k, k-1 and there’s no noise in the environment, and this assumption would
measurement at time k respectively. Matrix A defines the gain lead all the consequent Estimate of xk at state k to be zero
of the system, B and H are actuator transfer function and (remaining as the initial state). So we choose P0 some non-
sensor model respectively. uk is system input at time k and zero value.
usually consider zero which does not affect the generality of
the solution. In Kalman filter is a stochastic process in which C. What does Kalman filter realy do?
any unknown parameter such as noise or random event explain Unlike many complicated equations appear in the literature
with a Gaussian noise and denoted by wk−1 and vk in the Kalman filter does fairly simple function. if we consider a
equations. simple case A = B = H = 1, equation (3) will be simplified
as (4)
A. Kalam filtering process K k = P− −
k (Pk + R)
−1

After fitting a model into Kalman Filter (defining A, B and x̂k = x̂− −
k + Kk (zk − x̂k ) (4)
H), the next step is to determine the necessary parameters Pk = (I − Kk )P−
k
and initial condition. Kalman filter has two distinct set of
with little manipulation of the second equation in (4) we will
equations:
reach to (5)
1) Prediction( time update) x̂k = Kk zk + (1 − Kk )x̂−
k (5)
x̂−
k = Ax̂k−1 + Buk simply, the second equation in (4) says that when error be-
(2)
P− T tween measurement zk and predicted value x̂−
k is high it means
k = APk−1 A + Q
the model generates a bad value (the predicted value generated A. Derivation of Kk [17]
by the first equation in (2)) at this step. Therefore, between As discussed above, the goal of using the Kalman filter is
two values zk shall be selected and consequently Kalman gain to minimize posteriori state estimation error. The error in the
Kk shall be increased (Kk ≈ 1.0; for instance 0.95) to give a posteriori state estimation is defined as (14)
higher weight to measurement (as seen in equation (5)) which
in turn means (1 − Kk ) decreases drastically (1 − Kk ≈ 0.0; xk − x̂k (14)
for instance 0.05) to suppress the effect of the predicted value
on state estimation and other way around. As seen in (4), We seek to minimize the mean-square error estimator which
the Kalman process regulates Kk by adjusting predicted error is defined by hexpected value
i of the square of the magnitude of
2
covariance P− − the vector, E kxk − x̂k k . This is equivalent to minimizing
k . When prediction error is low, Pk and Kk tend
− the trace of the a posteriori estimate covariance matrix Pk
to zero while by increasing the error Pk increases rapidly
and Kk tend to 1. This way, the Kalman filter alternatively [16] [1] [13]:
selected one value out of two presented value, i.e., predicted Pk = P− − − ⊤ ⊤
k − Kk HPk − Pk H Kk +
and measured, and that is the reason why this process is called
Kk HP− ⊤
 ⊤
filter. k H + R Kk = (15)
we have talked a lot about Pk and Kk but how do we derive P−
k − Kk HP−
k − P− ⊤ ⊤
k H Kk + Kk S k K⊤
k
equation (2), (3) and noticeably Pk and Kk for a given process The trace is minimized by calculating the matrix derivative of
and measurement model. this will be discussed in detail in the (15) with spect to the gain matrix and set to zero. Using the
following section. gradient matrix rules and the symmetry of the matrices yield
(16).
II. K ALMAN GAIN DERIVATION : Kk AND Pk ∂ tr (Pk ) ⊤
= −2 HP− k + 2Kk Sk = 0 (16)
The error covariance Pk is defined by (6) [17] ∂Kk
Solving this for Kk yields the Kalman gain:
Pk = cov (xk − x̂k ) (6)
⊤
Kk Sk = HP− k = P−
kH

(17)
substituting in the definition of x̂k (6) as shown before in (4) − ⊤ −1
⇒ K k = Pk H S k
yield (8)
Pk = cov xk − x̂− B. derivation of Pk
 
k + Kk ỹk (7)
The above gain, which is known as the optimal Kalman
Pk = cov xk − x̂− −
 
k + Kk zk − Hx̂k (8) gain, is used to calculate the error covariance matrix Pk .
replacing (17) into (13) and manipulating the result equation
and substituting zk in (8) leads to yield (18) which defines error covariance matrix in terms of
Kalman optimal gain.
Pk = cov xk − x̂k + Kk Hxk + vk − Hx̂−
 
k (9)
Pk = (I − Kk H)P−
k (18)
and by collecting the error vectors we get
C. Variant of Kalman filter: Kalman-Bucy or continuous
Pk|k = cov (I − Kk H) xk − x̂−
  
k − Kk v k (10) Kalman
since the measurement error vk is uncorrelated with the other serval variants of Kalman filter have been presented in
terms, this becomes the literature since the time the Kalman filter was developed
such as Extended Kalman filter [7], Unscented Kalman filter
Pk = cov (I − Kk H) xk − x̂−
 
k + cov [Kk vk ] (11) [10], Hybrid Kalman filter, Rauch-Tung-Striebel [3], Modified
Bryson-Frazier smoother [10], Minimum-variance smoother
by the properties of vector covariance, (11) is written in the [6], Frequency-weighted Kalman filters [7]. one of the im-
form of (12) portant variant of the Kalman filter known as Kalman-Bucy
⊤ filter [14].
Pk = (I − Kk H) cov xk − x̂−

k (I − Kk H) +
(12) The Kalman-Bucy filter (named after Richard Snowden
Kk cov (vk ) K⊤ k Bucy) is a continuous time version of the Kalman filter [5]
using predicted covariance definistion which is variant of (6) [9]. It is based on the state space model and Kalman process
as P− and measurement models then are described with (19)
k and the definition of R in (12) yield (13)
d
Pk = (I − Kk H) P−
⊤ ⊤ x(t) = A(t)x(t) + B(t)u(t) + w(t)
k (I − Kk H) + Kk RKk (13) dt (19)
z(t) = H(t)x(t) + v(t)
Equation (13) is an essential equation and form now on
Kalman gain and other parameters will be extracted from this where, Q(t) and R(t) represent the intensities of the two white
equation. noise terms w(t) and v(t) respectively. if we follow the steps
q
detailed above (see section II.A and B) the state estimation Which has the positive solution P = R2 +R2 1 + R1 R2 . Thus,
and error covariance shown in (20) will be achieved. the Kalman filter gain yield (26)
d
r
dt x̂(t) = A(t)x̂(t) + B(t)u(t) + K(t)(z(t) − H(t)x̂(t)) 1 R1 p
d ⊤ ⊤ K= P =1+ 1+ =1+ 1+β (26)
dt P(t) = A(t)P(t) + P(t)A (t) + Q(t) − K(t)R(t)K (t) R2 R2
(20)
b. Substituting K and P into (23), the Kalman filter error
where, the Kalman gain is determined by (21)
dynamics are given by (27)
K(t) = P(t)H⊤ (t)R−1 (t) (21) ˙
x̃(t) = (A − KC)x̃(t) + w1 (t) − Kw2 (t)
p p (27)
Note that in expression (21) for K(t) the covariance of = − 1 + β x̃(t) + w1 (t) − (1 + 1 + β)w2 (t)
the observation noise R(t) represents at the same time the c. Form control theory [15] it is well known that for a given
covariance of the prediction error [2], unlike discrete Kalman system a transfer fucntion can be defined (as a fraction of
filter model ỹ(t) = z(t) − H(t)x̂(t); this is because (20) output over input). The instantaneous response of the system
and (21) works only in the case of continuous time [11]. The is more affected by the pole (in the denominator of the transfer
distinction of the prediction and the update steps of discrete- function) than zeros (in the nominator). The position of the
time Kalman filtering does not exist in continuous time. The Kalman filter pole is −√1 + β. We can see that if β → ∞
second differential equation, for the covariance, is an example the pole of the Kalman filter → −∞. Hence, the estimation
of a Riccati equation [4]. more detail will be elaborated in error dynamics are fast, and the Kalman filter very much trusts
section III. the measurements. On the other hand, if β → 0, the Kalman
filter pole tends to -1 , that is, as fast as the process pole.
III. E XERCISE 1: UNSTABLE FIRST ORDER SYSTEM
Now, the filter trusts the process model much more than the
A. problem 1 measurements.
Consider the unstable first-order system IV. E XERCISE 2: K ALMAN F ILTER FOR S ECOND O RDER
ẋ(t) = x(t) + u(t) + w1 (t) SYSTEM
(22) A. Problem
y(t) = x(t) + w2 (t)
A Kalman filter should be designed for the second-order
The uncorrelated noise signals wi (t) are white with intensities system
Ri . We want to investigate how the optimal Kalman filter    
depends on noise parameters. 0 1 1
ẋ(t) = x(t) + u(t) + w1 (t)
a. Show that the Kalman filter gain only depends on the 1 0 0 (28)
ratio β = R1 /R2 y(t) = (1 0)x(t) + w2 (t)
b. Find the observer error dynamics, i.e., the dynamics of where w1 and w2are uncorrelated
 white noise processes with
the estimation errorx̃(t) = x(t) − x̂(t) 3 0
intensities R1 = and R2 = 1, respectively
c. How does the error dynamics depend on the ratio β = 0 3
R1 /R2 ? Interpret the result for large β (process noise
a. Calculate the minimum observer error covariance P and
much larger than measurement noise), and for small β
the optimal Kalman filter gain K
(measurement noise much larger than process noise).

B. answer b. Write down the resulting filter equations for x̂1 and x̂2
a. as we seen in (20), (21), for this case, the Kalman gain B. Answer
and error covariance are determined by (20) and (21). With this problem is indeed the numerical example of what
the problem parameters A = C = 1, the (20) reduces to (23). has been discussed in section III. to this end, all parametric
d achievement such as Kalman gain and error covariance will
dt x̂(t) = x̂(t) + u(t) + K(t)(z(t) − x̂(t))
d ⊤ −1 be replicated here. a. With the problem parameters
dt P(t) = 2P(t) + Q(t) − K(t)R(t)K (t)K(t) = P(t)R (t)    
(23) 0 1  3 0
A= , C = 1 0 , R1 = ,
Substituting third equation into second in (23) and setting to 1 0 0 3
zero as we did in (section II.B [8]) yield (24), for simplicity 
p 1 p2

parameter t is removed and kept hidden. R2 = 1, P =
p 2 p3
P (29)
2P + Q(t) − =0 (24)
R The equation(23) AP + P AT + R1 − P C T R2−1 CP = 0 leads
Substituting Q and R from problem definition in (24) yield to (30)
(25) −p21 + 2p2 + 3 = 0
2 p 1 + p3 − p1 p2 = 0
P (30)
2P + R1 − =0 (25) −p 2
+ 2p + 3 = 0
R2 2 2
The positive solution of (30) is p1 = p2 = 3, p3 = 6. The
optimal P and K are thus determined by (31)
   
3 3 3
P = , K = P CT = (31)
3 6 3
b. As seen in section III, the Kalman filter is given by a
differential equation dx̂
dt = (A − KC)x̂ + Bu + Ky. Inserting
the problem data and the optimal K achieved from (31) gives
(32).
dx̂1
dt = −3x̂1 + x̂2 + u + 3y (32)
dx̂2
dt = −2x̂1 + 3y

R EFERENCES
[1] Brian DO Anderson and John B Moore. Optimal filtering. Englewood
Cliffs, NJ: Pren, 1979.
[2] Harry Asada. Lecture Notes No. 7, Continuous Kalman Filter, chapter
System Identification, Estimation, and Learning, pages 1–7. MIT
University press, March 1, 2006.
[3] Gerald J Bierman. Factorization methods for discrete sequential esti-
mation. Courier Corporation, 2006.
[4] Dario A Bini, Bruno Iannazzo, and Beatrice Meini. Numerical solution
of algebraic Riccati equations. SIAM, 2011.
[5] Richard S Bucy and Peter D Joseph. Filtering for stochastic processes
with applications to guidance, volume 326. American Mathematical
Soc., 2005.
[6] Garry A Einicke. Optimal and robust noncausal filter formulations. IEEE
Transactions on Signal Processing, 54(3):1069–1077, 2006.
[7] Garry A Einicke. Iterative frequency-weighted filtering and smoothing
procedures. IEEE Signal Processing Letters, 21(12):1467–1470, 2014.
[8] F. Haugen. State Estimation with Kalman Filter. Kompendium for Kyb,
June 2012.
[9] Andrew H Jazwinski. Stochastic processes and filtering theory. Courier
Corporation, 2007.
[10] Simon J Julier and Jeffrey K Uhlmann. New extension of the kalman
filter to nonlinear systems. In Signal processing, sensor fusion, and tar-
get recognition VI, volume 3068, pages 182–193. International Society
for Optics and Photonics, 1997.
[11] Thomas Kailath. An innovations approach to least-squares estimation–
part i: Linear filtering in additive white noise. IEEE transactions on
automatic control, 13(6):646–655, 1968.
[12] Tony Lacy. Tutorial in the Kalman Filter, Chapter 12, page 133-140.
MIT press, June 2012.
[13] Jingyang Lu and Ruixin Niu. False information injection attack on
dynamic state estimation in multi-sensor systems. In 17th International
Conference on Information Fusion (FUSION), pages 1–8. IEEE, 2014.
[14] Sanjoy K Mitter and Nigel J Newton. Information and entropy flow in
the kalman–bucy filter. Journal of Statistical Physics, 118(1-2):145–176,
2005.
[15] Emanuel Todorov. Optimal control theory. Bayesian brain: probabilistic
approaches to neural coding, pages 269–298, 2006.
[16] Jean Walrand and Antonis Dimakis. Random processes in systemslecture
notes (department of electrical engineering and computer sciences).
University of California, Berkeley CA, 94720, 2006.
[17] Wikipedia. Kalman filter, June 2020.

You might also like