You are on page 1of 15

Automatica, Vol. 7, pp. 465--479. Pergamon Press, 1971. Printcxl in Great Britain.

Recursive Bayesian Estimation Using Gaussian Sums"


Estimation recurrente bayesienne utilisant les sommes gaussiennes
Rekursive Bayes-Schtitzung unter Benutzung Gauss'scher Summen
PeKypCHBHa~ 6afiecoBcra~ ot/errKa c noMomsro rayccoBcKnx CyMM

H. W. S O R E N S O N t and D. L. ALSPACH~

The approximation O/the p,obability density p(xklZk) o f the state o f a noisy dynamic
system conditioned on available measurement data using a convex combination of
gaussian densities is proposed as a practical means Jbr accomplishing non-linear filtering.

Sununary--The Bayesian recursion relations which describe been published on the subject, the basic objective of
the behavior of the a posteriori probability density function obtaining a solution that can be implemented in a
of the state of a time-discrete stochastic system conditioned
on available measurement data cannot generally be solved straightforward manner for specific applications has
in dosed-form when the system is either non-linear or non- not been satisfactorily realized. This is manifested
gaussian. In this paper a density approximation involving by the fact that the Kalman filter equations [2, 3],
convex combinations of gaussian density functions is intro-
duced and proposed as a meaningful way of circumventing which are valid only for linear, gaussian systems,
the difficulties encountered in evaluating these relations and continue to be widely used for non-linear, non-
in using the resulting densities to determine specific estima- gaussian systems. Of course continued application
tion policies. It is seen that as the number of terms in the
gaussian sum increases without bound, the approximation has resulted in the development of ad hoc techniques
converges uniformly to any density function in a large class. [e.g. Refs. 4, 5] that have improved the performance
Further, any finite sum is itself a valid density function unlike of the Kalman filter and which give it some of the
many other approximations that have been investigated.
The problem of determining the a posteriori density and characteristics of non-linear filters.
minimum variance estimates for linear systems with non- Central to the non-linear estimation and
gaussian noise is treated using the gaussian sum approxima- stochastic control problems is the determination of
tion. This problem is considered because it can be dealt
with in a relatively straightforward manner using the the probability density function of the state con-
approximation but still contains most of the diffculfies that ditioned on the available measurement data. If this
one eneotmters in considering non-linear systems since the a posteriori density function were known, an esti-
a posteriori density is nongaussian. After discussing the
general problem from the point-of-view of applying gaussian mate of the state for any performance criterion
sums, a numericai example is presented in which the actual could be determined. Unfortunately, although the
statistics of the a posteriori density are compared with the manner in which the density evolves with time and
values predicted by the gaussian sum and by the Kalman
filter approximations. additional measurement data can be described in
terms of differential, or difference, equations [6-8]
these relations are generally very difficult to solve
I. INTRODUCTION
either in d o s e d form or numerically, so that it is
THE P~OBLrM of estimating the state of a non-linear usually impossible to determine the a posteriori
stochastic system from noisy measurement data is density for specific applications. Because of this
considered. This problem has been the subject of difficulty it is natural to investigate the possibility of
considerable research interest during the past few approximating the density with some tractable
years, and JAZWrNS~¢I [1] gives a thorough dis- form. It is to this approximation problem that this
cussion of the subject. Although a great deal has discussion is directed.

* Received 24 August 1970; revised 18 December 1970. 1.I. The general problem
The original version of this paper was not presented at any
IFAC meeting. It was recommended for publication in The approximation that is discussed below is
revised form by associate editor L. Meier.
introduced as a means of dealing with the following
t Department of the Aerospace and Mechanical Engine- system and filtering problem. Suppose that the
ering Sciences, University of California at San Diego, La Jolla,
California 92037, U.S.A. state x evolves according to
~;Department of Electrical Engineering, Colorado State
University, Fort Collins, Colorada, U.S.A. xk+ ~ = fRx~, wD (1)

465
466 H . W . SORENSONand D. L. ALSPACH

and that the behavior of the state is observed im- applications. The principal exception occurs when
perfectly through the measurement data the plant and measurement equations are linear and
the initial state and the noise sequences are gaussian.
Zk=hk(Xk, Vk). (2) Then, equations (3-6) can be evaluated and the
a posteriori density P(Xk[Zk) is gaussian for all k.
The w k and Vk represent white noise sequences and The mean and covariances for this system are
are assumed to be mutually independent. known as the Kalman filter equations.
The basic problem that is considered is that of When the system is non-linear and/or the a
estimating the state x k from the measurement data* priori distributions are non-gaussian, two problems
Z k for each k, that is, the filtering problem. Gener- are encountered. First the integration in equation
ally, one attempts to determine a "best" estimate (4) cannot be accomplished in closed form and,
by choosing the estimate to extremize some per- second the moments are not easily obtained from
formance criterion. For example, the estimate equation (3). These problems lead to the investiga-
could be selected to minimize the mean-square tion of density approximations for which the
error. Regardless of the performance criterion, required operations can be accomplished in a
given the a posteriori density function p(XklZk) , any straightforward manner. A particularly promising
type of estimate can be determined. Thus, the approximation is considered here. Emphasis in this
estimation problem can first be approached as the discussion is on the approximation itself rather than
problem of determining the a posteriori density. the non-linear filtering problem but the latter has
This is generally referred to as the Bayesian been stated because it provides the motivation for
approach [9]. considering an approximation. The approximation
of probability density functions is discussed in
1.2. The Bayesian approach section 2. The use of the gaussian stun approxima-
As has been demonstrated by the great interest in tion for a linear system with non-gaussian a priori
the Kalman filter, it is frequently desirable to density functions for the initial state and for the
determine the estimates recursively. That is, an plant and measurement noise sequences is dis-
estimate of the current state is up-dated as a func- cussed in section 3.
tion of a previous estimate and the most recent or
new measurement data. In the Bayesian case, the
a posteriori density can be determined recursively 2. APPROXIMATION OF DENSITIES
according to the following relations. The problem of approximating density functions
in order to facilitate the determination of the a
p(x lZ )-P(x lZk-1)P(Z lX ) posteriori density and its moments has been
p(ZklZk_0 (3) considered previously [10] using Edgeworth and
Gram-Charlier expansions. While this approach
,(XklZk_,) = Sp(Xk_ llZk_ 1)p(xklXk_ ,)dXk_ x (4) has several advantages and its utility has been
demonstrated in some applications, it has the
where the normalizing constant p(Zk[Z k_ 1) in equa- distinct disadvantage that when these series are
tion (3) is given by truncated, the resulting density approximation is
not positive for all values of the independent
p(z lZk- 1) = .[p(xkl 1)P(z lx )dx • (5) variable. Thus, the approximation is not a valid
density function itself. To avoid or to at least
The initial density p(xo[Zo) is given by reduce the negativity of the density approximation,
it is sometimes necessary to retain a large number
p(x0lz0)- P(Z°[X°)p(x°). (6) of terms in the series which can make the approxi-
p(zo) mation computationally unattractive. Further, it is
well known that the series converges to a given
The density p(ZklXk) in equation (3) is determined density only under somewhat restrictive conditions.
by the a priori measurement noise density p(Vk) and Thus, although the Edgeworth expansion is a con-
the measurement equation (2). Similarly, the venient choice in many ways, it has proven desirable
fl(Xk[Xk- 1) in equation (4) is determined byp(w k_ 1) to seek an approximation which eliminates these
and equation (1). Knowledge of these densities and difficulties. The gaussian sum approximation that
p(Xo) determines the P(xk[Zk) for all k. However, is described here exhibits the basic advantages of
it is generally impossible to accomplish the integra- the Edgeworth expansion but has none of the dis-
tion indicated in equation (4) in closed form so that advantages already noted. That is, the approxima-
the density cannot actually be determined for most tion is always a valid density function and, further,
converges uniformly to any density of practical
* The upper case letter Ze denotes the set (z0,zl. . . . z~). concern. An approximation of this type was
Recursive Bayesian estimation using gaussian sums 467

suggested briefly by AOKt [11]. More recently, it follows from equation (7) that Pa is a probability
CAMERON [12] and Lo [13] have assumed that the density function for all 2.
a priori density function for the initial state of It is basically the presence of the gaussian weight-
linear systems with gaussian noise sequences has ing function that has made the Edgeworth expan-
the form of a gaussian sum. sion attractive for use in the Bayesian recursion
relations. The operations defined by equations
2.1. Theoretical foundations of the approximatDn (3-6) are simplified when the a priori densities are
Consider a probability density function p which gaussian or closely related to the gaussian. Bearing
is assumed to have the following properties. this in mind, the following delta family is a natural
1. p is defined and continuous at all but a finite choice for density approximations. Let
number of locations
,L(x) A_U~(x)
2. I°°oop(x)dx = 1
= (2rc22)½exp[- ½x2/i2]. (8)
3. p(x)>_ 0 for all x
It is convenient although not necessary to consider It is shown without difficulty that Na(x) forms a
only scalar random variables. The generalization delta family of positive type as ;t~0. That is, as the
to the vector case is not difficult but complicates the variance tends to zero, the gaussian density tends
presentation and, it is felt, unnecessarily detracts to the delta function.
from the basic ideas. Using equations (7) and (8), the density approxi-
The problem of approximating p can be con- mation p~ is written as
veniently considered within the context of delta
families of positive type [14]. Basically, these are oo

families of functions which converge to a delta, or


impulse, function as a parameter characterizing the
family converges to a limit value. More precisely,
Pz(x) =
I -- 00
p(u)Nz(x - u)du. (9)

let {64} be a family of functions on ( - o% oo) which It is this form that provides the basis for the
are integrable over every interval. This is called a gaussian sum approximation that is the subject of
delta family of positive type if the following con- this discussion.
ditions are satisfied. While equation (9) is an interesting result, it does
not immediately provide the approximation that
(i) Iaa 6~(x)dx tends to one as 2 tends to some can be used for specific application. However, it is
limit value ;to for some a. clear that p(u)Nx(x- u) is integrable on ( - o% oo)
(ii) For every constant y>_0, no matter how and is at least piecewise continuous. Thus, (9) can
small, 6x tends to zero uniformly for itself be approximated on any finite interval by a
7 < [ x [ < ~ as 2 tends to 20. Riemann sum. In particular, consider an approxi-
(iii) 6z(x)> 0 for all x and 2. mation of Pz over some bounded interval (a, b)
Using the delta families, the following result can given by
be used for the approximation of a density function
/1
p.
1p(x~)N~(x- x,)[~,- 4,- i] (i0)
p../x) = ~ ~__E
Theorem 2.1. The sequence px(x) which is formed
by the convolution of 6z and p
where the interval (a, b) is divided into n sub-
intervals by selecting points ~, such that
pa(x)= f ~oo cSa(x-u)p(u)du (7)
a=~o<~l< ... ~,=b.
converges uniformly to p(x) on every interior sub- In each subinterval, choose the point x~ such that
interval of ( - ~ , oo).
For a proof of this result, see KOREVAAR [14]. ~
When p has a finite number of discontinuities, the
theorem is still valid except at the points of dis-
continuity. It should be noted that essentially the
P(X,)[~i- 4,- 1] =
f ~- I
p(x)dx

same result is given by Theorem 2.1 in FELLER[15]. which is possible by the mean-value theorem. The
If {6a} is required to satisfy the condition that constant k is a normalizing constant equal to

f ~o 6z(x)dx=l kI> x,dx


468 H . W . SORENSONand D. L. ALSPACH

and insures thatp., x is a density function. Clearly, Thus, one can attempt to choose
for (b-a) sufficiently large, k can be made arbi-
trarily dose to 1. Note that it follows that ~i, #i ai (i= 1, 2 . . . . , n)
n

(11) so that the distance IIP-P,[[ is minimized. As the


k,'= p(xi)[{,- 4,- a] = 1 number of terms n increases and as the variance
decreases to zero, the distance must vanish. How-
so that p., a essentially is a convex combination of ever, for finite n and nonzero variance, it is reason-
gaussian density ftmctions Na. It is basically this able to attempt to minimize the distance in a manner
form that will be used in all future discussion and such as this. In doing this, the stochastic estimation
which is referred to hereafter as the gaussian sum problem has been recast at this point as a deter-
approximation. It is important to recognize the ministic curve fitting problem.
p., a that are formed in this manner are valid proba- There are other norms and problem formulations
bility density functions for all n, 2. that could be considered. In many problems, it
may be desirable to cause the approximation to
2.2. Implementation of the gaussian sum approxima- match some of the moments, for example, the mean
tion and variance, of the true density exactly. If this
The preceding discussion has indicated that a were a requirement, then one could consider the
probability density function p that has a finite moments as constraints on the minimization
number of discontinuities can be approximated problem and proceed appropriately. For example,
arbitrarily closely outside of a region of arbitrarily if the mean associated with p is p, then the con-
small measure around each point of discontinuity straint t h a t p , have mean value p would be
by the gaussian sum p,. a as defined in equation (10).
These asymptotic properties are certainly necessary
for the continued investigation of the gaussian sum. U ~ n ~ i al -- i

However, for practical purposes, it is desirable, in


fact imperative, that p can be approximated to #= ~ ailtl. (14)
within an acceptable accuracy by a relatively small i=1

number of terms of the series. This requirement


furnishes an additional facet to the problem that is Thus, equation (14) would be considered in addition
considered in this section. to the constraints on the oq stated after equation
For the subsequent discussion, it is convenient to (12).
write the gaussian sum approximation as Studies related to the problem of approximating
p with a small number of terms have been conducted
p.(x)= ~ ~lN.,(x-pt) (12) for a large number of density functions. These in-
i=1 vestigations have indicated, not surprisingly, that
where densities which have discontinuities generally are
more difficult to approximate than are continuous
~ = 1; ~ > 0 for all i. functions. The results for two density functions, the
i=1 uniform and the gamma, are discussed below. The
uniform density is discontinuous and is of interest
The relation of equation (12) to equation (10) is from that point of view. The gamma function is
obvious by inspection. Unlike equation (10) in nonzero only for positive values of x so is an
which the variance 2 2 is common to every term of example of a nonsymmetric density that extends
the gaussian sum, it has been assumed that the over a semi-infinite range.
variance ai 2 can vary from one term to another. Consider the following uniform density function
This has been done to obtain greater flexibility for
approximations using a finite number of terms.
J'¼ for - 2 < x _ 2
Certainly, as the number of terms increase, it is
p(x) = ~0 elsewhere
(15)
necessary to require that the ai tend to become
equal and vanish.
The problem of choosing the parameters ~,/~i, a~ This distribution has a mean value of zero and
to obtain the "best" approximation p , to some variance of 1.333.
density function p can be considered. To define this Two different methods of fitting equation (15)
more precisely, consider the L k norm. The distance have been considered. First, consider an approxi-
between p and p , can be defined as mation that is suggested directly by (10) and referred
to subsequently as a Theorem Fit. The parameters
oO
p _ p , k= p(x)- of the approximation are chosen in the following
general manner.
Recursive Bayesian estimation using gaussian sums 469

(1) Select the mean value p~ of each gaussian tion when 6, 10, 20 and 49 terms were included. It
so that the densities are equally spaced on is interesting to observe that the approximation
( - 2 , 2). By an appropriate location of the retains the general character of the uniform density
densities the mean value constraint (14) can even for the six-term case. As should be expected,
be satisfied immediately. the largest errors appear in the vicinity of the dis-
continuities at + 2. These approximations exhibit
(2) The weighting factors 0q are set equal to an apparent oscillation about the true value that is
i/n so not visually satisfying. This oscillation can be
~t=l.
t=l
eliminated by using a slightly larger value for the
variance of the gaussian terms as is depicted in
Fig. l(b).
(3) The variance ¢12 of each gaussian is the same
and is selected so that the L 1 distance between The second and fourth moments and the L 1 error
p and p, is minimized. are listed in Table 1 for these two sets of approxima-
tion. The individual terms were located sym-
This approximation procedure requires only a one- metrically about zero so that the mean value of the
dimensional search to determine ¢2. gaussian sum agrees with that of the uniform den-
To investigate the accuracy and the convergence sity in all cases. Note that for the best fit the error
of the approximation, the number of terms in the in the variance is only 1.25 per cent when 20 terms
sum was varied. Figure l(a) shows the approxima- are used. As should be expected, higher order

O.S- 0.5-
(o) (b)

0.4 0.4-

0.3 0.3
[a.
t~
13.
0.2 0.2-

o-I /

i , , i i f i t i ,
-3 -2.1 '-,.b ' 6 ' I-~0 2"0 3"0-3-0 -2-0 - I '0 0 I'0 2'0 3 '0
X
0.5-
(c)

0.4

0.3

[~ seorch fit
..... I0 term-best theorem fit
0.2 ......... I0 t e r m - - s m o o t h e d theorem fit

0.1 {'
k
\

r , i i , i , 1 , i
-3'0 -2"0 - I "0 0 |'0 2.0 3.0
X

Fzo. 1. Gaussian sum approximations of uniform


density function.
(a) Best theorem fit--6, 10, 20 and 49 term approxi-
mations.
(b) Smoothed theorem fit--6, 10, 20 and 49 term
approximations.
(C) L2 Search fit comparison,
470 H . W . SORENSON and D. L. ALSPACH

m o m e n t s converge more slowly since errors farthest o f the n u m b e r o f terms involved, obtaining a search
away from the mean assume m o r e importance. F o r fit is significantly more difficult than obtaining a
example, the fourth central m o m e n t has an error o f theorem fit for the same n u m b e r of terms. Thus, the
5 per cent for the 20-term approximation. The theorem fit m a y be more desirable from a practical
errors in the moments o f the s m o o t h e d fit are standpoint. The moments and L 1 error for the
aggravated only slightly although the L I error search fit is also included in Table 1. These values
increases in a nontrivial manner. are slightly better than the theorem fit involving ten
terms. It is interesting in Fig. l(c) to note the
"spikes" that have appeared at the points o f dis-
TABLE 1. UNIFORM DENSITYAPPROXIMATION
continuities. This appears to be analogous to the
Gibbs p h e n o m e n o n o f Fourier series.
Fourth The second example that is discussed here is the
central L1
Variance moment error g a m m a density function. It is defined as

True values 1 "333 3.200 --


0 for x < 0

f
Best theorem fit
6 terms 1.581 5-320 0.2199 (16)
10 terms 1.417 3.884 0.1271 p(x)= x 3 e - X for x > 0 .
20 terms 1.354 3.363 0.0623 6
49 terms 1.336 3.226 0.0272

Smoothed theorem fit


6 terms 1 "690 6.387 0.2444 The distribution has a mean value of 4 and second,
10 terms 1.456 4.224 0.1426 third, and fourth central moments o f 4, 8 and 72
20 terms 1.363 3"442 0.0701 respectively.
49 terms 1.338 3.238 0.0280
First, consider a theorem fit o f this density in
L 2 search fit which the mean values are distributed uniformly on
6 terms 1.419 3.626 0.0968 (0, 10). F o r the uniform density, the uniform place-
ment was natural; for the g a m m a density it is not
as appropriate. F o r example, for n = 6 or 10, it is
As an alternative approach, the parameters ~i seen in Fig. 2 (a) that the approximating density is
#t, tr2 were chosen to minimize the L2 distance not as good, at least visually, as one might hope.
which is hereafter referred to as an L 2 search fit. The first four central moments are listed in Table 2.
These results are summarized in Fig. 1(c) for n = 6. Clearly, the higher order moments contain large
Included for comparison in this figure are the 10- errors and even the mean value is incorrect in
term theorem fits f r o m Figs. l(a) and l(b). Because contrast with the uniform density.

TABLE 2. GAMMA DENSITYAPPROXIMATION

Third Fourth
central central L1
Mean Variance moment moment Error
True values 4 4 8 72

Theorem fit
6 terms in (0, 10) 3"94 4"345 5"496 60.78 0"119
10 terms in (0, 10) 3.94 3.861 4-941 47.84 0.053
20 terms in (0, 10) 3.93 3.611 4-653 41.23 0.023
20 terms in (0, 12) 4.00 4.206 7-477 71.41 0.042

L2 search fit
1 terms in (0, 10) 3'51 3"510 0 10.53 0"203
2 terms in (0, 10) 3"82 3"427 3.04 34"96 0"078
3 terms in (0, 10) 3.91 3.632 4"711 43-39 0"036
4 terms in (0, 10) 3"95 3-744 5"682 49-57 0"018
Recursive Bayesia, estimation using gaussian sums 471

(a) (b)

O-Z 10 - term . . . . . . . . . . . . 0.Z


intervals
6 - term . . . . . . . . (0,10) .......
(0,12) ..........

0'2 o.2

Lt. 0.1 0.1


(L LL
Q
0-

0.1 0"1

;X

0,0 ~*k O0
/J

/ )
0.0 : -' 0.0 ' -" : ~
-2.0 0"0 2"0 4"0 6"0 8'0 I0"0 12'0 14"0 -2"0 0"0 2.0 4.0 6.0 8.0 I0.0 12-0 14.0
x X

(c)
0.2
3 - term . . . . . .
4-term ...........

0'2

•o.,
rt

0'1

I 0'0 t

0.0 tx
-2-0 0.0 2-0 4.0 6.0 8'0 10'0 12"0 14"0
X

FIO. 2. Gaussian sum approximations of gamma density


function.
(a) Best theorem fit--6 and 10 term approximations.
(b) Twenty term approximations over different inter-
vals.
(c) L2 search fit: 3 and 4 term approximations.

T w o different 20-term approximations are de- of these two cases are included in Fig. 2 and indicate
picted in Fig. 2(b). In one the mean values of the that increasing the interval over which the approxi-
gaussian terms are selected in the interval (0, 10), mation is valid has significantly improved the
whereas the second approximation is distributed in moments.
(0, 12). N o t e in the first case the gaussian sum tends The theorem fit provides a very simple method
to zero much more rapidly than the gamma function for obtaining an approximation. However, it is
for x > 10. Thus, to improve the approximation and clear that better results could be obtained by choos-
the moments it is necessary to increase the interval ing at least the mean values of the individual
over which terms are placed and the second curve gaussian terms more carefully. Consider now some
indicates the influence of this change. The results L 2 search fits. In Fig. 2(c), the search fits for three
472 H.W. SORENSONand D. L. ALSPACH

m k
and four terms are depicted and the moments are
P@k) : E ] ' i k N r , k ( O k - - Vik)" (21)
listed in Table 2. Clearly, values of the mean and i=1
variance appear to be converging to the true values
of 4. Note also that the 4-term approximation is There are a variety of ways in which the gauss,an
considerably better than the 10-term theorem fit. sum approximation could be introduced. For
Thus, the search technique, while more difficult to example, it is natural to proceed in the manner that
obtain, points out the desirability of judicious is to be discussed here in which the apriori distribu-
placement of the gauss,an terms in order to obtain tions are represented by gauss,an sums as in equa-
the most suitable approximation. tions (19), (20) and (21). This approach has the
advantage that the approximation can be deter-
mined off-line and then used directly in the Bayesian
3. LINEAR SYSTEMS WITH NONGAUSSIAN recurs,on relations. An alternative approach
NOISE
would be to perform the approximation in more of
It is envisioned that the gauss,an sum approxima- an on-line procedure. Instead of approximating the
tion will be very useful in dealing with non-linear a priori densities, one could deal with p ( X J Z k ) in
stochastic systems. However, many of the equation (3) and the integrand in (4) and derive
properties and concomitant difficulties of the approximations at each stage. This would be more
approximation are exhibited by considering linear direct but has the disadvantage that considerable
systems which are influenced by nongaussian noise. computation may be required during the processing
As is well-known, the a posterior, density p(Xk/Zk) of data. Discussion of the implementation of this
is gauss,an for all k when the system is linear and approach will not be attempted in this paper.
the initial state and plant and measurement noise
3.1. Determination o f the a posterior, distributions
sequences are gauss,an. The mean and variance of
Suppose that the a priori density functions are
the conditional density are described by the Kalman
given by equations (19, 20 and 21). In using these
filter equations. When nongaussian distributions
representations in the Bayesian recurs,on relations,
are ascribed to the initial state and/or noise se-
it is useful to note the following properties of
quences, p(Xk/Zk) is no longer gauss,an and it is
gauss,an density functions
generally impossible to determine p(Xk/Zk) in a
Seholium 3.1: For a4:0,
closed form. Furthermore, in the linear, gauss,an
problem, the conditional mean, i.e. the minimum
variance estimate, is a linear function of the N . ( x - ay) = 1 N . l . ( y - x / a ) . (22)
a
measurement data and the conditional variance is
independent of the measurement data. These
characteristics are generally not true for a system Scholium 3.2:
which is either non-linear or nongaussian.
For the following discussion, consider a scalar N~,(x - , i ) N , j(x - , j)
system whose state evolves according to --
2
2 21~
- Nt., , +~ j (., - . s ) N ~ , s ( x - . , j ) (23)
where
Xk = ~ k , k- lXk - 1 + Wk- 1 (17)
. , 4 + . J4
and whose behavior is observed through measure-
ment data z k described by 2 2
_ 2 __ if, ¢7j
O,s a? +a~.
zk = Hkx~ + VK. (18)
The proofs of (22) and (23) are omitted.
Suppose that the density function describing the
Armed with these two results it is a simple matter
initial state has the form
to prove that the following descriptions of the
lo filtering and prediction densities are true.
p(xo)= ~ ~,oNtr'i(Xo-,;o). (19)
i=l Theorem 3 . 1 . Suppose that P ( X k / Z k - O is
described by
Assume that the plant and measurement noise It,
sequences (i.e. {Wk} and {Vk}) are mutually inde- p(Xk/Zk-1)= E a*kN,;k(Xk--";k) " (24)
pendent, white noise sequences with density func- i=1

tions represented by Then, p(Xk/Zk) is given by


~k lk mk
p(w~)= Y t~,~Nqk,(w~--o),~) (20) p(~dz~)= E Z ¢,jN.,~(x~-~,j) (25)
i=1 i=1 j=I
Recursive Bayesian estimation using gaussian sums 473

where Clearly, the definition of p(Xo) has the form (28)


as does the p(xk/Zk-1) assumed in Theorem 3.1.
Cij = OtlkTjkNx(zk -- P i j m~=1 e~ ,,.~ N ~(z~ - P~,.) • Thus, it follows that the gaussian sums repeat
themselves from one stage to the next and that (26)
and (28) can be regarded as the general forms tor
y' 2 __ 1t./'2_.'2 - - _ 2 an arbitrary stage. Thus, the gaussian sum can
--~kOtk "1- r j k
almost be regarded as a reproducing density [16].
It is important, however, to note that the number
, 4 ~,2 al~H~
eq = I~ik rz
~--'7-Y'7, 2 L k - - V~k-- Hk#tk] of terms in the gaussian sum increases at each stage
~tk -rl k "t" r jk so that the density is not described by a fixed
number of parameters. The density would be truly
tru2 = ~i~
,'2 ,'4
- trtk 2 ,'2
Hk/(ai~ H~2 + r ~2 ) . reproducing if only the initial state were non-
gaussian, for example, see Ref. [12]. It is clear in
It is obvious that the c u > 0 and that this case that the number of terms in the gaussian
lk rak
sum remains equal to the number used to define
2 Z c,~=~. p(xo) so thatp(Xk/Zk) is described by a fixed number
of parameters.

Thus, equation (25) is a gaussian sum and for con- There are several aspects that require comment
venienee one can rewrite it as at this point. First, if the gaussian sums for the a
priori density all contain only one term, that is, they
Ilk are gaussian, the Kalman filter equations and the
p(xk/Zk) = ~. atkN~,~(Xk-- I~ik) (26) gaussian a posteriori density are obtained. In fact
t=1
the e u and a..2
~J in (25) and the means and variances
where nk=(lk)(mk) and the aik, Pik and /qk are 21k2 in (27) each represent the Kalman filter equa-
formed in an obvious fashion from the c u, tr u and tions for the ijth density combination. Thus, the
e U• gaussian sum in a manner o1 speaking describes a
The proof of the theorem is straightforward. combination of Kalman filters operating in concert.
From the definition of the measurement relation To examine this turther, consider the first and
(18) and the measurement noise density (21), one second moments associated with the prediction and
sees that filtering densities.

mk
Theorem 3.3
P(Zk/Xk) = ~., ~kN,,k(Zk-- Hkxk-- Vtk) . lk mk

1=1 j = l
Using this and (24) in (3) and applying the two eE(x~-~/~)~lzdA_p~/~2
scholiums, one obtains (25).
lk mk
The prediction density is determined from (4) and = Y__., Y., cij[o,j 2 + (~k/,,- ~tj) 2] (30)
leads to the following result. i=1 j=l

Theorem 3.2. Assume that p(xk/Zk) is given by


E[xk+~lZdAe~+x/~=Ok+~,ke~/k+E[wd (31)
(26). Then for the linear system (17) and plant E[(x~ + 1 - ~ + ~/~)2[ Z d = p k + ~/k 2
noise (20) the prediction density p(Xk+ 1/Zk) is
=¢~k+ l,k2pk/k2 + E {[wk--E(ogk)] 2} (32)
nk $k where
p(x~+l/z~)= Y, 2 a~jJ%/x~+l ~k
i=1 j=l

t-----1
- Ck + 1, kl~ik-- 09jk) (27)
~k
where E { EWk - - ~ ( ( D k ) ] 2 } = ~ ~ik(qik 2 dl- (.Oik 2) - - E 2 ( W k ) .
i---I
2 2 2 2
2u = ~ + l , k Plk +qyk • In equation (29), the mean value ~k/k is formed
as the convex combination of the mean values 8q
It is convenient to redefine terms so that (27) can of the individual terms, or Kalman filters, of the
be written as gaussian sum. It is important to recognize that the
cq, as is apparent from equation (25), depend upon
lk+l the measurement data. Thus, the conditional mean
p(x~+ ~lZ~) = ~ ~(~+ ~)N~(~ +,)(xk+ ~ - ~i(~ + ~)).
is a non-linear function of the current measurement
(28) data.
474 H . W . SORENSONand D. L. ALSPACH

The conditional variance pk/k 2 described by (30) the product of the number of terms in the two con-
is more than a convex combination of the variances stituent sums from which the densities are formed.
of the individual terms because of the presence of This fact could seriously reduce the practicality o!
the term (~k/k--eij) z- Tiffs shows that the variance this approximation if there were no alleviating
is increased by the presence of terms whose mean circumstances. The discussion above regarding the
values differ significantly from the conditional mean diminishing of the weighting factors and the com-
~k/k. The influence of these terms is tempered by bining of terms with nearly equal moments has
the weighting factor c~j. Note also that the con- introduced the mechanisms which significantly
ditional variance (in contrast to the linear Kalman reduce the apparent ill effects caused by the increase
filter) is a function of the measurement data because in the number of terms in the sum. It is an observed
of the cij and the (~k/k-- eij). fact that the mechanisms whereby terms can be
The mean and variance of the prediction den- neglected or combined are indeed operative and in
sity are described in an obvious manner. If the fact can sometimes permit the number of terms in
gaussian sum is an approximation to the true the series to be reduced by a substantial amount.
noise density, these relations suggest the desirability Since weighting factors for individual terms do
of matching the first two moments exactly in order not vanish identically nor do the mean and variance
to obtain an accurate description of the conditional of more gaussian densities become identical, it is
mean ~k+ l/k and variance Pk+ l/~. necessary to establish criteria by which one can
As discussed earlier, it is convenient to assign the determine when terms are negligible or are approxi-
same variance to all terms of the gaussian sum. mately the same. This is accomplished by defining
Thus, if the initial state and the measurement and numerical thresholds which are prescribed to main-
noise sequences are identically distributed, it is rain the numerical error less than acceptable limits.
reasonable to consider the variances for all terms to Consider the effects on the L ~ error of neglecting
be identical and to determine the consequences of terms with small weighting factors. Suppose that
this assumption. Note from Scholium 3.2 that if the density is

0"i2 = O'j2 m o -2 p(x)= ~ oqNa(x-ai) (33)


i=1
then
and such that al, a2 . . . . , ~,,-l(m <n) are less than
(rU2 = a2/2 for all i, j . some positive number 6~. Note that the variance
has been assumed to be the same for each term.
Thus, the variance remains the same for all terms Consider replacing p by Pa where
in the gaussian sum whenp(xk/Zk) is formed and the
concentrations as described by the variance becomes pA(X) = 1 ~ o~iN,,(x--ai). (34)
greater. Furthermore, from Scholium 3.2 it follows ~ , O~i i=m
that the mean value is given by i=m

The tollowing bound is determined without


ij - - ~i "37/~j
difficulty.
2
Theorem 3.4.
so the new mean value is the average of the previous
means. This suggests the possibility that the mean ~ m--1
values of some terms, since they are the average of
two other terms, may become equal, or almost
f oo
Ip(x)-p (x)ldx 2 1=1
E
< 2(m - 1)61 .
(35)
(36)
equal. If two terms of the sum have equal means
and variances, they could be combined by adding The L ~ error caused by neglecting ( m - 1) terms
their respective weighting factors. This would each of which are less than fi~ is seen in (35) to be
reduce the total number of terms in the gaussian less than twice the sum of the neglected terms.
sum. Thus, the threshold fi~ can be selected by using (36)
The ¢~j in (25) are essentially determined by a or (35) to keep the increased L 1 error within
gaussian density. Thus, if Zk--p~ j becomes very acceptable limits.
large, then the cij may become sufficiently small Consider the situation in which the absolute
that the entire term is negligible. If terms could be value of the difference of the mean values of two
neglected, then the total number of terms in the terms is small. In particular, suppose that a 1 and a2
gaussian sum could be reduced. are approximately the same and consider the L 1
The prediction and filtering densities are repre- error that results if the p(x) given in (33) is replaced
sented at each stage by a gaussian sum. However, by
it has been seen that the sums have the characteristic
that the number of terms increases at each stage as pa(x) = ~ oqN,(x-ai)+(oq +otz)N~(x-~ ) (37)
i=3
Recursive Bayesian estimation using gaussian sums 475

where The Kalman error variance is independent of the


measurement sequence and can cause misleading
filter response. For example, for some measure-
ment sequences perfect knowledge of the state is
Using (37), one can prove the following bound. possible, that is the variance is zero, but the Kalman
For a detailed proof of this and other results, see variance still predicts a large uncertainty. For
Ref. [17]. example, suppose that the measurements at each
stage are equal to
Theorem 3.5.
Zk=2k+4, k = 0 , 1. . . .
f~_o lP(x)-pA(X)ldx<_4~2Mla2-al
~+ ~ I (38)
Then, the minimum mean-square estimate for the
state based on the uniform distribution is
Thus terms can be combined if the right-hand side
of (38) is less than some positive number 62 which .¢k/k=2k+2, k=O, 1. . . .
represents the allowable L ~ error. The M in (38) is
the maximum value of N, and is given by and the variance of this estimate is
1
M= Pk/k2=O for all k.

Thus, for this measurement realization the mini-


Observe that as the variance tr decreases, the
mum mean-square estimate is error-free. Since the
distance between two terms to be combined must
Kalman variance is independent of the measure-
also decrease in order to retain the same error
ments, it is necessarily a poor approximation of the
bound.
actual conditional variance. The square root of the
3.2. A numerical example Kalman variance ¢7KAL and the error in the best
In this section the results presented in section 3.1 linear estimate together with the square root of the
are applied to a specific example. To make (17) variance a~s and error in the estimate of the state
and (18) more specific, suppose that the system is for the ten term gaussian sum for this measurement
realization is shown in Fig. 3(a). This shows that a
Xk "~--Xk- I "~ Wk- 1 (39) considerable improvement in both the mean and
Zk= Xk + vk (40) variance is provided by the gaussian sum when
compared with the results provided by the Kalman
where the xo, Wk, Vk(k= O, 1. . . . ) are assumed to
filter.
be uniformly distributed on ( - 2 , 2) as defined by (a)
(15).
The problem that is considered represents some- o'io! L
thing of a worst case for the approximation because 0"8 +f.tr-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+•CTKA
.....'"
0-7 .................................................... "~ KAL
the initial state and the noise sequences are assumed
0.6
to be uniformly distributed with the density dis- 0"5 f
cussed in section 2.2. As discussed there, the dis- 0"4 /
continuities at + 2 make it diificult to fit this density 0-5
(.
and necessitates the use of many terms in the 0.2
: 1-" ~" . . . . . " .... XCS
0"1 -/.
gaussian sum. The specific approximation used
here contains 10 terms and is shown in Fig. l(b). % v" I
I 2
1
3 4
I I
5
I
6
I
7
I
8
I 1 I
9 10 II
I I I 1 I I I I I
12 13 14 15 16 17 18 19 20
K
It is apparent that this approximation has non-
(b)
trivial errors in the neighborhood of the dis-
continuities but nonetheless retains the basic I0
0-9
character of the uniform distribution. 0.8
The performance of the gaussian sum approxima- 0.7 I / /
tion for this example is described below by com-
paring the conditional mean and variance provided
0.6
0.5
I,,'/V " , xoo,
0-4- ~. arRuz
by the approximation with that predicted by the
0.3
Kalman filter and with the statistics obtained by 0.2
considering the true uniform distribution. The 0.1
latter have been determined for this example after a 00
- 5 ', "~'.-'.,:'-~c.-~'<'~....:T'-.~ L."~"r..a.• , ,-.r..~..,
I 2 3 4 5 6 7 8 9 I0 II 12 13 14 15 16 17 18 19 PO
nontrivial amount of numerical computation. In I<

addition the true a posteriori density p(Xk/Zk) has F[o. 3. Gaussian sum and kalman filters compared
been computed and is compared with that obtained with best non-linear filter.
(a) Perfect knowledge example.
using the approximation. Co) Random noise example.
476 H . W . SORENSONand D. L. ALSPACH

The measurement sequence described above is The aposteriori density function for the third and
highly improbable. A more representative case is fourteenth stages is shown in Figs. 4(a) and 4(b).
depicted in Fig. 3(b) in which the plant and measure- In these figures, the actual density, the gaussian
ment noise sequences were chosen using a random approximation provided by the Kalman filter and
number generator for the uniform distribution the gaussian sum approximation are all included.
prescribed above. This figure again shows that the At stage 3, the true variance is smaller than the
gaussian sum approximation to the true variance is Kalman variance and the Kalman mean has a
considerably improved over the Kalman estimate significant error whereas at stage 14 the true
and, in fact, agrees very closely with the true variance is larger than that predicted by the Kalman
standard deviation aTRU~, of the aposteriori density. filter. Note that the error in the a priori density
However, the Kalman variance is representative approximation at the discontinuities is still evident
enough that the error in the estimate of the mean is in the a posteriori approximation but that the
not particularly different than that provided by the general character of the density is reproduced by
gaussian sum. the gaussian sum.

(a) (b)
True probability density 0.5
0,75
. . . . . . . Kalman approximation
.... Gaussian sum approximation
..""..,
0.4
,.. \
.. L
0.50 •" ',
/ \ ~a 0 . 3
b. ....... \
(:3 • .l/ "+.. \ (3-
13.
.:.'
.7 •..
"'.+. \ j .:.:,',,,
0"25 ...
."f'/i/
i
".
"+.. '\
0"2
/
/
'
.:"
," ". \
.. \
"... \\
\
"'"'#/ "+"~.X O.i /
/
..'
.: '. \
',

,,,// "'tX\.. / .."' '.

""~3
I II I I i I i I I I I 0
0"5 I'0 1'5 2"0 2'5 S'O 1,0 2'0 3-0 .4-0 5"0
x x

(c) (d)
o.75 True probability density 0.4 - ........... 8,= 0 . 0 O I , 82= 0 . 0 0 . 5
..... 8~ = 0 . 0 0 1 = 82 ..... 81= 0 " 0 0 5 , 82 = 0 . 0 O I

."~. ~:+-h+

r~ 0'50
0"3
/Y
Q.

.,.'" ,~ Q
;W
nO.2 ~ "..
/.'-I: " \
; °'" ~
0'25

~
0-0 0.5 I'0
I
I-5
I
2"0
I
2'5
I
%
I
S'O
0
0
~-'1 ,
1.0
, ,
2"0
, ,
3"0
, ,
4.0
\ 5'0
x x

FIO. 4. Influence of 61, &zon a posteriorl density.


(a) Third stal~l---82------0"001.
Co) Fourteenth stag~---Sl =82 =0.001.
(c) Third s t a l l = 0 " 0 0 1 , 62=0.005. and 5t =0"005, 61=0"001.
(d) Fourteenth staile---alffi04)05, &z=0"001. and 51=0.001, 8z=0.005.
Recursive Bayesian estimation using gaussian sums 477

In Figs. 4(a) and 4(b) the approximations at each 4. C O N C L U S I O N S


stage are based on terms being eliminated when The approximation of density functions by a
their weighting functions are less than St=0.001 sum of gaussian densities has been discussed as a
and combined when the difference in mean values reasonable framework within which estimation
cause the L ~ error to be less than 52=0.001. The policies for non-linear and/or nongaussian stochastic
effect of changing these parameters, that is, 5~ and systems can be established. It has been shown that
52, can be seen in Figs. 4(c) and 4(d). In these a probability density function can be approximated
figures, the effect of changing 5t and c52 is shown. arbitrarily closely except at discontinuities by such
In one case tS~ is increased from 0.001 to 0.005 while a gaussian sum. In contrast with the Edgeworth or
keeping 52 equal to 0.001. Alternatively, 52 is in- Gram-Charlier expansions that have been in-
creased to 0.005 while the value of 5~ is maintained vestigated earlier, this approximation has the
equal to 0.001. It is apparent in this example that advantage of converging to a broader class of
52 has a larger effect on the density approximation. density functions. Furthermore, any finite sum of
The increase in this parameter can be seen to these terms is itself a valid density function.
introduce a "ripple" into the density function and The gaussian sum approximation is a departure
indicates that the individual terms have become too from more classical approximation techniques
widely separated to provide a smooth approxima- because the sum is restricted to be positive for all
tion. It is interesting that the effect appears to be possible values of the independent variable. As a
cumulative as the ripple is not apparent after three result, the series is not orthogonalizable so that the
stages but is quite marked at the 14th stage. manner in which parameters appearing in the sum
The effect of improving the accuracy of the are chosen is not obvious. Two numerical pro-
density approximation by including more terms in cedures are discussed in which certain parameters
the a priori representations and by retaining more are chosen to satisfy constraints or are somewhat
terms for the a posteriori representation can be seen arbitrarily selected and others are chosen to mini-
in Fig. 5. Twenty terms are included in the a priori mize the L k error.
densities and 5~ and 52 are reduced to 0.0001. The It is anticipated that the gaussian sum approxima-
figure presents the actual a posteriori density, the tion will find its greatest application in developing
gaussian sum approximation, and the gaussian estimation policies for non-linear stochastic systems.
approximation provided by the Kalman filter However, many of the characteristics exhibited by
equations. Comparison of the results for stages 3 non-linear systems and some of the difficulties in
and 14 with Fig. 4 indicates the improvements in using the gaussian sum in these cases are exhibited
the approximation that have occurred. by treating linear systems with nongaussian noise
- - True p r o b a b i l i t y density
Kalman approximation
.....
---Gaussian sum approximation
050F .. 2 OF 0.75 I- 0 ' 5 0 I-

l.....b - . .
.." ".. u_ - --' "'

025 L ~ .--'~-"m---~-I o I0 c-, l .. a.- 0 25 .....


o._
/......,., .....,.
o,:
-2-50 0
.,
250
ol .....
20
; i o o-"
1.50 5'00
i o[. '',
0 2.50
"-~
5.00
x X X X
0.50 F .... I0- 0-5Of- • 050[- .'..

0.~:5 ..'
.... . ~_,~l
t- b"-,, ."~
1 %25 t-
t- ..7 ". t ~c~ t- I,' ..'" "'-'-'~,,,
o ,..., J",'l_ , ol ........ .:/ , , I% o1.;'1 , ,
-I00 i50 -250 ',00 500 0 2.50 500 1.50 4-00 6-50
X X X X
0.50[- 050[ 050 l- I0[

1'50 4'00 6'50 0 250 !00 1.50 400 20 40 60


X X X X

°5°rl-I / ..--~-'~m-q""k °5°r °5°1- I°f ~ "

oL,~f 1 I I If'., ol,.d" ~ I r r-.,, oL.,L" ~ I I Itz'~ oi ~ i i t i" L'~---.t


0 250 500 0 250 5.00 C,'~O 2.00 4-50 iO 40 5.0. 60
X X X X
Fie. 5. A posteriori d e n s i t y f o r s i x t e e n s t a g e s .
478 H . W . SORENSON a n d D. L. ALSPACH

sources. O f course, when the noise is entirely [l 5] W. FELLER: An Introduction to Probability Theory and
Its Applications, Vol. lI, p. 249. John Wiley, New York
gaussian the p r o b l e m degenerates and the familiar
(1966).
K a l m a n filter equations are o b t a i n e d as the exact [16] J. D. SI'RAOINS: Reproducing distributions lbr machine
solution of the problem. learning. Standard Electronics Laboratories, Technical
Report No. 6103-7 (November 1963).
The linear, n o n g a u s s i a n estimation p r o b l e m is
[17] D. L. ALST'ACH: A Bayesian approximation technique
discussed a n d it is shown that, if the a priori density for estimation and control of time-discrete stochastic
functions are represented as gaussian sums, then systems. Ph.D. Dissertation, University of Cahfornia,
San Diego (1970).
the n u m b e r of terms required to describe the a
posteriori density is equal to the p r o d u c t of the R6sum~--Les relations recurrentes bayesiennes qui decrivent
le comportement de la fonction 'de densit6 de probabilit6
n u m b e r of terms of the a priori densities used to
a posteriori de l'6tat d'un syst6me al6atoire, discret dans le
form it. The a p p a r e n t disadvantage is seen to cause temps, en se basant sur les donn6es des mesures disponibles,
little difficulty however, because the m o m e n t s of ne peuvent ~tre g6n6ralement resolues sous une forme fermee
m a n y individual terms converge to c o m m o n values lorsque le syst6me est soit non-lin6aire, soit non-gaussien.
ke present article introduit et propose une approximation de
which allows them to be combined. Further, the densit6, mettant en jeu des combinaisons convexes de fonc-
weighting factors associated with m a n y other terms tions de densit6 gaussiennes, ~t titre de m6thode efficace pour
6viter les difficultes rencontr6es dans l'6valuation de ces
become very small a n d permit those terms to be
relations et dans l'utilisation des densit6s en r6sultant pour
neglected w i t h o u t i n t r o d u c i n g significant error to determiner des strat6gies d'6valuation particuli6res. II est
the a p p r o x i m a t i o n . montr6 que, lorsque le nombre de termes de la somme
gaussienne augmente sans limites, l'approximation converge
N u m e r i c a l results for a specific system are
uniformement vers une fonction de densit6 quelconque dans
presented which provide a d e m o n s t r a t i o n of some une cat6gorie 6tendue. De plus, toute somme finie est elle-
of the effects discussed in the text. These results m6me une fonetion de densit6 valable, d'une mani6re
diff6rente des nombreuses autres approximations 6tudi6es
confirm dramatically that the gaussian sum approxi-
dans le pass6.
m a t i o n can provide considerable insight into Le probl6me de la determination des estimations de la
problems that hitherto have been intractable to densit6 a posteriori et de la variance minimale pour des
syst~mes lin~aires avee bruit non-gaussien est trait6 en
analysis.
utilisant l'approximation des sommes gaussiennes. Ce
probl6me est 6tudi6 parce qu'il peut ~tre trait6 d'une mani~re
relativement simple en utilisant l'approximation mais
REFERENCES contient encore la plupart des difficult~.s rencontr6es en
consid6rant des syst+mes non-lin6aires, puisque la densit6
[1] A. H. JAZWINSKI: Stochastic Processes and Filtering a posteriori est non-gaussienne. Apr6s la discussion du
Theory. Academic Press, New York (1970). probl6me g6n6ral du point de vue de l'applicationdes sommes
[2] R. E. KALMAN: A new approach to linear filtering and gaussiennes, l'article pr6sente un exemple num6rique dans
prediction problems. J. bas. Engng 82D, 35--45 (1960). lequel les statistiques r6611es de la densit6 a posteriori sont
[3] H. W. SORENSON: Advances in Control Systems, Vol. 3, compar6es avec les valeurs pr6dites par les approximations
Ch. 5. Academic Press, New York (1966). des sommes gaussiennes et du filtre de Kalman.
[4] R. COSAERTand E. GOTTZEIN: A decoupled shifting
memory filter method for radio tracking of space Zusammenfassung--Die Bayes'schen Rekursionsbezie-
vehicles. 18th International Astronautical Congress, hungen, die das Verhalten der a priori-Wahrscheinlichkeits-
Belgrade, Yugoslavia (1967). dichtefunktion des Zustandes eines zeitdiskreten stochasti-
[5] A. JAZWINSKI: Adaptive filtering. Automatica 5, 475- schen Systems beschrieben, k6nnen nicht allgemein in
485 (1969). geschlossener Form gel6st werden, wenn das System ent-
[6] Y. C. Ho and R. C. K. LEE: A Bayesian approach to weder nichtlinear oder nicht-gaussisch ist. In dieser Arbeit
problems in stochastic estimation and control. IEEE wird eine Dichteapproximation, die konvexe Kombina-
Trans. Aut. Control 9, 333-339 (1964). tionen von Gauss'schen Dichtefunktionen enth~ilt, eingefiihrt
[71 H. J. KUSHNER: On the differential equations satisfied und als bedeutungsvoUer Weg zur Umgehung der Schwierig-
by conditional probability densities of Markov pro- keiten vorgeschlagen, die sich bei der Auswertung dieser
cesses. SIAMJ. Control 2, 106-119 (1964). Beziehungen und bei der Benutzung der resultierenden
[8] J. R. FmrlER and E. B. STEAR: Optimal non-linear Dichten ergeben, um die spezifische Schiitzstrategie zu
filtering for independent increment processes, Parts I bestimmen. Ersichtlich konvergiert, da die Zahl der Terme
and II. IEEE Trans. Inform. Theory 3, 558-578 (1967). in der Gauss'schen Summe unbegrenzt w~ichst, die Approxi-
[9] M. AOKI: Optimization of Stochastic Systems, Topics mation gleichm~il~igzu einer Dichtefunktion in einer um-
in Discrete-Time Systems. Academic Press, New York fassenden Klasse. Weiter ist eine endliche Summe selbst eine
(1967). gfiltige Dichtefunktion, anders als viele andere schon
[10] H. W. SORENSONand A. R. STOBBERUD: Nonlinear untersuchte Approximationen.
filtering by approximation of the a posteriori density. Das Problem der Bestimmung der Sch~ttzungen der a
Int. J. Control 18, 33-51 (1968). posteriori-Dichte und des Varianzminimums ftir lineare
[11] M. AOKI: Optimal Bayesian and rain-max control of a Systeme mit nichtgauss'schem Ger~iusch wird unter Behand-
class of stochastic and adaptive dynamic systems. Pro- lung der Gauss'schen Summenapproximation behandelt.
ceedings IFAC Symposium on Systems Engineering for Dieses Problem wird betrachtet, well es in einer relativ
Control System Design, Tokyo, pp. 77-84 (1965). geraden Art unter Benutzung der Approximation behandelt
[12] A. V. CAMERON: Control and estimation of linear werden kann, abet immer noch die meisten der Schwierig-
systems with nongaussian a priori distributions. Pro- keiten enth~ilt, denen man bei der Betrachtung nichtlinearer
ceedings of the Third Annual Conference on Circuit and Systeme begegnet, da die a posteriori-Dichte nichtgaussisch
System Science (1969). ist. Nach der Diskussion des allgemeinen Problems vom
[13] J. T. Lo: Finite dimensional sensor orbits and optimal Gesichtspunkt der Anwendung Gauss'scher Summen, wird
nonlinear filtering. University of Southern California. ein Zahlenbeispiel gebracht, in dem die aktuellen Statistiken
Report USCAE 114, August 1969. der a posteriori-Dichte mit den Werten verglichen werde,
[14] J. KOREYAAR: Mathematical Methods, Vol. 1, pp. 330- die durch die Gauss'sche Summe und durch die Kalman-
333. Academic Press, New York (1968). Filter-Approximationen vorhergesagt wurden.
Recursive Bayesian estimation using gaussian sums 479

Pe~IOMe---PeKypcHBHMe 6a~ecom:r~e BMpa~eHH~ OnHCM- IIHOTHOCTH~ B OTJIPIqI4H OT MHOFO~4~HCHeHH]bIX ~Ipyrtlx


aalottlne nOBelIeHHe anocTepHopHoi~ ~ y m m H nnOTHOCTH npH6Jm~cermll H3y~aBmHXC~ S npomnoM.
BepO~ITHOCTH COCTOIIHH~I cay~aI~o~ CHCTeMbI, ]IHCI{peTHO~ I-[po6J1eMa oHpo~eJ1erm~ OHOHOK anOCTOpHOpHOfl rmO-
no SpeMetlrl, OCHOSl,magcl, Ha ~ocTynm~Ix pe3ynbTaTax THOCTII H MHHHMaYlbHO~ BapHaHTHOCTH ~flH HHHCflHbIX
H3MepeHgl~, He MOFyT 6MTI, O61,DIHO peiiieHl,i B 3aMKHyTOI~ CHCTeM C IIo-rayccoBCIOIM IIIyMOM pemaeTcfl c noMotI~rO
dpOpMe Korea CHCTeMa HH60 HCHHHe]~Ha, dIH60-~e He n p H 6 n r m c o n ~ rayccoBcrdax CyMM. ~ T a npo6fleMa n3y-
ssnaerc~ raycco~croit HaCTOSi~a~ CTaTb~ BBO~HT H tlaeTCfl IIOTOMy qTO OHa MO)KeT 6bITb peIiieHa cpaBHllTe-
npemmraer npH6HH~KeHHe ILrlOTHOCTH, npa6craromcc K JlbHO npOCTO Hcno.rtl,3y~ npHfm, txcerme HO co~lep~crlT em e
COqeTallI~lM rayccoBCKHX dpyHKl1Hlt HHOTHOCTH, B Ka~ICCTBe 60JIBIIIHHCTBO 3aTpy~IHeHni~ BCTpeqaeMI, IX pacCMaTpHBa~
3Op~eKTHBHOFO MeTo~Ia ~ n a H3~KaHHfl 3aTpy~HeHHl~ HeaaHeianbie CHCTeMbI, H60 ariocrepaopriaa nnoraocrb
BCTpeqaeMblX B OI~eHKe 3THX Bblpa~KeHH~ H B HCHO.rlb3OBaHHH aa~ewcs He-rayccoBczoit Hocne o6cy:~cjXeHaa o6IUeR
B],~TeKaIOILmX H3 HHX HJIOTHOCTeI'~ ~II~1 onpe~eHH~ ~aHHblX npo6HeMbl C TOqKH 3peHHll HpHMeHeHH~I rayCCOBCKHX
CTaTerH.fi OReHKH. I'IoKa3MBaeTC~ qTO, KOFRa ~IHCHO CyMM~ CTaTl~fl~laeT qHCJIOBOI~rlpltMep B KOTOpOM ~el~CTBHTe-
~IeHOB raycco~croit CyMMbI 6ecnpe~eH~HO Bo3pacTaeT, IlbHble CTaTHCTHKH anOcTeprlopHolt ILrlOTHOCTH cpaBHHBalO-
rlpH6nH~KeHHe paBHOMepHo CTpeMrlTC~I K n/o6o~t (I~yHKI~HH TCfl C 3HaLIeHH~VIH Hpe~cKa3at-IHblMrl HpH6HH)KeHri~Mri
ILrIOTHOCTH B IIIHpOKOM KJlaCC~. CBepx ~TOFO, BCaKa~I rayccoacKax CyMM H ~HHbTpa KanMana.
KOHeqHall CyMMa caMa ~IBJl~leTCfl rlpHeMneMOl'~ ~yHKHHeI~

You might also like