You are on page 1of 2

Appendix A: Matrix Inversion Lemma and the Schur Complement

Below are some useful properties of block matrices and how they are used in recursive ltering. First we look at the problem of computing the inverse of a matrix in terms of its sub-matrices. This derivation leads to the matrix inversion lemma, and makes use of an operation called the Schur complement, which for normal distributions in canonical form turns out to be equivalent to marginalization. The matrix inversion lemma and marginalization via the Schur complement are useful for understanding recursive linear estimation theory. For instance, deriving the Kalman lter from Bayes rule is greatly simplied by reference to the matrix inversion lemma, and the time step in state estimation is an application of marginalization. Let us say M is a large square matrix A B M= C D that we wish to invert, and we know that A and D are square and invertible. The rst thing is to notice the two following simple matrix multiplications that allow us to triangularize M: rst, the following left multiplication creates an upper-right triangular system, I 0 A B A B = C D 0 A CA1 I and the following right multiplication creates a lower-left triangular system, A B I A 1 B A 0 = . C D 0 I C A The term A = DCA1 B is called the Schur complement of A in M . Similarly, we can complement D instead of A, A B D 0 I BD1 = , C D C D 0 I and A C B D I D 1 C 0 I = D 0 B D ,

where D = ABD1 C is the Schur complement of D in M. Combining the above gives two dierent ways to block diagonalize M: I 0 A B I A1 B A 0 = , C D 0 I 0 A CA1 I and 1

I BD1 0 I

A B C D

I D 1 C

0 I

D 0

0 D

Using these we can re-express the original matrix M in terms of a lower-left block triangular component, a block diagonal component, and an upper-right block triangular component. That is, A C B D I CA1 0 I A 0 D 0 0 A 0 D I A 1 B 0 I I D 1 C 0 I

= =

I BD1 0 I

which greatly simplies computing the inverse since the middle term is block diagonal. For instance, A C B D
1

= =

I 0

A 1 B I

1 1 A1 + A1 B A CA 1 1 A CA

A 1 0

0 A 1

1 A 1 B A 1 A

I CA1 ,

0 I (1)

and equivalently, A C B D
1

= =

I D 1 C D

0 I

1 D 1

D 1 0

0 D
1

1 C D

1 D

1 1 D BD 1 1 + D CD BD1

I 0

BD1 I . (2)

Equating various terms of (1) and (2) yield the dierent forms of the Matrix Inverse Lemma, one of which is, (ABD1 C)1 = A1 + A1 B(DCA1 B)1 CA1 . This lemma is one of the primary tricks used to manipulate systems of equations that appear in least-squares methods. For instance, it is used to get between the Gauss-Newton method and the Kalman lter. The Schur complement, when applied to an inverse covariance or information matrix, is equivalent to marginalizing a normal distribution.

You might also like