You are on page 1of 6

LECTURE 8-9: THE BAKER-CAMPBELL-HAUSDORFF FORMULA

1. Taylor’s expansion on Lie group


As we have seen,
[X, Y ] = adX(Y ).
So if G is an abelian group, then c(g) : G → G is the identity map for all g ∈ G.
As a consequence, ad(X) ≡ 0. It follows that the Lie algebra of an abelian Lie group
is also abelian, i.e. [X, Y ] = 0 for all X, Y ∈ g. Conversely, one can prove (using
proposition 3.1 in lecture 6) that if G is connected and g is abelian, then G is also
abelian. In other words, the Lie bracket operation on g measures the non-commutativity
of the multiplication operation on G. In what follows we would like to characterize
this quantitatively. In other words, we would like to find out the different between
exp(X) exp(Y ) and exp(X + Y ) for a general Lie group.
Let G be a Lie group and X ∈ g a left invariant vector field on G. Then

d
(Xf )(a) = Xa f = dLa Xe f = Xe (f ◦ La ) = f (a exp(tX)).
dt t=0
for any f ∈ C ∞ (G) and any a ∈ G. More generally, for any t ∈ R,

d d d
(Xf )(a exp(tX)) = f (a exp(tX) exp(sX)) = f (a exp((t+s)X)) = f (a exp(tX)).
ds s=0
ds s=0
dt
Using this and induction, one can see that for any k ≥ 0,
dk
(X k f )(a exp(tX)) = (f (a exp(tX))) .
dtk
In particular,
dk

k
(X f )(a) = k f (a exp(tX)).
dt t=0
The formulae above can be generalized to multi-variable case. In fact, if X1 , · · · , Xk ∈
g, then

d d d
(X1 X2 f )(a) = (X2 f )(a exp(t1 X1 )) = f (a exp(t1 X1 ) exp(t2 X2 )),
dt1 t1 =0 dt1 dt2 t1 =0 t2 =0

and in general
∂k


(X1 · · · Xk f )(a) = f (a exp(t1 X1 ) · · · exp(tk Xk )).
∂t1 · · · ∂tk t1 =···=tk =0
As a consequence, we get the following Taylor’s expansion formula

1
2 LECTURE 8-9: THE BAKER-CAMPBELL-HAUSDORFF FORMULA

Proposition 1.1. If f is smooth on G, then for small |t|,


( )
X t2 X 2 X
f (exp(tX1 ) · · · exp(tXn )) = f (e)+t Xi f (e)+ Xi f (e) + 2 Xi Xj f (e) +O(t3 ).
i
2 i i<j

We remark that the previous formulae hold for vector-valued functions as well. Our
main result in this section is
Theorem 1.2. Let n ≥ 1 and X1 , · · · , Xn ∈ g. Then for |t| sufficiently small,
X t2 X
exp(tX1 ) · · · exp(tXn ) = exp(t Xi + [Xi , Xj ] + O(t3 )).
1≤i≤n
2 1≤i<j≤n

Proof. We apply proposition 1.1 to the inverse of the exponential map near e, i.e. the
map f defined by
f (exp(tX)) = tX
for t small enough. Then obviously, f (e) = 0. For any X ∈ g,

d d
(Xf )(e) = f (exp(tX)) = (tX) = X
dt t=0 dt t=0
and for any n > 1,
dn dn

n
(X f )(e) = n f (exp(tX)) = n (tX) = 0.
dt t=0 dt t=0
Notice X X X
Xi2 + 2 Xi Xj = (X1 + · · · + Xn )2 + [Xi , Xj ],
i i<j i<j

it follows that
X t2 X
f (exp(tX1 ) · · · exp(tXn )) = t Xi + [Xi , Xj ] + O(t3 ).
i
2 i<j

On the other hand, by the definition of f ,


exp(tX1 ) · · · exp(tXn ) = exp(f (exp(tX1 ) · · · exp(tXn ))).
This completes the proof. 

In particular, we see
t2
exp(tX) exp(tY ) = exp(tX + tY + [X, Y ] + O(|t|3 )
2
for |t| small. So [X, Y ] dominates the difference between exp(X) exp(Y ) and exp(X +
Y ), and thus dominates the non-commutativity of the group multiplication.
LECTURE 8-9: THE BAKER-CAMPBELL-HAUSDORFF FORMULA 3

2. The Baker-Campbell-Hausdorff Formula


Now the question is: What are the higher order terms in O(t3 ) above? For sim-
plicity we will denote by log the inverse of exp near 0 ∈ g. Let
µ(X, Y ) = log(exp(X) exp(Y ))
for X, Y close to 0 ∈ g. We have seen above that
1
µ(X, Y ) = X + Y + [X, Y ] + O(|X|3 , |Y |3 )
2
for |X|, |Y | small. A remarkable fact about the remainder terms is that they involves
only Lie brackets! In other words, we have
Theorem 2.1 (The Baker-Campbell-Hausdorff formula (existence)). For X and Y
small, we have X
µ(X, Y ) = X + Y + Pm (X, Y ),
m≥2

where Pm (A, B) is a Lie polynomial of order m, i.e. Pm (X, Y ) is a combination of


nested commutators in X, Y that involves m − 1 Lie brackets.

Although in application, the above existence result is sufficient, we will prove the
following explicit formula:
Theorem 2.2 (Dynkin’s formula). For X and Y small,
∞ P
X (−1)k X (−1) i (li +mi ) (adY )l1 (adX)m1 (adY )lk (adX)mk
µ(X, Y ) = X+Y + ◦ ◦· · ·◦ ◦ (Y ),
k+1 l1 +· · ·+lk +1 l1! m 1! lk! m k!
k=1

where the second summation is over l1 , · · · , lk , m1 , · · · , mk ≥ 0, li + mi > 0.

As a consequence, we can write down Pm (X, Y ) for m small. Of course,


1
P2 (X, Y ) = [X, Y ].
2
The next term P3 (X, Y ) comes from the following terms in Dynkins’s formula:
(1) k = 1, l1 = 1, m1 = 1 =⇒ − 12 12 [Y, [X, Y ]];
(2) k = 1, l1 = 0, m1 = 2 =⇒ − 12 11 12 [X, [X, Y ]];
(3) k = 2, l1 = 1, m1 = 0, l2 = 0, m2 = 1 =⇒ 13 21 [Y, [X, Y ]];
(4) k = 2, l1 = 0, m1 = 1, l2 = 0, m2 = 1 =⇒ 13 11 [X, [X, Y ]];
It follows
1
([X, [X, Y ]] − [Y, [X, Y ]]).
P3 (X, Y ) =
12
Similarly one can calculate the next term and get
1
P4 (X, Y ) = [X, [Y, [Y, X]]].
24
4 LECTURE 8-9: THE BAKER-CAMPBELL-HAUSDORFF FORMULA

To prove the Dynkin’s formula, we will need the following formula that computes
the differential of the exponential map at an arbitrary point.
Lemma 2.3. For each X ∈ g,
(d exp)X = (dLexp X )e ◦ φ(adX),
where φ is the function

1 − e−z X (−1)m m
φ(z) = = z .
z m=0
(m + 1)!

Proof of Dynkin’s formula.


Write
Z(t) = log(exp(X) exp(tY )).
Applying lemma 2.3, we get
d d
(exp Z(t)) = dLexp X (exp tY ) = dLexp X dLexp tY φ(ad(tY ))(Y ) = dLexp Z(t) (Y ),
dt dt
where we used the fact φ(ad(tY ))(Y ) = Y . On the other hand, by using lemma 2.3
directly,
d dZ
(exp Z(t)) = dLexp Z(t) φ(adZ(t)) .
dt dt
It follows
dZ adZ(t) X 1
= (Y ) = (I − exp(−adZ(t)))k (Y ).
dt I − exp(−adZ(t)) k≥0
k + 1

Notice that by the naturality of exp and by the definition of ad and Ad,
exp(−adZ(t)) = Ad exp(−Z(t)) = Ad(exp(−tY ) exp(−X))
= Ad(exp(−tY )) ◦ Ad(exp(−X))
= exp(−tad(Y )) ◦ exp(−ad(X)).
Thus
dZ X (I − exp(−tadY ) ◦ exp(−adX))k
= (Y )
dt k≥0
k+1
X (−1)k )l1 (adX)m1 (adY )lk (adX)mk
|l|+|m| (adY
X
|l|
= t (−1) ··· Y.
k≥0
k+1 l1 ,··· ,lk ,m1 ,··· ,mk ≥0,li +mi >0
l1 ! m1 ! lk ! mk !

where in the last step we used the fact that adX ∈ End(g) is an element in a linear
Lie group, and thus the exponential map is exactly the matrix exponential. Now the
Dynkin’s formula follows from termwise integration over t from 0 to 1. 
LECTURE 8-9: THE BAKER-CAMPBELL-HAUSDORFF FORMULA 5

3. The Derivative of the Exponential Map


Finally we prove lemma 2.3. We first show
Lemma 3.1. Let γ1 (t), γ2 (t) be smooth curves on G, and let γ(t) = γ1 (t)γ2 (t), then
γ̇(t) = dLγ1 (t) (γ̇2 (t)) + dRγ2 (t) (γ̇1 (t)).

Proof. Notice the fact γ(t) = µ(γ1 (t), γ2 (t)), where µ is the multiplication operation on
G. So the formula above follows from the following formula we have proven,
dµa,b (Xa , Yb ) = (dLa )b (Ya ) + (dRb )a (Xa ).


More generally, by using induction one can easily see that if γ1 (t), · · · , γm (t) are
smooth curves on G, and let γ(t) = γ1 (t) · · · γm (t), then
m
X
γ̇(t) = dLγ1 (t) · · · dLγk−1 dRγk+1 (t) · · · dRγm (t) (γ̇k (t)).
k=1

Now we are ready to prove lemma 2.3. For simplicity we will denote

d
ν(X, Y ) := exp(X + tY ) = (d exp)X (Y ).
dt t=0
Obviously ν(X, Y ) is linear in Y for each fixed X, and lemma 2.3 follows from
Lemma 3.2. For any X, Y ∈ g,

d
exp(X + tY ) = (dLexp X )e ◦ φ(adX)(Y ).
dt t=0

Proof. We notice that for any positive integer m,


 m
d X Y
ν(X, Y ) = exp( + t )
dt t=0 m m
m−1
X X Y
= (dLexp X )m−k−1 (dRexp X )k ν( , )
k=0
m m m m
m−1
1 X X
= (dLexp X )m−1 (dLexp X )−k (dRexp X )k ν( , Y ).
m m
k=0
m m m

Recall that the differential of the conjugation map c(a) = La Ra−1 is Ad, so we get
 k  k  k
−k k −X −X adX
(dLexp X ) (dRexp X ) = dc(exp( )) = Ad(exp ) = exp(− ) .
m m m m m
6 LECTURE 8-9: THE BAKER-CAMPBELL-HAUSDORFF FORMULA

So we get, for every positive integer m,


m−1  k
m−1 1 X adX X
ν(X, Y ) = (dLexp X ) exp(− ) ν( , Y ).
m m k=0 m m
Now the result follows since as m → ∞,
(dLexp X )m−1 = dLexp (m−1)X → dLexp X ,
m m

X
ν( , Y ) → ν(0, Y ) = (d exp)0 (Y ) = Y,
m
and, since adX ∈ End(g) is a matrix,
m−1  k m−1  
1 X adX 1 X k
exp(− ) = exp − adX
m k=0 m m k=0 m
m−1 ∞  n
1 XX 1 k
= − adX
m k=0 n=0 n! m

" m−1   #
n
X 1 X k (−1)n
= (adX)n
n=0
m k=0 m n!
∞ 1
(−1)n
X  Z 
→ n
x dx (adX)n
n=0 0 n!

X (−1)n
= (adX)n .
n=0
(n + 1)!

Finally we give several applications. Given the derivative of the exponential map
exp at an arbitrary point, we are ready to answer the following question: at which
points X the map exp is singular, i.e. (d exp)X is not invertible? Since
(d exp)X = (dLexp X )e ◦ φ(adX)
and dLexp X )e is always invertible, we see that (d exp)X is not invertible if and only if
the matrix φ(adX) ∈ End(g) is not invertible, i.e. 0 is not an eigenvalue of φ(adX).
−λ
Since all eigenvalues of φ(adX) are of the form φ(λ) = 1−eλ , where λ is an eigenvalue
of adX ∈ End(g), we conclude
Corollary 3.3. The singular points of the exponential map exp : g → G are precisely
those X ∈ g such that adX ∈ End(g) has an eigenvalue of the form 2πik, with k ∈
Z \ {0}.
As an example, we see that if G is an abelian Lie group, then exp is non-singular
everywhere. More generally, if g is nilpotent, then exp : g → G is non-singular every-
where.

You might also like