You are on page 1of 320

Group Theory

A Physicist’s Primer
Q. Ho-Kim
c Q. Ho-Kim (Ho Kim Quang). Group Theory: A Physicist’s Primer.
2014–19 ♠ 191115
Group theory. Symmetries in physics. Finite groups. Lie groups. Lie alge-
bras. Structures of groups. Representations of groups. Representations of
Lie algebras. Groups in particle physics.

Cover illustration: The Palace of the Alhambra in Granada contains a treasure trove of
mosaics, tiles, and other ornaments of wonderful symmetric designs, as exemplified by
this intricate interlacing pattern in the Hall of the Two Sisters shown on the front cover.
(Source: gwydir.demon.co.uk/knots/islamic. Courtesy of Jo Edkins).
Group Theory
A Physicist’s Primer

Q. Ho-Kim

♠ 191115
c Q. Ho-Kim (Ho Kim Quang). Group Theory: A Physicist’s Primer.
Cover illustration from image by Jo Edkins, used with permission.
2014–19 ♠ 191115
Contents

Preface v

1. Group Structure 1–32


1.1 Definition and Basic Properties . . . . . . . . . . . . . . . . . . . . . 1
1.2 Subgroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Cosets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4 Normality and Conjugation . . . . . . . . . . . . . . . . . . . . . . . 15
1.5 Quotient and Product Groups . . . . . . . . . . . . . . . . . . . . . . 17
1.6 The Symmetric Group . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.7 Classification of Finite Groups . . . . . . . . . . . . . . . . . . . . . . 28
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2. Group Representations 33–72


2.1 Linear Vector Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.2 Definition and Constructions . . . . . . . . . . . . . . . . . . . . . . 38
2.3 Simple Representation . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.4 Unitary Representation . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.5 Schur’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
2.6 Matrices and Characters . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.7 Tensor-Product Representation . . . . . . . . . . . . . . . . . . . . . 63
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

3. Lie Groups and Lie Algebras 73–122


3.1 Lie Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.2 Global Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
3.3 Local Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
3.4 Lie Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
3.5 Back to Lie Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
3.6 Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

4. sl (2, C ) and Associated Lie Groups 123–154


4.1 sl (2, C )’s Basic Properties . . . . . . . . . . . . . . . . . . . . . . . . 123
4.2 Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

iii
iv CONTENTS

4.3 Tensorial Representations . . . . . . . . . . . . . . . . . . . . . . . . 128


4.4 Parameters of Rotation . . . . . . . . . . . . . . . . . . . . . . . . . . 130
4.5 Rotation Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
4.6 Direct-Product Representations . . . . . . . . . . . . . . . . . . . . . 145
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

5. Lie Algebra sl (3, C ) 155–176


5.1 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
5.2 Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

6. Simple Lie Algebras: Structure 177–214


6.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
6.2 Roots and Root Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 179
6.3 String of Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
6.4 System of Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
6.5 Cartan Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
6.6 Dynkin Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
6.7 Classification of Simple Algebras . . . . . . . . . . . . . . . . . . . . 201
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214

7. Simple Lie Algebras: Representations 215–260


7.1 General Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
7.2 Irreducible Representations . . . . . . . . . . . . . . . . . . . . . . . 222
7.3 Dimension of Representation . . . . . . . . . . . . . . . . . . . . . . 239
7.4 Lie Groups in Particle Physics . . . . . . . . . . . . . . . . . . . . . . 250
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
Figures 7.11–13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258

A. The Symmetric Group and Tensor Representations 261–282

B. Clifford Algebras and Spin Representations 283–290

C. Manifolds 291–304

References 305–306

Index 307–312
v

Preface
This book is based on the notes for lectures I have given for several years to
advanced-undergraduate and graduate physics students. I aim to explain the
basic elements of group theory from the ground level up to the point where the
student knows enough to feel comfortable with the group applications in physics
literature. I have no pretense of mathematical rigor whatsoever: the necessary
definitions are given and the main theorems stated as clearly and as precisely as
I can, but they are justified (if at all) by plausibility arguments rather than by for-
mal proofs; and abstract concepts are in general illustrated by concrete examples.
The first two chapters contain—beside a summary of the general facts of el-
ementary algebra (set, map, morphism) and linear algebra (vector, matrix, oper-
ator) that one may have learned in requisite mathematics or physics courses—
a detailed description of the structures and representations of finite groups over
real or complex (C) fields. I also state here the result on the classification of fi-
nite simple groups, a recent outstanding achievement in mathematics. Chapter 3
brings a general description of the continuous groups, which of course include
many of the groups familiar to any physics students and bearing the illustrious
names of Galileo, Lorentz, and Poincaré. This prepares the student on a super-
ficial level for the study of a special but very important category of groups, the
compact simple Lie groups and associated algebras, which becomes the focus of
the remaining part of the book. The next two chapters are devoted to sl (2, C ) (the
angular-momentum algebra) and sl (3, C ) (the eightfold-way algebra); they ease
in many key concepts and properties also applicable in the more general context.
In the final two chapters, I present the classical material (due to S. Lie, É. Cartan,
W. Killing, H. Weyl, E. Dynkin, and others) on the structures and representations
of semisimple Lie algebras, and discuss briefly some applications of Lie groups
in the theory of the fundamental particles and fields in physics. The book ends
with three appendices, one on the use of the symmetric group to understand the
irreducible representations of the special unitary Lie groups; the second on the
application of Clifford algebra to construct half-integer spin representations of
the special orthogonal Lie groups; and the third on manifolds. Finally, some of
the sources I have used in preparing my lectures and writing this book are listed
in the References.

Québec and Toronto, January 2014. Q. Ho-Kim


Q. Ho-Kim. Group Theory: A Physicist’s Primer.
Chapter 1

Group Structure

1.1 Definition and Basic Properties


1.2 Subgroups
1.3 Cosets
1.4 Normality and Conjugation
1.5 Quotient and Product Groups
1.6 The Symmetric Group
1.7 Classification of Finite Groups

In this chapter we give the basics of group theory with emphasis on the finite-
order groups: definitions and a few of the most important properties of group
structure. Information on a group can be gained from an analysis of the rela-
tionships among its elements (multiplication table), or its subsets (subgroups,
cosets, equivalence classes). The various ways in which a group divides (quo-
tient groups), or acts on sets or groups (map, conjugation, multiplication) give us
a deeper understanding about its structure. To illustrate the concepts discussed,
we will use simple examples from point groups and permutation groups. We do
not know about all the groups, but mathematicians now understand very well
a special but important category of groups, called the finite simple groups; we
briefly discuss their characterization towards the end of the chapter.

1.1 Definition and Basic Properties


Working from the specific to the general, from the concrete to the abstract, genera-
tions of mathematicians, beginning with Évariste Galois in 1829, have abstracted
from fields as diverse as geometry, numbers, and algebraic equations their few
basic common properties to create the powerful concept of ‘group’ rich in im-
plications in mathematics and other disciplines, such as physics and chemistry.
Refer to [Ham] Chapter 1 and [Hu] Chapter II for more details.

1
2 CHAPTER 1. GROUP STRUCTURE

Definition 1.1 (Group). A non empty set of elements G = { a, b, c, . . . } together with


a binary operation  is called a group, written ( G, ), if the following four axioms are
satisfied:
1. The binary operation  associates any pair of elements a, b ∈ G with a well-defined
product a · b that is an element of G. The set G is then said to be closed under this
operation.
2. The operation  is associative, i.e. a · ( b · c ) = ( a · b ) · c for all a, b, c ∈ G.
3. There is an element e in G with the property a · e = a and e · a = a for all a ∈ G.
4. For each a ∈ G, there is an element a0 ∈ G such that a · a0 = e and a0 · a = e.
A Cartesian product of any two elements a, b is the ordered pair ( a, b ). The
Cartesian product of sets A and B is the set A × B consisting of all ordered pairs
( a, b ) with a ∈ A and b ∈ B. The operation described in axiom 1 is a function
 
(or map or mapping) on G of G × G → G, or in terms of elements, ( a, b ) 7 → c
for a, b, c ∈ G. Under the group binary operation, ( a, b ) = a · b is denoted by ab,
called the product of a and b in the multiplicative notation, or a + b, called the sum
of a and b in the additive notation. To be specific we generally use the multiplicative
notation, but switch to the additive notation when needed. Similarly, we refer to
the group ( G, ) simply as G when there is no cause for confusion.
Associativity of the group operation allows us to write abc for either a ( bc)
or ( ab )c. Generalizing to a1, a2, . . . , an ∈ G, with n > 3, we may write their
product as a1 a2 · · · an provided they are in an order in which partial products are
meaningful. In particular, a product of n factors of a is written an , standing for
aa · · · a (n times).
It follows from axiom 3 that if any e, f ∈ G are such that ea = ae = a and
f a = a f = a for all a ∈ G, then e = f , so an element having that property is
unique, and we refer to it as the identity of G.
Given a group G, its identity e, and arbitrary elements a, b, c of G, suppose we
have ab = ba = e and ac = ca = e, then b = c, that is, for every element a of G its
inverse element is unique, and is denoted a−1. In particular, e−1 = e.
In addition, for each a ∈ G, ( a−1)−1 = a; and for any elements a, b ∈ G,
( ab )−1 = b−1 a−1. For a, b ∈ G the equations ax = b and ya = b have unique
solutions in G, which are x = a−1 b and y = ba−1 respectively.
A group ( G, ) is said to be commutative or abelian (for Niels Abel) if a · b =
b · a for all a, b ∈ G. The number of elements in G is called the size, or order of
G, denoted | G |. If | G | is finite, G is said to be finite; if it is infinite, G is said to
be infinite. This chapter addresses itself primarily to finite, and occasionally to
countably infinite groups.
Most groups have numbers, matrices, or operators as elements subject to some
composition law. To verify that they indeed are groups, one must verify the four
axioms of the definition. We now give some examples.
E XAMPLE 1: Define the following multiplicative finite groups of respective orders
2, 4, and 8: R 2 = { e, − e}, C4 = { e, − e, i, − i }, and Q8 = {± e, ±i, ± j, ±k }, where
in each case e is the identity element; in case C4, i2 = − e; and finally in Q8 ,
i2 = j 2 = k 2 = − e, ij = − ji = k, jk = − kj = i, and ki = − ik = j.
1.1. DEFINITION AND BASIC PROPERTIES 3

E XAMPLE 2: The integers { 0, ± 1, ±2, . . . } = Z subject to ordinary addition as the


binary operation form a group. Closure is satisfied: the sum of two integers is
an integer; addition is associative; 0 is the additive identity; and the inverse of
integer m is − m. Since addition is commutative, (Z, +) is abelian and infinite.
For similar reasons the rational numbers Q and the real numbers R are infinite
abelian groups under ordinary addition, denoted Q + and R + , respectively.
E XAMPLE 3: Nonzero real numbers form an infinite abelian group under ordinary
multiplication, called R × . The product of two nonzero real numbers is a real
number (closure); multiplication is associative; 1 is the multiplicative identity;
each nonzero real x has an inverse, 1/x. Similarly, the nonzero rational numbers
under ordinary multiplication and the nonzero complex numbers under complex
multiplication also form infinite abelian groups, respectively Q × and C× .
E XAMPLE 4: For any positive integer n, the set of all n × n invertible matrices (i.e.
having a nonzero determinant) with entries over a (finite or non-finite) field F
forms a group under matrix multiplication. The product of two invertible matri-
ces is invertible, since ( AB )−1 = B−1 A−1, which implies closure; matrix multi-
plication is associative; the identity element is the identity matrix diag [1, 1, . . . , 1],
which we now call In , with 1 in every diagonal entry, and 0 everywhere else; and
finally for each invertible matrix there exists an inverse, by definition. This is the
general linear group over F, denoted GL( n, F ). It is non-abelian for all positive
integers other than n = 1 when GL(1, F ) = F× .
E XAMPLE 5: Additional conditions may be imposed on the matrices, leading to
smaller groups (subgroups of GL( n, F ), see p. 6). For example, consider for a
fixed n the subset of all matrices A over F of GL( n, F ) with determinant precisely
equal to 1 (such matrices are said to be unimodular), with matrix multiplication
as the binary operation. The product of two unimodular matrices is a unimod-
ular matrix, since det( AB ) = det A det B; the inverse unimodular matrix has
det A−1 = 1/ det A = 1; and the other two axioms are also satisfied. Hence the
set of unimodular matrices of order n forms a group, which is non-abelian, ex-
cept for the trivial case n = 1; one calls it the special linear group over F, denoted
SL( n, F ). In turn, this group contains other subgroups consisting of unimodular
matrices satisfying additional requirements.
E XAMPLE 6: Call N ∗ the set of all positive integers. Let N n = { 0, 1, 2, . . . , n − 1} ,
with n ∈ N ∗ , equipped with addition modulo n as the binary operation. (For
example, with n = 12, the set N 12 consists of 0, 1, . . . , 11, with 12 ≡ 0, and we
count up to 12, and start back again at 1, as in clock arithmetic.) We verify that
(N n , + mod n ) satisfies the four axioms of the group definition. (1) Let a, b ∈ N n ,
then we have a + b mod n = a + b if a + b < n; on the other hand, a + b mod n =
a + b − n if a + b > n. Hence closure. (2) On one side of the associative equation,
( a + b mod n ) + c mod n = a + b + c − kn for some k ∈ N ∗ so as to make it belong
to N n . On the other side, a + ( b + c mod n ) mod n = a + b + c − k 0 n for some
appropriate k 0 ∈ N ∗ . As the numbers on both sides are in N n , one must have
k = k 0 . (3) Zero is the additive identity. (4) The inverse of any a ∈ N n is n − a
since a + ( n − a ) mod n = 0. Hence (N n, + mod n ) is a finite abelian additive
group of order n. It is called the group of integers modulo n, denoted by Zn . 
4 CHAPTER 1. GROUP STRUCTURE

Multiplication Table

The essence of a finite group of low order can be captured in a table, called mul-
tiplication table (or group table; first used by Arthur Cayley around 1854), which
enumerates all its elements and gives their pairwise products. The elements are
listed in the first row and the first column (normally in the same order) and
the product of the element labeling row a by the element labeling column b is
recorded at the intersection of row a and column b, like this:

. . . b
.
.
a ab

We now give all the distinct groups of order less than 5.


| G | = 1: There is only one group, the trivial group, consisting of one element
e, with product ee = e. It is abelian, and denoted C1 or h e i.
| G | = 2: A group of order 2 must have two distinct elements, one of which is
the identity e. The other element a must be distinct from e, such that aa = e. This
two-element group, denoted C2, is the only one existing at order 2. The group
R 2 = { e, − e} in Example 1 is equivalent to C2.
| G | = 3: Again there is only one group of this order, formed by the elements
e, a, a2, with a such that a3 = e; it is denoted C3. Just as for C1 and C2, it is abelian
with the following multiplication table:

C3 e a a2
e e a a2
a a a2 e
a2 a2 e a

The groups C1, C2, C3 have a very simple structure: every group element is a
power of a special element called the group generator. This can be generalized
to arbitrary order n: The set consisting of the elements a, a2, . . . , an−1 , and an = e
under multiplication forms an abelian group, called the cyclic group of order n
and denoted Cn . Its elements can be visualized as lying at the vertices of a regular
n-gone on the unit circle, thus forming a cycle. It can be realized for example
with the generator exp (iθn ) representing an counterclockwise rotation on a plane
about the origin by an angle equal to θn = 2π/n; every other group element is a
discrete rotation by an angle equal to a multiple of θn .
| G | = 4: Call e, a, b, c the four elements. Starting with a as a generator, we may
have either b = a2, or b is a new generator. In the first case, with b = a2, we must
have c = a3 and a4 = e to close the group. In the second case, with b a generator,
we must necessarily have c = ab. Either case leads to a possible group.
The first case yields the cyclic group C4, with b = a2, c = a3, a4 = e, equivalent
to the multiplicative group {± e, ± i } defined in Example 1, as seen by identifying
1.1. DEFINITION AND BASIC PROPERTIES 5

a with i. Its multiplication table is

e a b c
e e a b c b = a2
C4 a a b c e c = a3 = ab (1.1)
b b c e a
e = a4 = b 2
c c e a b

The only other order-4 group is again an abelian group, the Klein four-group,
denoted by V, with the following multiplication table

e a b c a = bc
e e a b c b = ac
V a a e c b (1.2)
c = ab
b b c e a
c c b a e e = a2 = b 2 = c 2

We see here another way of specifying a group, namely by its generators and
rules : thus V is completely defined by two generators a and b that satisfy the
relations a2 = e, b2 = e, and ba = a−1 b. In this form it is called D2 , so that D2 =
h a, b : a2 = b2 = ( ab)2 = e i. So V is equivalent to D2, an abstract group which can
be realized by all the transformations that leave a rectangle invariant, such that
when some or all of its edges swap places the rectangle appears unchanged (we
call transformations with this property symmetry transformations). One may
take for the symmetry transformations on the rectangle: (i) doing nothing, (ii)
reflection about the median line perpendicular to two edges, (iii) reflection about
the other median line, (iv) counterclockwise rotation of the figure about the center
through an angle of π. This symmetry group is called the group of symmetries of
the rectangle, isomorphic to D2; depending on the specific transformations chosen,
it may carry different names, e.g. C2v, or C2h (the molecules H2O and SF4 have
such a symmetry). Note that the center of the rectangle is left fixed under any
symmetry operation described. The symmetry groups of finite bodies that leave
at least one point of the body fixed are called ‘point groups’. They include rotations
through a definite angle about some axis and mirror reflections in a plane. On the
other hand, the term ‘space group’ refers to a group of an infinite system (e.g. an
infinite crystal lattice) which includes translation. See [Ham] Chapter 1.
C OMMENTS . Let a, b, c be elements of a group G. If ab = ac then b = c; in other
words, if b 6 = c then ab 6 = ac. Similarly with right multiplication of b and c by a.
Each and every element of G appears once and only once in a row or a column
of a multiplication table for group G. Therefore if Jn is a row of the n = | G |
elements of G arranged in some order, then aJn is just a row of the same elements
rearranged in a different order: the binary operation a · Jn acts like a permutation of
the elements. Similar remarks hold for left multiplication Jn a. All the elements of
G are displayed in each row or column always in a different order.
6 CHAPTER 1. GROUP STRUCTURE

Fields, Finite Fields


Let F be a nonempty set that includes the zero element, 0 F ; and F × = F \ 0F its
subset of nonzero elements. Then the field F is the set F equipped with two binary
operations (addition + and multiplication ) on F such that (1) ( F, +) is an abelian
additive group; (2) ( F ×, ) is an abelian multiplicative group; (3) multiplication is
distributive over addition; and (4) the multiplicative identity 1 F ∈ F is not equal
to the additive identity 0 F ∈ F.
The number of elements in F, denoted | F |, is the order of F. The real numbers
R, the complex numbers C, and the rational numbers Q, each equipped with
ordinary addition and multiplication, are examples of fields of infinite order. But
Z, the set of all integers, 0, ± 1, ± 2, . . . , with familiar addition and multiplication,
is not a field because axiom (2) is not satisfied (specifically, Z does not have a
multiplicative inverse for every element).
However, Z can serve as a base for building prototypical examples of fields of
finite orders. Since a field requires the existence of nonempty groups ( F, +) and
( F ×, ) we must have | F | > 1. So consider | F | = 2, and define the equivalence
classes 0 = { . . . , − 2, 0, 2, 4, . . . } and 1 = { . . . , − 1, 1, 3, 5, . . . } . Then the set of
these two classes Z2 = { 0, 1} equipped with addition and multiplication modulo
2 is a field of order 2. Next, consider another example, Z10 = { 0, 1, . . . , 9} to-
gether with the operations addition and multiplication modulo 10. One can check
that the only condition for Z10 to be a field that is missing is the existence of a
multiplicative inverse for every element. Only 1, 3, 7, and 9 have multiplicative
inverses; the other elements do not. So Z10 is not a field. The question is, which
Zn , if any (apart from the case n = 2), are fields? The answer is the following:
• Zn = { 0, 1, . . . , n − 1} equipped with addition and multiplication modulo n is
a field if and only if n is a prime number.
As all other axioms of the definition are satisfied by Zn for any integer n > 1
(as properties inherited from Z), we only need to show that, in the case n = p
is a prime number, every nonzero element a of Z p has a multiplicative inverse.
This is so because the p multiples of a (modulo p) { a · 0, a · 1, a · 2, . . . , a · ( p − 1)}
must all be distinct, and so must be all of the p elements of Z p . In particular one
of them must be 1, i.e. there is an x such that a · x = 1 (modulo p). This element
x is just the multiplicative inverse of a.
Are there other fields? Yes, there are, as given here:
• Let p be a prime number, and m a positive integer. Then there is (up to isomor-
phism) exactly one field Fq having q = pm elements.
The statement is true, although it is not obvious. We might be tempted to
guess that the field Fq ought to be Zq , but it is not so, because as seen above, Zq
is not a field unless q = p is a prime number, i.e. unless m = 1. This amounts to
saying that for m > 1 the rules of operation for addition and multiplication on F
ought to be modified. Consider the simplest example, with q = 4 = 22. We can
readily see that Z4 = { 0, 1, 2, 3} (+ ,  mod 4) is not a field because 2 ∈ Z4 has
no multiplicative inverse, so Z4 6 = F4 . But we still may have a set of 4 elements,
F4 = { 0, 1, b, c }, satisfying two binary operations consistent with all the axioms
1.1. DEFINITION AND BASIC PROPERTIES 7

of a field, as shown in the following tables.

+ 0 1 b c  0 1 b c
0 0 1 b c 0 0 0 0 0
1 1 0 c b 1 0 1 b c
b b c 0 1 b 0 b c 1
c c b 1 0 c 0 c 1 b
Addition in F4 Multiplication in F4

From the addition table, we note that ( F4, +) is isomorphic to the Klein group
V∼= Z2 + Z2 , and from the multiplication table that ( F4×, ) is isomorphic to the
cyclic group C3. It turns out to be true in general ([Hu] p. 279) that for q = p m
with any prime p the additive group ( Fq, +) is isomorphic to Z p ⊕ · · · ⊕ Z p (m
summands), and the multiplicative group ( Fq×,  ) is cyclic, of order q − 1.
Matrix groups over finite fields. As an example of applications to group theory,
we consider the general linear group GL( n, Fq ) ≡ GL( n, q ) with q = pm , prime p,
and integer m > 1. Since the field over which it is defined is finite, GL( n, q ) must
have finitely many elements. How many elements are there?
If n = 1, GL( n, q ) ∼
= F× × 
q = ( Fq , ) , which has ( q − 1 ) elements.
If n = 2, any matrix M ∈ GL( n, q ) is of the form ac db , with a, b, c, d ∈ Fq ,
 
such that ad 6 = bc for M to be invertible, i.e. for det M 6 = 0. We know that the
determinant of a matrix is nonzero if and only if the rows of the matrix are linearly
independent (Chap. 2 Sec. 2.1 #10). The first row [ a, b ] can be anything other than
[0, 0], so there are q2 − 1 possibilities for [ a, b ]. The second row [ c, d ] must be
linearly independent from the first, i.e. can be anything other than a multiple of
the first (must not be x[ a, b ] where x ∈ Fq ). So there are q2 − q possibilities for the
second row. Thus, finally, there are ( q2 − 1)( q2 − q ) matrices in GL(2, q ).
These arguments can be generalized to GL( n, q ) for any positive integer n.
Let M ∈ GL( n, q ). The first row of M can be anything other than the zero row,
so there are ( qn − 1) possibilities. The second row must be linearly independent
from the first, i.e. can be anything other than a multiple of the first, so there
are ( qn − q ) possibilities for the second row. Generalizing, the ith row must be
linearly independent from all the first ( i − 1) rows, i.e. it cannot be a linear combi-
nation of the first ( i − 1) rows with combination coefficients in Fq . Since there are
qi−1 linear combinations of the first ( i − 1) rows, we have ( qn − qi−1 ) possibilities
for the ith row. So, in the end, there are ( qn − 1)( qn − q ) · · · ( qn − qn−1 ) matrices
of dimension n × n whose rows are mutually linearly independent. Thus, the
group GL( n, q ) has ∏in=−01 ( qn − qi ) elements.
The special linear group SL( n, F ) is defined to be a subgroup of GL( n, F ), such
that SL( n, F ) = { M ∈ GL( n, F )| det M = 1} . The det function maps GL( n, F ) to
F× , which is a surjective homomorphism. The kernel of this homomorphism
is precisely SL ( n, F ), from which it follows that GL( n, F )/SL( n, F ) ∼ = F× (see
the next two Sections 1.4–5; also First isomorphism theorem in [Hu] p. 44) . So
|SL( n, F )| = |GL( n, F )|/ |F× |, which leads to the order of the finite special linear
group: |SL( n, Fq )| = [ ∏in=−01 ( qn − qi )] / (q − 1).
8 CHAPTER 1. GROUP STRUCTURE

1.2 Subgroups
A group can also be analyzed in terms of subsets of elements; well designed
subsets can sharpen the differences and similarities between the group elements
and yield useful information about the group itself. In physics, this corresponds
to introducing a boundary condition or an interaction in a physical system that
leaves the system with a reduced symmetry. We will consider two such types of
subsets: subgroups, to be discussed in this section, and equivalence classes, to be
studied later in the chapter. The notion of set is of course used in other ways too.
For example, if X is the set of generators of a group G, then one may refer to G by
h Xi to emphasize the special role of X. Thus, h a i = { ak | k ∈ Z } is the signature
of a cyclic group of some order generated by a.
Let G be a group and H a non empty subset of G. If for every a, b ∈ H we
have ab ∈ H, then H is said to be closed under multiplication in G.
Definition 1.2 (Subgroup). A nonempty subset H of a group G is said to be a subgroup
of G if it is closed under the product in G, and is itself a group under the product in G.
This is denoted by H < G.
When the multiplication table for a group G is available, its subgroups can be
identified by simple inspection. Otherwise, a test based on the following result
can be used: A nonempty subset H of a group G is a subgroup if and only if ab−1 ∈ H
for all a, b ∈ H. To prove this proposition, remark that the identity e of G, being
unique, is also the identity of H, and the inverse of a ∈ H is a−1 in G. The direct
proposition is obvious. For the converse, we assume ab−1 ∈ H for all a, b ∈ H.
By this assumption we have e = aa−1 ∈ H for any a ∈ H, and b−1 = eb−1 ∈ H
for any element b ∈ H. Then as we now know that b−1 ∈ H for every b ∈ H, we
have ab = a ( b−1)−1 ∈ H, which means H is closed under the product in G. The
product in H is associative as it is in the group G. Hence H is a subgroup.
Every group G has at least two subgroups: h e i and G itself. They are trivial
subgroups. Any other subgroup that may exist is said to be non-trivial or proper.
E XAMPLE 7: If for an element a of a group G, the set a, a2, . . . , an−1 , an = e (where
n is the smallest positive integer for which an = e) forms a subgroup (the inverse
of ak being an−k , with 0 ≤ k ≤ n), then this subgroup is called a cyclic subgroup
of G, and denoted h a i. The order of element a equals the size of the subgroup h a i.
E XAMPLE 8: Given a group G, the set { c ∈ G | cx = xc for all x ∈ G } is an abelian
subgroup of G, called the center (Zentrum) of G and denoted C ( G ) (or Z( G )). On
the other hand, CG ( x) = { c ∈ G | cx = xc } is called the centralizer of x ∈ G. Two
simple examples are: Z(GL( n, F )) = { a · In | a ∈ F× } for the general linear group;
and Z(SL( n, F )) = { a · In | a ∈ F× , an = 1} for the special linear group.
E XAMPLE 9: For a fixed positive integer n, consider the group G = GL( n, F )
of invertible n × n matrices over the field F. We now verify that the subset H
of invertible matrices with determinant equal to one under matrix multiplica-
tion is a group, and hence a subgroup of G. Let A, B ∈ H, then det ( AB−1) =
det A/ det B = 1, and so AB−1 ∈ H. This suffices for H to be a subgroup of G;
we write this as SL( n, F ) < GL( n, F ). More on these groups in Chapter 6.
1.2. SUBGROUPS 9

E XAMPLE 10: As another example, consider the subset H of orthogonal matrices


A over a field F, defined by the condition AT A = I, or equivalently AT = A−1,
where AT is the transpose of A, i.e. with elements ( AT )ij = A ji . Since det AT =
det A, we have det( AT A ) = (det A )2 = 1, which says that det A = ± 1 for all
orthogonal matrices A. Now for any A, B ∈ H, we know that AT = A−1 and
BT = B−1, and so ( AB−1)T ( AB−1) = ( B−1)T AT AB−1 = ( BB−1)−1 = I. It
follows that AB−1 ∈ H and H < G. The set of orthogonal matrices forms a
group, called the orthogonal group O ( n, F ), another subgroup of GL( n, F ).
E XAMPLE 11: Let A be a complex matrix and A† its complex adjoint (defined
by ( A† )ij = A∗ji , with ∗ denoting complex conjugation); then A is said to be a
unitary matrix if it satisfies A† A = I, or A† = A−1. The set H of all such matrices
forms a group, because if A, B ∈ H then AB−1 is also unitary: ( AB−1)† ( AB−1) =
( B−1)† A† AB−1 = ( BB−1)−1 = I, and so is in H. The set of unitary matrices is
the unitary group U( n, C ), a subgroup of GL( n, C ). 
The special linear, unitary, and orthogonal groups are the more familiar ex-
amples of classical groups. When the field F over which they are defined is real
(R) or complex (C), they are just the classical Lie groups (cf. Chapter 6). When the
underlying field F is finite (having a finite number q of elements), they are called
the classical groups of Lie type, and denoted SL( n + 1, q ) ≡ A n ( q ), SO (2n + 1, q ) ≡
Bn ( q ), Sp( n, q ) ≡ Cn ( q ), and SO(2n, q ) ≡ Dn ( q ).

Homomorphism
In analyzing group structure, it is important to relate different groups and to
compare them. Here two notions are helpful: maps that preserve group structure,
and distinctiveness of groups.
Definition 1.3 (Homomorphism, Isomorphism). For given groups G and G0 , a func-
tion f : G → G0 is a homomorphism if f ( ab) = f ( a) f (b) for all a, b ∈ G. If f is a
one-to-one correspondence, f is called an isomorphism, and G and G0 are then said to
be isomorphic (denoted G ∼ = G0 ). An isomorphism of a group with itself is called an
automorphism.
The expression ‘ f is a one-to-one correspondence’ (or a bijection) means that
for f : G → G0 there exist both a map g : G0 → G such that g f = e (identity of G)
and a map h : G0 → G such that f h = e0 (identity of G0 ), and that g = h. This map
g is called a two-sided inverse of f . The two-sided inverse of an isomorphism f
is unique and denoted f −1. In terms of group structure, isomorphic groups are
considered the same, whereas non-isomorphic groups, distinct.
Let f : G → G0 be a homomorphism of groups, and e be the identity of G and
e the identity of G0 , then f ( e) = e0 and f ( a−1) = f ( a)−1 for any a ∈ G. We
0
can see this as follows: First, f ( a) = f ( ae) = f ( a) f (e) and so f ( a )−1 f ( a) = f ( e)
and e0 = f ( e) follows. Secondly, e0 = f ( e) = f ( aa−1) = f ( a) f ( a−1), from which
f ( a )−1e0 = f ( a−1), and the final result: f ( a)−1 = f ( a−1).
Definition 1.4 (Kernel, Image). Let f : G → G0 be a homomorphism of groups, and e0
be the identity of G0 . The kernel of f is the set Ker f = { a ∈ G | f ( a) = e0 ∈ G0 } . If S
10 CHAPTER 1. GROUP STRUCTURE

is a subset of G, then the image of S is f ( S) = { b ∈ G0 | b = f ( a ) for some a ∈ S } ; and


f ( G ) is called the image of f and denoted Im f .
It is clear that Ker f is a subgroup of G. This is so because for any a, b ∈ Ker f ,
one has ab−1 ∈ G for which f ( ab−1) = f ( a ) f (b−1) = f ( a ) f (b)−1 = e0 ( e0 )−1 = e0 ,
and so ab−1 ∈ Ker f . It is a sufficient condition for Ker f to be a subgroup of G.
E XAMPLE 12: Given any two groups G and G0 , the map f : G → G0 defined by
a 7 → e ∈ G0 for all a ∈ G is a homomorphism, with Ker f = G.
E XAMPLE 13: If G is an abelian group, the map a 7 → a−1 for all a ∈ G is an
isomorphism (in fact, an automorphism, because f : G → G); but the map a 7 → a2
is a homomorphism of G to itself. In both cases, Ker f = h e i.
E XAMPLE 14: Let G = GL( n, F ) and A any element of G, and consider different
(homomorphic) mappings of G. First, let f : G → F× and f ( A) = det A, then
as 1 is the identity element of F× , the kernel of this homomorphism f = det is
SL( n, F ). Second, let f : G → F× and f ( A) = AT A, then Ker f = O ( n, F ). Finally,
restricting G to GL ( n, C ), we define f ( A) = A† A, which maps G to C× . As 1 is
the identity of C× , the kernel of this function f is U( n, C ). 
Cyclic groups. A multiplicative cyclic group C = h a i is one that can be generated
by a single element a, so that all its elements are expressible as ak , where k ∈ Z. If
there is no nonzero k for which ak = e, then the elements of C are all distinct, and
C is a countably infinite cyclic group, isomorphic to the additive group Z under
the map ak 7 → k. If C is finite, of order n, i.e. if n is the smallest positive integer
such that an = e, then C consists of the n distinct elements a, a2, . . . , an−1 , an = e.
It is isomorphic to the additive group of integers modulo n under the map ak 7 → k
(mod n ). Thus, the structure of every cyclic group is completely known and can
be concisely described as follows:
Every infinite cyclic group is isomorphic to the additive group Z, and every finite cyclic
group of order n is isomorphic to the additive group Zn , which we write as Cn ∼ = Zn .

1.3 Cosets
Cosets are sets of equivalent elements in the sense to be defined; we will see
that this concept produces the important result that for finite groups the number-
theoretic properties of their size determine their structure.
Let A be a set of elements a, b, c, . . . ; let R be a subset of the Cartesian product
A × A with the following properties (i) ( a, a) ∈ R for all a ∈ A; (ii) If ( a, b ) ∈ R,
then ( b, a ) ∈ R; and (iii) If ( a, b ) ∈ R and ( b, c ) ∈ R, then ( a, c ) ∈ R. These
three properties are referred to as reflexivity, symmetry, and transitivity. If they
are satisfied by R, then R is said to be an equivalence relation on A. If ( a, b ) ∈ R,
then we say that a is equivalent to b under R, and write a ∼ b or a ≡ b. For any
a ∈ A, the equivalence class of a under R is the set of all those elements of A that
are equivalent to a under R (denoted ā, or [ a ]), that is ā = { x ∈ A | x ∼ a } . Any
representative b ∈ ā defines in turn an equivalence class which coincides with ā,
i.e. b̄ = ā. The particular equivalence relation that interests us now is defined as
follows.
1.3. COSETS 11

Definition 1.5 (Congruence modulo H). Let H be a subgroup of a group G, and a, b


any elements of G. a is right-congruent to b modulo H, denoted a ≡ r b mod H, if
ab−1 ∈ H. a is left-congruent to b modulo H, denoted a ≡ ` b mod H, if a−1 b ∈ H.
With G and H < G as defined above, right and left congruence modulo H
coincide if G is abelian: suppose ab−1 ∈ H, then as ( ab−1)−1 = ba−1 = a−1 b and
( ab−1)−1 ∈ H, we have a−1 b ∈ H. But for a typical nonabelian group they do not
coincide, and when they do, we have a very interesting situation, as we shall see
later. To keep our discussion simple, we consider just left congruence, but keep
in mind that analogous results also apply to right congruence.
Left congruence modulo H < G is an equivalence relation on G. The equiv-
alence class of an element a ∈ G under left congruence modulo H is the set
of all those elements x of G that are left-congruent to a modulo H, precisely
ā` = { x ∈ G | x ≡ ` a mod H } . If x ≡ ` a mod H then x−1 a ∈ H. Let x−1 a =
h−1, or x = ah, then ā` = { ah| for all h ∈ H } = aH. Similarly, we also have
ār = { x ∈ G | x ≡ r a mod H } = { ha| for all h ∈ H } = Ha. Note that in general
Ha 6 = aH.
C OMMENTS . (a) With the notation introduced in this section, and in particular
ā = a + h m i = a + mZ for the equivalence class containing a under congruence
mod m in Z, we have Zm = { 0̄, 1̄, . . . , m − 1} , a less ambiguous notation than the
customary { 0, 1, . . . , m − 1} (as used in p. 3).
(b) Congruence modulo H < G in G is a generalization of the more familiar
concept of congruence mod m in additive group Z. The correspondence is shown
in the following table.
congruence modulo m in Z congruence modulo H in G
(additive notation) (multiplicative notation)
Z, h m i = { mk | k ∈ Z } = mZ G, H < G
a ≡ b mod m ⇒ a − b ∈ h m i a ≡ ` b mod H ⇒ a−1 b ∈ H
ā = { x ∈ Z | x ≡ a mod m } ā` = { x ∈ G | x ≡ ` a mod H }
= a + h m i = a + mZ = aH
Z = 0̄ ∪ 1̄ ∪ . . . ∪ m − 1 G = H ∪ aH ∪ bH ∪ . . . 
As Ha = H and aH = H for every a ∈ H, where H < G, we now restrict to
elements a of G that are not in H, and introduce
Definition 1.6 (Cosets). Let H be a subgroup of a group G and g an element of G that
is not in H. Then the set gH, with the same product law as for G, is called a left coset of
H in G, and Hg is called a right coset of H in G.
The cosets have a geometrical interpretation: Just as we may view a + h m i as
h m i shifted by a, so too we may think of gH (resp. Hg) as a set obtained by left
(right) translating the subgroup H by g. H is a subgroup, but its cosets are not
subgroups, because for one thing they do not contain the identity element. Each
coset has the same number of elements as H, but has no elements in common
with it (because otherwise we would have ghi = h j or equivalently g = h j h− 1
i in
H, contrary to the assumption that g 6 ∈ H). In other words, H and gH are disjoint;
similarly for H and Hg.
12 CHAPTER 1. GROUP STRUCTURE

Two left cosets (or two right cosets) of a subgroup H either coincide completely, or
else have no elements in common. Let gH and g0 H be two left cosets of H. If there
are no hi , h j ∈ H such that ghi = g0 h j, then gH and g0 H are disjoint, having no
common elements. If on the contrary, there are two elements h i , h j ∈ H such that
ghi = g0 h j, then g−1 g0 = hi h− 1 −1 0
j is in H, which implies g g H = H, or gH = g H.
0

Now, let a group G of order | G | contain a subgroup H of order | H |. If H equals


G, then obviously | G | = | H |. If not, let g2 be an element of G not in H, and form
the coset g2 H. We have two sets, each consisting of | H | distinct elements, namely
H and g2 H, which are both contained in G. If together they exhaust G, we have
G = H ∪ g2 H and | G | = 2| H |. If not, we repeat the process, picking some element
g3 of G that is neither in H nor in g2 H. The coset g3 H yields | H | new elements of
G that are neither in H nor in g2 H. We repeat until we recover all the elements of
G, obtaining G as the union of a finite number of mutually disjoint sets, everyone
of which containing | H | elements:

G = H ∪ g2 H ∪ g3 H ∪ . . . ∪ gk H . (1.3)

Thus, the order of G is | G | = k | H |, i.e. an integral multiple k of the size of H.


This number k is a positive integer or ∞, called the index of H in G, and denoted
by k = [ G : H ]. The index [ G : H ] is the number of distinct left cosets of H in
G (including H itself); it is the number of times we need to left-shift H around to
recover completely G. Formally, we have
Theorem 1.1 (Lagrange). If H is a subgroup of a group G, then the orders of G and H
are related by | G | = k | H |, where k = [ G : H ].

The theorem has two corollaries when it refers to finite groups:


(1) If G is finite, the order of any subgroup of G divides | G |.
(2) If G is finite, the order of any element of G divides | G |.
(See it this way: Let H = h a i be the cyclic subgroup generated by any a ∈ G
of order n = | H |. By Theorem 1.1, we know that | H | divides | G |.)
Lagrange’s theorem is of basic importance in the study of finite groups. It has
many applications, of which some of the simplest are:
(1) Any group of prime order is cyclic. A group G of prime order p may have
only elements of order 1 or p. If g is an element other than the identity, it has
order p, which means h gi has size p. It follows G = h gi.
(2) If the size of a group G is given by the product of two primes, then any proper
subgroup G may have is cyclic. Let | G | = pq, where p and q are primes. A proper
subgroup will have the size equal to either p or q, and so is cyclic in either case.
(3) A proper subgroup of a finite group has order at most half the order of the group.
If H is a proper subgroup of a finite group G, then since [ G : H ] ≥ 2, we have
| G | ≥ 2| H |, or | H | ≤ 1/2| G |.
C OMMENTS .
(a) We have considered left cosets, but we would have obtained the same rela-
tion | G | = [ G : H ] · | H | with right cosets; i.e. there are as many left cosets as right
cosets of any subgroup in a group. Right cosets are in general not the same as left
1.3. COSETS 13

cosets; only their numbers are equal. In the special case of [ G : H ] = 2, there are
just one left coset and one right coset, and they are the same.
(b) Even when G and H are infinite groups, [ G : H ] may still be finite. This
occurs for example with G = Z, H = h m i, where m ∈ N ∗ , then [ G : H ] = m, so
that we have the disjoint covering Z = 0̄ ∪ 1̄ ∪ · · · ∪ ( m − 1).
(c) The relation | H | ≤ (1/2)| G| can be used for a rapid computer search for
elements of a finite group that do not belong to a subgroup H of G.
(d) A given subgroup of a finite group and its distinct left (or right) cosets
together make up the whole group. Since a group contains in general several
subgroups, it may be partitioned in several different ways, each giving a coset
decomposition of the group.
(e) We can reduce the full symmetry a physical system may have to a sub-
group by imposing a constraint on the system. For example, we can place a rod
through the corners of a square, reducing the possible symmetry transformations,
thus forming a subgroup H. By moving the rod to a new position via a symmetry
operation g, we obtain a new set of operations gH. If we repeat enough times
using different g we can recover all the symmetry transformations the square
would have in the absence of constraints. The sets of operations gH that we need
to recover all the symmetries of the system are just the cosets. 
E XAMPLE 15: Groups of order 6. We already have characterized all groups of
orders 1–4, and have learned that all groups of order 5 (or any prime) are cyclic.
Let us now find all distinct groups of order 6. Each of the elements of such a
group, call it G, must have an order equal to one of the divisors of 6 (i.e. 1, or 2,
or 3, or 6).
Assume that G has an element a of order 6, and so G = h a i is a cyclic group.
Any other possible structures can only have elements of order 1, 2, 3. Assume
that there is in G an element of order 3, so that it has the cyclic subgroup h a : a3 =
e i. If G also contains another distinct element b, then it contains the six distinct
elements of the set X = { e, a, a2, b, ba, ba2} , with b of order 2 or 3.
Now, if b3 = e, then b2 must be one of the elements of X; but each such case
would lead to contradictions of the assumptions, and so b cannot have order 3.
Let b2 = e and examine the product ab, which must be one of the six listed
elements. It cannot be e, a, a2, or b, because that would lead to contradictions. Is
ab = ba? If it is so, we have ( ab)6 = a6 b6 = e, that is, the group would contain an
element of order 6, contrary to assumption. So we are left with the last possibility:
ab = ba2. This implies ( ab )2 = abab = ab2 a2 = a3 = e. Element ab is of order
2, which does not contradict any assumptions, and we obtain a second group of
order 6, which is not isomorphic to h a i. It is the lowest-order nonabelian group
that exists. In conclusion, there are just two distinct abstract groups of order 6:
the cyclic group C6 and a dihedral group D3 = h a, b i defined by the algorithm:
a3 = b2 = e, ab = ba−1. (This turns out to be a special case of a general result,
which states that ‘Every group of order 2p, where p is an odd prime, is either the cyclic
group C2p or the dihedral group D p .’ See [Hu] p. 97.)
By taking all possible products of its elements we can establish its multiplica-
14 CHAPTER 1. GROUP STRUCTURE

tion table, as shown below (note that bam = a−m b, m = 1, 2, 3):

h a, bi e b ba ba2 a a2
e e b ba ba2 a a2
b b e a a2 ba ba2
ba ba a2 e a ba2 b (1.4)
ba2 ba2 a a2 e b ba
a a ba2 b ba a2 e
a2 a2 ba ba2 b e a

This nonabelian group h a, b i can be realized by the symmetry transformations


of an equilateral triangle, which may be the identity transformation, the reflec-
tions about the medians passing through the vertices, and the counter clockwise
rotations in the triangle plane about its centroid through the angles of 2π/3 and
4π/3. The six symmetry transformations described form a group of symmetries of
the equilateral triangle, isomorphic to D3. Examples (in three dimensions) of this
kind of symmetry are found in molecules with trigonal pyramidal geometry, such
as NH3 (ammonia) and PCl3 (phosphorous trichloride).
E XAMPLE 16: Subgroups and cosets in C6 and D3 . The cyclic group C6 = h a i,
with a6 = e, has two proper subgroups, H1 = { e, a3} and H2 = { e, a2, a4 } . As
a, a2, a4, a5 6 ∈ H1, we form the left cosets of H1 in C6: aH1 = { a, a4} , a2 H1 =
{ a2, a5 } , a4 H = aH1, and a5 H1 = a2 H1. The left cosets of H2 with a, a3, a5 6 ∈ H2
are: aH2 = { a, a3, a5 } = a3 H2 = a5 H2. Since we are dealing with abelian groups
and subgroups, left and right cosets coincide, and so C6 admits two decomposi-
tions: C6 = H1 ∪ aH1 ∪ a2 H1 and C6 = H2 ∪ aH2.
The nonabelian group D3 = h a, b i with a3 = b2 = ( ab )2 = e has three order-2
(abelian) subgroups, K 1 = { e, b }, K 2 = { e, ba}, K 3 = { e, ba2} ; and one order-3
(abelian) subgroup, H = { e, a, a2} . The left cosets of K 1 in D3 are aK 1 = ba2 K 1 =
{ a, ba2} and a2 K 1 = baK 1 = { ba, a2} ; and the corresponding right cosets are
K 1 a = K 1ba = { a, ba} and K 1 a2 = K 1 ba2 = { a2, ba2} , which differ from the
corresponding left cosets. This produces two distinct coset-decompositions of
the group: D3 = K 1 ∪ aK 1 ∪ a2 K 1 and D3 = K 1 ∪ K 1 a ∪ K 1 a2.
As for the subgroup H = { e, a, a2} , which does not contain b, ba, ba2 ∈ D3 ,
we calculate the left coset bH = { b, ba, ba2} , and realize that not only bH = Hb,
but also bH = gH = Hg for all g ∈ D3, leading to the unique partitioning of D3 :
D3 = H ∪ bH.
E XAMPLE 17: Group S3. If the vertices of an equilateral triangle (Fig.1.1) are la-
beled counterclockwise by 1,2,3, then the reflections interchange any two labels
leaving the third one unchanged; thus, b 7 → (12), ba 7 → (23), and ba2 7 → (31),
where for example (12) means the interchange of 1 and 2, leaving 3 unchanged.
Counterclockwise rotations by 2π/3 and 4π/3 have the effects of cyclic permu-
tations of the three labels, thus a 7 → (123) and a2 7 → (321). The resulting group,
isomorphic to D3, is called the symmetric group S3, which is of order 3. This is a
simple example of the general result (Cayley’s theorem) that any finite group can
be realized as a symmetric group. 
1.4. NORMALITY AND CONJUGATION 15

30 20

2 10 3

Figure 1.1: An equilateral triangle.

1.4 Normality and Conjugation


The condition for a subgroup of a group to be a normal divisor, or normal, is
related to the notion of conjugation, another equivalence relation. Both concepts
play an important role in determining the structure of finite groups.

Definition 1.7 (Normal subgroup). A subgroup N of a group G is said to be normal


in G (or a normal subgroup of G) if it satisfies one of the following conditions:
(i) Every left coset of N in G is a right coset of N in G;
(ii) aN = Na for every a ∈ G;
(iii) aNa−1 ⊂ N for all a ∈ G;
(iv) aNa−1 = N for all a ∈ G.

We shall write N / G to mean N is a normal subgroup of G. We now prove the


equivalence of the three conditions, with N < G as defined.
( i) ⇒ ( ii): Assume for any a ∈ G, there exists some b ∈ G such that aN = Nb.
Then since a ∈ aN ∩ Na = Nb ∩ Na, the two right cosets Na and Nb are not
disjoint and so must be equal. Hence aN = Na for every a ∈ G.
( ii) ⇒ ( iii): Assuming (ii) holds, for any h ∈ N we have aha−1 = h0 ∈ N, or
aNa−1 ⊂ N for every a ∈ G. This holds also for a−1 ∈ G, i.e. a−1 Na ⊂ N.
( iii) ⇒ ( iv ) Now we see that for every h ∈ N, h = aa−1haa−1 = a ( a−1ha ) a−1 ∈
aNa−1, which implies N ⊂ aNa−1. The two conclusions taken together show that
N = aNa−1 for every a ∈ G.
( iv ) ⇒ ( i): Let a ∈ G but a 6 ∈ N, and some h ∈ N, so that b = ah is in G but
not in N. Then Nb = aNa−1b = aNh = aN. 
E XAMPLE 18: Every group G has at least two normal subgroups, namely, h e i and
G itself. Every subgroup of a cyclic group is normal. Every subgroup of Q8 is a
normal subgroup, although the group is non-abelian. Any subgroup of index 2
of any group is normal. The subgroup { e, a, a2} of D3 = h a, b i has index 2, so is
normal, but its order-2 subgroups { e, b }, { e, ba} and { e, ba2} are not. 
The action of a ∈ G on N as used in ( iii) is called conjugation by a, and the
element axa−1 is called a conjugate of x. Generally, an element y of a group
G is said to be conjugate to x ∈ G if there is some element g ∈ G such that
gxg−1 = y. This is an equivalence relation in G since it is reflexive, symmetric, and
16 CHAPTER 1. GROUP STRUCTURE

transitive (see p. 10). The equivalence classes of G under conjugation are called
the conjugacy classes of G; the conjugacy class (or orbit) of an element x ∈ G is
the set x̄ = { gxg−1 | for all g ∈ G } , which should contain only distinct elements.
The number of such elements in x̄ is called its size, | x̄|.
Conjugacy classes have simple properties (which follow from the definition
and the general properties of equivalence classes), namely,
(1) In every group G the identity is a conjugacy class, ē, by itself.
(2) No conjugacy class other than ē is a subgroup.
(3) Every element of G belongs to some conjugacy class.
(4) All elements in a conjugacy class have the same order (( gxg−1)n = 1 iff
xn = 1). The converse is not true in general: different elements having the same
order may be found in different conjugacy classes.
(5) The conjugacy classes of a group G are either disjoint or equal. So they
provide a disjoint covering of the group: G = ∪ ḡ ḡ, so that | G | = ∑ḡ | ḡ| (cf.
| G | = [ G : H ]| H | for cosets of H < G). An important result (which we do not
prove here) is that in a finite group G each conjugacy class has size dividing | G |. The
reason for this is that, while ḡ 6 = ē is not a subgroup of G, its size is the index
[ G : CG ( g)] (where CG ( g) = { x ∈ G | xg = gx} is the centralizer of g ∈ G), which
divides | G |.
E XAMPLE 19: In an abelian group G, each element forms a class by itself. No two
elements are conjugate.
E XAMPLE 20: The dihedral group D3 (h a, b i with a3 = b2 = ( ab)2 = e) has its six
elements distributed among three conjugacy classes: ē = { e }, ā = { a, a2} , and
b̄ = { b, ba, ba2} , of sizes 1, 2, 3, partitioning G according to D3 = ē ∪ ā ∪ b̄, so that
6 = 1 + 2 + 3. All elements of the same order are grouped together in the same
conjugacy class (an exception to the general situation). The center is trivial, h e i.
E XAMPLE 21: In the quaternion group Q8 = {± e, ±i, ± j, ± k } there are five conju-
gacy classes: { 1} , {− 1}, { i, − i }, { j, − j }, { k, − k }, with 3 classes sharing 6 order-4
elements (e.g. i, − i, etc.). The center of Q8 is the set of elements in the conjugacy
classes of size 1, i.e. Z( Q8 ) = { 1, − 1}. 
Conjugation can be extended to subgroups. Thus, if H is a subgroup of a
group G, then for any g ∈ G the set gHg−1 = { ghg−1 : h ∈ H } is also a subgroup
of G, called a conjugate subgroup to H. Therefore by Definition 1.7 a subgroup
N < G is normal if and only if it is equal to all its conjugates: N = gNg−1 for all
g ∈ G. For this reason a normal subgroup is also called an invariant subgroup
of G. If x ∈ N, then all gxg−1 are in N (and conversely), which is equivalent to
saying that N < G is normal if and only if it contains elements of G in complete classes.
E XAMPLE 22: Every subgroup of an abelian group is normal, being conjugate
only to itself.
E XAMPLE 23: In the group D3, the order-3 subgroup { e, a, a2} contains the com-
plete classes ē and ā and so is normal; whereas all of its order-2 subgroups { e, b },
{ e, ba } and { e, ba2} only contain some elements of b̄, and so are not normal. 
The presence or absence of non-trivial normal subgroups is a key property in
characterizing two important related categories of groups:
1.5. QUOTIENT AND PRODUCT GROUPS 17

Definition 1.8. A group is said to be simple if it has no proper normal subgroups. A


group is said to be semisimple if none of its normal subgroups are abelian.

E XAMPLE 24: The only abelian simple groups are the cyclic groups of prime order
(having no proper subgroups at all). Cyclic groups with a non-prime order are
neither simple nor semisimple (all their subgroups are normal and abelian). The
nonabelian group D3 has a normal abelian subgroup, and so is neither simple
nor semisimple. Nonabelian simple groups of small order are rare; for example,
there are only two nonabelian simple groups of order less than 200, namely, the
alternating group A5, of order 60, and a subgroup of the symmetric group S7, of
order 168. See p. 28. 
C OMMENTS . A symmetry group may contain different types of transformations
(rotations, reflections) and so there has to be some way of reflecting these simi-
larities and differences in its description: this is the reason for having the concept
of conjugacy. In D3 = h a, b i, for example, the elements a and a2 regarded as
planar rotations through different angles about the center of the triangle are ge-
ometrically similar to each other, but quite different from reflections across the
median lines represented by b, ba, and ba2; in no ways a reflection can be made
into a pure rotation, but reflections across different lines are essentially similar, as
are planar rotations about a point through different angles – just different points
of view. Group theory makes this distinction between the two sets precise by
putting them in distinct classes, ā and b̄. In addition, many properties are pre-
served under the group transformations; this will be automatically the case if
those properties derive from elements of a normal subgroup. Thus, the angular
momentum associated with the ā class is conserved under the full group D3.

1.5 Quotient and Product Groups


In this section, we discuss how to build new groups from old ones by division
(quotient group) or multiplication (direct product group).
Quotient Group
Let N be a normal subgroup of a group G, and S = { g1 N, g2 N, . . . , gk N } the
complete collection of cosets of N in G (with gi ∈ G, g1 = e and k = [ G : N ]). As
a set of sets, S has the following properties:
(i) ( gi N )( gj N ) = ( gi gj ) NN = ( gi gj ) N (because gN = Ng, NN = N);
(ii) ( gi N )( gj N gm N ) = ( gi N )( gjgm N ) = ( gi gj N )( gm N ) = ( gi N gj N )( gm N );
(iii) ( gi N ) N = gi N and N ( gi N ) = gi N;
(iv) ( gi N )( gN ) = N and ( gN )( gi N ) = N for g = g− 1
i .
Those are the four characteristic group properties: the set S of elements gi N is
closed under the product of cosets; the product is associative; the identity element
of S is the normal subgroup N; and there is an inverse g− 1
i N in S for each element
gi N of S. So, S is a group under the product of cosets.
If N / G a normal subgroup of a group G, the set of all (left) cosets of N in G
is a group of order [ G : N ] under the binary operation ( gi N )( gj N ) = ( gi gj ) N,
18 CHAPTER 1. GROUP STRUCTURE

with eN = N the identity element. It is called the quotient group or factor group,
written G/N and read “quotient of G by N”, or “G modulo N”.
Relationships between homomorphisms and quotient groups give us further
information about the structure of the latter.
Let f : G → G0 be a homomorphism between groups, and e and e0 the re-
spective identities of G and G0 . We now show that the kernel of f , defined by
Ker f = { a ∈ G | f ( g) = e0 } = K is a normal subgroup of G.
(i) For any a, b ∈ K, f ( ab) = f ( a) f (b ) = e0 e0 = e0 , and so ab ∈ K (closure of K).
If for every a ∈ K there is a c such that ac = a (i.e. c = e), then f ( ac) = f ( a ); as
f ( ac) = f ( a) f (c) we have f ( a ) f (c) = f ( a) for every a ∈ K. Hence f ( c) = e0 , so
that c = e ∈ K (identity of K). Finally, for any a ∈ K, we have the inverse a−1 ∈ G,
and e = aa−1 ∈ G. We have f ( e) = f ( aa−1) = f ( a) f (a−1) = e0 f ( a−1). Hence
e0 = e0 f ( a−1), or f ( a−1) = e0 , which says that a−1 ∈ K (existence of inverse).
These are the conditions for K < G.
(ii) For some a ∈ K and any g ∈ G, we have f ( gag−1) = f ( g) f (a ) f ( g−1) =
f ( g)e0 f ( g)−1 = e0 , which means that gag−1 ∈ K for any a ∈ K, i.e. gK g−1 ⊂ K
for all g ∈ G. This is a sufficient condition for the subgroup K of G to be normal,
which we write K / G.
Thus, with the kernel K of f : G → G0 being a normal subgroup of G, the
set of all cosets of K, including K itself, is a group G/K under multiplication
( gK )(hK ) = ghK. Furthermore, the quotient group G/K is isomorphic to G0 .
In the homomorphism between groups f : G → G0 , all the elements ai, i =
1, 2, . . . , k of K are sent to e0 . For some g ∈ G, the k elements ga1 , ga2 , . . . , gak of
G are mapped to f ( gai) = f ( g) f (ai) = f ( g)e0 = f ( g), with 1 ≤ i ≤ k. That is,
if f : ai 7 → e0 , then the same number of elements of G are sent to some element
g0 of G0 . In particular, if b ∈ gK then b = ga with a ∈ K, and f (b ) = f ( ga) =
f ( g)e0 = f ( g), and so f has the same effect on every element of a coset, and it
follows we can define the map f¯ : G/K → G0 by the function f¯( gK ) = f ( g) for
all g ∈ G. Now, since f¯( gK hK ) = f¯( ghK ) = f ( gh) = f ( g) f (h) = f¯( gK ) f¯( hK ),
group multiplication is preserved and f¯ is a homomorphism.
If gK = hK, then f¯( gK ) = f¯( hK ). Conversely, suppose f¯( gK ) = f¯( hK );
then f¯( h−1 gK ) = f¯( h−1K gK ) = f¯( h−1K ) f¯( gK ) = f¯−1 ( hK ) f¯( gK ). Assuming
f¯( gK ) = f¯( hK ), we have f¯( h−1 gK ) = e0 , or h−1 gK = K. Hence gK = hK, or the
mapping is one-to-one, and f¯ is an isomorphism.
In summary: Let f : G → G0 be a homomorphism of groups. Then the kernel
of f , called K = Ker f , is a normal subgroup of G, and the quotient group G/K of G
by K can be formed. The map f¯ : G/K → G0 uniquely defined by f¯( gK ) = f ( g)
for all g ∈ G is an isomorphism, so that the groups G/K and G0 are isomorphic, or
symbolically G/K ∼ = G0 .
C OMMENTS .
(a) If G is a simple group, then it has only h e i and G as normal subgroups, and
G/ h ei ∼
= G and G/G ∼ = h e i are its only quotient groups.
(b) Quotient groups are important for at least two reasons. One, from any
group G, one constructs a new model of group G/H. Two, assuming a finite
1.5. QUOTIENT AND PRODUCT GROUPS 19

group G, one has two groups, H < G and G/H, both related to G and smaller in
size than G, a useful fact in proofs by induction on the size of the group.
E XAMPLE 25: If m > 1 is a fixed integer and a ∈ Z, then the equivalence class
of a under congruence modulo m, which is ā = a + h m i, is a coset of h m i in Z,
and so as sets, Zm = Z/ h m i. Furthermore, as ā + b̄ = a + b (a, b ∈ Z) the group
operations coincide. (Zm is sometimes written Z/mZ, or Z/m.)
E XAMPLE 26: The cyclic group C4 = h a; a4 = e i contains the normal abelian
subgroup H = { e, a2} . Its quotient group C4/H consists of H and M = aH =
{ a, a3} , with the coset products among its elements given by H H = H, H M =
H ( aH ) = ( aH ) H, or H M = M = MH, and MM = H. Define G0 = h c; c2 = e0 i
and the mapping f¯ : H 7 → e0 , aH 7 → c. We see that C4/H is isomorphic to G0 ∼
= C2 .
2 2
E XAMPLE 27: The abelian Klein group V = h a, b; a = b = e, ab = ba i has
two subgroups, H = { e, a } and K = { e, b} , both abelian, cyclic, and normal.
Two quotient groups can be formed, V/H = { H, bH } and V/K = { K, aK }, both
isomorphic to C2. Note that C4 and V are not isomorphic in spite of their similar
decompositions (C4 has one normal subgroup, V has two).
E XAMPLE 28: The cyclic group C6 = h a; a6 = e i has two subgroups, H = { e, a3}
and K = { e, a2, a4 } , both abelian and normal. To them correspond two quotient
groups, C6 /H = { H, aH, a2 H } and C6/K = { K, aK } together with coset mul-
tiplication rules. Define the order-3 cyclic group C3 = { b, b2, b3 = e0 } and the
mapping: H 7 → e0 , aH 7 → b, and a2 H 7 → b2 . This is an isomorphism relating
C6/H with C3. Similarly for C6/K, the mapping f¯ : K 7 → e0 and f¯ : aK 7 → c (where
c2 = e0 ) shows that C6/K ∼ = C 2.
E XAMPLE 29: The nonabelian group D3 = h a, b; a3 = b2 = e, bab−1 = a−1 i has
three order-2 subgroups K i and one order-3 subgroup H, of which only the latter
H = { e, a, a3} is normal (and abelian). Together with the coset bH = { b, ba, ba2} ,
it produces the quotient group D3 /H = { H, bH } , with coset product rules. The
mappings H 7 → e0 and bH 7 → c (with c2 = e0 ) define an isomorphism of groups
D3 /H and C2. Hence we have D3 /H ∼ = C2 .
E XAMPLE 30: Z( Q8 ) = { 1, − 1} is a normal subgroup in Q8 . The factor group
Q8 /Z( Q8 ) is an abelian group isomorphic to V. It consists of the elements given
by the sets {± 1}, {± i }, {± j }, and {± k }. 

Direct and Semi-direct Products


These products of groups are important for constructing new, larger groups from
old ones, and for describing certain groups in terms of special subgroups. We
begin with two groups, and indicate later how to generalize to several factors.
Definition 1.9 (External direct product). Let H and K be any two groups, and Ḡ =
H × K be the cartesian product of H and K, with elements ( h, k ), together with the binary
operation ( h, k ) · ( h0, k 0 ) = ( hk, h0k 0 ) for h, h0 ∈ H and k, k 0 ∈ K. Then Ḡ is a group,
called the external direct product of H and K.
Explicitly, Ḡ = H × K = {( h, k) : h ∈ H, k ∈ K } . Clearly it is closed under
the binary operation defined as described, and this operation is associative (since
20 CHAPTER 1. GROUP STRUCTURE

operations defined for H and K are). The identity of Ḡ is ( e H , eK ), where e H and


eK are the identity elements of H and K. The inverse of any element ( h, k ) is
( h−1, k −1 ). The group has order given by | H ||K |. It is abelian if both factors are
abelian. Note that there are three operations, in H, in K, and in H × K. In particu-
lar if they are additive in both H and K, so that hh 0 and kk 0 are also additive, then
we write H ⊕ K in place of H × K, and call it a direct sum.
E XAMPLE 31: Consider the direct product of the cyclic group C2 with itself, so
that C2 × C2 consists of the elements ( e, e0), ( a, e0 ), ( e, b ), ( a, b ), where a2 = e and
b2 = e0 . If we let ( e, e0 ) 7 → ε, ( a, e0) 7 → r, ( e, b ) 7 → s, and ( a, b ) = rs, we see that
C2 × C2 is isomorphic to the Klein group V = h r, s i. We know that it is distinct
from C4. In the additive notation, we have Z2 ⊕ Z2 , again isomorphic to V.
E XAMPLE 32: Consider the cyclic groups C2 = h a i and C3 = h b i. We can build
the direct product group C2 × C3 with the elements ( e, e0 ), ( a, b ), ( e, b2), ( a, e0),
( e, b ), ( a, b2). It is isomorphic to C6 = h r i under mapping ( e, e0) 7 → ε, ( a, b ) 7 → r,
etc. When written additively, we have correspondingly the direct-sum group
Z2 ⊕ Z3 consisting of the elements (0, 0), (1, 1), (0, 2), (1, 0), (0, 1), (1, 2). It is
isomorphic to Z6 , the group of integers modulo 6, under mapping (0, 0) 7 → 0,
(1, 1) 7 → 1, etc. Note that the only common divisor (c.d.) of the orders of C2 and
C3 is 1; that’s why C2 × C3 is cyclic. In the previous example the orders of the
factors have 1 and 2 as c.d.’s, so C2 × C2 is not cyclic. In general if H and K are
finite cyclic groups, then H × K is cyclic if and only if c.d. (| H |, | K |) = 1.
E XAMPLE 33: Take the cyclic group C2 = h c : c2 = e i and the dihedral group
D3 = h a, b : a3 = e, b2 = e i. Then we can build a group of order 12 with the direct
product group of D3 and C2, consisting of the elements ( e, e0), ( a, e0), ( a, c), ( b, c ),
etc. It is not abelian because D3 is not. 
C OMMENTS . The external direct product group Ḡ = H × K does not have H and
K as subgroups, but does contain H̄ and K̄ as subgroups, which are isomorphic
copies of H and K in the following sense

H̄ = {( h, eK), h ∈ H } and K̄ = {( e H, k ), k ∈ K } .

It can be seen that H̄ and K̄ are normal subgroups of Ḡ, because for any g = ( h, k ),
we can show that g H̄g−1 ⊆ H̄ and gK̄g−1 ⊆ K̄. In addition, Ḡ = H̄ K̄ and
H̄ ∩ K̄ = { e } , where e = ( e H , eK ).

Definition 1.10 (Internal direct product). A group G is said to be the internal direct
product of the groups H and K if the following conditions are satisfied: (i) Both H and
K are normal subgroups of G; (ii) H ∩ K = { e }, where e is the identity of G; and (iii)
G = HK = { hk : h ∈ H, k ∈ K } .

In fact, the internal direct product can be defined in three equivalent ways:
Assuming a group G contains, possibly among others, the subgroups H1, H2 such
that G = H1 H2, then G is said to be an internal direct product of H1 and H2 if one
of the following set of conditions is satisfied:
(1) H1 and H2 are both normal in G, and H1 ∩ H2 = { e }.
1.5. QUOTIENT AND PRODUCT GROUPS 21

(2) H1 and H2 are both normal in G, and any element g of G is uniquely


expressible as g = h1 h2, where hi ∈ Hi for i = 1, 2.
(3) h1 h2 = h2 h1 for any hi ∈ Hi with i = 1, 2, and any element g of G is
uniquely expressible as the product g = h 1 h2.
First, let us have a Lemma: If G is a group with subgroups H1 and H2, such that
G = H1 H2 and H1 ∩ H2 = { e }, then every element g of G may be written uniquely in
the form g = h1 h2, where hi ∈ Hi, i = 1, 2. Indeed, assume there are h i ∈ Hi and
k i ∈ Hi , with i = 1, 2 such that g = h 1 h2 and g = k 1 k 2. It follows h− 1 −1
1 k1 = h2 k2 .
Both sides are group elements in H1 ∩ H2 = { e } , and so h1 = k 1 and h2 = k 2.
We can see the equivalence of the conditions (1)–(3) as follows:
• (1) →(2): Since both H1 and H2 are normal subgroups of group G, we must
have h1 ( h2h− 1 −1 −1 −1 −1 −1
1 h 2 ) ∈ H1 and ( h 1h 2 h 1 ) h 2 ∈ H2, and so h 1 h 2 h 1 h 2 must be in
H1 ∩ H2 = { e }. It follows from the normality of H1 and H2 and the observation
H1 ∩ H2 = { e } that h1 h2 = h2h1 for hi ∈ Hi with i = 1, 2. Together with the
Lemma, we conclude that for a given element g of G, there is only one choice of
hi ∈ Hi , such that g = h1 h2 up to the order of the factors.
• (2) →(1): Let g be an element in H1 ∩ H2. Since g is in H1, it is of the form
g = h1 e2 for some h1 ∈ H1, and since it is in H2 it is of the for g = e1 h2 with
h2 ∈ H2. The uniqueness of the factorization of g means h 1 = e1 and h2 = e2.
Hence g = e1e2 = e, or H1 ∩ H2 = { e } .
• (1) →(3): In (1) →(2), we have proved that the normality of H1, H2 and
H1 ∩ H2 = { e } imply H1 H2 = H2 H1, and that g = h1h2 is unique up to the order
of the factors. These are the required conditions (3).
• (3) →(1): We know from (2) →(1) the uniqueness of the factorization of g
implies H1 ∩ H2 = { e }. Now we prove that normality of H1, H2 follows from
H1 H2 = H2 H1. Let g = h1 h2 be an arbitrary element of G. Then for any k 1 ∈ H1:

gk 1 g−1 = h1 h2 k 1h− 1 −1 −1
2 h 1 = h 1 k 1 h 1 ∈ H1 ,

where in the last step, we have used h2k 1 = k 1 h2. So we have gk 1 g−1 ∈ H1 for
every k 1 ∈ H1. Similarly, gk 2 g−1 ∈ H2 for every k 2 ∈ H2. So both H1 and H2 are
normal subgroups of G. 
C OMMENTS . When we say that a group G is an internal direct group of Hi with
i = 1, 2, both H1 and H2 are, by definition, subgroups of G, and G is isomorphic
to the external direct group Ḡ = H1 × H2, but Ḡ does not contain Hi , but only
isomorphic copies of them. Because of this isomorphism, the distinction between
the two cases is often implicitly understood, and the adjectives ‘internal’ and ‘ex-
ternal’ are often omitted. This is one reason for the custom of writing H1 × H2 to
indicate indifferently the internal or external direct product of H1 and H2.
E XAMPLE 34: A ready-made example is provided by the group H × K of any two
groups. It contains two normal subgroups H̄ and K̄, which only have the identity
in common, and such that H × K = H̄ K̄. So, by definition, this group is the
internal direct product of H̄ and K̄.
E XAMPLE 35: The Klein group V has two order-two cyclic subgroups H1 = { e, a }
and H2 = { e, b } with ab = ba, and each of its four elements can be written as
22 CHAPTER 1. GROUP STRUCTURE

the product of one element of H1 and one of H2. Therefore V can be described
as the internal direct product of H1 and H2, which we write as V ∼ = C2 × C2 (or
V∼ = Z2 ⊕ Z2 in the additive notation).
E XAMPLE 36: The cyclic group C6, with the elements a, a2, a3, a4, a5, e = a6, has
two subgroups, namely H1 = { e, a3} and H2 = { e, a2, a4 } . Both are normal in
C6 (because this group is abelian) and have e as the only common element. So
C6 can be regarded as the internal direct group of H1, H2 (we can also check that
each element of C6 is a unique product of the type h 1 h2, that is e = ee, a = a3 a4,
a2 = ea2, a3 = a3 e, a4 = ea4, a5 = a3 a2). Since H1 ∼= C2 and H2 ∼ = C3, we write

C6 = C2 × C3 (or Z2 ⊕ Z3 in the additive notation).
Both H1 and H2 are normal subgroups of C6. The cosets of H1 in C6 are a2 H1 =
{ a , a5 } and a4 H1 = { a, a4} , and we can resolve C6 into the sum of H1 and its
2

cosets. Mapping H1 7 → e0 , a2 H1 7 → b, and a4 H1 7 → b2, with b3 = e0 shows


that the factor group C6/H1 is isomorphic to H2 ∼ = C3, which allows us to write
C6 ∼ = C2 × ( C6/C2). Similarly, the factor group C6/H2, formed from H2 and its
coset aH2, is isomorphic to H1 ∼ = C2, which implies the resolution of C6 in terms
of C3 and the factor group C6/C3, giving us C6 ∼ = C3 × ( C6/C3). In general, if G
is the internal direct product, G = H1 H2, of H1 and H2, then H1 = G/H2 and
H2 = G/H1. 
Of course products of groups may have more than two factors; correspond-
ingly, we have a general definition:
Definition 1.11 (General direct product). (1) If H1, H2, . . . , Hn are arbitrary groups,
the external direct product group of H1, H2, . . . , Hn is the cartesian product H1 ×
H2 × · · · × Hn obeying component-wise multiplication. (2) If a group G contains normal
subgroups H1, H2, . . . , Hn such that G = H1 H2 · · · Hn , and for each g ∈ G there are
unique elements hi ∈ Hi such that g = h1h2 · · · hn (following group composition rule),
then G is said to be the internal direct product group of H1, H2, . . . , Hn .
Certain groups are product groups but do not satisfy all the conditions of
direct products. For example, D3 has a normal subgroup H = { e, a, a3} , and its
quotient group D3 /H is isomorphic to any one of its order-2 subgroups K i. So D3
is not the direct product of H and K i , because the elements of H do not commute
with those of K i. To handle cases such as this, we define the following kind of
product:
Definition 1.12 (Semi-direct product). A group G with subgroups H and K is said to
be the internal semi-direct product of H and K if (i) G = HK; (ii) H is normal in G;
and (iii) H ∩ K = { e } . It is then denoted by G = H o K.
A very important example of a product of this kind is the semi-direct product
of Cn and C2, which produces the dihedral group Dn , with n ≥ 3.
Dihedral groups. A dihedral group Dn , for a positive integer n ≥ 3, is a nonabelian
group defined by the algorithm h a, b : an = b2 = ( ab )2 = e i. For the purpose of
visualization, we will use the model of a regular n-gon centered at the origin, with
one vertex on the x-axis, so that Dn = h a, b i is its group of isometries, with a the
counterclockwise rotation by 2π/n radians, and b the reflection across x̂. Then
1.5. QUOTIENT AND PRODUCT GROUPS 23

y
y 6
A 6
B A

-
O -x O x

C D
B C

Figure 1.2: Symmetries of the dihedral groups D3 , D4 in an equilateral triangle


and a square. Dotted lines indicate the reflection lines.

Dn has n rotations e, a, a2, . . . , an−1 , and n reflections b, ab, a2b, . . . , an−1 b. Note
that an = b2 = ( ab )2 = e implies: (1) ab = ba−1, ba = a−1 b; (2) ak b = ba−k ,
bak = a−k b. So that (3) ( ak b )2 = e, showing that ak b (k = 0, . . . , n − 1) are indeed
reflections through the different reflection lines; and (4) a j ( ak b ) = ( ak b ) a− j. If n is
odd, there is a reflection for each vertex, the reflection line connecting this vertex
to the midpoint of the opposite side. If n is even, we have a reflection across
the line connecting each pair of opposite vertices, and a reflection across the line
connecting the midpoints of opposite sides (see Fig. 1.2 for n = 3, 4). The number
of reflection elements remains n whether even or odd.
Let h a i = { e, a, a2, . . . , an−1 } ∼ = Cn and h b i = { e, b } ∼
= C2, then Dn = h a ihb i.
The subgroup (of rotations) h a i is normal in Dn , but h b i is not, because for exam-
ple aba−1 = ba−2 is not in h b i. Symbolically, Dn ∼ = Cn o C2, for n ≥ 3.
The structure of Dn can be analyzed further, with the help of the commutation
relations given above, by identifying its conjugacy classes. We have the following:
(1) Odd n ≥ 3: there are ( n + 3) /2 conjugacy classes: { e }; { a±1} , . . . , { a±(n−1)/2 } ;
{ aib |0 ≤ i ≤ n − 1} . (2) Even n = 2m ≥ 4: there are n/2 + 3 conjugacy classes:
{ e }, { am } ; { a±1} , . . . , { a±(m−1) } ; { a2ib |0 ≤ i ≤ m − 1} , { a2i+1b : 0 ≤ i ≤ m − 1} .
Any group is the disjoint union of its conjugacy classes, and the order of the
group is the sum of its class sizes. This gives the partitions: 2n = 1 + 2 + · · · +
2 + n (the summands 2’s occur ( n − 1) /2 times) for odd n; and 2n = 1 + 1 +
2 + · · · + 2 + n/2 + n/2 (the summands 2’s occur m − 1 times) for even n. In the
second case, besides the identity, there is another class of size 1 containing only
an/2 , a 180-degree rotation. In addition, the reflections split into two classes: one
consisting of those (a2i b) with fixed lines connecting opposite vertices and those
(a2i+1 b) with fixed lines connecting midpoints of opposite sides. Finally, there are
differences in the group center and the centralizers depending on the parity of n.
Thus, recalling that the center is made up of the elements of the size-1 conjugacy
classes, we see that the center of Dn with n ≥ 3 is trivial for odd n, and { e, an/2 }
for even n. As for the centralizers, we can see, for example, CG ( b ) = { e, b } for
odd n, and CG ( b ) = { e, an/2, b, an/2 b } for even n. This difference shows up in the
unequal sizes of the reflection conjugacy classes in odd and in even n.
24 CHAPTER 1. GROUP STRUCTURE

1.6 The Symmetric Group


The symmetric group plays a very important role in the theory of finite groups,
in physics (e.g. in systems of identical particles), in graphic arts, and in music.
We give here the basics on this group; other topics will be found in Appendix
A, especially its applications in the representation theory of Lie groups. A useful
reference for this section is Chapter 7 in [Ham].
Let X be a nonempty set, and S ( X) the set of all permutations f of X mapping
X 7 → X. S ( X ) is a group, called the group of permutations on the set X, when
it is equipped with the composition rule f · g of two permutations. If X = I n ,
where I n = { 1, 2, . . ., n } , then S ( X ) is called the symmetric group on n objects and
denoted Sn . Each element p of Sn (a permutation of degree n) maps { 1, 2, . . . , n }
to { i1, i2 , . . . , in } , such that 1 7 → i1 , 2 7 → i2 , . . . , n 7 → in , where ir ∈ I n ; while the
reverse operation, p−1, is the mapping i1 7 → 1, . . . , in 7 → n. This is written as
   
1 2 . . .n i1 i2 . . . in
p= , p −1 = . (1.5)
i1 i2 . . . in 1 2 . . .n
The identity in Sn is  
1 2 . . .n
e= . (1.6)
1 2 . . .n
Two successive permutations of I n are equivalent to a permutation of I n , which
defines the composition rule. Thus permutations of degree n obey all the group
axioms, and so Sn is indeed a group. The number of all possible permutations of
I n is n!, and so the symmetric group of degree n, Sn , has order | Sn | = n!.
Rather than the above descriptive notation, we will use a more compact no-
tation, which can be better explained via an example. Consider a permutation of
8-degree:  
12345678
. (1.7)
23176548
Here 1, 2, 3 transform among themselves; this permutation mapping 1 7 → 2,
2 7 → 3, and 3 7 → 1 shall be denoted by (1 2 3). Next, the permutations 4 7 → 7
and 7 7 → 4 are to be denoted by (47), and similarly for (56). Finally 8 remains
unchanged, which we shall write as (8). So, we have the equivalent notations:
 
12345678
≡ (1 2 3)(4 7)(5 6)(8) . (1.8)
23176548
A permutation of the kind ( i1, i2 , . . . , i` ) with ` ≤ n which maps i1 7 → i2 , i2 7 → i3 ,
. . . , i` 7 → i1 is called a cyclic permutation, or a cycle, of length ` or an `-cycle. A
2-cycle, such as ( i1, i2 ), is called a transposition.
C OMMENTS .
(a) For simplicity, one may omit 1-cycles, replacing (1 2 3)(4 7)(5 6)(8) for ex-
ample with (1 2 3)(4 7)(5 6). By the same token, the identity is written as (1)
rather than (1)(2) . . . ( n ). However, when using this compact notation, one must
keep in mind the set I n on which the permutations are applied.
1.6. THE SYMMETRIC GROUP 25

(b) In a given cycle, the symbols are read from left to right, starting at any
point in the chain; for example, the cyclic permutations 1 → 2 → 3 → 1 can be
represented by any one of the equivalent cycles: (1 2 3), (2 3 1), or (3 1 2).
(c) A cycle represents an operation on a subset of I n . Two cycles commute
when they have no elements in common (and are said to be disjoint), for example
(1 2 3)(4 7) = (4 7)(1 2 3). Otherwise the order of application of the cycles (as op-
erators) is important, with the operators going from right to left, i.e. the rightmost
cycle acting first, then the next left on the result, and so on down the line, exactly
as in the usual convention in physics for products of operators, ab · x = a · ( b · x).
Example: (1i)(1j )(1i ) = ( ij ), but (1i)(1i )(1j ) = (1j )(i), both acting on { 1, i, j } .
(d) In this notation, the inverse of a k-cycle is (1 2 . . . k )−1 = ( k . . . 2 1), so that

(1 2 . . . k )(k . . .21) = (1)(2) · · · ( k ) ≡ (1) . (1.9)

A transposition (a 2-cycle) is its own inverse: (1 2)−1 = (2 1) = (1 2), or equiva-


lently (1 2)2 = (1)(2); in general, (1 2 . . . k )k = (1)(2) . . . ( k ) = (1).
(e) Any k-cycle can be written as a product of k − 1 transpositions

(1 2 . . . k ) = (1k ) . . . (13)(12) . (1.10)

Definition 1.13 (Even, odd permutations). A permutation p ∈ Sn is said to be even


(or odd) if p can be written as a product of an even (or odd) number of transpositions.
Although there are many ways of writing a permutation p ∈ Sn , with n ≥ 2,
as a product of transpositions, the number of factors in the decomposition of p is
always either even or odd, and so permutation p must be either even or odd. The
sign of a permutation p ∈ Sn , denoted sgn p, is + 1 if p is even, or − 1 if p is odd.
Permutation (1 2 . . . k ) is even if k − 1 is even, or odd if k − 1 is odd; for exam-
ple, (123) = (13)(12) is even, and (123)(45) odd. The permutations p and qpq−1
have the same sign for any p, q ∈ Sn .
The elements p ∈ Sn divide into two disjoint sets, one consisting of even per-
mutations (sgn p = + 1), the other of odd permutations (sgn p = − 1). The even
permutations form a group, called the alternating group An , because a product
of two such elements is an even permutation; on the other hand, as products of
two odd permutations are even permutations, the odd permutations do not form
a group (being in fact the coset of An in Sn ).
It is a general result that
(i) The alternating subgroup An of a symmetric group Sn is a normal subgroup
(An / Sn ) having index 2 and order n!/2;
(ii) An is simple if and only if n 6 = 4.
To see that (i) is indeed true, we define the multiplicative cyclic group C =
{+ 1, −1} and the map f : Sn → C by p ∈ Sn 7 → sgn p. This map is a homo-
morphism between groups, with Ker f = An . Hence, by the statement given on
page 18, An is normal in Sn and Sn /An ∼ = C. This implies the index [ Sn : An ] = 2
and | An | = | Sn | /2 = n!/2. As for (ii) we see that it is true for n = 2, 3, 4 in
the explicit group analyses given below. It is also true for n ≥ 5, the proof of
which is not trivial (it can be found in mathematical textbooks). In particular the
26 CHAPTER 1. GROUP STRUCTURE

simplicity (indivisibility) of A5 is the reason for the non-solvability of the quintic


equation by radicals, as was discovered by Galois.
E XAMPLE 37: Group S2 is of order 2! = 2. Its two elements are (1) and (12), with
(12)2 = (1). It is isomorphic to the cyclic group C2 = { e, a} , with no nontrivial
(normal) subgroups. Its alternating subgroup is A2 = (1).
E XAMPLE 38: Group S3 has 6 elements, the six possible distinct permutations of
three objects, i.e., (1), (12), (23), (31), (123), (132). We already know that S3 is
isomorphic to D3 , so we have the following multiplication table:

(1) (12) (23) (31) (123) (321)


(12) (1) (123) (321) (23) (31)
(23) (321) (1) (123) (31) (12)
(31) (123) (321) (1) (12) (23) (1.11)
(123) (31) (12) (23) (321) (1)
(321) (23) (31) (12) (1) (123)

Its alternating subgroup, A3 = {(1), (123), (321)} is a normal subgroup, isomor-


phic to the simple cyclic group of order 3, A3 ∼
= C3 .
E XAMPLE 39: Group S4 has order 24, which is divisible by 24, 12, 8, 6, 4, 3, 2, and 1.
The size of any subgroup must be one of these factors. The only subgroup of size
12 is the alternating subgroup A4 of index 2; it is normal, but not simple since it
contains in turn an invariant subgroup, {(1), (12)(34), (13)(24), (14)(23)}. There
are three subgroups of order 8 all isomorphic to the dihedral group D4; four sub-
groups of order 6 all isomorphic to S3; at order 4, three isomorphic to C4 and four
to V; four subgroups C3 at order 3; and finally nine at order 2.
Group S4 is covered by 5 conjugacy classes corresponding to the 5 parti-
tions of 4 so that each class exclusively contains permutations of the same cycle
type ((`1, `2, . . . ) where `i is the length of the ith cycle). Their cycle type, repre-
sentatives, and sizes are: {(1, 1, 1, 1)|(1)|1}, {(2, 1, 1)|(12)|6}, {(2, 2)|(12)(34)|3},
{(3, 1)|(123)|8}, and {(4)|(1234)|6}.
A normal subgroup must be a union of conjugacy classes of elements, and
must contain the identity (1), and so we can see that the only possible nontrivial
normal subgroups of S4 have orders 4 (= 1 + 3) and 12 (= 1 + 3 + 8). The order-4
normal subgroup is (1) (12)(34), which can be seen to be isomorphic to Klein
S

group V; whereas the order-12 normal subgroup is (1) (12)(34) (123), which
S S

is none other than A4. It follows that one can construct two quotient groups from
S4, namely S4 /A4 ∼ = S3 and S4 /V ∼ = C 2. 
Symmetric groups are important in group theory because they can be sys-
tematically analyzed and because they contain all the possible structures of finite
groups, as stated in the following theorem.

Theorem 1.2 (Cayley). Every finite group G is isomorphic to a subgroup of the sym-
metric group Sn , with n = | G |.

This theorem can be understood as follows. Let G act on itself by multiplica-


tion, by which we mean g : x 7 → gx for all x ∈ G with some given g ∈ G. Hence
1.6. THE SYMMETRIC GROUP 27

the action of g has the effect of permuting the order of the elements in the under-
lying set, and so induces a map p : G → S ( G ) defined by x 7 → gx = p g ( x) for all
x ∈ G. The permutation p g thus defined can be seen to satisfy the properties:
(i) p gg0 = p g p g0 , because on the one hand gg0 x = p gg0 ( x), and on the other
hand gg0 x = gp g0 ( x) = p g ( p g0 ( x)) = p g p g0 ( x).
(ii) ex = x = pe ( x), hence pe = (1) is the identity permutation.
(iii) x = p gg−1 ( x) = p g p g−1 ( x) = pe ( x), which defines ( p g )−1 = p g−1 .
This shows that, given a group G of order n, each permutation p g is uniquely
associated with a g ∈ G; and { p g| all g ∈ G } is a subgroup of Sn isomorphic to G.
E XAMPLE 40: C3 and S3. The cyclic group C3, with its three elements a, a2 = b,
a3 = e, is specified by the multiplication table shown on the left-hand side of
Eq. (1.12). The table on the right-hand side shows the permutations in group S3
to which the elements of C3 are mapped: e 7 → pe = (1), a 7 → pa = (123), and
b 7 → pb = (132). By the usual rules of permutation multiplication, we can check:
p2a = pb , p2b = pa , pa pb = pe , in agreement with a2 = b, b2 = a, ab = e. We have
thus shown that C3 is isomorphic to A3, the alternating subgroup of S3.

C3 e a b 1 2 3
e e a b pe 1 2 3
a a b e −→ pa 2 3 1 (1.12)
b b e a pb 3 1 2

E XAMPLE 41: D2 and S4. We show on the left hand side of (1.13) the multipli-
cation table in Eq.(1.2) of the dihedral group D2 , and on the right hand side the
permutations to which the elements of D2 are mapped.

D2 e a b c 1 2 3 4
e e a b c pe 1 2 3 4
a a e c b −→ pa 2 1 4 3 (1.13)
b b c e a pb 3 4 1 2
c c b a e pc 4 3 2 1

where pe = (1), pa = (12)(34), pb = (13)(24), pc = (14)(23). Following the


usual rule of permutation product, we can show that
 
1 7→ 3 7→ 4  
 = 1234 ,
 2 7→ 4 7→ 3 
pa pb = (12)(34)(13)(24) = 
 3 7→ 1 7→ 2  (1.14)
4321
4 7→ 2 7→ 1

and so pa pb = pc. Similarly, we have pb pc = pa , pc pa = pb . In addition, p2a =


((12)(34))2 = (12)(34)(12)(34) = (12)(12)(34)(34) = (12)2(34)2 = (1). This
means p2a = (1), and similarly p2b = (1) and p2c = (1). So the set of permutations
{ pe, pa , pb , pc } forms a subgroup of S4 isomorphic to D2 ∼
= V. 
28 CHAPTER 1. GROUP STRUCTURE

C OMMENTS . (a) Cayley’s theorem identifies the structure of any group G of or-
der n with that of an order-n subgroup R of the group of permutations Sn of
degree n. From the examples given above, we note the following properties of
R < Sn which also hold in all generality: (i) Except for the identity p e , the ele-
ments pi ∈ R leave no symbols unchanged. (ii) When resolved into independent
cycles, all the cycles in each element pi must have the same length: pi must be
a product of 1-cycles, or of 2-cycles, etc.. (They reflect the characteristic prop-
erties of the multiplication table.) These very special permutations are said to
be ‘regular’. Subgroups formed only of regular permutations are called regular
permutation subgroups.
(b) Cayley’s theorem implies that the number of distinct (non isomorphic)
groups of a given order is finite, because every one of them is isomorphic to some
subgroup of Sn , and as Sn is finite, it contains only a finite number of subgroups.
(c) We may use the theorem to determine group structures. Finding groups of
order n is equivalent to finding regular subgroups of the same order in Sn , and
we can do this by exploiting the fact that regular permutations, when resolved
into independent cycles, must have all cycles equal in length `, which means that
` must be a divisor of n. And so:
• If n is a prime number, the cycle lengths can only be 1 or n. So the regu-
lar subgroup can only contain the cyclic permutation (12 . . . n ) and its powers.
Therefore, a group of prime order has to be cyclic.
• If n is not a prime, the problem is to find the regular subgroups in Sn con-
taining permutations resolved into cycles whose lengths divide n. As an exam-
ple, take n = 4, which has divisors 4, 2, 1. First consider a degree-4 permutation
(1234), and take its successive powers. The permutations so obtained, (1234),
(13)(24), (1432), and (1234)4 = (1)(2)(3)(4), are the elements of a regular sub-
group of S4 isomorphic to C4. Starting with another 4-cycle leads to no new
structure. Next, take all possible 2-cycles, i.e. (12)(34), (13)(24), and (14)(23), each
of which has a square equal to (1)(2)(3)(4). They form another regular subgroup
of S4, which is isomorphic to D2. This exhausts all possibilities. In conclusion,
there are just two distinct groups of order 4, namely C4 ∼
= Z4 and D2 ∼ = Z2 ⊕ Z2 .

1.7 Classification of Finite Groups


Lagrange’s theorem tells us that in a finite group G any subgroup has an order
that divides the group order | G |. Its converse (namely, ‘To each positive integer
m dividing | G |, there corresponds in G a subgroup of order m’) is false in general.
The smallest group that shows that the converse is not true is A4, which is of
order 12. While A4 has subgroups of sizes 1, 2, 3, 4, and 12, it has no subgroups of
size 6 (see below). But the converse holds true in a few special cases (e.g. abelian
groups, dihedral groups, groups having a prime-power size). The relevant results
are the following.
Theorem 1.3. In any finite abelian group of order | G |, there is a subgroup of order m for
every positive integer m that divides | G |.
For example, the cyclic group C6 (size 6) contains the subgroups h a2i of order
1.7. CLASSIFICATION OF FINITE GROUPS 29

3, and h a3i of order 2. This theorem need not hold for nonabelian groups: For ex-
ample, D3 (size #6, factors 1,2,3,6) has proper subgroups of size 2 (h b i), and 3 (h a i),
but A4 (#12, divisors 1,2,3,4,6,12) has no subgroups of size 6 (but has subgroups
C2 ∼= {(1), (12)(34)} (#2); C3 ∼= {(1), (123)(132)} (#3); as well as subgroups of #4
of the type V ∼ = {(1), (12)(34), (13)(24), (14)(23)}.
Theorem 1.4 (Cauchy). If G is a group whose order is divisible by a prime p, then G
contains an element (and hence a cyclic subgroup) of order p.
Thus, D3 (#6 divisible by 2 and 3) has elements of order 2 or 3. All the groups
of order 8 (e.g. C8, D4, Q8) have at least one element of order 2.
The following theorem extends Cauchy’s theorem and guarantees the exis-
tence of subgroups in groups of prime-power order.
Theorem 1.5 (Sylow). If G is a group whose order is p n m, with n ≥ 1, p prime, and
p, m relatively prime, then G contains a subgroup of order p k for each 1 ≤ k ≤ n.
So, for example, the order of D4 may be written as 23 1, and so D4 must have
nontrivial subgroups of order 2 and 4, which is verified. The symmetric group
S4 has order 23 3, and so must contain subgroups of order 2, 4, 8, and 3, which is
also verified.
At this point, it is useful to recall a few facts seen in previous sections: (1)
Every cyclic group of order n is isomorphic to Zn . (2) Every group with a prime
order p is cyclic, isomorphic to Z p . (3) Every group of order 2p, where p is an
odd prime, is either the cyclic group Z2p or the dihedral group D p . (4) There are
five distinct groups of order 8, namely Z8 , Z4 ⊕ Z2 , Z2 ⊕ Z2 ⊕ Z2 , D4, and Q8
(see Problem 1.7). So we know, up to isomorphism, the structure of all the groups
with order smaller than 12, as shown in the accompanying table.

Table 1.1: Groups with order less than 12.

Order Abelian group Non-abelian group Reference


1 hei p. 4
2 C2 p. 4
3 C3 p. 4
4 C4 , C2 × C2 pp. 5, 20
5 C5 p. 12
6 C3 × C2 D3 = C3 o C2 p. 13
7 C7 p. 12
8 C8 , C4 × C2 , C2 × C2 × C2 D4 = C4 o C2 , Q8 Problem 1.7
9 C9 , C3 × C3 p. 29
10 C10 D5 = C5 o C2 p. 13
11 C11 p. 12

These results are among the early steps towards a classification of all the finite
groups. This task is far from being completed, but mathematicians have made
a very significant step in identifying all the finite simple groups. This is an out-
standing achievement because finite simple groups are the basic building blocks
30 CHAPTER 1. GROUP STRUCTURE

for constructing any finite group by a sequence of operations. (For example,


C2, C2 7 → V; C2, C3 7 → S3; and V, S3 7 → S4).
By 1900, it was known, from Évariste Galois, Sophus Lie, and others, that the
list of simple groups included the cyclic groups of prime order, the alternating
groups, and 13 families of Lie groups over finite fields, plus five simple groups
found by Émile Léonard Mathieu in 1861–1873, and referred to as sporadic, be-
cause they did not fit into any of the above infinite series. Further progress was
made by William Burnside:
Theorem 1.6 (Burnside). If the order of a group may be written as p a qb , where p, q are
prime and a, b non negative integers, then the group is solvable.
In other words, each non-abelian finite simple group must have an order
divisible by at least 3 distinct primes. This considerably limits the number of
possibilities: the only non-abelian simple groups with at most 200 symmetries
are A5 (with 60 = 22.3.5 symmetries) and PSL(2, 7) (a subgroup of S7 of order
168 = 23 .3.7). Burnside went even further, making (in 1911) the conjecture that
a non-abelian odd-order group is solvable, or saying it differently, any non-abelian
simple group has to have an even order. This conjecture was finally proved (in
1963) by Walter Feit and John Thompson.
Advances then came quickly and steadily in a concerted program, and the
proof of the classification theorem of finite simple groups, which runs in sev-
eral thousands of pages, was systematically revised, organized, and completed
in 2004 (by Daniel Gorenstein, Michael Aschbacher, Richard Lyons, and Stephen
D. Smith, et al., [As], [Go], [GLS]). The proof of the classification theorem for
simple finite groups is more complex, requiring more efforts than that for the
compact Lie groups (which was essentially completed by 1950), because there ex-
ist many sporadic groups, and no one has yet found a simple uniform geometric
or graphic description (similar to the Dynkin diagrams [Dy1], [Dy2] for compact
Lie groups) that applies to all finite simple groups.
Theorem 1.7 (Classification). Each finite simple group is isomorphic to one of the fol-
lowing groups:
(1) a cyclic group of prime order,
(2) an alternating group of degree at least five,
(3) a group of Lie type,
(4) a sporadic group.
Not all finite groups of Lie type are simple, but they will become simple upon
slight modifications such as by quotienting out their centers, which then yields
projective linear groups. For example, SL( n, q ) is not simple in general (n is a
positive integer and q is a positive power of a prime number) but the projective
special linear group PSL( n, q ) = SL( n, q )/Z SL( n, q ) is simple (unless ( n, q ) =
(2, 2) or (2, 3)). There are 16 such infinite families of type (3), including both
the classical Lie groups (the projective special linear, projective symplectic, or
orthogonal groups over a finite field) and the exceptional Chevalley groups; the
twisted Chevalley groups; the Steinberg groups; the Suzuki groups; and the Ree
and Tits groups.
1.7. CLASSIFICATION OF FINITE GROUPS 31

The sporadic groups are the 26 (and only 26) individual groups that do not
fit into any of the 18 countably infinite families in (1)–(3); they arise usually as
groups of automorphisms of multi-dimensional geometrical configurations. The
smallest sporadic is the Mathieu group M11 of order 7920, and the largest spo-
radic is the Fischer–Griess Monster group M, of order ≈ 8 · 1053, which con-
tains all but 6 of the other sporadic groups. The Monster group (conjectured
by Bernd Fischer and Robert Griess in 1973, and constructed by Griess in 1982)
may be regarded as the automorphism group of a 196 883-dimensional commu-
tative non-associative algebra, called the Griess algebra, also the first nontrivial
tier of the infinite-dimensional vertex operator algebra. It is the most intriguing
of all finite simple groups, with many apparent but still unexplained connections
to modular functions. (For example, the first nontrivial coefficient of a certain
modular function is equal to the sum of the dimensions of the smallest nontrivial
irreducible representations of M: 196 884 = 1 + 196 883. And the pattern contin-
ues on with 21 493 760 = 1 + 196 883 + 21 296 876, relating the second coefficient
of the modular function and the next highest dimension for the Monster group).
The search for an explanation of this completely unexpected relationship between
sporadic groups and modular functions is an object of the moonshine theory, a the-
ory with rich implications in number theory, algebra, and geometry, as well as
wide-ranging ramifications into physics, in particular, quantum field theory and
string theory.

Problems
1.1 Show that the set of all rotations R ẑ ( α ) around the z axis by an angle α acting
on the coordinates ( x, y ) for all values of α forms a (continuous) group.
1.2 Prove that a group is abelian if the order of any of its elements other than the
identity is 2.
1.3 Find the structure of the groups of order 4, 6, 8, and 10 with elements all of
order 2.
1.4 Prove that the intersection of two subgroups of a group of G is a subgroup of
G.
1.5 Use Cayley’s theorem to find all distinct groups of order 6.
1.6 Let G be a group that contains a p-order cyclic subgroup H = h a i, such that
p = | G | /2. Find the coset of H in G, and identify the structure of G for | G | =
4, 6, 8.
1.7 Show that there are five distinct groups of order 8, namely, the cyclic group
C8, the group of symmetries of the square D4 , the quaternion group Q8, and two
product groups, C4 × C2 and D2 × C2.
1.8 Enumerate the conjugacy classes, subgroups and normal subgroups, as well
as the factor groups, of the symmetric groups S2, S3 and S4.
1.9 The dihedral group D4 is the group of the symmetry transformations of a
square, generated by a rotation b through an angle π about a diagonal, and a
32 CHAPTER 1. GROUP STRUCTURE

counterclockwise rotation a through an angle π/2 about the axis perpendicular


to the square plane and passing through its center. The generators satisfy the
relations a4 = e, b2 = e, ba = a−1 b.
(a) Give the multiplication table.
(b) Identify the subgroup of S8 isomorphic to D4.
(c) Enumerate all the conjugacy classes.
(d) Enumerate all the subgroups, identifying the normal subgroups, the cosets
of all subgroups, and the factor groups associated with the normal subgroups.
1.10 Let Dn be the dihedral group generated by a of order n and b of order 2.
Show that the cyclic subgroup h a i is normal in Dn .
1.11 The quaternion group Q8 is generated by two order-4 elements a and b, such
that ( bam )2 = a2, with m = 0, 1, 2, 3.
(a) Give the multiplication table.
(b) The name of the group arises from the fact that its elements can be repre-
sented by four objects, like 1, x, y, z (and its negatives) or four independent 2 × 2
matrices, such as the identity I matrix and the standard Pauli matrices: Making
use of the relations σi σj = I + ieijk σk , where I is the identity 2 × 2 matrix, establish
the correspondence between the three representations.
(c) Identify the conjugacy classes and subgroups of Q8. For each normal sub-
group, give the factor group of Q8 .

Q. Ho-Kim. Group Theory: A Physicist’s Primer.


Chapter 2

Group Representations

2.1 Linear Vector Space


2.2 Definition and Constructions
2.3 Simple Representations
2.4 Unitary Representations
2.5 Schur’s Lemma
2.6 Matrices and Characters
2.7 Tensor-Product Representation

This chapter begins with a review of the basic concepts and properties of linear
vector spaces essential to a formulation of representation theory and continues
on with a discussion of the different ways in which a group acts on sets or vec-
tor spaces. Groups are abstract: they are symbols and rules. Representations of
groups are concrete: they deal with vectors, operators, and matrices. The aims of
representation theory are to define, construct, and classify all the distinct repre-
sentations of groups. If a system (a differential equation or a physical object, for
example) exhibits some symmetry, then all associated symmetry transformations
form a group, and a deeper understanding of these symmetries gives us better
insight into the system. Mathematicians use representations to study group prop-
erties not readily accessible to groups by themselves, to explore new structures,
or apply group theory in other contexts. Physicists usually encounter groups by
way of their representations and find them in quantum-mechanical applications.
We restrict ourselves in this chapter to finite groups, paving the way for infinite
groups in later chapters.

2.1 Linear Vector Space


In this section, we review the basic ideas concerning linear vector spaces, which
can be found in greater details in [Fa], and [Hu] Chapter VII.
1. A linear (vector) space V is a collection of elements, called vectors, that is closed
under the addition of any two elements and the scalar multiplication of any element

33
34 CHAPTER 2. GROUP REPRESENTATIONS

by a scalar number (element of a field F). If F is real (R), the vector space is said
to be real; if F is complex (C), we have a complex vector space.
2. Given ψ, φ ∈ V , they are said to be linearly dependent, or parallel to each other,
if αψ + βφ = 0 for some nonzero scalars α and β. If there are no such scalars, the
vectors are linearly independent. A vector space V is n-dimensional if there exist
sets containing n linearly independent vectors but no sets with a greater number
of linearly independent vectors. If n is infinite, V is said to be infinite-dimensional.
3. We assume from here on finite spaces, V = Fn with F = R or C. In such a V any
set of n linearly independent vectors u1, u2, . . . , un span V in the sense that any
vector ψ in V may be written as a linear combination c1 u1 + c2 u2 + · · · + cn un ,
with ci ∈ F. The set { ui} is then said to be complete, or forms a basis of V ; the
numbers ci, called the components of ψ on { ui} , are unique up to normalization.
4. Let V and V 0 be two linear spaces. Suppose there is a map f that assigns to
every vector x in V a vector x0 in V 0, i.e. f : V → V 0, then we say that this map
defines an operator A from V to V 0, such that x0 = Ax. We then write A : V → V 0,
and consider the two statements, on f and A, interchangeable.
An operator A : V → V 0 is said to be linear if A ( αψ + βφ ) = αAψ + βAφ, or
antilinear if A ( αψ + βφ ) = α∗ Aψ + β∗ Aφ for every ψ, φ ∈ V and α, β ∈ C (α∗ is
the complex conjugate of α). If V 0 = F1 , i.e. x0 is an F-scalar, one usually writes
x0 = f ( x), rather than x0 = Ax, and call f a functional, rather than an operator.
5. The sum of the linear operators A : V → V and B : V → V is defined by
def def
( A + B ) ψ = Aψ + Bψ; and the product of A and a scalar α by ( αA )ψ = α ( Aψ)
for every vector ψ ∈ V . So the set of all linear operators in V also forms a linear
vector space. It is of dimension n2 if V is n-dimensional.
6. Let A be a linear operator such that the relation Aψ = 0 for ψ ∈ V holds only
if ψ = 0. Then A is said to be invertible, and its inverse, denoted A−1 , obeys the
relations AA−1φ = φ and A−1 Aψ = ψ for every φ and ψ in V .
7. Let V be a linear inner-product space (cf. #11) of dimension n, and let ui , with
i = 1, . . . , n, be a complete basis. Then we may write any vector ψ in V as a linear
superposition, and any linear operator A : V → V as
n n
ψ= ∑i=1 ui ci , Au j = ∑i=1 ui aij . (2.1)
This system of equations can be solved to give ci and aij . We may regard ci as
the components of a column vector c and aij the entries of an n × n matrix A.
These data, ci and aij , suffice to define ψ and A once the basis vectors ui are
specified. We may refer to c and A as the representations of ψ and A in the basis u.
(Once a basis is fixed, ‘matrix’ and ‘operator’ become practically the same.) All
algebraic operations on abstract vectors and operators are preserved in matrix
representations; to each operation on vectors and operators corresponds a similar
matrix operation. For example, to ψ0 = Aψ corresponds c0 = Ac, and to C = AB
corresponds C = AB. Relations in matrix representations, such as c0 = Ac and
C = AB, are basis-specific.
8. Change of basis . It is sometimes useful
or simpler to work in a different basis.
Let u1, u2, . . . , un and v1, v2, . . . , vn be two bases of V . The change of basis
 
2.1. LINEAR VECTOR SPACE 35

is a similarity transformation realized via an n × n invertible matrix, S, with


its elements defined by va = ∑m um Sma . This matrix gives the correspondence
between the representations, S : πV → πU , of vectors and matrices:

c(u) = Sc(v) or c(v) = S−1 c(u)


(2.2)
A(u) S = SA(v) or A(v) = S−1 A(u) S .

The relation c(v) = S−1 c(u) is the characteristic transformation property of a col-
umn vector under S, while A(v) = S−1 A(u) S guarantees that A(u) c(u) is also a
column vector, i.e. A(v) c(v) = S−1 A(u) c(u) . Comparing va = ∑m um Sma with
c(v) = S−1 c(u) , we note that the transformations between the basis vectors and
the transformations between the matrices are in inverse relation. The commuta-
tion relation SπV = πU S contained in (2.2) justifies the term intertwining operator
sometimes applied to S. The intertwining of Π V and ΠU by S is illustrated here:
S
V −−−−→ U
 
ΠV y
 Π
y U
V −−−−→ U
S
More generally, two n × n matrices A and B over F are said to be similar if there
exists an invertible matrix M such that B = MAM−1.
9. Let A = [ aij ] be an invertible n × n matrix, and Āij the ( n − 1) × ( n − 1) matrix
obtained by deleting row i and column j from A. Then the determinant of A,
det A ≡ | A |, is given by | A | = ∑nj=1 (− 1)i+ j aij | Āij | (for each i = 1, . . . , n). Let
B = [ bij ] be the n × n matrix with elements bij = (− 1)i+ j | Āij | (called the cofactor
of aij ), then the inverse of A exists, given by A−1 = B/ | A|, if | A | 6 = 0.
10. Let {( xa1, . . . , xan ) | a = 1, . . . , n } be a set of n vectors specified by its compo-
nents in some appropriate basis, and let M be the n × n matrix formed from the
vectors { xa } , a vector on each row. Then the vectors x1 , x2 , . . . , xn are linearly
independent if and only if the determinant of M is non zero.
11. We now begin to study properties that depend on a quadratic quantity, the
bilinear form B ( ψ, φ ). We shall assume V = Cn ; most of the results to be pre-
sented also hold for real vectors upon restriction to R n , with exceptions that will
be noted. The scalar product B ( ψ, φ ) of two vectors ψ and φ in a linear vector
space V is a complex number, denoted by ( φ, ψ ), such that (i) ( ψ, ψ) ≥ 0 with
( ψ, ψ ) = 0 if only if ψ = 0; (ii) ( φ, ψ ) = ( ψ, φ )∗, where ∗ denotes complex conju-
gation; and (iii) ( φ, αψ + βχ ) = α ( φ, ψ) + β hφ, χ), where α, β ∈ C. A product with
property (i) is referred to as a (positive-definite) inner product. Any vector space en-
dowed with an scalar product is called an inner-product space. More technically,
it is a pre-Hilbert or Euclidean space; if also complete, it is called a Hilbert space; see
[Fa].
12. Dirac notation. It is convenient to adopt the Dirac notation in which a scalar
product B ( φ, ψ ) is written as h φ | | ψ i, or h φ |ψi, and call | ψ i a ket vector and h φ |
36 CHAPTER 2. GROUP REPRESENTATIONS
∗
a bra vector. In this notation the conjugation relation becomes h φ | ψi = h ψ | φi ,
which implies that under complex conjugation a ket becomes a bra, and vice
versa, | φ i∗ = h φ |, (h ψ|)∗ = | ψ i, and the relative order of the two vectors in the
product is reversed. The vectors | ψ i and h ψ | are said to be dual to each other, and
belong respectively to space V and its dual, denoted by V ∗ .
13. One calls h ψ | ψi = k ψ k2 the squared length, or squared norm, of ψ (when no
confusion can arise, we write ψ for | ψ i). When k ψ k = 1, the vector ψ is said to
be normalized (to one), or of norm one. Any nonzero vector can be normalized by
dividing it by its own length. Two vectors φ and ψ whose scalar product vanishes
are orthogonal to each other, which we indicate by φ ⊥ ψ.
14. A basis is said to be orthonormal if the basis vectors are mutually orthogonal
and all normalized to 1.
15. Gram–Schmidt Orthogonalization. From any two non parallel vectors we can
produce two orthogonal vectors by subtracting from one its component parallel
to the other. More generally, from n linearly independent vectors φi , the Gram–
Schmidt orthogonalization process produces n mutually orthogonal vectors ψi, each
of which, when divided by its own length, is duly normalized.
16. Conjugations. To any linear operator A corresponds another linear operator,
def
called its adjoint and denoted by A†, such that ψ | A†φ = h Aψ |φi, or equiva-

def
lently ψ | A†φ = hφ | Aψi∗ . The operator A is said to be self-adjoint (or Hermitian)

when it is equal to its adjoint, A† = A. (Space R n : If A is a real operator A∗ = A,


its adjoint is equal to its transpose, A† = AT , defined by ATij = A ji , and so a self-
adjoint (Hermitian) operator is symmetric, A = AT . Note that when referring to
operators or matrices, ‘self-adjoint’ and ‘symmetric’ mean the same thing in the
real case, but two different things in the complex case.)
17. An operator P is called a projection operator, or simply, projector, if it is both
idempotent (P2 = P) and Hermitian (P† = P). (In a real space, the condition is
‘idempotent and symmetric’.)
18. A linear operator U in an inner-product space V is unitary if it satisfies any
one of the conditions:
(i) U has an inverse and hUψ, Uφ i = hψ, φ i for every ψ and φ in V ;
(ii) U has an inverse such that U −1 = U † ;
(iii) UU † = U † U = I.
Unitarity has an important property, namely, that it preserves the lengths, angles
and scalar products of the transformed vectors. (Space R n : If U is a real operator,
UU † = U † U = I reduce to UU T = U T U = I, and U is said to be orthogonal. A
real orthogonal operator leaves invariant the lengths, angles, and scalar products
of vectors of R n .)
19. A subset W of a linear vector space V closed under vector addition and scalar
multiplication is called a (linear) subspace of the vector space V .
20. Consider a linear operator A : V → V . A subspace L of V is said to be invariant
or stable under A (restricted to L) if Ax ∈ L for every vector x ∈ L.
2.1. LINEAR VECTOR SPACE 37

21. Given a subspace L of V , all the vectors orthogonal to L form a subspace of V


called the orthogonal complement of L in V , and denoted L⊥ = V − L. The sum of
the mutually orthogonal subspaces is called an orthogonal sum.
22. A sum U + W of vector spaces is said to be direct and denoted U ⊕ W if and
only if U ∩ W = 0. (The symbol ⊕ in U ⊕ W is a reminder of this condition.) The
direct sum is commutative and associative: U ⊕ W = W ⊕ U ; (V ⊕ U ) ⊕ W =
V ⊕ (U ⊕ W ); and the dimension dim (U ⊕ W ) = dim U + dim W . An orthogonal
sum is a special example of a direct sum when there is an inner product.
23. An important issue is to know when a map f can be represented by a diag-
onal matrix. Let A be an n × n matrix over F. Then the determinant | λIn − A |,
where λ ∈ F and In the identity matrix of order n, has values on F and is a monic
polynomial of degree n in λ. This polynomial p A = | λIn − A | is called the charac-
teristic polynomial of the matrix A. If f is the associated map f : V → V of the finite
dimensional vector space V = Fn , then the characteristic polynomial of the map f ,
denoted p f is defined to be p A.
24. Eigenvector, eigenvalue. A nonzero vector v ∈ V in an eigenvector of A (as
described above) if Av = λv for some λ ∈ F. An element λ ∈ F is an eigenvalue
of A if Av = λv for some nonzero v ∈ V . Then, the set { v ∈ V | Av = λv } is a non
zero subspace of V called eigenspace or characteristic space of λ. (For F = C, p A = 0
has at least one solution λ ∈ C to which corresponds an eigenvector v ∈ Cn . But
if the base field is real, a real operator may not have any real eigenvalues.) With
these definitions, we have the following important results.
25. Let A : V → V be a linear transformation of a finite dimensional vector space
V over a field F. Then
(a) The eigenvalues of A are the roots in F of the characteristic polynomial p A.
(b) A can be written as a diagonal matrix D relative to some ordered basis of V if
and only if the eigenvectors of A span V . In this case the diagonal entries of D are
the eigenvalues of A, and each eigenvalue λ appears on the diagonal d times equal
to the F dimension of the characteristic space of λ.
26. Similar matrices have the same characteristic polynomial. That is, if one has
B = MAM−1, then p B = | xIn − B | = | xIn − MAM−1| = | M ( xIn − A ) M−1| = p A,
because | AB | = | A || B|. However the inverse is not true in general: Two matrices
of the same size and the same characteristic polynomial are not necessarily simi-
lar. But a special case breaking the general rule is worth mentioning: Two matri-
ces of order n that are diagonalizable to the same diagonal matrix, and therefore
similar to it, are similar to each other.
27. A category of operators that satisfy the conditions of #25 are the ‘normal’
operators. An operator A is said to be normal if it commutes with its adjoint:
AA† = A† A (which reduces to AAT = AT A for a real operator). Then, we have
a key result of linear algebra: A necessary and sufficient condition for an operator in
Fn to admit n orthonormal eigenvectors is that it is normal. Two important examples
are the self-adjoint operators and the unitary operators (or symmetric operators
and orthogonal operators in the real case). A necessary and sufficient condition
for two normal operators in Fn to admit n common orthonormal eigenvectors is
that they commute.
38 CHAPTER 2. GROUP REPRESENTATIONS

2.2 Definition and Constructions


We introduce here the basic terminology of the representation theory of groups
and show various ways of constructing representations of finite groups.
Definition 2.1 (Representation). A linear representation ( π, V ) of a group G in a
vector space V over a field F is a homomorphism π : G → GL(V ) of G to the group of
invertible linear maps t : V → V over F. The map π is then said to give V the structure
of a group module over F, written F [ G ]-module.
The representation ( π, V ) is said to have degree n, or to be n-dimensional, if
V has dimension n; it is said to be finite-dimensional if n is finite. The term ‘linear
representation’ refers to the fact that the invertible operators t to which correspond
the elements of G are linear. The field F over which V is defined consists of real
numbers R or complex numbers C. As we are concerned exclusively with linear
representations of finite degree in a vector space V over R or C we will refer
to them simply as ‘representations’, or π, or V . For the same reason, we use
sometimes the same symbol, such as g, to represent either a group element g or
its representation π ( g). In this chapter, we shall restrict ourselves mostly to finite
groups.
The homomorphism π : G → GL(V ) between groups means that any element
g of G is mapped to an invertible operator t ∈ GL (V ), now identified with π ( g),
such that the group rules for G are preserved. If e is the identity element and g, h
are arbitrary elements of G, and I the identity operator in V , then we have

π ( g) π (h) = π ( gh) , (2.3)


−1 −1
π(g ) = ( π ( g)) , (2.4)
π (e) = I . (2.5)
Homomorphism also implies that one or more elements of G may be mapped on an
element of GL(V ). When it is reduced to a one-to-one mapping, or an isomorphism,
the representation is said to be faithful.
Let { u1, u2, . . . , un } be a basis for an n-dimensional vector space V . For each
invertible linear operator π ( g) representing g ∈ G, we may write
n
π ( g)ui = ∑ u j Dji ( g) , (2.6)
j= 1

so as to define an n × n matrix D( g) in the basis { ui } corresponding to π ( g).


A representation of G in V can be specified by giving the D( g)’s for all g ∈ G
satisfying the conditions for matrices analogous to (2.3)–(2.5). More precisely, a
matrix representation of G in V is a homomorphism π : G → GL( n, F ) of the
group G to the group GL( n, F ) of nonsingular n × n matrices over F. Alternatively,
(2.6) may be regarded as a linear action of G on V : let a group G act on some
finite set X by left multiplication, such that the map G × X → X is defined by
( g, x) 7 → gx, with g ∈ G and x ∈ X, then we can construct representations in
different ways depending on our choice for X, as follows.
2.2. DEFINITION AND CONSTRUCTIONS 39

(1) If X is a vector space, then the transforms x0 = gx, with x ∈ X, produce a


representation called a defining representation, of degree equal to the size of X.
Note that x0i = π ( g)xi = ∑j Dij ( g) xj , where x = ∑i ui xi .
(2) If X is a coordinate space, all the complex-valued functions ψ : X → C con-
stitute a complex vector space L. Calculating the transformations on this function
space is another way of finding the representations of G (over C).
Suppose we have a symmetry transformation in the configuration space X
corresponding to some element r ∈ G taking a point x to a point x0 = rx. We can
associate r, which operates on coordinates, with a linear operator π ( r ) which acts
in function space: ψ 7 → ψ0 ( x) = [ π ( r)ψ]( x). As the transformed function ψ0 at the
transformed point x0 = rx has a value equal to the value of the original function
ψ at the original point x, we have
ψ0 ( x0 ) ≡ [ π (r )ψ](rx) = ψ ( x) , (2.7)
−1
or [ π ( r)ψ]( x) = ψ ( r x) . (2.8)
If following r, we apply a second transformation s of the same group, x0 7 →
x00 = sx0 , then we have for the function ψ00 ( x00 )
[ π (s ) ψ0]( sx0 ) = ψ0 ( x0 ) = ψ ( x) ,
 
π ( s ) [π (r )ψ] ( srx) = ψ ( x) .
Now, to x00 = sx0 we associate [ π ( sr
 )ψ](srx) = ψ  ( x). Comparing the two equa-
tions obtained above, we infer that π ( s)[π (r)ψ] ( srx ) = [ π ( sr)ψ](srx), or, since
ψ is an arbitrary function,
π ( s ) π (r ) = π ( sr ) , (2.9)
so that the representative operators in space L obey the same multiplication law
as the group elements, as they should.
The above analysis suggests a procedure for constructing representations of G
in a sequence of steps: (i) Choose a set of n linear independent functions ε i ( x) as
a basis in L; (ii) Calculate ε i ( g−1 x) for each g ∈ G; (iii) Express the result as linear
combinations of the ε i ( x). The coefficients of ε i in the sums are just the elements
of the matrices D( g), according to π ( g)ε i( x) = ∑j ε j ( x) Dji ( g), which give the n-
dimensional representation of G in the subspace of L spanned by ε i.
(3) Let X be a finite set, and V a vector space with basis { ε x; x ∈ X} , so that
we may write any f ∈ V as f ( y) = ∑x a x ε x ( y ). G acts on V by
π ( g) ∑x a x ε x = ∑x a x ε gx .

This is the case if we identify ε x ( y ) with the characteristic function δx,y:


(
1 x = y,
δx,y =
0 x 6 = y.

From (2.8) it follows that π ( g) ε x( y ) = ε x ( g−1 y ) = ε gx ( y ), which gives us

π ( g) ε x = ε gx = ∑y ε y δy, gx . (2.10)
40 CHAPTER 2. GROUP REPRESENTATIONS

The coefficients of ε y in the y-sum are just the elements of a matrix in the repre-
sentation associated with the homomorphism of G to the permutation group of
X, called the permutational representation. It is related to a finite group itself,
being a subgroup of some symmetric group (e.g. D4 < S4). Matrices in this rep-
resentation can be obtained from the permutation operators p g described in the
previous chapter. For Sn , X may be a set of n letters or symbols.
(4) Let X be the group G itself, so that G × G → G, meaning that the left action
of an element (operator) g ∈ G on another element (considered now a basis vector
in the group space) yields a vector in the same space. Then (2.6) may be re-written
in the form
g gj = ∑ gk Dkj ( g) , g, gi , gk ∈ G , (2.11)
k
where the matrix D( g) is defined as
(
1 if ggj = gk ,
Dkj ( g) = k, j = 1, 2, . . . , | G |. (2.12)
0 if ggj 6 = gk ;

For g = e, Dkj ( e ) = δkj ; and for g 6 = e, all entries of D( g) are 0, except a single
off-diagonal equal to 1 in each row and each column. For any complex-valued
function φ in the group space, we have gi φ ( gj ) = φ ( g− 1
i gj ) . The associated per-
mutation representation defined by the action ( g, x) 7 → gx of G on itself is called
the regular representation of the group G. Its degree is equal to the order | G | of
the group; it is quite unique in that it contains all the basic units (inequivalent ir-
reducible representations) of the group representations, as we shall see. A similar
construction plays a key role in the study of Lie groups.
E XAMPLE 1: To begin, we take the familiar example of rotations in a plane E2
about the origin, forming a continuous group G = { R θ | 0 ≤ θ < 2π } . Let e1 , e2
be the orthogonal unit vectors in E2. Under R θ the basis vectors transform as

e01 = π ( R θ) e1 = e1 cos θ + e2 sin θ = ej Dj1 ( θ ) ,


e02 = π ( R θ) e2 = e1 (− sin θ ) + e2 cos θ = e j Dj2 ( θ ) .

so that the matrix representation of G in this basis, according to (2.6), is


   
D11 D12 cos θ − sin θ
D( θ ) = = .
D21 D22 sin θ cos θ

π ( Rθ ) preserves orthonormality (in particular, ei e j = e0i e0j = δij ), and sends any
vector V = ei Vi in E2 to V 0 = π ( R θ ) V, with components in the basis e1 , e2 given
by Vj0 = Dji ( θ ) Vi, or explicitly,

V10 = V1 cos θ − V2 sin θ,



0
V 7→ V = e i
Vi0 : (2.13)
V20 = V1 sin θ + V2 cos θ.

The components of the transformed vector in the transformed basis are the same
as the components of the old vector in the old basis: e0i · V 0 = ei · V, which is the
2.2. DEFINITION AND CONSTRUCTIONS 41

same as (2.8). The representation we have is orthogonal, DT D = 1 (DijT = Dji ), as


expected from the orthogonality of the basis in which it is written. 
Our first important task is to find all one-dimensional representations of any
given group. We will use the notation E ≡ D( e ) and A ≡ D( a ) for a group
element a. To begin, note two general results:
(1) Recall that a, b ∈ G are said to be conjugate if a = gbg−1 for some element
g ∈ G. If π is a one-dimensional representation, then π ( a ) = π ( g)π (b)π g−1 ,


or π ( a ) = π ( g) π (b) π ( g)−1. As π ( g) π (b) = π ( b ) π (g), we have π ( a ) = π ( b ). It


follows that a one-dimensional representation is constant in each conjugacy class
of G.
(2) For any group G, the mapping π : G → GL(1, C ), or g 7 → C for all g ∈ G,
gives a one-dimensional (1D) representation of G. In particular, when all g 7 → 1,
the representatives are the 1 × 1 matrices D( g) = [1] for all g ∈ G, and the one-
dimensional representation is said to be trivial.
E XAMPLE 2: Cyclic groups Cn . We will assume F = C in the following. For C2 =
h a : a2 = e i, D( a2) ≡ A2 = 1 and A = ± 1, which gives us two 1D representations:
(α1): E = 1, A = 1; and (α2): E = 1, A = − 1.
For C3 = h a : a3 = e i, we have D( a3) ≡ A3 = 1 and three cubic roots of 1 (1, ω,
and ω 2), leading to three (complex) 1D representations:

E A A2
α1 : 1 1 1
α2 : 1 ω ω2
α3 : 1 ω2 ω

In general, a cyclic group Cn = h a : an = e i admits n one-dimensional repre-


sentations corresponding to the n nth -roots of 1: ω, ω 2, . . . , ω n = 1.
E XAMPLE 3: Symmetric groups. The group S3 (acting on the set of three symbols
{ 1, 2, 3} ) has three conjugacy classes: { 1} , {(123), (132)}, {(23), (31), (12)}. Let
r = (123) and s = (23), so that r3 = e and s2 = e and rs = sr2. The 1D represen-
tations R ≡ D ( r ) and S ≡ D ( s ) satisfy R 3 = 1 (so R = 1, ω, or ω 2) and S2 = 1
(so S = ± 1). But rs = sr2 implies RS = SR 2. For either S = ± 1, one has R = R 2,
which is satisfied only for R = 1. (As we have mentioned before, these are also
the values for the other elements of the respective classes.) Hence, S3 has two 1D
representations: Trivial α1: R = 1, S = 1, and Alternating α2: R = 1, S = − 1. (In
other words, in the alternating representation, g 7 → 1 if g is even, and g 7 → − 1 if
g is odd.)
The symmetric group Sn admits in general two 1D representations: the triv-
ial representation and the alternating representation (which maps even permuta-
tions to 1 and odd permutations to − 1). This follows from two basic properties
of Sn : (a) Any p ∈ Sn may be (non uniquely) expressed as a product of transposi-
tions: p = τ1 · · · τr ; (b) Each transposition is its own inverse. As any transposition
τ satisfies τ 2 = 1, its 1D representations are ± 1. Then, either all τ 7 → 1, then all
p 7 → 1 (trivial 1D); or all τ 7 → − 1, and p 7 → (− 1)r (alternating 1D).
42 CHAPTER 2. GROUP REPRESENTATIONS

E XAMPLE 4: Dihedral groups. The symmetry group of the rectangle D2 = h r, s :


r2 = s2 = e, rs = sr i has four 1D representations: π ( r ) = ± 1 and π ( s ) = ± 1.

e r s rs
α1 : 1 1 1 1
α2 : 1 1 −1 −1
α3 : 1 −1 1 −1
α4 : 1 −1 −1 1

In D3 = h r, s : r3 = s2 = e, rs = sr2i, the symmetry group of the equilateral


triangle, the defining rules for the generators require that the one-dimensional
representations R = D ( r ) and S = D ( s ) satisfy R = 1 and S = ± 1, and so there
are only two of them (cf. S3 ∼ = D3 ).
In D4 = h r, s : r4 = s2 = e, rs = sr3i the rules for the generators imply R = ± 1
and S = ± 1; and D4 admits four one-dimensional representations.
y y
y 6
A B 6 A
6
B A
- -
O x
O -x O x
C D
C D
B C

Figure 2.1: Planar geometric shapes having the symmetries of the dihedral groups
D2, D3 , D4 . Dotted lines indicate the reflection lines.

E XAMPLE 5: The defining representations of D2 , D3 , D4 are those that are produced


by the rotations and the reflections that leave the rectangle, the equilateral tri-
angle, and the square unchanged, all in the Euclidean plane E2. They are all
two-dimensional, faithful, and real. Let x, y be the Cartesian coordinates in R 2 ,
and let the generators r, s of D2 act as isometries of the plane R 2 , sending the rect-
angle into itself, such that r : ( x, y ) 7 → (− x, − y) and s : ( x, y ) 7 → (− x, y). Then
the matrices        
1 0 −1 0 −1 0 1 0
, , ,
0 1 0 −1 0 1 0 −1
are the defining representation of D2 .
In D3 , r may be taken to be a rotation √
through 2π/3 about √ the center of an
equilateral triangle, mapping x 7 → (− x − 3y ) /2 and y 7 → ( 3x − y ) /2; and s
a reflection across the perpendicular plane through y axis, mapping x 7 → − x and
y 7 → y. Then the defining representation of D3 is specified by
 √   
1 −1 − 3 −1 0
π (r) = √ , π (s) = .
2 3 −1 0 1

In D4 we may take r to be the rotation about O through π/2, sending ver-


tex A to B, and s the reflection across AC, sending vertex B to D. The defining
2.2. DEFINITION AND CONSTRUCTIONS 43

representation of D4 is specified by
   
0 −1 0 1
π (r) = , π (s) = .
1 0 1 0

Other representatives of the group are obtained by matrix multiplication. For


example, r2 sends A to C, and sr2 is reflection across AC sending B to D:
   
2 2 −1 0 2 0 −1
2
π (r ) = π (r) = , π ( sr ) = π ( s)π (r) = .
0 −1 −1 0

E XAMPLE 6: Regular representation. The group table for C3 = h a : a3 = e i is

C3 1 2 3
e 1 2 3
a 2 3 1
a2 3 1 2

From this table, we obtain its regular representation (in the basis { e, a, a2} )
     
1 0 0 0 0 1 0 1 0
2
D ( e ) = 0 1 0  , D ( a ) = 1 0 0  , D ( a ) = 0 0 1 .
0 0 1 0 1 0 1 0 0

E XAMPLE 7: Permutational representation. Let X be a set of three symbols { 1, 2, 3}


and let us apply the formulas π ( g)ε x = ∑y ε y Dyx ( g) with Dyx ( g) = δy,gx , with
x, y ∈ X, to calculate the permutational representation of the symmetric group
S3. For example the transposition g = (12  ) produces
  the nonzero results: y, gx :
1, g · 2; 2, g · 1; 3, g · 3 for the matrix D (12) = (12) (this permutation leaves only
   
‘3’ fixed). Similar calculations give (123) and (132) (which leave no symbols
fixed):
     
  0 1 0   0 0 1   0 1 0
(12) = 1 0 0 , (123) = 1 0 0 , (132) = 0 0 1 .
0 0 1 0 1 0 1 0 0

Note that (123) and (132) are equal to the matrices D( a ) and D( a2) of C3
   
in its regular representation. This is because, as discussed in Chapter 1, C3 is
isomorphic to a subgroup of S3 under the mapping a 7 → (123) and a2 7 → (132).
E XAMPLE 8: D2 in function space. Suppose now that π ( D2) acts on a function
space L, which we first assume to be spanned by linear functions of the form
f ( x, y) = ax + by, where a and b are independent real constants. We choose the
basis to be e1 = x and e2 = y. The elements of the group transform e1, e2 in
exactly the same way as in the defining representation and, therefore, the matrix
representation of D2 in this basis is given by the same 2 × 2 matrices.
44 CHAPTER 2. GROUP REPRESENTATIONS

Assume now that L is spanned by the vectors e1 = 1, e2 = xy, e3 = x2 and


e4= y2, so that any vector in L is of the form f = c0 + c1 xy + c3 x2 + c4 y2. In this
basis the group elements are represented by 4 × 4 matrices, such as
 
1 0 0 0
0 − 1 0 0 
r 7→ 0
.
0 1 0
0 0 0 1

Here we have a four-dimensional representation of D2 .

2.3 Simple Representations


A central concern in the representation theory is to find all the distinct represen-
tations of a group. In general, a group can have many possible representations
of a given degree, not all of which are fundamentally different. We need a way
to tell whether they are or are not. In addition, representative matrices may look
complicated, but may actually be written in a sense as sums of smaller matrices.
It turns out that one can define basic units of representation of a given group in
terms of which arbitrary representations may be constructed.
Definition 2.2 (Equivalent representations). Two representations ( π, V ) and ( ρ, W )
of a group G over a field F are said to be equivalent if they are isomorphic under G.
This means there is a linear map s : V → W that preserves the action of G, so that
s ( gψ) = g( sψ ) for all g ∈ G, ψ ∈ V .
In matrix form, given the maps π : g 7 → A ( g) and ρ : g 7 → B ( g), we say the
matrix representations A and B are equivalent if there is a nonsingular matrix S,
corresponding to the linear map s : V → W , independent of the group elements
such that B ( g) = SA ( g)S−1 for each g ∈ G. As V and W are isomorphic spaces,
they must have equal dimensions, and A and B are matrices of the same size.
The relation B ( g) = SA ( g)S−1 means that A and B are similar (A ∼ B), that is,
related by a similarity transformation (see Sec. 2.1 #8): they are just matrices writ-
ten in bases related by S. The two representations contain the same information
about the group and are effectively the same, just being viewed from different
points of view.
Recalling (Sec. 2.1 #26) that two similar matrices have the same characteristic
equations, | SAS−1 − λI | = 0 and | A − λI | = 0, and so have the same eigenvalues.
This condition is necessary but not sufficient in general. For example,
   
1 0 1 1
and
0 1 0 1

have the same eigenvalues 1, 1, but are not similar. However, if it is true that
A ( g) ∼ B ( g) for every g ∈ G, then there exists an invertible matrix S ( g) such that
B ( g) = S ( g) A( g)S( g)−1 for every g ∈ G, and we can find a matrix S independent
of g such that B ( g) = SA ( g)S−1 for every g ∈ G, which means the representations
2.3. SIMPLE REPRESENTATIONS 45

are equivalent. If on the contrary A ( g) and B ( g) have different eigenvalues for


some g ∈ G, then the associated representations are certainly inequivalent.
Equivalent representations form equivalence classes, such that a representa-
tion can belong only to one class. To have a classification for a group it suffices
to know just one member of each equivalence class since from this member all
others can be generated by some similarity transformations. Two representations
belonging to two different equivalence classes are not only different, but also dis-
tinctive. In short, different equivalence classes means distinct representations.
E XAMPLE 9: The following matrices
 √   
1 −1 − 3 −1 0
R= √ , S=
2 3 −1 0 1

specify the defining representation of the group D3 over R 2 , and the matrices
 √   √ 
1 −1 − 3 1
R0 = √ , S0 = √1 3
2 3 −1 2 3 −1

define another representation of D3 .√But the latter is equivalent to the first, being
 
related to it through P = 21 √1 3
by R 0 = PRP−1 and S0 = PSP−1. 
− 3 1
Some representations of a group may be considered more ‘basic’ than others.
Some are ‘simple’, others ‘semi-simple’.

Definition 2.3 (Simple representation). A representation ( π, V ) of G is said to be


simple (or, equivalently, irreducible) if there is no proper nonzero subspace invariant
under G. Otherwise, it is said to be reducible.

The definition tells us that in a simple representation ( π, V ) any subspace W


of V stable under G (such that gψ ∈ W for all g ∈ G, ψ ∈ W ) must be either 0
or V (see Sec. 2.1 #20). We will use the abbreviations ‘rep(s)’ for representation(s)
and ‘irrep(s)’ for simple or irreducible representation(s).
E XAMPLE 10: All one-dimensional representations are simple (because they have no
proper subspaces, stable or not). Thus, Cn has n 1D irreducible representations.
E XAMPLE 11: D3 = h r, s : r3 = s2 = e, rs = sr−1 i over C. We have seen before
that D3 has just two 1D representations: α1 : r, s 7 → 1, 1 and α2 : r, s 7 → 1, − 1.
We want to find now other simple representations ( β, V ) with dim V ≥ 2. In
such a representation (with g standing for β ( g)), the eigenvectors for r, defined
by rψ = λψ, have eigenvalues such that λ3 = 1, with roots ω, ω 2, ω 3 = 1. The
vector defined by φ = sψ is an eigenvector for r with eigenvalue λ2, because
rφ = rsψ = sr2ψ = λ2 φ. Thus, ψ and φ are linear independent and span a
subspace W = { ψ, φ } of V stable under D3, as we can verify: rψ = λψ, rφ = λ2 φ,
sψ = φ, and sφ = s2 ψ = ψ. As ( β, V ) is simple by assumption, a stable nonzero
subspace must be V itself. So, V = W , and dim V = 2, for which we may choose
as basis the independent vectors ψ and φ. Consider the three cases:
46 CHAPTER 2. GROUP REPRESENTATIONS

(i) λ = ω: The representation matrices are:


   
ω 0 0 1
r 7→ , s 7→ .
0 ω2 1 0

(ii) λ = ω 2: This representation is equivalent to (i), as seen by interchanging


the basis vectors, ψ ↔ φ, and so we keep only one representation of dim V = 2.
(iii) λ = 1. Here the representation matrices are:
   
1 0 0 1
r 7→ , s 7→ .
0 1 1 0

This says rψ = ψ and rφ = φ, and rv = v for any v ∈ V , or r is represented by the


matrix I in any basis of V . Turning now to s, and noting that s has eigenvectors ξ
with eigenvalues ± 1, we see that the 1D space U = h ξ i is stable under D3 since
rξ = ξ, sξ = ± ξ. As V is simple, one must have U = V , which is contrary to
hypothesis (dim V = 2); and λ 6 = 1. In fact, a similarity transformation defined
1 1
by P = P−1 = √ 1
diagonalizes both r and s, and shows that this repre-
2 1 −1
sentation is the direct sum of two 1D representations. In conclusion, D3 has two
irreps of dimension 1, and one irrep of dimension 2 (2D), as listed below:

D3 irreps dim r s
α1 : 1 1 1
α2 : 1  1   −1 
ω 0 0 1
α3 : 2
0 ω −1 1 0

E XAMPLE 12: D4 = h r, s : r4 = s2 = e, rs = sr−1 i over C. We have seen before that


D4 has 4 1D representations. Now assume ( π, V ) simple, with dim V ≥ 2. Let
rψ = λψ, and φ = sψ. Then rφ = λ3 φ, so that ψ and φ are eigenvectors of r with
eigenvalues λ and λ3. They span vector space U = { ψ, φ } ⊂ V , which is stable
under r and s (therefore under D4). As V is simple, one has V = U = { ψ, φ }, that
is, dim V = 2 is the condition for simplicity of representations of degrees greater
than 1. As r4 = 1, one has λ4 = 1, leading to 4 roots λ = ± 1, ± i.
In the two cases λ = ± 1 (λ3 = ± 1), one has r 7 → ± I in the basis ψ, φ, and
so also in any other basis (reached by a similarity transformation). On the other
hand, as s is diagonalizable in V , and r is diagonal in any basis, both r, s must
be simultaneously diagonalizable. Therefore, for λ = ± 1, the representation is
decomposable, and so, not simple. The two remaining cases λ = ± i are equiva-
lent by swapping the basis vectors, and we need to consider only the case λ = i.
Recalling that sφ = ψ and sψ = φ, we have the 2D simple representation
   
i 0 0 1
r 7→ and s 7 → .
0 −i 1 0

In conclusion, D4 has four irreps of dimension 1, and one irrep of dimension 2.


2.3. SIMPLE REPRESENTATIONS 47

E XAMPLE 13: In the general case, Dn = h r, s : rn = s2 = e, rs = sr−1 i with n ≥ 3,


let ( π, V ) be an irrep π : Dn → GL (dim V , C ). (For r, s, . . . , read π ( r ), π (s), . . . .)
I dim V = 1: Here rs = sr−1 implies r2 = 1. The conditions r2 = 1, s2 = 1
yield r = ± 1, s = ± 1, subject to the rule rn = 1, so that we have:
• n odd: r = + 1, s = ± 1 : two 1D irreps;
• n even: r = ± 1, s = ± 1 : four 1D irreps.
I dim V ≥ 2: Follow the same reasoning as for D3 , D4 , and define rψ = λψ
and φ = sψ. Then, using rs = sr−1, rφ = λn−1 φ, so that ψ and φ are eigenvectors
of r with eigenvalues λ and λn−1. Further, with s2 = 1, sψ = φ and sφ = ψ. As
ψ and φ are linearly independent, they span a subspace U = { ψ, φ } of V stable
under Dn . Provided U is not reducible, it is a minimum invariant subspace, and
must coincide with V . Noting that rn ψ = ψ = λn ψ so that λn = 1, we write these
results in the { ψ, φ } basis as follows:
   
λ 0 0 1
π (r) = , π (s) = , (λ is an nth -root of unity). (2.14)
0 λ −1 1 0

We will use ω = ei2π/n to denote the first primitive nth -root of unity.
• n odd: λ = { 1, ω ±1, ω ±2, . . . , ω ±m } with m = ( n − 1) /2, where we have
used the identity ω m+k = ω −(m−k+1) with k = 1, 2, . . . , m. The eigenvalue λ = 1
gives a representation that is not simple, being decomposable into a direct sum
of two 1D irreps, while of the n − 1 remaining eigenvalues, half (ω − j) give irre-
ducible representations that are equivalent (upon ψ ↔ φ in the basis) to the other
half (ω j). And so, for Dn with odd n, there exist ( n − 1) /2 non-equivalent 2D
irreducible representations.
• n even: λ = {± 1, ω ±1, ω ±2, . . . , ω ±m } , with m = ( n − 2) /2, where we have
used the relation ω m+k+1 = ω −(m−k+1) , with k = 1, 2, . . . , m. The two representa-
tions corresponding to λ = ± 1 are not simple, they are decomposable into direct
sums of 1D irreps. Half of the representations corresponding to the remaining
n − 2 roots are equivalent to the other half. And so for the Dn with even n, there
exist ( n − 2) /2 non-equivalent 2D irreducible representations. 
Now, suppose we have a representation ( π, V ) of a group G given in matrix
form D ( g) of degree n = dim V . If V contains no nonzero proper subspaces
stable under G, then π is a simple representation and is indecomposable, not
being expressible as a sum of other representations. If, on the contrary, V contains
a subspace V 1 of dimension n1 different from both 0 and n, then we define its
orthogonal complement W such that V = V 1 ⊕ W (see Sec. 2.1 #21), and in an
appropriate basis for V consisting of ei , i = 1, . . . , n, the vectors ei , i = 1, . . . , n1
span V 1. In order for V 1 to be a G-invariant subspace (that is, π ( g) v ∈ V 1 for
every v ∈ V 1 and every g ∈ G), all the vectors given by
n
π ( g) ei = ∑ e j Dji ( g), ( i = 1, . . . , n1 ) (2.15)
j= 1

must be in V 1, which means Dji ( g) = 0 with i = 1, . . . , n1 and j = n1 + 1, . . . , n,


48 CHAPTER 2. GROUP REPRESENTATIONS

and the n × n matrix D assumes the upper-triangular form


 1 
D ( g) ∗
D ( g) = , (2.16)
0 ∗
where ∗ stands for unspecified entries, and all entries below the main diagonal
are 0’s. Then we say that the representation D is reducible, i.e., can be reduced to a
triangular form (2.16). If this holds for all g ∈ G, then the set of D1 ( g) for all g ∈ G
define a subrepresentation of degree n1 for G in the subspace V 1 (of dimension
n1) of V invariant under G.
We may now apply the same procedure to the subspace W and may find a
subspace V 2 in W invariant under G. Then, with an appropriate choice of basis,
the representation matrix takes the form
 1 
D ( g) 0 ∗
D ( g) =  0 D2 ( g ) ∗  . (2.17)
0 0 ∗
We keep applying the same operations until no more reduction can be made,
then D( g) may be written in terms of irreducible matrices:
D1 ( g )
 
0 ... 0
 0 D2 ( g ) ... 0   = diag [D1 (g), D2 (g), . . . , D p (g)] .
D( g ) = 
 0 (2.18)
0 ... 0 
0 0 0 D p ( g)

Thus, the matrix D( g) of the representation ( π, V ) of g ∈ G is reduced ultimately


to a block-diagonal matrix expressible in terms of irreducible matrices Di ( g), so
we may write D( g) = diag [D1( g), . . . , D p ( g)], where Di is the matrix of the irrep
πi : g 7 → Di ( g). If this form holds for all g ∈ G, we say that the representa-
tion ( π, V ) is completely reducible, or semisimple, Then it is expressible as a sum
π = ∑i πi ≡ ⊕ i πi of irreducible representations πi in a space defined by the
direct sum V = ⊕V i of spaces V i (see Sec. 2.1 #21). We say that a semisimple
representation of G is one that is expressible as a direct sum of simple representations.
In the above discussion, we have written V = V 1 ⊕ W where W is orthogonal
complement of V 1; this can be generalized to give an alternative definition of
semisimplicity: V is semisimple if and only if for every stable subspace U of V there
exists at least one complementary stable subspace W (so that V = U ⊕ W ).
Some representations present in ⊕ i V i may be equivalent to each other and so
should not be considered distinct, allowing us to replace mutually equivalent
representations by an equivalence class and the sum over individual representa-
tions by a sum over equivalence classes, or said differently, a sum of inequivalent
irreducible representations πi, each term representing an equivalence class. Assum-
ing there are κ distinct simple representations of G, we write π = ∑κi ai πi (or
occasionally ⊕ κi ai πi ) where ai is the multiplicity of the irreducible representation πi
(the size of the corresponding equivalence class). When expressed in terms of the
underlying vector spaces, the decomposition is given by the direct sum
V = a1 V 1 ⊕ a2 V 2 ⊕ · · · ⊕ aκ V κ , (2.19)
2.3. SIMPLE REPRESENTATIONS 49

⊕a
where each term ai V i in this sum stands for a direct sum V i i = ⊕ j V j of ai repre-
sentatives in the equivalence class. The dimension of the semisimple representa-
tion is the dimension of the underlying space, dim π = ∑κi ai dim V i.
In this section, we have explained the meaning of simplicity and semisim-
plicity of representations. These properties are at the center of the representation
theory and will be the subject of further study in the next sections. We will see
then that the number of simple representations of a finite group G is finite, that
the degree of each has an upper limit (| G |1/2), and that representations of a finite
group G are semisimple, with a unique decomposition as in (2.19), in which the
simple representations that occur are unique, just as are their multiplicities.
E XAMPLE 14: Group S2 has two elements: e = (1)(2) and a = (12). Two ob-
jects can be in either configuration |12i or |21i, which are taken to be a basis in
a two-dimensional space. The operator (12) interchanges |12i, |21i. In this ba-
sis, the state vectors are the standard column vectors (1,0)T and (0,1)T and the
representation of S2 is given by
   
1 0 0 1
D( e ) = , D( a ) = .
0 1 1 0

Define a new basis, formed by the symmetric combination |+i = |12i + |21i
and the antisymmetric combination |−i = |12i − |21i, which are column vectors
(1, 1)T , (1, − 1)T when written in the standard basis. The group leaves |+i un-
changed and changes |−i by a sign. They form two invariant 1D subspaces, and
together decompose D( S2) into distinct irreps by block diagonalization:
   
1 0 1 0
D( e ) = , D( a ) = .
0 1 0 −1

E XAMPLE 15: In the preceding example, the vectors |12i and |21i are taken to
form a basis, they can be obtained by applying the group elements e = (1)(2)
and a = (12) on the vector |12i (or, for that matter, |21i ). So there is a unique
correspondence between the basis vectors and the group elements. In the regu-
lar representation, the group elements are used as a basis. Similarly, the vectors
|+i = |12i + |21i and |−i = |12i − |21i can be seen as ( e + a )|12i and ( e − a )|12i,
and therefore, we can regard e + a and e − a as forming another basis; in fact they
constitute, by themselves, two invariant 1D bases.
E XAMPLE 16: The regular representation πR of the group C3 = h a; a3 = e i is
defined by
     
1 0 0 0 0 1 0 1 0
D ( e ) = 0 1 0 , D ( a ) = 1 0 0  , D ( a 2 ) = 0 0 1 ,
0 0 1 0 1 0 1 0 0

in the basis { e, a, a2} , or in terms of standard column vectors, (1,0,0), (0,1,0) and
(0,0,1). They leave the sums 1 + a + a2, 1 + ωa + ω 2 a2, 1 + ω 2 a + ωa2 invariant
(up to overall constants). We have defined ω a cubic root of 1, which satisfies
50 CHAPTER 2. GROUP REPRESENTATIONS

the identity 1 + ω + ω 2 = 0 (the sum of all nth roots of 1 is zero). Therefore,


these linear combinations of group elements define a new basis for the regular
representation, in which each sum by itself spans a one-dimensional invariant
subspace. The similarity transformation defined by the complex matrix S given
below together with its inverse,
   
1 1 1 1 1 1
1
S = 1 ω ω 2  , S − 1 = 1 ω 2 ω  ,
3
1 ω2 ω 1 ω ω2

relates the two bases and simultaneously diagonalizes the three matrices, [ e ], [ a ],
[ a2], resulting in their fully reduced forms D0 = SDS−1:
     
1 0 0 1 0 0 1 0 0
D0 ( e ) = 0 1 0 , D0 ( a ) = 0 ω 0  , D0 ( a2) = 0 ω 2 0  .
0 0 1 0 0 ω2 0 0 ω

This shows the complete decomposition of πR into the irreps of C3. Now, we write
πR = α1 + α2 + α3 (where αi are the three simple one-dimensional representations
of C3). This illustrates the general property that in any finite group, the regular
representation contains all the irreducible representations of the group.
E XAMPLE 17: We have seen that D3 has 3 irreducible representations in all:
two 1D representations: α1 : r, s 7 → 1, 1; and α2 : r, s 7 → 1, − 1;
one 2D representation: α3.
Any other 2D representations it may have must be semisimple, and may be con-
structed by summing α1 and α2. There are four possibilities: α1 + α1, α2 + α2,
α1 + α2, and α2 + α1. The last two being equivalent, there are thus, in all, 4 two-
dimensional reps, of which only α3 is simple.

D3 reps : dim r s characteristics


α1 + α1 : 2 I  I  identity
1 0
α1 + α2 : 2 I semisimple
0 −1
 
−1 0
α2 + α2 : 2 I semisimple
0 −1
   
ω 0 0 1
α3 : 2 simple
0 ω2 1 0

where ω 3 = 1, and I is the 2 × 2 identity matrix. 

2.4 Unitary Representations


We recall that in a complex vector space V defined with an inner product B (, ) a
linear operator U is said to be unitary (or orthogonal, if real) with respect to B if (i)
U is invertible, and (ii) B (Uψ, Uφ) = B ( ψ, φ ) for all vectors ψ, φ ∈ V . (Note that
2.4. UNITARY REPRESENTATIONS 51

‘inner product’ always implies positive-definiteness.) Unitary operators play a


special role in the algebra of complex vector spaces because of two useful prop-
erties they possess: the lengths and scalar products of vectors on which they act
are preserved (Sec. 2.1 #18), and they are always diagonalizable (Sec. 2.1 #27).
The main purpose of this section is to prove that reducible representations of a
finite group are semisimple. For this it is convenient to go over to a representa-
tion called unitary; here we do not restrict ourselves to complex vectors only, but
rather allow the field F to be R or C. The bilinear form B (, ) is then meant to be
symmetric if F = R, and hermitian if F = C.
Definition 2.4. (Unitary representations). A representation ( π, V ) of a finite group
G over F is said to be unitary with respect to some inner product { , } in V if { , } is
invariant under G, that is if { π ( g) x, π (g)y)} = { x, y } for all g ∈ G and x, y ∈ V .
As ( π, V ) is not expected in general to be unitary with respect to (wrt) a given
inner product, our aim is to find a G-invariant bilinear form wrt which it is. Let
B (, ) = h|i be any (positive-definite) inner product. Then define { , } as the aver-
age of B ( x, y) over all elements of G:
def
{ x, y } = B ( x, y ) = | G |−1 ∑ h π ( h ) x | π ( h ) yi , x, y ∈ V . (2.20)
h ∈G

{ , } = B satisfies the scalar product axioms (Sec. 2.1 #11) and is G-invariant, that
is B( gx, gy ) = B( x, y ), as we can see here:

{ π ( g) x, π ( g)y} = | G |−1 ∑ h π ( h ) π ( g ) x | π ( h )π ( g ) y i,
h ∈G
(2.21)
= | G | −1 ∑ h π (h0) x| π ( h0) y i = { x, y } ,
0
h ∈G

where h0 = hg covers the same range as h. Thus, averaging any positive-definite


inner product over G produces a new inner product invariant under G.
Although the inner product { , } might look forced in the original vector basis,
it is quite natural in another basis of V . Let { ei } be a basis defined wrt h|i, and
{ ui} a basis wrt { , } as we shall see. Without loss of generality, we assume them to
be orthonormal wrt their respective inner products, i.e. h ei | e j i = δij and { ui, u j } =
δij . Call S the similarity transformation relating the two bases so that ei = Sui for
i = 1, 2, . . . , dim V . For any x ∈ V , we have x = ∑i xi ui . Now,

h Sx|Syi = ∑ h Suixi | Su jy j i = ∑h ei xi | e j y j i (2.22)


= ∑ { ui xi , u j y j } = { x, y } .

Thus, redefining the scalar product so that the representation is unitary is equiv-
alent to changing the basis in the underlying vector space by the transformation
S defined by

h Sx|Syi = | G |−1 ∑ g∈G h π ( g) x|π ( g)yi for all x, y ∈ V . (2.23)


52 CHAPTER 2. GROUP REPRESENTATIONS

E XAMPLE 18: Let V be the space of complex-valued functions f ( x ) on a finite set


X, and G a finite group acting by permutation on X. Then it also acts on V by
(2.8). It follows that ( f 1, f 2 ) = ∑s f 1∗ ( x) f 2( x) is an invariant inner product: the
representation in question is unitary. 
The importance of unitary representations lies in the following two results:
Theorem 2.1. A reducible unitary representation is completely reducible.
P ROOF : Let π be a reducible unitary representation of G in vector space V over
R or C with inner product { , } . Let U be a subspace of V , and W its orthogonal
complement wrt { , } . By assumption, if v ∈ W then { u, v } = 0 for every u ∈ U ;
and { π ( g) x, π (g)y} = { x, y } for all g ∈ G, x, y ∈ V . We have to prove that if U is
stable (if u ∈ U then π ( g)u ∈ U for all g ∈ G) then so is W .
Hence for g ∈ G, y ∈ W we have { π ( g)π (g−1) x, π ( g)y} = { π ( g−1) x, y } = 0
for all x ∈ U , and so { x, π ( g)y} = 0 for all x ∈ U . This says that π ( g)y ∈ W if
y ∈ W , or that W is stable. Hence V = U ⊕ W and π is semisimple. Note that
we have taken the alternative definition of semisimplicity: every stable subspace
U ⊂ V must have a stable complement W ; then using the notion of invariant
form on V , we take W to be the orthogonal complement of U wrt this form. 
Theorem 2.2. Every representation of a finite group on an inner-product space can be
made equivalent to a unitary representation by a similarity transformation.
P ROOF : Given a finite group G and an arbitrary representation π ( G ) in space
V with inner products h, i, we need to find a non-singular operator S such that
def
ρ ( g) = Sπ ( g)S−1 is unitary for every g ∈ G. We can choose S to be an operator
satisfying (2.23); such an operator exists, as shown by the construction described
above. We now show that ρ ( g) = Sπ ( g)S−1 is unitary for this choice of S: For
any x, y ∈ V and any g ∈ G, we have

h ρ ( g)x|ρ( g)yi = h Sπ ( g)S−1x| Sπ ( g)S−1y i


h ρ ( g)x|ρ( g)yi = | G |−1 ∑ g0 ∈G h π ( g0 ) π ( g)S−1x| π ( g0 ) π ( g)S−1y i
= | G |−1 ∑ g00 ∈G h π ( g00 ) S−1 x| π ( g00 ) S−1y i
= h SS−1 x| SS−1y i = h x| y i . (2.24)

The first step comes from the definition of ρ, the second and fourth equalities fol-
low from (2.23), and the third from permuting the group elements. Since invari-
ance h ρ (g) x|ρ( g)yi = h x| y i holds for arbitrary x, y ∈ V and g ∈ G, we conclude
that ρ ( G ) ∼ π ( G ) is a unitary representation wrt the standard form h|i. 
In sum, h, i is G-invariant in ρ, whereas { , } is G-invariant in π. Whether one
uses π together with { , } , or ρ together with h, i is a matter of choice. These results
have immediate consequences for arbitrary representations of finite groups:
(i) Every representation of a finite group over F = C (or R) can always be chosen to
be unitary (orthogonal, if F = R). Because if given one that is not, we can
find by Theorem 2.2 a unitary representation equivalent to it by a similarity
transformation.
2.5. SCHUR’S LEMMA 53

(ii) Every representation of a finite group over R or C is either irreducible (simple) or


completely reducible (semisimple). This is so because by Theorem 2.2 every
representation of a finite group is equivalent to a unitary representation,
which, if reducible, is semisimple by Theorem 2.1.
E XAMPLE 19: Consider the irreducible representation of D4 over C in vector space
{ x, y } ⊂ C2 defined by
   
i 0 0 1
r 7→ s 7→ .
0 −i 1 0

The general form on C2 is ax∗ x + by∗ y + cx∗ y + c∗ y∗ x where a, b ∈ R, c ∈ C.


As an example, we take B ( x, y ) = 2x∗ x + y∗ y − ix∗ y + iy∗ x. D4 has 8 elements,
namely, e, r, r2, r3, s, sr, sr2, sr3. Let B ( g(x, y)) = B ( gx, gy ), then we have

B ( e( x, y)) = B ( x, y ) = 2x∗ x + y∗ y − ix∗ y + iy∗ x ,


B ( r( x, y)) = B (ix, − iy) = 2x∗ x + y∗ y + ix∗ y − iy∗ x ,
B ( r2( x, y )) = B (− x, − y) = 2x∗ x + y∗ y − ix∗ y + iy∗ x ,
B ( r3( x, y )) = B (− ix, iy) = 2x∗ x + y∗ y + ix∗ y − iy∗ x ,
B ( s( x, y)) = B ( y, x ) = x∗ x + 2y∗ y + ix∗ y − iy∗ x ,
B ( sr( x, y)) = B (iy, −ix) = x∗ x + 2y∗ y − ix∗ y + iy∗ x ,
B ( sr2( x, y )) = B (− y, − x) = x∗ x + 2y∗ y + ix∗ y − iy∗ x ,
B ( sr3( x, y )) = B (− iy, ix) = x∗ x + 2y∗ y − ix∗ y + iy∗ x .

Adding and averaging over D4 , we have B ( x, y ) = { x, y } = 3/2( x∗ x + y∗ y ), a


trivially hermitian invariant form in C2 . This result reflects the general rule that
a simple representation space carries a unique invariant hermitian form, up to a
scalar multiple.

2.5 Schur’s Lemma


As we have seen, we can always express reducible representations of a finite
group as direct sums of irreducible representations, and we may then consider
irreducible representations to be the basic buildings blocs of all representations of
finite groups. Therefore it is important to have some way of identifying them. The
criteria we need are implied in a fundamental theorem concerning the orthogo-
nality of irreducible representations which we will discuss in the next section. Its
proof relies on Schur’s Lemma (a statement about finite groups, but also admit-
ting generalizations to Lie groups and Lie algebras) which we discuss presently.
It has two parts, one concerning two irreducible representations, and the other a
single one.
Theorem 2.3 (Schur). Let ( π, V ) and ( ρ, W ) be irreducible representations of a finite
group G in complex vector spaces V and W .
54 CHAPTER 2. GROUP REPRESENTATIONS

(a) If ϕ : V → W is a C [ G ] homomorphism (so that ϕπ ( g) = ρ ( g) ϕ for all g ∈ G),


then either ϕ = 0, or ϕ is a C [ G ] isomorphism.
(b) If ϕ : V → V is a C [ G ] homomorphism (so that ϕπ ( g) = π ( g) ϕ for all g ∈ G),
then ϕ is a scalar map (meaning ϕ = λIV where λ ∈ C and IV is the identity map on
vector space V ).

P ROOF : (a) follows from the fact that the kernel Kerϕ and the image Imϕ
of ϕ are subspaces stable under G. This can be seen as follows. First, Kerϕ =
{ x ∈ V | ϕ ( x) = 0} is an invariant subspace of V with respect to π ( G ) because
given any x ∈ Kerϕ, we have ϕ π ( g) x = ρ ( g) ϕ ( x) = ρ ( g) 0 = 0 for all g ∈ G,
which means π ( g) x ∈ Kerϕ if x ∈ Kerϕ. As π ( G ) is irreducible, Kerϕ must
be either empty (in which case ϕ ( x) 6 = 0 for any x ∈ V ), or the whole of V (in
which case ϕ = 0). Secondly, Imϕ = { y ∈ W | y = ϕ ( x), for some x ∈ V }
is a subspace of W stable under  ρ ( G ) because given any y ∈ Imϕ, the vector
ρ ( g) y = ρ ( g) ϕ (x) = ϕ π ( g) x for all g ∈ G, is in Imϕ. As ρ ( G ) is an irreducible
representation, we have either Imϕ = 0 (in which case ϕ = 0), or Imϕ = W (in
which case ϕ is a homomorphism).
Now, from the fact that Kerϕ and Imϕ are G stable subspaces, we see that
either ϕ is the trivial zero map (Imϕ = 0, Kerϕ = V ), or an invertible map (Imϕ =
W , Kerϕ = 0), i.e. a C [ G ] isomorphism, which defines the equivalence of two
simple representations via conjugation, ρ ( G) = ϕ π ( G ) ϕ−1.
(b) As ϕ is a linear mapping on complex space V , it has an eigenvalue λ ∈ C
(cf. Sec. 2.1 #24). Let ψ = ϕ − λIV , then its kernel Kerψ = { x ∈ V | ψx = 0}
= { x ∈ V | ϕ ( x) − λx = 0} must be non empty. So Kerψ is a nonzero sub repre-
sentation of G on V . But by assumption V is irreducible, and Kerψ = V . This
means Ker{ ϕ − λIV } = V , or ϕv − λv = 0 for all v ∈ V , or equivalently ϕ = λIV .

Part (a) of Theorem 2.3 has no converse, but Part (b) has one, which reads

Theorem 2.4 (Converse of Schur (b)). Let ( π, V ) be a representation of a finite group


G in complex vector spaces V . If every C [ G ] homomorphism V → V is scalar, then
( π, V ) is irreducible.

P ROOF : Suppose every C [ G ] homomorphism V → V is scalar, but V is re-


ducible, decomposable into a direct sum of sub representations similar to (2.19):
V = U + W . Then every v ∈ V may be uniquely written v = u + w with u ∈ U
and w ∈ W . Define a projection map P such Pv = u for all v ∈ V . Then for
every g ∈ G we have Pπ ( g)v = P ( π ( g)u + π ( g)w) = Pπ ( g)u (U and W being
subrepresentations), and Pπ ( g)v = π ( g) u = π ( g) Pv. So, P is a C [ G ] homomor-
phism but is not scalar (as Pv = u), which is a contradiction. Hence, V must be
irreducible. 
Schur’s Lemma can also be re-stated in terms of linear operators or matrices
in some arbitrary given basis. In particular, Theorem 2.3 (b) and its converse,
which relate irreducible representations and scalar transformations, may take the
following if-and-only-if form:
2.6. MATRICES AND CHARACTERS 55

Theorem 2.5 (NSC of Schur-C). Let G be a finite group and V a vector space over C.
Then, a representation ( π, V ) of G is irreducible if and only if every linear transformation
A : V → V that satisfies the commutation relation Aπ ( g) = π ( g) A for all g ∈ G is
scalar.
E XAMPLE 20: Let G be a finite abelian group and ( π, V ) representation of G over
C. Given any fixed element a ∈ G, we have π ( a)π ( g) = π ( g)π (a ) for all g ∈ G
because G is abelian. Assuming now ( π, V ) irreducible, and letting a range over
the entire group, we see that π is scalar and V one-dimensional. So,
• Every irrep of a finite abelian group over C is one-dimensional.
This means in particular that all irreps of Cn and V ∼ = D2 must be one-dimensional.
On the other hand, in the examples we have examined, we have found that
V ∼ = D2 has four 1D representations, and Cn has n 1D representations, and by
Theorem 2.5 they must be irreducible.
C OMMENTS .
(a) Schur’s lemma can be extended to infinite (Lie) groups. It is in particular
useful with Lie algebras in which form it finds many applications in quantum
mechanics.
(b) Theorem 2.3 (a) holds for representation spaces over R too, and so it holds
in general for F = R or C: Let ( π, V ) and ( ρ, W ) be irreducible representations of
a finite group G over a base field F; if ϕ : V → W is an F [ G ] homomorphism, then
either ϕ = 0, or ϕ is an F [ G ] isomorphism. But as it is, Theorem 2.3 (b) does not
hold over R because a linear operator on a real vector space may not have any
real eigenvalues at all (see Section 2.1 #24), and one must insert an extra condi-
tion: If ϕ : V → V is an R [ G ] homomorphism with a real eigenvalue, then ϕ is scalar.
This condition is certainly satisfied if the linear operator of interest is self-adjoint
on an inner-product space over either F = R or C (cf. Section 2.1 #27), and so
Theorem 2.5 can be modified to read
Theorem 2.6 (NSC of Schur-F). Let G be a finite group and V an inner-product space
over F = R or C. Then, a representation ( π, V ) of G is irreducible if and only if every
self-adjoint linear transformation A : V → V that satisfies the relation Aπ ( g) = π ( g) A
for all g ∈ G is scalar.
It is in this form that Schur’s Lemma is most often used in practice.

2.6 Matrices and Characters


This section is devoted to a study of matrices and characters of finite representa-
tions of finite groups, with focus on the central role of their orthogonality prop-
erty. An important outcome of this discussion is a complete characterization of
simple representations (criteria for identifying them, their number in a given
group, their sizes, their multiplicities in any representation). See [FH] Chap. 2,
[Ham] Chap. 3, and [Tu] Chap. 3.
Characters. Let ( π, V ) be a finite representation of a finite group G. For each
g ∈ G, π ( g) denotes an invertible linear operator on V represented by a matrix
56 CHAPTER 2. GROUP REPRESENTATIONS

Dπ ( g) in some chosen basis of order equal to dim V , over base field C (includ-
ing restriction to R when necessary). Although Dπ ( G ) is needed for a complete
description of G on V , or any physical system it could represent, it is an unnec-
essarily unwieldy tool just for identifying simple representations or determining
an arbitrary representation. For this limited purpose, a scalar function encoding
all the information about a representation would suffice, and, in fact, it exists:
Definition 2.5. The character of a finite representation ( π, V ) of a group G over F is
the function χ : G → F defined by the trace of π ( g) on V : χπ ( g) = Tr ( π ( g)) for any
g ∈ G. A simple character is the character of a simple, or irreducible, representation.
Many properties of the character follow from those of the trace. For example,
the cyclicity property Tr ( ABC ) = Tr ( BCA ) implies that χ ( hgh−1) = χ ( g), that is,
χ is constant in each conjugacy class of G; such a function is called a class function.
Another consequence of cyclicity is χ ( g) = Tr (D( g)) = ∑i Dii ( g): χ does not
depend on the basis in which it is calculated. (When there are no ambiguities, the
representation label π in χπ or Dπ is omitted.)
There are two important special cases: (i) χ ( e) = dim V ; and (ii) χ1 ( g) = 1 for
any g ∈ G in the trivial one-dimensional representation (π = 1). We also have

−1 χ ( g) if F = R
χ( g ) = (2.25)
χ ( g )∗ if F = C

To show this result, we assume F = C and π unitary (if it is not, we will go to


an equivalent unitary representation). As π ( g) is unitary, it has a complete set
of eigenvectors which can serve as a basis for V , so that we may write χ ( g) =
∑m n
i=1 λi , where m = dim V . In a finite group, any g has a finite order and g = e
−1
for some n, so that λi = 1 for i = 1, . . . , m, which means | λi | = 1 or λi = λ∗i
n
−1
for i = 1, . . . , m. It follows that ∑m
i= 1 λ i = ∑ λ∗i = ( ∑ λi )∗ . Hence we have
− 1 ∗
the stated result χ ( g ) = χ ( g) . On the other hand if F = R, so that D( g) is
a real matrix with Dn = I, we can first regard D as a complex matrix, and have
∗
Tr (D−1 ) = Tr D . Restriction to R produces Tr (D−1 ) = Tr D.
Finally, recalling that the sum representation π + ρ is defined over the direct
sum V π ⊕ V ρ, we have χπ+ρ = χπ + χρ. We shall deal later with semisimple
representations π = ∑α aα α, for which the character is χπ = ∑α aα χα .

Orthogonal Matrices. Let G be a finite group of order | G |, and consider the irre-
ducible representations ( α, V α) and ( β, V β) over C. Let us define for some linear
map B : V α → V β the following operator A on V α :

A = ∑ g∈G β ( g) B α g−1 .

(2.26)

For each h ∈ G, we calculate

β ( h ) A = ∑g β ( h ) β ( g) B α g−1 α ( h−1) α (h)



(2.27)
= ∑g β ( hg) B α ( hg)−1 α ( h) = Aα ( h) ,

2.6. MATRICES AND CHARACTERS 57

where we have used the fact that hg, with fixed h, ranges over G just as does g.
Applying Schur’s Lemma, we have either (i) α ∼ β, then α ( h) A = Aα (h) for all
h ∈ G which implies A = λI with a scalar λ depending on B; or (ii) α  β, then
A = 0. (We shall use the symbol δα,β to mean 1 if α ∼ β, and 0 if α  β.)
In an orthogonal basis of V α , α ( g) 7 → Dαij ( g) with 1 ≤ i, j ≤ d α = dim V α , and
the result ∑g β ( g) · B · α ( g−1) = λ I δαβ may be written in matrix form:
β
∑ Dmj ( g) Diα` ( g−1 ) = λij δ`m δαβ , (2.28)
g ∈G

when B is chosen as a matrix with a single nonzero entry: Bji = 1 for some i, j.
The constant λ can be calculated by letting α = β, ` = m, and summing over `,
resulting in λij = δij | G | /dα. We may now write the equation in the form:

dα β
Diα` ( g−1 ) Dmj ( g) = δ`m δij δαβ ,
| G | g∑
1 ≤ i, j, `, m ≤ dα . (2.29)
∈G

The restriction of V α to C allows us to go to an equivalent unitary representation


where Diα` ( g−1 ) = D`αi∗ ( g), and write the result as

dα β
D`αi∗ ( g) Dmj( g) = δ`m δij δαβ 1 ≤ i, j, `, m ≤ dα .
| G | g∑
(2.30)
∈G
p
We may interpret dα / | G | Dijα in this equation as vectors labeled by α, i, j in a
| G |-dimensional vector space; they are mutually orthogonal, and so also linearly
independent. Since the number of linear independent vectors in a linear space
cannot exceed the dimension of that space (Section 2.1 #3), we have
κ
∑α=1 d2α ≤ | G | , (2.31)

where κ is the number of inequivalent irreducible representations of the group. We


see that the non-equivalent irreducible representations of a finite group G are finite in
number and in size (κ ≤ | G |, dα ≤ | G |1/2).

Orthogonal Characters. A similar relation for the simple characters can be ob-
tained from (2.30) by letting ` = i and m = j and summing over i, j = 1, . . . , d α
with the result:
1
χα∗ ( g) χ β( g) = δαβ .
| G | ∑ g ∈G
(2.32)

As already mentioned, characters are class functions on G, meaning that if g, g0


belong to the same conjugacy class, then χ ( g) = χ ( g0 ). Assume the group G
has nc conjugacy classes [ µ ], each of size cµ , and rewrite the sum over the group
elements in (2.32) as a sum over the conjugacy classes:

1 nc β
cµ χαµ∗ χµ = δαβ ,
| G | µ∑
(2.33)
=1
58 CHAPTER 2. GROUP REPRESENTATIONS

where χαµ = χα ( g) for any representative g ∈ [ µ ]. (Components of a character χα ,


of the α representation, are labeled by g or µ, as in χα ( g) or χαµ .)

An interpretation similar to that of (2.30) can be given: each cµ / | G | χαµ is
regarded as the µ-component of an nc -dimensional vector labeled by α; all such
vectors χα are mutually orthogonal (and therefore linear independent) in an nc-
dimensional space. Since the number of linear independent vectors cannot ex-
ceed the space dimension, it must be bounded from above: The number of distinct
simple representations of G is less than or equal to the number of conjugacy classes:

κ ≤ nc . (2.34)

This interpretation suggests that we may define the inner product of any two
class functions ξ and η in the class function space of a finite group G as follows:

def 1 1 nc
∑ ξ ∗ ( g) η ( g ) = cµ ξ µ∗ ηµ .
| G | µ∑
h ξ, η i = (2.35)
| G | g ∈G =1

Then the statement that the simple characters of a finite group are orthonormal wrt the
inner product (2.35) has a concise expression:

h χα, χ β i = δαβ . (2.36)

C OMMENT. If the representation is defined over R, rather than C, one replaces


D`αi∗ ( g) with D`αi ( g); χ∗µ with χµ ; and ξ µ∗ with ξ µ in the above relations.

Implications of Orthogonality. The above discussion implies several important


facts to be noted:
(1) A representation π is irreducible (simple) if and only if h χ, χ i = 1. This is
evident from the fact that, for a given π = ∑α aα α, we have h χ, χi = ∑α a2α . It fol-
lows π is irreducible if and only if all the multiplicities are zero, except one which
equals 1. This is an alternative criterion for the simplicity of a representation.
(2) Given any semisimple representation π = ∑α aα α, with underlying vector
space V = ⊕ α V α⊕aα , where α = 1, . . . , κ label the distinct irreps of G, its character
is χ = ∑α aα χα . Conversely any representation is determined up to equivalence by its
character since the multiplicities that define π in terms of the known irreps α of G
are the inner products of χ with χα , i.e. aα = h χα , χ i, as in (2.35).
(3) Two representations π and ρ are equivalent if and only if their characters are
equal. Assuming π ∼ ρ, there is some linear map P such that ρ ( g) = Pπ ( g)P−1 for
all g ∈ G. The equality of the characters, χρ ( g) = χπ ( g) for all g ∈ G follows from
the cyclicity property of the trace. Conversely, assuming χπ = χρ , we calculate
the multiplicities of the simple representations that occur in π: aπ α π
α = hχ , χ i =
α ρ ρ π ρ
h χ , χ i = aα . As aα = aα for all α, we have π ∼ ρ.
Let us turn now to the characters in the regular representation R of G. We
recall that it is defined by the action G × G → G, so that the representation space
is the group itself. Each representation matrix DR ( g) for g 6 = e has zero diagonal
entries, and contains a single nonzero off-diagonal element, equal to 1, in each
2.6. MATRICES AND CHARACTERS 59

R ( g) = δ for g = e. So the components of the character


row or column, while Dkj kj
χ R are simply: χ R ( e ) = | G | and χ R ( g) = 0 for g 6 = e. This means e leaves fixed
every element, and g 6 = e changes every other element of G.
From (2.35) we can calculate the norm of χ R :

h χ R, χ R i = | G |−1 ∑g | χ R ( g)|2 = | G |−1| χ R ( e )|2 = | G | .

This result, h χ R, χ R i = | G |, indicates that the regular representation is not simple,


except in the trivial case G = h e i, and so must be semisimple with the character
χ R expressible in terms of the simple characters χα of the group: χ R = ∑α aαR χα .
With this expression, the norm now becomes
2
h χ R, χ R i = ∑ aαR a Rβ h χα, χ β i = ∑ aαR ,
αβ α

where we have used the orthonormality of the simple characters. The multiplicity
can now be evaluated:

aαR = h χα, χ R i = | G |−1 ∑ g χα ( g) χ R( g) = | G |−1χα ( e ) χR( e ) .

With χα ( e ) = dα and χ R ( e ) = | G |, we obtain aαR = dα . This gives us a very


important relation, χ R = ∑α dα χα , which tells us that the regular representation of
G contains all of its irreducible representations, each in d α copies.
Again, as | G | is finite, there is only a finite number of irreducible representa-
tions of G. Altogether the above results lead to a ‘counting’ relation:
κ
∑α d2α = | G |. (2.37)

Before examining its implications, we want to write down another result to be


discussed shortly. With χα ( e ) = dα , the equation χ R = ∑α dα χα may be expressed
as
1
χα∗ ( e ) χα( g) = δg,e .
|G| ∑
(2.38)
α

This reduces to (2.37) for g = e, and ∑α χα∗ ( e ) χα( g) = 0 for g 6 = e. With this
equation, one may calculate any one character in terms of the others.

Completeness. We have seen before that dα / | G | Dijα may be interpreted as | G |-
component mutually orthogonal vectors. Equation (2.37) states that there are as
many such vectors as there are dimensions and they fill out the entire | G |-dimensional
space. This completeness property of the irreducible representation matrices finds
its expression in the statement
κ dα
dα α
∑ ∑ D ( g) Dijα∗ ( g0 ) = δgg0 . (2.39)
α=1 i,j=1
| G | ij

Together with (2.30), it completely characterizes the simple representations.


60 CHAPTER 2. GROUP REPRESENTATIONS

A similar relation exists for the simple characters, as we now show. Let α be
any representation of G, and [ µ ] some conjugacy class of G; then consider the sum

Aαµ = ∑ α ( h) . (2.40)
h∈[µ]

Recalling that the conjugacy class of an element h ∈ G consists of all distinct


ghg−1 for g ∈ G, we see that Aαµ is invariant under G: α ( g) Aαµ α ( g−1) = Aαµ
for all g ∈ G, and so by Schur’s Lemma, Aαµ = λI α , where I α is the identity in
representation space V α of dimension dα . The constant λ can be determined by
taking the trace in V α of both sides of this equation, leading to λd α = cµ χαµ , where
cµ is the size (number of elements) of conjugacy class [ µ ]. In matrix form, we have
cµ α α
∑ Dα ( g ) = χ I . (2.41)
g∈[µ]
dα µ

Returning now to (2.39) (where now α stands for simple representations), we


sum g over class [ µ ], and g0 over class [ ν ] on both sides of that equation. On the
right-hand side, we get cµ δµν , and on the left-hand side,

dα cµ α c ν α∗ cµ cν
∑ |G| dα
χµ δij

χν δij =
|G| ∑ χαµ χαν∗ .
αij α

It follows that
c µ κ α∗ α
| G | α∑
χν χµ = δµν . (2.42)
=1
In the special case [ µ ] = [ e ], this equation reduces to (2.37) and√(2.38). More sig-
nificantly, it shows that the nc κ-dimensional non-zero vectors cµ / | G | χαµ (with
vector label µ and component label α) are orthogonal to one another, so nc ≤ κ.
This inequality, together with (2.34), gives us κ = nc : the number of simple rep-
resentations of G equals the number of its conjugacy classes. To summarize:
When a list of distinct irreducible representations α of a finite group G is complete,
κ 2
√ counting relations (κ = nc and ∑α dα = | G |) are satisfied. The simple characters
the
cµ / | G | χαµ may be regarded either as α-labeled vectors of components µ, or µ-labeled
vectors of components α. Either way they are mutually orthogonal, as shown in (2.33)
and (2.42).

The Character Table. It is customary to display the simple characters χαµ of a finite
group in the form shown in Table 2.1, which may also serve as a tool for finding
unknown entries, even when the representations Dα are not explicitly known.
The columns correspond to the conjugacy classes, labeled [ µ ] and accompanied
by their sizes cµ . The rows correspond to the irreducible representations Dα , so
that the entry at row α, column µ gives the character χαµ . As the number nc of con-
jugacy classes equals the number κ of simple representations (nc = κ), there are
as many columns as rows. Normally, the first row gives the characters (χ1µ = 1 for
all µ) in the trivial one-dimensional representation, and the first column gives the
2.6. MATRICES AND CHARACTERS 61

characters of element e (χα[e] = dim V α = dα for all α). The rows are normalized to
1, and orthogonalized to one another by (2.33). Similarly, the columns satisfy the
orthonormality condition (2.42). In particular, the normalization of the entries in
the first row requires ∑µ cµ = | G |, and similarly, ∑κα d2α = | G | in the first column.
Once completed, the table gives a full account of the characters of the group.

Table 2.1: Character Table

G [ 1 ] 1 . . . [ µ ] c µ . . . [ n c ] c nc

D1 1 ... 1 ... 1
... ... ... ... ... ...
Dα dα ... χαµ ... χαnc
... ... ... ... ... ...
Dκ dκ ... χκµ ... χκnc

E XAMPLE 21: Cyclic groups. In an abelian group G, there are nc = | G | conjugacy


classes, each of size cµ = 1, and κ = nc simple representations, each of dimension
dα = 1. Matrices and characters coincide and are simply scalars (real or complex
numbers). As the order of any group element is finite, the simple characters of
all elements are roots of the unity. It turns out that complex irreps occur in pairs
of complex conjugates, that is, whenever a basis spans an irrep whose characters
are complex, it will also span a second irrep whose characters are the complex
conjugates of the first representation. Now, take the cyclic group Cn = h a; an = e i,
where | Cn | = n, then χα ( g) is one of the `th roots of 1, namely exp(i2π ` /n) with
` = 1, . . . , n. Evidently, the relation | Cn | = ∑ d2α is satisfied, verifying the result
that Cn has exactly n irreducible representations, all one-dimensional.
In C2, a2 = e produces two roots ± 1. There are two classes (both with cµ = 1)
and two simple representations, both real. On the other hand, C3 = h a, a3 = e i
has three classes (all of sizes 1), and three representations, all one-dimensional,
corresponding to the three cubic roots of π ( a )3 = 1. If we choose χα2 ( a ) = ω, we
must have χα2 ( a2) = ω 2. Then χα3 is determined by orthogonality with χα2 . The
simple characters of C2 and C3 are shown below:

C3 : e a a2
C2 : e a
α1 : 1 1 1
S : 1 1
α2 : 1 ω ω2
A : 1 −1
α3 : 1 ω2 ω

E XAMPLE 22: The Klein group V ∼ = D2 = h r, s; r2 = s2 = e, rs = sr i is abelian, but


not cyclic. It has four simple representations, all of dimension one, for π ( r ) = ± 1
and π ( s ) = ± 1. All χα ( g) are real, equal to + 1 or − 1. To establish the char-
acter table, fill in all entries in the first row and equally all in the first column
with 1. Next, choose χα2 ( r ) = 1. Then row α2 is determined by orthonormality
62 CHAPTER 2. GROUP REPRESENTATIONS

with row α1, and similarly the second column is determined by orthonormality
with the first column. Calculate the last two entries of rows α3 and α4 again by
orthogonality with the first two rows.

D2 : e r s rs D2 : e r s rs
α1 : 1 1 1 1 α1 : 1 1 1 1
α2 : 1 1 . . −→ α2 : 1 1 −1 −1
α3 : 1 . . . α3 : 1 −1 1 −1
α4 : 1 . . . α4 : 1 −1 −1 1

E XAMPLE 23: Dihedral groups. First, consider D3 ∼ = S3. It has 6 elements divided
into 3 classes [ e ], [ s ], and [ r ], carrying the (subscript) label [ µ ] (µ = 1, 2, 3):
[1] = { e } ∼ = {(1)} ,
[2] = { s, sr, sr2} ∼ = {(12), (23), (31)}, and
[3] = { r, r2} ∼ = {(123), (321),
of sizes cµ = 1, 3, 2. As κ = nc = 3, there are three simple representations, with
dimensions satisfying the equation d 21 + d22 + d23 = 6, whose unique solution is:
d1 = d2 = 1, d3 = 2. As usual, we know the entries on the first row (1, 1, 1) and
first column (1, 1, 2), and we have to calculate χ22, χ23 , χ32, χ33.
Now, as ( χ22)2 = ( χ2( s ))2 = 1 (because s2 = e), χ22 is real and equal to ± 1.
Orthogonality between rows α1 and α2 requires that 3χ22 + 2χ23 = − 1, which
yields the unique solution χ22 = − 1, χ23 = 1. Entries in the last row are deter-
mined by the orthogonality relations between column [1] and columns [2], [3]:
∑α dα χα2 = 1 + χ22 + 2χ32 = 0 and ∑α dα χα3 = 1 + χ23 + 2χ33 = 0. This gives χ32 = 0
and χ33 = − 1, which completes the character table for D3 . The second row of
the table for D3 can also be found in another way, which proves to be useful
for larger groups. We know from Chapter 1 that D3 has an invariant subgroup,
H = { e, r, r2} with coset M = sH = { s, sr, sr2} . The associated factor group
D3 /H = { H, M } is isomorphic with C2, and so the characters for representation
α2 can be obtained from the characters of C2 by the homomorphism D3 7 → D3 /H,
that is e, r, r2 7 → e ∈ C2, and s, sr, sr2 7 → c ∈ C2. It follows that χ21 = 1, χ23 = 1 and
χ22 = − 1, in agreement with results already found. Entries in the last row, found
from the orthonormality conditions, will complete the table.
We can construct in the same way the character table for D4 (with its conju-
gacy classes known). The results for both D3 and D4 are shown below:

D4 : [ e ]1 [ s ]2 [ rs ]2 [ r ]2 [ r2]1
D3 : [ e ]1 [ s ]3 [ r ]2 α1 : 1 1 1 1 1
α1 : 1 1 1 α2 : 1 −1 −1 1 1
α2 : 1 −1 1 α3 : 1 1 −1 −1 1
α3 : 2 0 −1 α4 : 1 −1 1 −1 1
α5 : 2 0 0 0 −2
2.7. TENSOR-PRODUCT REPRESENTATION 63

From the pattern suggested by these examples, we can write down the char-
acter table for any dihedral group. In fact, we know all about the Dn groups: the
compositions and sizes of their conjugacy classes from Chapter 1 Sec. 1.5; and
their irreducible representations from Example 13, which we now know to be
complete, by ‘counting’ them (κ = nc and ∑κα d2α = | G |). So, we just recall:
• If n is odd, Dn has exactly ( n + 3) /2 irreducible representations, of which 2
are one-dimensional, and ( n − 1) /2 two-dimensional.
• If n is even, Dn has exactly n/2 + 3 irreducible representations, of which 4
are one-dimensional, and ( n − 2) /2 two-dimensional.

E XAMPLE 24: (S3 and permutation rep) From the simple characters, it is easy to
find the decomposition of any representation with known characters. To illus-
trate, take the permutational representation π P of S3 (Example 7 p.43), where
 the
traces of the matrices given yield χP ( e ) = 3, χP (12) = 1, and χP (123) = 0,


which can be expressed as χP ( g) = ∑α aα χα ( g) for g = e, (12), (123), giving us 3


equations for the three unknowns, with solution a1 = 1, a2 = 0, a3 = 1. So, the
representation π P of S3 has the decomposition α1 + α3.
Two lessons can be drawn from this example:
(i) If ( π, V ) is any representation of the symmetry group S3 and the χπ ( g)
for g = e, (12), (123) are known, then π is determined up to isomorphism by its
character χπ = a1 χ1 + a2 α2 + a3 α3, producing π = a1 α1 ⊕ a2 α2 ⊕ a3 α3.
(ii) In general, if π P is the permutational representation associated with the action of
a group G on a set X, then χP ( g) for g ∈ G is the number of elements of X left fixed by
g. Given this property, it is easy to find χP ( g) for any symmetry group Sn acting
on the set X = { 1, . . ., n } . 

2.7 Tensor-Product Representation


Representations can be added and multiplied. We learned how to add in Sec. 2.3:
Given a group G and its simple representations ( πi, Vi ) of dim Vi = di, the direct
sum ⊕ i πi = ∑i πi gives a representation π of G in space ⊕ Vi. This means, for-
mally, that π represents the action of G given by g (⊕ vi) = ⊕( gvi) where g ∈ G
and vi ∈ Vi; or a block-diagonal matrix π = diagonal [ π1, π2, . . . ], of dimension
d = ∑i di in the matrix language. In this section, we shall learn how to multiply
representations, how to describe the action of groups in products of vector spaces,
taking just two factors for the sake of simplicity.
We shall begin by defining the underlying space, which carries the represen-
tations. Given the linear vector spaces U and V over a base field F, and ui ∈ U
and v j ∈ V arbitrary vectors, we use the symbol ( u ⊗ v )a with a = ( i, j ) to mean
ui ⊗ v j , and define the following vector space:

Definition 2.6 (Tensor product ⊗ of spaces). Let U and V be finite-dimensional vector


spaces over F = R or C. The tensor product U ⊗ V of U and V is a vector space of
vectors given by finite linear combinations of bilinear products, ∑a ca ( u ⊗ v )a, of ui ∈ U
64 CHAPTER 2. GROUP REPRESENTATIONS

and v j ∈ V , with c ∈ F. The product ⊗ is bilinear in the sense that

u1 + cu2 ⊗ v = ( u1 ⊗ v ) + c ( u2 ⊗ v ) ,

(2.43)
u ⊗ v1 + cv2 = ( u ⊗ v1 ) + c ( u ⊗ v2 ) .

(2.44)

Suppose that { ui; i ∈ I u } where I u = { 1, 2, . . . , dim (U )} is a basis for U ,


and { v j; j ∈ I v } where I v = { 1, 2, . . . , dim (V )} is a basis for V . Then the set of
vectors { ui ⊗ v j | i ∈ I u , j ∈ I v } is a basis for U ⊗ V , and the dimension of the
tensor-product space is dim (U ⊗ V ) = dim (U ) dim(V ).
Given U , V , U ⊗ V as above, and any two linear operators A : U → U and
B : V → V , there exists a unique linear operator A ⊗ B : U ⊗ V → U ⊗ V such that

( A ⊗ B )(u ⊗ v ) = ( Au ) ⊗ ( Bv), for all u ∈ U , v ∈ V . (2.45)

Furthermore, if A1, A2 are linear operators on U , and B1, B2 linear operators on


V , then we may form a new linear operator on U ⊗ V defined by

( A1 ⊗ B1 )( A2 ⊗ B2 ) = ( A1 A2 ) ⊗ ( B1 B2 ) . (2.46)

Representations of H × K . We turn now to representations of groups on tensor


spaces, dealing with both a single group and two (or several) groups. As we
may consider the former case as a special situation, we shall begin with the more
general case. In Chapter 1 we defined a direct product group H × K of two groups
H and K as a set of pairs ( h, k ) with h ∈ H and k ∈ K subject to the composition
rule ( h, k )(h0, k 0 ) = ( hh0, kk 0 ) and whose order is the product of the orders of its
factors, | H × K | = | H || K |. We now discuss its representations.

Definition 2.7 (Representation of H × K in U ⊗ V ). Let α be a representation of a


group H in vector space U over F, and β be a representation of a group K in vector space
V over the same F, then the representation α ⊗ β of the direct product group H × K
in the tensor-product space U ⊗ V is the map ( h, k ) ∑( u ⊗ v ) = ∑( hu ⊗ kv ), where
( h, k ) ∈ H × K and ∑( u ⊗ v ) is an arbitrary vector in U ⊗ V .
Suppose that the representation spaces U and V , of respective dimensions d α
and d β , come with a definition of inner product, and with orthonormal bases
{ ui; i ∈ I u } and { v j; j ∈ I v } . Then the dα d β vectors wa = ( u ⊗ v )a = ui v`,
where a = ( i, `); i ∈ I u ; ` ∈ I v may form a basis for the tensor-product space
W = U ⊗ V . Any vector x ∈ W can be written as a linear combinations of wa such
that | xi = ∑a | wa i xa , where the F-valued components are given by xa = h wa | xi,
and the scalar product of any two vectors x, y ∈ W is given by h x| y i = xa ya , or
x∗a ya if F = R, or C. Then in the representations ( α, U ) and ( β, V ) of the groups
H and K, we have for any elements h ∈ H and k ∈ K

α ( h)|uji = ∑i∈ Iu | ui i Aαij ( h ), ( j ∈ Iu ), (2.47)


β
β ( k )|vmi = ∑`∈ I | v` i B`m ( k ), ( m ∈ Iv ) . (2.48)
v
2.7. TENSOR-PRODUCT REPRESENTATION 65

It follows that the action of ( h, k ) ∈ H × K in the vector space U ⊗ V is:

( α ⊗ β )( h, k )|wb i = α ( h)|u ji ⊗ β ( k )|vmi


β
= ∑ | uii ⊗ | v`i Aαij ( h ) B`m ( k )
i`
= ∑ | wa i( Aα ⊗ B β )ab . (2.49)
a

Hence, the representation ( α ⊗ β ) of H × K in W assumes the form of a tensor-


product matrix of dimension dim ( α × β ) = d α d β in the given bases:

α× β β
Di`;jm ( h, k ) = ( Aα ( h ) ⊗ B β ( k ))i`,jm = Aαij ( h ) B`m( k ) . (2.50)

The character of ( h, k ) in α ⊗ β is the trace of Dα× β ( h, k ) in U ⊗ V :

β
χα× β ( h, k ) = Tr ( Dα× β ( h, k )) = Aαii ( h ) B`` ( k ) ,

which is
χα× β ( h, k ) = χα ( h ) · χ β (( k ) h ∈ H, k ∈ K . (2.51)
That is, the characters in the representation α ⊗ β of the direct-product group
H × K are given by the products of the characters in the representations of the
factors. Their inner products can be calculated from (2.35) yielding the result:

h χα× β, χα× β i = h χα ( h ), χα( h )i hχ β( k ), χβ( k )i , (2.52)

where h, i is the bilinear form in (2.35). It is the basis for the following property:
• The representation α ⊗ β of the direct product group H × K over F is irre-
ducible if and only if the representations α and β of H and K over F are both
irreducible. Every irreducible representation of H × K is of this kind.
Recalling that a representation α over F is irreducible if and only if h χα , χα i = 1,
we see that if α and β are both irreducible, then h χα× β, χα× β i = 1, which means
that α ⊗ β is irreducible. Moreover if H has nc conjugacy classes and K has mc
classes, then H × K has nc mc classes, since ( h, k ) ∼ ( h0, k 0 ) implies h ∼ h0 , k ∼ k 0 ,
and vice versa for h, h0 ∈ H and k, k 0 ∈ K. We know that there are as many simple
representations as there are classes, which is nc mc for H × K; and there are pre-
cisely nc mc simple representations α ⊗ β. All possible irreducible representations
of H × K must be of the form α ⊗ β.
Representations of G in U ⊗ V . In physics we often deal with systems of identi-
cal particles existing in representations of some given group G in tensor-product
spaces (for example, space rotations in a two-nucleon system). The problem we
have here is, in mathematical terms, identical to that of the representations of
the product group G × G restricted to its diagonal subgroup {( g, g) | g ∈ G } , and
so, may be simply treated as a special case of that situation, although the carrier
spaces could be physically distinctive with different kinds of degrees of freedom.
66 CHAPTER 2. GROUP REPRESENTATIONS

Definition 2.8 (Representation of G in U ⊗ V ). Let α, β be representations of a group


G in the vector spaces U and V over F, then the representation α ⊗ β of G in the tensor
product space U ⊗ V is defined by ( g, g) ∑( u ⊗ v) = ∑( gu ⊗ gv), where g is an element
of G and ∑( u ⊗ v ) is an arbitrary vector in U ⊗ V .
Just as before we introduce the orthonormal bases { ui } for U , { v ` } for V ,
and { wa = ( u ⊗ v )a = ui v` } for W = U ⊗ V . Then, if α : g 7 → Dα ( g) and
β : g 7 → D β ( g) are representations of G in U and V , the representation α ⊗ β of G in
W is given by α ⊗ β : g 7 → Dα× β ( g) = Dα ( g) ⊗ D β ( g). The notation is standard:
Dα , D β , and Dα× β are matrices of respective dimensions dα , d β, and dα d β.
The character of g ∈ G in representation α ⊗ β is the trace of Dα× β ( g) in U ⊗ V :
β
χα× β ( g) = Tr Dα× β ( g) = Diiα ( g) D`` ( g), or

χα× β ( g) = χα ( g) χ β(( g) . (2.53)

As expected, the characters in the representation α ⊗ β of the group G are prod-


ucts of the characters in the representations α and β of G. But in contrast to the
previous case, this h χα× β , χα× β i does not factorize into h χα , χα i and h χ β, χ β i as
before, and therefore simplicity of the representations α and β does not guarantee sim-
plicity of the representation α ⊗ β. The problem we address now is: Given the irreps
Dα and Dβ , what is the decomposition of Dα× β = Dα ⊗ Dβ , and how are the irreps
in Dα× β related to those of its factors Dα and Dβ ?
To simplify notation, we define

| ui i ≡ | αii, i = 1, 2, . . . , dα , (2.54)
`
| v i ≡ | β `i, ` = 1, 2, . . . , d β , (2.55)
a
| w i ≡ | αii| β `i = | αi, β `i . (2.56)

They form the orthonormal bases for the spaces U , V , and W = U ⊗ V over C,
and satisfy the orthogonality and completeness relations:

h αi, β `|αj, βmi = δij δ`m , (2.57)


∑ | αi, β`ihαi, β`| = ID , (2.58)
i`

where I D is the identity operator in space W of dimension D = d α d β . In this basis


{| αj, βmi}, the representation matrices are given by
α× β β
h αi, β `|(α ⊗ β )( g)|αj, βmi = Di`;jm ( g) = Dijα ( g) D`m ( g) . (2.59)

As in general α ⊗ β is reducible, there is always a basis in which Dα× β is ex-


α× β α× β
pressible as a direct sum of the form ⊕ γ aγ Dγ , where aγ is the multiplicity
of the distinct irreducible representation Dγ , that is, in which W is decomposed
γ
into a direct sum of invariant subspaces W τ , with γ = 1, 2, . . . , κ labeling the
α× β
distinct irreducible representations, and τ = 1, 2, . . . , aγ denoting the different
subspaces associated with the equivalent representations Dγ .
2.7. TENSOR-PRODUCT REPRESENTATION 67

α× β
We shall call the new basis vectors | τγk i with k = 1, . . . , d γ ; τ = 1, . . . , aγ ;
and γ = 1, . . . , κ, subject to the orthonormality and completeness relations:
h τγk|τ 0γ0 k 0 i = δττ0 δγγ0 δkk0 , (2.60)
∑τγk | τγkihτγk| = ID . (2.61)
α× β
Completeness implies that D = dα d β = ∑κγ=1 aγ dγ .
The two orthonormal bases {| αj, βmi} (‘uncoupled’) and {| τγk i} (‘coupled’)
in W are related by a similarity transformation, such that they satisfy the recip-
rocal relations
| αj, βm i = ∑ | τγkihτγk|αj, βmi , (2.62)
τγk

| τγk i = ∑ | αj, βmihαj, βm|τγki . (2.63)


jm

Thus, the similarity transformation connecting the two bases is described by a


matrix whose entries are given by the complex-valued quantities h τγk|αj, βm i;
and its inverse by their complex conjugates h αj, βm |τγki = h τγk |αj, βmi∗. These
numbers h τγk |αj, βmi, characteristic of the group G, are called the Clebsch-Gordan
coefficients for G. Being a matrix defining the similarity transformation between
two orthonormal bases, it is orthogonal, with entries obeying the orthonormality
and completeness relations:
∑h τγk|αj, βmihαj, βm|τ0γ0 k0 i = δττ0 δγγ0 δkk0 , (2.64)
jm

∑ h αi, β`|τγkihτγk|αj, βmi = δij δ`m . (2.65)


τγk

Whereas in the ‘uncoupled’ basis {| αi, β `i}, the representation α ⊗ β is defined


by (2.59); now in the ‘coupled’ basis {| τγk i}, it is given by
γ
h τγk|(α ⊗ β )( g)|τ0γ0 k 0 i = Dk0 k ( g) δγγ0 δττ0 . (2.66)
The matrix elements h αi, β`|α ⊗ β ( g)|αj, βmi and h τγk |α ⊗ β ( g)| τ 0γ0 k 0 i are of
course related by the similarity transformation that connects the two bases. If we
start from (2.59) and use the completeness of {| τγk i}, we obtain
β
Dijα ( g) D`m ( g) = h αi, β `|α ⊗ β ( g)|αj, βmi
= ∑h αi, β `|τγk ihτγk |α ⊗ β ( g)|τ 0γ0 k 0 ih τ 0γ0 k 0 | αj, βm i
γ
= ∑ 0 h αi, β`|τγki · Dkk0 ( g) · h τγk0| αj, βmi . (2.67)
τγkk

Alternatively, starting from (2.66) we use the completeness of {| αj, βmi} to obtain
γ
Dk0 k ( g) δγγ0 δττ0 = h τγk |(α ⊗ β )( g)|τ 0γ0 k 0 i
= ∑h τγk|αi, β`i hαi, β`|(α ⊗ β)( g)|αj, βmi hαj, βm|τ0γ0 k0 i
β
= ∑ h τγk |αi, β`i · Dijα ( g) D`m ( g) · h αj, βm|τ 0γ0 k 0 i . (2.68)
ij`m
68 CHAPTER 2. GROUP REPRESENTATIONS

If we set j = i and m = `, sum over i and j on both sides of (2.67), and make
use of the orthonormality of the Clebsch-Gordan coefficients, we obtain a relation
between the characters χα× β and χγ ( g):
α× β
κ a κ α× β γ
χα ( g) χ β( g) = ∑γ=1 ∑τγ=1 χγ ( g) = ∑γ=1 aγ χ ( g) . (2.69)

We would have obtained the same result had we started from (2.68). It agrees
α× β
with (2.53), where we set χα× β = ∑γ aγ χγ , as implied by the decomposition
α× β α× β
Dα× β = ⊕ γ aγ Dγ . The multiplicity coefficient aγ , which is the number of
times the irrep Dγ appears in the decomposition of Dα× β , is given by
α× β
aγ = | G | −1 ∑ χα ( g) χ β( g) χγ ( g) = h χγ, χα×β i, (2.70)
g ∈G

with χ = χ∗ if F = C, and χ = χ if F = R.
To summarize: Assuming the irreducible representations α, β of a group G are
known, the representation α ⊗ β of G is given by the tensor product of the reps
of α and β, i.e. Dα× β ( g)=Dα ( g) ⊗ Dβ ( g) with g ∈ G. It is completely reducible
and can be decomposed into a sum of the irreducible representations of G by a
similarity transformation relating the ‘uncoupled’ basis and the ‘coupled’ basis,
determined by an orthogonal matrix the elements of which are called the Clebsch-
Gordan coupling coefficients specific to G. The representations expressed in the
two bases are transformed into each other by the reciprocal relations:
β γ
Dijα ( g) D`m ( g) = ∑ 0 h αi, β`|τγki · Dkk0 ( g) · h τγk0| αj, βmi, (2.71)
τγkk
γ β
Dk0 k ( g) δγγ0 δττ0 = ∑ h τγk|αi, β`i · Dijα ( g) D`m( g) · h αj, βm|τ 0γ0 k0 i, (2.72)
ij`m

α× β
for any element g ∈ G. The decomposition Dα× β = ⊕ γ aγ Dγ is completely
determined by the multiplicities of the inequivalent irreducible representations γ
α× β
through the formula aγ = h χγ, χα× β i.

E XAMPLE 25: Representation α3 ⊗ α3 of D3 ∼ = S3. The simple two-dimensional


representation α3 of D3 is given in Example 17, from which we obtain the product
representation α3 ⊗ α3 of r and s by tensor multiplication of matrices:
 2   2 
ω 0 0 0 ω 0 0 0
 0 ω3 0
   
ω 0 ω 0 0 =  0 1 0 0,
 
(r) : ⊗ =
0 ω2 0 ω2 0 ω3 0   0 0 1 0 

0
0 0 0 ω4 0 0 0 ω
 
    0 0 0 1
0 1 0 1 0 0 1 0 
(s) : ⊗ = .
1 0 1 0 0 1 0 0
1 0 0 0
PROBLEMS 69

By simply swapping columns and rows, we obtain (in a new basis):


   
1 0 0 0 0 1 0 0
0 1 0 0  1 0 0 0
α3 ⊗ α3 ( r ) = 
0 0 ω 0  , α3 ⊗ α3 ( s ) = 0 0 0
  
1
0 0 0 ω2 0 0 1 0

One recognizes that α3 ⊗ α3 = β + α3, where β is defined by β ( r ) and β ( s ):


   
1 0 0 1
β : r 7→ , s 7→ .
0 1 1 0

In fact β ( s ) is equivalent to the diagonal form β : r 7 → diag [1, 1], s 7 → diag [1, − 1].
So β = α1 + α2 is the sum of the two simple one-dimensional representations of
D3, which means we have the complete decomposition α3 ⊗ α3 = α1 ⊕ α2 ⊕ α3.
Another way to obtain this result is to calculate the multiplicities aγ in the
decomposition α3 ⊗ α3 = ∑γ aγ γ, where the sum is over all the simple represen-
tations of D3 . We will use the formula for aγ in the form
α γ
aγ = | G |−1 ∑µ cµ ( χµ3 )2 χµ

together with the necessary data found in the character table for D3. We get 6a1 =
1.22 .1 + 3.02 .1 + 2(− 1)21 = 6; 6a2 = 1.22.1 + 3.02 (− 1) + 2(− 1)21 = 6; and
6a3 = 1.22 .2 + 3.0.0 + 2.(− 1)2(− 1) = 6. Thus, we have aγ = 1 for all three simple
representations, so that α3 ⊗ α3 = α1 ⊕ α2 ⊕ α3, just as we found before.
An even simpler way is to recall that every representation is determined up
to isomorphism by its character. Thus, referring to the character table for D3 ,
we have χα3 ⊗α3 = ( χ3)2 = (4, 0, 1) on the three conjugacy classes; but these
values are the same as given by χ1 + χ2 + χ3. Then, it follows that α3 ⊗ α3 =
α1 ⊕ α2 ⊕ α3. Similarly, α1 ⊗ α3 has the character (2, 0, − 1), and α1 ⊗ α3 ∼ = α3; and
finally, α2 ⊗ α3 = α3 as they have the same character (2, 0, − 1).
E XAMPLE 26: Products are not confined to just two factors: we may have for ex-
ample α× n
3 = α 3 ⊗ . . . ⊗ α 3 , where α 3 is the simple 2-dimensional representation
of S3. What is its decomposition ⊕ γ aγ γ in simple representations? The answer
3
can be found in the character theory. We know that
×n n n
 χ = (2, 0, − 1) on the conju-
gacy classes. The character of α3 is 2 , 0, (− 1) , equal to the character of the
decomposition, a1 χ1 + a2 χ2 + 3
 a3 χ = ( an1 + a2 +n2a3, a1 − a2, a1 + a2 − a3). We
find: a1 = a2 = 2 n − 1 n
+ (−) /3, a3 = 2 − (−) /3. 

Problems
2.1 (Representations in function space) In a two-dimensional Euclidean space,
define a transformation R on the Cartesian coordinates ( x, y ) by x0 = D11 ( R ) x +
D12 ( R )y and y0 = D21 ( R ) x + D22 ( R )y. The set of all homogeneous scalar func-
tions of degree 2 in x, y invariant under R form a linear vector space√V 3 of di-
mension 3 spanned by the basis vectors ψ1 ( x, y ) = x2 , ψ2 ( x, y ) = 2xy and
70 CHAPTER 2. GROUP REPRESENTATIONS

ψ3 ( x, y ) = y2. Given D( R ) as follows, find the matrix representation of R in V 3 in


each case.
     √ 
−1 0 cos φ − sin φ − 1/2 − 3/2

( a) , (b) , (c) .
0 1 sin φ cos φ 3/2 − 1/2

2.2 (Block-diagonalization) Find the basis that reduces the three-dimensional


representation M( R ) of Problem 2.1 Case (c) to a block-diagonal form.
2.3 (Left/Right regular representations) The regular representation of a finite
group G is defined by the left action of group elements (called g, h, q, . . . ):

1 : qg = h
qg = ∑h hDhg ( q ) , Dhg ( q ) =
0 : qg 6 = h

(where we have used group elements to label the representation matrices). But it
can also be defined by the right action

1 : gq = h
gq = ∑h D gh ( q )h , D gh ( q ) =
0 : gq 6 = h

(a) How are D and D related when G is an abelian group.


(b) Find their relationship when G is a finite group.
(c) Find D, D, and X for the groups C3 and D3.
2.4 (Inner products) In Sec. 2.4 we have defined a product { , } by the relation

def
{ x, y } = | G |−1 ∑ h D ( h ) x | D ( h ) y i, x, y ∈ V ,
h ∈G

where D ( G ) is a representation of a group G of order | G | in a linear space V , and


h|i is a complex scalar product. Show that { , } satisfies the scalar-product axioms,
namely linearity, complex conjugation, and positivity of norm.
2.5 (D3 two-dimensional representation) Group D3 can be realized in a Euclidean
R 2 space by the symmetry transformations that leave an equilateral triangle in-
variant. Let us label the vertices of such a triangle by 1, 2, 3 counterclockwise.
(a) To begin, choose a Cartesian basis in which the origin coincides with the
triangle center, the axis ŷ = e2 passes through a vertex , and the axis x̂ = e1,
perpendicular to ŷ in the conventional sense. Find the matrix representation of
the six elements of D3 . Is it unitary, is it reducible?
(b) Now take a new basis, with each of the axes u1 and u2 passing through a
vertex. Find the similarity transformation that takes one basis to the other, Sim =
h ei| um i. Find the equivalent group representation in this new basis. Is it unitary,
is it reducible?
2.6 (Representation of class elements) Let [ µ ] be a nontrivial conjugacy class of a
finite group G. Prove that an irreducible representation of the sum of the elements
in [ µ ] is a multiple of the identity, λI, and determine the constant λ.
PROBLEMS 71

2.7 (Group algebra & completeness relation)


(a) Consider a finite group G and define the set C( G ) = { ∑g∈G a g g} with
complex numbers a g . Show that C( G ) is closed under addition and multiplication
of any two members. (A linear vector space closed under a (vector) multiplication
law is called an algebra ).
cµ µ
(b) Let K µ = ∑i gi be the sum of all elements of the conjugacy class [ µ ] of size
cµ ≥ 1, where µ = 1, . . . , nc . Show that K µ K ν = ∑γ cλµν K λ, where cλµν are positive
integers. In the following µ = 1 indicates the identity.
(c) Consider any irreducible representation Dα ( G ), and define, for any conju-
gacy class [ µ ], the sum of matrices Dµ = ∑g∈[µ] Dα ( g). Show that Dµ commutes
with Dα ( h ) for every h ∈ G, and therefore (why?) Dµ = λµ I. Calculate λ.
γ
(d) Show that χαµ χαν = ∑γ Nµν χαγ χα1 , where χα1 is the character of the identity
γ
e, and Nµν is a number that depends only on the group, not its representations.
Give an interpretation of this result.
(e) Show that the inverses of the elements of any class [ µ ] form another com-
plete class [ µ0 ], and prove that the coefficients defined in (b) are such that c1µν = cµ
if ν ∈ [ µ0 ]; otherwise, it is zero. From this and (b), show that cν ∑κα χαµ χαν = | G | δνµ0 ,
where κ is the number of distinct simple representations. If Dα is unitary, then
Dα ( g−1 ) = ( Dα ( g))†, which implies χαµ0 = χαµ∗ . What does this result mean
about χαµ ; what does it say about the values of κ (the number of representations)
and nc (the number of classes)?
2.8 (Sums of characters) Show that the sum of the simple characters of the el-
ements of a finite group vanishes in any irreducible representation except the
trivial one-dimensional representation. Show also that the sum over the simple
representations, ∑α dα χαµ with µ 6 = [ e ] and dα being the representation dimen-
sions, also vanishes.
2.9 (D3 representation in V 4) Let V 4 be the four-dimensional vector space of the
functions of two real variables f ( x, y) = c1 x3 + c2 x2 y + c3 xy2 + c4 y3, with a basis
consisting of ψ1 = x2 , ψ2 = x2 y, ψ3 = xy2, and ψ4 = y3. It is assumed that
the variables x, y are Cartesian coordinates that linearly transform according to
x0i = Dij ( g) xj (x01 = x, x2 = y) where D ( g), with g = e, b, ba, ba2, a, a2, are the
matrices of the two-dimensional representation of the group G = D3 (given in
the solution to Problem 2.5).
(a) Find the representation matrices M( G ) in the basis {ψi } of V 4.
(b) Calculate the characters in this representation.
(c) Find the transformation that reduces M( G ) to direct sums of the irreducible
representations of D3.
(d) Find the bases of the fully reduced representation in terms of {ψi}.
2.10 (Character table for D4) Construct the character table for D4 (the group of
symmetries for the square).
2.11 (Character table of T12 ∼
= A 4) T12 is the group of rotations that leave a regu-
lar tetrahedron invariant (this configuration has 4 three-fold axes and 3 two-fold
72 CHAPTER 2. GROUP REPRESENTATIONS

axes, found in C H4 for example). Using the group structure and the orthogonal-
ity and completeness relations, construct the character table for T12. Note that T
is a subgroup of S4, isomorphic to the alternating group A 4 (see Problem 1.10).
2.12 (Character table for S4) Find the characters for the symmetric group S4.
2.13 (Characters for S4) Find the permutational representation on 4 letters for S4;
find its decomposition. What is the product α2 × α4 (where α2 is a nontrivial one-
dimensional representation and α4 a three-dimensional simple representation of
S4)? On the basis of these results, show a way to obtain the characters for S4.
2.14 (Matrix representation of product group) Suppose that a finite group G is
the product group of its two subgroups, G = H × K. Using the matrices and
not the characters, show that the direct product Dα ( H ) × D β ( K ) of two simple
representations of H and K is a simple representation of G.

Q. Ho-Kim. Group Theory: A Physicist’s Primer.


Chapter 3

Lie Groups
and Lie Algebras

3.1 Lie Groups


3.2 Global Properties
3.3 Local Properties
3.4 Lie Algebras
3.5 Back to Lie Groups
3.6 Representations

Up to now we have dealt with groups consisting of a finite number of elements.


In physics, such groups may describe the symmetries of polyatomic molecules
under rotation through a definite angle about some axis, or the symmetries in
systems of identical particles under permutation. But there also exist physical
systems that preserve their shapes or properties under an infinite number of re-
lated symmetry operations, and so are said to possess an infinite symmetry group.
It so occurs either because the associated group elements may be infinite in num-
ber but discrete (i.e. denumerable, or being in one-to-one correspondence with
integers), or because at least some symmetries of the system are continuously infi-
nite (which can be realized with continuously varying parameters). Whereas the
first case may be considered as the limit of some discrete group when its order
goes to infinity, the second is radically different, and more interesting.
Among the continuous groups, those that also have the exacting structure
of differentiable finite-dimensional manifolds — called the Lie groups — are the
best developed and form the subject of this chapter and the remainder of this
book. Sophus Lie (1842–1899), who initiated the theory, was motivated in part
by the desire to classify the symmetries of differential equations in a way similar
to what Évariste Galois (1811–1832) had done for the algebraic equations with
the permutation groups. Lie groups are a complicated subject because, being
of uncountable order, they have no meaningful multiplication tables and, being
defined on manifolds, they have all the complexities of algebra, topology, and

73
74 CHAPTER 3. LIE GROUPS & LIE ALGEBRAS

geometry combined. Fortunately, a purely algebraic tool, the Lie algebra (a closely
related finite vector space equipped with a bilinear composition rule), can be used
to study properties of the associated Lie groups and extract much of their struc-
ture while avoiding the inherent topological issues; in fact, in cases of physical
interest it encodes almost all of the geometry/topology of the group itself.
This chapter gives an overview of Lie groups and Lie algebras. First we de-
fine the Lie group and give examples, then discuss its algebraic properties (sub-
groups, cosets, invariant subgroups) and its topological properties (compactness,
connectedness, and simple connectedness). We next show when and how a Lie
group linearizes to produce a Lie algebra and describe the reversed path (the ex-
ponential map), and finally introduce general notions of the representation theory
of Lie groups and algebras in preparation for later, more focused, studies.

3.1 Lie Groups


Here the reader will find a definition and a few examples of Lie groups. But first,
some notions about the concept of manifold.
MANIFOLDS. An n-dimensional manifold is a topological space which locally re-
sembles the Euclidean space R n (but globally may not). A topological space M is
a nonempty set that contains open subsets (such as open intervals on real num-
ber line R, or open balls in R n ) and that is in addition well-behaved. The second
condition, being locally isomorphic to Euclidean space, means that one can set up
coordinates in R n for any open set in M, and also inversely. Technically, for any
open set O in M there is a unique open set V in R n related by mutually inverse
continuous maps φ ( V ) = O and φ−1 (O ) = V. Since one may need several such
sets O to cover M entirely, one defines several corresponding coordinate systems,
or charts, making up an atlas for the whole manifold. When two sets, say O1
and O2, overlap, the images of the intersection O1 ∩ O2 given by the charts φ1−1
and φ2−1 look different, and a transformation of coordinates (or transition map)
is needed. If the transition functions, defined on open sets of R n and relating
the Euclidean local charts to each other, are differentiable for a fixed (k) degree of
differentiability, one speaks of (C k) differentiable manifold; or of smooth manifold if
the transition functions are arbitrarily differentiable. In a differentiable manifold
the tools of infinitesimal analysis used in studies of Euclidean space can be ap-
plied locally, i.e. applied to points in each small neighborhood of the manifold.
In particular, whether a continuous function between differentiable manifolds is
differentiable or not can now be decided by computing its derivatives pointwise
in any of the Euclidean coordinate charts. See Appendix C for more details.
E XAMPLES of manifolds:
(1) Any open subset O of R n is a smooth manifold of dimension n. A possi-
ble choice for O is R n itself. R n looks like R n not only locally but also globally.
Euclidean space is a non-compact manifold. (A set O is an open set if for each
element x ∈ O there is an open neighborhood U ( x) entirely contained in O.)
(2) R m identified with the set {( x1, . . . , xm , c1, . . . , cn−m )} ⊂ R n (where c1, . . . ,
cn−m are fixed constants) is an m-dimensional manifold in R n . For example, the
unit circle S1 ⊂ R 2 is a one-dimensional manifold with coordinates ( θ, r = 1).
3.1. LIE GROUPS 75

(3) An open set O in R n constrained by smooth map f : O → R k (0 < k < n)


is an (n − k)-dimensional smooth manifold in R n . For example, the unit sphere
S2 in R 3 , defined by the set S2 = {( x, y, z ) ∈ R 3 | x2 + y2 + z2 = 1} , is a two-
dimensional manifold in R 3 determined by the condition x2 + y2 + z2 = 1.
(4) The set of all real n × n matrices A, called M( n, R ), is an open subset of the
2
vector space of n × n matrices over R in bijective correspondence with R n (the
coordinates xi of each point being identified with the n2 entries Aij ), and so is a
smooth manifold of dimension n2. Similarly, the set M( n, C ) of all complex n × n
matrices can be viewed as an open subset of the vector space of n × n matrices
over C, and hence is a smooth manifold of dimension 2 n2. 
Definition 3.1 (Lie group). A Lie group is a set that is at the same time a group and a
smooth manifold of finite dimension, such that the group operations (composition of any
two elements and inversion of every element) are smooth maps.
Being a Lie group, the set G is a manifold, a well-behaved topological space
that everywhere locally resembles a Euclidean space of fixed dimension; and also
is a group meaning that the group operations – multiplication m( α, β ) = αβ ∈ G,
and inversion inv( α ) = α−1 ∈ G for all α, β ∈ G – are valid, and such that the
two structures are compatible, that is, m ( α, β ) and inv ( α ) are smooth, i.e. C ∞
differentiable, functions of their arguments. (This double requirement is actually
equivalent to the single requirement that m( α, β ) be smooth for all α, β ∈ G, in-
cluding β−1). A real (or complex) Lie group is a smooth real (or complex) manifold.
The dimension of a Lie group is defined (over R, or over C) as that of its manifold.
Lie groups often appear in descriptions of symmetries of geometrical objects
or physical systems when they act on a manifold. The action G : V → V of
a Lie group G on a vector space V is the assignment to each α ∈ G a smooth
map ( α, v ) 7 → α · v ∈ V, also written v 0 = f ( α; v), called a transformation function.
When defined in this way, the action of G on V is called a group of transformation,
described by { f } . The transformation function inherits all the group properties of
G; e.g., the associativity property of group elements, written as β · ( αv) = ( β · α ) v
for any v ∈ V, finds its expression in
f ( β; f (α; v)) = f (m( β, α ); v). (3.1)
The function f encodes all information on the group G, and in fact reduces to the
usual group product m : G × G → G when V is identified with G.
E XAMPLE 1: Lie group of dimension 1. The set R × = R \{ 0} (all real numbers except
0) with multiplication as the group operation and 1 as the identity element, is a
one-dimensional Lie group. The set of complex numbers with absolute value 1 is
a group (under multiplication) and a (1-dimensional) manifold, and so is a one-
dimensional Lie group (the circle group), isomorphic to the quotient space R/Z.
E XAMPLE 2: Lie group of dimension 2. The set C× = C \{ 0} of non-zero complex
numbers, with complex multiplication as the group operation is a Lie group of
dimension 2 (over R). Let x = x1 + ix2 ∈ C× . Then the product xy = x1 y1 −
x2 y2 + i( x1 y2 + x2 y1 ) and the inverse x−1 = ( x1 − ix2 ) / ( x21 + x22 ) are both smooth
functions of their entries xi , yi ∈ R.
76 CHAPTER 3. LIE GROUPS & LIE ALGEBRAS

E XAMPLE 3: Euclidean vector space. The (smooth, real) manifold R n is also a group
(with vector addition), and is a Lie group of dimension n. Its topology structure is
defined by taking all open balls as open sets, and specifying the metric (distance,
norm) d ( x, y) = | x − y | for any elements x, y of the group.
E XAMPLE 4: The general linear groups GL( n, R ) and GL ( n, C ). We already know
2
that the set of all real n × n matrices M( n, R ) can be thought of as R n equipped
with standard topology; it has no group structure. But the subset consisting of
all invertible matrices, G = { A ∈ M( n, R )| det A 6 = 0} , is a group, with matrix
multiplication as the group operation, and an inverse for every element. It is an
2
open subset of smooth manifold R n (‘open’ because given an invertible matrix A
there is a neighborhood U A of A such that every matrix in U A is also invertible).
So G is an n2-dimensional smooth manifold. Moreover, the matrix multiplication
m( A, B ) = AB with A, B ∈ M( n, R ) is given by a smooth (polynomial) function
of their entries and so is a smooth map. The restriction of M × M to G × G is
clearly smooth. As for matrix inversion inv : G → G, the map A 7 → A−1 is also,
by Cramer’s formula and continuous det map, a smooth function of the entries
of A. So G = GL ( n, R ) is a Lie group, of dimension n2.
The action of G = GL( n, R ) on V = R n is described by the transformation
functions f : G × V → V given by

f i ( A; x) = ∑ j Aij xj ; i = 1, 2, . . . , n; A ∈ GL( n, R ); x, f ∈ V . (3.2)

By similar arguments the set of invertible n × n complex matrices GL( n, C ) is a


complex Lie group representing the complex symmetry transformations on a com-
plex vector space V = Cn . It may be regarded as a (closed) subgroup of the gen-
eral linear group GL(2n, R ), and a real Lie group of dimension 2 n2(understood
over R). Note the special cases GL(1, R ) = R × and GL(1, C ) ∼ = C× . 
Most Lie groups we meet can be viewed as open subsets of M( n, F ) for F = R
or C; they can be generated as subgroups of GL( n, F ), and are Lie groups by virtue
of this theorem: A subgroup of a Lie group G that is also a manifold in R N , is a Lie
group. This is so because the multiplication map in a subgroup H of the Lie group
G is the restriction m H of the (matrix) multiplication map mG on G, which is itself
smooth. And so, the subgroup H that is also a manifold in R N with compatible
structure is a Lie group.
When the subgroup H is specified by a system of simultaneous equations on
R N that are equivalent or reducible to k independent conditions expressed by a
set of smooth or continuous functions f : O → R k , it is a manifold in R N of dimen-
sion N − k (0 < k < N). Therefore, the above theorem applies to H. Moreover,
the space of solutions is a closed subspace of R N (‘closed’ under limit operations)
from which follows the corollary: A closed subgroup of a Lie group is a Lie group. A
closed subgroup of the general linear group GL ( n, F ) is called a linear group.
E XAMPLE 5: The special linear groups SL( n, R ) and SL( n, C ) are the subgroups
of GL( n, F ) with matrices of determinant 1:

SL( n, F ) = { A ∈ GL( n, F )| det A = 1} .


3.1. LIE GROUPS 77

SL( n, F ) is a subgroup because for any A, B ∈ GL( n, F ) with determinant 1, one


has det( AB ) = det A · det B = 1. It is an open set of R N (N = n2 or 2 n2) specified
by a smooth (scalar) map det : A 7 → 1, and so is a smooth manifold. Therefore,
with the condition det A = 1, SL( n, F ) is a Lie group of dimension n2 − 1 for
F = R, and 2 n2 − 2 for F = C. We may equivalently define SL( n, R ) as the group
of automorphisms R n 3 x 7 → Ax ∈ R n preserving the volume element in vector
space R n (since dn Ax = (det A )dn x). Trivial cases: SL(1, C ) = SL (1, R ) = { 1} .
E XAMPLE 6: The orthogonal group O ( n ) is the group of orthogonal matrices:

O ( n ) = O ( n, R ) = { A ∈ M( n, R )| AT A = I } .

Noting ( AB )T = BT AT and AT = A−1 for any matrices A, B ∈ O ( n ), one verifies


that it indeed has the structure of a group. As we have seen for GL( n, F ), the
group multiplication and inversion operations are smooth maps. Moreover, O ( n )
has a manifold structure for the following reasons. The matrix equation AT A = I
corresponds to a system of n2 equations for the n2 real-valued matrix entries, of
which n ( n + 1) /2 are independent for every A ∈ O ( n ). Therefore, the space of
solutions, with constant dimension given by n2 − n ( n + 1) /2 = n ( n − 1) /2, is a
smooth manifold, and so O ( n ) is a Lie group of dimension n ( n − 1) /2.
Subgroups of GL( n, F ) (call them H), including the orthogonal group, may be
interpreted as groups of transformation. They are in fact groups of isometries
f : V → V in some linear space V equipped with a metric g, such that f : x 7 → Ax
preserves the bilinear form B ( x, y ) ≡ h x, gy i, so that B ( Ax, Ay) = B ( x, y) for all
x, y ∈ V and A ∈ H. This invariance condition implies AT gA = g for F = R, and
A† gA = g for F = C. (Here and in the following, we will be using the notations:
AT is the transpose of A, so that ATij = A ji ; and A† its hermitian adjoint, so that
A†ij = A∗ji , with ∗ denoting complex conjugation.)
With F = R and g = I, the symmetric positive-definite form on V = R n
defined by B ( x, y ) = h x, y i = ∑i xi yi gives us AT A = I, and hence the orthogonal
group O ( n ) (the full rotation group). This set includes all matrices with det A =
+ 1, or − 1. Matrices with det A = 1 are connected to the identity and form the
special orthogonal group (proper rotations), also of dimension n ( n − 1) /2,

SO( n ) = SO( n, R ) = { A ∈ O ( n )| det A = 1} = O ( n ) ∩ SL( n, R ).

Special cases: O (1) = { 1, − 1} ∼


= S0, while SO(1) = { 1} is a single point.
E XAMPLE 7: The corresponding complex case is the unitary group U ( n ):

U ( n )) = U ( n, C ) = { A ∈ M( n, C )| A† A = I } .

The total number of (real) coordinates for every A ∈ U ( n ) is N = 2 n2 subjected


to k = n + 2 ( n2 − n ) /2 = n2 independent conditions, so the space of solutions
for A† A = I, and thus the manifold, has dimension N − k = 2 n2 − n2 = n2. The
group U ( n ) is a Lie group of dimension n2.
For any A ∈ U ( n ) one has | det A | = 1, and so det A = eiθ for any real θ. The
set of unitary matrices with determinant one is the special unitary group SU( n ).
78 CHAPTER 3. LIE GROUPS & LIE ALGEBRAS

It is a Lie group of dimension n2 − 1, one less than that of U ( n ). The difference


in dimension of SO ( n ) relative to O ( n ) and of SU( n ) relative to U ( n ) reflects the
difference in the determinant mapping in the real (det O → ± 1) and complex
(det U → S1) cases.
The unitary group may also be viewed as the group of linear isometries in
Cn , or of complex linear transformations preserving the hermitian-sesquilinear
symmetric form: h Ax, Ayi = h x, y i = ∑i x∗i yi for all x, y in Cn . Operators that
perform such transformations satisfy the condition A† A = I, or in components,
∑i A∗ij Aik = δjk , so that we have det( A† A ) = det I, hence | det A |2 = 1, or
| det A | = 1 (in words, det A is a complex number of modulus 1). When we
restrict U ( n ) to the subset of volume-preserving elements A so that det A = 1,
we get a subgroup equal to U ( n ) ∩ SL( n, C ), which is the special unitary group
SU( n ). For n = 1, we have U(1) ∼ = S1 (continuous) and SU(1) = { 1} (discrete).
E XAMPLE 8: The symplectic group Sp ( n, F ) is the isometry group on a space
of dimension 2n over F equipped with a non positive-definite metric g, as for
example a skew-symmetric metric (gij = − gji). For definiteness one may choose
for g the non-degenerate antisymmetric matrix
 
0 n In
J= (JT = − J, J2 = − I2n)
− In 0 n

(Ji,2n+1−i = + 1, J2n+1−i,i = − 1 with 1 ≤ i ≤ n; all other elements are 0) where


In is the n × n identity matrix, and 0n the n × n zero matrix. Isometry means
invariance of the bilinear form, i.e. BJ ( Ax, Ay) = BJ ( x, y ) for x, y ∈ F2n , so that
Sp( n, F ) can also be described as consisting of the 2n × 2n matrices A that satisfy
AT JA = J:
Sp( n, F ) = { A ∈ M(2n, F )| AT JA = J } .
Arguments similar to those for O ( n ) show that Sp ( n, F ) is a Lie group. Its
dimension is given by N − k, where N = (2n )2 is the number of entries in any
A ∈ Sp( n, F ), and k is the number of the independent conditions on them. So the
space of solutions (or the manifold of the group) has dimension n (2n + 1), which
is also the dimension of Sp( n, F ). From AT JA = J, one has (det A )2 det J = det J,
or det A = ± 1 for all A ∈ Sp(n, F); and so Sp ( n, F ) is a subgroup of GL(2n, F ).
Arguments (based on properties of skew-symmetric matrices and the Pfaffians)
show that actually det A = + 1 for all A ∈ Sp ( n, F ) (one can easily see this for
n = 1). So Sp ( n, F ) preserves both the bilinear form and the volume element in
space of skew-symmetric metric, and is in fact a subgroup of SL(2n, F ).
Finally, the compact symplectic group, denoted Sp( n ) or USp( n ), is the Lie
group of all the 2n × 2n complex matrices that are simultaneously symplectic and
unitary: Sp( n ) = U(2n ) ∩ Sp( n, C ) = { A ∈ U(2n ) | AT JA = J } . One can verify
that Sp(1, R ) = SL(2, R ), Sp(1, C ) = SL(2, C ), and Sp(1) = SU(2).
E XAMPLE 9: Generalized orthogonal and Lorentz groups. Let space V = R n+m
(with positive integers n and m) be defined with a symmetric, non-definite met-
ric g, such that it has n positive and m negative eigenvalues. Pick as a basis
for R n+m the corresponding orthonormal eigenvectors, then g is diagonal, that
3.1. LIE GROUPS 79

is, g = diag [1, . . . , 1; − 1, . . . , − 1], with 1 in the first n diagonal entries and − 1
in the last m, and the bilinear form Bg ( x, y ) = h x, gy in,m on R n+m is given by
h x, gy in,m = x1 y1 + · · · + xn yn − xn+1 yn+1 − · · · − xn+m yn+m . The set of all ( n +
m ) × ( n + m ) real matrices A that preserve this bilinear form Bg ( x, y ) is called
the generalized orthogonal group. This form-invariance condition is written as
AT gA = g (which implies det A = ± 1 among other things), so that the general-
ized orthogonal group is defined by O ( n; m) = { A ∈ GL( n + m, R )| AT gA = g} .
The group SO ( n; m ) is the subset of matrices in O ( n; m ) with det A = 1.
The case O (3; 1) is of particular interest in physics, where it is known as the
Lorentz group. If we write ( x, y, z; t ) for any (space-time) vector v in R 3+1 and
the metric matrix g3,1 = diag [1, 1, 1; − 1], then the Lorentz group is the group of
transformations on R 3+1 that leave the inner product h v, vi3,1 = x2 + y2 + z2 − t2
invariant, or equivalently the group of all 4 × 4 real matrices A that satisfy the
condition AT g3,1 A = g3,1 .
E XAMPLE 10: The Euclidean group E( n ) is the set of bijections of R n that preserve
Euclidean distance. To each element of E( n ) corresponds a map f : R n → R n such
that f ( α; x) = Ax + a where x ∈ R n and α = ( A, a ), with a real invertible n × n
matrix A for a linear map (‘rotation’), and constant a ∈ R n for a non-linear map
(‘space displacement’). (A transformation of the form f ( x) = Ax + a is called
‘affine’.) An α map following a β map (in which x0 = f ( β, x)) is given by

f ( α; x0) = f ( α; f (β, x)) = A ( Bx + b ) + a = f ( α · β; x),

from which we can deduce both the product m( α, β ) = α · β = ( AB, Ab + a ) and


the inverse inv ( α ) = ( A−1, − A−1 a ). Hence the bijective map f defines a group on
R n with the given metric topology. It is an isometry (leaving distances invariant)
if and only if AT A = I, i.e. iff A ∈ O ( n ), and the resulting transformation group
is called the affine group of R n . There is a one-to-one correspondence between any
element α = ( A, a ) ∈ E( n ) and the following ( n + 1) × ( n + 1) matrix
 
A a
( A ∈ O ( n ), a ∈ R n ) ,
0 1

where 1 ascalar,
 and 0 is an n-component row vector. It operates on vectors of
the form x1 where x ∈ R n . So the Euclidean group E( n ) can be viewed as a
subgroup of O ( n + 1), and it is then possible to build charts for E( n ) out of any
atlas for O ( n ) showing that E( n ) is a manifold; and as usual matrix multiplication
is smooth. It is globally isomorphic to O ( n ) × R n (R n being here a Lie group),
and so has dimension n ( n − 1) /2 + n = n ( n + 1) /2. So E( n ) is a Lie group of
dimension n ( n + 1) /2; in particular, E(3) has dimension 6.
E XAMPLE 11: The Poincaré group P ( n; 1) is the group of isometries of (Lorentz)
space L = R n+1 equipped with the Lorentz metric g = diag[1, . . . , 1; − 1] and
the bilinear form Bg ( x, y ) = h x, gy i for any x, y ∈ L. An isometry on R n+1 is
represented by the map f : x 7 → x0 specified by x0 = f ( α; x ) = Λx + a with
Λ ∈ O ( n; 1) (‘space-time’ rotation) and a ∈ R n+1 (‘space-time’ displacement).
The Poincaré group is similar on every point to the Euclidean group , except for
80 CHAPTER 3. LIE GROUPS & LIE ALGEBRAS

the metric of the space on which it is applied. So just as for E( n ), each element
( Λ, a ) ∈ P ( n; 1) uniquely corresponds to an ( n + 2) × ( n + 2) matrix of the form
 
Λ a
( Λ ∈ O ( n; 1), a ∈ R n+1 ),
0 1

where 1 a scalar, and 0 = (0, . . . , 0) is an ( n + 1)-component row vector. Argu-


ments similar to those for E( n ) show that P ( n; 1) is a manifold globally isomor-
phic to O ( n + 1) × R n+1 , and so is a Lie group of dimension ( n + 1)(n + 2) /2. In
particular, the Poincaré group P (3; 1) of ordinary space-time is ten-dimensional.
E XAMPLE 12: The Heisenberg group H( n ) is the subgroup of GL( n, R ) consisting
of matrices of the form In + Tn where Tn is a strictly upper-triangular matrix on R
(in passing, Tnn = 0). It is isomorphic as a manifold to R n(n−1)/2 , and since matrix
multiplication is smooth, it is a Lie group. For n = 3, a typical element of H(3) is
 
1 a c
0 1 b  = I3 + aX + bY + cZ ( a, b, c ∈ R; X, Y, Z ∈ { T3} ) .
0 0 1

The matrices X = E12, Y = E23 and Z = E13, where ( Eij )αβ = δiα δjβ , satisfy
Heisenberg’s commutation relations [ X, Y ] = Z, [ X, Z] = 0, [Y, Z] = 0 (hence
the group’s name). H(3) can also be thought of as the subgroup of symmetries of
space of ‘wave functions’ under the transformation f ( x) 7 → e2πi(bx+c) f ( x + a ) for
any square-integrable f . 
C OMMENTS . (a) The classical Lie groups are normally understood to be the groups
GL( n, F ), SL( n, F ), O ( n, F ), SO ( n, F ), SO ( p, q; R ), U ( n ), SU( n ), and Sp( n, F ), with
F = R, or C. (b) Some groups similar in structure to Lie groups do not have all of
the required attributes. Some are totally disconnected groups: the Galois group
of an infinite extension of fields, for example, has an underlying space not locally
isomorphic to R n , and so is not a Lie group. Others have infinite-dimensional
manifolds and for this reason are not Lie groups: gauge groups smoothly map-
ping a manifold to a group, groups of diffeomorphisms with Virasoro algebras, or
loop groups with Kac-Moody algebras are examples of great interest to physics.

3.2 Global Properties


A Lie group, being both a manifold and a group, has properties inherited from
both structures. We discuss in this section topological properties (compactness,
connectedness, and simple connectedness) which have important implications in
Lie algebras and the representation theory.

3.2.1 Compactness
Criteria for compactness of a subspace X take different forms depending on the
space it is in. Thus, a subset X of a topological space is said to be compact if and
3.2. GLOBAL PROPERTIES 81

only if it is Hausdorff (distinct points on it have disjoint neighborhoods; see Ap-


pendix C) and if every cover of X by the union of a family of open sets contains
a finite subcover (Borel–Lebesgue theorem). On the other hand, if the notion of
distance is available (in metrizable spaces) one may consider sequential compact-
ness (or, equivalently, simply compactness): A subset X of a metric space is said
to be compact if every infinite sequence of distinct elements of X has a conver-
gent subsequence that converges to an element of X. All this finally reduces to a
simple criterion for Euclidean space by the Bolzano-Weierstrass and Heine-Borel
theorems: A subset of a finite-dimensional Euclidean space R n is compact if and only
if it is closed and bounded. (A reminder: In a metric space, a subset X is said to be
closed if sequences in X that converge do so within X; it is said to be bounded if it
is contained in a ball of finite radius.) Details can be found in [Fa] Ch. 3.
Most Lie groups we deal with are subgroups of GL ( n, F ) and so can be viewed
as manifolds R m for some integer m. When such a subgroup is specified by al-
gebraic conditions expressed by continuous functions, it is a closed subgroup (cf.
p. 76), and so satisfies one of the two conditions for compactness. It remains to
verify just boundedness.
C OMMENTS . Compactness has important implications in the representation
theory of Lie groups. Many results in the representation theory of finite groups
are established by averaging group functions over the given finite group. For a
continuous set X, the notion of averaging is Rnow replaced by a continuous linear
functional called integration measure µ ( X ) = X dµ. The key result is that if a group
G is compact,Rthere exists a Runique real measure µ ( G ) such that µ ( G ) isR invariant on
G (meaning G ( g f ) dµ = G f dµ), and normalizable to one (meaning G 1 dµ = 1).
With this key fact, one can extend representation theory from finite groups to
compact Lie groups on some space V, for example by designing a well-defined
inner product on V with respect to which one constructs a unitary representation
of G in V. Just as for any finite group, every finite-dimensional, real or complex
representation of a compact group is completely reducible, i.e., expressible as a
direct sum of simple (i.e. irreducible) representations (see Sec. 3.6).
E XAMPLE 13: R n , R × , C× , and Cn are closed but not bounded, hence are not
compact. On the other hand, the (n − 1)-sphere Sn−1 = { x ∈ R n || x| = 1} is
closed (being in R n ), and bounded (by | x| = 1), hence is compact.
E XAMPLE 14: We already know that O ( n ) and SO( n ) are closed; they are also
bounded because if A is an orthogonal matrix, then | Aij | ≤ 1 for all 1 ≤ i, j ≤ n.
So O ( n ) and SO ( n ) are compact groups. Similarly, U ( n ) and SU( n ) are also com-
pact. The group Sp ( n ), being a closed subgroup of U ( n ), is necessarily compact.
E XAMPLE 15: We know that GL( n, F ) are not closed (being M( n, F ) minus ele-
ments with det A = 0), and neither are they bounded, and so they are not com-
pact. All other examples of Lie subgroups of GL( n, R ) or GL( n, C ) are closed,
but violate the boundedness condition, and are non-compact. For example, the
n × n matrices Ak = diag[ k, 1/k, 1, . . . , 1] have determinant one for any k and are
unbounded. It follows that the groups SL ( n, R ) and SL ( n, C ) for n > 1 are not
compact. Similarly, the following groups are not bounded, and so are not com-
pact: O ( n, C ), SO( n, C ), Sp ( n, R ), Sp ( n, C ), O ( n, m ), SO( n, m ), E( n ), and P ( n, 1).
82 CHAPTER 3. LIE GROUPS & LIE ALGEBRAS

3.2.2 Connectedness
A space M is said to be disconnected if it can be partitioned into disjoint non-empty
open sets. M is said to be connected if it is not disconnected. There is a closely related
concept: M is said to be pathwise connected if given any two points x and y in M
we can find a continuous path entirely in M joining x to y, i.e. a continuous map
π ( t) in M, 0 ≤ t ≤ 1, with π (0) = x and π (1) = y. If M is locally Euclidean,
as is the case of all the groups we will study, the two concepts, ‘connected’ and
‘pathwise connected’, may be considered the same.
A disconnected Lie group G can be uniquely partitioned into disjoint sets,
called components, such that any two elements of the same component can be
joined by a continuous path, while two elements of different components cannot.
The identity component, called G0, which contains the identity I, plays a crucial
role. We can verify that G0 is closed under multiplication and inversion as fol-
lows. For any two elements g, h of G0, there exist continuous paths entirely in G0
called α ( t) and β ( t ), 0 ≤ t ≤ 1, with α (0) = β (0) = I, α (1) = g, and β (1) = h.
Then α ( t) β (t) is a continuous path lying in G0 joining I to gh, showing that the
product of two elements of G0 is again in G0 . In addition, α ( t)−1 is a continuous
path from I to g−1 and, as G0 is connected, the inverse of an element of G0 is
again in G0. So the connected component containing the identity is a subgroup of
G. In fact, G0 is a normal Lie subgroup. To see this, we need to show that if g ∈ G0
and h ∈ G0, then ghg−1 ∈ G0 . Let α : [0, 1] → G0, with α (0) = I and α (1) = h.
Then the conjugation of α ( t) by g is a continuous map that joins gα (0) g−1 = I to
gα (1) g−1. As G0 is connected to I, the end point ghg−1 must be in G0. In other
words, G0 is invariant under conjugation. Finally, as G0 is a normal subgroup of G,
the set consisting of G0 and its cosets forms a discrete group, which is the quotient
group G/G0. (One often defines the group of components, called π0, of G such that
its order gives the number of connected components. The group G is connected if
and only if π0 ( G ) is trivial, i.e. π0 ( G ) = { 1} , or | π0| = 1.) In summary,
• Let G be a Lie group, and G0 its connected component containing the identity.
Then G0 is a normal subgroup and a Lie group; the quotient group G/G0 is a 0-
dimensional Lie group isomorphic to a discrete group. A study of Lie groups re-
duces to studies of their connected components and their discrete groups.
Now, the connected component can be built up by taking multiple products
of elements near the identity. Let G be a connected Lie group, and U ⊂ G an open
neighborhood of the identity. Assume that U −1 = U (replacing U with U ∩ U −1
if necessary), and define H = ∪n≥1U n = U ∪ U 2 ∪ U 3 · · · (where U = { gi } ,
U 2 = { gi gj } , etc.). H is a subgroup of G (since U n U m ⊆ U n+m ). H is open in
G since it is the union of open sets. Moreover, if for every g ∈ G but g 6 ∈ H,
then gH 6 ⊂ H (since otherwise if h1, h2 ∈ H are such that gh1 = h2, then g =
h2 h− 1
1 ∈ H, in contradiction). Thus H is the complement of the union of all cosets
not containing H, so it is closed. Since H is a non-empty, open and closed subset
of G, and since G is connected, we have G = H. Therefore,
• If G is a connected Lie group and U an open neighborhood of the identity, then U
generates the whole G. This result shows the important role of U in the local
properties of a connected group.
3.2. GLOBAL PROPERTIES 83

E XAMPLE 16: A matrix of the form A = [ z ] where z ∈ C× is a 1 × 1 invertible


matrix. It is evident that given any two such matrices A and B (two non-zero
complex numbers), we can always find a continuous path joining them and not
passing through zero. Hence, GL(1, C ) is connected. If A and B are any two
invertible complex matrices in GL( n, C ), with n ≥ 1 it can be argued that one
can find a continuous path M ( t ) ∈ GL( n, C ), 0 ≤ t ≤ 1 with M (0) = A and
M (1) = B, such that the complex number det M ( t ) does not pass through zero.
It follows that GL( n, C ) is connected. Similarly, SL ( n, C ) is connected.
E XAMPLE 17: The sphere Sn is connected for each n ≥ 1. It follows from the
connectedness of U (1) ∼ = S1, SO(1) = SU(1) = { I }, and Sp (1) ∼ = S3 that the
compact groups U ( n ), SU ( n ), SO ( n ), and Sp ( n ) are also connected for each n ≥ 1.
E XAMPLE 18: R × is not connected, because a real positive point and a real neg-
ative point cannot be joined by a continuous real-valued path unless it passes
through zero, which is excluded from R × . But the set of positive reals and the set
of negative reals are separately connected. By analogy, GL(1, R ) is not connected,
but has two disjoint connected components, namely, GL(1, R )+, the set of positive
real numbers (including the identity I = 1), and GL(1, R )−, the set of negative
numbers. In general, GL( n, R ) cannot be connected, because if A, B ∈ GL ( n, R )
with det A > 0 and det B < 0, any continuous path M ( t ) in GL ( n, R ) must have
the real number det M ( t ) passing through zero, which is excluded from GL( n, R ).
Hence GL( n, R ) is not connected: it has two components, one, GL( n, R )+, con-
taining n × n real matrices of positive determinant and connected to the identity;
and the other, GL( n, R )− containing n × n real matrices  of negative determinant.
(One summarizes this result by writing π0 GL ( n, R ) = Z2 .)
E XAMPLE 19: O (1) = {+ 1, −1} is disconnected, with two components. SO (1) is
trivially connected. More generally, O ( n ) is not connected because two matrices
A, B ∈ O ( n ) with det A = 1 and det B = − 1 cannot be joined by a continuous
real-valued path M ( t ) ∈ O ( n ) with a determinant not passing through zero. The
component consisting of all matrices of determinant + 1 is just SO( n ), whereas
the component
 consisting
 of all matrices of determinant − 1 is a coset of SO ( n ).
(Thus, π0 O ( n ) = Z2 , and also O ( n, C ); but π0 is trivial for both SO ( n ) and
SO( n, C ).) However, SO( n; m ) is not connected in general.
E XAMPLE 20: The group G of transformations f ( α; x) = ± x + a has two com-
ponents: H consisting of transformations x0 = x + a that can join x to any point
of R by continuously varying a; and the set of transformations x0 = − x + a that
cannot reach continuously the identity: they are products of elements of H with
the inversion ι : x0 = − x. Thus, G = H + Hι. Similarly, the Euclidean group E( n )
for n ≥ 1 has two disjoint components, while Poincaré P ( n; 1) has four. 

3.2.3 Simply-Connectedness
A Lie group G is said to be simply-connected if every closed curve in G can be continu-
ously varied so that it contracts to a point in G. To show that G is simply-connected,
one verifies that for every closed loop lying in G defined by α ( t), 0 ≤ t ≤ 1,
α (0) = α (1), there exists a family of continuous curves α ( s, t), 0 ≤ s, t ≤ 1, lying
84 CHAPTER 3. LIE GROUPS & LIE ALGEBRAS

in G, such that (i) The curve is a closed curve: α ( s, 0) = α ( s, 1) for each value
of s; (ii) The curve α (0, t) coincides with the given loop α ( t ) for all t; (iii) When
s = 1, the curve contracts to a point: α (1, t) = α (1, 0) for all t. (Whenever G is
connected, simply-connectedness informs us on its ‘shape’ and on the presence
or absence of ‘holes’ in space G. This being the case, one finds it useful to intro-
duce the fundamental group π1 of G, such that G is simply-connected if and only if
π1 is trivial, π1 ( G ) = { 1} .)
E XAMPLE 21: In Euclidean space R n , let r ( t) with t ∈ [0, 1] and r (0) = r (1) = 0,
define a loop passing through the origin. Then we can define the continuous map
p ( s, t) = (1 − s)r(t), with 0 ≤ s ≤ 1, that represents a family of closed curves that,
as s varies, deform the loop continuously from r ( t) for s = 0 to a point r = 0 for
s = 1. So, R n is simply-connected.
E XAMPLE 22: In R 2, the points given by eiφ , with real φ, define the circle S1. With
0 ≤ t ≤ 1 consider the following curves in S1: (1) φ = t (1 − t) defines a curve that
can be deformed to a point; (2) φ = 2πt goes around the circle once and cannot
be deformed continuously to a point; (3) φ = 2πmt, with integer m > 0, define
a family of closed curves of m loops that cannot be deformed into one another.
So S1 is not simply-connected, it is infinitely connected (i.e. π1 = Z). However,
Sn−1 = { x ∈ R n || x| = 1} for any n > 2 is simply-connected (π1 = { 1} ). It
follows that SU (2), a connected Lie group topologically equivalent to S3, is also
simply-connected.
E XAMPLE 23: Let R + be the additive group of real numbers. It is non compact,
simply-connected, and contains a discrete invariant subgroup { n } = a (0, ± 1,
± 2, . . . ) (any real a). The circle S1 is then isomorphic to R + / { n} and multiply
connected. If S1 is associated with the rotation group SO (2), all rotations through
θ + 2πn are identified. On the segment [0, 2π ) of R + , with 2π identified with 0,
SO (2) is single-valued. 
C OMMENTS . The importance of the topic discussed here arises from the following
facts. If the group manifold is simply-connected, every continuous function on
the group is single-valued, and there exists a 1-to-1 correspondence between the
representations of G and those of its Lie algebra (introduced later). On the other
hand, if the group manifold is multiply-connected, then m-valued functions can
exist, and we expect to have m-valued representations. The difficulty in study-
ing multiple-valued representations is resolved by the fact that for any multiply
connected group G, there exists a simply-connected group G e called the universal
e
covering group for G, such that G can be mapped to G by homomorphism, i.e. G
is isomorphic to the quotient group G/N e where N is an invariant subgroup of G. e
Every irreducible representation (Definition 3.8) of G, whether single or multiple-
valued, is a single-valued representation of G, e and one needs to study just G. e
In Table 3.1 we give a summary of the global properties (compactness, con-
nectedness, simply-connectedness) of some of the Lie groups discussed in this
chapter. The real Lie groups are listed first, followed by the complex Lie groups.
In each row we indicate, in order, the name of a group, its compactness (CMP
(Yes or No)), connectedness (CNN), simply-connectedness (SCNN), and finally the
fundamental group π1 (G) for connected groups (Y). The order of the component
group |π0 | (the number of components) is also given; simply-connectedness and
3.3. LOCAL PROPERTIES 85

π1 (G) are given whenever the group is connected. In the last row, one finds the
pages where more information relevant to each column can be found. In the trivial
cases, there are possibly exceptions to the displayed general rules: GL(1, R) = R× ,
GL(1, C) = C× , SL(1, R) = SL(1, C) = {1}, and O(1) = SO(1) = SU(1) = {1}.
Note that U(n), SU(n), and Sp(n) are real Lie groups.

Table 3.1: Global properties of some Lie groups discussed in this chapter.

Group G CMP CNN (| π0( G )|) SCNN π1 ( G )


GL( n, R ) N N (2) −
SL ( n, R ) N Y (1) N Z ( n = 2 ) , Z2 ( n > 2 )
O( n ) Y N (2) −
SO ( n ) Y Y (1) N Z ( n = 2 ) , Z2 ( n > 2 )
U( n ) Y Y (1) N Z
SU ( n ) Y Y (1) Y {1}
Sp ( n, R ) N Y (1) N Z
Sp ( n ) Y Y (1) Y {1}
O ( n; 1) N N (4) −
SO ( n; 1) N N (2) −
E( n ) N N (2) −
P ( n; 1) N N (4) −
GL( n, C ) N Y (1) N Z
SL ( n, C ) N Y (1) Y {1}
O ( n, C ) N N (2) −
SO ( n, C ) N Y (1) N Z ( n = 2 ) , Z2 ( n > 2 )
Sp ( n, C ) N Y (1) Y {1}
pp. 76-80 p. 80 p. 82 p. 83 p. 84

3.3 Local Properties


In this section, we use a typical group of continuous transformation to illustrate
some general results of the last section concerning connected Lie groups near the
identity and to introduce further properties, leading in particular to the concepts
of infinitesimal operator and Lie algebra. See also [Gi], [Ham], [Ja], Sa].
Infinitesimal operators. Let G be a connected Lie group (or, if not connected, its
identity component) of dimension η, and V a real vector space of dimension n.
Then, define the associated group of transformation by the map f : G × V → V
or, equivalently, the function f ( α; x ) ∈ V analytic in both arguments α ∈ G and
x ∈ V, where α = { αµ } ∈ R η and x = { xi } ∈ R n are their coordinates.
An arbitrary function F0 ( q ) of point q ∈ V takes the form F0 ( q ) = F ( x) in
a reference frame S, where q has coordinates xi ( q ) = xi . In another frame S0 ,
obtained from S by a symmetry operation α ∈ G, q has coordinates x0i ( q ) = x0i ,
86 CHAPTER 3. LIE GROUPS & LIE ALGEBRAS

so that F0 ( q ) = F 0 ( x0 ), with a different functional dependence, F 0 6 = F. We now


analyze this difference. By definition x0 = f ( α; x), or inversely x = f ( α; x0 ) where
α = α−1, we have
F 0 ( x0 ) = F ( x) = F [ f (α; x0 )] . (3.3)
Assuming the identity corresponds to α = 0 (after re-parameterization if neces-
sary), transformations close to the identity are specified by αµ = δαµ (or inversely
αµ = − δαµ), with δαµ very small. Analyticity of f in α means that x continuously
shifts to x0 by small steps such that

∂ f ( β; x0)
j
xj = f j [− δα; x0] ≈ x0j − δαµ , (3.4)
∂β µ
β=0

where here and in the following summation over repeated indices is implicit.
Substituting this expression in (3.3) we obtain by expansion

0 0 0 ∂ f j ( β; x0 ) ∂
F ( x ) ≈ F ( x ) − δαµ F ( x0) . (3.5)
∂β µ ∂x0j
β=0

With the notations



def ∂ f j ( β; x)
uµ,j ( x) = , µ ∈ Nη , j ∈ Nn , (3.6)
∂β µ β=0

def ∂ ∂ f j ( β, x) ∂
Xµ ( x) = − uµj ( x) =− , (3.7)
∂xj ∂β µ β=0 ∂xj

the change in the form of F by an infinitesimal δα : F → F 0 is given by

F 0 ( x) = [1 + δαµ Xµ ( x)] F (x) , (3.8)

where we have replaced x0 with x as arbitrary coordinates. (When F ( x) is just


x, this equation becomes simply x0 = x + δαµ Xµ x.) A finite transformation
α : F → F 0 is obtained by repeated applications of the infinitesimal transforma-
tions. The action of G on V is completely determined by the operators Xµ ( x) in
V, which are called the generators of the infinitesimal transformations in V, or
simply generators of the Lie group of transformation in V.
The corresponding results for a general connected Lie group G are obtained
by considering the action of G on itself through mapping φ : G × G → G (as
indicated in p. 75 following Eq. (3.1)), so that in analogy with (3.7) we have the
generators of G as differential operators in space G:

∂φλ ( β, α ) ∂
Xµ ( α ) = − = − Θ µλ ( α )∂λ , (3.9)
∂β µ β=0 ∂αλ

def ∂φλ ( β, α ) ∂
Θ µλ ( α ) = , ∂λ = . (3.10)
∂β µ β=0 ∂αλ
3.3. LOCAL PROPERTIES 87

As the map φ : G × G → G may be viewed as a nonsingular change of basis of


G, we must have det Θ ( α) 6 = 0. Since no Xµ can be zero, there are precisely η
independent operators representing vectors tangent to G at the identity. They
span an η-dimensional linear vector space, T1 G, called the tangent space of G at
the identity. In matrix groups, generators can be determined in a similar way.
Thus, if M( α ) is a matrix in the matrix Lie group G acting on vector space V, then
near the identity M(0) = I, we have M( α )v ≈ [I + δαµ Mµ ] v for any v ∈ V (cf.
(3.8)), and Mµ = ( ∂M/∂αµ)|0 span the corresponding tangent space.
E XAMPLE 24: For Euclidean group G = E(1), the map G : R → R is given
by α ◦ x = f ( A, a; x ) = Ax + a, with A, a real parameters and A 6 = 0. Upon
redefining A = eα1 and a = α2, and calling α = ( α1, α2 ), the identity is (0, 0)
and the composition law, φµ ( β, α ) = ( β 1 + α1, eβ1 α2 + β 2 ). The generators of the
group of transformation E(1) in R are then:

∂ f ∂ ∂ ∂ f ∂ ∂
X1 ( x) = − = −x , X2 ( x) = − =− ;
∂α1 α=0 ∂x ∂x ∂α2 α=0 ∂x ∂x

but in the corresponding Lie group they take the form



∂φ1 ∂ ∂φ2 ∂ ∂ ∂
X1 ( α ) = − − =− − α2 ,
∂β 1 0 ∂α1 ∂β 1 0 ∂α2 ∂α1 ∂α2

∂φ1 ∂ ∂φ2 ∂ ∂
X2 ( α ) = − − =− .
∂β 2 0 ∂α1 ∂β 2 0 ∂α2 ∂α2

In either case, we obtain X1 X2 − X2 X1 = X2 by differentiation. The vector space


spanned by X1 and X2 together with this special product defines an algebra asso-
ciated with the group E(1).
E XAMPLE 25: The matrix group isomorphic to E(1) considered above consists of
elements of the form  α 
e 1 α2
M( α ) = .
0 1

So that its generators corresponding to coordinates α1 and α2 are


   
∂ 1 0 ∂ 0 1
M1 = M = , M2 = M = .
∂α1 0 0 0 ∂α2 0 0 0

Again,we can check that M1 M2 − M2 M1 = M2 , which implies isomorphism with


the algebra spanned by X1 and X2 , defined in the previous example.
E XAMPLE 26: The rotation group in R 2 is represented by the transformation func-
tions f 1 ( θ; x) = x01 = x1 cos θ − x2 sin θ and f 2 ( θ; x) = x02 = x1 sin θ + x2 cos θ. The
generator of rotations
 
∂ f1 ∂ ∂ f2 ∂ ∂ ∂
X=− + = x2 − x1
∂θ ∂x1 ∂θ ∂x2 θ =0 ∂x1 ∂x2
88 CHAPTER 3. LIE GROUPS & LIE ALGEBRAS

is proportional to the angular-momentum operator in coordinate space. In matrix


form the group element is given by
   
cos θ − sin θ 1 −θ
M( θ ) = ≈ = I + θM ,
sin θ cos θ θ 1

where, on the RHS, θ is infinitesimal and M is the generator of the matrix group.
E XAMPLE 27: Let F ( x) = x + c, with a constant c.
(a) If x0 = f ((0, a); x) = x + a: we have X1 = 0 and X2 = − ∂/∂x = − ∂ x.
From X1 , X2 , we get F 0 ( x) = (1 − a∂ x )( x + c ) = x + c − a.
(b) If x0 = f ((α, 0); x) = eα x, then X1 = − x∂ x and X2 = 0, and we get
F 0 ( x) = ( x + c) + (− αx∂ x + 12 α2 x∂ x x∂ x + · · · ) x = c + x(1 − α + 12 α2 + · · · ), which
is F 0 ( x) = e−α x + c. This tells us that we can get the full transformation if we
know the local transformation specified by Xµ . 
Commutation relations. As the generators Xµ are derived from the elements of a
connected Lie group G in a neighborhood U of the identity, one may expect that
the vector space spanned by Xµ has a structure arising from the group laws. Let
α, β be arbitrary elements in U, where α may be approximated to

1
α ≈ I + δαµ Xµ + δαµ Xµ δαν Xν + · · · ,
2
and similarly for β. We have, up to second order, the products

αβ ≈ 1 + δα · X + δβ · X + ( δα · X)( δβ · X) + 12 ( δα · X)2 + 21 ( δβ · X)2,


( αβ )−1 ≈ 1 − δα · X − δβ · X + ( δβ · X)( δα · X) + 12 ( δα · X)2 + 21 ( δβ · X)2.

Let γ = αβ ( βα)−1 (also called the group commutator of α, β; it is equal to the


identity only if G is abelian). To second order, it has the form

αβ ( βα)−1 ≈ I + δαµ δβ ν ( Xµ Xν − Xν Xµ ). (3.11)

The product of operators Xµ Xν − Xν Xµ is called a commutation relation (or com-


mutator) of Xµ and Xν , and denoted [ Xµ , Xν ]. Since αβ ( βα)−1 is an element of G
close to the identity, the infinitesimal term, δαµ δαν [ Xµ , Xν ], must be in the vec-
tor space T1G, and the commutation relation of the generators must be a linear
combination of the basis vectors Xµ ’s of T1G:

[ Xµ , Xν ] = cλµν Xλ , (3.12)

where the expansion coefficients cλµν , called structure constants (in the given ba-
sis), are constant numbers. If Xµ† = ± Xµ , then both LHS and RHS of (3.12)
are skew-Hermitian, and so if Xµ are Hermitian the c’s are imaginary, but if Xµ
are skew-Hermitian the c’s are real. The properties implicit in the definition of
the commutator (namely, skew symmetry [ A, B ] = −[ B, A], and the Jacobi identity
3.3. LOCAL PROPERTIES 89

[[ A, B], C ] + [[ B, C ], A] + [[ C, A], B] = 0 ) imply the identities among the c’s:

cλµν + cλνµ = 0 , (3.13)


β β β
cαµν cαλ + cανλ cαµ + cαλµ cαν = 0 . (3.14)

Given Xµ in terms of the composition function φ, as in (3.9), one can readily prove
the results just stated and obtain explicit expressions of the structure constants in
terms of φ ( β, α ) as follows. As Θ µλ is non-singular, we can define its inverse by
Ψµν ( α ) Θνλ ( α ) = δµλ for arbitrary α. Using Xµ ( α ) = − Θ µλ ( α ) ∂λ in the commu-
tation relation, we have
 
[ Xµ , Xν ] = Θ µσ ( ∂σ Θ νλ ) − Θ νσ ( ∂σ Θ µλ ) ∂λ .
Introducing Ψρκ Θ κλ = δρλ on the right-hand side we immediately obtain
[ Xµ , Xν ] = cκµν Θ κλ ∂λ = cκµν Xκ , (3.15)

where  
∂Θµρ ( α ) ∂Θνρ ( α )
cκµν = Θ νσ ( α ) − Θ µσ ( α ) Ψρκ ( α ) (3.16)
∂ασ ∂ασ
are in fact independent of α, as can be shown by differentiation. So cκστ can be
calculated at any point α and, in particular, at the identity (α = 0), where one
has Θ (0)µν = Ψ (0)µν = δµν . Thus, setting α = 0 in (3.16), one gets the explicit
expression of the structure constants of the group of transformation
 2 κ 
κ ∂ φ ( β, α ) ∂2 φκ ( β, α )
cµν = − . (3.17)
∂β µ ∂αν ∂β ν ∂αµ α= β=0

One-parameter subgroups. In a connected group of transformation G that is also


pathwise connected, any element can be reached from the identity I by a contin-
uous curve lying in G. As before, given the maps f : G × V → V and φ : G × G →
G, we consider arbitrary transformations on the coordinates α : x 7 → x0 = f ( α; x)
and α + dα : x 7 → x0 + dx0 = f ( α + dα; x) with α, α + dα ∈ G and dα infinitesi-
mal. As f ( α; x) is continuous in both arguments, the points x0 and x0 + dx0 must
be joined by an infinitesimal transformation δα such that x0 + dx0 = f ( δα; x0)
and α + dα = φ ( δα; α) (see Fig. 3.1). These two equations give us respectively
dx0i = δαλuλi ( x0 ) and dαµ = δαλ Θ λµ ( α ). With Xλ ( x) as in (3.7), we have (drop-
ping the prime sign in x0 ) the differential:
dxi = dαµ Ψµλ ( α ) uλi( x) = − dαµ Ψµλ ( α ) Xλ( x) xi . (3.18)
Any given group element g, specified by coordinates α1, α2, . . . , can be reached
from the identity I by letting the parameters vary along a line αµ t (with real t) in
the direction of the vector αµ and passing through I (where t = 0) and g (where
t = 1). Letting αµ → αµ t for fixed αµ and varying t, the equation for xi ( t ) now
reads
dxi ( t )
= − αµ Ψµλ ( αt ) Xλ( x) xi ( t ). (3.19)
dt
90 CHAPTER 3. LIE GROUPS & LIE ALGEBRAS

x0
@
3
α @ δα
 @

 -
R
@
x α + dα x0 + dx0

Figure 3.1: x and x0 + dx0 can be joined either by α + dα or φ ( δα, α).

Writing xi ( t ) = ϕ ( t )xi(0), where ϕ ( t ) should be viewed as an n × n matrix, we


obtain the first-order total differential equation for ϕ ( t ):
d
ϕ0 ( t ) =ϕ ( t ) = − αµ Ψµλ ( αt) Xλ( x( t)) ϕ (t) . (3.20)
dt
with the initial conditions (at the identity, where t = 0) given by
ϕ (0) = I, and ϕ0 (0) = − αµ Ψµλ (0) Xλ[ x(0)] ϕ (0) = X[ x(0)] ,
where we have used Ψµλ (0) = δµλ and defined X = − ∑µ αµ Xµ . With well
defined initial conditions, the differential equation admits the unique solution
1
ϕX ( t ) = I + tX + ( tX )2 + · · · . (3.21)
2
ϕX ( t ) is analytic if the series converges. For an arbitrary vector X in the space
tangent to G at the identity, we have constructed an operator taking x(0) to x( t ).
Given the series for ϕX ( t ), it is seen that ϕX (0) = I; that ϕX ( t ) has the inverse
ϕX (− t) for each value of t; and finally ϕX ( s ) ϕX ( t ) = ϕX ( s + t ). Hence, the curve
ϕX ( t ), with t ranging over all R, is a group, called the one-parameter subgroup of G
determined by X in T1 G. Inversely, its derivative wrt t at t = 0 gives us back X.
E XAMPLE 28: Take X = − α1 X1 − α2 X2 , with X1 ( x) = − x∂ x , X2 ( x) = − ∂ x, and
∂ x = ∂/∂x; and arbitrary α1, α2. Assuming α1 6 = 0, we have (with 0 ≤ t ≤ 1)
 n
tn ∂
ϕX ( t ) x = x + ∑ α1 x + α2∂ x x
n=1
n! ∂x
t n n−1
= x+ ∑ α
n! 1
(α1 x + α2 )
n=1
α 
= etα1 x + 2 etα1 − 1 .
α1
We check that ϕX (0) x = x and ϕ0X (0) x = X · x. We also recognize that ϕX ( t )
is an element of E(1) for all t ∈ R, such that x0 = ϕX ( t ) x = f ( A, a; x), where
A = exp( tα1) and a = ( α2/α1)[exp( tα1) − 1] (cf. Example 24).

3.4 Lie Algebras


An algebra is a vector space with a product rule for vectors. The ordinary alge-
bra of real numbers is a one-dimensional vector space in which the product of
3.4. LIE ALGEBRAS 91

two real numbers, also a real, is commutative and associative. The vector space
spanned by the generators of a Lie group of transformation defined in the pre-
ceding section, which comes equipped with a product, is also an algebra, called
Lie algebra, although the product is neither commutative nor associative.

3.4.1 Definitions and Examples


We give here a general definition of Lie algebra and a few examples.
Definition 3.2 (Lie algebra). A Lie algebra is a vector space g equipped with a bilinear
product called the Lie bracket, [, ] : g × g → g, that is skew-symmetric: [ x, x] = 0 for
all x ∈ g, and satisfies the Jacobi identity: [ x, [ y, z ]] + [ y, [ z, x ]] + [ z, [ x, y]] = 0 for all
x, y, z ∈ g. Its dimension is the dimension of the vector space.
Henceforth the same symbol, e.g. g, denotes a vector space and the associ-
ated Lie algebra (the space together with the Lie product on this space). Note
that g, as a Lie algebra, is closed both under the linear operations on its elements
x, y, . . . (x + y ∈ g and ax ∈ g with a a number), and under the bracket operation
([ x, y ] ∈ g). The definition of the bracket operation [, ] by its fundamental proper-
ties (skew symmetry and the Jacobi identity) is general, and includes the familiar
commutator [ x, y ] = xy − yx. When [ x, y ] = 0 for all x, y ∈ g, the Lie algebra g is
said to be abelian. Otherwise, it is said to be non-abelian. In this case, the bracket
is non-associative because of the Jacobi identity: [ x, [ y, z ]] = [[ x, y ], z ] + [[ z, x], y].
E XAMPLE 29: The algebra of real numbers R, with the usual associative product,
becomes an abelian one-dimensional real Lie algebra, called R L , with the bracket
identified with the commutator. Similarly, C L .
E XAMPLE 30: The algebra H (created by William R. Hamilton in 1843) of the
quaternions is the set of numbers q = a1 + bi + cj + dk (where a, b, c, d ∈ R,
and 1, i, j, k are the quaternionic units obeying i2 = j 2 = k 2 = ijk = − 1) equipped
with a non-commutative associative product qq0 . It is made into a real Lie algebra,
called H L , by defining the Lie bracket [ q, q0 ] = qq0 − q0 q.
E XAMPLE 31: gl(V ), gl ( n, R ), gl ( n, C ). The algebra of linear operators X : V → V
on a vector space V over the field F = R, or C with the associative multiplication
X · Y ≡ XY can be made into a Lie algebra by defining the operator Lie product
[ X, Y ] = XY − YX. This Lie algebra is called the general Lie algebra of V , and
denoted gl(V ). Taking V = R n and defining the matrix Lie product [ X, Y ] =
XY − YX gives the general linear Lie algebra gl( n, R ) of n × n real matrices. We can
analogously define the complex Lie algebra gl ( n, C ).
Definition 3.3 (Lie subalgebra). A Lie subalgebra h of a Lie algebra g is a subspace
h ≤ g that is closed under the Lie bracket: [h, h ] ≤ h (meaning [ x, y ] ∈ h for all x, y ∈ h).

E XAMPLE 32: sl (V ), sl ( n, R ), sl ( n, C ). The special linear Lie algebra sl (V ) in vector


space V consists of all linear operators X : V → V of 0 trace; it is a Lie algebra,
subalgebra of gl (V ), inheriting both its linear and bracket operations. In matrix
form, the special linear Lie algebra sl ( n, F ), with F = R, or C, is a Lie subalgebra of
gl( n, F ), consisting of its n × n traceless F-valued matrices.
92 CHAPTER 3. LIE GROUPS & LIE ALGEBRAS

E XAMPLE 33: o(V ), o( n, R ) and o( n, C ). Let V be a linear vector space with a bi-
linear symmetric inner product h v, wi = v T w defined on V . Then, with the linear
and bracket operations defined as in gl(V ), the orthogonal Lie algebra o(V ) consists
of all linear operators X on V under which the bilinear form h, i is invariant, so
that h Xv, w i + h v, Xwi = 0 for all v, w in V . If X and Y satisfy this invariance
condition, so does [ X, Y ]. In particular, for V = Fn , we may take h v, wi = ∑i vi wi
for any v, w ∈ Fn , so that the invariance relation reads ∑ Xij v j wi + vi Xij w j = 0.
If it is satisfied by X, it is also by cX for any c ∈ F. The orthogonal Lie al-
gebra o( n, F ) consists of the skew-symmetric n × n matrices X = − XT over F:
o( n, F ) = { X ∈ gl( n, F )|XT + X = 0} . Note that the expected subalgebra of
o( n, F ) defined by the additional restriction Tr X = 0 is unnecessary because
XT + X = 0 implies that all diagonal entries are 0; so o( n, F ) and so( n, F ) are
the same space of matrices, meaning o( n, F ) ≡ so ( n, F ).
C OMMENTS . (a) In general, Lie groups having the same identity component have
the same Lie algebra (see p. 97 for an explanation of this correspondence). For
example, the Lie groups O ( n) and SO ( n), which coincide in neighborhoods of
the identity I, have the same Lie algebra o( n, R ) = so ( n, R ). (b) As we have
seen before, the orthogonal group is defined by the form invariance condition
h Av, Awi = h v, w i, or the matrix relation AT A = I. If A is in the identity compo-
nent, we may write A = I + M + · · · , where | Mij |  1, the two relations become
h Mv, wi + h v, Mwi = 0 and MT + M = 0.
E XAMPLE 34: u( n ) and su( n ). Let V be a complex vector space endowed with a
positive-definite sesquilinear symmetric inner product h v, w i = v † w on V . The
unitary Lie algebra u(V ) consists of all the linear operators X on V satisfying the
(infinitesimal) invariance (unitarity) property h Xv, w i + h v, Xw i = 0 for all v, w
in V . The Lie algebra u(V ) is defined over R, but not over C (in the sense that if
X 6 = 0 has the invariance property, so does aX if a is real, but not if a is complex).
So u(V ) is a real Lie algebra.
In V = Cn with inner product h v, wi = ∑i v∗i wi (∗ denoting complex conju-
gation), the unitary Lie algebra u( n ) consists of the n × n complex matrices that
satisfy X† + X = 0 (X is then said to be skew-Hermitian or skew-adjoint). The
special unitary Lie algebra su ( n ) consists of elements of u( n ) with Tr X = 0. Note
that su ( n ), just like u( n ), is a real Lie algebra, even though it involves complex
matrices; and also that it is not the same as so ( n, C ).
E XAMPLE 35: sp(V ), sp( n, F ), and sp( n ). Let V be a vector space over F = R
or C equipped with a definition of a non-singular skew-symmetric bilinear form
B ( v, w). Then, the symplectic Lie algebra sp (V ) consists of all linear operators X act-
ing on V that have the infinitesimal invariance property B ( Xv, w) + B(v, Xw) = 0
for all v, w in V . The symplectic Lie algebras sp( n, R ) and sp ( n, C ) of R 2n and C2n ,
in which B ( v, w) = v T Jw, are set of 2n × 2n matrices that satisfy XT J + JX = 0
(cf. Example 8). For n = 1, invariance means Tr X = 0, and so sp(1, F ) is identical
with sl (2, F ). The matrices that are simultaneously in sp( n, C ) and u(2n ) form the
compact symplectic Lie algebra, denoted usp( n ) or sp( n ).
E XAMPLE 36: Consider V = R n+m with metric gn,m , so that h v, w iL = vT gn,m w
for any v, w ∈ V . The generalized orthogonal Lie algebra o( n, m; R ) consists of those
3.4. LIE ALGEBRAS 93

operators X that satisfy the invariance relation h Xv, w ig + h v, Xwig = 0, or of


the ( n + m ) × ( n + m ) real matrices X such that XT gn,m + gn,m X = 0. The sim-
plest example is the one-dimensional o(1, 1), with the basis X = 01 10 and the
metric g1,1 = diag [1, − 1]. The case of the Lorentz Lie algebra o(3, 1; R ) is of partic-
ular interest in physics; it consists of all the 4 × 4 real matrices X that satisfy the
condition XT = − g3,1 Xg3,1 where g3,1 = diag [1, 1, 1, − 1]. 

3.4.2 Properties
We discuss here a number of concepts and properties of Lie algebras, with exam-
ples based on structures of lower dimensions.
Structure constants. In a vector space g of dimension η, a set of η linearly in-
dependent vectors { xµ | µ ∈ N η } may be chosen as a basis, and any vector x in
this linear space is of the form x = ∑µ aµ xµ , where aµ are real for a real space,
and complex for a complex space. The Lie algebra, also denoted g, based on this
vector space can then be defined by a multiplication table for xµ which enumer-
ates all [ xµ , xν ] = ∑λ cλµν xλ , where the coefficients cλµν , called the structure con-
stants of the Lie algebra g (in the chosen basis), obey the relations cλµν = − cλνµ and
cαµν cκαλ + cανλ cκαµ + cαλµ cκαν = 0 derived from the axioms defining the Lie bracket
(see also (3.13) and (3.14)). This multiplication table, or an equivalent list of the
η 3 structure constants (1 ≤ λ, µ, ν ≤ η and taking into account symmetries), com-
pletely defines the Lie algebra and implies many of its properties. But in practice
this approach is rarely followed.
E XAMPLE 37: If a Lie algebra is abelian, its structure constants vanish in any basis:
cλµν = 0 for all indices.
E XAMPLE 38: Given a Lie algebra g (of dimension η) and a Lie subalgebra h (of
dimension ζ) by definition closed under both linear and bracket operations, we
can choose a basis for g, such that xµ , µ = 1, . . . , ζ span h. As h is closed under
[, ], we must have cλµν = 0; µ, ν = 1, . . . , ζ; λ = ζ + 1, . . . , η in that basis.
E XAMPLE 39: Dimension one. Rx and Cx, defined in any basis x of a one-dimen-
sional vector space, are real and complex abelian Lie algebras with the trivial
bracket [ x, x ] = 0. Examples of one-dimensional
  matrix Lie algebras are so (2)
(basis X = 10 −10 ) and so (1, 1) (basis X = 01 10 ).
E XAMPLE 40: Dimension two. Let x1 and x2 be a basis of a two-dimensional Lie
algebra g. Then, in general [ x1 , x2 ] = ax1 + bx2 , where a, b are real or complex
scalars. If both a and b vanish, we have an abelian algebra. If not, assume b 6 = 0,
then y1 = x2 /b and y2 = ax1 + bx2 satisfy [ y1, y2 ] = y2. So, there are just two
distinct two-dimensional real or complex Lie algebras:
( i) [ x1, x2 ] = 0 ;
(3.22)
( ii) [ x1 , x2 ] = x2 .
The first is the direct sum of two one-dimensional algebras, Fx1 ⊕ Fx2 , and the
second is the unique two-dimensional nonabelian algebra over either R, or C. An
example of ( ii) is the affine algebra of the line aff(1) (cf. Example 25). 
94 CHAPTER 3. LIE GROUPS & LIE ALGEBRAS

Algebraic concepts. Let g and h be Lie algebras. The symbol [g, h ] denotes the
linear span of all [ x, y ] with x ∈ g and y ∈ h. Using this notation, we define a
Lie subalgebra of a Lie algebra g as a subspace, say h, of g that is closed under the
bracket operation, [h, h ] ≤ h, with the same linear and bracket operations as for
g. Several kinds of subalgebras are important enough to be given names:
• A Lie ideal is a Lie subalgebra q of a Lie algebra g such that [g, q] ≤ q, that
is to say, [ x, y ] ∈ q for every x ∈ g and y ∈ q.
• The center of g is defined as Z(g) = { z ∈ g|[ z, x] = 0 for all x ∈ g} . It is
an ideal of g. If g is abelian, Z(g) = g; and conversely Z(g) = g implies
an abelian g. A semisimple Lie algebra can have no center, Z(g) = 0; see
Sec. 3.5.2.
• The derived algebra D g of the Lie algebra g is the subspace spanned by all
Lie products: D g = {[ x, y ]|x, y ∈ g} , or in short hand, D g = [g, g], itself an
algebra and an ideal of g. The Lie algebra g is abelian if and only if D g = 0.
We say g is perfect if D g = g. Every semisimple Lie algebra is perfect (but
the converse is not true).
Just as for Lie groups, it is useful to have a way of distinguishing different Lie
algebras, which is given by the following

Definition 3.4 (Homomorphism, isomorphism). A Lie algebra homomorphism


is a linear map between Lie algebras φ : g → g0 that preserves the bracket operation,
φ ([ x, y]) = [ φ ( x), φ (y)] for all x, y ∈ g. If in addition φ is one-to-one and onto, then
φ is called a Lie algebra isomorphism. An isomorphism of a Lie algebra with itself is
called a Lie algebra automorphism.

If Lie algebras g and g0 are related by a Lie isomorphism φ, we say that g and
g0 are isomorphic, or they are ‘the same up to isomorphism’; and write g ∼ = g0 .
The inverse map φ−1 is then also a Lie algebra isomorphism.
For example, when we said that the nonabelian two-dimensional Lie algebra
is unique, we implied ‘up to isomorphism’. Thus, the two-dimensional Lie al-
gebra defined by [ y1, y2 ] = y1 is isomorphic to the Lie algebra defined in (3.22)
if we let x1 7 → − y2 and x2 7 → − y1. This involves a simple change of the basis in
the underlying vector space. Another kind isomorphism calls for a less obvious
change of the base field, as we now explain.
If g is a finite-dimensional real Lie algebra, its complex extension is obtained by
tensoring g with C over R, which we write as gC = g ⊗ R C = g ⊕ ig, where i is
the complex unit. This means one forms linear combinations x + iy with x, y ∈ g,
and then (a, b ∈ R) define ( a + ib )( x + iy ) and [ x + iy, x0 + iy0 ] in the usual way.
The resulting structure gC is a complex Lie algebra, called the complexification
(or complex extension) of the real Lie algebra g. Conversely, we say that g is a
real form of gC (one among several possible real forms associated with a given
complex Lie algebra gC ). A basis for g over R is also one for gC over C.
E XAMPLE 41: Dimension three. It is convenient to classify the algebras of a given
dimension according to the dimensions of their derived algebras. In the present
case, dim g = 3, there are 4 possibilities: dim D g = dim [g, g] = 0, 1, 2, 3.
Let x1 , x2 , x3 be a basis of the real (or complex) space g. If dim D g = 0, g is
3.4. LIE ALGEBRAS 95

abelian, isomorphic to F3 . If dim D g = 1, then when comparing D g with the


center Z(g) there are two possibilities:
( i) Z(g) 6 = [g, g], then g = Z(g) ⊕ g1 :
[ x1 , x2 ] = x2 , [ x3 , x1 ] = 0, [ x3, x2 ] = 0, (3.23)
( ii) Z(g) ≤ [g, g]:
[ x1 , x2 ] = x3 , [ x3 , x1 ] = 0, [ x3, x2 ] = 0. (3.24)
For dim D g = 2, D g is a two-dimensional abelian algebra, and g is a semidi-
rect product of two abelian algebras. There are two families of solutions (with
arbitrary real numbers a, b, c):
( i) [ x1 , x2 ] = 0, [ x3, x1 ] = cx1 − bx2, [ x3, x2 ] = bx1 + cx2; (3.25)
( ii) [ x1 , x2 ] = 0, [ x3, x1 ] = x1 , [ x3, x2 ] = ax2. (3.26)
Finally, for dim D g = dim g = 3, there are two possibilities:
( i) [ x1, x2 ] = x3 , [ x2, x3 ] = x1 , [ x3, x1 ] = x2 ; (3.27)
( ii) [ x1, x2 ] = x3 , [ x2, x3 ] = − x1, [ x3, x1 ] = x2 . (3.28)
The two real Lie algebras (3.27)–(3.28) are distinct; one reason for this is that
( i) has no two-dimensional subalgebras, whereas ( ii) does, which we can show
by defining the real linear combinations e = x1 + x2 , f = − x1 + x2 , h = 2x3 , and
converting ( ii) into an equivalent form
[ h, e ] = 2e, [ h, f ] = − 2 f , [ e, f ] = h. (3.29)
The Lie algebras ( i) and ( ii) are distinct when defined over R; but when defined
over C, ( i) is isomorphic to ( ii) under x1 7 → x1 , x2 7 → − ix2 , and x3 7 → − ix3. So,
let us note: (3.27), (3.28) and (3.29) are equivalent to one another over C; but only
(3.28) and (3.29) are equivalent to each other over R.
E XAMPLE 42: We define [ r, s ] = rs − sr here and in the following examples. An
algebra of the type (3.23) is the direct sum of an algebra h x1, x2 i of the type (3.22)
and an invariant one-dimensional algebra h x3 i (cf. p. 116). An example of (3.24)
is the Heisenberg algebra ([ P, Q ] = E, [ P, E ] = 0, [ Q, E ] = 0), expressible in terms
of strictly upper triangular 3 × 3 matrices.
E XAMPLE 43: Let x1 7 → ∂ x = ∂/∂x and x2 7 → ∂y = ∂/∂y (basis for translations
in R 2 ). Now let x3 7 → b ( x∂y − y∂ x) − c ( x∂ x + y∂y ), then we have a Lie algebra
of the type (3.25). On the other hand, if we take x3 7 → − x∂ x − ay∂y, then we
have an algebra of the type (3.26). Both correspond to Euclidean transformations
(rotations so(2) and translations) in the x-y plane.
E XAMPLE 44: o(3) = o(3, R ), and o(3, C ). The orthogonal Lie algebra o(3, F ) is
the space of all skew symmetric F-valued 3 × 3 matrices (M = − MT ). A basis
consists of the matrices R 1, R 2, R 3 given by
     
0 0 0 0 0 1 0 −1 0
R 1 = 0 0 − 1 , R 2 =  0 0 0  , R 3 = 1 0 0 .
0 1 0 −1 0 0 0 0 0
96 CHAPTER 3. LIE GROUPS & LIE ALGEBRAS

They satisfy the commutation relations

[ R 1, R 2 ] = R 3, [ R 2, R 3 ] = R 1, [ R 3, R 1 ] = R 2 ;

and define algebras of the type (3.27). It is evident that the complex extension
o(3)C = o(3) + io(3) of the real Lie algebra o(3) is isomorphic to o(3, C ).
E XAMPLE 45: su (2). This real Lie algebra consists of the 2 × 2 trace-free skew-
Hermitian matrices, expressible in the (canonical) basis given by
     
1 0 −i 1 0 −1 1 −i 0
S1 = 2 , S2 = 2 , S3 = 2 .
−i 0 1 0 0 i

They obey the Lie products [ S1, S2 ] = S3, [ S2, S3 ] = S1, [ S3, S1 ] = S2, which
show that su (2) ∼
= so (3) (this isomorphism arises from the fact that the Lie groups
SU (2) and SO (3) look the same near I, see p.97 et seq.). Note for reference that
these matrices may be written S j = − iσj /2, with j = 1, 2, 3, in terms of the stan-
dard Pauli matrices σj normalized such that σi σj = δij + ieijk σk . Explicitly,
     
0 1 0 −i 1 0
σ1 = , σ2 = , σ3 = . (3.30)
1 0 i 0 0 −1

E XAMPLE 46: o(2, 1; R ). This generalized orthogonal real Lie algebra consists of
all 3 × 3 real matrices L such that LT g2,1 + Lg2,1 = 0, with the metric g2,1 =
diag [1, 1, − 1]. We take as a basis
     
0 0 0 0 0 1 0 −1 0
L1 = 0 0 1 , L2 = 0 0 0 , L3 = 1 0 0 .
0 1 0 1 0 0 0 0 0

They obey the commutation relations

[ L1 , L2 ] = L3 , [ L 2, L 3 ] = − L 1, [ L3, L1 ] = − L2.

After a change of basis such that L1 ↔ L3, these relations are of the type (3.28),
and, as mentioned above, are equivalent to (3.27) when complexified. Therefore,
o(2, 1; R )C ∼
= o(3, C ).
E XAMPLE 47: sl (2, R ). This algebra is the space of all 2 × 2 trace-free real matrices;
a convenient basis is
     
1 0 0 1 0 0
H= , E= , F= , (3.31)
0 −1 0 0 1 0

with the product rules [ E, F ] = H, [ H, E ] = 2E, [ H, F ] = − 2F, identical to (3.29):


we have isomorphism of real Lie algebras: sl (2, R ) ∼ = o(2, 1; R ).
E XAMPLE 48: sl (2, C ). This complex Lie algebra is the space of all 2 × 2 trace-
free matrices. It is obviously equal to the complex extension sl (2, R )C, for which
the same convenient basis H, E, F may be used. In addition, the complex linear
3.5. BACK TO LIE GROUPS 97

combinations H = 2iS3 = σ3, E = iS1 − S2 = 12 ( σ1 + iσ2 ), and F = iS1 + S2 =


1
2 ( σ1 − iσ2 ) show that sl( 2, C ) is also isomorphic to su( 2 ) ⊗ C, i.e. the complex
extension of the (real) Lie algebra su(2). So we have so (3, C ) ∼ = su(2)C ∼ = sl (2, C ).
In summary, the three-dimensional non-abelian complex Lie algebra sl (2, C )
has 2 distinct real forms: (i) the compact real form (with a negative-definite Killing
form) defined by the brackets (3.27) and realized in matrix form so (3, R ) ∼ = su(2);
and (ii) the normal real form (with a maximally non-definite Killing form) defined
by (3.28) or equivalently (3.29), which has the matrix form sl (2, R ) ∼ = o(2, 1; R ).
E XAMPLE 49: If A and B are real n × n matrices, then A + iB ∈ gl ( n, C ). On the
other hand, any complex n × n matrix may be written uniquely as M1 + iM2 ,
where M1, M2 ∈ gl ( n, R ). So we have gl ( n, C ) ∼
= gl ( n, R ) + igl ( n, R ) = gl ( n, R )C.
If A and B are n × n complex skew-Hermitian matrices, then A + iB ∈ gl( n, C );
on the other hand, any n × n complex matrix M may be written uniquely as
M = M1 + iM2, where M1 = ( M − M† ) /2 and M2 = ( M + M† ) /2i are skew-
Hermitian, and so are in u( n ). This means gl( n, C ) ∼= u( n ) + iu( n ) = u( n )C.
Similarly for the other cases. The Lie algebras gl ( n, C ), sl ( n, C ), sp ( n, C ), and
o( n, C ) = so ( n, C ) are complex Lie algebras isomorphic to the complexifications
of real Lie algebras, as summarized here:
sl ( n, C ) ∼
= sl ( n, R )C gl ( n, C ) ∼
= gl ( n, R )C ∼
= u( n ) C
o( n, C ) ∼= o( n, R )C ∼
sp( n, C ) = sp ( n, R )C ∼
= sp( n )C
o( p + q, C ) ∼ = o( p, q; R )C

3.5 Back to Lie Groups


We now resume and generalize our study of the relationship between Lie groups
and Lie algebras begun in Section 3.3 (which concerns mostly with the group of
transformation). As it is already known, the tangent space to a connected Lie
group G at the identity element I is its Lie algebra g. This means, if we take
a smooth curve γ ( t) lying entirely in space G such that γ (0) = I, the tangent
vector at t = 0, given by the derivative γ0 (0), is an element of the Lie algebra g
of G. Every element of g is the derivative γ0 (0) of some suitably chosen curve. A
special class of such curves represented by the exponential map exp : R → G plays
a key role in this study. We shall mostly restrict ourselves to matrix Lie groups
and matrix Lie algebras (so that the Lie bracket [ X, Y ] is a commutator XY − YX
of matrices), but almost all results have a wider generality.

3.5.1 The Exponential Map


Let M( n, F ) (where F = R, or C) be the space of real or complex n × n matrices.
For each matrix X ∈ M( n, F ), the series I + X + 21 X2 + · · · converges to a matrix
of the same size, and is a continuous function of X denoted eX or exp X:

Xm
eX = ∑ = I + X + 12 X2 + · · · . (3.32)
m=0 m!
98 CHAPTER 3. LIE GROUPS & LIE ALGEBRAS

From this definition, one sees that its properties are mostly the same as for ex-
ponentials of numbers, with the differences arising from the non-commutativity
of matrices. Let X, Y ∈ M( n, F ), and eX , eY be defined as in (3.32), then the exp
map has the following properties:

1. e0 = I.
2. If XY = YX, then eX eY = eX+Y . Hence eX e−X = I, or (eX )−1 = e−X .
3. If S is an invertible matrix, then exp( SXS−1) = S exp( X) S−1.
4. (exp X)T = exp ( XT ) (where ()T denotes transposition).
5. If γX ( s ) = esX , then γX ( s )γX ( t ) = γX ( s + t ) for any s, t ∈ R, or C.
2
6. In M( n, F ) viewed as a space Fn , the s-function γX ( s ), with a fixed X,
describes a smooth curve with tangent vector at s given by the derivative
0 ( s ) = Xγ ( s ) = γ ( s ) X; in particular, at the identity, it is γ0 ( 0 ) = X.
γX X X X
7. det (exp X) = exp(Tr X).
−1
The continuous curve γX ( t ) = exp tX, satisfying γX (0) = I, γX ( t ) = γX (− t)
and γ ( s )γ (t) = γ ( s + t ) for all s, t ∈ R and fixed X, is a group, called the one-
parameter subgroup in G determined by X. It maps R → G : t 7 → etX for each
X ∈ M( n, R ). Given these properties, for every X ∈ M( n, R ) we take a real
variable t to form the series ∑∞ m m
m=0 t X /m!, which is an invertible matrix exp tX
for all t, and so lies in GL( n, R ). Thus exp defines a continuous homomorphic
map exp : R 3 t → etX ∈ GL( n, R ), and every continuous homomorphism R →
GL( n, R ) is of this form.
A closed subgroup of GL( n, R ) is a Lie group called linear group (see p. 76).
Let G ⊂ GL( n, R ) be a linear group, and define the set of real matrices

Lie ( G ) = { X ∈ M( n, R ) | etX ∈ G for all t ∈ R } .

Lie( G ) is a vector subspace of M( n, R ). This means Lie( G ) is closed under the


linear operations: (i) If X ∈ Lie( G ), a ∈ R, then aX ∈ Lie( G ); and (ii) If X, Y ∈
Lie( G ), then X + Y ∈ Lie ( G ), because

etX etY = I + t ( X + Y ) + 12 t2[ X, Y ] + · · · = et(X+Y)+···

(making use of the series (3.32)). Since X = limt→0 (etX − I ) /t = detX /dt |0, it is
the tangent space to G at the identity. The dimension of the linear group G is then
defined to be the dimension of the vector space Lie( G )
We can also calculate the group commutator of the elements eX , eY of G:
2 [X,Y ]+···
etX etY e−tX e−tY = I + t2[ X, Y ] + · · · = et . (3.33)

Here the dots stand for terms of degree three and higher. Given these relations,
it is evident that since eX and eY are in G, so is eX eY e−X e−Y , and therefore [ X, Y ]
is in Lie( G ) (cf. arguments leading to (3.12)). In other words, if we have X, Y ∈
Lie( G ), it follows that [ X, Y ] ∈ Lie( G ). The vector space Lie( G ) is closed under
the Lie bracket, and so by definition is a Lie algebra, Lie ( G ) = g. In particular, if
Lie( G ) = 0, then G is a discrete group, and vice versa.
3.5. BACK TO LIE GROUPS 99

E XAMPLE 50: Exponential of a diagonalizable matrix. Let X ∈ M( n, F ) (where


F = R, or C), and assume it to be diagonalizable, i.e., expressible as X = SΛS−1,
where Λ is a diagonal matrix, Λ = diag[ λ1, . . . , λn ], and S an invertible matrix.
Then exp Λ = diag[exp λ1, . . . , exp λn ] since Λm = diag [ λm m
1 , . . . , λn ] , and the ex-
− 1 − 1
ponential of X is exp X = exp ( SΛS ) = S (exp Λ ) S . When X is real, exp X
must be real, even though S and Λ may be complex.
E XAMPLE 51: Exponential of a nilpotent matrix. A matrix M is said to be nilpotent
if there is some positive finite integer m such that M` = 0 for all integers ` ≥ m.
Let X ∈ M( n, F ) be nilpotent, with Xm = 0 and Xm−1 6 = 0. Then exp X =
I + X + · · · + (m−1 1)! Xm−1 can be explicitly calculated.
E XAMPLE 52: Cyclic matrix. If X is a cyclic matrix of order n, such that Xn = I,
then eX = a0 I + a1 X + · · · + an−1 Xn−1 . For example, if X2 = I, then separate
summation of terms of odd and even powers yields etX = I cosh t + X sinh t.
E XAMPLE 53: General linear Lie group, algebra. What is the Lie algebra of the Lie
group GL( n, R )? Given a smooth curve γ ( t ) in GL( n, R ), it is real for all real t,
and γ0 (0) = X is also real and in M( n, R ). On the other hand, if X is any n × n
real matrix, then etX is real for real t, and invertible; and is in GL( n, R ). So, the
Lie algebra of GL( n, R ) is the space of all n × n real matrices. Similarly, if X is
any n × n complex matrix, γX ( t ) = etX is complex, invertible, and must be in
GL( n, C ): the Lie algebra of GL( n, C ) is the space of all n × n complex matrices.
We thus have for Lie GL( n, F ) the space of all n × n F-valued matrices, that is,
gl( n, F ) = M( n, F ), with F = R, or C.
E XAMPLE 54: Special linear Lie group, algebra. When GL( n, F ) is restricted to uni-
modular elements, we obtain SL( n, F ). If Tr X = 0, then det exp ( tX ) = 1 for all t
(because det exp ( tX ) = exp( tTr X)). On the other hand, if X is any n × n matrix
such that det exp( tX ) = 1 for all t, then exp( tTr X) = 1 for all t, and so Tr X = 0.
Thus, the Lie algebra of the Lie group SL( n, F ) is the space of all n × n F-valued
traceless matrices, denoted sl ( n, F ) with F = R, or C. In formula, it is

Lie SL( n, F ) = sl ( n, F ) = { X | X ∈ gl ( n, F ); Tr X = 0} .

E XAMPLE 55: Orthogonal Lie group, algebra. Let γ ( t) with γ (0) = I be a curve
consisting of orthogonal matrices in O ( n, F ), meaning that γT ( t )γ (t) = I for all
t. It follows, upon differentiation wrt t and setting t = 0, that γ0 (0)T + γ0 (0) = 0,
and so X = γ0 (0) is in o( n, F ). On the other hand, take any X ∈ o( n, F ) (so
that XT + X = 0), then the product (exp( tX ))T exp( tX ) = exp ( tX T ) exp( tX ) has
the derivative exp( tXT )( XT + X) exp( tX) identically equal to 0 for all t, which
means that (exp( tX ))T exp ( tX ) is constant for all t, which is I if we let t = 0.
Hence (exp( tX ))T exp( tX ) = I, or exp( tX) lies in O ( n, F ) for all t. In conclusion,

Lie O ( n, F ) = o( n, F ) = { X | X ∈ gl ( n, F ); XT + X = 0} .

The special orthogonal Lie algebra so ( n, F ) is a subalgebra of o( n, F ) that sat-


isfies in addition the condition Tr X = 0. But since X = − XT implies a traceless
X, it is not a new restriction, and so so ( n, F ) = o( n, F ). This can be understood
100 CHAPTER 3. LIE GROUPS & LIE ALGEBRAS

because the Lie group SO( n, F ) is the connected component of O ( n, F ), so that the
two groups have the same properties near the identity, and therefore the same Lie
algebra. This is an example of an important general result in Lie group theory:
• Given the connected components G0 and G00 of linear groups G and G0 , if G0 =
G00 then Lie( G ) = Lie ( G0 ); and vice versa.
E XAMPLE 56: Unitary Lie group and algebra. What is the Lie algebra of the Lie
group U( n )? A matrix A is unitary if and only if A† = A−1, and etX is unitary
† †
if and only if (etX )† = e−tX , or etX = e−tX (that (etX )† = etX can be seen from

the power series of exp). For this to hold, it suffices X = − X. Conversely, if

etX = e−tX , it is necessary that X† = − X, as seen by differentiating at t = 0. In
other words, etX is unitary if only if X is skew-Hermitian, or

Lie U( n ) = u( n ) = { X | X ∈ gl( n, C ); X† + X = 0} .

The special unitary group SU( n ) is a subgroup defined by the additional condi-
tion det etX = 1 for all real t,which holds iff Tr X = 0 in the tangent vector space.
We write this as: Lie SU( n ) = su ( n ) = { X | X ∈ u( n ); Tr X = 0} .
E XAMPLE 57: Symplectic Lie group and algebra. A symplectic Lie group consists
of elements A satisfying the condition AT J A = J. The corresponding relation
T
etX JetX = J holds for all real t iff XT J + J X = 0. And we have

Lie Sp( n, F ) = sp( n, F ) = { X | X ∈ gl (2n, F ); XT J + J X = 0} .

Finally, the Lie algebra of Sp( n ) is sp( n ) = sp( n, C ) ∩ u(2n ).


E XAMPLE 58: Generalized orthogonal Lie group, algebra. An ( n + m ) by ( n + m )
real matrix A is in the generalized orthogonal group O ( n; m ) iff AT gA = g, or
equivalently gAT g = A−1 , where g = gn,m = diag[1, . . . , 1; − 1, . . . , − 1]. If X is
an ( n + m ) × ( n + m ) real matrix, then etX is in O ( n; m ) for every real number t
iff g exp ( tXT ) g = exp ( tgXT g) = exp(− tX ), that is iff gXT g = − X. Thus, the
Lie algebra of O ( n; m ) consists of all the ( n + m ) × ( n + m ) real matrices X that
satisfy gXT g = − X. It is called o( n; m ) :

Lie O ( n; m ) = o( n; m ) = { X | X ∈ gl( n + m, R ); XT g + gX = 0} ,

where g is the non-definite metric gn,m . Since gXT g = − X implies that X is


traceless, we have Tr X = − Tr X = 0 (using the identities Tr ( ABC ) = Tr ( BCA ),
Tr XT = Tr X, and g2 = 1), it follows that o( n; m ) is the same as the Lie algebra of
SO( n; m ). We write this result as Lie SO( n; m ) = so ( n; m ) = o( n; m ). 
SIMPLE, SEMISIMPLE ALGEBRAS. In parallel with semisimple and simple (Lie)
groups (cf. Chap. 1 Sec. 1.4), we now define an important class of Lie algebras.
• A Lie algebra g is said to be semisimple if it has no non-zero abelian ideal.
Semisimplicity is a useful attribute of g because otherwise, in an abelian sub
Lie algebra, all of its elements commute with each other, are essentially complex
numbers, and do not show up in the structure constants; nothing can be learned
about the abelian invariant subalgebras from them. An even more basic structure
is simple algebras defined as follows:
3.5. BACK TO LIE GROUPS 101

• A Lie algebra g is simple if it is of dimension greater than 1, and if it has no


nonzero ideal except g itself.
The dimension restriction rules out the trivial case of one-dimensional abelian
Lie algebras. It is equivalent to requiring that g be non abelian because otherwise
it would have non-trivial ideals. So the stated definition may be replaced with: A
Lie algebra g is simple if it is not abelian and has no nonzero proper ideal. It is also clear
that if g is a simple Lie algebra, then g is equal to its derived algebra, i.e. [g, g] = g
(D g = [g, g] is an ideal in g and cannot be zero, otherwise g would be abelian).
Finally, any simple Lie algebra is necessarily semisimple; so it is a limiting case in
the class of semisimple Lie algebras.
When a Lie group G is the direct product of its subgroups H and H 0 = G − H,
i.e. G = H × H 0 , the elements in the two subgroups commute, and the Lie al-
gebra g = Lie( G ) is given by the direct sum of the Lie algebras h and g − h =
h0 of H and H 0 . We write this as g = h ⊕ h0 , so that the direct-sum vector
space, with typical elements ( X, X0 ), can be regarded as a space defined with Lie
bracket [( X, X0), (Y, Y 0 )] = ([ X, Y ], [ X0, Y 0 ]) (using customary notation for Carte-
sian products). The summands h and h0 (i.e. all the ( X, 0) and (0, X0 )) are ideals
in g, and nullify each other, [h, h0 ] = 0. The importance of simplicity lies in that
E XAMPLE 59: Dimension one. The only real Lie algebra of dimension 1, g = Rx, is
abelian, [ x, x] = 0. It is not simple because its dimension is 1; it is not semisimple
because it has a non-zero abelian ideal, namely h xi.
E XAMPLE 60: Dimension two (Example 40). The two-dimensional abelian algebra
is not semisimple. Neither is the nonabelian h X1, X2 : [ X1, X2 ] = X2 i, because its
sub Lie algebra h X2 i is both abelian and invariant.
E XAMPLE 61: Dimension three (Example 41). The three-dimensional algebras with
dim D g = 0, 1, 2 are not semisimple. The case with dim D g = dim g = 3 cor-
responding to the complex Lie algebra sl (2, C ) is simple, as we will now show.
Take h, e, f for its basis, as in (3.29). Any ideal (invariant subspace) q in g must be
spanned as a vector space by a subset of { h, e, f } . If an ideal q contains h, then
[ h, e ] = 2e ∈ q, and [ h, f ] = − 2 f ∈ q, and q = sl (2, C ). On the other hand, if q
contains e, then [ e, f ] = h ∈ q and [ h, e ] = 2e ∈ q, and so q = sl (2, C ). Finally, if
f ∈ q, then q = sl (2, C ). Hence, the only ideal sl (2, C ) can have is itself: it has no
proper ideals; so sl (2, C ) is a simple Lie algebra.
The complete list of simple complex Lie algebras, which is the main aim of
structure theory, is sl ( n, C ) (for n > 1), so ( n, C ) (for n > 2), sp( n, C ), plus five
exceptional algebras.

3.5.2 Related Properties


Morphism. Every map of Lie groups ϕ : G1 → G2 that preserves the group laws
(called a morphism of Lie groups) defines a corresponding map of Lie algebras
ϕ∗ : g1 → g2 that preserves the algebra laws. Here are two simple examples of this
relationship: (1) The Lie group commutator defines the Lie algebra commutator, as in
(3.33). (2) Every Lie subgroup H of a Lie group G defines a Lie subalgebra Lie( H ) of
Lie( G ).
102 CHAPTER 3. LIE GROUPS & LIE ALGEBRAS

As we recall, a Lie subgroup H in Lie group G is said to be normal if it is


invariant under conjugation, γαγ−1 ∈ H for all α ∈ H and all γ ∈ G. Then,
evidently γαγ−1α−1 ∈ H; from (3.33), we have [g, h ] ≤ h, which means that h is
invariant under the Lie multiplication by g, or that h is an ideal in g. Given Lie
group G and a Lie subgroup H in G, and their respective Lie algebras g = Lie ( G )
and h = Lie( H ), the Lie subalgebra h is an ideal if (and only if) H is a normal
subgroup. With h being an ideal in g, the quotient space g/h consists of elements
given by the cosets X + h; it carries an induced bracket defined by [ X + h, Y + h ] =
[ X, Y ] + h (where X, Y are arbitrary representatives). This quotient space g/h is a
Lie algebra, called the quotient Lie algebra.
E XAMPLE 62: u(2). The Lie algebra u(2) is the real space of 2 × 2 skew-Hermitian
complex matrices, with the commutator as the Lie product. For a basis, we take
the skew-Hermitian matrices S0 = diag [i, i], and Si , i = 1, 2, 3 as in Example 45:
they satisfy the relations [ S0, Si ] = 0 and [ Si, S j ] = eijk S j . We have also seen
before that S1, S2, S3 form a basis in su (2). Now, q = h S0i is an abelian ideal in
u(2); and q0 = h S1, S2, S3 i is a nonabelian ideal in u(2), because [ S0, Si ] = 0 ∈ q0
and [ Si, S j ] = eijk S j ∈ q0 . Thus, u(2) is neither simple nor semisimple, and (since
[ S0, Si ] = 0) we may write u(2) = u(1) ⊕ su(2).
E XAMPLE 63: u( n ). The unitary Lie algebra u( n ) is the set of all n × n skew-
Hermitian complex matrices. The special unitary Lie algebra su ( n ) is the set of
u( n ) matrices with zero traces. The space q spanned by multiples of the n × n
diagonal matrix iIn is an ideal in u( n ), because if x ∈ q and y is any element of
u( n ), then [ x, y ] = 0 ∈ q. On the other hand, the space of all n × n traceless
matrices is an ideal, q0 , too, because if x ∈ q0 and y is any element of u( n ), then
[ x, y ] = z ∈ q0 , since the trace of any commutator is necessarily traceless, and
z ∈ q0 . In addition, [q, q0 ] = 0. So, u( n ) is neither simple nor semisimple, and every
one of its elements can be written as the sum of one element from q = u(1) and
one from q0 = su ( n ). Hence, u( n ) is the direct sum of two ideals, one of dimension
one and the other simple, u( n ) = u(1) ⊕ su ( n ). 
The Baker-Campbell-Hausdorff Formula. As we have learned, every connected
Lie group G defines a Lie algebra g as the tangent space at the identity, with
γ0 (0) = X ∈ g for each smooth curve γ ( t) ∈ G passing through the identity,
γ (0) = I. The expansion series for eX eY in its complete form is known as the
Baker–Campbell–Hausdorff (BCH) formula. It states that, for X, Y ∈ g, one has

eX eY = eµ(X,Y) , log(eX eY ) = µ ( X, Y ), (3.34)

in which the g-valued function µ ( X, Y ) is given by a series of multiple commu-


tators of X, Y which converges to some Z ∈ g in some neighborhood of (0, 0):
1 1  1 
µ ( X, Y ) = X + Y + [ X, Y ] + { X, [ X, Y ] + Y, [Y, X] } + · · · . (3.35)
2 12 12
This expression, a series of Lie polynomials consisting of commutators of X, Y,
and their commutators of all degrees in X, Y, is universal (i.e., independent of
g and the choice of X, Y). The existence of the BCH formula is very significant,
3.5. BACK TO LIE GROUPS 103

because it shows that the group law of a connected Lie group G can be recovered from
the Lie bracket in Lie algebra g (but not necessarily the overall geometry of G) by
mapping µ ( X, Y ) 7 → exp( X) exp(Y ).
From Lie algebra back to Lie group. The reversed path (from Lie algebra to Lie
group) is more complex, and for this reason, we just enumerate here a few of
the most important results concerning connected Lie groups obtainable from Lie
algebras (see [FH], [Sa], [Ja] for details and proofs):
• The product rule in a connected Lie group G admitting Lie algebra g = Lie( G )
can be recovered from the Lie bracket in g via (3.34).
• Any Lie algebra is isomorphic to a subalgebra of gl ( n ) (Ado’s theorem). Hence,
every Lie algebra is equivalent to a matrix Lie algebra, in contrast to the
fact that not all Lie groups are matrix Lie groups.
• Any finite-dimensional Lie algebra is isomorphic to the Lie algebra Lie ( G ) of some
Lie group G (Lie’s third theorem).
• For any finite-dimensional Lie algebra g, there is, up to isomorphism, a unique
connected, simply-connected Lie group G such that Lie( G ) = g. Any other
connected Lie group G0 with Lie ( G0 ) = g must be a quotient group of the
form G/Z for some discrete invariant subgroup Z in G.

E XAMPLE 64: Consider the Lie algebra u(2), with basis S0 and Si (i=1,2,3), just as
in Examples 45 and 62. We calculate the elements of the Lie group G = U (2),
admitting Lie(G) = u(2) as the Lie algebra by exponential mapping. Let X =
θ0 S0 + ∑3i=1 θi Si, with real numbers θ0, θi , be an arbitrary element in u(2), then
the exp-map X 7 → exp X produces an element in U (2):
 
X → exp θ0 S0 + ∑i θi Si = exp ( θ0S0) exp ∑i θi Si

where we have used [ S0, Si ] = 0 in the last equality. To transform this into an
explicit matrix form, we will need ( ∑3i=1 θi σi )2 = θ 2, as shown here:

 3 2 3
∑ θi σi = ∑ θi θj σi σj = ∑ θi θj ( δij + ieijk σk ) = ∑ θi2 .
i= 1 i,j i,j i= 1

As S j = − iσj /2, we have ( ∑3i=1 θi Si )2 = −( θ/2)2. In the expansion series for


the function exp ( ∑i θi Si ), we will sum separately all terms of even powers, using
( ∑i θi Si )2k = (−)k( θ/2)2k with k = 0, 1, . . . , and all terms of odd powers, using
( ∑i θi Si )2k+1 = − i n · σ (−)k( θ/2)2k+1, with k = 0, 1, . . . and ni = θi /θ. The end
result reads
 3   
θ θ
exp( θ0 S0) exp ∑ θi Si = exp (iθ0) I2 cos − i n · σ sin
i= 1
2 2
 
cos θ2 − in3 sin θ2 −( n2 + in1 ) sin θ2
= exp (iθ0)  .
( n2 − in1 ) sin θ2 cos θ2 + in3 sin θ2
104 CHAPTER 3. LIE GROUPS & LIE ALGEBRAS

(The minus sign in − i n · σ comes from the sign in S j = − iσj /2.) Thus, any ele-
ment of U (2) is specified by four real parameters: one specifying a phase factor
exp(iθ0), and three determining a 2-by-2 unimodular unitary matrix, a result im-
plying that U (2) is expressible as a direct product of U (1) and SU (2).
In the U (2) matrix given above, call A the factor belonging to SU (2), and
re-express its elements in terms of r0 = cos( θ/2) and ri = ni sin( θ/2), with i =
1, 2, 3. If we regard r0, ri as the Cartesian coordinates in a 4-dimensional Euclidean
space, then, as det A = r20 + r21 + r22 + r23 = 1, the group space is the unit sphere
S3, and so SU (2) is both compact and  simply-connected.
Again, let A ( n, θ ) = exp ∑i θi Si ∈ SU (2), then A ( n, 0) = diag[1, 1] = I2 is
the identity, and A ( n, ± 2π  ) = − I2 for arbitrary n. Further, we also have A ( n, θ +
4π ) = A − n, −( θ + 2π ) = A ( n, θ ). It follows that in the parameter space the
group elements occupy a sphere of radius θ = 2π, centered at the identity I2,
and with all points on the spherical shell θ = 2π identified with − I2. I2 and − I2
form a discrete invariant subgroup Z = { I2, − I2} of order 2 in SU (2). Hence,
the Lie algebra su (2) gives rise to two distinct Lie groups: SU (2)/ { I2} ≡ SU (2) and
SU (2)/Z ∼ = SO (3). Any pair of elements ± A ∈ SU (2) corresponds to a single
element R ∈ SO (3), and the map SU (2) → SO (3) is two-to-one. 
The relationship between some matrix Lie groups and matrix Lie algebras are
summarized in the following Table 3.2.

Table 3.2: Matrix Lie groups and associated Lie algebras.

Group G Conditions Lie(G) Conditions S SS Dimension


GL(n, R) det A 6= 0 gl(n, R) M(n, R) ∗ ∗ n2
SL(n, R) det A = 1 sl(n, R) TrX = 0 Y Y n2 − 1
O(n) AT A = I
so(n) X T = −X Y Y n(n − 1)/2
SO(n) AT A = I, det A = 1
U(n) B† B = I u(n) Z† = −Z † † n2
SU(n) B† B = I, det B = 1 su(n) Z† = −Z, TrZ = 0 Y Y n2 − 1
Sp(n, R) AT J A = J sp(n, R) XT J = − JX Y Y n(2n + 1)
Sp(n) BT JB = J, B† B = I sp(n) ZT J = − JZ, TrZ = 0 Y Y n(2n + 1)
GL(n, C) det B 6= 0 gl(n, C) M(n, C) ‡ ‡ n2
SL(n, C) det B = 1 sl(n, C) TrZ = 0 Y Y n2 − 1
O(n, C) BT B = I
so(n, C) ZT = − Z Y Y n(n − 1)/2
SO(n, C) BT B = I, det B = 1
Sp(n, C) BT JB = J sp(n, C) ZT J = − JZ Y Y n(2n + 1)

A, X, and I are real square matrices, whereas B and Z are complex square matrices, all
of the size n × n, except for the symplectic groups/algebras when they are of 2n × 2n, as
is the skew-symmetric metric matrix J. The dimensions in the last columns are over R
(or C) for real (or complex) groups/algebras. S and SS refer to the simplicity and semi-
simplicity of the Lie algebra (Y for yes). Exceptional cases: so(2) and so(2, C) are one-
dimensional abelian; so(4) and so(4, C) are semisimple, but not simple. Decompositions:
∗ gl(n, R) = RI ⊕ sl(n, R); † u(n) = iRI ⊕ su(u); ‡ gl(n, C) = CI ⊕ sl(n, C).
3.6. REPRESENTATIONS 105

3.6 Representations
We now discuss in general terms the representations of Lie groups and Lie alge-
bras. As these two structures are closely related, the latter being relatively the
simpler of the two, it is important to establish how, what, and when representa-
tions of Lie algebras can inform us about the representations of Lie groups. In
general, we will assume representations to be complex, but will consider both real
and complex Lie groups and Lie algebras. The scope of our present study of the
representations of Lie groups and Lie algebras is determined by their dimensions:
we will be dealing only with Lie groups, Lie algebras, and their representations
of finite dimensions.

3.6.1 Basic Definitions


What do we mean by representations of a Lie group, or of a Lie algebra?
Definition 3.5 (Representation of a Lie group). A representation ( Π, V ) of a Lie
group G is a vector space V together with a morphism Π : G → GL (V ).
We give G a finite-dimensional representation ( Π, V ) by assigning to every
element α of G a linear operator Π ( α) on a finite-dimensional vector space V with
inner product h, i, such that all such operators obey the group axioms, namely,
Π ( ε) = I (identity) and Π ( α · β ) = Π ( α)Π( β ) for any α, β ∈ G, and satisfy the
analyticity and boundedness conditions: h φ | Π(α)ψi is a continuous function of
the group parameters for arbitrary ψ, φ ∈ V . Then the map Π ( α) : V → V , with
α ∈ G, can also be regarded as a linear action of G on V . Similarly, for Lie algebras:
Definition 3.6 (Representation of a Lie algebra). A representation ( π, V ) of a Lie
algebra g is a vector space V together with a morphism π : g → gl(V ).
This means that to each element x of g we assign an operator π ( x) : V → V
that preserves both linearity, π ( ax + by ) = aπ ( x) + bπ ( y), and bracket bilinearity,
π ([ x, y]) = [ π ( x), π (y)]. The morphism π ( x) : g → gl(V ) can be equivalently
viewed as a linear action of g on V by mapping g × V → V . (The term ‘module’
rather than ‘representation’ is sometimes used, identifying ‘module over g’ (or g-
module) with ‘representation of g’, although strictly speaking they are different,
[Ja] p. 14.)
Representation ( Π, V ) (or ( π, V )) is said to be real or complex, depending on
whether representation space V is a real or complex vector space. In general, if
G (or g) is complex, V should be complex; but even if G (or g) is real, one may
equally take a complex space V . In fact, it turns out that complex representations
simplify the theory even for real Lie groups or algebras.
As we are dealing with a finite (d) dimensional vector space, let us pick a finite
basis { ui; i = 1, 2, . . . , d } , and define matrix representations in this basis:

Π ( α)uj = uk Dkj ( α ) , α ∈ G, (3.36)


j k
π ( x) u = u Dkj ( x) , x∈g (3.37)
106 CHAPTER 3. LIE GROUPS & LIE ALGEBRAS

(summing over repeated indices; no significance attached to upper/lower posi-


tions of indices). All the Lie groups and Lie algebras discussed in Sec. 3.1 and
Sec. 3.4.1 come equipped with a natural matrix representation (e.g. GL( n, C ) and
gl( n, C )) acting on some vector space Fn . In matrix language, a complex rep-
resentation of Lie group G is a Lie group homomorphism Π : G → GL ( n, C ),
and a complex representation of Lie algebra g is a Lie algebra homomorphism
π : g → gl ( n, C ). Of course, these matrix Lie groups and matrix Lie algebras may
have representations in any spaces, other than the ‘standard spaces’, as well. To
find and classify these possibilities is among the aims of representation theory.
On the matter of notations, we shall simply use π or V , rather than ( π, V ), to
designate a representation of a Lie algebra. We may also write x to mean π ( x),
and xv to mean π ( x) · v. Similar notations apply to representations of Lie groups.
E XAMPLE 65: The Trivial Representation of a Lie group G is Π : G → GL (1, C ), with
Π ( α) = I for all α ∈ G. The trivial representation of any Lie algebra g is the map
π : g → gl (1, C ) such that π ( X ) = 0 for all X ∈ g.
E XAMPLE 66: The Standard Representation is the simplest non-trivial representa-
tion; it may serves to define the structure of a group or algebra. For a Lie group G,
it is the map Π ( A ) = A for every A ∈ G; for a Lie algebra g, the map π ( X ) = X
for every X ∈ g. Thus, V = Fn for GL( n, F ), SL( n, F ), O ( n, F ); V = Cn for
U( n ), SU( n ); V = F2n for Sp( n, F ). Analogously for corresponding Lie algebras.
E XAMPLE 67: The affine Lie algebra of the line aff(1) discussed in Examples 24–
25 can be defined by two 2 × 2 real matrices, with ( X1 )11 = 1, ( X2)12 = 1 and
all other entries equal to 0. They obey the commutation relation [ X1, X2 ] = X2 ,
which by itself defines aff(1).
E XAMPLE 68: The standard representation of SO (3) is given by the set of all rota-
tions acting on R 3. It is a real representation. The orthogonal Lie algebra so(3),
defined in (3.27) by the algebra [ xi , xj ] = xk and cyclic permutations of i, j, k. A
basis is given by the following skew-symmetric matrices
     
0 0 0 0 0 1 0 −1 0
R x = 0 0 − 1 , R y =  0 0 0  , R z = 1 0 0 .
0 1 0 −1 0 0 0 0 0
They satisfy the commutation relations [ R x, R y ] = R z and cyclic permutations.
E XAMPLE 69: The algebra sl (2, C ) = C { h, x+ , x− } is defined by the Lie brackets
[ h, x± ] = ± 2x± and [ x+ , x− ] = h. (The substitution h = 2J3, x± = J± will give us
a more familiar form in J± , J3 .) Its standard representation V = C2 is:
     
1 0 0 1 0 0
H= , X+ = , X− = .
0 −1 0 0 1 0
Equivalent notations: E = X+ , F = X− . Here is a non-standard representation of
sl (2):
     
2 0 0 0 2 0 0 0 0
H = 0 0 0  , X+ = 0 0 1 , X− = 1 0 0 . 
0 0 −2 0 0 0 0 2 0
3.6. REPRESENTATIONS 107

3.6.2 Relating Π ( G ) and π (g)


Given a representation of a Lie group, what do we know about representations
of its Lie algebra? Conversely, given a representation of a Lie algebra, is there a
corresponding representation of an associated Lie group? The answer is summa-
rized as follows:
Π
G −−−−→ GL ( V )
x x

exp 
exp

g −−−−→ gl( V )
π
Let G be a connected Lie group with Lie algebra g. Then,
• Every Lie-group representation Π : G → GL (V ) defines a unique Lie-algebra
representation π : g → gl (V ) by π ( x) = dΠ (etx) /dt |0 for all x ∈ g.
• Given a Lie algebra g such that its corresponding Lie group G is connected and
simply-connected, there exists for every representation π of g a unique represen-
tation Π of G, acting on the same space, such that Π (ex) = eπ(x) for all x ∈ g.
These are important results: since Lie algebras are finite-dimensional spaces,
they are easier to deal with, and are usually the indicated place to begin when
one wants to compute the representations of Lie groups. For example, su(2) is a
vector space with the basis X, Y, Z satisfying the commutation relations [ X, Y ] =
Z, [Y, Z] = X, and [ Z, X] = Y. Any representation found for su(2) is also a
representation of the connected simply-connected group SU(2).
Even if the Lie group under study, say G, is connected but not simply-connected,
the above result can still be useful because (from Sec. 3.5.2) G may be written
as the quotient group G = G̃/Z, for some connected and simply-connected Lie
group G̃ and a discrete invariant subgroup Z, so that representations of G are
representations of G̃ satisfying Π ( Z ) = I. Example: G = SO (3, R ), G̃ = SU (2).
Another result is that any finite-dimensional complex representation of a real Lie al-
gebra g has a unique representation of its complexification gC ; that is, g and gC ,
have equivalent representations. Example: The finite-dimensional complex rep-
resentations of SU (2), SL (2, C ), su (2) and sl (2, C ) are all equivalent.
In conclusion, the problem of computing the representations of Lie groups
can be reduced largely to the problem of computing the representations of the
associated Lie algebras.
The adjoint representation. The adjoint action Ad on a Lie group G is a map
Ad : G → G, such that Adγ · β = γβγ−1, with β, γ ∈ G. Similarly, define the map
on Lie algebra ad : g → g, which implies adX · Y = [ X, Y ], as we explicitly show
now. Just as Adγ · Y = dtd γ exp ( tY )γ−1|t=0 gives the adjoint action on G, so the
adjoint action on g = Lie ( G ) produces

d d
adX · Y = exp ( sX ) exp( tY ) exp(− sX ) ,
ds dt s=t=0

which is adX · Y = [ X, Y ]. As claimed, Adγ · β = γβγ−1 implies the ‘infinites-


imal’ action adX· = [ X, ·]. Given this fact, we see that the condition for a Lie
108 CHAPTER 3. LIE GROUPS & LIE ALGEBRAS

subgroup of a Lie group G to be normal is that it be Ad G-invariant, and for a Lie


subalgebra of a Lie algebra g to be an ideal in g, it be ad g-invariant.
Given a Lie group G with Lie algebra g, the group morphism Ad : G → GL (g),
defined by Adα ( X ) = αXα−1 for α ∈ G and X ∈ g, is a real representation of G
on space g, called the adjoint representation of G (i.e. V = g, Π ( G) = AdG. It
then follows the ad representation of g: adX(Y ) = [d/Ad γX ( s )/ds]0 · Y = [ X, Y ]
(where γX ( s ) = esX for X, Y ∈ g) called the adjoint representation of g (i.e. V = g,
π (g) = adg). We see the morphism ad : g → gl (g) as the regular representation of
g by left multiplication, with representation space V = g, which assigns to each
X ∈ g a linear operator π ( X ) = adX. As a homomorphism of Lie algebras, adg
preserves the Lie bracket, so that if [ X, Y ] = Z, then [ad X, ad Y ] = adZ must
follow for arbitrary elements X, Y of g. One can verify that it is indeed the case:
[adX, adY ] W = adX adY · W − adY adX · W
     
= X, [Y, W ] − Y, [ X, W ] = − W, [ X, Y ]
 
= [ X, Y ], W = ad ([ X, Y ]) · W ,
where skew symmetry and the Jacobi identity have both been used. The result,
[ad X, ad Y ] = ad[ X, Y ], may be seen as an expression of the Jacobi identity.
Matrices in the adjoint representation of a Lie algebra can be expressed in
terms of its structure constants (in a given basis). Let { X1, . . . , Xη } be a basis
for an η-dimensional algebra g, satisfying the bracket relations [ Xi , Xj ] = Xk ckij ,
then the adjoint representation ad g is specified by ad Xi , with i = 1, 2, . . . , η. The
ad-representation matrix R of each Xi then follows:

ad Xi · Xj = Xk R kj ( Xi ) = Xk ckij ; (3.38)

so that (adXi )k j = R kj ( Xi ) = ck ij . Thus, the adjoint representation of a Lie algebra


contains exactly the same information as its structure constants, but in a more
convenient (operator or matrix) form, where ordinary matrix arithmetic applies,
with associative operator or matrix products replacing the non-associative Lie
brackets. This is especially true when we deal with multiple commutators. An
important example is the relation [ Xi , [ Xj , Xk ]] = X` (adXi · adXj )` k .
Remark: For a finite Lie algebra g such as gl( n, F ), its standard representation
is V = Fn , while its adjoint representation is V = Fη where η = dim g. Similarly
for other classical Lie algebras. The adjoint representation will be used to define
the Killing form, which plays a role in testing semisimplicity of Lie algebras.
E XAMPLE 70: o(3). Generally, one calculates the adjoint-representation matrices
of a Lie algebra either from the structure constants, via R kj ( Xi ) = ckij , or by read-
ing off [ Xi , Xj ] = Xk R kj ( Xi ) or the multiplication table of the Lie algebra (with
the table entries giving R kj ( Xi )). Note in this case n = η = 3.
Rx Ry Rz
Rx 0 Rz − Ry
=⇒ R ( R x) = R x , R ( R y) = R y , R ( R z ) = R z .
Ry − Rz 0 Rx
Rz Ry − Rx 0
3.6. REPRESENTATIONS 109

E XAMPLE 71: sl (2, C ). We have (rows, columns in the same order X+ , H, X− ):


X+ H X− " #
0 −2 0
X+ 0 −2X+ H
=⇒ adX+ = 0 0 1 ,
H 2X+ 0 −2X− 0 0 0
X− −H 2X− 0

" # " #
2 0 0 0 0 0
adH = 0 0 0 , adX− = −1 0 0 .
0 0 −2 0 2 0

In the canonical basis { S1, S2, S3 } , in which [ S1, S2 ] = S3 (cf. Example 45), we
have the non-zero matrix elements: (adS1)23 = − 1, (adS1 )32 = + 1; (adS2)13 =
+ 1, (adS2 )31 = − 1; (adS3 )12 = − 1, (adS3)21 = + 1. All other elements are 0.
Note for reference (adS1 )2 = diagonal[0, − 1, −1], (adS2)2 = diagonal[− 1, 0, −1],
(adS3)2 = diagonal[− 1, −1, 0]. And so one can find Tr (adSi · adS j )2 = − 2δij.

3.6.3 Simple and Semisimple Representations


In order to classify representations, one must have a way to tell them apart: When
can we say two representations are ‘the same’?
Definition 3.7 (Morphism, isomorphism). Let ( Π, V ) and ( Σ, W) be representations
of a Lie group G. A linear map φ : V → W is called a morphism of representations
(G-morphism) if φ Π ( x)v = Σ ( x)φ(v) for all v ∈ V and x ∈ G. A similar definition
holds for morphisms between representations of Lie algebras (g-morphism).
The morphism relation φ ◦ Π ( x ) = Σ ( x) ◦ φ with x ∈ G, illustrated in the
following diagram, is described as the intertwining of ( Π, V ) and ( Σ, W ) by φ:
φ
V −−−−→ W
 
 
Π( x) y yΣ(x)
V −−−−→ W
φ

From the relation π ( x) = dΠ (etx) /dt |0 for all x ∈ g between the representa-
tions of a Lie group G and its Lie algebra Lie ( G ) = g, one can expect that every
G-morphism is a g-morphism. The converse holds if G is connected and simply-
connected, so that φG (V , W ) = φg (V , W).
An isomorphism of representations is a morphism that is invertible. Two repre-
sentations related by an isomorphism are said to be isomorphic or equivalent to one
another, and can be regarded as the same.
Let V be a representation of a Lie group G (or Lie algebra g). Then a subrepre-
sentation is a vector subspace W ⊂ V invariant under the action: Π ( g)W ⊂ W
for all g ∈ G (respectively, π ( x)W ⊂ W for all x ∈ g). A subrepresentation W of
V is called proper if it is neither { 0} nor V itself.
Definition 3.8. A non-zero representation V is said to be irreducible or simple, iff it
has no proper subrepresentations. Otherwise, it is called reducible.
110 CHAPTER 3. LIE GROUPS & LIE ALGEBRAS

Again, in other words, ‘the only invariant subspaces of V are 0 and V. A


subspace V 0 ⊂ V is invariant if gV 0 ⊂ V 0 for all g ∈ G for a Lie group G; and if
xV 0 ⊂ V 0 for all x ∈ g for a Lie algebra g.
E XAMPLE 72: The trivial representation of Lie group G, Π : G → GL (1, C ), with
Π ( α) = I for all α ∈ G, is irreducible since C has no non-trivial subspaces at all.
The trivial representation of Lie algebra g defined π : g → gl(1, C ), with π ( X) = 0
for all X ∈ g, is also irreducible. The standard representations of most classical
Lie groups and Lie algebras are irreducible. Thus, for each of the classical Lie
groups GL( n, C ), SL( n, C ), SO ( n, C ), SU( n, C ), the standard representation V =
Cn is irreducible. Analogously, the standard representation V = Cn for gl( n, C ),
sl ( n, C ), so ( n, C ), su ( n, C ), u( n, C ); and V = C2n for sp( n, C ) is irreducible. 
One can produce smaller representations with subrepresentations, but one
can also build up larger ones. Suppose Π 1, Π 2 are representations of the group
G in vector spaces V 1, V 2, then the representation Π 1 + Π 2 (denoted Π 1 ⊕ Π 2) of
g ∈ G on a vector ( v 1, v2 ) of space V 1 ⊕ V 2 is defined by [ Π 1 ⊕ Π 2 ( g)](v1, v2 ) =
( Π 1( g) v1, Π 2 ( g) v2). It is just as for finite groups, except here one must also verify
that if Π 1 and Π 2 are continuous so is Π 1 + Π 2. As defined, Π 1 ⊕ Π 2 ( G ) is called
the direct sum of representations Π 1, Π 2 of the group G. Cf. Chapter 2, Sec. 2.3.
Similarly, let ( π1, V 1) and ( π2, V 2) be representations of Lie algebra g. Then
their direct sum is also a representation of g, denoted π1 ⊕ π2 acting on V 1 ⊕ V 2,
defined by [ π1 ⊕ π2 ( x)] ( v1, v2 ) = ( π1( x) v1, π2 ( x) v2) for all x ∈ g. Clearly, this
discussion can be generalized to several summands.

Definition 3.9 (Complete reducibility). A finite-dimensional representation V is said


to be completely reducible (or semisimple) if it is isomorphic to a direct sum of irre-
ducible representations: V = ⊕ Vi, with Vi irreducible.
In this decomposition, one usually groups together isomorphic summands,
writing V = ⊕ ai Vi, where the Vi’s are now pairwise non-isomorphic irreducible
representations, and ai non-negative integers, called multiplicities of Vi. Recall
from the last chapter that a finite-dimensional representation of a group G in V is
semisimple if and only if each stable subspace U ⊂ V has at least one stable com-
plementary subspace W ⊂ V , so that V = U ⊕ W . Not every representation is
completely reducible, so, which representations of which Lie groups can achieve
complete reduction?

Definition 3.10 (Unitary representation). (1) A complex representation ( Π, V ) of


a Lie group G is called unitary if there exists a G-invariant inner product, such that
h Π (g)v, Π (g)wi = h v, wi (or equivalently, Π ( g) ∈ U(V ) unitary Lie group) for any
g ∈ G and any v, w ∈ V . (2) A complex representation ( π, V ) of a Lie algebra g is
called unitary if there is a g-invariant inner product: h π ( x)v, wi + h v, π (x)wi = 0 (or
equivalently, π ( x) ∈ u(V ) unitary Lie algebra) for any x ∈ g.

We recall (cf. Sec. 2.4) that every finite-dimensional unitary representation of a


group is completely reducible. The proof by induction does not depend on whether
the group is finite or not, and so it also holds for Lie groups. But in contrast
to finite groups, complete reducibility is a condition more difficult to achieve for
3.6. REPRESENTATIONS 111

arbitrary representations of Lie groups. In a finite group G, if B is a positive-


definite non G-invariant inner product in representation Π of G, then one may
define a new inner product by averaging B with group action over all g ∈ G,

1 
| G | g∑
B( v, w ) = B Π ( g)v, Π ( g)w , (3.39)
∈G

which is both positive-definite and G-invariant. It follows that Π is equivalent to


a unitary representation and therefore completely reducible. So every reducible
representation of a finite group is completely reducible. The above arguments
depend critically on the existence of B̄: Now, to have a similar inner product
for a continuous group G, one replaces the sum with some finite integral over G,
and one needs to have a well-defined integration measure, or a continuous linear
functional µ ( G ) on G. The key fact is that if G is a compact Lie group, there exists
R Haar measure, cf. [Fa] p. 228), which
such a unique real positiveR measure (called
is invariant under G (i.e. G ( h f )dµ = G Rf dµ for all h ∈ G and f ∈ C ∞ ), and
normalizable so that G has volume 1 (i.e. G dµ = 1). Suppose now that Π is a
representation of a compact Lie group in the finite-dimensional vector space V
with any given positive-definite inner product B ( v, w) on V . Then we can define
Z  
B( v, w) = B Π ( g)v, Π( g)w dµ ( g). (3.40)
G

This inner product is positive definite: B̄( v, v ) > 0 (integral of a positive function:
the Haar measure is positive,and, on a compact group, positive definite); and
G-invariant: B̄ Π ( h)v, Π (h)w = B̄ ( v, w). The representation Π is unitary with
respect to the inner product B̄, and therefore completely reducible:
• Every finite-dimensional representation of a compact Lie group is either irre-
ducible, or completely reducible.
The existence of the Haar measure for compact Lie groups implies that many
properties of the representations of finite groups have parallels in the representa-
tions of compact Lie groups. In particular, characters can be defined, with familiar
properties. Let ui be a basis of a vector space V , and Π ( g) : V → V the linear oper-
ators corresponding to the elements g of a compact real Lie group G. A character
of representation V is defined to be a function on the group:

χV ( g) = TrV Π ( g) = ∑i Π ii ( g). (3.41)

From this definition, it is clear that χV does not depend on the choice of basis in
V , and that χV = 1 for the trivial representation
R V = C.
Now, define the inner product h f 1, f 2 i = G f 1∗ ( g) f 2( g)dµ ( g) with the Haar
measure µ ( g) on G. With respect to this inner product, the characters are orthog-
onal (vectors) — if V, W are non-isomorphic complex irreducible representations
of G, then h χV, χW i = 0. Suppose V is a complex representation of a compact
real Lie group G, then V is irreducible if and only if h χV , χV i = 1. As V is com-
pletely reducible, it can be uniquely decomposed as V = ∑i ai Vi, where Vi are
pairwise non-isomorphic simple representations, with the multiplicities given by
112 CHAPTER 3. LIE GROUPS & LIE ALGEBRAS

ai = h χV , χVi i. These properties are just the same as for finite groups. So, if we
restrict ourselves Rto compact Lie groups, we can carry over, with modifications
(e.g. | G |−1 ∑g 7 → G dµ ( g)), the results derived in Chapter 2 for finite groups.
Among the Lie groups already mentioned, O ( n ), SO( n ), U( n ), SU( n ), and
Sp( n ) are compact groups (see Example 14 in Sec. 3.2.1).
Tensor Products of Representations. In Chapter 2 we define products of repre-
sentations of finite groups. That definition remains valid as well for Lie groups.
We recall it now, and extend it to Lie algebras.

Definition 3.11 (One Lie group). If G is a Lie group, and ( Π 1, U ), ( Π 2, V ) are rep-
resentations of G, then the tensor product of Π 1 and Π 2 is a representation
 of G acting
on U ⊗ V defined by Π ( g) ∑( u ⊗ v ) = ∑ Π 1 ( g) u ⊗ Π 2( g) v for all g ∈ G. The usual
notation is Π ( g) = Π 1 ⊗ Π 2( g).

Definition 3.12 (Two Lie groups). Let G and H be Lie groups, ( Π 1, U ) a representa-
tion of G, and ( Π 2, V ) a representation of H. Then the tensor product of Π 1 and Π 2 is
a representation of G × H acting on U ⊗ V defined by

Π ( g, h) ∑( u ⊗ v ) = ∑ Π 1( g) u ⊗ Π 2( h )v

for all g ∈ G and h ∈ H. The usual notation is Π ( g, h) = Π 1 ⊗ Π 2 ( g, h ).

As Π 1 and Π 2 are continuous, one can verify that Π 1 ⊗ Π 2 is continuous.


Furthermore, Π 1 ⊗ Π 2 is irreducible if and only if Π 1 and Π 2 are both irreducible,
and, in addition, every irreducible complex representation of G × H must be of this
form provided that G and H are compact groups.
Now, let us have the corresponding definitions for Lie algebras.

Definition 3.13 (One Lie algebra). Let g be a Lie algebra, and let ( π1, U ) and ( π2, V )
be representations of g. Then the tensor product π1 ⊗ π2 is a representation of g acting
on U ⊗ V defined by π1 ⊗ π2 ( x) = π1 ( x) ⊗ I + I ⊗ π2 ( x) for all x ∈ g.

To see this is so, let α ( t) = etx be some one-parameter curve in a Lie group G,
with x = α0 (0) and α (0) = 1. Then, as π ( x) = d Πα (t)/dt|0, we have

d
π1 ⊗ π2 ( x)( u ⊗ v ) = Π 1 (etx ) u ⊗ Π 2(etx ) v
dt 0

Using Leibniz rule, we get π1 ( x) u ⊗ v + u ⊗ π2 ( x) v, just as we want.

Definition 3.14 (Two Lie algebras). Let g and h be Lie algebras. If ( π1, U ) is a repre-
sentation of g, and ( π2, V ) a representation of h, then the tensor product of π1 and π2,
denoted π1 ⊗ π2, is a representation of g ⊕ h acting on U ⊗ V , defined by

π1 ⊗ π2 ( x, y ) = π1 ( x) ⊗ I + I ⊗ π2 ( y )

for all x ∈ g and y ∈ h.


3.6. REPRESENTATIONS 113

Although it is called ‘tensor product’ of representations, π1 ⊗ π2 ( x, y ) is not


the tensor product of the operators π1 ( x) and π2 ( y ); it is rather a sum of tensors.
That it is so can be seen by letting it act on an element u ⊗ v of U ⊗ V :

d
π1 ⊗ π2 ( x, y ) · ( u ⊗ v ) = Π (etx ) u ⊗ Π 2 (ety ) v
dt 0 1

d tx d
= Π1 (e ) u ⊗ v + u ⊗ Π 2 (ety ) v
dt 0 dt 0
= π1 ( x) u ⊗ v + u ⊗ π2 ( y )v.

3.6.4 Invariant Forms and Semisimple Algebras


An important tool in determining the semisimplicity of Lie algebras is the concept
of symmetric invariant bilinear form. A form B ( x, y ) on a vector space g is said
to be bilinear if B ( x, ay + bz ) = aB ( x, y) + bB ( x, z ) (a, b ∈ R) and similarly for the
first argument; symmetric if B ( x, y ) = B ( y, x); and invariant if

B [ z, x], y ) + B ( x, [ z, y] = 0, (3.42)

where x, y, z ∈ g (cf. the ‘infinitesimal invariance property’ discussed in p. 91).


B is said to be non-degenerate if for every nonzero x ∈ g there exists y ∈ g such
that B ( x, y) 6 = 0 (i.e. if for some x ∈ g B ( x, y ) = 0 for all y ∈ g, then x = 0).
Or, equivalently: If in any basis { X1, . . . , Xn } of g, the determinant of the matrix
[ B ( Xi, Xj )] is not 0, then B is said to be non-degenerate. In the case of a real Lie
algebra, a negative-definite form means B ( x, x) < 0 for every nonzero x ∈ g.
Trace Form. Let g be a Lie algebra, and ( π, V ) a finite-dimensional representation
of g, so that π ( x) ∈ gl(V ) is a linear V -to-V operator. A possible choice for B is
the bilinear trace form:

B ( x, y) = TrV π ( x) π (y) for x, y ∈ g . (3.43)

The trace form is clearly symmetric; that it is also an invariant form can be shown
by recalling that π ( xy ) = π ( x) π (y) when taking the trace of the representations
of both sides of the identity [z, xy ] = [z, x] y + x [z, y ]. The trace form on a real Lie
algebra has all the properties of an inner product except possibly positive defi-
niteness. But the trace form on a complex Lie algebra cannot be an inner product,
since it lacks the sesquilinearity of inner products on complex vector space.
The importance of the trace form lies in the fact that if the trace form of a Lie
algebra g in a representation (π, V ) is non-degenerate, then g is decomposable into a
direct sum of the form g = Z(g) ⊕ gss , where gss is semisimple and Z(g) the center
of g (i.e. [ Z(g), g] = 0).
E XAMPLE 73: Here we calculate the trace form of some classical Lie algebras g in
the standard representation (π, V ), making use of properties listed in Table 3.2. We
use notations x ∈ g, π ( x) = X, i = 1, 2, . . . , n. For gl ( n, F ), the standard rep-
resentation is V = Fn , so one has B ( x, y ) = ∑ij Xij Yji , which is non-degenerate.
Since sl ( n, F ) < gl ( n, F ), it follows that B on sl ( n, F ) is also non-degenerate. For
114 CHAPTER 3. LIE GROUPS & LIE ALGEBRAS

so ( n, F ) in V = Fn , we have B ( x, y ) = ∑ij Xij Yji = − ∑ij Xij Yij , and in particular


B ( x, x) = − 2 ∑i> j Xij2 , so B is non-degenerate. For unitary algebra u( n ) in V =
Cn , we have B ( x, y ) = − ∑ij Xij∗ Yij , and in particular B ( x, x) = − ∑ij | Xij |2 ≤ 0
with equality only for x = 0, which is again non-degenerate. Since su ( n ) < u( n ),
the trace form for su ( n ) is also negative-definite, and thus non-degenerate.
We next calculate the center Z(g) = { x ∈ g|[ x, y ] = 0 for all y ∈ g} , leading
to Z(g) = 0 for sl( n, F ), so( n, F ) (n > 2), su ( n ) and sp( n, F ); and Z(gl( n, F )) =
FI and Z(u( n, F )) = iR I. So the classical algebras sl ( n, F ), so ( n, F ) (n > 2),
su ( n ), and sp( n, F ) are semisimple; while gl ( n ) and u( n ) are not semisimple, but
decomposable into direct sums: gl ( n ) = F I ⊕ sl ( n ) and u( n ) = iR I ⊕ su ( n ).
The Killing Form is a special case of the trace form in which the representation
is taken to be the adjoint representation (π (g) = adg, V = Fη where η = dim g) :

K ( x, y) = Tr (adx · ady ) . (3.44)

In calculating the trace of matrix representatives R, the matrices (of order η) can
evidently be expressed in any basis. Given a basis { Xi, i = 1, 2, . . . , η } of g, the
Killing form of any pair Xi and Xj is

K ( Xi, Xj ) = Tr adXi · adXj = ∑ cki` c`jk . (3.45)
k`

One can calculate K ( Xi, Xj ) using either the Lie brackets among the Xi or the
structure constants cki` . Since cki` are tensors of type (2, 1), the cross-contraction
of the two upper indices in the pair of c  yields a second-rank covariant tensor,
called the Cartan–Killing metric, and denoted κij = K ( Xi, Xj ). As c  can be made
real for all real or complex Lie algebras, [ κij ] is real and symmetric, and therefore
can be brought into a diagonal form. If one or several of the diagonal elements
turn out to be zero, the metric tensor is singular, making the corresponding bi-
linear form K degenerate. So, if g has an abelian ideal, there will be directions in
which ad is zero and K degenerate.
If the Killing form is non-degenerate, the matrix [ κij ] is non-singular, and an
inverse [ κ ij ] exists, such that κij κ jk = δi k = κ kj κ ji . These second-rank tensors,
κij and κ ij , are symmetric in their respective indices; they are nonsingular and
dual, and may be used to raise and lower indices of tensors. In particular, we can
define structure constants that are completely antisymmetric in its three indices, as
in cijk = c`ij κ`k .
Just as any other trace form, the Killing form may be non-degenerate, degen-
erate, or even zero. Following are some examples.
E XAMPLE 74: aff(1). We obtain κ11 = 1, κ12 = 0, κ22 = 0, and det κ = 0, so [ κ ] is
singular. For any X = a1 X1 + a2 X2 and Y = b1 X1 + b2 X2 , we get K ( X, Y ) = a1 b1 .
So for an arbitrary Y, one may have K ( X, Y ) = 0 with a non-zero X = a2 X2 . The
Killing form on aff(1) is degenerate.
E XAMPLE 75: The Heisenberg algebra H (3) = { x, y, z } is defined by the commu-
tation relations [ x, y ] = z, [ x, z ] = 0, [ y, z ] = 0. In the adjoint representation we
3.6. REPRESENTATIONS 115

have the only non-zero elements (adx)zy = 1 and (ady )zx = − 1, together with
[adz ] = 0. Clearly, the Killing form K ( u, v) = 0 for any u, v ∈ H (3).
E XAMPLE 76: so (3). Recalling that adR i = R i, with R i the standard representa-
tion as in Example 68, we calculate the Cartan-Killing matrix κij = Tr ( R i R j ) =
− 2δij, and det κ = − 8, so [ κ ] is non-singular. Let X = aR x + bR y + cR z and X0 =
a0 R x + b0 R y + c0 R z be any elements of o(3). Then K ( X, X0) = − 2( aa0 + bb0 + cc0 ),
which nullifies for arbitrary X0 only if X = 0; so, this form is non-degenerate. In
particular K ( X, X ) = − 2( a2 + b2 + c2 ), to be compared with the orthogonality
relation ∑ cTi ci , where ci are the column vectors in X. This result holds in general:
the metric tensor of the orthogonal (real) Lie algebra is negative-definite.
E XAMPLE 77: sl (2, C ). With the adXα in Example 71, the matrix καβ can be calcu-
lated to give κ HH = 8, κ+− = κ−+ = 4; all other components are zero; it follows
det [ καβ ] = − 128. For any X = aH + bX+ + cX− in sl (2, C ), with complex num-
bers a, b, c, we have the Killing form K ( X, X0) = 8aa0 + 4( bc0 + cb0). So K ( X, X0) =
0 for arbitrary X0 only if X = 0. It is non-degenerate. We note further that the
trace form in the standard representation is given by BHH = 2, B+− = B−+ = 1;
all other components are zero. So that Bad ( x, y ) = 4Bstandard ( x, y ). This is so be-
cause on a simple Lie algebra (over C) any two symmetric invariant bilinear forms are
proportional to each other. See also the example of sl ( n, C ) discussed below.
E XAMPLE 78: su (2). To go to su(2) (recalling sl = su ⊕ i su), we require X† = − X,
which implies a∗ = − a and c = − b∗. Accordingly, letting a = iα, b = β + iγ and
c = − β + iγ, with real α, β, γ, we have the Killing form K ( X, X ) = − 8( α2 + β2 +
γ2 ), which is negative definite for arbitrary real α, β, γ. In the basis Si = − iσi/2
we have κij = − 2δij, and det [ κij ] = − 8 (cf. Example 71). Since Tr ( SiS j ) = − 21 δij ,
we again have Tr (adSi · adS j ) = 4Tr ( SiS j ). Also cf. case of so (3). 
The Killing form is the basis of an effective test (Cartan criterion) for semisim-
plicity of Lie algebras, which has several important implications.
• A Lie algebra is semisimple if and only if its Killing form is non-degenerate.
Arguments leading to this theorem can be found for example in [FH] p. 479, or
[Ja] p. 69. (1) Suppose the Killing form K is nondegenerate. As with any other
form, this implies that g is reductive, i.e. g = Z(g) + gss, where [ Z(g), gss ] = 0. But
with nondegenerate K, the center must be 0; because otherwise, there would be an
x0 ∈ Z(g) so that ad x0 = 0 and K ( x0, y ) = 0 contrary to assumption. Therefore,
Z(g) = 0 and g is semisimple. (2) Suppose Lie algebra g is not semisimple. Then
it has a non-zero abelian ideal a (so that [a, a] = 0 and [a, g] = a). Choose a basis
{ Xi } for g. Take a non-zeroA in a and any X in g, and calculate the Killing form
K ( A, X ) from the formula X, [ A, Xk ] = X` (adA · adX)` k . If Xk ∈ a, then the
LHS
 vanishes,
 [ X, [ A, Xk ]] = 0, because a is an abelian ideal. If Xk 6 ∈ a, then
X, [ A, Xk ] ∈ a, so that the sum on the RHS contains only terms with X` ∈ a,
and (adA · adX)` k = 0 for ` = k. Hence, K ( A, X) = Tr (adA · adX) = 0, and the
Killing form K is degenerate.
E XAMPLE 79: gl( n, F ). Let x = cIn with c ∈ C and In the n × n identity. Clearly,
x ∈ gl( n, C ); or more precisely x ∈ Z[gl]. Since adx = 0 the Killing form K ( x, x)
is zero. On the other hand, we see that the trace form B ( x, x) = Tr ( x2 ) = nc2 6 = 0.
Since Tr x = nc, we verify that 2nTr( x2 ) − 2(Tr x)2 = 0.
116 CHAPTER 3. LIE GROUPS & LIE ALGEBRAS

E XAMPLE 80: gl ( n, F ). Given X ∈ gl and any n × n matrix M, then (adX)2 maps


M to X2 MI + I MX2 − 2X M X, and the trace of the action of (adX)2 is the Killing
form K ( X, X ) = 2n Tr( X2 ) − 2(TrX)2. For sl ( n, F ), it reduces to 2n Tr( X2 ); and so
also K ( X, X) = 2n Tr( X2 )for X ∈ su ( n ). 
The above theorem has several corollaries:
• A Lie algebra g is semisimple iff it is a direct sum of simple  Lie algebras si (so
that g = s1 ⊕ s2 ⊕ s3 ⊕ · · · , where [si , si ] = si , and si , sj = 0 for all i 6 = j).
This decomposition is unique, except for the order of the terms in the sum. We
have seen that D g = g holds if Lie algebra g is simple; we see that it also does
even if g is only semisimple from its decomposition into a direct sum of simple
Lie algebras. If a given Lie algebra is not semisimple, its Cartan metric is singular,
and so it cannot be completely reduced, but it still can be written sometimes as a
direct, or semi-direct, sum of a semisimple Lie algebra and a solvable Lie algebra.
• A real Lie algebra g is semisimple iff gC is semisimple.
In the case of a real Lie algebra, the compactness of its corresponding Lie
group is determined by its Killing form.
• Let g be a semisimple real Lie algebra with negative-definite Killing form. Then
g is a Lie algebra of a compact real Lie group. Conversely, let G be a compact real
Lie group. Then g = Lie( G ) is reductive, and the Killing form of g is negative-
semidefinite; and the Killing form of the semisimple part of g is negative-definite.
In the example of real Lie algebra su( n ), we have K ( x, y) = Tr (adx ady ) =
2nB ( x, y), where trace form B ( x, y ) is Tr ( XY ) = ∑ Xij Yji = − ∑ X∗ji Yji . And so in
particular K ( x, x) = − 2n ∑ | Xij |2 ≤ 0, with equality holding only for x = 0. It
follows that SU( n ) is compact since Lie(SU( n )) = su ( n ).
C OMMENTS on direct and semi-direct sums of Lie algebras: To simplify consider just
two algebras, g and h. Their sum, s, formed from the combination of all genera-
tors of the two, is a Lie algebra in the following situations:
(1) If g and h are simple algebras, obeying the conditions [g, g] = g, [h, h ] = h,
and [g, h ] = 0, then s is a direct sum of g and h, and is a semi-simple Lie algebra.
(2) If g is simple, and h is abelian (so that [g, g] = g, [h, h ] = 0), then any
g ∈ g and any h, h 0 ∈ h obey the Jacobi identity [ h, [ g, h 0]] = [ h0[ g, h ]], which
holds if either [g, h ] = 0, or [g, h ] = h. (2a) If [g, h ] = 0, s is again a direct sum
of g and h, but is not semi-simple, having an abelian invariant subalgebra. (2b) If
[g, h ] = h (i.e. h is an abelian ideal in s), then s is a semi-direct sum of g and h, and
a Lie algebra that is not semi-simple. Both of these possibilities are of interest in
physics. The case (2a) finds its expression in the Lie algebra of the hypercharge–
isospin group U (1) × SU (2); whereas (2b) can be realized, for example, by the
Poincaré algebra, which is the semi-direct sum of space-time displacements and
Lorentz space-time rotations.

3.6.5 Representations in Quantum Mechanics


Quantum mechanics offers many interesting examples and applications of group
and algebra representations. Group elements describe transformations on ob-
jects; algebra elements operate on vectors that can possess physical meaning and
3.6. REPRESENTATIONS 117

can be associated with observable quantities. For example, the generators of rota-
tion algebra so (3) may be identified with the angular momentum; while those of
the hidden symmetry so(4) in the hydrogen atom, with the angular momentum
and the Runge–Lenz vector. As Emmy Noether has shown (1918), symmetry im-
plies the existence of conservation laws: so, in any physical system invariant un-
der a continuous symmetry that is global (acting the same way everywhere and at
all times), there exists an associated conserved ‘charge’, i.e. a time-independent
quantity. Thus, invariance of the laws of physics under spatial displacements
implies conserved momentum; time-translational invariance implies conserved
energy; and invariance under phase changes of the wave functions of charged
particles implies conserved electric charge. As the states of a physical system can
be uniquely identified by the values of its complete set of conserved quantities,
much of the effort in the study of a quantum system is to identify the symmetry
transformations that leave the system invariant (or nearly invariant), and to find
the irreducible representations of these symmetry groups. Symmetry, invariance,
and conservation are closely related concepts.
In quantum physics, a state is represented by a vector in a Hilbert space H .
Given a symmetry group G of the physical system and T a linear operator rep-
resenting any element of G, for every allowed state ψ of the system the vector
Tψ is also an allowed state, and so also is ψ + Tψ, by the superposition princi-
ple. It follows that a basis of an irreducible representation of G can be formed by
appropriate linear combinations ∑T Tψ ; conversely, allowed physical states are
expressible in terms of the basis vectors of irreducible representations. Given this
relationship, the tools of group theory can be applied to uncover properties of
a symmetric system, as for example in rotational-invariant atoms, where energy
spectra, branching rules, and selection rules can be inferred with confidence.
Furthermore, as probabilities of events, rather than vectors themselves, are
measurable, the requirement of conservation of the inner product itself would be
unnecessarily stringent; instead it suffices to have the equality of probabilities for
equivalent events in two reference frames: |h φ |ψi|2 = |h φ0 | ψ0 i|2 for all vectors
ψ, φ in H . The conservation of the probability implies that, up to an arbitrary
constant phase factor, the mapping ρ : ψ → ψ0 must be either linear and unitary, or
antilinear and antiunitary.
An operator U : ψ → ψ0 = Uψ is said to be linear and unitary if it satisfies the
conditions U | αψ + βφ i = αU | ψi + βU |φ i and hUφ |Uψi = h φ | ψi; and an operator
T : ψ → ψ0 = Tψ is said to be antilinear and antiunitary if it satisfies  the conditions
T | αψ + βφ i = α∗ T | ψ i + β∗ T | φ i and h Tφ | Tψi = h φ | ψi∗ = h ψ | φi.
E XAMPLE 81: An example of an antilinear antiunitary operator is the time rever-
sal operator T : t → − t; it effectively reverses the motion (the linear and angu-
lar momentum) leaving the position unchanged. An operator O transforms as
O 0 = T O T −1, so that a matrix element such as h ψ (t)|O|φ(t)i becomes under
time reversal h Tψ (t)|O 0| Tφ (t)i = h ψ (− t)|O|φ(− t)i∗. 
The Lie algebra g ∼= u (H) of a unitary group G ∼
= U (H) consists of all the lin-
ear operators on Hilbert space H that are skew-Hermitian (A† = − A). As usual,
to each A ∈ g is associated a one-parameter (unitary) group U A ( t ) = exp ( t A)
such that A = limt→0 [(U A( t ) − 1) /t ]. Physicists generally prefer, by a simple
118 CHAPTER 3. LIE GROUPS & LIE ALGEBRAS

redefinition (A = ± iH), to deal with hermitian/self-adjoint operators (H † = H)


which may acquire physical meaning as they have real-valued eigenvalues iden-
tifiable with observable physical quantities.
Another liberty is allowed in the treatment of symmetry groups in physics:
Eigenvectors and eigenvalues of operators on H are unchanged when all state
vectors of a physical system are multiplied by an overall phase eiϑ , since vectors
differing by such a phase are physically equivalent (although their relative phases
are still physically significant). This implies that the usual group composition,
U ( g)U (h) = U ( gh ), can be modified to allow a phase factor on the RHS:

U ( g)U (h) = eiϑ(g,h) U ( gh ) (3.46)

for any two elements g, h of a group and some real angle ϑ ( g, h). Accordingly, the
general bracket relations are modified by adding an extra multiple of the identity:
 
Xi , Xj = ckij Xk + bij I. (3.47)

If the constants bij cannot be eliminated by redefining the Xi ’s or the phases ϑ’s,
then they must have physical significance and be kept as part of the algebra. The
presence of such irreducible constants (in the algebra) or phases (in the group)
implies the existence of a superselection rule that all the states of a physical system
must obey (which forbids, for example, the superposition of physical states of
different masses).
Non-relativistic quantum physics is invariant under the isometries in R 3 × R 1
— space and time displacements, space rotations, and transformations between
uniformly moving frames (called boost) — which constitute the Galilean group.
Their effects on the position x and time t are: x → x0 = Rx + a + vt, and t → t0 =
t + τ, where R is a rotation, a a space displacement, τ a time displacement, and
v the constant velocity of a moving coordinate frame. Their group representation
in H is given by the operators on H :

Rotation: exp θi R i ; Space translation: exp ai Ti ;


Galilean boost: exp vi Bi ; Time displacement: exp τ D.

The ten parameters θi , ai, vi , and τ are the (real) parameters of the symmetry
transformations generated by the corresponding skew-Hermitian operators R i,
Ti, Bi , and D. In the physicist’s convention, one would rather take the Hermitian
operators Ji = iR i, Pi = iTi, K i = − iBi, and H = − iD. These operators obey the
Lie product rule, as listed in the following multiplication table (where each entry
at row i and column j shows the value of the bracket [ Xi , Xj ]):

Jj Kj Pj H
Ji ieijk Jk ieijk K k ieijk Pk 0
Ki ieijk K k 0 iδij λI iPi
Pi ieijk Pk − iδijλI 0 0
H 0 − iPj 0 0
3.6. REPRESENTATIONS 119

From this table, a few points to note: (1) The operators K i do not commute
with the time displacement operator H, and so are not conserved; but both Ji
and Pi do commute with H, and so have conserved eigenvalues. (2) There exist
several subalgebras, and an irreducible constant (λ in [ Pi, K j ]) both indicative that
Galilean symmetry might not be at the fundamental level of nature symmetry. To
see how the constant λ arises, let us make in succession a translation x → x + a,
then a boost x → x + vt in the x-direction, resulting in x → x + vt + a. This
corresponds in operators on Hilbert space:

exp (ivK x) exp(− iaPx) = exp(iλav/2) exp i( vK x − aPx) ,
with the presence of a phase factor.
In relativity, space and time are unified as Minkowski spacetime R 3+1 and the
symmetry group of relativistic quantum mechanics is elevated to the Poincaré
group P(3; 1). It is a ten-dimensional non-abelian non-compact Lie group of all
isometries in R 3+1 generated by the spacetime translation four-vector Pµ , and
the rotation-boost antisymmetric tensor Mµ,ν (where µ, ν = 1, 2, 3, 0), obeying the
following bracket rules:
i[ Mµν , Mλκ ] = gνλ Mµκ − gµλ Mνκ − gκµ Mλν + gκν Mλµ , (3.48)
i[ Pµ, Mλκ ] = gµλ Pκ − gµκ Pλ , (3.49)
[ Pµ, Pν ] = 0. (3.50)
P(3; 1) is the semiproduct of the abelian subgroup of translation h Pµ i and the
noncompact nonabelian Lorentz subgroup SO (3; 1) also known as the homoge-
neous Lorentz group. The identity component of the latter, called SO∗ (3; 1), is
the connected and simply connected group of the ordinary spatial rotations Mij
and Lorentz boost Mi0 , and can be generated by exponentiation of h Mij, Mi0 i.
To see how the Poincaré group compares with the Galilean group,  let us define
Pi = Pi , H = P0, Ji = M23, M31, M12 , and K i = M10, M20, M30 . Then the 10
generators of the Poincaré algebra have the Lie products listed below:
Jj Kj Pj H
Ji ieijk Jk ieijk K k ieijk Pk 0
Ki ieijk K k − ieijk Jk iδij H iPi
Pi ieijk Pk − iδij H 0 0
H 0 − iPj 0 0
Now, call M the scalar operator giving the total mass of the system, so that the
total energy operator (or Hamiltonian) H is H = M + H 0, where H 0 is the kinetic
plus potential energy (setting the speed of light c = 1). Let v be the typical veloc-
ity and m the typical mass, then, in the limit v  1, we get H = O( m ), Ji = O(1),
Pi = O( v ), and K i = O( v−1), and we recover the Galilean algebra provided we
identify λI with M. This irreducible constant is just a byproduct of the reduction
of Poincaré symmetry to a lower level.
The three non-compact groups just mentioned, Galileo, Poincaré, and Lorentz,
act on infinite-dimensional Hilbert spaces, and admit infinite-dimensional repre-
sentations; and so will not be considered further in the rest of this book.
120 CHAPTER 3. LIE GROUPS & LIE ALGEBRAS

Problems
3.1 (Rotations, O(2), SO(2)) A rotation R ( θ ) in the plane R 2 about the origin
through an angle θ, is defined by a real matrix R ij , such that
 
i j cos θ − sin θ
R ( θ )e = e R ji ( θ ) , R( θ ) = ,
sin θ cos θ

where e1, e2 are orthonormal unit vectors.


(a) Show that the set hR( θ )i form a Lie group: specify its product law, identity
element, and inverse element. What is the range of values of θ?
(b) The group O(2) is defined as the set of 2 × 2 real nonsingular matrices A
obeying the condition AT A = I. What physical situations do the two possibilities,
det A = ± 1 correspond to? Define SO (2) = { A ∈ O (2) : det A = 1} . Show that
there is a 1-to-1 correspondence between rotations in a plane and SO(2) matrices.
3.2 (O(2), SO(2), and U(1)) (a) Find the conjugacy classes of O(2). (b) Prove that
the transformations that leave the magnitude of complex numbers invariant form
a one-parameter Lie group (called U(1)) isomorphic to SO(2).
3.3 (U(2) and SU(2)) (a) Find the general parameterization of the U(2) and SU(2)
matrices. (b) In a complex vector space C2 , the inner product is defined by
h x, y i = ∑2i=1 x∗i yi . Show that the group of linear transformations in C2 that pre-
serve the inner product is an U(2) group.
3.4 (O(1;1) and SO(1;1)) Write down the general matrices in O (1; 1) and SO (1; 1).
3.5 (Dimension of orthogonal groups) Find the dimension of the orthogonal Lie
groups O(n, R) and SO(n, R).
3.6 (Dimension of unitary groups) Find the dimension of the unitary Lie groups
U(n, C) and SU(n, C).
3.7 (Dimension of symplectic groups) (a) Show that Sp ( n, F) contains also the
inverse and the transpose of every one of its elements. Find the dimension of
Sp(n, R). (b) Same questions for the compact symplectic group Sp ( n).
3.8 (Product law) A Lie group is defined in terms of the complete set of its gen-
erators Xµ , obeying the brackets [ Xµ , Xν ] = cµν λ Xλ . Find the product law for its
elements exp( αµ Xµ ) and exp ( β ν Xν ) up to the total third order of α, β.
3.9 (Identities with matrix exponentials) (a) Prove eX eY = eX+Y for arbitrary
n × n commuting matrices. (b) Prove exp( SXS−1) = S exp( X) S−1 for arbitrary
X ∈ M( n, F ) and some invertible matrix S ∈ M( n, F ).
3.10 (Polar decomposition of a matrix) (a) Prove that for any A ∈ SL ( n, R ) there
is a unique decomposition A = RS, where R ∈ SO ( n), and S a symmetric uni-
modular positive definite matrix. (b) Given A ∈ SL (2, R ), determine R, S.
3.11 (Nilpotent matrices) Find the lowest power for which the following matrices
nullify, and compute e A , eB and eC :
PROBLEMS 121

 
  " # 0 a b c
0 a b
0 a 0 0 d e
A= , B= 0 0 c , C=
0 0 0 0 0 f
0 0 0
0 0 0 0

3.12 (Computing eX ) Compute eX for the following real matrices X:


           
0 a 0 −a a b a b a 0 1 1
, , , , , ,
a 0 a 0 0 a −b a b a 0 0

3.13 (Is given Y equal eX with real X?) Are the following (real) matrices of the
form eX ? If so, compute X.
           
1 0 0 1 a 0 0 −1 1 1 1 1
, , , , , ,
0 0 1 0 0 b 1 0 0 1 −1 1

3.14 (Dimension-three algebras) Find the generators of rotation acting on (a) three
dimensional Euclidean space of metric g = diag[1, 1, 1]; and (b) three dimensional
Lorentzian space of metric g = diag[1, 1, − 1].
3.15 (Exponential of rotation) (a) Referring to Problem 3.1, write the rotation ma-
trix for a very small angle δθ as R( δθ ) = I + δθR: sum the expansion series that
defines exp ( θR ). (b) When the rotation is applied on the Cartesian coordinates
x1 , x2 of a two-dimensional Euclidean space, we can define the corresponding
Lie group of rotations, generated by a differential operator X. Find X and calcu-
late ϕX ( θ ) = exp (− θX ) (the minus sign in exp (− θX ) arises because the operator
is applied on the coordinates, not the basis vectors)
3.16 (Lorentz boost) (a) Referring to the relevant discussion in the chapter, show
that for any one-dimensional Lie group, with the composition function φ ( β, α ),
one can find a new parameter α0 such that the composition function is additive,
i.e. φ ( β0, α0 ) = β0 + α0 (i.e. one-parameter Lie groups are abelian).
(b) In a two-dimensional space with metric g = diag (1, − 1), define the Lorentz
boost L ( v) along the x axis by the matrix
  p
γ γv
L ( v) = where 0 ≤ v < 1, γ = 1/ 1 − v2.
γv γ

(i) Show that L ( v 2) L ( v1) = L ( φ (v2, v1 )), with φ ( v2, v1 ) to be found. (ii) Find the
new parameter ω ( v), such that L̄( ω2) L̄( ω1) = L̄ ( ω2 + ω1). (iii) Find the group
generator X such that the exponential map exp ( ωX ) reproduces L̄ ( ω ).
3.17 (Heisenberg group) Define the sets of matrices:
(" # ) (" # )
1 a b 0 α β
H= 0 1 c ; a, b, c ∈ R ; h= 0 0 γ ; α, β, γ ∈ R .
0 0 1 0 0 0

(a) Show that H is a Lie group, and find its center. (b) Show that for any X ∈ h,
the exponential eX ∈ H, and conversely if X is any matrix such that etX ∈ H, then
122 CHAPTER 3. LIE GROUPS & LIE ALGEBRAS

X = detX /dt |t=0 is in h. Define a basis for h, together with its algebra. (c) Find the
center C (h) of h. Using the BCH formula, calculate exp X for an arbitrary X ∈ h.
3.18 (Euclidean group E(2)) In classical physics, space is assumed to be homo-
geneous and isotropic, so that physical phenomena should not depend on the
specific location or orientation of the physical system. In mathematical terms,
this means a physical Euclidean space, with its symmetries described by the
Euclidean group, which consists of uniform translations and uniform rotations.
Take, as an example, the Euclidean group E (2) defined by the following transfor-
mations of the Cartesian coordinates x1 , x2 in R 2:
x01 = x1 cos θ − x2 sin θ + α1; x02 = x1 sin θ + x2 cos θ + α2,
in terms of a displacement distance (− ∞ < αi < ∞; i = 1, 2) and a rotation
angle (0 ≤ θ ≤ 2π). We call A ( θ, α) the operator effecting this transformation:
A ( θ, α) xi = x0i = R ij ( θ ) x j + αi .
(a) Verify that A ( θ, α) form a Lie group, and that A ( θ, 0) and A (0, α) are its
subgroups. Calculate the transformation functions as well as the composition
functions of E (2). (b) Calculate the generators of E (2) and their brackets.
3.19 (Linearization of E(2)) Let L be the linear space of (column) vectors of the
form ( x1, x2 , 1)T , where x1 , x2 are the Cartesian coordinates in two-dimensional
Euclidean space. The object of this problem is to show that E(2) is isomorphic
to a matrix Lie group, by showing that E(2) group and algebra elements (Prob-
lem 3.18) are expressible as 3 × 3 matrices in space L, denoted M(3, R ), with
appropriate properties.
(a) Find the generators of E(2) and their brackets as elements of M(3, R ).
(b) Prove that A ( θ, α) = A (0, α) A ( θ, 0). (c) Show that the exponential series
of exp( θX0) and exp( α1 X1 + α2 X2 ) sum to A ( θ, 0) and A (0, α) respectively. (d)
What is A ( θ, 0) A(0, α)? Compare it to A (0, α) A (θ, 0). (e) In question (b), we have
proved that exp ( α1X1 + α2 X2 ) exp( θX0) is identical to the group element A ( θ, α).
But the theory tells us that, in general, it is the map exp( θX0 + α1 X1 + α2 X2 ) that
should reproduce the group elements. How do you explain the difference?
3.20 (Invariant subgroup and factor group of E(2)) With the notations and results
in Problem 3.19, answer the following questions:
(a) Prove the relation eθX0 T( α) e−θX0 = T[ R (θ )α]. From this result, discuss the
meaning of the commutation relations between X0 and X1 , X2 . (b) Prove that T =
{ A (0, α)} is the invariant subgroup of E(2), and that the factor group E(2) /T is
isomorphic to SO(2). (c) Is E(2) simple, semi-simple, compact?

Q. Ho-Kim. Group Theory: A Physicist’s Primer.


Chapter 4

sl(2, C )
and Associated Groups

4.1 sl(2, C)’s Basic Properties


4.2 Representations
4.3 Tensorial Representations
4.4 Parameters of Rotation
4.5 Rotation Matrices
4.6 Direct-Product Representations

This chapter is devoted to a study of the Lie algebra sl (2, C ) and the Lie groups
SO (3) and SU (2). As sl (2, C ) is isomorphic to so (3)C and su (2)C, studying one
is studying at the same time the others. Further, as sl (2, C ) turns out to be a
Lie subalgebra of all simple and semi-simple Lie algebras, this simplest of all Lie
algebras is a model for and a part of the study of the more complex cases. It also
has importance in its own right, in particular in physics, where it underlies the
concepts of angular momentum, spin, and isospin.
We will discuss first sl (2, C ) and classify all of its representations. We next
consider the groups SO(3) and SU (2), and their representations, and the direct-
product representations, which occur frequently in studies of physical systems.

4.1 sl (2, C ) Basic Properties


To begin, we gather here a few basic properties of the Lie algebra sl (2, C ), already
seen in the last chapter, to be called up in the following sections.
1. sl (2, C ) is the unique (up to isomorphism) complex Lie algebra of dimension
three. Every one of its elements x appears in the form x = Ch + Ce + C f in the
basis { h, e, f } defined by the commutation relations

[ h, e ] = 2e, [ h, f ] = − 2 f , [ e, f ] = h. (4.1)

123
124 CHAPTER 4. sl (2, C ) AND ASSOCIATED LIE GROUPS

2. sl (2, C ) is a simple Lie algebra, because the only nonzero ideal it has is itself
(having its derived algebra, D g = [g, g], equal to itself, D g = g; cf. Sec. 3.5.2).
Being a simple Lie algebra, sl (2, C ) may be considered a basic building block of
the higher-dimensional Lie algebras.
3. sl (2, C ) is evidently the complex extension of sl (2, R ), and isomorphic to the
complexification su (2)C of the real Lie algebra su (2) (and so also to so(3)C, the
complexification of so(3)), under the correspondence h 7 → 2iS3, e 7 → iS1 − S2,
f 7 → iS1 + S2, where S j = − iσj /2, as defined in Chapter 3, Examples 45 and 48.
The close relationship we see here among these algebras arises from the fact that
Lie algebra is a local theory.
4. Which Lie groups G admit sl (2, C ) as Lie algebra? An answer is the connected
and simply connected group G = SL (2, C ) in a one-to-one correspondence with
sl (2, C ); but it is not the only one, because as SL (2, C ) has as center the subgroup
{ I, − I } of scalar matrices, one may also have G = SL (2, C )/{ I, − I }.
For a similar reason, the Lie groups having su (2) as Lie algebra include, not
only the connected, simply connected group SU (2) which is in a one-to-one corre-
spondence with su (2), but also the quotient SU (2)/ { I, − I } (since SU (2) contains
the invariant discrete subgroup { I, − I }).
The simple Lie algebras su (2)C and so(3)C are mutually isomorphic, but it is
not expected that their associated Lie groups to be so, because whereas SU (2) is
simply connected, SO (3) is not. In fact SO (3) ∼ = SU (2)/{ I, − I } (see Example 64
in Chapter 3; we will elaborate on this point later on).
5. Given the close relationships among all the simple Lie algebras of dimension
three, on the one hand, and between each of them and its associated Lie groups,
on the other hand, when we study any one of them, for example sl (2, C ) as we
choose to do here, we are in effect studying also all the others.

4.2 Representations
As one may recall, a representation ( π, V ) of a Lie algebra g is a vector space V together
with a morphism π : g → gl (V ), or equivalently a linear action of g on V by mapping
g × V → V , which assigns to each element x of g an operator π ( x) : V → V ,
preserving both linearity and the product laws. So, if H, E, F are linear operators
on a vector space V obeying the bracket relations

[ H, E ] = 2E, [ H, F ] = − 2F, [ E, F ] = H, (4.2)

then the linear map π : sl (2, C ) → gl (V ) satisfying π ( h) = H, π ( e) = E, and


π ( f ) = F will be a representation of sl (2, C ), in the sense that for any element
x ∈ sl (2, C ), its representation is π ( x) = Cπ ( h ) + Cπ ( e ) + Cπ ( f ). The object of
this section is to determine all the finite-dimensional complex representations of
sl (2, C ).
The main idea in finding the representations of sl (2, C ) is to realize that H
plays a special role, and to diagonalize it. Since it is a linear operator in a complex
4.2. REPRESENTATIONS 125

space V , it always has at least one eigenvector. Therefore the equation

Hv = µv, v ∈ V, (4.3)

admits one or several eigenvectors (called weight vectors of V ), which together


compose the eigenspace (or weight space) Vµ of eigenvalue (or weight) µ ∈ C.
From the product laws (4.2), which any π (sl(2, C )) must satisfy, we have

HEv = ( µ + 2) Ev ,
(4.4)
HFv = ( µ − 2) Fv .

So either Ev = 0, or else Ev is an eigenvector for H with µ + 2 as eigenvalue; and


either Fv = 0, or else Fv is an eigenvector for H with eigenvalue µ − 2. Put it
differently, if Ev 6 = 0, then Ev ∈ Vµ+2 , and if Fv 6 = 0, then Fv ∈ Vµ−2 . For this
reason, E is called a ‘raising operator’, and F a ‘lowering operator’.
If one repeatedly applies E on v, one obtains a sequence of vectors Ev, E2v, . . .
in V , each of which satisfies an equation of the form HEk v = ( µ + 2k ) Ekv. So
either Ek v = 0, or else it is an eigenvector for H of eigenvalue µ + 2k. Since
H acts on a finite-dimensional space, it can only have a finite number of distinct
eigenvalues, and so there must be an integer m ≥ 0 defining a nonzero vector
Em v such that E · Em v = 0. We call that nonzero vector v 0 and its eigenvalue
λ ∈ C; they satisfy the conditions

Hv0 = λv0, Ev0 = 0. (4.5)

Now, define the sequence of vectors v0, v1, . . . by vk = F k v0 = Fvk−1 with k ≥ 0


and v−1 = 0, all of them eigenvectors of H and mutually linearly independent
(since no two of which have the same eigenvalue): Hv k = ( λ − 2k ) vk. Again,
because H has only a finite number of eigenvalues, the sequence is finite, ending
with a nonzero vector v n (so that vn+1 = 0). Now, since

HEvk = [ H, E ]vk + ( λ − 2k ) Evk = ( λ − 2( k − 1)) Evk,

Evk has the same eigenvalue as vk−1 , and so must be proportional to it, which
we write as Evk = ck vk−1 . In particular c0 = 0 and c1 = λ (this is so because
c1 v0 = Ev1 = EFv0 = [ E, F ]v0 = λv0).
With the commutation relations (4.2) we do the following calculations:

Evk = EFvk−1 = ( FE + [ E, F ])vk−1



= FEvk−1 + Hvk−1 = ck−1 Fvk−2 + λ − 2( k − 1) vk−1
 
= ck−1 + λ − 2( k − 1) vk−1 ,
which is to be equated to ck vk−1 . This yields ck = ck−1 + λ − 2( k − 1) subject to
the condition c0 = 0, the solution of which is ck = k ( λ − k + 1).
Since vn 6 = 0 and vn+1 = 0, the equation Evn+1 = cn+1 vn = 0 requires cn+1 =
0, which means ( n + 1)(λ − n) = 0. The conclusion that follows, λ = n (of course,
a real number), says that (i) the eigenvalue n − 2k of each v k is an integer; and (ii)
the eigenvalue λ = n of v 0 is closely related to that of vn , i.e. λ − 2n = − n.
126 CHAPTER 4. sl (2, C ) AND ASSOCIATED LIE GROUPS

The vectors v0, v1, . . . , vn , being eigenvectors to different eigenvalues of H,


are linearly independent and span a subspace W of V . Since Ev k = ck vk−1 and
Fvk = vk+1 , the space is invariant under the action of sl (2, C ). If V is an irreducible
representation (or ‘irrep’), W is equal to V . Thus, we know that an irreducible
representation must be of the form V = { a0v0 + a1 v1 + · · · + an vn | ai ∈ C } .
These considerations motivate the following statement, which contains the
main result of this section:
Theorem 4.1 (Irreducible representation of sl (2, C )). 1. Given any non negative inte-
ger n, let V (n) be the complex vector space of dimension n + 1, with a basis v 0, v1, . . . , vn ,
defined by the following action of sl (2, C ) = h h, e, f i on it:
Hvk = ( n − 2k ) vk, k = 0, 1, . . . , n,
Fvk = vk+1 , vn+1 = 0, (4.6)
Evk = [ k (n − k + 1)] vk−1, v−1 = 0,

where π ( h ) = H, π ( e ) = E, and π ( f ) = F. Then V (n) is a simple or irreducible


representation of sl (2, C ), uniquely identified by the maximal eigenvalue n of H, called
the highest weight of the representation (or equivalently by its lowest weight − n).
Further, all the eigenspaces Vµ within V (n) with eigenvalues µ = n, n − 2, . . . ,− n + 2,
− n of H are one-dimensional, and are the summands in the direct sum V (n) = ⊕ Vµ.
2. V (n) and V (m) are non-isomorphic for n 6 = m.
3. Every finite-dimensional irreducible representation of sl (2, C ) is isomorphic to
some such representation V (n) , with n = 0, 1, . . . .
To show that (4.6) defines a representation of sl (2, C ), we check, by computa-
tions, that the relations [ H, E ]v = 2Ev, [ H, F ]v = − 2Fv, and [ E, F ]v = Hv hold
for all vectors v in the space.
To see that this representation is irreducible, one just has to show that every
non-zero invariant subspace of V (n) is in fact equal to V (n). Let W be such a
space. As W is non-zero, there is at least one non-zero element in W, and it can
be written uniquely in the form w = a0 v0 + a1 v1 + · · · + an vn , where at least one
of the ak ’s is non-zero. Let m ≤ n be the largest index for which am 6 = 0, so that
w = a0 v0 + a1 v1 + · · · + am vm . When we apply E · · · E = Em on w, all the terms
vanish, except the last one: Em vm ∝ v0. Since W is invariant, it must contain this
multiple of v0, hence v0 itself. Now, since W is invariant, it contains all F k v0 = vk
for all 0 ≤ k ≤ n. As these are precisely a basis for V (n), we have W = V (n) ,
which is what we wanted to show.
Since the eigenvectors with different eigenvalues are independent, we may
write W = ⊕ Vµ, where Vµ are the (orthogonal) eigenspaces of H. Since W is non-
zero and invariant under the action of H, E, F, it is an invariant subspace of an
irreducible representation V (n). Irreducibility means V (n) = W, and so it follows
V (n) = ⊕ Vµ. When n 6 = m, V (n) and V (m) have different dimensions, and are
necessarily non-isomorphic.
Part 3 of the theorem tells us what irreducible representations look like (as
described before). In fact, it says more: The complete list of the irreducible represen-
tations of sl (2, C ) consists of the representations V (n) , with n = 0, 1, 2, . . . .
4.2. REPRESENTATIONS 127

C OMMENTS . 1. The spectrum of H has reflection symmetry: if m is an eigenvalue


of H, then so is − m. Any representation of sl (2, C ) in which all the eigenvalues
of H have the same parity (even or odd integers) and occur with multiplicity one
(each one once) is necessarily irreducible.
2. In contrast to H, the matrices E and F in V (n) depend on the normalization
of the basis vectors (although their shapes are fixed). For example, if we take as
basis the vectors uk = vk /k!, then (4.6) is replaced with:
Huk = ( n − 2k )uk, k = 0, 1, . . . , n,
Fuk = ( k + 1) uk+1, un+1 = 0, (4.7)
Euk = ( n − k + 1) uk−1, u−1 = 0.
H = diag[ n, n − 2, . . . , − n ] is unchanged, but the nonzero elements of the matrix
E are given by n, n − 1, . . . , 1 on the diagonal line above the main diagonal; and
those for F are given by 1, 2, . . . , n on the diagonal line below. On the other hand,
if the uk ’s are normalized to one, u†k uk = 1 for all k, the matrix representation of
the ladder operators are similar to those often used in quantum mechanics:
q q
Fuk = ( k + 1)( n − k ) uk+1 , Euk = k ( n − k + 1) uk−1 . (4.8)

3. The operator C = e f + f e + 12 h2 of sl (2, C ) has the interesting property of


commuting with every element of the algebra and, in particular, with h, e, and
f . In every representation V of sl (2, C ), the operator π ( C ) : V → V commutes
with every element of sl (2, C ). Then, by Schur’s lemma, C acts as a scalar mul-
tiplication in every irreducible representation. On the other hand, if V is not
irreducible, it can be decomposed into sub representations, each corresponding
to an eigenspace of C in V . An operator of that kind is called a Casimir operator.
E XAMPLE 1: V (0) is the trivial representation with π ( x) = C for every element x.
V (1) is the standard representation given by 2 × 2 matrices:
     
1 0 0 1 0 0
H= , E= , F= .
0 −1 0 0 1 0

V (2) is the adjoint representation, with 3 × 3 matrices obeying (4.7) given by


" # " # " #
2 0 0 0 2 0 0 0 0
H= 0 0 0 , E= 0 0 1 , F= 1 0 0 . 
0 0 −2 0 0 0 0 2 0
As the group SU (2) is a connected and simply-connected Lie group with Lie
algebra su (2), the representations π of su (2) and Π of SU (2) are in one-to-one
correspondence, uniquely related by relations of the type Π (ex) = eπ(x) for all
x ∈ su (2). Further, since SU (2) is a compact group, all its representations are semisim-
ple (i.e. completely reducible), and so also are all representations of su (2). As su (2)
has the unique complex extension su (2)C, its representations π are uniquely re-
lated to those πC of its complexification by the relation πC ( x + iy) = π ( x) + iπ (y)
for all x, y ∈ su (2). Therefore, just as for π, the πC ’s are equally semisimple. Since
sl (2, C ) and su(2)C are isomorphic, their representations are the same, and so are
also semisimple. This is the justification for the following theorem:
128 CHAPTER 4. sl (2, C ) AND ASSOCIATED LIE GROUPS

Theorem 4.2 (Representation of sl (2, C )). Every finite-dimensional representation of


sl (2, C ) is completely reducible, expressible as a direct sum of irreducible representations:
V = ⊕V (n).
E XAMPLE 2: The tensor product of two representations V (n) and V (m) of sl (2, C )
is completely reducible, and so is a direct sum of the V (k) :

V (n) ⊗ V (m) = V (n+m) ⊕ V (n+m−2) ⊕ · · · ⊕ V (|n−m|) .


This decomposition is called the Clebsch–Gordan series, and is known in quantum
mechanics as the angular-momentum-addition relation . 

4.3 Tensorial Representations


In the previous section, the irreducible representations of sl (2, C ) were found by
diagonalizing H and identifying all the representation vectors with the help of
E and F. In this section, we describe a complementary approach which does
not specifically use Lie algebra but rather relies on the concept of direct-product
space and tensorial techniques to classify different representations according to
their symmetries under the symmetric group (cf. Chapter 3 and Appendix A).
First, some basic definitions: Let Gm be a general linear group of invertible linear
transformations g acting on an m-dimensional vector space Vm . Given a basis
| ii (with i = 1, 2, . . . , m) on Vm , define the matrix representation of Gm by the
invertible matrices gij in the usual way (summation over repeated indices):
π ( g)| j i = | ii gij . (4.9)

The tensor product space involving n factors Vm×n ≡ Vm ⊗ Vm ⊗ · · · ⊗ Vm ad-


mits as a basis the set of rank-n tensors
def
|{ in }i = | i1 i ⊗ | i2 i ⊗ · · · ⊗ | in i. (4.10)
The action of Gm on the tensor space Vm×n induces an nm-dimensional represen-
tation D{i}{ j} defined by

π ( g)|{ j n}i = |{ in }i D{i}{j} ( g)


(4.11)
where D{i}{ j} ( g) = gi1 j1 gi2 j2 · · · gin jn .

On the other hand, the representation of a symmetric group Sn on a given tensor


space Vm×n is defined by
π ( p )|{ jn }i = |{ in}i D{i}{j} ( p )
i p1 i p2 i pn (4.12)
where D{i}{ j} ( p ) = δj δj · · · δjn .
1 2

As permutations of n objects are a kind of linear transformation, it is to be


expected that the symmetric group Sn consisting of all such transformations must
have algebraic properties shared with the general linear group.
4.3. TENSORIAL REPRESENTATIONS 129

The representations D [ Sn ] of the symmetric group and D [ Gm ] of the linear


group, both defined on the same tensor space Vm×n , are in general reducible. But
whereas Sn is a finite group and so every D [ Sn ] can be written as a direct sum
of irreducible representations, Gm is an infinite group, and so its representations
may or may not be completely reducible. However, studies have shown that
the reduction of the tensor space Vm×n into irreducible invariant subspaces with
respect to Sn also implies a full decomposition of D [ Gm ]. This is the key result we
want to use. Our present problem, which consists of finding all irreps of sl (2, C )
(and so also of the group SL(2, C )), reduces to finding the symmetry classes of Sn in
the tensor space (C2 )×n , or even more simply its totally symmetric subspace for
Sn . This will produce at the same time the sought for (n + 1)-dimensional irrep of
the Lie algebra for every non-negative integer n (cf. Appendix A §A.4).
The smallest irreducible representations of sl (2, C ) is the one-dimensional rep-
resentation, V (0) = C. For the two-dimensional representation, we pick the stan-
dard basis ξ = (1, 0) and η = (0, 1) in C2 , so that Hξ = ξ and Hη = − η. Then
Cξ + Cη is just the fundamental representation V (1) we have already met. Note
that the raising and lowering operators satisfy the relations: Eξ = 0, Eη = ξ,
Fξ = η, and Fη = 0.
Next, recalling the general connection with the symmetric group S2 observed
above, we consider the symmetric-square space W = Sym2 C2 , a subspace of
the two-fold tensor space C2 ⊗ C2 , for which we take as basis φ0 = ξ ⊗ ξ, φ1 =
ξ ⊗ η + η ⊗ ξ, and φ2 = 2η ⊗ η (the numerical factors are chosen to conform with
(4.6)). In this tensor space, each operator A is meant to be A ⊗ I + I ⊗ A. Then,
Hφ0 = ( Hξ ) ⊗ ξ + ξ ⊗ Hξ = 2ξ ⊗ ξ = 2φ0,
Hφ1 = ( Hξ ) ⊗ η + ξ ⊗ Hη + ( Hη ) ⊗ ξ + η ⊗ Hξ = 0,
Hφ2 = 2[( Hη ) ⊗ η + η ⊗ Hη ] = − 2(2η ⊗ η ) = − 2φ2.
One can check in a similar way that
Eφ0 = 0, Eφ1 = 2φ0 , Eφ2 = 2φ1,
Fφ0 = φ1, Fφ1 = φ2 , Fφ2 = 0.
The representation W = Cφ0 + Cφ1 + Cφ2 = W2 ⊕ W0 ⊕ W−2 is precisely the 3D
irrep V (2) we identified before. Notations and calculations can be simplified by
regarding the φi ’s as second-degree polynomials in ξ and η (their relative order
in products is not important). So, we write φ0 = ξ 2, φ1 = 2ξη, and φ2 = 2η 2 .
To generalize, define W as the space of the homogeneous polynomials of de-
gree n ≥ 1 in the two symbols ξ and η, and the following operators on W :
∂ ∂ ∂ ∂
H=ξ −η , E=ξ , F=η , (4.13)
∂ξ ∂η ∂η ∂ξ
and check that they satisfy the canonical commutation relations (4.2) of sl (2, C ).
Then take as a basis in W the polynomials φk = ak ξ n−k η k , with k = 0, 1, . . . , n;
and finally produce by simple differentiation of φk the following:
Hφk = ( n − 2k ) φk, Fφk = φk+1 , Eφk = k ( n − k + 1) φk−1, (4.14)
130 CHAPTER 4. sl (2, C ) AND ASSOCIATED LIE GROUPS

which is exactly of the form shown in (4.6), provided ak = n!/ (n − k ) !. So H


acting on W = Symn C2 has eigenvalues n, n − 2, . . . , − n + 2, − n, each occurring
just once. It follows that the ( n + 1)-dimensional space Symn C2 is irreducible,
and equivalent to V (n):
Theorem 4.3 (V (n) = ∼ Symn C2 ). Every irreducible representation of sl (2, C ) is a sym-
metric power of its fundamental representation V (1) = C2 .
To summarize: To every non-negative integer n corresponds an irreducible repre-
sentation V (n) of sl (2, C ) having n as its highest weight, and n + 1 its dimension; every
irrep of sl (2, C ) is of this kind. V (1) is called the standard representation, and each V (n)
with n ≥ 2 is a symmetric tensor product of V (1), that is, V (n) ∼ = Symn C2 . (Why only
n
Sym ’s are involved will be explained in Appendix A, p. 277 .)

4.4 Parameters of Rotation


SO (3) and SU (2) are groups that admit sl (2, C ) as their Lie algebra. They are
of interest in physics, one for describing isometries in geometric space, and the
other for its role in transformations in spin and charge space. The symmetry
transformations are specified by the parameters characteristic of the group real-
ization. We will describe here two sets of such parameters and their geometrical
significance.

4.4.1 The Group SO(3)


As we have seen in the last chapter, O (3) is the Lie group of transformations R in
R 3 that preserve the bilinear form h x, y i ≡ ∑i xi yi, so that h Rx, Ryi = h x, y i, for
all x, y ∈ R 3 . This is equivalent to the orthogonality condition R T R = I, and so
O (3) is the orthogonal group O (3) = { R ∈ GL(3, R ) : R T R = I } .
The six independent real constraints implied by the condition R T R = I, or
explicitly R ik R jk = δij (with (i, j = 1, 2, 3), mean that of the nine real numbers R ij
only three are independent, making O (3) a three-parameter group. O (3) contains
two disjoint sets of 3 × 3 matrices, one with det R = + 1, and the other with
det R = − 1 The set of matrices with determinant equal to one is a subgroup,
called the special orthogonal group SO (3), describing pure rotations in R 3 .
Let us note that (i) ∑j R 2ij = 1, for i = 1, 2, 3, implies R 2ij ≤ 1 for all i, j, and
so O (3) and SO (3) are compact groups; and (ii) the symmetric second-rank tensor
δij and the totally antisymmetric third-rank tensor eijk are both invariant under
rotations, as shown by the conditions R T R = 1 and det R = 1 for any R ∈ SO(3)
when written out explicitly:

R ik R j` δk` = δij ,
(4.15)
R i` R jm R kn e`mn = eijk .

The angle-and-axis parameterization. The simplest way to describe a rotation


in a three-dimensional space is to specify the axis of the rotation n (a unit vector)
4.4. PARAMETERS OF ROTATION 131

and the angle ω through which the (counterclockwise) rotation is made. Since
n can be defined by two angles (the polar and azimuthal angles θ and ϕ), every
rotation, denoted R ( ω, n), is specified by three parameters ω, θ, ϕ with values in
the ranges
0 ≤ ω ≤ π, 0 ≤ θ ≤ π, 0 ≤ ϕ < 2π. (4.16)
The limit ω = 0 corresponds to the group identity element R (0, n) ≡ I. The
values of the parameters in their ranges of definition give a unique assignment of
the group elements, except for ω = π, when we have the redundancy

R ( π, n) = R ( π, −n). (4.17)

From geometric considerations, we also expect that, when ω exceeds its normal
limits, R must satisfy the following global conditions:

R ( ω + 2πm, n) = R (2πm − ω, − n) = R ( ω, n), ( m ∈ Z ). (4.18)

The group parameter space consists of all points inside and on a sphere of
radius π about the origin. To each vector ω = ωn drawn from the origin there
corresponds a (counterclockwise) rotation around the vector ω through an angle
equal to the distance ω from the origin. This assignment is unique, except on the
surface ω = π, where two diametrically opposite points give the same rotation
(cf. (4.17)). Thus, SO (3) as space is a compact sphere.
Rotations about the coordinate axes ei : When n lies along a coordinate axis ei , the
rotation through an angle ψ will be given by one of the matrices:
 
1 0 0
R (ψ, e1 ) = 0 cos ψ − sin ψ , (4.19a)
0 sin ψ cos ψ
 
cos ψ 0 sin ψ
R (ψ, e2 ) =  0 1 0 , (4.19b)
− sin ψ cos ψ
 
cos ψ − sin ψ 0
R (ψ, e3 ) =  sin ψ cos ψ 0  . (4.19c)
0 0 1

We have fixed the sense of rotation, and hence the signs of the entries, according
to the right-hand rule.
General rotation ( ω, n): Given a rotation R ( ω, ei ), we can obtain the rotation
R ( ω, n) for any n by rotating ei to n with an operation S, such that n = Sei.
E XAMPLE 3: Using R ( ω, ei ) given in Eq. (4.19), we calculate R ( ϕ, e3 ) R (θ, e2 ), now
called S, and find its inverse S−1:
   
cφ cθ − sφ cφ sθ cφ cθ sφ cθ − sθ
S =  sφ cθ cφ sφ sθ  , S −1 =  − s φ cφ 0 , (4.20)
− sθ 0 cθ cφ sθ sφ sθ cθ
132 CHAPTER 4. sl (2, C ) AND ASSOCIATED LIE GROUPS

where cφ = cos φ, sφ = sin φ, cθ = cos θ, and sθ = sin θ. The rotation through ω


about the axis n = S ( θ, ϕ )e3 = ( cφ sθ , sφ sθ , cθ ) is

R ( ω, n) = S ( θ, ϕ ) R (ω, e3 ) S−1( θ, ϕ ) , (4.21)

which is obtained by simple matrix multiplication of known S and R ( ω, e3 ). 


More generally, given a rotation R ( ω, n) for any n, let n0 be the vector obtained
from n by some rotation S, such that n0 = Sn, then we have

R ( ω, n0 ) = SR (ω, n) S−1. (4.22)

All matrices for rotations through the same angle are related by similarity trans-
formations that rotate the axis of one rotation into the axes of the others. So all
rotations through a given angle ω belong to a conjugacy class of SO (3); they cor-
respond to points on a spherical shell of radius ω in the parameter space.
The Euler angles. A rotation R in three-dimensional space can also be specified
by the relative orientation of two Cartesian systems of coordinates, say, (e1 , e2 , e3)
and (e01 , e02 , e03 ). Three successive (counterclockwise) rotations bring (e1, e2 , e3 ) to
(X, Y, Z), then to (L, M, N), and finally to (e01 , e02 , e03 ), in detail as follows:
(a) Rotate about e3 through α, for any value of α between 0 and 2π. This brings
e 7 → OX, e2 7 → OY, leaving e3 ≡ OZ unchanged.
1

(b) Rotate about OY through β, with 0 ≤ β ≤ π. The new frame is (OL, OM ≡


OY, ON).
(c) Rotate about e03 ≡ ON through γ, with 0 ≤ γ < 2π. The axes are rotated
as: OL 7 → e01 , OM 7 → e02 , with ON ≡ e03 left unchanged.
Thus, the rotation R : { e1 , e2 , e3 } 7 → { e01 , e02 , e03 } is defined by three angle-
and-axis rotations applied in succession:

R ( α, β, γ) = R ( γ, e03 ) R ( β, OY ) R (α, e3 ). (4.23)

This expression, which depends on moving axes, can be re-written in a more


convenient form:

R (α, β, γ) = R ( α, e3 ) R ( β, e2 ) R (γ, e3 ). (4.24)

In this form, we see that R ( α, β, γ) results from three successive operations: (a)
First, rotate about e3 through γ; (b) Then rotate about e2 through β; (c) Finally
rotate about e3 through α. The angles α, β, γ, called the Euler angles, have the
ranges already specified: 0 ≤ α, γ < 2π and 0 ≤ β ≤ π. This expression is very
convenient because it gives an arbitrary rotation as a product of rotations about
fixed axes, which are given by simple expressions, as in (4.19).
C OMMENTS .
1. The assignment of the Euler angles to rotations is not always unique. Thus,
R ( α, 0, γ) is function only of the sum α + γ, and R ( α, π, γ) only of the difference
α − γ, as can be seen from (4.24).
4.4. PARAMETERS OF ROTATION 133

2. As the expressions (4.21) and (4.24) for rotations in the two parameterizations
are well defined, we can find the relations between the variables α, β, γ and ω, θ, ϕ
for any given rotation:
ϕ = ( π + α − γ ) /2 ,
tan( β/2)
tan θ = , (4.25)
sin[( α + γ ) /2]
β α+γ
cos ω = 2 cos2 cos2 −1 .
2 2
SO (3) near the identity. Take any element R ∈ SO (3), and let it approach the
identity, so that it may take a linearized form, R ≈ I + M, in accord with the
condition det R = 1 with all entries in M infinitesimal. Requiring R T R = I means
that MT = − M, and so M has the general form:
 
0 − θc θb
M =  θc 0 − θa  , (4.26)
− θb θa 0
in terms of three real variables θa , θb , θc . The signs of the matrix elements have
been chosen so that in the appropriate limits they are in accord with the right-
hand rule we have been using. For example, when θb = θc = 0, (4.26) agrees with
R ( ω1, e1 ) − I in (4.19a), for small ω1 provided θa and ω1 are identified. For the
same reason, θb and θc are identified with the angles of rotation about the axes e2
and e3 , i.e. θb ≡ ω2 and θc ≡ ω3.
Therefore, R ∈ SO (3) near the identity can be written as
R = I + ( ω1R1 + ω2 R2 + ω3 R3) , (4.27)
where R 1 = R x, R 2 = R y, and R 3 = R z are the 3 × 3 matrices defined in the
previous chapter, Sec. 3.4. We know that they are a basis for the real Lie algebra
so (3), which upon complexification is isomorphic to Lie algebra sl (2, C ) by the
maps 2iR 1 7 → F + E, 2R 2 7 → F − E, and 2iR 3 7 → H.
The self-adjoint objects Li = iR i are more commonly used in physics, where
they represent the orbital angular momentum:
     
0 0 0 0 0 i 0 −i 0
L1 = 0 0 − i  , L2 =  0 0 0 , L3 =  i 0 0 . (4.28)
0 i 0 −i 0 0 0 0 0
These matrices can be succinctly written as ( Lk )`m = − iek`m , where ek`m is the
totally anti-symmetric unit tensor of rank 3. They have several properties of in-
terest. First let us rewrite (4.15) as R i` R jm e`ms = eijk R ks , and substitute in it eijk
for i( Li ) jk to obtain
R i` (iLn )`m R jm = i( Lk )ij R kn , (4.29)
which is equivalent to (recalling that R T = R −1)

RL j R −1 = Lk R kj ( j = 1, 2, 3). (4.30)
134 CHAPTER 4. sl (2, C ) AND ASSOCIATED LIE GROUPS

Comparing this relation with Rej = ek R kj , we see that each L j behaves as a three-
dimensional vector under the adjoint action of R. Now, when R approaches the
identity, (4.30) becomes
( I − iωi Li ) L j ( I + iω` L` ) = Lk ( I − ωi Li )kj . (4.31)
As ( Lk )`m = − iek`m , we have the commutation relations:
[ Li, L j ] = ieijk Lk (with L†i = Li ) . (4.32)
These relations derived in a particular (the standard) representation of so (3)C will
retain their form in any other representation, as well as in abstract algebra.
Conversely, for every element ∑ ωi Li of so (3)C, the mapping exp · ∑ ωi Li pro-
duces an element of the rotation group. In particular, the series for exp(− iωLi)
can be summed, and the matrices (4.19) recovered as exp(− iωL1) = R ( ω, e1 ),
exp(− iωL2) = R ( ω, e2 ), and exp(− iωL3) = R ( ω, e3 ).
E XAMPLE 4: Calculation of exp(− iωL3). Let ψ = − iω and E0 = diag[0, 0, 1].
Note that L23 = I − E0 = diagonal[1, 1, 0] and L33 = L3. The series for exp(− iωL3)
can be separated into odd and even powers, and summed separately:
   
ψ3 ψ2 ψ4
exp (− iωL3) = E0 + L3 ψ + + · · · + L23 1 + + +···
3! 2! 4!
= E0 + L3 sinh ψ + L23 cosh ψ
= E0 − iL3 sin ω + L23 cos ω = R ( ω, e3 ) .
So, one recovers (4.19c). 
Given these results, the simplest way to show that all elements of SO (3) are
of the form exp (− i ∑ ωi Li ) is to start from (4.30), that is from SL3 S−1 = Lk Sk3 ,
where S ( θ, ϕ ) is (cf. Example 3) the orthogonal transformation that rotates e3 to
the direction n specified by its angles θ and ϕ,

n = S ( θ, ϕ )e3 = ej S j3 ( θ, ϕ ) . (4.33)

As n = ej n j , we have n j = S j3, and so SL3S−1 = Lk nk . It follows that



S exp(− iωL3) S−1 = exp [ S (−iωL3) S−1] = exp − iω ∑ Lk nk .
So the rotation through an angle ω about an axis n is given by the following equiva-
lent matrix or exponential form:

R ( ω, n) = S ( θ, ϕ ) R(ω, e3 ) S−1( θ, ϕ ) = exp − iω ∑ nk Lk . (4.34)
In the same way, the rotation matrix written in Euler angles (as in (4.24)) also has
its equivalent exponential form, a product of three angle-axis rotations:

R ( α, β, γ ) = R (α, e3 ) R ( β, e2 ) R (γ, e3 ) = e−iαL3 e−iβL2 e−iγL3 . (4.35)


Exponential expressions for rotation are not as explicit as their matrix equivalents,
but have the advantage of being concise and convenient for analysis. They prove
to be essential in representation theory.
4.4. PARAMETERS OF ROTATION 135

4.4.2 The Group SU ( 2)


The special unitary group SU (2) is the set of all 2 × 2 complex-valued matrices A
that are unitary (A† A = I) and unimodular (det A = 1). An element A of SU (2)
has the general form
 
a b a, b ∈ C,
A= (4.36)
− b ∗ a∗ aa∗ + bb∗ = 1.

There are three free parameters, making SU (2) a three-parameter group. The
two complex numbers a and b are also referred to as the Cayley–Klein parameters.
If each element of SU (2) is denoted by ( a, b ), then the identity element of SU (2)
is (1, 0) ≡ I, the inverse of ( a, b ) is ( a∗, − b ), and the product rule, obtained by the
usual matrix multiplication, is ( a, b ) · ( c, d ) = ( ac − bd ∗ , ad + bc∗ ).
When only unitarity is required, the group is enlarged to the unitary group
U (2) consisting of elements of the form
 
0 a b
A =e iλ
, | a |2 + | b |2 = 1, λ∗ = λ . (4.37)
− b ∗ a∗

If a and b are fixed, U (2) is reduced to the group of unimodular complex numbers
z, with | z | = 1, having the structure of U (1). On the other hand, if λ is fixed, at 0
for example, the group is reduced to SU (2). The map f : U (1) × SU (2) → U (2)
is a homomorphism that induces the isomorphism (U (1) × SU (2))/Z ∼ = U (2),
where (U (1), SU (2)) 3 Z = {(1, I ), (−1, − I )} ∼= 2
Z is Ker f .
The condition | a |2 + | b |2 = 1 makes it clear that U (2) and SU (2) are com-
pact groups, because, as spaces, they are equivalent to S3 within C2 ∼ = R4. U (2)
preserves the inner product of any two vectors in C2 on which the group acts; it
follows that the relative angle and relative phase of any two such vectors, and
their norm are all preserved. This invariance does not depend on the value of the
determinant of the group element; but if det A = 1 is also satisfied, the overall
phase is the same for all vectors, that is ψ0 = Aψ for all ψ.
SU ( 2) near the identity. Consider an element A ∈ SU (2) near the identity: A ≈
I − iM, where all entries of the matrix M are small. (Note the explicit imaginary
unit i, and A ≈ I, implementing the condition det A = 1.) Then, the unitarity
condition A† A = I means M† = M, or Mij∗ = Mji in any basis. The general form
of M subject to this requirement and consistent with (4.36) is
 
1 ω3 ω1 − iω2
M= , (4.38)
2 ω1 + iω2 − ω3
where ωi are real and infinitesimal, and the overall numerical factor is conven-
tional. It may be written as M = 21 ∑3j=1 ω j σj , where σj are the self-adjoint Pauli
matrices defined by the relations

σi σj = δij I + ieijk σk , (4.39)


[ σi, σj ] = 2ieijk σk (4.40)
136 CHAPTER 4. sl (2, C ) AND ASSOCIATED LIE GROUPS

(with I = diagonal[1, 1]). The standard form of the σi ’s (with a diagonal σ3) is:
     
0 1 0 −i 1 0
σ1 = , σ2 = , σ3 = , (4.41)
1 0 i 0 0 −1

The 2 × 2 matrices Jk = 1/2 σk give the standard two-dimensional representa-


tion of the generators of the group SU (2). They obey the bracket relations

[ Ji, Jj ] = i eijk Jk (with Ji† = Ji ) . (4.42)

On the abstract level, the Ji ’s defined by (4.42) are the basis of the Lie algebra
su (2)C, which is manifestly isomorphic with so (3)C defined in (4.32) through the
correspondence Ji 7 → Li .
Exponential mapping. For every element X = ∑ ωi Ji of su (2), the map exp · X
produces an element of SU (2). In the standard representation, the Ji ’s are evi-
dently cyclic (in particular, ( n · σ )2 = I with a unit vector n), and so exp X has a
closed form, as already seen in Chapter 3. This result can be derived in a differ-
ent way, similar to that of (4.34) making use of the fact that SU (2) transformations
may be regarded as rotations in C2 (see also below). By this, we mean that there
exists an operator T in SU (2) whose adjoint action on exp( ω J3) gives exp( ωnk Jk ),
or equivalently rotates σ3 to a certain direction n, such that

Tσ3 T −1 = σk nk = σ1 n1 + σ2 n2 + σ3 n3 , (4.43)

where the nk ’s are the components of the unit vector n (defined as in Example 3
in terms of its polar and azimuthal angles θ and φ). We find
 
cos( θ/2) − e−iφ sin( θ/2)
T = iφ , T −1 = T † . (4.44)
e sin( θ/2) cos ( θ/2)

As σ3 is diagonal given by diag [1, − 1], the matrix exp(− iωσ3/2) is also diagonal,
given by diag [e−iω/2 , eiω/2 ]. Matrix multiplication gives us
ω  −1 ω ω
T exp − i σ3 T = cos − i n · σ sin , (4.45)
2 2 2
which is just the 2 × 2 matrix U ( ω, n) found in the last chapter. It gives the stan-
dard representation of any element of SU (2) in the angle and axis parameters.
For reference, note also the following properties:

U ( ω, n) U ( ω0, n ) = U ( ω + ω 0, n ) ;
U (4π − ω, − n) = − U (2π − ω, − n) = U ( ω, n) ; (4.46)
U (2π, n) = − I ; U (4π, n) = I .

The general formula for U ∈ SU (2), valid in any representation, reads

U ( ω, n) = exp (− iω n · J ) , (4.47)
4.4. PARAMETERS OF ROTATION 137

with Ji obeying (4.42), and the group parameters having the ranges of values
0 ≤ ω < 2π, 0 ≤ θ ≤ π, 0 ≤ φ < 2π. The equivalent formula in terms of the
Euler angles is analogous to (4.35):

U ( α, β, γ ) = e−iαJ3 e−iβJ2 e−iγJ3 , (4.48)

where 0 ≤ α < 2π, 0 ≤ β ≤ π, 0 ≤ γ < 4π. The two sets of parameters ( ω, θ, φ )


and ( α, β, γ) are related through (4.25), but the ranges of the parameters ω, γ for
SU (2) differ from the ranges of the corresponding parameters for SO(3). They
are determined so that the representation matrix in each case takes all possible
values for it. If we have a smaller range, there are products of the matrices with
parameters outside the range; and if we take a larger range, there are values that
give the same matrix more than once. As an aside, the parameters a, b in (4.36),
holding only in the standard (2D) representation, are related to α, β, γ by

a = e−i(α+γ)/2 cos ( β/2) , b = − e−i(α−γ)/2 sin( β/2) . (4.49)

SO (3) versus SU ( 2). We have mentioned before that SU (2) acts on C2 in a way
similar to SO (3) rotating in R 3. To examine this relationship in more detail, let
x1 , x2 , x3 be the Cartesian components of a vector x ∈ R 3 . Under SO (3), they
transform as
xi 7 → x0i = R ij xj , R ij R ik = δjk . (4.50)

To each vector x we associate an element X = σi xi ∈ su(2), which is traceless


(TrX = 0), hermitian (X† = X), and has determinant det X = −| x|2. The adjoint
action by any element T ∈ su (2) maps X into X0 such that X 7 → X0 = TXT −1.
Note that X0 remains traceless and hermitian, and its determinant is preserved:
det X0 = det X, so that | x0 |2 = | x|2. As X = σi xi , we have on the one hand
X0 = TXT −1 = Tσi T −1 xi ; and on the other hand X0 = σi x0i , and so an unitary
transformation induces an (orthogonal) rotation in R 3 related to it by

Tσj T −1 = σi R ij , (4.51)

a result which generalizes (4.43). This equation says that to each SO (3) rotation
R, there correspond two equivalent operations, ± T, in SU (2). Hence, SO (3) is a
two-to-one homomorphic image of SU (2). This implies in particular that not all the
representations of SU (2) are representations of SO (3), but all the representations
of SO (3) must necessarily be present among those of SU (2).
To elaborate further, let us work with the angle-axis (ω; n) parameterization.
As we recall, the group SO (3) is compact, and its three-parameter space consists
of all points ωn for which 0 ≤ ω ≤ π, i.e. it is a spherical ball of radius π centered
at the point associated with the identity, and each pair of antipodal points on the
spherical surface (of radius π) are identified with the same group element. On
the other hand, the three-dimensional manifold underlying SU (2) consists of all
points ωn for which 0 ≤ ω ≤ 2π, i.e. it is a spherical ball of radius 2π. The group
operations are in one-to-one correspondence with elements of the group within a
138 CHAPTER 4. sl (2, C ) AND ASSOCIATED LIE GROUPS

radius 2π from the origin. As U (2π, n) = − U (0, n) = − I, all points on the surface
of the spheroid must be identified with the group element − I.
It is clear that any two points in the parameter sphere of SU (2) (or equiva-
lently two group elements) can be connected by a line lying completely within
the parameter space. So SU (2) is connected. In addition, any closed loop in the
spherical ball beginning and ending (for example) at the center of the sphere can
always be continuously deformed to the origin, whether the path touches the
sphere surface or not. Hence, SU (2) is simply connected. This is to be contrasted
with the situation in SO(3): Here, if a closed path beginning at the origin cuts
the sphere surface (with the intersection identified with the antipodal point) and
returns to the origin, it cannot be continuously deformed to a point at the ori-
gin without breaking the loop, and so SO (3) is not simply connected, it is doubly
connected, but possesses no other multiple-connectedness.
In summary, SO (3) and SU (2) are simple, compact, connected three-parameter Lie
groups. The group SO (3) is doubly connected, but the group SU (2) is simply connected,
and so provides the universal cover for all groups with the Lie algebra defined by the
bracket relations (4.42).

4.5 Rotation Matrices


We calculate here the representation matrices of SU (2) (and so also those of
SO (3)) making use of the results obtained in the previous sections. In this sec-
tion, we shall follow the notations and conventions preferred by physicists. First,
we define and use the notations J3 = H/2, J+ = E, and J− = F in place of
H, E, F. The operators J3, J1 = 1/2( J+ + J− ), and J2 = −i/2( J+ − J− ) are the
Cartesian components of a vector, the spin or angular momentum operator. Accord-
ingly, J3 = H/2 will have eigenvalues m = n/2 − k (as in Sec. 4.2), or spectrum
m = n/2, ( n/2) − 1, . . . , − n/2, either all integers or all half odd integers. The non
negative number j = n/2 is the spin (angular momentum) of the system, and so
a representation with highest weight n is referred to in physics as a representation
of spin j. Finally, the eigenvectors for J3 (which are the vi’s for H in Sec. 4.2) will
be denoted here by the kets | jm i (explicitly showing the spin j), all normalized to
one. Then, (4.6) tells us that
For each positive integer or half odd-integer, j = 0, 1/2, 1, 3/2, . . . , there exists an
irreducible representation V (2j) of the Lie algebra sl (2, C ) of dimension 2j + 1, having its
space spanned by the orthonormal vectors | jm i satisfying the equations:

J3 | jm i = | jm i m , m = − j, − j + 1, . . . , j − 1, j ;
(4.52)
J± | jm i = | j m ± 1i [( j ∓ m )( j ± m + 1)]1/2 .

j− m
For reference, note that | jm i = a j−m J− | jj i, where ak = [(2j − k ) !/ (2j )!k! ]1/2.
The expression (4.52) involves a specific choice of phase, referred to in physics
literature as the Condon–Shortley convention. One can check that J12 + J22 + J32 is
the Casimir operator, with invariant value j ( j + 1) on V (2j) .
4.5. ROTATION MATRICES 139

As for Symn C2 , the chosen basis is similarly re-labeled and normalized:


s  s
j 2j (2j )!
ψm = Nm ξ j+m η j−m , with Nm = = . (4.53)
j+m ( j + m) !( j − m) !

Written as a tensor of rank n = 2j in ξ and η, it reads:

j 1
ψm =
Nm ∑ ξ| ⊗ ·{z· · ⊗ ξ} ⊗ η| ⊗ ·{z· · ⊗ η} , (4.54)
P
j+ m j− m

where one sums over all permutations of the 2j factors in the summand (produc-
2 terms). This expression for the ψ j ’s, as a sum of tensor products, shows
ing Nm m
j j0
that they are orthonormalized vectors in (C2 )×n space, so that ψm † ψm0 = δjj0 δmm0 .
E XAMPLE 5: We label the rows and the columns in the order m = j, j − 1, . . . , − j,
and use a simplified notation in which Ji stands for π j [ Ji ].
j = 1/2: The standard representation. Order of rows and columns: ( 12 , − 12 ).
     
1/2 0 0 1 0 0
J3 = , J+ = , J− = ,
0 − 1/2 0 0 1 0

   
1 0 1/2 −i 0 − i/2
J1 = ( J+ + J− ) = , J2 = ( J+ − J− ) = .
2 1/2 0 2 i/2 0

E XAMPLE 6: j = 1: The adjoint representation.


Order of rows and columns: (+ 1, 0, − 1). We have J3 = diagonal [1, 0, − 1], and
   
0 1 0 0 1 0
1  −i 
J1 = √ 1 0 1 , J2 = √ − 1 0 1  .
2 0 1 0 2 0 −1 0

Note that both J1 and J3 are real, but J2 = ( J+ − J− ) /2i is pure imaginary, as gen-
erally expected. These matrices are equivalent, up to a similarity transformation,
to the defining matrices (4.28) of SO (3). 
C OMMENTS . The ideas about SU (2) we are presenting here can be nicely illus-
trated by the rotational symmetries of charge space (isospin T) and coordinate
space (spin, angular momentum J). In the first case, nuclear isospin invariance
requires that each nuclear energy level in a light nucleus be part of a multiplet
of energy levels in 2T + 1 nuclei with the same mass number but with (charge)
T3 varying from − T to + T. In the second case, rotational invariance manifests
itself in energy sequences J = 0, 1, 2, 3, . . . observed in any deformed nucleus or
diatomic molecule.
140 CHAPTER 4. sl (2, C ) AND ASSOCIATED LIE GROUPS

4.5.1 Representation from Exponential Map


The basis for our present calculation of the representation matrices of SU (2) is the
fact that if a Lie group G is connected and simply connected, then every represen-
tation π of Lie algebra g = Lie( G ) gives a unique representation Π of G acting on
the same space, such that Π (ex) = eπ(x) for all x ∈ g. The group SU (2) satisfies
the required conditions. Let R be any element of SU (2), and U [ R ] = Π ( R ) its
representation in space V . In a simplified notation, π ( Ji) 7 → Ji . In terms of the
Euler angles, we have (as already stated before)

U [ R (α, β, γ)] = e−iαJ3 e−iβJ2 e−iγJ3 . (4.55)

Then, the representation matrix D j for the unitary operator U [ R (α, β, γ )] in the
canonical orthonormal basis {| jmi} of space V (2j) is defined by
j
U [ R (α, β, γ)]| jm i = | jm0i Dm0m [ R (α, β, γ)], (4.56)

j
Dm0 m [ R (α, β, γ )] = h jm0|U [ R (α, β, γ)]| jm i
0 j
= e−iαm dm0 m ( β )e−iγm , (4.57)
j
dm0 m ( β ) = h jm0|e−iβJ2 | jm i .
j j
For simplicity, we will write Dm0 m ( α, β, γ ) instead of Dm0 m [ R (α, β, γ )], and, as
usual, summed over repeated indices. With a purely imaginary J2 = −i/2( E − F )
and the Condon–Shortley convention for the | jm i’s (cf. (4.52)), all matrix elements
j
dm0 m ( β ) are real-valued. It is our main task to evaluate this object.
Some general facts can be deduced from (4.57): As the Ji ’s are self-adjoint,
the operator U [ R ] defined in (4.55) is unitary. Given a unitary U [ R ] and an or-
thonormal basis for V (2j) , the representation matrix D j [ R ] is unitary, D j† D j = I,
so that
j† j j
Dm0 m ( α, β, γ ) = Dm0 m ( α, β, γ )−1 = Dm0 m (− γ, − β, −α) . (4.58)
j j j
Together with the reality of d m0 m ( β ) it also follows that d m0 m ( β ) = dmm0 (− β).
j
Since R (0, n) is the group identity element, Dm0 m [ R (0, n)] = δm0 m . In addition,
from (4.57) we have D j (2π, β = 0, γ = 0) = D j [ R (2π, e3 )], and so
j
Dm0 m [ R (2π, e3 )] = e−i2πm δm0 m = (− 1)2m δm0m = (− 1)2jδm0 m . (4.59)

As an arbitrary R ( ω, n) ∈ SU (2) can be obtained from R ( ω, e3 ) by the relation


R ( ω, n) = SR (ω, e3 ) S−1 for some appropriate S ∈ SU (2) (cf. (4.34)), the above
j
result also implies Dm0 m [ R (2π, n)] = (− 1)2jδm0m for arbitrary n. For each R ( ω, n)
of SU (2), with any value of ω in [0, 2π ], we have an irreducible representation
D j [ R (ω, n)] for SU (2). But not all irreps of SU (2) are also irreps of SO (3) ; for
this to be true, one must have D j [ R (0, n)] = D j [ R (2π, n)], that is (−)2j = 1. Only
those representations of SU (2) with integral j are also representations of SO (3).
4.5. ROTATION MATRICES 141

As a simple application, we calculate the character of SU (2) in any irreducible


representation j. As all SU (2) or SO (3) rotations by the same angle ω around any
axis belong to the same class, it suffices to calculate the character for rotation
R ( ω, n) with any axis of rotation. In particular we may take it along e3 , because
R ( ω, e3 ) has a diagonal representation in the canonical basis:

j
sin[(2j + 1) ω/2]
χ j ( ω ) = Tr D j [ R (ω, e3 )] = ∑ e−imω = . (4.60)
m=− j
sin( ω/2)

The group character χ j ( ω ) carries two labels, j for the representation and ω for
the class. As the trace is basis independent, this result (4.60) holds true in any
basis. The character of the (single-valued) j-representation of SO (3) corresponds
to integral values of j.
C OMMENTS . By definition, any element R in SU (2) or in SO (3) must be such
that det U [ R ] = 1. We can directly check that this property is satisfied in the
irreducible representation (4.52), and so also in any other representation. Recall-
ing that det Π j (ex ) = exp(Tr π j ( x)) for all x ∈ g, we see that det U [ R ] = 1 iff
Tr π ( x) = 0, a condition clearly satisfied, because π j ( x) = ω1 J1 + ω2 J2 + ω3 J3
and because every Ji is traceless in any π j .
E XAMPLE 7: With the full matrix D j given by (4.57) in the canonical basis, we just
need to evaluate its non-trivial part, d j , by summing the series involving powers
of π j ( J2), except for the case j = 0, which is trivial: d 0( β ) = 1.
j = 1/2 : The power series for the exponential e−iβJ2 , with J2 = σ2 /2 given in
Example 5, sums up to be

d1/2 ( β ) = e−iβσ2 /2 = cos ( β/2) − iσ2 sin( β/2)


 
cos( β/2) − sin( β/2)
= .
sin( β/2) cos ( β/2)

In particular, one has d 1/2 (2π ) = − I for β = 2π, and d 1/2 (4π ) = I for β = 4π.
The matrix for the rotation through β about e2 is D1/2 [ R ( β, e2 )] = d1/2 ( β ), from
which one can obtain the matrix for the rotation through β about any axis n by
an appropriate transformation: D1/2 [ R ( β, n)] = D [ S ] D [ R (β, e2 )] D [S]−1. Now,
for a full revolution β = 2π, one would expect R (2π, n) = R (0, n), but, instead,
one finds D1/2 [ R (2π, n)] = D [ S ] D [ R(2π, e2 )] D [S]−1 = − I, although one has of
course D1/2 [ R (0, n)] = I: Rotations through one complete revolution (or an odd
number of revolutions) are mapped to − I (rather than I); and two (or an even
number of) revolutions are mapped to I. This is an example of the situations
discussed before (page 140).
The group character of π 1/2 can be evaluated from D1/2 ( α, β, γ ):
 
χ1/2 = cos ( β/2) e−i(α+γ)/2 + e+i(α+γ)/2
= 2 cos ( β/2) cos[( α + γ ) /2] = 2 cos( ω/2) .
142 CHAPTER 4. sl (2, C ) AND ASSOCIATED LIE GROUPS

To obtain the result in the last line, we used the equivalence relations between the
two sets of parameters. This agrees with the general formula (4.60) for j = 1/2.
E XAMPLE 8: j = 1 : Using the representation of J2 given in Example 6, with the
cyclic property J23 = J2, we sum the exponential series of e−iβJ2 to get
 √ 
√c2 − 2cs s2
√ β β
d1( β ) =  2cs c√
2 − s2 − 2cs ( c = cos , s = sin )
2 2 2
s 2cs c2

(rows and columns in the order + 1, 0, − 1). In particular, one has d 1 (2π ) = I and
D1 [ R (2π, n)] = I, leading to the conclusion that j = 1 corresponds to a faithful
(single-valued) representation of the group SO (3). The group character is
 
χ1 = c2 e−i(α+γ) + e+i(α+γ) + c2 − s2
α+γ
= 4c2 cos2 − ( c2 + s2 ) = 2 cos ω + 1 ,
2
in accord with the general formula (4.60). We have used the trigonometric iden-
tity cos 2ψ = 2 cos2 ( ψ ) − 1. 

4.5.2 Representations from Symn C 2


When the weight vectors are written as symmetric tensors, it is possible to derive
an explicit expression for D j [ R ], which could not be done in the approach based
on exponential mapping considered in the previous section.
j
As before, let ξ and η be a basis in C2 , and ψm a basis in Symn C2 , as defined
in (4.53). Under a unitary transformation R ( a, b ), defined in terms of the Cayley–
Klein parameters as in (4.36), the basis in C2 transforms to ξ 0 = aξ + cη and
j
η 0 = bξ + dη (with c = − b∗ and d = a∗ ), while ψm goes to
0
ψm = Njm ( ξ 0 ) j+m ( η 0 ) j−m
p
ξ 2j−µ−ν η µ+ν (2j )! ( j + m ) ! ( j − m ) ! j+m−µ j−m−ν µ ν
= ∑ a b c d .
µ,ν=0
µ! ν! (j + m − µ ) ! ( j − m − ν ) !

Let us replace index ν by index m0 , such that ν = j − µ − m0 . Then we have


µ + ν = j − m0 and 2j − µ − ν = j + m0 ; and the expression for ψm
0 becomes

0 0p
0 ξ j+m η j−m (2j )! 0 0
ψm =∑p ∑ a j+ m − µ b m − m + µ c µ d j− m − µ
0
( j + m )!(j − m )! µ 0
m0
p (4.61)
( j + m0 ) ! ( j − m0 ) ! ( j + m ) ! ( j − m ) !
× .
µ! ( j − m0 − µ ) ! (j + m − µ ) ! (m0 − m + µ ) !
0j j j
This equation may be reshaped in the form (cf. (4.11)) ψm = ∑m0 ψm0 Dm0 m ( a, b )
so that D j ( a, b ) can be identified with the representation matrix D j [ R (a, b)] of the
4.5. ROTATION MATRICES 143

element R ( a, b) ∈ SU (2) acting on V (2j) , and written in the canonical basis:


p
j ( j + m0 ) ! ( j − m0 ) ! ( j + m ) ! ( j − m ) !
Dm0 m ( a, b ) = ∑
µ=0
µ! ( j − m0 − µ ) ! (j + m − µ ) ! (m0 − m + µ ) ! (4.62)
0 0
× a j+m−µ ( a∗ ) j−m −µ (− b∗)µ bm −m+µ .

The sum is taken over all possible values of µ as long as none of the arguments
of the factorials where µ appears are negative.
Special Cases.

• a 6 = 0, b = 0, then µ = 0 and m0 = m:
j j
Dm0 m ( a, 0)δm0m a j+m ( a∗ ) j−m ; in particular, Dm0 m (± 1, 0) = δm0 m (± 1)2j.

• m0 = j, then µ = 0:
 1/2
j (2j )!
Djm ( a, b) = a j+ m b j− m .
( j + m) !( j − m) !

• m = − j, then µ = 0:
 1/2
j (2j )! 0 0
Dm0 ,− j ( a, b ) = ( a ∗ ) j− m b j+ m .
( j + m ) ! ( j − m0 ) !
0

C OMMENTS . Given these results, we can show that the unitary representations
D j are irreducible. For this to be the case, any matrix that commutes with all D j ’s
is necessarily a multiple of the unit matrix (Schur’s lemma). First, if a matrix M
commutes with the matrix D j ( a, 0), it must be diagonal: Mmm0 = cm δmm0 . If M
commutes with the general unitary matrix D j ( a, b ), then the ( jm)-component of
j j
MD and DM yields c j Djm = Djm cm for all allowed m. As we have seen above,
j
Djm 6 = 0, so cm = c j for all m. Thus, a matrix that commutes with all matrices of
the representation must be a multiple of the unit matrix, and so the representations
j
Dmm0 ( a, b ) are irreducible. Also, as D j have different dimensions for different j,
they are not equivalent to one another.
Expression of D j in terms of Euler angles. With a, b as given in (4.49) the repre-
sentation matrix can now be expressed in the Euler angles:
j 0 j
Dm0 m ( α, β, γ ) = e−im α dmm0 ( β ) e−imγ
j 0 [( j + m0 ) ! ( j − m0 ) ! ( j + m ) ! ( j − m ) ! ]1/2
d m0 m ( β ) = ∑ (−)µ+m −m µ! (m0 − m + µ ) !( j − m0 − µ ) !( j + m − µ ) !
µ=0
  0   0
β 2j+m−m −2µ β m −m+2µ
× cos sin . (4.63)
2 2
144 CHAPTER 4. sl (2, C ) AND ASSOCIATED LIE GROUPS

Its transpose is

j [( j + m0 ) ! ( j − m0 ) ! (j + m ) ! ( j − m ) ! ]1/2
dmm0 ( β ) = ∑ (−)ν ν! (m0 − m + ν ) !( j − m0 − ν ) !( j + m − ν ) !
ν= 0
  0   0
β 2j+m−m −2ν β m −m+2ν
× cos sin . (4.64)
2 2

These formulas are due to Eugene Wigner, and are called the Wigner formula(s).
The matrix d j obeys the following general identities:
j j
dm0 m ( β ) = d−m,−m0 ( β ) (4.65)
m0−m j
= (−) dmm0 ( β ) (4.66)
m0−m j j
= (−) dm0 m (− β ) = dmm0 (− β ) (4.67)
j 2j
dm0 m (2π ) = (−) δm0 m . (4.68)
j
d m0 m ( π ) = (−) j−m δm0 ,−m (4.69)
j j j+ m 0 j
d m0 m ( π − β) = (−) j−m d−m0 ,m ( β ) = (−) d m 0 ,− m ( β ) (4.70)

They directly follow from the Wigner formula in its various equivalent forms.
(4.67) is an expression of the equivalence of the change β 7 → − β and matrix
transposition. (4.69) follows from (4.64) because [cos( π/2))]k 6 = 0 iff k = 0, or
ν = j + ( m − m0 ) /2; and ( j − m0 − ν ) ! 6 = 0 iff j − m0 − ν = −( m0 + m ) /2 = 0. Fi-
j j j
nally, (4.70) follows from dm0 m ( π − β ) = h m0 |e−i(π− β) J2 | m i = dm0 m00 ( π )dm00m (− β ),
and the preceding general identities.
Special Cases. Let c ≡ cos ( β/2) and s ≡ sin( β/2).
(i) m = m0 = 0; then j = ` an integer:
" #2
` ` ! c`−µ sµ
d`00 ( β ) = ∑ (−)µ . (4.71)
µ=0
µ! (` − µ ) !

(ii) From the general relations (4.63)–(4.67), we have


j j j j
dmj ( β ) = d− j,−m ( β ) = (−) j−m d jm ( β ) = (−)j−m d−m,− j ( β )
 1/2
(2j )!
= c j+ m s j− m . (4.72)
( j + m ) ! (j − m ) !

E XAMPLE 9: In the list of matrices given below the entries of the first and last rows
of a matrix d j, as well as the entries of its first and last columns, are evaluated with
(4.72). The matrix elements are given, as usual, in the decreasing order of m. If j
j
is an integer, then d00 is given by (4.71); for other j, d j are calculated with (4.63).
Notations: c ≡ cos ( β/2) and s ≡ sin( β/2). Besides d 0 ( β ) = 1, we have
4.6. DIRECT-PRODUCT REPRESENTATIONS 145
 
c −s
(i) j = 1/2 : d1/2 ( β ) = .
s c
 2 √ 
√c − 2cs s2

(ii) j = 1 : d1 ( β ) =  2cs c√ 2 − s2 − 2cs.

s2 2cs c2
3
(iii) j = /2 :
 3 √ 2 √ 2 3 
√ c − 3c s 3cs √− s 2
 3c2 s c3 − 2cs2 − 2c2s + s3
√3cs2  .
3/2

d ( β) =  √
 3cs2 2c2 s − s3 c3 − 2cs2 − 3c s 
√ 2 √ 2
s3 3cs 3c s c3

C OMMENTS .
1. There is a simple but important physical consequence which follows directly
from the present discussion. The physics of a system can be described by an
action integral that depends on the state function ψ in the form ψ† O ψ. If a trans-
formation group leaves the action integral invariant, then the generators of the
group produce conserved quantities. Whether the system is invariant under a
direct product group, like U (1) × SU (2), or a group, like U (2) itself, has implica-
tions on the quantum numbers. In either case, the group representation may be
labeled by ( n, j ), with a U (1) (integral) quantum number n, and an SU (2) (inte-
gral or half-integral) quantum number j. If the symmetry group is U (1) × SU (2),
then n and j are independent conserved quantum numbers; but if it is U (2), then
invariance requires 2j + n = 2Q be an even number, and it is Q that is the con-
served number.
2. The Lie algebra su (2) < sl (2, C ) shows that its irreducible representations can
be characterized by integral or half-integral values of j. In quantum physics, su (2)
is usually associated with symmetries in coordinate or charge space, then j is
interpreted as the spin (angular momentum), or isospin quantum number. We
know that there exist in nature two kinds of particles: the bosons, which obey
Bose–Einstein statistics and have integral spins, and the fermions, which obey
Fermi–Dirac statistics and have half-integral spin. The isospins of bosons and
fermions may be either integral (π mesons, Σ baryons) or half-integral (K mesons,
the nucleons).

4.6 Direct-Product Representations


Quantum systems are often described in terms of rotationally invariant basis
states of the kind we considered above, but usually with additional complex-
ity. An electron may be represented by bilinear combinations of orbital angular
momentum and intrinsic spin states, or alternatively by states of the electron’s
total angular momentum. An atom of two or more electrons may be described by
products of the individual electron angular momentum states, or equivalently by
eigenstates of the atom’s angular momentum. Thus, constructing direct-product
146 CHAPTER 4. sl (2, C ) AND ASSOCIATED LIE GROUPS

representations and finding irreducible representations of a direct-product group


are of prime importance in physics. We have already discussed this problem in
general terms in Chapter 2 Sec. 2.7, and Chapter 3 Sec. 3.6. We now apply it to
the direct-product group SU (2) × SU (2). Of course, these considerations could
be generalized to products of more than two factors and, more importantly, to
higher-dimensional groups studied in later chapters.
Consider two spaces V and V 0 (immersed in R 3 where perhaps two non-
interacting particles are evolving) and construct the simple representations of the
group SU (2), or of the associated Lie algebra su (2)C, acting on each space. This
involves defining two sets of angular momentum operators { ji} and { ji0 } , and
finding the eigenvectors | jm i and | j 0 m0 i of the commuting sets of hermitian oper-
ators, j3 and j30 , respectively, ultimately leading to the corresponding irreducible
0 0
representations D j and D j in spaces V (2j) ⊂ V and V (2j ) ⊂ V 0.
Combining two non-interacting systems into a single system means in mathe-
matical terms forming the direct product space V ⊗ V 0, or, if we confine ourselves
0
to states of fixed j and j 0 , its subspace V (2j) ⊗ V (2j ) of dimension (2j + 1)(2j0 + 1).
A basis for this space can be formed by the direct products

| jm i ⊗ | j 0 m0 i ≡ | m, m0 i , (4.73)

where m, m0 take all allowed values, and the labels j, j 0 are implicit in the short-
ened form on the RHS. In this product space, a rotation R has a representation
defined by (again with implicit summation over repeated indices)

j j0
Π ( R)|m, m0i = | n, n0 i Dnm ( R ) Dn0m0 ( R ) . (4.74)

j× j0 j j0 0
Symbolically, Dnn0 ,mm0 = Dnm Dn0 m0 . The product representation D j× j is re-
ducible (unless either j or j 0 is zero), and one problem is to find its decomposition
into irreducible representations (cf. Example 2).
Consider a rotation R (δω, n) around an arbitrary axis n:
0 0
D j× j [ R (δω, n)] = D j [ R (δω, n)] D j [ R (δω, n)]. (4.75)

Assuming δω infinitesimal, we can expand both sides to the first few terms:
0
h i h 0
i
Ij× j0 − iδωD j× j ( Jn ) = Ij − iδωD j ( jn ) ⊗ Ij0 − iδωD j ( jn0 ) (4.76)

where Ir is the identity matrix and Dr ( Jn ) the matrix representing the operator
π r ( Jn ) in space V (2r) , with Jn = J · n. Equating the first-order terms in δω on
0 0
both sides of the equation produces D j× j ( Jn ) = D j ( jn ) ⊗ Ij0 + Ij ⊗ D j ( jn0 ), where
on the right-hand side we are reminded that we are dealing with direct (and not
ordinary) products of matrices. This being understood, we write the operators in
0
the product space V (2j) ⊗ V (2j ) in a simplified form: Jk = jk + jk0 , with (k = 1, 2, 3),
rather than in the more precise form: Jk = jk ⊗ I + I ⊗ jk0 . It is important when
using the abbreviation to recall the specific space on which each operator applies.
4.6. DIRECT-PRODUCT REPRESENTATIONS 147

Each angular momentum, jk or jk0 , satisfies (4.42) in its own space, and commutes
with the other, whereas an angular momentum Jk = jk + jk0 in the product space
obeys its own commutation relations.
A point worth noting, illustrated for SU (2) and su (2) but generalizable, is
that the representations Π of a Lie group G and π of the Lie algebra Lie( G ) = g
in direct-product space are quite different from each other. Thus we have
j× j0 j j0
Π ( g)|m, m0i = | nn0 i Dnn0,mm0 ( g) = | nn0i Dnm ( g) Dn0m0 ( g), (4.77)

for any g ∈ G, whereas for any X ∈ g we have, in contrast,


j× j0 j j0
π ( X )|m, m0i = | nn0i Dnn0 ,mm0 ( X) = | nm0 i Dnm ( X) + | mn0i Dn0 m0 ( X) . (4.78)

Assuming neither j nor j 0 equal to zero, the direct-product representation


0
D j× j is fully reducible. Its reduction to irreducible representations is equivalent
to finding the irreducible rotationally invariant subspaces of the (2j + 1)(2j0 + 1)-
0
dimensional product space V (2j) × V (2j ) . The hermitian operators j2 , j02 , J 2, and
J3 form a complete set of commuting operators. Let |( jj 0) J M i be their common
eigenvectors, with eigenvalues:

J 2 → J ( J + 1) , J3 → M ,
2
j → j ( j + 1) , j02 → j 0 ( j 0 + 1) .

With J and M taking all (still to be specified) allowed values, the set of vectors
|( jj 0) J M i gives a second basis for the product space, and so each vector is ex-
pressible as a linear combination of the product vectors (4.73). Since the eigen-
values j and j 0 are constant in both bases, we may confine ourselves to just the
0
subspace V (2j) ⊗ V (2j ) with fixed j and j 0 . For this reason, the labels j, j 0 may be
suppressed, as in | J M i ≡ |( jj0) J M i.
For a given J, the 2J + 1 vectors | J M i with M in the range − J ≤ M ≤ J span
an irreducible subspace. Therefore if the vector | J M i for a particular value (for
example, the highest value) of M is known, then all the other vectors in V (2J ) can
be constructed by repeated application of the raising/lowering operators J± on
the known vector.
On the other hand, recalling that M = m + m0 , all the vectors with the same
value of M but different values of J can be constructed from the set of vectors
| mm0i, with different m and m0 added up to M, as we will see in the following
example. Let g( M ) be the degree of degeneracy of M in this space, i.e. the numbers
of vectors | J M i having the same value of M; it is equal to the number of pairs
( m, m0 ) such that M = m + m0 . Knowing g( M ) we can find the allowed values of
J for fixed M, and hence the range of values of J for any given M. This description
should become clearer through the following example.
E XAMPLE 10: Let j = 3 and j 0 = 2, which implies − 3 ≤ m ≤ 3 and − 2 ≤ m0 ≤ 2,
and so − 5 ≤ M ≤ 5. We list in the following table g( M ), M, all pairs ( m, m0 ) with
M = m + m0 , and allowed values of J. It tells us which | mm0i’s are found in | J M i,
148 CHAPTER 4. sl (2, C ) AND ASSOCIATED LIE GROUPS

and vice versa. For example, | J = 5, M = 4i is a linear combination of states with


( m, m0 ) = (3, 1) and (2, 2); inversely | m = 3, m0 = 1i can be formed from vectors
with ( J, M ) = (5, 4) and (4, 4).

g( M ) M ( m, m0) J
1 5 (3, 2) 5
2 4 (3, 1) (2, 2) 5, 4
3 3 (3, 0) (2, 1) (1, 2) 5, 4, 3
4 2 (3, − 1) (2, 0) (1, 1) (0, 2) 5, 4, 3, 2
5 1 (3, − 2) (2, −1) (1, 0) (0, 1) (−1, 2) 5, 4, 3, 2, 1
5 0 (2, − 2) (1, −1) (0, 0) (− 1, 1) (− 2, 2) 5, 4, 3, 2, 1
5 −1 (1, − 2) (0, −1) (−1, 0) (−2, 1) (−3, 2) 5, 4, 3, 2, 1
4 −2 (0, − 2) (−1, −1) (−2, 0) (−3, 1) 5, 4, 3, 2
3 −3 (− 1, −2) (−2, − 1) (− 3, 0) 5, 4, 3
2 −4 (− 2, −2), (−3, −1) 5, 4
1 −5 (− 3, −2) 5

When M = 0, or m = − m0, the state m = − 3 does not contribute because | m0 |


is limited to values less than or equal to 2. It follows that J = 0 (which would
require j = j 0 ) is not allowed, and the number of independent states for this case
limited to 5. The number ∑M g( M ) = 35 is equal to the number of all | J, M i states
and also equal to the number of all | mm0 i states, (2j + 1)(2j0 + 1) = 35. 
The situation can be generalized to any j, j 0 , leading to the general result

g( M ) = 0 if | M | > j + j 0
= j + j + 1 − | M | if | j − j 0 | ≤ | M | ≤ j + j 0
0 (4.79)
= 2 min( j, j 0) + 1 if 0 ≤ | M | ≤ | j − j 0 | .

Therefore (as J3 7 → M), the range of values of J is | j − j 0 | ≤ J ≤ j + j 0 .


Since for each J there are 2J + 1 states (− J ≤ M ≤ J), the total number of
states is (assuming j ≥ j 0 without loss of generality):

j+ j0
N= ∑ (2J + 1)
J = j− j0
1 
= { 2(( j − j 0 ) + 1} + { 2( j + j 0 ) + 1} (2j 0 + 1)
2
= (2j + 1)(2j 0 + 1) .

This state counting provides a consistency check of the results obtained above.
Thus, for fixed j and j 0 , there are two independent sets of orthonormal vectors:

|( jj0) J, M i : | j − j 0 | ≤ J ≤ j + j 0 , −J ≤ M ≤ J ; (4.80)
0 0 0 0 0
| jm, j m i : − j ≤ m ≤ j, −j ≤ m ≤ j ; (4.81)
0
each spans the product space V (2j) ⊗ V (2j ) (i.e. each is complete). The two bases
4.6. DIRECT-PRODUCT REPRESENTATIONS 149

can be mutually related by a unitary similarity transformation and its inverse:

j+ j0 J
0 0
| jm, j m i = ∑ ∑ |( jj 0) J, M ih J M| jm, j0m0 i , (4.82)
J =|j− j0 | M =− J
j j0
0
|( jj ) J, M i = ∑ ∑ | jm, j 0m0 ih jm, j0m0 | J M i , (4.83)
m=− j m0 =− j0

where h J M | jm, j0m0 i = h jm, j 0m0 | J M i∗ (complex conjugation). The elements of


the transformation matrix are called the Clebsch-Gordan coefficients for the group
SU (2).
The construction of | J, M i from | jm, j 0m0 i as described above can determine
these coefficients up to a common phase factor for each invariant J subspace. The
commonly accepted Condon–Shortley phase convention specifies that

h jm, j0m0 | J M i : real


0 (4.84)
h jj, j J − j | J M i : positive for all j, j 0, J .

So in this convention, the unitary transformation and its inverse are specified by
the same set of real coefficients: h J M | jm, j 0m0 i = h jm, j 0m0 | J M i.
Properties of the Clebsch-Gordan coefficients for SU ( 2).
(i) Angular momentum selection rule:

h jm, j0m0 | J M i = 0 unless M = m + m0 and | j − j 0 | ≤ | M | ≤ j + j 0 .

(ii) Orthogonality and completeness:

j j0
∑ ∑ h J 0 M0 | jm, j 0m0 ih jm, j0m0 | J M i = δ J 0 J δM0 M
m=− j m0 =− j0
j+ j0 J
∑ ∑ h jn, j 0n0 |( jj 0) J, M ih J M| jm, j0m0 i = δnm δn0 m0 .
J =|j− j0 | M =− J

(iii) Symmetry Relations:


0
h jm, j 0m0 | J M i = (−) j+ j − J h j 0 m0 , jm | J Mi
0
= (−) j+ j − J h j, − m; j0, − m0 | J, − Mi
0
= (−) j− J +m h J M; j 0, − m0 | jm i[(2J + 1) / (2j + 1)]1/2 .

Corresponding to the construction of the invariant irreducible subspaces as


encapsulated in (4.82), we have the reduction of the direct product of rotation
matrices to its irreducible parts in the spin-J representation:
0 0 0 0
D j ⊗ D j = D j+ j ⊕ D j+ j − 1 ⊕ · · · ⊕ D | j− j | , (4.85)
150 CHAPTER 4. sl (2, C ) AND ASSOCIATED LIE GROUPS

where each summand D J occurring only once. It is the same as the relation given
in Example 2. When the decomposition is written out fully, we have what is
known as the Clebsch-Gordan series:
j+ j0
j j0 J
Dmn ( R )Dm0n0 ( R ) = ∑ h jm, j0m0 | J M i DMN ( R )h J N | jn, j0n0 i . (4.86)
J =|j− j0 |

On the RHS, only terms with M = m + m0 and N = n + n0 may appear in the


sum. Conversely, (4.83) allows us to build from irreducible representations D j
0
and D j invariant irreducible representations of higher (or lower) dimensions:

J j j0
δ J J 0 DMM0 ( R ) = ∑0 h J M | jm, j 0m0 i Dmn ( R )Dm0n0 ( R )h jn, j0n0 | J 0 M0 i , (4.87)
mm nn0

where m, m0 , n, n0 must be such that M = m + m0 and M0 = n + n0 .


E XAMPLE 11: Coupling of j = 12 and j 0 = 12 to give the total angular momentum
J = j + j0 . We have two sets of orthonormal vectors:
(i) Eigenkets | mm0 i of j3 and j30 : | + +i, | + −i, | − +i, | − −i, where | + −i
stands for m = 12 , m0 = − 12 , and so on.
(ii) Eigenkets of J 2 and J3: |11i, |10i, |1, −1i, |00i, where |10i stands for J =
1, M = 0, and so forth.
We want to relate the two sets. We may start from |1, − 1i. Recalling that
M = m + m0 , we see that |1, − 1i = | − −i. Applying J+ = j+ ⊗ I + I ⊗ j+ 0 on
both sides of the last equation and using (4.52), we get
0
J+ |1, − 1i = ( j+ ⊗ I + I ⊗ j+ )| − −i
q q
1(1 + 1) + 1(− 1 + 1)|10i = 12 ( 12 + 1) + 21 (− 12 + 1)| + −i
q
+ 12 ( 12 + 1) + 21 (− 12 + 1)| − +i

2|10i = | + −i + | − +i.

State |11i = | + +i is obtained either by applying J+ on |10i, or by noting that


| + +i is the only uncoupled ket with m + m0 = 1. To obtain |00i, it suffices to
note that it is orthogonal to |10i. To summarize, the direct product of two doublet
(2) representations of SU (2) decomposes into a sum of a singlet (1) and a triplet
(3) representation: 2 ⊗ 2 = 1 ⊕ 3; and the two sets of kets are related in terms of
the Clebsch-Gordan coefficients given in the following table.

++ −+ +− −−
11 1 0 0 0
q q
1 1
10 0 2 2 0
1, − 1 0 0 0 1
q q
1 1
00 0 2 − 2 0
4.6. DIRECT-PRODUCT REPRESENTATIONS 151

The weight diagrams. We introduce here a graphical tool, the weight diagram
of a representation, which gives a graphical distribution of the weights of the
representation in weight space. For SU (2), it is simply a linear plot, but it is more
complicated in higher-dimensional groups.
For example, the irreducible representation V (4) has the weight diagram:
s s s s s
−2 −1 0 1 2 m

in which each eigenspace νm (with m = ± 2, ± 1, 0) for J3 is represented by a black


dot. Let us recall two facts discussed earlier in the chapter, which concern the de-
composition of the products and powers into irreducible representations of su(2):
(i) For an irreducible representation, the eigenvalues for J3 occur symmetrically, and
only once. (ii) Knowing the eigenspace decomposition of representations provides
us with the eigenspace decomposition of their tensor products: for example, if
V = ⊕ νm and W = ⊕ νm0 are known, then V ⊗ W = ⊕ νm+m0 with all possible
values of m, m0 .
In Example 11 discussed above, we considered the tensor bi-product of the
standard representation V (1) = C2 , each factor having the eigenvalues ± 1/2. The
eigenvalues for J3 in V (1) ⊗ V (1) are thus 1, 0 (twice) and − 1, distributed into V (2)
and V (0), so that V (1) ⊗ V (1) = V (2) ⊕ V (0), as shown in the following figure:

s s
e s
−1 0 1

As another example, let us take V (2) ⊗ V (3). The eigenvalues for J3 in V (2) are
0 and ± 1; and in V (3), ± 3/2 and ± 1/2, which means the twelve eigenvalues in
V (2) ⊗ V (3) are ± 5/2, ± 3/2 (twice), and ± 1/2 (three times), as shown:
s se s
h
e se
h se s
− 52 − 32 − 12 1
2
3
2
5
2

As we know, V (n) ∼ = Symn V (1), the space of homogeneous polynomials of de-


gree n in the basis states of C2 . So the map Sym2 V (1) ⊗ Sym3 V (1) → Sym5 V (1) (a
multiplication of polynomials) identifies the set 5/2, 3/2, 1/2, − 1/2, − 3/2, and − 5/2
(each occuring once) as the components of V (5). The remaining possible eigenval-
ues ± 3/2 (once) and ± 1/2 (twice) clearly distribute themselves into a quadruplet
3/2, 1/2, − 1/2, and − 3/2, which identifies it with V (3) ; and a doublet 1/2 and − 1/2
forming a copy of V (1). So we have the decomposition

V (2) ⊗ V (3) = V (5) ⊕ V (3) ⊕ V (1) .

An analysis along the same lines yields the decomposition of a typical tensor
product Syma V (1) ⊗ Symb V (1), or equivalently V (a) ⊗ V (b), according to

V (a) ⊗ V (b) = V (a+b) ⊕ V (a+b−2) ⊕ · · · ⊕ V (|a−b|) .


152 CHAPTER 4. sl (2, C ) AND ASSOCIATED LIE GROUPS

We can check this equation by calculating the dimensions on both sides, and see
that they agree. It is remarkable that the multiplicity of every component is one,
and that the weights (with respect to J3) of all composing representations occupy
every point of the set 12 Z.

Problems
4.1 (Casimir invariant) Let h, e, f be the canonical basis of sl (2, C ). Show that
C = e f + f e + 12 h2 is an invariant of the algebra. Express C in terms of J1, J2, J3 ,
with h = 2J3, e = J1 + iJ2, and f = J1 − iJ2.
4.2 (Brackets) Let h, e, f be the canonical basis of sl (2, C ). Calculate the brackets
[ h, em ], [ h, f m ], [ f , em], and [ e, f m].
4.3 (Spin-one representation) Let ξ, η ∈ C2 transform into ξ 0 = aξ + cη, η 0 = bξ +
dη under SU (2) (so that d = a∗ , c = − b∗, and aa∗ + bb∗ = 1). Let ψ+ = ξ 2 , ψ0 =

2ξη, and ψ− = η 2. Show that the ψµ ’s transform as a spin-one representation;
find the representation matrix D1 in terms of the Euler angles.
4.4 (Quaternion groups) A quaternion q is a number that can be written as q =
∑3µ=0 xµ qµ where xµ are real numbers, and qµ = { q0, qi } (with µ = 0, . . . , 3 and
i = 1, 2, 3) are generalized non commuting complex numbers defined such that
q0 q0 = q0, q0 qi = qi q0 = qi , qi q j = − δijq0 + eijk qk ; and q∗0 = q0, q∗i = − qi under
complex conjugation.
(a) Let H be a one-dimensional quaternion-valued vector space, with typical
vector η. The set of all non-zero mappings η 0 = Qη form a Lie group, called
GL(1,Q). Find its generators and its Lie algebra.
(b) Show that { Q ∈ GL (1, H )|Q∗ Q = 1} is a Lie group. Find its associated
Lie algebra.
(c) Evaluate the exponential mapping of the above Lie algebras.
4.5 (det R = 1) Show that the invariance of the tensor eijk under rotations is
equivalent to det R = 1 for any orthogonal real 3 × 3 matrix R.
4.6 (Invariant axis) Show that a proper rotation in an odd-dimensional space pos-
sesses an invariant axis.
4.7 (Similarity transformation) Consider the group of pure rotation in R 3: { R ∈
M(3, R 3)| RT R = I; det R = 1} .
(a) Find the direction of the axis and the angle of rotation in terms of the matrix
elements R ij .
(b) Find the transformation U that diagonalizes R, so that U −1 RU = Λ is a
diagonal matrix.
(c) Find the transformation that gives Λ in a real orthogonal basis.
4.8 (Angle-axis rotation matrix) Write out the 3 × 3 matrix of rotation R ( ω, n)
in terms of the rotation angle ω and the spherical coordinates θ, ϕ of the axis n,
or alternatively in terms of ω and the Cartesian components n1, n2 , n3 of n in an
orthonormal basis { ei } , with n2 = 1.
4.6. DIRECT-PRODUCT REPRESENTATIONS 153

4.9 (Euler-angles rotation matrix) Write out the matrix of rotation R ( α, β, γ) in a


Cartesian basis in terms of the Euler angles.
4.10 (Relations between parameters) Find the relations between the angle-axis
parameters and the Euler angles for the same rotation.
4.11 (R ( α, β, γ) at β = 0, π) Show that R ( α, 0, γ) is function of the sum α + γ only,
and R ( α, π, γ ) of the difference α − γ only.
4.12 (To rotate a vector) Show that the rotation R ( α, n) of a position vector x in
R 3 produces the following vector (with α ≡ αn, n2 = 1):

sin α 1 − cos α
x0 = x + ( α × x) + [ α × ( α × x)].
α α2

4.13 (Commutation of angular momentum) (a) Show that the sequence of in-
finitesimal rotations by the directed angles α, β, − α, and − β on the position
vector x is equivalent to the rotation by the directed angle β × α on x.
(b) In Hilbert space, a rotation by α is represented by the unitary operator
U ( α) = e A, where A = α · X with Xi denoting the generators of rotation. Follow-
ing the argument used in (a), derive the commutation relations for Xi (see also
Sec. 3.3, Chapter 3).
4.14 (Exponential mapping) Re-express R ( ω, n) = exp(− iωnk Lk ) in a non-trans-
cendental form by summing the power series of the exponential. Lk are 3 × 3
matrices of the angular momentum operators in a Cartesian basis.
4.15 (From cartesian basis to spherical basis) Show that the matrices Jk ≡ D1 ( Jk )
in the spin j representation (in which D1 ( J3) is diagonal) are equivalent by a sim-
ilarity transformation to the defining matrices Lk of the algebra so (3) given in the
preceding problem.
4.16 (Non standard parameterizations) In the text, we emphasized the standard
parameterization ( ω1, ω2, ω3 ) of SU (2). But there are times when it is more ad-
vantageous to use a non-standard parameterization (e.g. to make the exponential
mapping simpler). In SU (2) the use of a parameterization associated with the
operators J± = J1 ± iJ2 has a further advantage when we need to study transi-
tions between states in the same irreducible representation. In this problem, we
examine just such mappings involving the elements J± , J3.
(a) Evaluate Uo ( ω+, ω− , ω3 ) = exp[− i(ω+ J+ + ω− J− + ω3 J3 )] and relate the
parameters ω+ , ω− , ω3 to the standard parameters ω1, ω2, ω3.
(b) Evaluate Ua ( α+, α− , α3 ) = exp[− i(α+ J+ + α− J− )] exp[− iα3 J3 ], and dis-
cuss the characteristics of the parameters.
(c) Repeat with Ub ( β +, β − , β 3 ) = exp [− iβ + J+ ] exp[− iβ 3 J3 ] exp[− iβ − J− ].
j
4.17 (Matrices in non standard parameters) Find Dm0 m [U ] for the different U
found in Problem 4.16.
4.18 (j = 1/2 rotation matrices) (a) Write down the matrices for the same general
rotation in the Euler angles, R ( α, β, γ), and the angle-axis parameters, R (ω, n),
both in the two-dimensional j = 1/2 angular-momentum basis.
154 CHAPTER 4. sl (2, C ) AND ASSOCIATED LIE GROUPS

(b) By comparing the two expressions, D1/2 [ R (α, β, γ)] and D1/2 [ R (ω, n)], de-
rive the relationship between the two sets of parameters.
4.19 (Schwinger’s model) J. Schwinger gave a model of the algebra of angular
momentum based on the algebra of two independent harmonic oscillators repre-
sented by ( a+ , a†+ ) and ( a−, a†− ), which satisfy the commutation relations typical
of uncoupled harmonic oscillators:
h i h i
a+ , a†+ = 1, a− , a†− = 1,
h i h i
a+ , a†− = 0, a− , a†+ = 0.

(a) Define the number operators N+ = a†+ a+ and N− = a†− a− . Show that
they commute with one another, and thus have common eigenkets | n+ , n− i of
respective eigenvalues n+ , n− where the vacuum ket |0, 0i ≡ |0i is defined by
a+ |0i = 0, and a− |0i = 0. Construct the normalized ket | n+ , n− i from the nor-
malized vacuum state |0, 0i.
(b) Define the operators J+ = a†+ a− , J− = a†− a+ , J3 = 21 ( a†+ a+ − a†− a− ), and
J = 12 ( J+ J− + J− J+ ) + J32 . Derive their bracket rules.
2

(c) Evaluate J 2 | n+ , n− i, J3 | n+, n− i, J+ | n+, n− i, and J− | n+ , n− i, and interpret


| n+, n− i as a vector | jm i describing a system of spin j in state m.
(d) From the rotation property of the angular momentum kets | jm i, obtain the
j
explicit formula for the rotation functions d m0 m ( β ).
4.20 (Weight diagrams) We have studied in this chapter the decomposition of ten-
sor products of the kind Syma V (1) × Symb V (1), where V (1) ∼ = C2 is the standard
(two-dimensional) representation of sl (2, C ). In this problem we want to apply
the same approach to find the decomposition of symmetric powers of V (2), i.e.
Syma V (2). Here, V (n) = Symn V (1) is the (n + 1)-dimensional irreducible repre-
sentation of sl (2, C ). In order to do this, first find the decomposition of Sym2 V (2)
and Sym3 V (2); then generalize to Syma V (2).
4.21 (Reduction of direct product) Calculate the Clebsch–Gordan coefficients for
the direct product of irreducible representations of SO (3): (a) D1/2 × D1 , and (b)
D1 × D1. A good way is to make use of the properties of states of highest weight
and the raising/lowering operators.
j
4.22 (Rotation of tensor product) Show that Tm = ∑µν h` µ, sν| jmi Xµ` Yνs transforms
as a tensor of rank j if Xµ` and Yνs transform as tensors of rank ` and s .

Q. Ho-Kim. Group Theory: A Physicist’s Primer.


Chapter 5

Lie Algebra sl(3, C )

5.1 Structure
5.2 Representations

We will apply the basic ideas and techniques developed in the last chapter for
sl (2, C ) to study sl (3, C ). As we are dealing now with a larger space — an eight-
dimensional sl (3, C ) rather than the three-dimensional sl (2, C ) — we need to
generalize old ideas and introduce new concepts. The basis for our study is an
elementary result from linear algebra, namely, that commuting diagonalizable
matrices are simultaneously diagonalizable. By the end of this chapter, we will
have almost all the tools we need for studying other simple or semi-simple Lie
algebras. The Lie group SU (3), which admits sl (3, C ) as the Lie algebra, plays
an important role in physics, first as the (approximate) symmetry of the quark
flavor, where it was referred to as the eightfold way, then later as the (exact) sym-
metry of the quark and gluon color, which gave rise to a gauge theory of strong
interactions called quantum chromodynamics (QCD).

5.1 Structure
After a brief review of some key properties of sl (2, C ), we will take up sl (3, C ),
define a standard basis for it, analyze its structure, and consider its roots and
weights, phrasing the results so as to be generalizable to other Lie algebras.
Basic results on sl (2, C ). To begin, we recall the most important properties of the
complex semi-simple algebra sl (2, C ) which, as we shall see, plays an essential
role in the analysis of sl (3, C ) and other simple Lie algebras. Let h, e, and f be the
canonical basis for sl (2, C ) satisfying the bracket relations

[ h, e ] = 2e, [ h, f ] = − 2 f , [ e, f ] = h . (5.1)

Every element z of g = sl (2, C ) has the form z = ah + be + c f , where a, b, c ∈ C.


Let h be the sub algebra spanned by h, and g − h be the set of all be + c f .

155
156 CHAPTER 5. LIE ALGEBRA sl (3, C )

A convenient basis for the standard representation in C2 of sl (2, C ) is


     
1 0 0 1 0 0
H= , E= , F= . (5.2)
0 −1 0 0 1 0

The adjoint action of h on sl(2, C ) is defined to be adh ( z ) = [h, z ] for any z ∈


sl (2, C ). The eigen equation for adh,

adh z = α ( h) z, z ∈ g, (5.3)

has at least one solution, an eigenvector z with eigenvalue α. The eigenvalue α ( h)


is a complex linear function of h ∈ h, so that the dual h∗ of h is a space each
element α of which sends every k ∈ Ch to α ( k ) ∈ C. A nonzero α is called a root of
the algebra sl (2, C ) with respect to h, and the element z of g − h found in (5.3) is
called a root vector corresponding to the root α. The set of all such z with the same
α forms the root space corresponding to the root α, denoted gα .
From (5.1) we see that there are three eigenvalues (namely, 0, 2, − 2) for adh,
but only two roots in the algebra, both integers, namely, α = 2 for the vector e, and
α = − 2 for the vector f . As h, e, f are the independent eigenvectors for adh, they
form a basis of g, and we may write g as a direct sum: g = h ⊕ g2 ⊕ g−2 .
In any complex finite-dimensional irreducible representation π defined on a vector
space V , the operator π ( h ) on V is diagonalizable, so that

π ( h )u = µ ( h )u, u ∈ V. (5.4)

µ is a linear function on h called a weight of π. It is associated with one or several


eigenvectors, called weight vectors, which together compose the weight space Vµ
for µ. In the algebra sl (2, C ) all Vµ are one-dimensional. If u ∈ Vµ is such a vec-
tor, then π ( e )u is either zero, or is an eigenvector for π ( h ) with eigenvalue µ + 2.
Similarly, either π ( f )u = 0, or else π ( f )u is an eigenvector for π ( h) with eigen-
value µ − 2. We have a complete classification of the representations of sl (2, C );
they are either simple or semi-simple:
(a) Every finite-dimensional irreducible representation of sl (2, C ) is uniquely deter-
mined by an integer n ≥ 0, called V (n) . It is spanned by the (n + 1) π ( h)-eigenvectors
v0, v1, . . . , vn corresponding to the unbroken sequence of eigenvalues n, n − 2, . . . , − n.
This means V (n) = Vn ⊕ + Vn−2 ⊕ · · · ⊕ V−n .
(b) Irreducible representations of sl (2, C ) with the same dimension are equivalent.
(c) Every representation of sl (2, C ) is completely reducible to a direct sum ⊕ n V (n) .
Lie group SU ( 3 ) and associated Lie algebras. The matrix Lie group SU (3) is the
eight-dimensional group of 3 × 3 complex-valued matrices that are unitary and
have determinant equal to one. Near the identity element, I, every element, say A,
of SU (3) takes the form A = I + M + · · · , where M denotes an infinitesimal
traceless ( TrM = 0 ) and skew-adjoint ( M† = − M ) 3 × 3 matrix. (In physics,
M is usually written in a basis consisting of eight traceless self-adjoint matrices
λ1, λ2, . . . , λ8 , so that one has M = i ∑i ai λi , where ai ∈ C.) The set of all such
matrices M, regardless of the basis being used, forms the associated Lie alge-
bra su(3). Its complex extension, su (3)C, is equivalent to the matrix Lie algebra
5.1. STRUCTURE 157

sl (3, C ) consisting of non-singular complex-valued 3 × 3 matrices having zero trace.


(Any element M ∈ sl ( n, C ) may be written uniquely as M = M1 + iM2, where
M1 = ( M − M† ) /2 and M2 = ( M + M† ) /2i are traceless and skew-adjoint, and
so must be in su( n ).) So, su (3)C and sl (3, C ) are isomorphic.
The standard representation. In order to define sl (3, C ), we pick a basis similar
to (5.2). There can be at most two diagonal matrices, because, if a, b, c are the
nonzero entries of a 3 × 3 diagonal matrix, the zero-trace condition, a + b + c = 0,
leaves only two elements undetermined; which two and of which values are a
matter of choice. The remaining traceless matrices (with zero entries on the main
diagonal) can be most simply chosen to have just one nonzero element in each.
So there are 8 independent matrices, which we choose to be:
   
1 0 0 0 0 0
T1 = 0 −1 0 , T2 = 0 1 0 ,
0 0 0 0 0 −1
     
0 1 0 0 0 0 0 0 1
E1 = 0 0 0 , E2 = 0 0 1 , E3 = 0 0 0 ,
0 0 0 0 0 0 0 0 0
     
0 0 0 0 0 0 0 0 0
F1 = 1 0 0 , F2 = 0 0 0 , F3 = 0 0 0 .
0 0 0 0 1 0 1 0 0

The commutation relations among these matrices and their correspondence


with elements of the algebra, e.g. Ti → ti, Ei → ei, and Fi → f i , determine
the composition laws among the members of the basis { t1, t2, e1, f 1, e2, f 2, e3, f 3 }
for the Lie algebra sl (3, C ). This is shown in Table 5.1. The manipulations can
be simplified if we introduce (3 × 3) matrices Eij (with i, j = 1, 2, 3) with 1 at
the ij-position and 0’s everywhere else, so that ( Eij )rs = δir δjs . For example,
T1 = E11 − E22, T2 = E22 − E33, E1 = E12, F1 = E21, etc.

The multiplication table for sl (3, C).


t1 t2 e1 f1 e2 f2 e3 f3
t1 0 0 2e1 −2 f1 −e2 f2 e3 − f3
t2 0 0 −e1 f1 2e2 −2 f2 e3 − f3
e1 −2e1 e1 0 t1 e3 0 0 − f2
f1 2 f1 − f1 −t1 0 0 − f3 e2 0
e2 e2 −2e2 −e3 0 0 t2 0 f1
f2 − f2 2 f2 0 f3 −t2 0 −e1 0
e3 −e3 −e3 0 −e2 0 e1 0 t1 + t2
f3 f3 f3 f2 0 − f1 0 −t1 − t2 0
Each entry gives [ a, b ] , with a corresponding to the row label
and b to the column label.

Table 5.1: The Lie bracket relations for sl (3, C ).


158 CHAPTER 5. LIE ALGEBRA sl (3, C )

Elementary properties. From Table 5.1 several properties of g = sl (3, C ) are ap-
parent. First, t1 and t2 commute with each other, and, as we have noted above
concerning the matrices T1 and T2, it is the largest number of mutually commut-
ing elements this algebra can have. They define a Cartan sub algebra (denoted h)
of g, as is called the maximal abelian sub Lie algebra of a semisimple Lie algebra,
with adh simultaneously diagonalizable for all h ∈ h. The dimension of a Cartan
sub algebra, called the rank ` of the Lie algebra (` = 2 in this case), is independent
of the specific choice of its span. (Other possible choices for h can be identified
from the entries of Table 5.1.)
Secondly, the elements t1, e1 , f 1 satisfy the bracket relations

[ t1, e1 ] = 2e1, [ t1, f 1 ] = − 2 f 1, [ e 1, f 1 ] = t 1 . (5.5)

They form a sub algebra of sl (3, C ) isomorphic to sl (2, C ) via the correspondence
of t1, e1, f 1 with the elements h, e, f in (5.1). It is not the only sub algebra isomor-
phic to sl (2, C ) in sl (3, C ), as t2, e2, f 2 also obey similar relations

[ t2, e2 ] = 2e2, [ t2, f 2 ] = − 2 f 2, [ e 2, f 2 ] = t 2 . (5.6)

These two sl (2, C ) sub algebras are referred to as s1 and s2 . Together they generate
the whole algebra g since the remaining two elements, e3 and f 3, are fixed by

[ e1, e2 ] = e3, [ f 1, f 2 ] = − f 3, [ e 3, f 3 ] = t 1 + t 2 . (5.7)

Weights of representations. Just as the quantum states of a physical system are


identified by a maximal set of mutually commuting operators for the system, so
the representations of a Lie algebra g can be unambiguously described by a Cartan
sub Lie algebra (CSA) h of g. The CSA of sl (3, C ) is two-dimensional, spanned by
the chosen basis elements t1, t2. The strategy is to diagonalize the representatives
π ( t1) and π ( t2) in any π. Since t1 and t2 commute, π ( t1) and π ( t2) will also
commute, and therefore π ( t1) and π ( t2) can be simultaneously diagonalized over
complex numbers (they always possess at least one eigenvector). If π ( t1) and π ( t2)
can be diagonalized, so also can be π ( h ), with h = at1 + bt2 for arbitrary a, b ∈ C.
Therefore, in analogy with (5.4), we define
Definition 5.1 (Weights of π [sl(3, C )]). Let ( π, V ) be a complex linear finite dimen-
sional representation of sl (3, C ), and h an element of the CSA h. Then µ ( h) ∈ C is called
a weight of π relative to h if there exists a vector v 6 = 0 in V such that π ( h) v = µ ( h ) v.
The vector v is called a weight vector corresponding to the weight µ ( h), and the space
of all such vectors is the weight space, denoted Vµ , corresponding to the weight µ.
It is often convenient to use a basis-dependent notation which shows a weight
µ of π written as an ordered pair ( m1, m2 ), where m1 = µ ( t1) and m2 = µ ( t2) in
some basis { t1, t2 } of h. The weights µ ( h) are linear functionals of h ∈ h (so that
µ ( at1 + bt2) = am1 + bm2), and hence all µ belong to h∗ , the dual of h.
We shall return to the general (complex-linear finite-dimensional) represen-
tations later on in this chapter, but let us take up for now the eight-dimensional
adjoint representation, which contains a wealth of information on the structure
5.1. STRUCTURE 159

of the Lie algebra itself. It is useful to note that the matrix adx for any element x
of the algebra can be obtained from Table 5.1.
The adjoint representation: roots. Non-zero weights of the adjoint representa-
tion are called roots. Generalizing Eq. (5.3), we have
Definition 5.2 (Roots of sl (3, C )). Let h be an element of the Cartan sub algebra h of
g = sl (3, C ). A non-zero linear function α ( h ) is called a root of g with respect to h if
there exists a non-zero element z ∈ g such that adh z = α ( h )z. The element z is called
a root vector corresponding to the root α.
Just as for the weights of any other representation, α ( h) is a linear function
of h ∈ h (which means α ∈ h∗ ), and can be specified in terms of an ordered pair
( α1, α2 ), with α1 = α ( t1) and α2 = α ( t2) relative to the basis { t1, t2 } of h.
Simple roots. From Table 5.1, one can establish the complete root system of
sl (3, C ). The product rules given there tell us that ei, f i , with i = 1, 2, 3, are the 6
joint eigenvectors of adt1 and adt2, with the corresponding 6 roots given below:

Root vectors zα Roots α


e1 (2, − 1)
e2 (− 1, 2)
e3 (1, 1) (5.8)
f1 (− 2, 1)
f2 (1, − 2)
f3 (− 1, −1)
The roots have integral values relative to the given basis of h, but only two are
independent of one another, which we may choose as the roots attached to the
root vectors e1 and e2, namely,

α(1) = (2, − 1), α(2) = (− 1, 2). (5.9)

These roots α(1) and α(2) are called the simple roots of the algebra relative to
the given h, forming the set Σ = { α(1), α(2) } ; they have the property that all the
roots of the algebra can be expressed as linear combinations of them, with integral
coefficients which are either all greater than or equal to zero (positive roots), or else
all less than or equal to zero (negative roots). This can be verified:

(2, − 1) = α(1)
(− 1, 2) = α(2)
(1, 1) = α(1) + α(2) ≡ α(3)
(5.10)
(− 2, 1) = − α(1)
(1, − 2) = − α(2)
(− 1, −1) = − α(1) − α(2) = − α(3) .

The set of all the roots (± α(1), ± α(2), ± α(3)) of g is a finite subset of h∗ denoted
by ∆; and its subset of positive roots (which are α(1) , α(2) , and α(3) ) is called ∆+ .
(0 ∈ h∗ is not considered a root, i.e. 0 6 ∈ ∆, although g0 is often used to denote h.)
160 CHAPTER 5. LIE ALGEBRA sl (3, C )

From their explicit expressions, we observe that the roots of sl (3, C ) have the
following properties (cf. [Ja] Chap. IV), which hold in general for other semisim-
ple Lie algebras (in particular, sl (2, C )) as well:
(i) If α is a root, the only integral multiples kα of α that are nonzero roots are
α and − α (i.e. ± 2α, ± 3α, . . . are not roots).
(ii) To each root corresponds a unique linearly independent element of g with
that root, so that we have the direct-sum decomposition:
g = h ⊕ (⊕ α∈∆ gα ) . (5.11)
h∗0
(iii) All the roots are real; they lie in and span a real subspace of the general
weight space h∗ , i.e. more precisely h∗0 = { α ∈ h∗ | α = a1α(1) + a2 α(2) ; ai ∈ R } ,
where α(1) , α(2) are the simple roots of g. Moreover, the rank ` of g is defined to be
dim h = dim h∗ ; and the dimension of g is dim g = dim h + ∑α dim gα = | Σ | + |∆|.
The Killing form on g and h. The Killing form K on g is a bilinear trace form in the
adjoint representation: K ( x, y) = Tr (adx · ady ) for any x, y ∈ g. We will use the
notation ( x : y ) ≡ K ( x, y). As we have already explained in Chapter 3 Sect. 3.6.4,
to calculate the Killing form of any x, y ∈ g, (i) Pick a basis { zi } for g; (ii) Calculate
[ x, zi ] = zk aki and [ y, zi] = zk bki ; (iii) Calculate [ x, [ y, z j ]] = zi aik bkj . The coefficient
of the zi with i = j on the right-hand side, if nonzero, contributes to the trace; (iv)
Repeat for every possible value of j, and obtain ( x : y ) = Tr (adx · ady ) = a jk bkj .
If [ x, [ y, z j ]] does not re-generate z j for any value of j, then ( x : y ) = 0. The Killing
form ( x : y ) for arbitrary x, y ∈ g can be expressed as a linear combination of
κij = ( zi : z j ) in any basis { zi} of g, as is shown, using a general property of
representations:
π ∑i ai zi π ∑j bj z j = ∑ij ai bj π ( zi ) π (z j) .
 

So, to calculate the Killing form K ( x, y) for any x, y ∈ g it suffices to know the
Killing forms of all the pairs of elements in some basis. The Killing forms for the
basis elements of sl (3, C ) given in Table 5.1 are listed in Table 5.2.
The Killing form on sl (3, C ) has the following properties:
(i) ( h : zα ) = 0 for any h ∈ h and any zα ∈ (g − h ).
(ii) ( zα : z β ) = 0, unless α + β = 0, with zα , z β ∈ (g − h).
(iii) The determinant of the matrix [ κij ] is non-zero: the Killing form on sl (3, C )
is non-degenerate, as is expected from Cartan’s criterion applied to a simple Lie
algebra. This also holds true for the restriction of K ( x, y) to CSA h.
(iv) K ( x, y) = Tr (adx · ady ) = 6 Tr ( xy ) for all x, y ∈ sl (3, C ).
Correspondence between h∗ and h. As the Killing form on h is non degenerate,
we can establish a one-to-one correspondence between h and h∗ : Let $ ( h) be an
element of h∗ , a linear functional of h ∈ h. Then there exists a unique element
h$ ∈ h defined such that the Killing form of h $ and h is precisely $ ( h ) ∈ h∗ :
def
( h$ : h ) = $ ( h) . (5.12)
This formula determines $ from any given h $ . Inversely, given a root $, how
do we find the element h$ in h? First, consider the simple roots. Let h = at1 + bt2
5.1. STRUCTURE 161

The Killing forms on sl(3, C).


t1 t2 e1 f1 e2 f2 e3 f3
t1 12 −6 0 0 0 0 0 0
t2 −6 12 0 0 0 0 0 0
e1 0 0 0 6 0 0 0 0
f1 0 0 6 0 0 0 0 0
e2 0 0 0 0 0 6 0 0
f2 0 0 0 0 6 0 0 0
e3 0 0 0 0 0 0 0 6
f3 0 0 0 0 0 0 6 0

Table 5.2: The Killing forms on sl (3, C ). Each entry gives K ( a, b) ≡ ( a : b ), with
row label a and column label b.

be an arbitrary element of h; then from (5.9) the simple roots as linear functions
of a and b are given by
α(1) ( at1 + bt2) = 2a − b, α(2) ( at1 + bt2) = − a + 2b . (5.13)
As the corresponding elements hα(i) live on the CSA h, they must have the form
hα(i) = ci t1 + di t2 and, by definition, must satisfy (5.12):
2a − b = ( hα(1) : at1 + bt2)
= c1 a ( t1 : t1 ) + d1 b ( t2 : t2 ) + ( c1b + d1 a )(t1 : t2)
= (12c1 − 6d1) a + (12d1 − 6c1) b ;
− a + 2b = ( hα(2) : at1 + bt2)
= (12c2 − 6d2) a + (12d2 − 6c2) b ,
where we have used the values of the Killing forms ( t1 : t1 ) = ( t2 : t2 ) = 12 and
( t1 : t2 ) = − 6. The unique solution to these equations for ci and di, with arbitrary
a, b, are c1 = 1/6, d1 = 0, and c2 = 0, d2 = 1/6. (Note its independence from a, b,
and h.) The results gives the correspondence between two bases of h:
1 1
hα(1) =
t , hα(2) = t2 . (5.14)
6 1 6
The vectors h$ corresponding to any other root $ ∈ ∆ can be obtained by
using linearity, namely, h−α(i) = − hα(i) and hα(1) +α(2) = hα(1) + hα(2) . What we
have obtained is a new basis for CSA h; for example,
adhα(1) · e1 = [ hα(1) , e1 ] = α(1) ( hα(1) ) e1 ,
adhα(2) · e2 = [ hα(2) , e2 ] = α(2) ( hα(1) ) e2 . (5.15)
It follows that
α(1) ( hα(1) ) = α(2) ( hα(2) ) = α(3) ( hα(3) ) = 1/3, ,
α(1) ( hα(3) ) = α(2) ( hα(3) ) = − α(1) ( hα(2) ) = 1/6 .
162 CHAPTER 5. LIE ALGEBRA sl (3, C )

And from (5.12) we must have the bilinear forms

( hα(i) : hα(i) ) = 1/3, ( i = 1, 2, 3),


( hα(1) : hα(3) ) = ( hα(2) : hα(3) ) = −( hα(1) : hα(2) ) = 1/6,

which, of course, can also be obtained from (5.14) and Table 5.2.
Inner product on the real root space h∗0 . The Killing form is defined on h. We now
want to have a similar bilinear form on h∗ . Let h∗0 be the subspace of h∗ spanned
by the simple roots. Take $, σ any two vectors of h∗0 and define an inner product of
$, σ via the bilinear form of the corresponding elements h $, hσ :

def
h $, σ i = ( h$ : hσ ) . (5.16)

From the definition (5.12) of ( : ), we see that h, i has the property

h $, σ i = $ ( hσ) = σ ( h$) . (5.17)

h $, σ i is an inner product (a non-degenerate, symmetric bilinear form) on the real


(root) vector space h∗0 . In particular, $ ( h$) = h $, $i = ( h$ : h$ ). See [Ja] Chap. IV.
Geometric interpretation of the roots. It follows from the above paragraphs that
the inner products among the roots of sl (3, C ) are given by

h α(1), α(1) i = h α(2), α(2) i = h α(3), α(3) i = 1/3 , (5.18)


(1) (2) (1) (3) (2) (3)
−h α , α i = hα ,α i = hα ,α i = 1/6 . (5.19)

(N.B.: the shorthand notations | α |2 ≡ h α, αi and α · β = h α, β i are often used.)


These scalar products suggest that the roots of sl (3, C ) can be√viewed as vectors
on the real inner-product vector space h∗0 , each of length 1/ 3 separated from
the next adjacent root vector by a 60◦ angle. This interpretation is illustrated by
a root diagram (Fig.5.1) representing the configuration of the roots of sl (3, C ) in
h∗0 . It is a very useful tool for doing computations and extracting information.
When the algebra is decomposed into s1 = h h1, e1, f 1 i and s2 = h h2, e2, f 2 i, the
roots α = ( m1, m2 ) are simultaneously a representation of s1 with weight m1 and
a representation of s2 with weight m2 , as shown in Fig. 5.2 (a)–(b). As e2 and e3
span a sub representation under s1 , we have ad f 1 · e3 7 → e2; and similarly as f 2
and f 3 span another sub representation, ad e1 · f 3 7 → f 2. Considering in the same
way sub representations under s2 , we have ad e2 · f 3 7 → f 1 and ad f 2 · e3 7 → e1.
C OMMENTS . In physics it is customary to choose a basis for the CSA with physi-
cal significance in mind, unlike what is being done here, which makes graphical
representations shown here look somewhat unfamiliar to physicists; see [Ge] and
Problem 2 for a physicist-friendly choice.
A New Basis for sl (3, C ). It is clear that the concept of ‘root’ plays a fundamental
role in characterizing a Lie algebra. For this reason, and in order to have a frame-
work that can be generalized to larger Lie algebras, we adopt a new basis for
5.1. STRUCTURE 163

α(2 ) α(3 )
KA 
A 
A 
(1 ) A  (1 )
−α A -α
A
 A
 A
 A
−α(3)
 AU −α(2)

Figure 5.1: The root system ∆ plotted on sl(3, C)’s root space.

−1 t t1 t2 t1
A e3  A
e2 K
A  A
t 0 A t
f e
t1 f
t -1 t t Af
1 A A
−2 0 2 −1
A A
A f2 AU
−1 t t1 −1 t t −2
(a) (b)

Figure 5.2: Root system of sl(3, C) as representation of (a) s1 , and (b) s2 .

sl (3, C ) consisting of any two independent elements h 1, h2 of h and six elements


eα , with α ∈ ∆ (replacing the previous ei and f i), defined to obey the equations

[ h1, h2 ] = 0, and [ h, eα] = α ( h ) eα , (5.20)

where h is an arbitrary element of h. In addition, we need to determine [ eα , e β ],


with α, β ∈ ∆, to be consistent with (5.20). From the Jacobi identity, it follows

[ h, [ eα , e β ]] = [ eα , [ h, e β]] + [[ h, eα], e β ]
= [ eα , β ( h )eβ] + [ α (h)eα, e β ]
= ( α ( h) + β ( h )) [eα, e β ] . (5.21)

The condition that emerges, namely, adh [ eα, e β ] = ( α ( h) + β ( h )) [eα, e β ] for any
element h of h, can be satisfied in one of two ways:
(i) [ eα , e β ] is zero, but α + β 6 = 0;
(ii) [ eα , e β ] is a nonzero root-vector with
(a) α + β 6 = 0, or (b) α + β = 0.
Case (i) describes the situation where α + β is not a root, and so [ eα , e β ] = 0.
In case (iia) α + β is a root in ∆, and [ eα, e β ] must be proportional to eα+ β , with
a multiplying constant, Nαβ , yet to be determined. In other words, [ eα , e β ] is again
an eigenvector for h, with eigenvalue α + β.
164 CHAPTER 5. LIE ALGEBRA sl (3, C )

Finally, in case (iib), β = − α, and [ eα , e β ] = [ eα , e−α ] is a nonzero element. As


[ h, [ eα , e β ]] = adh [ eα, e β ] = 0, that element must be in h. We have seen on p. 160
that the Killing form has the property that ( eα : e β ) vanishes unless α + β = 0.
Now, as the Killing form is an invariant trace form, i.e. ([ a, b] : c ) = ( a : [ b, c ]), we
have

([ eα, e−α ] : h ) = ( eα : [ e−α , h ])


= α ( h)(eα : e−α ) ,

which becomes, with α ( h ) = ( h α : h ) from the definition (5.12),

([ eα, e−α ] : h ) = ( hα : h )( eα : e−α ).


Since h is an arbitrary element of h, it immediately follows that

[ e α , e −α ] = ( e α : e −α ) h α . (5.22)

Thus, (iia)–(iib) are ad-maps of the form adgα : gβ → gα+ β (including g0 = h ):


the action of adgα preserves the decomposition (5.11), carrying each root space
gβ into another, or to zero. This translation on g can be visualized as in Fig.5.3.
The ‘translation’ may go in any allowed ‘direction’ and ‘sense’ α ∈ ∆ in the two-
dimensional space h0; this is to be contrasted with the one-dimensional geometry
characteristic of sl (2, C ).

t -t -0
AA 
A 
A 
t -Ath -t -0
A
 A
 A

t -AAt -0

Figure 5.3: The action of adgα on gβ produces gα+ β , h, or 0. This action is indicated by the
arrows; the central dot represents h; any other dot represents a g β for some β ∈ ∆.

In summary, the basis { h1, h2, eα } of sl (3, C ) is defined by the relations:

[h1, h2 ] = 0, for any h1, h2 ∈ h; (5.23)


[h, eα ] = α ( h )eα, for any h ∈ h, α ∈ ∆; (5.24)
 
eα , e β = ( eα : e−α ) hα, if α + β = 0, (5.25)
= Nαβ eα+ β , if α + β ∈ ∆; Nαβ = constant (5.26)
= 0, for α + β 6 = 0 and α + β 6∈ ∆ . (5.27)

To check that these relations agree with the multiplication Table 5.1, it suffices
to recall that ( eα : e−α ) = 6 for all α ∈ ∆; choose h 1 = t1 = 6hα(1) and h2 = t2 =
5.2. REPRESENTATIONS 165

6hα(2) ; and recall the roots α(1) = (2, − 1), α(2) = (− 1, 2), and α(3) = (1, 1) written
in the basis { α (h 1), α (h2)} of h∗ . Taking ei and e−i (with the subscripts i standing
for α(i) ) to be ei and f i respectively, we find the nonzero structure constants:

N1,2 = − N−1,−2 = 1,
N−1,3 = − N1,−3 = 1,
N2,−3 = − N−2,3 = 1 .

We note that these Nα,β ’s satisfy the following identities:

Nα,β = − N−α,− β = − Nβ,α ,


= Nβ,−α− β = N−α− β,α ; (5.28)

which in fact turn out to be true in other Lie algebras as well (cf. Problem 5.4).
Standard normalization. Of the admissible bases, those associated with a fixed
(positive) root α play a special role as they span a three-dimensional sub algebra.
Let us choose them with the following normalization:

eα = eα ,
f α = [2/α(hα)( eα : e−α )] e−α , (5.29)
h̃α = [2/α(hα)] hα .

Then (5.24) and (5.25) (with h set to be h̃α ) assume the canonical form (5.1):

[ h̃α , eα ] = 2eα ,
[ h̃α , f α ] = − 2 f α, (5.30)
[ eα , f α ] = h̃α .

Hence h h̃α, eα , f α i is isomorphic to sl (2, C ). Noting that h̃α (also called a co-root)
differs from hα by a constant, one has from (5.12) α ( h̃α ) = 2.

5.2 Representations
Quick review. Many of the general remarks we made in the last chapter about the
representations of SU (2) and of its associated algebras apply to our present case
as well. In particular, just as the group SU (2) is connected and simply connected,
so too is the group SU (3), and so there is a one-to-one correspondence between
the finite-dimensional representations of the Lie group SU (3) and those of the Lie
algebra su (3). On the other hand, as the complex representations of the real Lie
algebra su (3) are in one-to-one correspondence with the complex representations
of the complexified algebra su (3)C, and as su (3)C ∼ = sl (3, C ), there must also be a
one-to-one correspondence between the finite-dimensional complex representations Π of
SU (3) and the finite-dimensional complex representations π of sl (3, C ), which is ex-
pressed by Π (ex) = eπ(x) for all x ∈ sl (3, C ). The representation Π is irreducible
166 CHAPTER 5. LIE ALGEBRA sl (3, C )

if and only if π is irreducible. (To say ‘π is irreducible’ is to say that V contains no


non-trivial subspace invariant under the algebra; i.e. it is the smallest g-invariant
space.) Since SU (3) is compact, all the finite-dimensional representations of SU (3)
are completely reducible, and so by virtue of the one-to-one correspondence be-
tween Π of SU (3) and π of sl(3, C ) mentioned above, all the representations of
sl (3, C ) are completely reducible provided they are finite-dimensional and defined over
C. For this reason, we will study only the finite-dimensional complex represen-
tations of the algebra sl (3, C ).
The basis we will be using, { h 1, h2, e1, f 1, e2, f 2, e3 , f 3 } , is precisely the same as
defined in Table 5.1 and consistent with the normalization in (5.30) provided we
make the identification hi = ti = h̃α(i) with i = 1, 2, and e j = eα(j) , f j = f α(j) with
j = 1, 2, 3. In addition, we will use Hi, Zj , . . . to denote π ( hi), π (z j), . . . .
Weights and weight vectors. The key to understanding the representation theory
of sl (3, C ) lies in a simultaneous diagonalization of π ( h 1) = H1 and π ( h2) = H2
and an identification of all possible representation spaces V . The basic eigenvalue
equation is Hv = µ ( h )v for any h ∈ h and v ∈ V . In particular, for the basis
elements h1 = t1 and h2 = t2, this equation becomes

H1 v = µ ( h1) v, H2 v = µ ( h2) v (5.31)

(and of course v exists over C). For an arbitrary element h = ah 1 + bh2 of h, an


eigenvalue for H, or weight of π, is a linear function of h: µ ( h) = am1 + bm2,
with m1 ≡ µ ( h1) and m2 ≡ µ ( h2). A more explicit notation for v would be v m1 ,m2 .
As s1 ≡ h H1, E1, F1 i and s2 ≡ h H2, E2, F2 i are three-dimensional sub algebras
isomorphic to (5.1) (by Hi 7 → h, Ei 7 → e and Fi 7 → f ), we know that the eigenvalues
m1 and m2 are integral numbers in any π (see Sec. 5.1).
Given a joint eigenvector v for H1 and H2, we can generate all other vectors of
the representation π by applying (the ladder operators) Ei , Fi , Ei Fj , . . . on v. For
this purpose, we will find useful the following
Lemma. Let π be a representation of sl (3, C ) on a complex vector space V ; and v ∈ V a
non-zero weight vector with a weight µ. Let π ( h) = H be an element of π (h ), α a root
relative to h, and Zα ∈ gα a corresponding non-zero root vector. Then

HZα v = µ ( h) + α ( h ) Zα v . (5.32)

Either Zα v = 0, or else Zα v is a weight vector with weight µ ( h) + α ( h ).


The proof is simple: In representation π, the bracket relation [ h, zα ] = α ( h)zα
becomes [ H, Zα ] = α ( h )Zα. It follows that

HZα v = ( Zα H + [ H, Zα ]) v
= (µ ( h) + α ( h )) Zα v .

So, the roots act on the weights by translation: If v µ ∈ Vµ , then Zα vµ ∈ Vµ+α ;


Zα : Vµ → Vµ+α .
In particular, take H as H1 or H2, and Zα as any one of Ei, Fi (with i = 1, 2, 3).
5.2. REPRESENTATIONS 167

Then, for example with Zα = E1, E2, we have from (5.32)

Hi E1 v = µ ( hi) + α(1) ( hi ) E1 v ,


Hi E2 v = µ ( hi) + α(2) ( hi ) E2 v .


We already know that µ ( h 1) = m1 and µ ( h2) = m2 are integers (or zeros) and
the simple roots relative to h1 and h2 are given by ( α(1) ( h1), α(1) ( h2)) = (2, − 1)
and α(2) ( h1), α(2) ( h2) = (− 1, 2). So, with the weight of v given by ( m1, m2 ),

these equations tell us that E1 v has the weight ( m1 + 2, m2 − 1), and E2 v has the
weight ( m1 − 1, m2 + 2). The ladder operators Ei ’s flip the parity of the weight
vector, from (− 1)m1+m2 for v to (− 1)m1+m2 +1 for Ei v.
Ordering the weights. Just as for sl (2, C ), the irreducible finite-dimensional rep-
resentations of sl (3, C ) are identified by their highest weights. The additional
complication here is that CSA h is two-dimensional, and so one must introduce
some convention for ordering the weights:

Definition 5.3 (Ordering of the weights; highest weight). Let λ and µ be any two
weights of a representation π of sl (3, C ). Then λ is said to be higher than µ if λ − µ can
be written as a positive root of sl (3, C ): λ − µ ∈ ∆+ . This relationship will be indicated
by λ  µ. A weight λ of π is said to be a highest weight if λ  µ for all other weights
µ of π. Similarly, λ is a lowest weight if µ  λ for all µ of π.

The positive roots of sl (3, C ) relative to the CSA h h 1, h2 i are α(1) = (2, − 1),
α (2)= (− 1, 2), and α(3) = (1, 1). The condition that λ − µ = aα(1) + bα(2) be
positive means a ≥ 0 and b ≥ 0. In particular, α(3) , considered a weight, is higher
than both α(1) and α(2) .
If µ? is a highest weight of π, then µ? + α is not a weight for any α ∈ ∆+ .
On the other hand, a weight µ of π must be of the form µ? − kα, with k ≥ 0; all
weights lie in a sequence, or string, µ? , µ? − α, . . . , µ? − nα (n ≥ 0), spaced by
some α ∈ ∆+ . Since π is assumed finite-dimensional, the string is finite and n
must be a finite non-negative integer. Several steps are needed that lead us to the
main theorem concerning the simple representations of sl (3, C ):
1. Let us recall our normalization factors for si < g:

[ H, Ei ] = α(i) ( h ) Ei, [ H, Fi ] = − α(i) ( h ) Fi, [ Ei, Fi ] = Hi . (5.33)

Now, consider a finite representation ( π, V ) of g = sl (3, C ), and let v 0 be a non-


zero weight vector of weight µ? (relative to h h1, h2 i, as always) defined by

Hi v0 = µ? ( hi) v0, Ei v0 = 0, ( i = 1 and 2). (5.34)

For each α(i) , consider the sequence of vectors v0, v1, . . . , vni generated from v0
according to vk = Fi vk−1 = ( Fi )k v0 with k = 1, 2, . . . , ni ; finiteness of the rep-
resentation requires Fi vni = vni +1 = 0. Each vk is an eigenvector of Hi with
eigenvalue µk = µ? ( hi) − kα(i) ( hi ) where 0 ≤ k ≤ ni.
168 CHAPTER 5. LIE ALGEBRA sl (3, C )

The vector Ei vk = Ei Fi vk−1 is proportional to vk−1 , and the proportional con-


stant denoted tk−1 ∈ R can be calculated by applying Ei Fi = Hi + Fi Ei repeatedly,
and noting Ei v0 = 0 in the last step. The result is:
k h i
tk = ∑ µs ( h i ) = ( k + 1 ) µ? ( hi ) − kα(i) ( hi) /2 . (5.35)
s=0

As Ei vni +1 = tni vni vanishes because vni+1 = 0, it follows from vni 6 = 0 that
tni = 0, which implies ni = 2 µ? ( hi)/α(i) ( hi). As α(i) ( hi ) = 2 for both i = 1, 2, we
have ni = µ? ( hi), and so µ? = ( n1, n2 ) is a pair of non-negative integers. In each
i-direction on h∗ , the vectors v0, Fi v0, . . . , ( Fi )ni v0, generated from v0 and attached
to the eigenvalues ni , ni − 2, . . . , − ni, define an (ni + 1)-dimensional irreducible rep-
resentation of si , a sub representation of g.
2. Now let µ be any weight of π, and α any positive root of the algebra, and
p, q be non-negative integers. Let Sα ( µ ): µ + pα, . . . , µ, . . . , µ − qα be a string of
weights such that µ + ( p + 1) α and µ − ( q + 1) α are not weights. (p and q are
then the numbers of times the root α can be added to and subtracted from the
weight µ while still remaining a weight of the representation.) The corresponding
weight vectors will be built by applying ( Eα )k and ( Fα )k on the weight vector v
of weight µ. Taking α to be a simple root α(i) , we identify the highest weight with
µ? ( hi ) = µ ( hi) + pi α(i) ( hi), and the corresponding lowest weight with − µ? ( hi ) =
µ ( hi) − qi α(i) ( hi). It then follows that

2 µ? ( h i ) 2 µ ( hi)
qi + p i = = µ? ( h i ) , qi − p i = = µ ( h i) , (5.36)
α (i) ( h i ) α (i) ( h i )

which fixes the numbers pi = [ µ? ( hi) − µ ( hi)] /2 and qi = [ µ? ( hi) + µ ( hi)] /2.
These are the constraints on the string of weights based on µ ( h i) and of length
qi + pi + 1 in the representation admitting µ? ( hi) as its highest weight:

Weights : µ + pi α(i) , . . . , µ + α(i) , µ, µ − α(i) , . . . , µ − qi α(i)


Weight vectors : ( Ei) pi v, . . . . . . , Ei v, v, Fi v, . . . , ( Fi )qi v .

3. The numbers ni , qi + pi and qi − pi are given by ratios of linear functions of


the same argument, hi , and so will be unchanged if it is multiplied by a constant,
and in particular if h i is replaced everywhere by hα(i) . Thus, for example, qi − pi
is also given by
2 µ ( h (i) ) 2 h µ, α(i) i
q i − p i = (i) α = (i) (i) ,
α ( hα(i) ) hα , α i
where scalar products h, i on h∗0 are defined via Killing forms as in (5.16).
Then, we are led to the following important theorem which gives the complete
classification of the irreducible representations of sl (3, C ):
Theorem 5.1 (Irreducible representation of sl(3, C )). 1. Every irreducible complex
finite-dimensional representation π of sl (3, C ) on a complex vector space V is the direct
5.2. REPRESENTATIONS 169

sum of its weight spaces, V = ⊕ Vµ, where each Vµ is a joint eigenspace for π ( h1) and
π ( h2), with h1 and h2 being the chosen basis for CSA h.
2. Every irreducible representation of sl(3, C ) has a unique highest weight, say λ,
of the form λ = ( n1, n2 ) with non-negative integers n1, n2 relative to the chosen CSA
{ h1, h2 } . The representation is then denoted by πn1 ,n2 or V (n1 ,n2 ) .
3. Conversely, every pair of non-negative integers, n1 and n2, determines an irre-
ducible representation πλ admitting λ = ( n1, n2 ) as its highest weight. Two irreducible
representations of sl (3, C ) with the same highest weight are equivalent, that is to say, πλ
is uniquely determined by ( n1, n2 ).

Based on this theorem, the construction of the irreducible finite-dimensional


representation V (λ) with highest weight λ = ( n1, n2 ) proceeds as follows: Given
the canonical operators Hi, Ei , Fi as already defined,
(1) Find a simultaneous eigenvector v of H1 and H2 with eigenvalues n1, n2,
such that E1 v = 0, E2v = 0. This v is the highest-weight vector.
(2) Construct all vectors w = Fi1 Fi2 · · · Fik v with each ij = 1, 2 if k > 1. Each
such vector, if not 0, is a weight vector of π with weight λ − α(i1 ) − · · · α(ik ) . Only
a finite number of these vectors are non-zero. Together with v, they form a sub-
space W ⊂ V (λ) , which is invariant under H1, H2, E1, E2, E3 , F1, F2, F3.
(3) The smallest W is the irreducible representation space V (λ) itself.
E XAMPLE 1: Representation π1,0 ∼ = C3 . Apart from the trivial representation π0,0,
the simplest are those having (1, 0) and (0, 1) as the highest weights, obtained
when the algebra acts on C3 . The representatives of the algebra in the standard
(or defining) representation (1, 0) ∼= C3 have already been given (p. 157), written
in the canonical basis:
     
1 0 0
ξ 1 = 0 , ξ 2 = 1 , ξ 3 = 0 .
0 0 1

These vectors are the weight vectors of weights ε 1 = (1, 0), ε 2 = (− 1, 1), and
ε3 = (0, − 1) relative to the CSA h = h h1, h2 i (read off from the diagonal matrices
T1, T2 on p. 157). Since Ei ξ 1 = 0 for i = 1, 2, 3, ξ 1 is the highest-weight vector,
with weight ε 1 = (1, 0). We can also check that ε 1 − ε 2 = α(1); ε 2 − ε 3 = α(2) ;
ε 1 − ε 3 = α(1) + α(2) (with α(1) = (2, − 1) and α(2) = (− 1, 2)) which shows that
ε 1  ε 2 and ε 1  ε 3, or ε 1 is the highest weight. The standard representation is
indeed the irreducible representation π1,0. Further, since ε 2  ε 3 and ε 1  ε 3, the
lowest weight in π1,0 is ε 3.
Suppose, conversely, that we are asked to construct the representation with
the pair of integers (1, 0) ≡ µ? as its highest weight. The procedure we will follow
is based on our discussion in §1 and §2 (pp. 167–168). We have, by assumption,
( H1, H2 ) u0 = µ? u0, and E1 u0 = 0, E2 u0 = 0. The string of weights starting from
µ? spaced by α(i) must end with the weight µ? − ni α(i) . Since n1 = µ? ( h1) = 1,
the weight string spaced by α(1) is µ? , µ? − α(1) , corresponding to the sequence
of vectors u0, F1 u0 (which forms a doublet of s2 ); and so ( F1)2 u0 = 0 because
170 CHAPTER 5. LIE ALGEBRA sl (3, C )

α(2)K
A
α(2K
A)
ε2 A ε1 A −ε3
sA s s
A A - A A
A (1 )
α A A
A A A -(1)
UA s s UA s α
ε3 −ε1 −ε2
Figure 5.4: Finding the weights of the fundamental representations π1,0 and π0,1 of
sl(3, C) through applications of root vectors Fi on the weight space. The weight vectors
ξ 1 , ξ 2 , ξ 3 of π1,0 have the weights ε1 , ε2 , ε3 ; and the weight vectors ξ 3 , ξ 2 , ξ 1 of π0,1 have the
weights −ε3 , −ε2 , −ε1 .

µ? − 2α(1) is not a weight. On the other hand, as n2 = µ? ( h2) = 0, the second


sequence consists only of u0, which is equivalent to saying that µ? − α(2) is not a
weight, or that F2 u0 = 0.
To continue, what strings can be built on v ≡ F1 u0 of weight µ = (− 1, 1)? We
already know that F1v = F12u0 = 0, and also that E1v = E1 F1 v = µ? ( h1) u0, and
this string leads to no new non-zero vector. Consider now the string containing
µ and spaced by α(2) , namely,

µ + pα(2), . . . , µ, . . . , µ − qα(2) ;
( E2) p v, . . . , v, . . . , ( F2)q v .

As E2 ( F1u0 ) = F1 E2 u0 = 0 and so E2 v = 0, we have p = 0. On the other hand,


q − p = µ ( h2) = 1, which means q = 1. In other words, the α(2) -string is simply
µ = (− 1, 1), µ − α(2) = (0, − 1), corresponding to the vectors F1 u0, F2 F1u0 (a
doublet of s2 ). We also learned that µ − 2α(2) is not a weight, and so ( F2)2 F1u0 = 0.
The next step is to consider the α(i) -strings based on λ = (0, − 1), with its
weight vector w = F2 F1 u0. The α(2) string need not be considered, because F2 w =
0. As for the α(1) -string, λ + pα(1), . . . , λ, . . . , λ − qα(1) , we can see that E1 w = 0
(using various commutation relations and F2 u0 = 0), which implies p = 0. In
addition, q − p = λ ( h1) = 0, which means q = 0. And so the process ends
here, leaving us with the vectors u0, F1 u0, F2 F1 u0, with respective weights (1, 0),
(− 1, 1), and (0, − 1). They can be identified with the vectors ξ 1 , ξ 2 , ξ 3. This
construction of π1,0 is illustrated in the left half of Fig.5.4.
E XAMPLE 2: Representation π0,1 ∼ = C3∗ . Let µ? = (0, 1), and define v0 such that
?
( H1, H2 ) v0 = µ v0, and E1 v0 = 0, E2 v0 = 0. Reasoning in the manner of §1
(p. 167), we see that n1 = µ? ( h1) = 0 tells us that µ? − α(1) is not a weight, and
n2 = 1 means µ? − α(2) is a weight, but µ? − 2α(2) is not. Therefore, v0 and F2 v0
are weight vectors in representation π = (0, 1), forming a doublet of s2 .
As we already know that F2 F2 v0 = 0, we need to consider only the α(1) -string
based on µ = µ? − α(2) = (1, − 1), with the vector w = F2 v0. In this sequence of
vectors, E1 w = E1 F2 v0 = F2 E1 v0 = 0, and so p = 0. On the other hand, q − p =
5.2. REPRESENTATIONS 171

µ ( h1) = 1, which implies q = 1. It follows that we can build from µ = (1, − 1) the
string consisting only of µ, µ − α(1) , with the corresponding weight vectors F2 v0
and F1 F2v0 (a doublet of s1 ). As in the previous case, we can show that either F1
or F2 applied on F1 F2 v0 leads to no new vectors.
In conclusion, the irreducible representation π0,1 is spanned by v0, F2 v0, and
F1 F2 v0, with weights − ε 3, − ε 2, − ε 1; as shown in the right half of Fig.5.4.
We see that the diagrams for π1,0 and π0,1 are their mutually reflected images.
This is so because the two representations are dual (or conjugate) to each other,
with the lowest weight of one being the negative of the highest weight of the
other. (See Problem 5.9. In the case of sl (2, C ), the weights for any representation
π are symmetric about the origin, and so each π is its own dual.) The two repre-
sentations V (1,0) and V (0,1) are mapped into one another by an automorphism of
sl (3, C ), which sends each element Z to − ZT (the negative of transposed Z). This
(1,0)
relationship is indicated by the notations π1,0 = π0,1 and V = V (0,1). For each
T
Z = π ( z ), there corresponds π ( z ) = − Z in V . In particular, the elements of the
Cartan sub algebra in V are
   
−1 0 0 0 0 0
π ( h1) =  0 1 0 , π ( h 2 ) = 0 − 1 0  . (5.37)
0 0 0 0 0 1

The weight vectors are ξ 1, ξ 2 , and ξ 3 (with covariant, or lower, labels) carrying
the weights (− 1, 0), (1, − 1), and (0, 1). The highest weight is (0, 1).
This method of finding the fundamental representations, by repeated applica-
tions of the lowering operators on the highest-weight vector, can be used to find
any other simple representations. Some more examples are given in Figs.5.5–5.6
and Problems 5.7–5.8. Each such representation is reducible under the action of
any sub Lie algebras sα , the decomposition being of the form V = ⊕ Wi, where
the Wi ’s are sα -irreducible representations equivalent to complete α-strings of
weights. Another approach to finding higher-dimensional representations is to
form tensor products of lower-dimensional irreducible representations and de-
compose them (since they are fully reducible) into irreducible subspaces by the
symmetry arguments, as illustrated in the following examples. This type of con-
struction is explained in more detail in Chap. 7 and Appendix A of this book.
E XAMPLE 3: π1,0 ⊗ π1,0, π1,0 ⊗ π0,1, and π0,1 ⊗ π0,1. Degree-two tensor products
of V ≡ V (1,0) and V ≡ V (0,1) provide us with the simplest examples. Recall
that V (1,0) has the weight vectors ξ i with weights ε 1 = (1, 0), ε 2 = (− 1, 1), and
ε 3 = (0, − 1); and V (0,1) has the vectors ξ i with weights − ε 1, − ε 2, − ε 3.
(i) V ⊗ V contains the weights ε i + ε j , which are (2, 0), (0, 1), (1, − 1), (− 2, 2),
(− 1, 0), and (0, − 2), as well as (0, 1), (1, − 1), and (− 1, 0). It is clearly not ir-
reducible. Let ξ i ξ j be any element in V ⊗ V , then it may be written as ξ i ξ j =
1 i j j i 1 i j j i
2 ( ξ ξ + ξ ξ ) + 2 ( ξ ξ − ξ ξ ) , and so it splits into a second rank contravariant
tensor Sij = ξ i ξ j + ξ j ξ i and a covariant vector ηk = ekmn ξ m ξ n . Since Sij contains
the highest-weight vector ξ 1 ξ 1 with weight (2, 0), from which applications of F1
and F2 can produce the remaining five components, or weight vectors, it is just
172 CHAPTER 5. LIE ALGEBRA sl (3, C )

α(2 ) 0, 2
−2, 2 0, 1 2, 0 α(2 )
r KAA r r r
A A A KA A
A A
A A - A
A
AU r AU r α(1) −1, 1 r AU r 1, 0
A
−1, 0 A 1, −1 A A - α(1 )
A A
AU r r r AU r
0, −2 −2, 0 0, −1 2, −2

Figure 5.5: The sextuplet representation π2,0 and its dual π0,2 . The multiplets of the sub
algebras s1 and s2 can be identified from the displayed (in frames) weights.

α(2 )
−1, 2 K
A t t 1, 1
A A
A A
t AU f
t AU t -α(1)
−2, 1 A A 0, 0 2, −1
A A
A AU t
−1, −1U t 1, −2

Figure 5.6: The adjoint (octet) representation π1,1 . Representation space V decomposes
into ⊕4s=1 Ws , where Ws are s1 (or s2 ) irreducible representations.

the smallest invariant subspace containing the highest-weight vector of weight


(2, 0); see Fig.5.5. Hence it is the irreducible representation V (2,0), also called
Sym2 V as a reminder that it is the symmetric part of V ⊗ V . On the other hand,
η3 , η2 , and η1 have respective weights − ε 3, − ε 2, − ε 1, and so are identified with
the irreducible representation V (0,1), also denoted ∧2 V (an exterior product) to
emphasize its antisymmetric construction. Conclusion: π1,0 ⊗ π1,0 = π2,0 ⊕ π0,1.
Similarly, V ⊗ V = ∧2 V ⊕ Sym2 V , where ∧2V ∼ = V and Sym2 V is a represen-
tation spanned by vectors with weights (0, 2), (1, 0), (− 1, 1), (2, − 2), (0, −1), and
(− 2, 0) (see Fig.5.5). Conclusion: π0,1 ⊗ π0,1 = π0,2 ⊕ π1,0.
(ii) V ⊗ V . The nine weights obtained from ε i − ε j are: (1, 1), (− 1, 2), (2, − 1),
(− 2, 1), (1, − 2), (− 1, −1), and (0, 0) (the last with multiplicity three). This repre-
sentation is reducible to a direct sum of irreducible representations. One of them
is the trivial representation π0,0 = C. The remainder is the subspace of V ⊗ V of
traceless matrices, which is just the adjoint representation of sl (3, C ), identified
by its highest weight (1, 1), see Fig.5.6. A typical element of V ⊗ V may be writ-
ten as ξ i ξ j = δij S + Mij , where S = 1/3ξ k ξ k and Mij = ξ i ξ j − δij S. So S ∼
= π0,0 and
Mij is isomorphic to π1,1. Conclusion: π1,0 ⊗ π0,1 = π1,1 ⊕ π0,0.
E XAMPLE 4: Symmetric powers. Higher symmetric powers of V = V (1,0) and of
V = V (0,1) can be identified (see [FH] pp. 180–183). Thus, the symmetric cubes
Sym3 V = π3,0 and Sym3 V = π0,3 have the weight diagrams shown in Fig.5.7;
5.2. REPRESENTATIONS 173

3, 0 0, 3
r r r r r
A A A A
AU r AU r AU r r AU r
A A A A
AU r AU r r AU r AU r
A A A A
AU r 0, −3 −3, 0r AU r AU r AU r

Figure 5.7: The representation π3,0 = 10 and its dual π0,3 = 10.

they all appear in triangular (rather than hexagonal) patterns, and are evidently
irreducible. This observation has a generalization: The weights of the symmetric
powers Symn V (1,0) and Symn V (0,1) occur with multiplicity 1, and are all irreducible:
Symn V (1,0) = V (n,0), and Symn V (0,1) = V (0,n). In turn V (n,0) and V (0,m) are the
foundation for constructing arbitrary irreducible representations V (n,m) . In fact,
V (n,0) ⊗ V (0,m) is completely reducible, decomposable into a direct sum. To sum-
marize:

πn,0 = Symn π1,0, π0,m = Symm π0,1, (5.38)


πn,0 ⊗ π0,m = πn,m ⊕ πn−1,m−1 ⊕ · · · ⊕ πn−m,0 , ( n ≥ m ). (5.39)

In all examples, we observe that if representations V, W, . . . have highest-


weight vectors v, w, . . . with weights λ, µ, . . . respectively, then v ⊗ w ⊗ · · · is
the highest-weight vector in V ⊗ W ⊗ · · · with weight λ + µ + · · · . Thus, the
highest-weight vector in ⊗ n V (1,0) and in Symn V (1,0) is ξ 1n = ξ 1 ξ 1 · · · ξ 1 with
weight nε 1 = ( n, 0); the highest-weight vector in ⊗ m V (0,1) and in Symm V (0,1) is
ξ 3 m = ξ 3 · · · ξ 3 with weight − mε3 = (0, m ); and finally, the highest-weight vector
in V (n,0) ⊗ V (0,m) is ξ 1n ξ 3 m with weight nε 1 − mε 3 = ( n, m ); this vector is also
the highest-weight vector of the representation V (n,m) . One can see similarly that
the lowest-weight vector in V (n,m) is ξ 3 n ξ 1 m , which has the weight nε 3 − mε 1 =
(− m, − n). More generally, we can regard V (n,m) as the space of irreducible ten-
sors of some definite symmetry T n m with n contravariant indices and m covariant
indices, of the form ξ 1 n1 ξ 2 n2 ξ 3 n3 ξ 1 m1 ξ 2 m2 ξ 3 m3 , such that n = n1 + n2 + n3 and
m = m1 + m2 + m3 with non-negative integers ni , mi .
C OMMENTS .
(a) Dimension of irreducible representation. The dimension of the irreducible
representation of sl (3, C ) with highest weight ( n1, n2 ) is given in terms of these
non-negative integers by the formula (found later in Chapter 7 of this book):
d ( n1, n2 ) = 21 ( n1 + 1)(n2 + 1)( n1 + n2 + 2).
This formula suggests an alternative notation, commonly used by physicists,
in which a representation πλ is specified by its dimension dλ , with the dual of
a representation marked by an overline (or an asterisk). (This notation is conve-
nient but ambiguous, and so must be understood in the context.) For example,
we make the identification π0,0 = 1, π1,0 = 3, π0,1 = 3. The decomposition of the
174 CHAPTER 5. LIE ALGEBRA sl (3, C )

(a) (b)
Y Y.....
K0 K+
...
... ...s̄ 1 .........................................
. ...
. .
.
.
... .... ...
1
.. . .
u
.. . .
d ....................................................................
. .. .
. ..
3 ...
... ....
. ..
... ....
. .
η
.. . ..
... ... 8.. ...
−............................................................................. π+
.. .... .

.
....
..
....
. 0 π
. ..0
.. .. ... ..
..
.. .....
..
.. .... .. π .

. . ..
........................................................................

. .
... .
. .. . .. .
.. .... .. . .. .

− 23 s
... .. ... . .. .
−1
..
K−
.......................................
K̄0
I3 ........
I3
− 12 0 1
2 −1 − 12 0 1
1
2

(c) (d)
Y..... Y......
n p ∆.− 0 + ++
....................................∆
................................∆ ....................................∆
.
1 .........................................
. ... . 1 .. . ... . ..........
.
. . . .
.. .. . . . .
.. .
. .. . . . .. . .
.. . ... .. . .
. .. . . .. .
.. . .. . ..
. . ∗0
. . ..
Λ .. .
−.............................................................................
.. . .

. .

Σ+ Σ∗−......................................................................................Σ∗+
.

. .. .
0 Σ 0
. . . .
.. . ..0 Σ . ..
.. . .. . .. . . .
.. .
... . .. . . ..
. . .. . .. . .. .
..
. .
∗− ............................................. ∗0
.. . .. . .
−1 Ξ−
.........................................
Ξ0 −1 Ξ Ξ
.. .
.. .
I3
........ .. .
−1 − 12 0 1 .. .
2 1 .. .

−2 ..

I3
..........

− 32 −1 − 12 0 1
2 1 3
2

Figure 5.8: sl(3, C) in particle physics: (a) Quarks, antiquarks in the fundamental repre-
sentations. (b) Pseudo-scalar mesons as quark-antiquark states in an octet representation.
(c) Light baryons as tri-quark states of an octet. (d) Heavy baryons in a decuplet. Here, the

Cartan basis consists of the isospin I3 = 1/2λ3 and the hypercharge Y = 1/ 3λ8 . Adapted
from Elementary Particles and Their Interactions by Ho-Kim Quang and Pham Xuan-Yem.
Springer (1998).

degree-two tensor products of the fundamental representations 3 and 3 consid-


ered above now reads in this notation as follows:

3 ⊗ 3 = 3 ⊕ 6, 3 ⊗ 3 = 3 ⊕ 6, 3 ⊗ 3 = 1 ⊕ 8.

Note that 6 = Sym2 3 is the symmetric part of 3 ⊗ 3. Take as another example the
order-3 tensor product of 3, which can be reduced in two successive steps to

3 ⊗ 3 ⊗ 3 = 3 ⊕ 6 ⊗ 3 = 1 ⊕ 8 ⊕ 8 ⊕ 10
The singlets in 3 ⊗ 3 and 3 ⊗ 3 ⊗ 3, though congruent, are not the same. In the
first case, it is a symmetric scalar product of vectors, ξ i ξ i ; in the second, it is an
antisymmetric scalar product eijk ξ i ξ j ξ k . Similarly, the octets 8 in the two tensor
products are completely unrelated.
(b) Flavor and color symmetries. Most strongly-interacting particles can be re-
garded as being made up of several quarks of three kinds (or flavors) called u, d
and s. (Actually there are more flavors, but these are the lightest ones.) u, d, s
PROBLEMS 175

provide the states of the fundamental representation 3 in flavor space. Likewise


the antiquarks ū, d,¯ s̄ give the corresponding basis vectors for 3; see Fig.5.8(a).
Mesons are quark-antiquark pairs, and so belong to the singlet or the octet of the
decomposition direct sum of 3 ⊗ 3, as shown in Fig.5.8(b). On the other hand,
baryons (heavier particles such as the proton and neutron) are bound states of
three quarks, and therefore must belong to some irreducible representation in the
decomposition of 3 ⊗ 3 ⊗ 3, that is, 1, or either of the 8’s, or 10. It was because
of the widespread presence of the octet representations among the families of
mesons or baryons that Murray Gell-Mann called the quark-flavor SU (3) group
‘the eightfold way’. The experimental discovery of the Ω− particle, which came
after the prediction of its position as the last particle in the baryon 10 representa-
tion, marked a triumph of group theory in physics; see Fig.5.8(c)–(d).
Theoretical considerations (1965) and experimental evidence at high ener-
gies (early 1970’s) required the introduction of a new degree of freedom for the
strongly-interacting particles. The basic assumption is that this new quantum
property, called color, has precisely three possible values and that physics is invari-
ant under color changes. We can then associate with this symmetry a group, called
the color SU (3) group; and with its representations an abstract space, called color
space, consisting of a copy of Hilbert space C3 for each flavor. Saying that physics
is invariant under the action of SU (3) implies that all observed states must be
color-neutral, i.e. they can exist only as color-singlets. So, we cannot observe in-
dividual quarks because they transform non trivially under SU (3). Nor can we
ever see a particle made up of two quarks, because 3 ⊗ 3 contains no 1. But both
3 ⊗ 3 and 3 ⊗ 3 ⊗ 3, as the smallest products possible that contain 1 in their de-
composition into direct sums of irreps, can produce observable particles. In fact,
mesons (quark-antiquark systems) can exist only in the symmetric color-singlet
state ξ i ξ i , and baryons (three-quark systems) in the antisymmetric color-singlet
state eijk ξ i ξ j ξ k , where ξ i denotes a color basis state, and ξ i an anti-color basis
state. Of course, non-observable particles may exist in any other representations
of SU (3). Thus, quarks occupy color states of irrep 3, antiquarks occupy color
states of irrep 3, and gluons, which mediate the color exchange between quarks
and antiquarks, exist in color–anticolor states of the adjoint representation 8.

Problems
5.1 (Matrices Eij ) The matrices Eij have a single non-zero entry, equal to 1, at row
i, column j, i.e. ( Eij )ab = δia δjb . Rewrite the basic elements of sl (3, C ) in terms of
3 × 3 matrices Eij , and check their commutation relations.
5.2 (Spin and hypercharge) In physics, the CSA of su (3)C is often defined with a
different basis: tz = 1/2t1 and y = 1/3t1 + 2/3t2. In the standard representation,

Tz = π ( tz) = 1/2 λ3 and Y = π ( y) = 1/ 3 λ8, where the λi are the well-known
3 × 3 Gell-Mann self-adjoint matrices. The remaining elements λi are related to
the elements used in the chapter in the following way: λ1 ± iλ2 = 2t± , λ6 ± iλ7 =
2u± , λ4 ± iλ5 = 2v±, where t+ = e1, t− = f 1, u+ = e2, u− = f 2, v+ = e3, and
v− = f 3.
176 CHAPTER 5. LIE ALGEBRA sl (3, C )

(a) Calculate Tr ( hi h j ) in the two bases, { t1, t2 } and { tz, y } .


(b) Calculate the roots in the physical basis h = { tz, y } .
(c) Calculate the Killing forms ( y : y ), ( tz : tz ), and ( y : tz ).
(d) Calculate hαi , and then the inner products h αi, α j i, with i, j = 1, 2, 3.
5.3 (Casimir operator) Find the quadratic Casimir operator in sl (3, C ), and calcu-
late its invariant value in the irreducible representation of highest weight ( n, m ).
Hint: Use the properties of the Gell-Mann matrices λi.
5.4 The structure constants Nαβ of a simple Lie algebra are defined in the equa-
tion [ eα, e β ] = Nαβ eα+ β if α + β is a non-zero root. Assuming this is the case and
going into a finite representation, prove the symmetry relations: Nαβ = − Nβα =
N− β,−α = Nβ,−α− β = N−α− β,α .
5.5 (Weight vectors) Let s = h h, e, f i be a sl (2, C ) subalgebra of sl (3, C ), and let
π ( h) = H, π ( e ) = E, and π ( f ) = F. Let [ H, E ] = α ( h)E, [ H, F ] = − α (h)E, and
[ E, F ] = H; and finally Hv = µ ( h )v. Prove for any non-negative integer k:
(a) [ H, Ek] = kα (h)Ek and [ H, F k] = − kα(h) Fk.
(b) HEk v = [ µ (h) + kα (h)] Ekv and HF kv = [ µ (h) − kα (h)] Fkv.
(c) [ E, Fk ] = kF k−1 H − xk α ( h) Fk−1; find xk .
5.6 (π2,0 is group invariant) Consider the irreducible representation of highest
weight (2, 0) of sl (3, C ), to be called π2,0. Enumerate its weights and weight vec-
tors, and prove that the representation space is group invariant.
5.7 (Representation of highest weight (2, 0)) Construct the irrep of highest weight
(2, 0) of sl (3, C ) (also called a2 ). Give the matrix elements of the lowering opera-
tors F1 and F2 in a normalized basis of the representation.
5.8 (The adjoint representation (1, 1)) Construct the irrep of highest weight (1, 1)
of sl (3, C ). Find the matrix elements of F1 and F2.
5.9 (Dual representation) Let G be a Lie group, and g its Lie algebra.
(a) Consider the dual space V defined wrt some bilinear form h u, v i for all
v ∈ V and u ∈ V. Let Π be a representation, Π : G → GL ( V ), and the dual
representation Π : G → GL ( V). Show that Π ( g) = Π ( g−1)T for all g ∈ G, where
()T indicates transposition.
(b) Let π̄ be the representation of g on V (i.e. dual to π). What is π ( X) for
X ∈ g? Check that π preserves the Lie algebra products. To relate this question
to (a) assume that G is connected.
(c) Now take the case of sl (3, C ). How are the weights of π and π̄ related?
If ( n, m ) is the highest weight of the irreducible representation π, what is the
highest weight of its dual π?
5.10 (Mesons and baryons) Let ξ i correspond to the quarks u, d, s, and ξ i to the
anti-quarks ū, d,¯ s̄. As in Problem 5.2, the isospin is Tz = 1/2 λ3, and the hyper-

charge Y = 1/ 3 λ8; in addition, the (electric) charge is Q = Tz + 1/2 Y. Given that
mesons are pairs of quark-antiquark, and baryons are made up of three quarks,
find the quark content of the meson and of the baryon octets, and give the values
of Tz, Y, and Q for their members.

Q. Ho-Kim. Group Theory: A Physicist’s Primer.


Chapter 6

Simple Lie Algebras: Structure

6.1 Basics
6.2 Roots and Root Spaces
6.3 String of Roots
6.4 System of Roots
6.5 Cartan Matrix
6.6 Dynkin Diagram
6.7 Classification of Simple Algebras

This chapter generalizes our previous study of the simple Lie algebras sl (2; C )
and sl (3; C ) to all other complex simple or semisimple Lie algebras. After a brief
review of some of the concepts already seen, we shall discuss the basic proper-
ties of the semisimple (and simple, the limiting case) Lie algebras (in particular
their roots and root spaces). The essence of the structure of a semisimple Lie
algebra is encapsulated in a fundamental system of roots, or equivalently a Car-
tan matrix, or a planar diagram, called the Dynkin diagram. This representation
is in fact the tool used to show that there are only a finite number of classes of
simple Lie algebras over the complex numbers, which can thus be identified and
systematically studied. The main point of this chapter is to present this elegant
and important result in mathematics. It was first obtained by Wilhelm Killing
in 1888–1890, and made more rigorous by Élie Joseph Cartan in 1894 (who also
classified semisimple real Lie algebras). Important contributions were later made
by Hermann Weyl, Eugene Dynkin, Claude Chevalley, and Jean-Pierre Serre.

6.1 Basics
To begin, at the risk of being repetitive, we will recall (Chapters 3–5) the defini-
tion of simple and semisimple Lie algebras over complex numbers, and of their
Cartan sub algebras. Next, we will discuss the roots and root spaces of complex
semisimple Lie algebras, with the aim of applying their properties to uncover

177
178 CHAPTER 6. SIMPLE LIE ALGEBRAS: STRUCTURE

their existence and structure. For a more detailed and more rigorous treatment,
the reader is referred to [Ja] and [Sa].
Let g be a complex Lie algebra. A sub Lie algebra h of g is called an ideal in g
if it is invariant with respect to g, that is, if [ h, g] ∈ h for all h ∈ h and all g ∈ g. A
Lie algebra g is simple if dim g ≥ 2 and if it contains no proper ideal. In other words,
it is its own ideal: [g, g] = g. A Lie algebra g is semisimple if it has no nonzero abelian
ideal. A Lie algebra is semisimple if only if it can be written as a direct sum of
simple Lie algebras. In particular if this sum reduces to a single term, then it is
simple. So, semisimple Lie algebras include every simple Lie algebra, but also
many others.
In a semisimple algebra of dim g ≥ 2, every element xi (called generator) of
a given basis { xi } of g has a nonzero commutator with some other element xj .
The structure constants then carry a lot of information, which is encapsulated in
a criterion for semisimplicity (referred to as the Cartan criterion): A Lie algebra g
is semisimple if and only if its Killing form is non-degenerate. Put it in another way,
to show that g is semisimple it suffices to check that there are no bases { xi } for g
in which the Killing form K ( xi, xj ) = κij is singular, i.e. in which the determinant
of [ κij ] is zero. That is why throughout this section we need just require semi-
simplicity, rather than the more restrictive condition of simplicity.
We have seen in previous chapters that sl (2; C ) has a special matrix H = 2J3,
and sl (3; C ) has two commuting matrices T1 and T2 that play a key role in their
study. Now, for a general complex semisimple Lie algebra g, we shall follow the same
strategy and seek among its elements those that can play a similar role. That is,
we want to find an abelian subalgebra h that acts diagonally on one faithful (and
so on any) representation of g and that contains the greatest possible information
about g. Such a set of elements is defined as follows:

Definition 6.1 (Cartan subalgebra). A subspace h of a complex semisimple Lie algebra


g is called a Cartan subalgebra (CSA) if
(i) h is abelian; which means [ h, h 0] = 0 for all h, h0 ∈ h;
(ii) h is maximal abelian; which means, any element z of the algebra g that satisfies
[ h, z ] = 0 for all h ∈ h must also be in h; and
(iii) ad h : g → g is diagonalizable for all h ∈ h.

The dimension of the Cartan subalgebra (dim h) is called the rank (`) of the
algebra, and so ` = dim h. A semisimple Lie algebra always possesses a Cartan
subalgebra; it is essentially unique in the sense that any two CSA’s are equivalent
by automorphism and so must have the same dimension. For example, in su (2)C,
the CSA could be defined by J3, or J2, or J1; and in sl (3; C ), it could be chosen to
be the pair of matrices T1 and T2, or equivalently Tz (charge) and Y (hypercharge),
or any other pair of commuting independent matrices in sl (3; C ).
Just as all h in h commute, so do all adh, since ad[ h, h 0] = [adh, adh0 ]. Since all
adh’s commute, and each adh is diagonalizable (by construction), the adh’s are
simultaneously diagonalizable (from linear algebra). This leads us naturally to a
study of the simultaneous eigenvalue problem for all the adh’s, with h ∈ h, as a
way to classifying the remaining elements of g.
6.2. ROOTS AND ROOT SPACES 179

6.2 Roots and Root Spaces


In this section we will discuss in some detail the aforementioned eigenvalue
problem, the properties of the eigenvalues, or roots, and root spaces of complex
semisimple Lie algebras.

Definition 6.2 (Roots). Let g be a complex semisimple Lie algebra with Cartan subal-
gebra h; let h act on g by the adjoint representation such that

adh z = α ( h) z, h ∈ h, z ∈ g. (6.1)

α ( h) is a complex linear functional of h (so α is in h∗ , the space dual to h). The nonzero
eigenvalues in (6.1) are called the roots of g with respect to the CSA h. If α 6 = 0, the
eigenvector z is called a root vector corresponding to the root α, and the space of all such
vectors for a fixed α is the root space to α, called gα .

The set of all the roots, α, β, . . . , will be denoted by ∆, and thus ∆ is a finite
subset of h∗ that does not include { 0} , although h itself is the eigenspace for the
action of adh corresponding to the eigenvalue 0. To include { 0} in our discussion,
it is useful to adopt the notations ∆0 = ∆ { 0} and g0 = h. So the action of adh
S

on g leads to a direct-sum decomposition (the Cartan decomposition):


M
g = h ⊕ gα ⊕ · · · ⊕ gγ = gα . (6.2)
α∈∆0

The action of adh preserves each subspace in the direct sum, acting on each term
by scalar multiplication of 0 or α (by (6.1)). In addition, for any zα ∈ gα , z β ∈ gβ
   
with α, β ∈ ∆0, we have adh zα , z β = α ( h ) + β ( h ) zα , z β , which means that

[gα , gβ ] ⊂ gα+ β if α + β ∈ ∆0, or [gα , gβ ] = 0 if α + β 6 ∈ ∆0. (6.3)

Thus, adgα sends any gβ to a root space or h, or else to 0, preserving the direct-sum
decomposition: We then say that the Cartan decomposition is adg-invariant.
One can give any space of linear operators a bilinear trace form as inner prod-
uct; for Lie algebras g, a convenient definition is the Killing form: K ( x, y) =
Trg (ad( x) ad( y )) for any x, y ∈ g. As before, we will also use the symbol ( x : y )
for K ( x, y). When two elements x and y of g satisfy ( x : y ) = 0, we say that they
are orthogonal to each other (relative to the Killing form) and write x ⊥ y.
Many properties of the roots of complex Lie algebras are geometric in nature
and arise from the concept of bilinear trace form K. They will be presently dis-
cussed and numbered P1, P2,. . . for reference. Cf. [Ja] Chap. IV.
P1. gα ⊥ gβ relative to the Killing form for any α, β ∈ ∆0 such that α + β 6 = 0.
This can be seen by examining the process of calculating Tr (ad( xα )ad( y β ))
 
through the computation of xα , [ y β, z ] for given xα ∈ gα , y β ∈ gβ , and any
z ∈ gν , and identifying the nonzero coefficients
 of z in the resulting sum as the
contributions to Tr (ad( xα )ad( y β )). As gα , [gβ , gν ] ⊂ gα+ β+ν by (6.3), it is clear
that no terms proportional to z can exist unless α + β = 0. So ( xα : y β ) = 0 if
180 CHAPTER 6. SIMPLE LIE ALGEBRAS: STRUCTURE

α + β 6 = 0. Three cases: (i) ( h : xα ) = 0 with h ∈ h, xα ∈ gα every α ∈ ∆ (h ⊥ gα );


(ii) ( xα : x0α ) = 0 with any xα , x0α ∈ gα , α ∈ ∆ (the Killing form is identically 0 on
each gα ); and (iii) ( xα : y β ) = 0 with β 6 = + α and 6 = − α, and α, β ∈ ∆ (gα ⊥ gβ ).
P2. The restriction of K to h is given by

( h : h0 ) = ∑ dα α ( h ) α(h0) , h, h0 ∈ h, dα = dimgα . (6.4)


α ∈∆

Since adh reduces to a scalar multiplication on gα , we have adh adh 0 ( zα ) =


α ( h)α(h0) zα for any h, h0 ∈ h and each zα ∈ gα . So we get dimgα α ( h)α(h0) from
each root space. (It turns out, as we will see, that dimgα = 1 for all α ∈ ∆; then
( h : h0 ) can be simply calculated if all α ( h ) are known.) 
So far, the results apply to any complex Lie algebra g. If we assume, in addition,
semisimplicity for g (hence a non-degenerate Killing form in g), as we will do from
now on, then more can be said about its roots.
P3. The restriction of the Killing form to h is non-degenerate.
This follows from the non-degeneracy on g, together with the orthogonality
of h to all gα where α ∈ ∆, as illustrated in the following partially filled table:

h ⊕ gα
K= h ∗ 0
⊕ gα 0 ∗

P4. If all roots vanish on an element h ∈ h, then h = 0.


Given α ( h) = 0 for all α ∈ ∆, we have ( h : h 0 ) = 0 for all h0 ∈ h by (6.4); on the
other hand, by P1, h is orthogonal to all gα with α ∈ ∆. Hence ( h : z ) = 0 for all
z ∈ g. And so h = 0 by virtue of the non-degeneracy of K on g. 
Here we can see why any CSA h of a semisimple Lie algebra must be abelian.
Define k = [ h, h0] for any h, h0 ∈ h. Let adh z = α ( h )z and adh 0 z = α ( h0) z. Then,
since adk = ad[ h, h0] = [adh, adh0 ], we have α ( k ) = 0 for every α ∈ ∆. It follows
that k = 0, or [ h, h0] = 0 for all h, h 0 ∈ h, and so h is abelian.
P5. If α is a root, then so is − α, a result succinctly written as ∆ = − ∆ (equality of sets).
If α is a root and − α is not a root, then gα ⊥ gβ for every β ∈ ∆. Then, since
gα ⊥ h also, we have gα ⊥ g, which contradicts the non-degeneracy assumption.
So − α must be a root. That “g−α 6 = 0 follows from the non-degeneracy of K on g”
is schematically shown in the following table:

h gα g− α gβ
h 0
K= gα 0 0 ∗ 0
g− α ∗
gβ 0

A corollary is that if a z 6 = 0 is in gα there is a x ∈ g−α such that ( x : z ) 6 = 0.


Similarly, if a x 6 = 0 is in g−α there is a z ∈ gα such that ( x : z ) 6 = 0. This says that
gα and g−α are dual relative to the Killing form. 
6.2. ROOTS AND ROOT SPACES 181

P6. The number of linearly independent roots of g is equal to its rank `, or dim h.
h∗ is spanned by the (non-zero) roots, hence its dimension is given by the
largest number of linearly independent roots. The assertion that dim h∗ = dim h
(= `) follows from the isomorphism of dual vector spaces, which says to each
nonzero h ∈ h there exists a nonzero α ∈ h∗ . Let us assume, in the contrary, that
there exists a non-zero h ∈ h, such that α ( h) = 0 for every root α (so that the
subspace of h∗ spanned by the roots has dimensionality less than `). Then (6.4)
implies ( h : k ) = 0 for every k ∈ h and h 6 = 0, which contradicts P4. 
We recall (from Chap. 5, Sec. 5.1) a definition of the inner product on h∗ . For
every element $ of h∗ , we can define a unique element h$ of h such that
def
( h $ : k ) = $ ( k) , for all k ∈ h. (6.5)
The mapping h∗ 3 $ ↔ h$ ∈ h is an isomorphism from h∗
to h, allowed by the
non-degeneracy of K |h. If $ is a root, h$ is called a root generator of $.
If $, σ are any two elements of h∗ , and h$ , hσ their corresponding elements in
h, then the bilinear form on h∗ is defined by
def
h $, σ i = ( h$ : hσ ) . (6.6)
It follows that h $, σ i = $ ( h σ) = σ ( h$), and so h, i is a non-degenerate, symmetric
bilinear form on h∗ . As there exists a unique non-zero root generator h α for each
α ∈ ∆, we have ( hα : hα ) 6 = 0. Hence ( hα : hα ) = α ( hα) 6 = 0, and so h α, α i 6 = 0
for every (non-zero) α ∈ ∆. This property is referred to as ‘the non-isotropy of the
roots’ relative to h, i.
P7. For any h ∈ h and α ∈ ∆, let xα ∈ gα and x−α ∈ g−α be the root vectors obeying
the equations [ h, xα ] = α ( h )xα and [ h, x−α ] = − α (h) x−α. Then, x±α must also satisfy
[ xα , x−α ] = ( xα : x−α ) hα , (6.7)
where hα ∈ h is the root generator corresponding to the root α.
Just as for sl (3, C ), the proof
 runs as follows:
 By the invariance of K, we have
for any h ∈ h [ xα , x−α ] : h = xα : [ x−α , h ] = α ( h)( xα : x−α ) (which is 6 = 0 by

P5). Since α ( h) = ( h α : h ) by definition, the equation also reads [ xα , x−α ] : h =
( hα : h )( xα : x−α ). Then (6.7) follows from the non-degeneracy of K on h (which
when applied here tells us that ( h α : h ) 6 = 0 for all h ∈ h). 
P8. The root space gα has dimension 1 for every root α ∈ ∆, whereas gkα is zero for each
k = 2, 3, . . . .
Given a root α, let x±α and hα be as defined in P7. Consider a subspace w of g
spanned by hα , x−α and gkα (with k = 1, 2, 3, . . . ), that is, w = Chα ⊕ Cx−α ⊕ k gkα ,
where the summands have the dimensions 1, 1, d α , d2α , . . . , respectively.
By P7, (6.1), and (6.3), w is invariant under x±α and hα , that is to say, ad x±α
and ad hα send every z ∈ w either to 0 or w. So it is meaningful to calculate the
traces of ad x±α and ad hα over w. As the adjoint action of h α ∈ h acts diagonally
on w, we have the trace of ad h α over w given by
Trw (adhα ) = α ( hα)[− 1 + dα + 2d2α + · · · ]. (6.8)
182 CHAPTER 6. SIMPLE LIE ALGEBRAS: STRUCTURE

where dkα = dim gkα ≥ 0. But by (6.7), ( xα : x−α )adhα = [adxα , adx−α ], and so
Tr (adhα ) = 0 since ( xα : x−α ) 6 = 0 (dual vectors). It follows that
α ( hα)[− 1 + dα + 2d2α + 3d3α + · · · ] = 0.
Since α ( hα ) = h α, α i 6 = 0 (non-isotropy of the roots), this can be satisfied only if
dα = 1 (so gα = Cxα ), and d2α = d3α = · · · = 0. By the reflection symmetry of ∆,
we also have d−α = 1 and d−kα = 0 for k = 2, 3, . . . .
In conclusion, for each α in ∆, g±α = Cx±α , and g±kα = 0 if k ≥ 2. If α is a root,
the multiple kα is a root only for k = 1 or − 1. 
For each root α ∈ ∆, the elements xα ∈ gα , x−α ∈ g−α and hα ∈ [gα , g−α ] of g
defined in P7 obey the bracket relations
[ hα , xα ] = α ( hα) xα ; [ hα, x−α ] = − α (hα ) x−α ; [ xα , x−α ] = ( xα : x−α ) hα. (6.9)
We see that the adjoint action of h α carries each xα and each x−α into itself, and so
hα , xα and x−α generate a three-dimensional sub Lie algebra of g. Let us change
their normalization, and define:
2 2
h̃α = hα , eα = xα , f α = ξ x−α ; where ξ = . (6.10)
α ( hα) α ( hα )( xα : x−α )
The re-normalized elements obey the bracket relations:
[ h̃α , eα ] = 2eα , [ h̃α, f α ] = − 2 f α, [ eα, f α ] = h̃α . (6.11)
h̃α , eα , and f α form a canonical basis for a three-dimensional subalgebra in g iso-
morphic to sl (2, C ), which we have studied in Chapter 4. It is significant that
whereas hα ∈ h (so also h̃α ) is uniquely determined from each given α ∈ ∆, with
a fixed normalization, α ( h̃α) = 2, there is some arbitrariness in the root vectors:
eα can be replaced by aeα , and f α by b f α , with a, b ∈ C provided ab = 1, without
causing any changes in their bracket relations. For any α ∈ h∗ , let α̃ = 2α/ hα, αi,
then h̃α is related to α̃ in precisely the same way as h α is related to α. (We call α̃ the
coroot and h̃α the coroot generator of the root α.) Thus, we have α̃( h ) = ( h̃α : h )
and h α̃, β̃ i = ( h̃α : h̃ β ). The symmetry between roots and co-roots is captured in
the identity α̃( h β ) = β ( h̃α ). Information about the independent roots suffices in
practice, so we state the result as follows:
P9. For every independent (nonzero) root α of g, there is a three-dimensional simple sub
Lie algebra sα isomorphic to sl (2, C ), defined by the canonical basis h h̃α , eα , f α i together
with the canonical commutation relations (6.11).
E XAMPLE 1: For the simple Lie algebra g = sl(3, C ), one can choose for the CSA
h = { t1, t2 } . There are six roots—two linearly independent α1, α2 plus four others
α3 = α1 + α2, − α1, − α2, − α3. sl (3, C ) admits the decomposition to the direct sum
g = h ⊕ h e1i ⊕ h e2i ⊕ h e3i ⊕ h f 1i ⊕ h f 2i ⊕ h f 3i. The subalgebras s1 = h h1, e1, f 1 i
and s2 = h h2, e2, f 2 i associated with the independent roots α1 and α2 are isomor-
phic to sl(2, C ). It turns out that h i = ti are coroot generators, related to the root
generators hαi by hi = 6hαi . The non-zero Killing forms are ( h i : hi ) = 12 with
i = 1, 2, ( h1 : h2 ) = − 6, ( e j : f j ) = 6 with j = 1, 2, 3. We have the inner products
on h∗ : h αi, αi i = 1/3 for i = 1, 2, and h α1, α2 i = − 1/6, h α1, α3 i = h α2, α3 i = 1/6. 
6.3. STRING OF ROOTS 183

6.3 String of Roots


With the general properties of the roots of g now known, we can proceed to calcu-
late the values of the roots relative to some given CSA. We will rely on experience
gained from the calculations done in the last two chapters involving sl (2, C ) and
sl (3, C ), where the three-dimensional sub Lie algebras sα played a significant role.
We begin by recalling a result directly applicable to sα , namely, that every ir-
reducible representation V (n) of sl (2, C ) is uniquely specified by an integer n ≥ 0
and is expressible as a direct sum ⊕ µ Vµ of one-dimensional spaces Vµ , corre-
sponding to the weights (or eigenvalues of π ( h), where h is the canonical basis of
the CSA), of the form µ = n − 2k, forming an unbroken symmetric sequence of
integers n, n − 2, . . . , − n + 2, − n. (Here ± 2 are the two roots of sl (2, C ).) Such a
sequence of numbers may also be written equivalently in the form λ + 2k, where
integer k takes every value in the interval [− q, p] (with non-negative integers p, q)
such that λ = q − p and n = q + p (since λ + 2p = −(λ − 2q ) = n). Hence
V (n) = ⊕ k Vλ+2k , with dim V (n) = q + p + 1 and dim Vλ+2k = 1.
Now consider a complex semisimple Lie algebra g, together with some CSA
h. For any two roots α, β ∈ ∆ of g, call Sα ( β ) the α-string of β the uninterrupted
sequence of elements β + kα ∈ h∗ , with all integers k such that β + kα ∈ ∆0.
(Actually, β + kα = 0 can only occur if β = − α.) Often we will also take Sα ( β ) to
mean the direct sum of root spaces ⊕ k gβ+kα over the same range of k.
Following the same arguments as in P8, we can convince ourselves that the
sum ⊕ k gβ+kα (α, β ∈ ∆) is invariant under the adjoint action of sα = h h̃α, eα , f α i,
and so must be a representation of sα . In addition, adh̃α acts on each gβ+kα by scalar
multiplication by β ( h̃α ) + kα(h̃α). As α ( h̃α) = 2, the eigenvalue β ( h̃α) + 2k of adh̃α
on each gβ+kα is a root or 0 with multiplicity equal to 1 (since dim gβ+kα = 1 by
P8) provided the integer k takes all values in a range still to be determined. By
the key result on sl (2, C ) recalled above, the string Sα ( β ) on h̃α is an irreducible
representation of the algebra sα , and { β ( h̃α ) + 2k } form an unbroken sequence of
roots, which with respect to (6.11) are all integers of the same parity. It follows
that β ( h̃α ) ∈ Z for any β ∈ ∆.
Let p and q be two non-negative integers defining the limits of k and hence of
the string of roots Sα ( β ) = { β + kα | − q ≤ k ≤ p } , or explicitly,
Sα ( β ) : β − qα, . . ., β − α, β, β + α, . . . , β + pα.
The non-negative integers p and q satisfy two conditions: (i) q + p + 1 is equal the
length of the string, or the dimension of the irreducible representation (n + 1 in
customary notation) corresponding to the string; and (ii) q − p = β ( h̃α ), because
the distribution of the weights in such a representation is symmetric about the
origin. These findings are summarized in the following capsule:
P10. Let α, β be any two roots of a complex semisimple Lie algebra g relative to a given
CSA h , and h̃α an element of h defined by h̃α ∈ [gα , g−α ] and α ( h̃α) = 2. Then
(1) β ( h̃α ) is an integer.
(2) The set Sα ( β ) = { β + kα | − q ≤ k ≤ p } in h∗ , with non-negative integers
184 CHAPTER 6. SIMPLE LIE ALGEBRAS: STRUCTURE

p and q subject to the condition q − p = β ( h̃α ), defines an uninterrupted string of


roots of g uniquely associated with an irreducible representation of sα ∼
= sl (2, C ) (a sub
representation of g) of dimension p + q + 1.
The integers β ( h̃α ) are called the Cartan integers, denoted aαβ .
Since − p ≤ q − p ≤ q, the integer β ( h̃α ) lies in the interval [− p, q] (which
also means − β ( h̃α ) lies in the interval [− q, p]). For any roots α, β, the element
β − β ( h̃α ) α belongs to Sα ( β ), and hence is also a root. In fact, if ε = sign( q − p ),
then ε, 2ε, . . . , ε | q − p | all lie in the interval [− p, q], and so β, β − εα, β − 2εα, . . . ,
β − β ( h̃α ) α all lie in Sα ( β ), and so are either roots or 0. In particular,
(i) if p = 0, then β ( h̃α ) = q ≥ 0, and Sα ( β ) consists of β − qα, . . ., β − α, β; and
(ii) if q = 0, then β ( h̃α ) = − p ≤ 0, and Sα ( β ) consists of β, β + α, . . . , β + pα.
P11. For any α, β ∈ ∆ let ε ≡ sign aαβ if aαβ 6 = 0 and ε ≡ 0 if aαβ = 0. Then all
the terms in the sequence β, β − εα, . . ., β − aαβ α are again roots, or 0. In particular,
assuming α 6 = β, aαβ > 0 if β − α is also a root, and aαβ < 0 if β + α is also a root.

E XAMPLE 2: From Example 1 we have h α1, α2 i = − 1/6, h α1, α3 i = 1/6, and


h α2, α3 i = 1/6. And so α1 + α2, α1 − α3, and α2 − α3 are roots. 
In P8 we saw that a multiple kα of a root α is a root if k = ± 1. We will see
now that it is also a necessary condition. Suppose both α and β = kα are roots.
Recalling that γ ( h̃γ ) = 2 for any γ ∈ ∆, we have β ( h̃α ) = kα (h̃α ) = 2k, and
β ( h̃ β) = 2 = kα (h̃β), from which α ( h̃ β) = 2/k. By P10 both 2k and 2/k are
integers, and so k = ± 1, ± 2, ± 1/2. But P8 forbids both ± 2 and ± 1/2. Hence,
P12. A multiple kα of a root α is again a root if and only if k = ± 1.
It remains to find the possible values of p and q in a string of roots Sα ( β ) =
{ β + kα; − q ≤ k ≤ p } , or equivalently its length L = p + q + 1. If β is equal
to α or − α, then Sα ( β ) = {− α, 0, α} and so L = 3. So from now on, assume
β 6 = α, − α. Consider the case L = 5 and call the five possible non-zero roots
β − 2α, β − α, β, β + α, β + 2α, with an appropriate relabeling if necessary.
Now, by P12, both ( β + 2α ) − β = 2α and ( β + 2α ) + β = 2( β + α ) are not
roots. Hence S β ( β + 2α ) contains just one root β + 2α, and the p and q for this
string are both zero, which means ( β + 2α )(h̃β) = 0 or h β + 2α, β i = 0. Similarly,
( β − 2α ) − β and ( β − 2α ) + β are not roots, so that h β − 2α, β i = 0. Putting these
two results together, we have h β, β i = 0 for a nonzero root β, contradicting non-
isotropy of non-zero roots. The conclusion is p + q + 1 < 5. Then, as p ≤ 3 and
q ≤ 3 and β ( h̃α ) = q − p, we must have aαβ = β ( h̃α ) = 0, or ± 1, or ± 2, or ± 3. We
record this key result in the following statement:
P13. Each string of roots contains at most four roots. A Cartan integer, aαβ ≡ β ( h̃α ),
must have one of the values 0, ± 1, ± 2, and ± 3.
We have thus pinned down all possible values of the roots. But to say more
(for example, their relative directions) we need a few more concepts on the CSA’s.
6.4. SYSTEM OF ROOTS 185

6.4 System of Roots


If we restrict elements (the root vectors) defined a priori in complex spaces (h and
h∗ ) to their real forms, and equip them with positive-definite inner products (as in
ordinary real vector Euclidean spaces ), then they can be given geometrical inter-
pretation and other interesting characterization. See [Ja] Chap. IV, [Sa] Chap. 2.
The Killing form involving pairs of h̃α ’s can be calculated from P2 and P8:

( h̃α : h̃ β ) = ∑ γ ( h̃α) γ ( h̃β), α, β ∈ ∆ . (6.12)


γ∈∆

With γ ( h̃α) ∈ Z, this tells us that ( h̃α : h̃ β ) ∈ Z , and so, ( h̃α : h̃α ) = ∑γ γ2 ( h̃α ) ∈
N (i.e. non-negative integers). From the relationship between h̃α and hα (in par-
ticular, ( hα : hα )( h̃α : h̃α ) = 4 and 2( hα : h β ) = β ( h̃α )( hα : hα )), it follows that
( hα : hα ), and hence also ( hα : h β ), are both rational numbers for all roots α, β ∈ ∆.
Let h0 be the real subspace of h spanned by the h α ’s, where α ∈ ∆, i.e. the set
of all elements of the form ∑α∈∆ Rhα . If x, y are any two elements in h0 ⊂ h,
then ( x : y ) = ∑αβ (Rhα : Rh β ) is real. Further, ( x : x) = ∑γ ( γ ( x))2 ≥ 0 (since
γ ( x) ∈ R) is equal to zero if and only if γ ( x) = 0 for all γ ∈ ∆, that is if and
only if x = 0 by P4. This shows that: The restriction of the Killing form to h0 is real,
positive-definite, and so h0 is a Euclidean space.
The isomorphism h ↔ h∗ defined by the relation ( h α : h β ) = h α, β i gives
the correspondence between the real vector space h0 and the R-span of ∆, called
∆R ∼ = h∗0 . As ( hα : h β ) defined on h0 is a rational number, so is h α, β i defined on

h0 . Further, as h α, α i = ∑γ h α, γ i2, and h α, γ i is a rational number for α, γ ∈ h∗0 ,
we have h α, α i ≥ 0, and it is zero only if h α, γ i = 0 for all γ, that is if and only if
α = 0, since the roots span h∗ . In conclusion:
The R-span of ∆, denoted ∆R ∼ = h∗0 , is a Euclidean subspace of h∗ with respect to the
symmetric, positive-definite inner product h, i. For any α, β ∈ h∗0 , the product h α, β i is a
rational number (Q), and | α |2 = h α, α i is positive-definite.
In terms of roots, the Cartan integer may be rewritten as

2β ( hα ) 2h α, β i
aαβ = β ( h̃α ) = = , (6.13)
α ( h α) h α, α i
where α, β ∈ h∗0 and hα ∈ h0 . In particular aαα = 2 for all α, but in general
a βα 6 = aαβ and takes one of the values 0, ± 1, ± 2, ±3, by P12. As the roots are
vectors in a Euclidean space, ordinary geometrical notions apply: angles between
vectors can be defined and Schwarz’s inequality can be invoked. Let θ be the
angle (0 ≤ θ ≤ π) between the two roots α and β on Euclidean space h∗0 , then

4h α, β i2
aαβ a βα = = 4 cos2 θ ≤ 4 . (6.14)
h α, αih β, βi
The possible values of aαβ a βα are 0, 1, 2, 3, 4. The value 4 means cos θ = ± 1 corre-
sponding to the trivial cases (θ = 0 or β = α, and θ = π or β = − α). The value 0
corresponds to orthogonality of α and β, or equivalently a βα = aαβ = 0, with no
186 CHAPTER 6. SIMPLE LIE ALGEBRAS: STRUCTURE

restrictions on the ratio | β | /|α|. For the remaining cases, let us assume first that
a βα ≤ 0, or equivalently θ ≥ π/2. Then of the a βα and aαβ , one is − 1 and the
other is − 1, or − 2, or − 3, corresponding to the angle θ = 2π/3, 3π/4, or 5π/6,
and the ratio aαβ /a βα = | β |2/ | α |2 = 1 or 2 or 3 or their reciprocals. For the case
a βα ≥ 0 (θ ≤ π/2), we remove the signs of the a’s and replace θ with π − θ, which
leads to the angles π/6, π/4 and π/3. The accompanying table and Fig. 6.1 show
the four non trivial cases, with roots α and β 6 = ± α.

aαβ a βα θαβ | β | /|α|



1 3 π/6 30◦ 3

1 2 π/4 45◦ 2
1 1 π/3 60◦ 1
0 0 π/2 90◦ −
−1 −1 2π/3 120◦ 1

−1 −2 3π/4 135◦ 2

−1 −3 5π/6 150◦ 3

""  ◦ bb
" ◦ 45◦ 90◦ T 120◦ @ 135◦ ◦
" 30  60 T @ b150

Figure 6.1: The angle θ between any two roots of a semisimple


√ Lie algebra is allowed to
have only limited values, such that cos θ = 0, or −1, or ± n/2 (n = 1, 2, 3).

P14. aαβ a βα for any roots α, β 6 = ± α has one of the values: 0, 1, 2, 3. The angle between
two such roots on h∗0 has one of the values: 30◦ , 45◦ , 60◦ , 90◦ , 120◦ , 135◦ , and 150◦ .
Root system. The roots α, β, . . . of a semisimple Lie algebra g wrt CSA h have the
following essential properties:
(i) 2h β, α i/ h α, αi is an integer (called Cartan integer, denoted by aαβ );
(ii) β − aαβ α is also a root; and
(iii) α and r α are both roots only for r = ± 1.
The notions of roots and root spaces may be carried to an abstract level by
defining, without referring to any group or algebra, a root system in a Euclidean
vector space V with respect to a positive-definite inner-product h, i as a finite non-
empty subset of V whose elements have properties (i)–(iii). Thus, ∆, described
above, is an example of a root system in h∗0 . The rank of a root system is the dimen-
sion of V (equal to the rank of g as defined before, ` = dim h). As examples, we
now determine all root systems of rank one and two.
6.4. SYSTEM OF ROOTS 187

E XAMPLE 3: Rank one. There is only one rank-one root system, denoted by A 1.
It consists of two vectors α and − α. The Cartan integers are aαα = a−α,−α = 2,
a−α,α = aα,−α = − 2. It will be identified with the root system of the Lie algebra
a1 = sl (2, C ). This is so because this algebra has a CSA h H i of rank 1, with Lie
brackets [ H, E±] = ± 2E±, implying a pair of root ± α such that ± α ( H ) = ± 2.
Here, the root system lies on h∗0 = R H.

A1  q -

E XAMPLE 4: Rank two. There are four distinct root-systems of rank two shown in
Fig. 6.2. Their labels are the names of the root systems of the corresponding Lie
algebras in the Cartan–Killing classification. A 1 ⊕ A 1 corresponds to a decom-
posable algebra, as we will see later. The other cases (A 2, B2, G 2) give simple Lie
algebras; they satisfy conditions (i), (ii), (iii) of root systems. There are no other
systems of rank two. 

β
6
A1 ⊕ A1  q -α | β | : | α | arbitrary ratio;
α⊥β
?
β
KA  | β | : | α | = 1;
A2  Aq - α
A angle between adjacent vectors
 UA = π/3

β √
I 6
@
B2 @q - α | β | : | α | = 2;
@ angle between adjacent vectors
?@R = π/4

6
β H
YH KA  
* √
G2 HAq - | β | : | α | = 3;
H α
 AUA HH
angle between adjacent vectors
 j
= π/6
?

Figure 6.2: The four root-systems of rank two; α and β are the two simple roots.

In any vector space, it is always practically useful to have a basis to work with.
It is also the case with h∗0 we are dealing with here.
Let ( α1, α2, . . . , α` ) be a basis for h∗ consisting of independent roots (we al-
ready know ` = dim h∗ by P6). We want to check that every root β can be written
as β = ∑i qi αi, with qi ∈ Q. If this is true, then h β, α j i = ∑i qi h αi, α j i for any
188 CHAPTER 6. SIMPLE LIE ALGEBRAS: STRUCTURE

j = 1, 2, . . . , `, and the unknown qi obey the system of equations

`
2h β, α ji 2 h α i, α j i
h α j, α j i
= ∑ j j qi , j = 1, 2, . . . , `. (6.15)
i= 1 h α , α i

This is a system with integral coefficients and a non-zero determinant:

2 h α i, α j i 2`
 
det = det [h αi, α j i] 6 = 0 ,
h α j, α j i ∏ j h α j, α j i

(det[h αi, α j i] 6 = 0 because the bilinear form h, i is non-degenerate, and the αi ’s


are a basis for h∗ ). Hence (6.15) has a unique solution, which is rational. Thus
the qi ’s are rational numbers, leading to the conclusion that every root β is a linear
combination of the basis vectors αi with rational (and so real) coefficients.
P15. The dimension of h∗0 is given by the rank ` of the algebra g (which is the dimension
of the Cartan subalgebra h), i.e. we have dim h∗0 = dim h = `.
In the real (rational) vector space h∗0 , pick a basis given by a set of linearly
independent vectors α1, α2, . . . , α` , where ` = dim h, arranged in some fixed but
arbitrary order. A root $ = ∑i qi αi , with qi ∈ Q, is said to be positive ($ > 0) if
the first non-zero coefficient qi is positive. (If the first non-zero coefficient qi is
negative, $ is said to be negative.) The set of all positive roots is closed under
addition and multiplication by positive rationals. If $, σ ∈ h∗0 and $ − σ > 0, then
we write $ > σ, and h∗0 is ordered, by convention, in this way.
The following Lemma on positive roots will be useful later on:
Lemma. If the roots α1, α2, . . . , αk ∈ h∗0 , with k ≤ `, are positive and such that
h αi, α j i ≤ 0 for all i 6 = j, then the αi’s are linearly independent over Q.
Suppose that, contrary to the assertion, αk is not independent so that αk =
k−1
∑i=1 qi αi = ρ + σ, where ρ = ∑j q0j α j and σ = ∑m q00m αm with 1 ≤ j, m ≤ ( k − 1)
and q j > 0, q00m ≤ 0. Since αk > 0, ρ 6 = 0. Then, as h αk, αk i > 0, we have
h ρ, σ i ≥ 0, so h αkρ i = h ρ, ρi + h ρ, σ i > 0. On the other hand, by assumption,
h αk ρ i = ∑j q0j h αk, α j i ≤ 0 in contradiction. Hence all the αi are independent.

Definition 6.3 (Simple roots). Relative to a given ordering in h∗0 , a root α is said to be
simple (or fundamental) if it is positive and cannot be written as the sum (β + γ) of
two positive roots (β and γ).

P16. Let Σ be the set of all simple roots of g relative to a fixed ordering in h∗0 . It has the
following properties:
(a) For any distinct simple roots α, β ∈ Σ , α − β is not a root.
(b) For any distinct simple roots α, β ∈ Σ , we have h α, β i ≤ 0.
(c) The set Σ is a basis for h∗0 .
(d) For every positive root β not in Σ, there exists an element α ∈ Σ such that β − α
is a positive root.
6.4. SYSTEM OF ROOTS 189

P ROOF. (a) If α − β is a positive root, then α = β + ( α − β ), contrary to the


definition of α. If β − α is a positive root, then β = α + ( β − α), again contradicting
the definition of β.
(b) By (a) β − α is not a root, and so it does not appear in the string Sα ( β ) (by
P11; the integers p and q are defined in P10). Hence q = 0, and so the Cartan
integer aαβ = 2h α, β i/hα, αi = − p ≤ 0. Since h α, α i > 0, we have h α, β i ≤ 0.
(c) By (b) and the Lemma, Σ is a linearly independent set. Now, if a positive
root β is not simple, it is by definition sum β = γ + γ0 of two positive roots γ, γ0 .
If either γ or γ0 is not simple, it splits into two positive roots, and so on, until all
the terms are simple. So we have γ = ∑α∈Σ k α α and γ0 = ∑α∈Σ k 0α α with all k α
and k 0α non-negative integers. It follows that we may write β = ∑α∈Σ ( k α + k 0α ) α,
that is, the (arbitrary) positive root β can be expanded in terms of α ∈ Σ, with
non-negative integral coefficients. If β is negative root, then − β is a positive root,
and a similar argument applies: in this case the expansion coefficients are all
non-positive integers. And so we have proved that Σ is a basis for h∗0 , that is, all
α ∈ Σ are independent, and every root β ∈ h∗0 can be expanded in terms of α ∈ Σ
(with all non-positive, or all non-negative, integral coefficients). This means that
Σ contains precisely ` = dim h∗0 linearly independent simple roots.
(d) By (c) we write β = ∑α∈Σ k α α with non-negative integers k α . From h β, β i =
∑α k α h β, α i > 0, it follows that there is some h β, α i > 0, and so 2h β, α i/hα, αi > 0.
Then by P11 β − α is a root, and so either β − α or α − β is positive. But the latter
cannot be a positive root, because then α = β + ( α − β ) contradicting α ∈ Σ. So,
β − α > 0 and we may write β = α + ( β − α ), where α ∈ Σ, as asserted. 
Once an order is chosen for h∗0 , we identify all the simple roots and the set of
all such roots Σ = { α1, α2, . . . , α` } as a basis for h∗0 . Then every root β is of the form
β = k 1 α1 + k 2 α2 + · · · + k ` α` , where the k i are either all non-negative integers or
all non-positive integers. If the k i are non-negative, β is a positive root; if the k i are
non-positive, β is a negative root. Thus, the collection of all roots is divided into
two disjoint sets: one, ∆+ , consisting of positive roots, and the other, ∆− = − ∆+,
consisting of negative roots. So, we can write ∆ = ∆+ ∪ ∆− . Now we introduce
the essential notion of ‘fundamental root-system’:
Definition 6.4 (Fundamental system of roots). A set Σ of ` linearly independent
vectors α1, α2 , . . . , α` of a Euclidean space V of dimension ` (i.e. real vector space with
positive-definite inner product h, i) is called a fundamental system of roots, or basis,
of V if for any two distinct vectors αi and α j in Σ the quantity aij = 2h αi, α j i / h αi, αi i is
a non-positive integer.
As aij (i 6 = j) is a non-positive integer and aij a ji = 4 cos2 θij by definition, it
follows immediately that aij a ji = 0, 1, 2, or 3 (i 6 = j). A system of (simple) roots
Σ = { α1, α2, . . . , α` } of a simple algebra g on V = h∗0 consistent with P13–P14,
as we have previously defined, is the prime example of a fundamental system of
roots (a subsystem of the root system ∆), and so will be identified as such. The
concept of ‘fundamental system of roots’ (FSR) is of crucial importance: it serves
to find, identify, and classify simple Lie algebras, either in its algebraic realization
(the Cartan matrix) or in its graphical representation (the Dynkin diagram).
190 CHAPTER 6. SIMPLE LIE ALGEBRAS: STRUCTURE

6.5 Cartan Matrix


We will show here, by examples, that having a fundamental system of roots is
essentially equivalent to having all the information about the structure of the
corresponding Lie algebra g. Starting from an FSR, we can calculate an important
object (the Cartan matrix) associated with g, which in turn determines its roots
(relative to some chosen CSA h), the bracket relations for its three-dimensional sub
Lie algebras si , and, from these, the Lie algebra g itself.
1. Let Σ = { α1, α2, . . . , α` } be a fundamental system of roots (FSR) defined on
a Euclidean space h∗0 , as in Definition 6.4. Then the Cartan integers, given by
aij = 2h αi, α j i / h αi, αi i with i, j = 1, . . . , `, are real numbers independent of the
root normalization and satisfying the conditions (i) aij ≤ 0 if i 6 = j, and (ii) aij a ji =
0, 1, 2, 3. The ` × ` matrix A, with entries aij , is called the Cartan matrix relative
to the basis Σ for an associated Lie algebra. It has diagonal entries aii = 2 for all
i = 1, . . . , `, and off-diagonal entries taking one of the values 0, − 1, − 2, − 3. Since
0 ≤ aij a ji ≤ 3, either both off-diagonal elements aij and a ji are zero, or else one is
− 1 and the other − 1, or − 2, or − 3. Its determinant is a nonzero multiple of the
determinant of the matrix whose entries are given by h αi, α j i (by P15). Hence the
Cartan matrix is non-singular.
There is just one rank-one FSR (single simple root), denoted by A 1, and so its
Cartan matrix is simply A = [2]. There are 4 rank-two FSR’s (two simple roots), as
described in Fig. 6.2, for which we now calculate the Cartan matrix.
E XAMPLE 5: A 1 ⊕ A 1. Its FSR is Σ = { α1, α2 } in which h α1, α2 i = 0. The Cartan
matrix is A = diag[2, 2], corresponding to a direct sum of two Lie algebras A 1.
E XAMPLE 6: A 2. The FSR is Σ = { α1, α2 } in which | α1| = | α2|, and h α1, α2 i =
| α1|2 cos 120◦ = −| α1|2 /2. Then a12 = − 1 and a21 = − 1, which gives
   
a11 a12 2 −1
A= = .
a21 a22 −1 2
In sl (3, C ) (cf. Example 1), the simple roots relative to the CSA h = ( h 1, h2 )
are given by α1 = (2, − 1) and α2 = (− 1, 2), so that | α1|2 = | α2|2 = 1/3 and
h α1, α2 i = − 1/6. From the definition of aij , we obtain a12 = a21 = − 1, leading
to the same Cartan matrix as above. So we identify sl (3, C ) with the Lie algebra
corresponding to FSR A 2: Lie(A 2) = sl (3, C )
E XAMPLE 7: B2. The FSR is Σ = { α1, α2 } , with | α1|√2 = 2 | α 2 | 2 (α 1 chosen to be the
1 2 2 2 ◦
long root) and h α , α i = −| α | (cos 135 = − 1/ 2). It follows that a12 = − 1,
a21 = − 2, leading to the Cartan matrix (identifiable with that of b2 = so (5, C ))
 
2 −1
A= .
−2 2

E XAMPLE 8: G 2. The FSR is Σ = { α1, α2 } , with | α2√


|2 = 3| α1|2 (α2 chosen to be the
1 2 2 2 ◦
long root) and h α , α i = −| α | /2 (cos 150 = − 3/2). The integers a12 = − 3,
a21 = − 1 are entries in the Cartan matrix (to be identified later with that of g2 )
 
2 −3
A= . 
−1 2
6.5. CARTAN MATRIX 191

C OMMENTS . Having a Cartan matrix is equivalent to having an FSR, and so the


reverse path can also be taken. Thus, given a Cartan matrix, one may calculate all
aij a ji and aij /a ji. Since aij a ji = 4 cos2 θij and aij /a ji = | α j |2 / | αi|2, where θij is the
angle between the vectors αi and α j , this information leads to a reconstruction of
the FSR. 
2. We now show how to determine all the roots (i.e. the set ∆) of the Lie algebra g
specified by its fundamental root system, or its Cartan matrix.
Assume that given Σ = ( α1, α2, . . . , α` ) we have calculated the corresponding
Cartan matrix A. Then we need to show that the integers k i such that β = ∑i αi k i
are roots can be determined from A. Since ∆− consists of the negatives of the
roots in ∆+ , we need to calculate only the positive roots, β = ∑i αi k i where the k i
are non-negative integers; we call the sum ∑i k i the level of a (positive) root β.
Level-one roots are the simple roots αi ∈ Σ . We now assume that we know all
the positive roots up to level n and want to find the roots of level ( n + 1).
Given a positive root β of level n = ∑i k i , a positive root of level n + 1 is of the
form β + α j for some α j ∈ Σ . If β = α j , clearly β + α j = 2α j is not a root. So we
may assume that there is a k i 6 = 0 with i 6 = j in β = ∑i αi k i . Consider the string of
roots Sα j ( β ) = { β + kα j | − q ≤ k ≤ p } , where p and q are non-negative integers,
obeying the relation
q − p = β ( h̃α j ) = ∑ a ji k i .
i

As we assume all the roots up to level n are known, we know the number q.
Then we can calculate the number p = q − ∑i a ji k i , and hence roots of levels
n + 1, . . . , n + p. There are new (positive) roots if and only if p > 0; they are
β + α j , . . . , β + pα j. The following examples illustrate the procedure.
E XAMPLE 9: Lie (A 2) = sl (3, C ). We have aii = 2, a12 = a21 = − 1. The roots of
level one are the simple roots α1 and α2. Consider the strings Sα2 ( α1) and Sα1 ( α2).
As α1 − α2 is not a root, q = 0 in both cases. For β = α1, the only non-zero
coefficient is k 1 = 1, and so in Sα2 ( α1), we have p = q − a21 k 1 = −(−1) · 1 = 1,
which means that Sα2 ( α1) consists of α1 and α1 + α2. We obtain similarly p = 1
for Sα1 ( α2), and so this string consists of the roots α2 and α2 + α1. As 2α1 and 2α2
are not roots, the sum α1 + α2 is the only root at level two.
To examine level three, consider the string Sα1 ( γ ), where γ = α1 + α2. Since
γ − α1 = α2 is a positive root, but not γ − 2α1, we know that q = 1. We calculate
p = q − ( a21k 1 + a22 k 2 ) = 1 − (− 1 + 2) = 0. We can do a similar calculation for
Sα2 ( γ ). The result shows that there are no roots at level three or higher.
. Conclusion: The roots of sl (3, C ) are ± α1, ± α2, ±( α1 + α2 ).

E XAMPLE 10: Lie (G 2) = g2 . Calling α2 the long root (| α2| = 3| α1|), the Cartan
integers for g2 are a12 = − 3, a21 = − 1. With α1 and α2 known as level-one
positive roots, we consider the strings of roots based on them to obtain the roots
at the next level. Since α1 − α2 is not a root, q = 0 for both strings Sα j ( β ). For the
string Sα2 ( α1), we have β = α1 (so k 1 = 1), we have p = 0 − a21k 1 = 1, and so
p = 1, and
S α 2 ( α 1 ) : α 1, α 1 + α 2 .
192 CHAPTER 6. SIMPLE LIE ALGEBRAS: STRUCTURE

For Sα1 ( α2), we have p = 0 − a12 k 2 = 3. It follows

S α1 ( α 2 ) : α2, α2 + α1, α2 + 2α1, α2 + 3α1.


Since neither 2α1 nor 2α2 is a root, we have just one positive root at level two. The
string Sα2 ( α1) also indicates that α1 + 2α2 and α1 + 3α2 are not in ∆+ . Thus, we
have the following positive roots up to level 4:
Level 2: α2 + α1;
Level 3: α2 + 2α1;
Level 4: α2 + 3α1.
To find the roots at the next level, we start with β = 3α1 + α2 (k 1 = 3, k 2 = 1)
and build strings on this β with α1, and α2.
For the string Sα1 ( β ), we know that β − α1 = 2α1 + α2, β − 2α1 = α1 + α2, and
β − 3α1 = α2 are all roots, but β − 4α1 = α2 − α1 6 ∈ ∆. Hence q = 3. The integer p
is given by p = q − ( a11k 1 + a12k 2 ) = 3 − (2 · 3 − 3 · 1) = 0. So (3α1 + α2 ) + α1 =
4α1 + α2 is not a root.
As for the string Sα2 ( β ), the difference β − α2 = 3α1 6 ∈ ∆ and so q = 0. The
calculation p = q − ( a21k 1 + a22 k 2 ) = 0 − (− 1 · 3 + 2 · 1) = 1 produces the string
Sα2 ( β ): 3α1 + α2, 3α1 + 2α2. All this shows that the only positive root at Level 5 is
3α1 + 2α2, which is the maximal root, since no roots exist at higher levels.
We see in Fig. 6.3 that the construction of the positive roots, starting from the
knowledge of the two simple roots α1 and α2, follows a definite path in h∗0 :
α 3 = α 1 + α 2 → α 4 = α 3 + α 1 → α 5 = α 4 + α 1 → α 5 + α 2.

0, 1
Y
H
H
6 HH
-
−1, 1
1, 0
- H-H
α2 −3, 2
YH
H KA  * 3, −1

H 
HHA 
 HA - 2, −1 α1
H
s
 A HH
  A H

  AU
HH
j

Figure 6.3: Path in h∗0 tracking the positive roots of g2 . The coordinates of the simple roots
are given by the Cartan matrix column entries : α1 = (2, −1) and α2 = (−3, 2).

. Conclusion: There are 12 roots in g2 , namely,


± α1, ± α2, ±( α1 + α2), ±(2α1 + α2 ), ±(3α1 + α2 ), and ±(3α1 + 2α2). 
3. Once all the roots of a Lie algebra are known, one can compute the Lie bracket
relations for it, and thereby completely determine the Lie algebra. With the set ∆
calculated and a (FSR) subset chosen, Σ = { α1, α2, . . . , α` } ⊂ h∗0 , we may define

hi = h̃αi , e i = e αi , f i = f αi ; ( i = 1, 2, . . . , `). (6.16)


6.5. CARTAN MATRIX 193

hi, ei , f i obey, just as in (6.11), the Lie brackets [ h i, ei ] = 2ei , [ hi, f i ] = − 2 f i, and
[ ei, f i ] = hi . But we also have for i 6 = j: [ ei, f j ] = 0 since αi − α j is not a root; further
[ hi, e j ] = [ h̃αi , eα j ] = α j ( h̃αi ) eα j = aij e j ; and finally [ hi, f j ] = − α j( h̃αi ) f j = − aij f j .
(Note: no summation over repeated indices here and in the following equations.)
Thus, we have for i, j = 1, 2, . . . , `:
 
hi , h j = 0
 
hi , e j = aij e j
  (6.17)
hi , f j = − aij f j
 
ei , f j = δij hi.

The set { h1, h2, . . . , h` } span the CSA h. The 3` elements ei , f i , hi do not span
the whole Lie algebra g, but generate g in the following sense: We can write
each positive root β as a sum of simple roots in the form β = αi1 + αi2 + · · · +
αik such that each partial sum αi1 + αi2 + ·h· · + αim , 1 ≤ m ≤ k, isi a root. In
such a representation, let us define e β = eik , [ eik−1 , . . . , [ ei2 , ei1 ] . . . ] and f β =
h i
f ik , [ f ik−1 , . . . , [ f i2 , f i1 ] . . . ] for each positive root β. Then the set

{ hi (1 ≤ i ≤ `); e β , f β ; with β ∈ ∆+ }

forms a basis for g (dim g = ` + 2| ∆+ | = ` + | ∆ |). The multiplication table for the
elements of g in this basis is rational, and is completely determined by the Cartan matrix.
The following example shows how this is done.
E XAMPLE 11: Rank-two Lie algebra sl (3, C ) has the set of positive roots ∆+ con-
sisting of α1, α2 and α3 = α1 + α2. Take the FSR Σ = { α1, α2 } , and the CSA
h = { h1 = h̃α1 , h2 = h̃α2 } . To ± α1 correspond the root vectors e1 and f 1; and
to ± α2, the root vectors e2 and f 2. Finally, we associate with ± α3 the vectors
e3 = [e1, e2 ] and f 3 = − [ f 1, f 2 ] (conventional sign, an unimportant detail). So a
complete basis for the algebra consists of h 1, h2, e1, f 1, e2, f 2, e3, f 3. The six ele-
ments hi , ei , f i with i = 1, 2 obey (6.17) with the Cartan integers a11 = a22 = 2 and
a12 = a21 = − 1. We just need to consider relations involving e3 = [e1, e2 ] and
f 3 = − [ f 1, f 2 ]. For hi = h1, or h2, we have

[hi , e3 ] = ( ai1 + ai2 ) e3 = α3 ( hi ) e3 = e3,


[hi , f 3 ] = −( ai1 + ai2 ) f 3 = − α3( hi ) f 3 = − f 3.

Since ±( αi + α3 ) are not roots for i = 1, 2, we have [ ei, e3 ] = 0 and [ f i, f 3 ] = 0; and


we just need to calculate the following commutators

[e1, f 3 ] = a12 f 2 = − f 2,
[e2, f 3 ] = − a21 f 1 = f 1,
[ f 1, e3 ] = − a12e2 = e2,
[ f 2, e3 ] = a21 e1 = − e1.
194 CHAPTER 6. SIMPLE LIE ALGEBRAS: STRUCTURE

Finally, using these relations, we get [e3, f 3 ] = −( a21h1 + a12h2 ) = h1 + h2. The
results of these calculations agree with Table 5.1 of the last chapter. Together with
Example 9, this example illustrates that a set of simple roots (FSR) determines all
the roots of a Lie algebra, and so defines the Lie algebra itself. 

6.6 Dynkin Diagram


To identify and classify the simple Lie algebras over C, there exists a remarkable
graphical tool, due to Eugene Dynkin [Dy1], [Dy2], which we now describe.
Suppose we are given, for a certain Lie algebra, a fundamental system of roots
(FSR) Σ = { α1, α2, . . . , α` } defined in a Euclidean space h∗0 , or equivalently a Car-
tan matrix A = [ aij ], both defined relative to some unspecified CSA h. To Σ, or A,
we associate a graph, called a Dynkin diagram, in the following manner: Mark
` points (or vertices or nodes) labeled by α1, α2, . . . , α` , one point to each root; and
connect αi to α j by a number of lines (or edges) equal to aij a ji (= 0, 1, 2, 3). If aij = 0,
the vertices αi and α j are not connected. If there are two or three edges, the roots
are not of equal lengths, and one indicates the shorter root by a solid circle, and
the longer one by an open circle. If Σ is given, and so the simple roots are com-
pletely known, one attaches to each point αi the value of the norm square h αi, αi i
(called weight). If only A is available, one knows the simple roots only up to a
common factor, then 1 is assigned to the short roots, and 2 or 3 to the long roots.
The rank-one FSR in Example 3 has the simplest, one-point, diagram: (A 1).
The 4 rank-two FSRs in Example 4 correspond to the next simplest, two-point,
diagrams shown in Fig. 6.4.

no lines (A1 ⊕ A1 ) e e if θ = π/2

single line (A2 ) e e if θ = 2π/3

double line (B2 ) u e if θ = 3π/4

triple line (G2 ) u e if θ = 5π/6

Figure 6.4: Basic Dynkin links. Solid circles (•) denote the shorter roots.

For each FSR = { α1, α2 } (Example 4), or associated Cartan matrix (Sect. 6.5),
draw a two-point diagram according to the following information:
A 1 ⊕ A 1: a12 = 0 7 → unconnected vertices of equal weights.
A2: a12 a21 = 1 7 → vertices of equal weights connected by a line.
B2 : a12 a21 = 2 7 → vertices connected by two lines; | α1|2 = 2| α2|2.
G2: a12 a21 = 3 7 → vertices connected by three lines; | α2|2 = 3| α1|2.
Conversely, a (connected) Dynkin diagram determines the associated FSR, or
the Cartan matrix. From a given diagram we obtain for two vertices i, j the num-
ber of connecting lines n, which is equal to aij a ji . If n = 0, it means aij = a ji = 0. If
6.6. DYNKIN DIAGRAM 195

n = 1, we must have aij = a ji = − 1. If n > 1, the two roots have unequal lengths.
Supposing α j is the long root ( ), the relation aij /a ji = | α j |2/ | αi|2 gives us the
Cartan integers a ji = − 1 and aij = − n. Applying this argument to every part of
the diagram we obtain the corresponding Cartan matrix. If the weight | αi|2 for

every vertex is also known, then one can calculate h αi, α j i = − 21 n | αi | | α j| for
i 6 = j, n = 1, 2, 3. In this way, the Dynkin diagram determines the associated FSR
(up to similarity transformations in h∗0 ). So, to find all the FSRs one just has to
draw all the ‘allowed’ Dynkin diagrams.
E XAMPLE 12: b3 . Suppose we are given the following Cartan matrix (for a Lie
algebra called b3 ), what is the corresponding Dynkin diagram?

 
2 −2 0 1 2 2
A [b3 ] = − 1 2 − 1 ⇐⇒ u e e
0 −1 2 α1 α2 α3

The product a12 a21 = 2 and the ratio a12 /a21 = 2 tell us that the roots α1 and
α2 are connected by two lines and that | α2| > | α1|. The product a23 a32 = 1 and
the ratio a23 /a32 = 1 imply that α2 and α3 are of equal lengths and joined by one
line. Finally, a13 a31 = 0 means that α1 and α3 are not directly joined.
One can also trace the reversed path from the diagram back to the matrix.
From the diagram, we know that of the three roots, α1 is the shortest, whereas
| α2| = | α3|. Furthermore the numbers of lines yield 2 = a12 a21, 1 = a23 a32 and
0 = a13 a31, from which one deduces the Cartan integers a12 = − 2, a21 = − 1,
a23 = a32 = − 1, and a13 = a31 = 0, and hence the Cartan matrix A [b3 ]. 
In the classification problem, we are concerned with the basic building blocks,
which for systems of roots means the indecomposable fundamental systems of
roots. An FSR Σ = { αi, . . . , α` } is said to be indecomposable if it is impossible
to partition it into two non-overlapping subsets Σ 0 and Σ 00 such that h αi, α j i = 0,
or aij = 0, for every αi ∈ Σ 0, α j ∈ Σ 00 , which means the associated Cartan matrix
cannot be written in a block-diagonal form. On the Dynkin diagram, the equiv-
alent condition is connectedness, under which there exists a sequence αi1 , . . . , αi`
such that every two consecutive points are joined by at least one edge; i.e. there
are no separate pieces, or subdiagrams. Finally, the indecomposability of an FSR
implies and is implied by the simplicity of the corresponding Lie algebra. See
[Ja] pp. 127–128. And so, the problem of finding all the simple Lie algebras is the
same as that of finding all the indecomposable FSR’s on a Euclidean space, in turn
equivalent to finding all the connected Dynkin diagrams. Formulated in this way,
the problem is essentially geometric, if one temporarily ignores the weights of the
vertices.
The defining conditions for FSRs in Definition 6.4 — (i) h α, αi > 0, (ii) aij ≤ 0
for i 6 = j, and (iii) aij a ji = 0, 1, 2, 3 for i 6 = j — severely limit the number of such
systems and the associated Dynkin diagrams (which are then said to be allowed).
Those conditions, translated into geometric terms, imply the following general
rules for the Dynkin diagrams (see [Ja], [Sa]):
196 CHAPTER 6. SIMPLE LIE ALGEBRAS: STRUCTURE

• Rule 1. An allowed diagram has more vertices than links, and cannot contain
closed polygons.
• Rule 2. In an allowed diagram, no more than three lines can leave a vertex.
• Rule 3. If a diagram Γ containing a simple A 2 chain is allowed, the diagram Γ0
obtained from it by shrinking the A 2 chain to a point is also allowed. Conversely,
if Γ is not allowed, neither is Γ0 .
C OMMENTS . A link is what connects two vertices; a link may have one, two, or
three lines (called, respectively, A 2, B2 and G 2 links; see Fig. 6.4). Rule 1 is needed
to prove Rules 2 and 3. A simple A 2 chain referred to in Rule 3 is a diagram (or
any part of it) in which each vertex is connected to the next by an A 2 link (i.e. a
single line). These rules will be examined in Problem 6.3. 
We will now apply these rules to find the allowed connected diagrams.
(1) A simple A 2 chain of any length (as in Fig. 6.6(a)) does not break any rules,
and so yields an allowed connected diagram.
(2) The basic G 2 link shown in Fig. 6.4 cannot be part of any larger graph,
because any additional lines leaving either vertex would violate Rule 2 . The only
allowed configuration containing a G 2 link is:

e e

(3) By Rules 2 and 3 the graphs on the left-hand side of Fig. 6.5, all with four
lines leaving a vertex, are not allowed, and so neither are the corresponding ex-
panded graphs on the right-hand side, with a simple A 2 chain inserted, either as
stand-alone diagrams or as parts of larger diagrams. This means that a connected
allowed diagram cannot contain more than one B2 link (as in (a)), or more than
one branch (as in (b)), or simultaneously a B2 link and a branch (as in (c)).

(a) e e e e e e e ··· e e e

e e e e
(b)
Q 
Q e QQe e e ··· e 
e
eQ Qe e Q
Qe

e e
(c) e 
e e e e e ··· e 
e
Q
Qe Q
Qe

Figure 6.5: Non allowed configurations. Each graph in the left column is obtained by
shrinking to a point the intermediate one-line linked chain of the corresponding graph in
the right column.

(4) At this point, we have reduced the possible candidates to four basic con-
figurations shown in Fig. 6.6. An allowed Dynkin diagram must have one of the
following configurations:
6.6. DYNKIN DIAGRAM 197

(a) A chain of one-line links of any length,


(b) A triple line joining two vertices,
(c) One-line chain containing a single double-line insertion, and
(d) One-line chain with a one-line chain branching.


(a) f f f··· f f

(b) f f

(c)
α1 α2 α p−1 α p βq βq−1 β2 β1
f f f. . . f f f f. . . f f

fγ1

(d) fγ2

fγr−1
f f f f f f f f f
α1 α2 α p−1 δ βq−1 β2 β1

Figure 6.6: Allowed general configurations. Diagrams (a) and (b) pass all possible tests;
but diagrams (c) and (d) are still subjected to further restrictions, as described in the text,
where we will make use of the vertex labels displayed here.

Configuration Fig. 6.6 (a) is subject to no other restrictions, and any number of
linked vertices is allowed, which can be checked by displaying the corresponding
non singular Cartan matrix. Configuration Fig. 6.6 (b) is the only one allowed
with a G 2 link, as already discussed.
Configurations Fig. 6.6 (c) and Fig. 6.6 (d) cannot contain any other double-line
or branching because of the restrictions discussed in (3) p. 196. Whether there are
further restrictions or not must be checked (e.g., by verifying the non-singularity
of the Cartan determinant, or the positivity of the inner product; see Problem 6.4).
It turns out, in fact (cf. Comments p. 198), that the following rules hold:
(i) From Fig. 6.6 (c): configuration in Fig. 6.7 (c.1) is not allowed; but are al-
lowed configurations in Fig. 6.7 (c.2) (as a chain with an unlimited number of A 2
links) and in Fig. 6.7 (c.3) (as a finite chain).
(ii) From Fig. 6.6(d): configuration in Fig. 6.8(d.1)–(d.2) are not allowed as
subdiagrams; but are allowed configurations in Fig. 6.8 (d.3) (as an unlimited
chain) and in Fig. 6.8 (d.4)–(d.6) (as finite chains).
In summary, an allowed indecomposable FSR must correspond to a connected
Dynkin diagram that is either a member of one of the three unlimited chains in
Figs. 6.6 (a), 6.7 (c.2), and 6.8 (d.3); or one of the five configurations in Figs. 6.6 (b),
198 CHAPTER 6. SIMPLE LIE ALGEBRAS: STRUCTURE

(c.1) f f f f f ×


(c.2) f f··· f f f


(c.3) f f f f

Figure 6.7: Chains with one B2 link: (c.1) is not allowed; (c.2)–(c.3) are.

6.7 (c.3), and 6.8 (d.4)–(d.6). The systems that contain just simple A 2 links (also
said to be simply laced) are shown Figs. 6.6 (a) and 6.8 (d.3)–(d.6). The fact that for
every connected allowed Dynkin diagram there exists a corresponding FSR, and
hence a simple complex Lie algebra, will be proved by explicit construction, as
we will see in the next section.
E XAMPLE 13: Consider Fig. 6.7 (c.1) as an example, in which one assigns relative
weights 1, 1, 2, 2, 2 to the vertices in order from left to right. The matrix with
entries given by the Cartan integers calculated from the diagram is
 
2 −1 0 0 0
 −1 2 −2 0 0 
 
 0 −1 2 −1 0 
 .
 0 0 −1 2 −1 
0 0 0 −1 2

If all entries in the first row (r1) are replaced by the numbers obtained by the
combination of rows r1 + 2 ( r2) + 3 ( r3) + 2 ( r4) + r5, then we will have a matrix
with one row with all entries equal to zero. Hence the matrix is singular, and the
associated Dynkin diagram not allowed. 
C OMMENTS . We now sketch the arguments (found in [Ja] pp. 132–133), based
on the general properties of the simple roots, that lead to the results given in
Figs. 6.7–6.8. In Fig. 6.6 the nodes are labeled by the simple roots ui , all normed
to one, h ui, ui i = 1, and spanning, in each case, a Euclidean space h∗0 . With i 6 = j,
√ √
one has 2h ui, u j i = − 1, − 2, − 3 for resp. A 2, B2 , G 2 links.
p q
Consider first the case of Fig. 6.6(c) , and define φ = ∑1 iαi and ψ = ∑1 jβ j ,
with p, q ≥ 1, which have norms | φ |2 = p ( p + 1) /2 and | ψ |2 = q ( q + 1) /2 and in-
ner product h φ, ψ i2 = p2 q2 /2 . By Schwarz’s identity (|h φ, ψi|2 < | φ |2| ψ |2 where
φ ∦ ψ), we have 2pq < ( p + 1)( q + 1), or equivalently ( p − 1)( q − 1) < 2 . Hence
the only possible positive integers p, q are: (i) p = 1 with q arbitrary (or q = 1, p
arbitrary), and (ii) p = 2, q = 2 . They are represented resp. by the diagrams in
Fig. 6.7 (c.2) and (c.3), which are allowed. But Fig. 6.7 (c.1) is not allowed, because
it has p = 2, q = 3.
6.6. DYNKIN DIAGRAM 199

f
(d.1) f f f f f ×

f
(d.2) f f f f f f f f ×


f √
(d.3) f f··· f f
H
Hf

f

(d.4) f f f f f

f

(d.5) f f f f f f

f

(d.6) f f f f f f f

Figure 6.8: Configurations with one branching (d.1)–(d.2) are not allowed; but (d.3)–(d.6),
in which one of the branches has length one, are allowed.

p− 1 q−1
For the case of Fig. 6.6(d), define φ = ∑1 iαi , ψ = ∑1 jβ j and χ = ∑r1−1 kγk,
where p, q, r ≥ 2. The vectors φ, ψ, χ lie in mutually orthogonal subspaces of h∗0 ,
which do not contain the (normed) simple root δ. This root, on the other hand, is
linked to α p−1 , βq−1 and γr−1 . If θ1, θ2 , θ3 are the angles between δ and the vectors
φ, ψ, and χ, then we have the inequality relation cos2 θ1 + cos2 θ2 + cos2 θ3 < 1.
Now, noting 2h ui, ui+1 i = − 1 for A 2 links, we get simultaneous conditions on
p, q, r:
cos2 θ1 = h φ, δ i2/ h φ, φ i = ( p − 1) /2p,
cos2 θ2 = h ψ, δi2/ h ψ, ψi = ( q − 1) /2q,
cos2 θ3 = h χ, δi2/ h χ, χi = ( r − 1) /2r.
This leads to the restriction p −1 + q−1 + r−1 > 1. As p, q, r are interchangeable,
we may assume p ≥ q ≥ r (≥ 2). It follows that r = 2 is the only possible value
of r, and, finally, there remain four configurations: (i) r = q = 2, with p arbitrary,
and (ii) r = 2, q = 3, with p = 3, 4, 5. They are represented resp. by the diagrams
in Figs. 6.8 (d.3), (d.4), (d.5), and (d.6). The diagrams in Figs. 6.8 (d.1) and (d.2),
which have r = q = p = 3 and r = 2, q = 3, p = 6, are disallowed.
200 CHAPTER 6. SIMPLE LIE ALGEBRAS: STRUCTURE

The above discussion leads us to conclude that an indecomposable allowed


fundamental system of roots (with its associated Dynkin diagram) must be one
of those listed in the following theorem:
Theorem 6.1 (Dynkin diagrams). The only indecomposable fundamental systems (and
associated connected Dynkin diagrams) are the four classical series A ` , B` , C` , and D` ,
plus the five exceptional systems G 2, F4, E6, E7, and E8:
Name Dynkin diagram Rank
A` e e··· e e ` = 1, 2, 3, . . .

B` e e··· e e u ` = 2, 3, 4, . . .

C` u u··· u u e ` = 3, 4, 5, . . .

e
D` e e··· e 
e ` = 4, 5, 6, . . .
H
He

G2 u e `=2

F4 u u e e `=4

e
E6 e e e e e `=6

e
E7 e e e e e e `=7

e
E8 e e e e e e e `=8

The restrictions on the values of the parameter ` in the unlimited series are
intended to avoid double counting, or to remove non-allowed diagrams. A 1, B1,
and C1 correspond to the same single-vertex diagram. C2 and B2 are represented
by the same double-line two-point diagram. As for the D` series, each diagram
must end with a branching, and so D1 would be empty, while D2 is a non-allowed
disjoint two-point diagram, representing A 1 ⊕ A 1. Finally, D3 is identical to A 3
(a single-edge three-point chain). So, we end up with a complete list of distinct
indecomposable allowed fundamental system of roots and associated connected
Dynkin diagrams.
The symmetries observed in some diagrams are manifestations of automor-
phisms (self-equivalences) of the associated Lie algebras, with implications on
their representations. Thus, A ` with ` ≥ 2 is symmetric under the reflection
which maps nodes to nodes α1+ j ↔ α`− j . E6 has a similar symmetry. The system
D` is invariant if the two short branches are interchanged, and the diagram D4
has an even higher symmetry as it remains invariant under the full symmetric
6.7. CLASSIFICATION OF LIE ALGEBRAS 201

group S3 acting on the three end points. In contrast, the diagrams B` , C` , and the
exceptional systems (apart from E6) do not have any such symmetry.
By assigning a weight to every vertex in the Dynkin diagrams, we can write
down the corresponding Cartan matrices, as discussed before. Since this corre-
spondence is not affected by multiplying all the roots of a given FSR by a constant,
it suffices to normalize one (short) root to 1 and assign appropriate weights to all
other vertices relative to that vertex. Thus, the vertices in A ` , D` , E6, E7, and E8
all carry the same weight. In B` , C` , and F4 the short roots have weights equal to
1 and the long roots have weights equal to 2 . G 2 is the only system with a long
root of weight 3 relative to its weight-one short root.

6.7 Classification of Lie Algebras


To each fundamental system of roots, or its associated Dynkin diagram, corre-
sponds a unique complex simple Lie algebra. Formally, we have
Theorem 6.2 (Simple Lie algebras). All complex simple Lie algebras are determined
by the fundamental systems of roots (or associated Dynkin diagrams) listed in Theorem
6.1, in accordance with the following table:

FSR Lie algebra Rank Dimension


A` a` = sl (` + 1, C ) ` = 1, 2, . . . `(` + 2)
B` b` = so(2` + 1, C ) ` = 2, 3, . . . `(2` + 1)
C` c` = sp(`, C ) ` = 3, 4, . . . `(2` + 1)
D` d` = so (2`, C ) ` = 4, 5, . . . `(2` − 1)
G2 g2 2 14
F4 f4 4 52
E6 e6 6 78
E7 e7 7 133
E8 e8 8 248

The algebras a` , b` , c` , and d` corresponding to the fundamental root systems


A `, B` , C` , and D` are the classical Lie algebras, whereas g2 , f4, e6 , e7 , and e8 are
the five exceptional Lie algebras. The above list is complete: it gives all the com-
plex simple Lie algebras that exist. We should recall (from Chapter 3, Sect. 3.4–5)
that: (i) o( n, C ) = so( n, C ); (ii) sl ( n, C ) ∼ = su ( n )C; and (iii) sp (`, C ) = sp(`)C. The
dimensions (sizes) of the simple Lie algebras are given by the sum of the rank
and the number of roots: (dim g = ` + | ∆ |); for the classical series, they can also
be obtained from the corresponding matrix Lie algebras.
The restrictions on the values of ` for the classical Lie algebras b` , c` , and d`
are intended, just as for the corresponding Dynkin diagrams B` , C` , D` , to avoid
double counting. The identification of the diagrams A 1 = B1 = C1 corresponds
to the isomorphisms sl (2, C ) ∼ = so (3, C ) ∼ = sp(1, C ); and C2 = B2 corresponds to
sp(2, C ) ∼= so ( 5, C ) . In the D ` series, D 2 consists of disjoint vertices corresponding
to the isomorphism so(4, C ) ∼ = sl ( 2, C ) ⊕ sl ( 2, C ), which is semisimple, but not
simple; and finally, D3 = A 3 reflects the isomorphism so (6, C ) ∼ = sl (4, C ).
202 CHAPTER 6. SIMPLE LIE ALGEBRAS: STRUCTURE

Having now the list of all allowed root systems, we proceed to show there
exist corresponding simple Lie algebras by actually constructing them. We shall
adopt the following notations: ei are canonical basis vectors of space V; ε i are
basis in the dual space V ∗ , such that ε i (ej ) = δij. We also use ai, bi , . . . for real or
complex numbers. Every classical Lie algebra is a sub Lie algebra of the general
linear Lie algebra gl ( n, C ). The space of all (n × n)-matrices admits as a basis
the set of n2 matrices [ Eij ] (with i, j = 1, 2, . . . n), which are ( n × n )-matrices with
1 at the ij-position and 0’s everywhere else; so elements of the matrix [ Eij ] are
( Eij )rs = δir δjs . The [ Eij ]’s obey the product rule Eij Ekm = Eim δjk , or equivalently,

[ Eij , Ekm ] = Eim δjk − Ekj δim . (6.18)

Elements of a classical complex Lie algebra are matrices expressible in terms of


Eij over the complex field C and subject to conditions specific to that algebra.
1. Special linear algebra sl (` + 1, C ).
sl ( n, C ) is a Lie algebra of rank ` = n − 1. Its elements are complex traceless
(n × n) matrices, which may be written as complex linear combinations of the
Eij ’s, so that sl ( n, C ) = { ∑i,j bij Eij ∑ni=1 bii = 0, bij ∈ C } .


The Cartan subalgebra is h = { H = ∑i ai Eii ∑i ai = 0)} , i.e. the space of
all diagonal matrices H = diag[ a1, a2, . . . , an ] on Cn , subject to ∑i ai = 0. In
particular, h admits a basis consisting of h i = Eii − Ei+1,i+1 with 1 ≤ i ≤ `. Its
dual is h∗ = C { ε 1, ε 2, . . . , ε n } (mod ∑n1 ε i), where ε i (ej ) = δij . Eii sends ei to itself
and kills ej if j 6 = i. From the commutation relation (6.18) among the Eij ’s we have

[ H, Eij ] = ( ai − a j ) Eij ( i 6 = j ). (6.19)

So, the `(` + 1) matrices Eij with i 6 = j are the root vectors with roots relative to
H given by the linear functions αij ( H ) = ai − a j (or αij = ε i − ε j). Then, a basis
for Lie algebra sl ( n, C ) = h ⊕ α∈∆ gα may consist of h1, . . . , h` and Eij with i 6 = j.
L

The real CSA h0 is the restriction of h to R. An ordering in h∗0 can be assigned


through some arbitrarily chosen H0 ∈ h with a1 > a2 > · · · > an . The 1/2 `(` + 1)
positive roots are αij with 1 ≤ i < j ≤ n. In particular, the roots α1 = α12,
α2 = α23, . . . , α` = α`,`+1 are special in that every root αij with 1 ≤ i < j ≤ n may
be written αij = αi + αi+1 + · · · + α j−1. Every root αij has the form ∑µ mµ αµ where
the mµ ’s are either all non-negative, or all non-positive integers. In particular,
∑µ αµ = ε 1 − ε `+1 is the maximal root. So the set Σ = { αµ | µ ∈ [1, `]} is the FSR
for sl (` + 1, C ) relative to CSA h, and may serve as a basis in h∗0 .
The Killing form of any two H, H 0 ∈ h can be calculated via (6.4):

( H : H 0 ) = ∑i,j αij ( H ) · αij ( H 0 )


n
= ∑i,j ( ai − a j )( a0i − a0j ) = 2n ∑i=1 ai a0i , (6.20)

where we have used ∑ni ai = 0 and ∑ni a0i = 0. Note (i) ( H : H 0 ) = 2n Tr( H H 0);
and (ii) the Pythagorean form of the norm ( H : H ) = 2n ∑ a2i .
6.7. CLASSIFICATION OF LIE ALGEBRAS 203

The fundamental-root generator Hαµ corresponding to the simple root αµ ∈ Σ


is defined as usual via the Killing form: ( Hαµ : H ) = αµ ( H ). Given the simple
root αµ ( H ) = aµ − aµ+1 , and the Killing form ( Hαµ : H ) = 2n ∑ni ai ci, where ci
are the coordinates of Hαµ ∈ h0 , we have a system of equations for the unknowns
ci (namely, aµ − aµ+1 = 2n ∑i ai ci ) which yields cµ = − cµ+1 = 1/2n and ci = 0
for i 6 = µ, µ + 1. Hence
1 1
Hαµ = ( Eµµ − Eµ+1,µ+1 ) = (eµ − eµ+1 ) . (6.21)
2n 2n
The inner product h, i on h∗0 can be calculated for any pair of simple roots
∈ Σ with the help of αν = ε ν − ε ν+1 and ε i (ej ) = δij . We obtain
αµ , αν

1
h αµ, αν i = αν ( Hαµ ) = (2δµν − δµ,ν+1 − δµ,ν−1 ) for µ, ν = 1, . . . , `. (6.22)
2n
Normalizing Hαµ such that hµ ≡ h̃αµ = Hαµ 2/ h αµ, αµ i, we get hµ = eµ − eµ+1 ,
and hence the Cartan integers aµν = αν ( hµ ), which are aµµ = 2, aµν = − 1 if
µ = ν ± 1, and aµν = 0 otherwise. They are the entries to the Cartan matrix:

−1
 
2 0

 −1 2 −1 


 0 −1 2 

 ··· 
A[sl(` + 1, C)] =  . (6.23)

 ··· 


 2 −1 0 

 −1 2 −1 
0 −1 2

It is represented by a Dynkin diagram (with the same weights for all points),
which turns out to be identical to diagram A ` given in Table 6.1 :

A` f f f··· f f
α1 α2 α3 α`−1 α`
The structure of the algebra can be completely determined from the ` sub
algebras sµ = h hµ, eµ , f µ i, where hµ = eµ − eµ+1 , eµ = Eµ,µ+1 , and f µ = Eµ+1,µ
for µ = 1, 2, . . . , `, associated with the simple roots αµ = ε µ − ε µ+1 composing the
fundamental system Σ = { αµ; µ = 1, 2, . . . , `} . In conclusion, the complex matrix
Lie algebra sl (` + 1, C ) is equivalent to the complex simple Lie algebra, now called a` ,
represented by the Dynkin diagram A ` .
2. Symplectic Lie algebra sp(`, C ).
The matrix Lie algebra sp(`, C ) consists of complex matrices M of order n = 2`
that satisfy the condition MT J + J M = 0, where J and M are similarly partitioned
into (` × `)-blocks, as follows
   
M1 M2 0 I`
M= and J= ,
M3 M4 − I` 0
204 CHAPTER 6. SIMPLE LIE ALGEBRAS: STRUCTURE

where I` is the identity matrix of order `. The condition on M then becomes

M2 = M2T , M3 = M3T , M4 = − M1T . (6.24)

Accordingly, one may choose for sp(`, C ) a basis consisting of the following
`(2` + 1) real matrices (with i, j = 1, 2, . . . , `):

Eij1 = Eij − Ej+`,i+` ,


Eij2 = Ei,j+` + Ej,i+` , i ≤ j,
Eij3 = Ei+`,j + Ej+`,i , i ≤ j.

For the CSA h, one chooses as basis the set of ` diagonal matrices

ei = Eii1 = Eii − Ei+`,i+` , i = 1, 2, . . . , `. (6.25)

Then, for an arbitrary element H = ∑i ai ei = diag [ a1, . . . , a` , − a1, . . . , − a` ] of h (a


diagonal matrix of order n = 2`), one obtains with the help of (6.18),
h i
H, Eij1 = ( ai − a j ) Eij1 , i 6= j (6.26)
h i
H, Eij2 = ( ai + a j ) Eij2 , i≤j (6.27)
h i
H, Eij3 = −( ai + a j ) Eij3 , i ≤ j. (6.28)

This says that Eij1 (i 6 = j), Eij2 (i ≤ j), and Eij3 (i ≤ j) are the root vectors of sp(`, C )
with respect to h corresponding to the root functions
ij ij ij
α 1 ( H ) = ai − a j , α 2 ( H ) = ai + a j , α3 ( H ) = −( ai + a j ), (6.29)

with the indicated restrictions on the ranges of the indices. They define the set ∆.
Restriction ai ∈ R defines the set ∆R ⊂ h∗0 . Once some ordering in h∗0 is chosen,
we may specify the fundamental root system Σ and the set ∆+ of positive roots:

Σ: α i = ε i − ε i+ 1 ( i < `); α` = 2ε ` ;
∆+ : εi − ε j , εi + ε j ( i < j ≤ `); 2ε i (1 ≤ i ≤ `).

The maximal root is 2( α1 + · · · + α`−1 ) + α` = 2ε 1. Restricted to h0 , the Killing


form for any pair H, H 0 ∈ h0 , again, has a Pythagorean form:

( H : H0 ) = ∑ α ( H ) · α ( H0 )
α ∈∆
= ∑i6= j ( ai − a j )( a0i − a0j ) + 2 ∑i≤ j ( ai + a j )( a0i + a0j )
`
= 4(` + 1) ∑i=1 ai a0i = ( n + 2) Tr( H H0 ) . (6.30)

The fundamental root generator Hαi , defined by ( Hαi : H ) = αi ( H ), can be


calculated by taking the linear functions αi ( H ) in Σ for the RHS, and the bilinear
6.7. CLASSIFICATION OF LIE ALGEBRAS 205

form ( Hαi : H ) = 4(` + 1) ∑k ak ck for the LHS, and solving the resulting equations
for ci to obtain
ei − ei+1
Hαi = , ( i < `) (6.31)
4(` + 1)
e`
Hα` = . (6.32)
2(` + 1)

The inner products on h∗0 for the simple roots can now be calculated, with
h αi, α j i = αi ( Hα j ), leading to the results

(2δij − δi,j+1 − δi,j−1 )


h α i, α j i = , ( i, j < `) (6.33)
4(` + 1)
−1
h αi, α` i = δ , (6.34)
2(` + 1) i,`−1
1
h α`, α` i = . (6.35)
(` + 1)

We see that α` is the long root since | α` |2 = 2| αi |2 = 1/ (` + 1) for every i < `.


Then follow h̃αi = 2Hαi / h αi, αi i = ei − ei+1 (with i < `) and h̃α` = e` , and the
Cartan integers aij = 2h αi, α j i / hαi, αi i, with the non-diagonals given by

aij = − δi,j+1 − δi,j−1 , ( i, j < `);


ai` = − 2δi,`−1, a`i = − δi,`−1 .

These integers are the entries of the Cartan matrix:


−1
 
2 0

 −1 2 −1 


 0 −1 2 

 ··· 
A[sp(`, C)] =  , (6.36)

 ··· 


 2 −1 0 

 −1 2 −2 
0 −1 2

which just shows that sp(`, C ) is the Lie algebra c` associated with the Dynkin diagram

1 1 1 1 2
C` v v··· v v f
α1 α2 α`−2 α`−1 α`

3. Orthogonal Lie algebras.


In general, an orthogonal Lie algebra so( n, C ), of dimension n ( n − 1) /2, con-
sists of all skew-symmetric matrices, MT = − M, of order n over C. The difficulty
in finding its structure lies in the fact that the diagonal elements of M all vanish
in the defining base vector space V = Cn , which makes the construction of a CSA
206 CHAPTER 6. SIMPLE LIE ALGEBRAS: STRUCTURE

along the lines we have followed problematic. To avoid this difficulty, we go over
to a different basis for V via a unitary transformation, giving us a new set of ma-
trices obeying a different condition, but the algebra they represent is equivalent
in every way to the original one.
Given M ∈ so ( n, C ), let U be a unitary matrix on Cn , and define N = U † MU.
Then the condition on M, namely MT + M = 0, becomes for N:

N T U T U + U T U N = 0. (6.37)

(Note the similarity with MT J + J M = 0 of the symplectic algebra.) To get an


explicit expression for U, we separate the cases of even and odd n.
Even-order orthogonal Lie algebra so (2`, C ). When n = 2` is even, we choose
   
1 iI` − iI` T 0 I`
U= √ , U U= , (6.38)
2 − I` − I` I` 0

where I` is the identity matrix of order `. With N partitioned in the same way,
 
N1 N2
N= , (6.39)
N3 N4

the condition (6.37) tells us that the diagonal blocks are the negative transposes
of each other (N4T = − N1), and that the off-diagonal blocks are skew symmetric
(N2T = − N2, N3T = − N3). There appear similarities and differences with the
symplectic case already seen, which we can exploit to define an appropriate basis
consisting of the following `(2` − 1) matrices (with i, j = 1, 2, . . . , `):

Eij1 = Eij − Ej+`,i+` , i 6= j


Eij2 = Ei,j+` − Ej,i+` , i < j,
Eij3 = Ei+`,j − Ej+`,i , i < j,
ei = Eii − Ei+`,i+` , i = 1, 2, . . . , `.

(Note that while N and E are matrices on C2` ; E1, E2, E3, and e are considered
objects defined on C` .) The diagonal matrices ei (with i = 1, 2, . . . , `) are chosen
as a basis for the Cartan sub algebra of the algebra,
h = { H | H = ∑i ci ei } , and of
its restriction on the real field, h0 = { ∑i aiei ai ∈ R } . Relative to this CSA, we
have the root vectors and corresponding roots:
ij
Eij1 : α1 = εi − ε j , i 6 = j,
ij
Eij2 : α2 = εi + ε j , i < j,
ij
Eij3 : α3 = −( εi + ε j ), i < j.

ij ij ij
(Note that ± 2ε i are not roots.) Altogether α1 , α2 , α3 form the set of roots ∆, with
its restriction to reals denoted by ∆R ⊂ h∗0 . The fundamental system Σ and the
6.7. CLASSIFICATION OF LIE ALGEBRAS 207

set of positive roots ∆+ are

Σ: α i = ε i − ε i+ 1 ( i < `), α` = ε `−1 + ε ` ;


∆+ : εi − ε j , εi + ε j, ( i < j ).

The maximal root is α1 + 2( α2 + · · · + α`−2 ) + α`−1 + α` = ε 1 + ε 2.


The calculation of the Killing form on h proceeds exactly as in the previous
cases, leading us to
` h i `
( H : H0 ) = ∑ ( ai − a j )( a0i − a0j ) + ( ai + a j )( a0i + a0j ) − ∑ 4ai a0i
i,j=1 i= 1
`
= 4(` − 1) ∑ a a0
i= 1 i i
= ( n − 2) Tr( H H ) . 0

The fundamental-root generators Hαi = ∑ ck ek can be found from



a i − a i+ 1 : 1 ≤ i ≤ ` − 1
4(` − 1) ∑k ak ck = αi ( H ) = ,
a`−1 + a` : i = `

leading us to the results


ei − ei+1
Hαi = , ( i < `), (6.40)
4(` − 1)
e + e`
Hα` = `−1 . (6.41)
4(` − 1)
From these elements on h0 , we readily obtain the inner products on h∗0 :

(2δij − δi,j+1 − δi,j−1 )


h α i, α j i = ( i, j < `) (6.42)
4(` − 1)
−1
h α`−2, α` i = , h α`−1, α` i = 0, (6.43)
4(` − 1)
2
h α `, α ` i = . (6.44)
4(` − 1)

Note that all simple roots have the same length | αi|2 = 1/2(` − 1), and the
fundamental-coroot generators are h̃αi = ei − ei+1 ( i < `), and h̃α` = e`−1 + e` .
So, the structure of so (2`, C ) can be encoded in the Cartan matrix
−1
 
2 0
−1 2 −1 
 
 0
 −1 2 


 ··· 

A[so(2`, C)] = 
 ··· .
 (6.45)

 2 −1 0 0

 −1 2 −1 −1 

 0 −1 2 0
0 −1 0 2
208 CHAPTER 6. SIMPLE LIE ALGEBRAS: STRUCTURE

In conclusion, the even-order orthogonal Lie algebra so (2`, C ) is equivalent to the com-
plex simple Lie algebra, now called d` , associated with the Dynkin diagram D` :

f α`−1
f f··· f f
D` HH f α`
α1 α2 α`−3 α`−2

E XAMPLE 14: For ` = 1, i.e. so (2, C ), this formalism gives us N = a0 −0a and
 
√ 
U = 1/ 2 −i i − 1
 †
 0 −1 
−1 , where a ∈ C, which leads to M = UNU = ia 1 0 . (As
already noted, so (2, C ) ∼
= C is not a semisimple Lie algebra.) 
Odd-order orthogonal Lie algebra so (2` + 1, C ). Now, the matrices N of order
n = 2` + 1 are partitioned in submatrices in the following way:
 
B C1 C2
N =  D1 N1 N2  , (6.46)
D2 N3 N4
where B is a complex number, Ci are 1 × ` matrices, Di are ` × 1 matrices, whereas
the Ni are ` × ` matrices, all of complex numbers. Taking into account the extra
dimension in this case, we choose for unitary transformation the matrix U:
√   
1 0 0
1  2 0 0
U= √ 0 iI` − iI`  , U T U =  0 0 I`  , (6.47)
2 0 −I −I 0 I 0
` ` `

The condition (6.37) is now re-expressed as: B = 0, D1 = − C2T , and D2 = − C1T ,


together with the (unchanged) conditions on the sub matrices Ni : N2T = − N2,
N3T = − N3, and N4T = − N1.
In analogy with the even-order case, we can choose the diagonal matrices
ei = Ei+1,i+1 − Ei+`+1,i+`+1 , i = 1, 2, . . . , `, (6.48)
to form a basis for the CSA h (with dual basis ε i ). Any vector H ∈ h may be
written as H = ∑i ai ei = diag [0, a1, . . . , a` , − a1, . . . , − a` ]. Relative to this CSA,
the root elements and corresponding roots are
Ei+1,j+1 − Ej+`+1,i+`+1 : εi − ε j, i 6= j
i j
Ei+1,j+`+1 − Ej+1,i+`+1 : ε +ε , i < j,
i j
Ei+`+1,j+1 − Ej+`+1,i+1 : −( ε + ε ), i < j,
E1,i+1 − Ei+`+1,1 : − ε i, i = 1, 2, . . . , ` ,
i
Ei+1,1 − E1,i+`+1 : ε , i = 1, 2, . . . , ` .
The set of positive roots ∆+ and the fundamental system Σ are
∆+ : εi , εi ± ε j ( i < j );
Σ: αi = ε i − ε i+1 ( i < `); α` = ε` .
6.7. CLASSIFICATION OF LIE ALGEBRAS 209

The maximal root is α1 + 2( α2 + · · · + α` ) = ε 1 + ε 2. Given these data, we now


can calculate the Killing form:
`
( H : H 0 ) = 4(` − 1) ∑i ai a0i + 2 ∑i ai a0i = 2(2` − 1) ∑i=1 ai a0i .

Here, as in all other classical algebras, we have a Pythagorean norm: ∑i a2i . Also,
note that ( H : H 0 ) = ( n − 2) Tr( H H0 ), exactly the same formula as for even order.
The root generators corresponding to the simple roots are
ei − ei+1
Hαi = ( i < `), (6.49)
2(2` − 1)
e`
Hα` = . (6.50)
2(2` − 1)

The inner products on h∗0 can then be calculated as before, leading to the results:

(2δij − δi,j+1 − δi,j−1 )


h α i, α j i = ( i, j ≤ ` − 1); (6.51)
2(2` − 1)
−1
h α i, α ` i = δ ( i < `); (6.52)
2(2` − 1) i,`−1
1
h α `, α ` i = . (6.53)
2(2` − 1)

So α` is the short root: | α` |2 = 1/2 | αi |2 for every i < `; and the fundamental
co-root generators are h̃αi = ei − ei+1 (with i < `) and h̃α` = 2e` .
The Cartan matrix and associated Dynkin diagram for so (2` + 1, C ) are:

−1
 
2 0

 −1 2 −1 


 0 −1 2 

 ··· 
A[so(2` + 1, C)] =  . (6.54)

 2 −1 0 0 


 −1 2 −1 0 

 0 −1 2 −1 
0 0 −2 2

2 2 2 2 1
B` f f··· f f v
α1 α2 α`−2 α`−1 α`

In conclusion, the odd-order orthogonal Lie algebra so(2` + 1, C ) is the same as the com-
plex simple Lie algebra b` associated with the Dynkin diagram B` .
C OMMENTS . Once the sets ∆ and Σ of a Lie algebra g are known, one can fol-
low the steps described in Example 9 to find the Cartan integers, bypassing the
computation of inner products. To illustrate, we consider two examples. In the
case of A ` , we see that αi + αi+1 (1 ≤ i ≤ ` − 1) are roots (being equal to αi,i+2 ),
210 CHAPTER 6. SIMPLE LIE ALGEBRAS: STRUCTURE

but αi + 2αi+1 are not, and neither are αi + α j for j ≥ i + 2. This means that for
two adjacent roots in Σ we have q = 0 and p = 1 (as in P10-P11) and the Cartan
integer ai,i+1 is − 1, and that the non-adjacent fundamental roots are orthogonal
to each other and the corresponding Cartan integers are 0. As a second exam-
ple, consider C` : For i = 1, 2, . . . , ` − 1, αi + αi+1 are roots, but αi + 2αi+1 are not;
so one has q = 0, p = 1, which implies ai+1,i = − 1. On the other hand, both
α` + α`−1 = ε ` + ε `−1 and α` + 2α`−1 = 2ε ` are roots, but not α` + 3α`−1, hence
a`−1,` = − 2. 
4. Exceptional Lie algebra g2 .
Let ei be the standard basis in C3 or R 3 , and ε i the corresponding dual basis.
Then { H = ∑3i aiei | a1 + a2 + a3 = 0} defines a two-dimensional CSA h (if ai
are complex) or h0 (if ai are real). The dual basis h∗ is similarly subject to the
restriction ε 1 + ε 2 + ε 3 = 0. The root system ∆ consists of twelve elements, namely,
± ε i and ±( εi − ε j), with i, j = 1, 2, 3; and so we have the dimension of the algebra:
dimg2 = dimh + | ∆ | = 14. The subset of positive roots ∆+ consists of ε 1, ε 2, − ε 3,
ε 1 − ε 2, ε 1 − ε 3, and ε 2 − ε 3.
We choose the simple roots to be α1 = ε 2 and α2 = ε 1 − ε 2, so that all roots
may be written in the form n1 α1 + n2 α2, where n1, n2 are both non-negative or
non-positive integers. In particular, the maximal root is ε 1 − ε 3 = 3α1 + 2α2.
The Killing form is given by ( H : H 0 ) = 8 ∑3i ai a0i , where H, H 0 ∈ h0 . From
this, one obtains the root generators Hα defined by ( Hα : H ) = α ( H ), and the
inner product h, i on h∗0 , as follows.
Fundamental root generators:
Hα1 = − 1/24 e1 + 1/12 e2 − 1/24 e3 and
Hα2 = 1/8 e1 − 1/8 e2 .
Inner products in h∗0 : h α1, α1 i = α1 ( Hα1 ) = 1/12, h α2, α2 i = α2 ( Hα2 ) = 3/12,
h α , α2 i = α1 ( Hα2 ) = − 1/8. Thus, α2 is the long root: | α2|2 = 3| α1|2.
1

Fundamental co-root generators:


h1 = − e1 + 2e2 − e3 and h2 = e1 − e2 , where hi = h̃αi = 2| αi |−2 Hαi .
Cartan integers: a12 = α2 ( h̃α1 ) = − 3 and a21 = α1 ( h̃α2 ) = − 1.
Cartan matrix and associated Dynkin diagram:

  1 3
2 −3 u e
A [g2 ] =
−1 2 α1 α2

5. Exceptional Lie algebra f4 .


CSA: h = C4 and h0 = R 4 . The roots of f4 are given by:

Σ: α1 = 1/2( ε 1 − ε 2 − ε 3 − ε 4), α2 = ε 4, α3 = ε 3 − ε 4, α4 = ε 2 − ε 3.
∆+ : ε i, ε i + ε j, ε i − ε j (for i < j ), 1/2( ε 1 ± ε 2 ± ε 3 ± ε 4) .

There are 24 positive roots, of which ε 1 + ε 2 = 2α1 + 4α2 + 3α3 + 2α4 is the maxi-
mal root. Killing form on h0: ( H : H 0 ) = 18 ∑i ai a0i .
6.7. CLASSIFICATION OF LIE ALGEBRAS 211

Fundamental root generators: Hα1 = 1/36 (e1 − e2 − e3 − e4 ), Hα2 = 1/18 e4 ,


Hα3 = 1/18 (e3 − e4 ), Hα4 = 1/18 (e2 − e3 ).
Inner products on h∗0 : h α1, α1 i = h α2, α2 i = 1/18, h α3, α3 i = h α4, α4 i = 1/9,
h α , α2 i = − 1/36, h α2, α3 i = h α3, α4 i = − 1/18, showing that α3 and α4 are long
1

roots.
Fundamental co-roots:
h̃α1 = e1 − e2 − e3 − e4 , h̃α2 = 2e4 , h̃α3 = e3 − e4 , h̃α4 = e2 − e3 .
The Cartan integers, obtained from the formula aij = α j ( h̃αi ), are displayed in
the matrix form and represented by a Dynkin diagram:

−1
 
2 0 0
1 1 2 2
 −1 2 −2 0  u u e e
A [ f4 ] = 
0 −1 2 −1  α1 α2 α3 α4
0 0 −1 2

6. Exceptional Lie algebra e6 .


The three exceptional algebras e6 , e7 , and e8 have roots of equal norms and
correspond to Dynkin diagrams with one branch point. Explicit expressions for
their data depend on where the branch leaves the trunk. There are two possible
choices, related by permutations of the simple-root labeling and rearrangement of
rows and columns of the Cartan matrices. Our choice has the advantage that the
three algebras are simply related, so that e7 and e6 can be considered reductions
of e8 with the last one vertex or the last two vertices removed.
The main properties of e6 are as follows.
CSA: h = C6 , h0 = R 6 . Elements: H = ∑6i=1 ai ei .

Σ (` = 6): α1 = 1/2 ( ε 1 − ε 2 − ε 3 − ε 4 − ε 5) + 3/2ε 6, α2 = ε 1 + ε 2, α3 = ε 2 − ε 1,
α 4 = ε 3 − ε 2, α 5 = ε 4 − ε 3 , α 6 = ε 5 − ε 4. √
∆+ (dim = 36): 1/2(± ε1 ± ε 2 ± ε 3 ± ε 4 ± ε 5 + 3ε 6) (with even number of
minus signs), { ε i + ε j } i< j≤5 , { ε i − ε j } j<i≤5 .
Killing form on h: ( H : H 0 ) = 24 ∑6i=1 ai a0i .

Fundamental root generators: Hα1 = 1/48 (e1 − e2 − e3 − e4 − e5 ) + 3/48 e6 ,
Hα2 = 1/24 (e1 + e2 ), Hαi = 1/24(ei−1 − ei−2 ) with i = 3, 4, 5, 6.
Inner products on h∗0 can be calculated from h αi, α j i = α j ( Hαi ), with the known
functions αi ( H ) and Hαi . The results are:
h αi, αi i = 1/12 (i = 1, 2, . . . , 6);
h αi, α j i = − 1/24 for ( i, j ) = (1, 3), (2, 4), (3, 4)(4, 5), (5, 6). √
Fundamental coroot generators: h̃α1 = 1/2 (e1 − e2 − e3 − e4 − e5 ) + 3/2 e6 ,
h̃α2 = e1 + e2 , h̃αi = ei−1 − ei−2 for i = 3, 4, 5, 6.
The Cartan integers can be calculated either with the known inner products
of the roots on h∗0 , or the known functions α j ( h̃αi ), yielding aii = 2 (i = 1, 2, . . . , 6),
aij = a ji = − 1 for ( i, j ) = (1, 3), (2, 4), (3, 4)(4, 5), (5, 6). The results are shown in
the form of a Cartan matrix, with its associated Dynkin diagram E6 in Fig. 6.9.
212 CHAPTER 6. SIMPLE LIE ALGEBRAS: STRUCTURE

 
2 0 −1 0 0 0

 0 2 0 −1 0 0 

 −1 0 2 −1 0 0 
A [ e6 ] =  

 0 −1 −1 2 −1 0 

 0 0 0 −1 2 −1 
0 0 0 0 −1 2

1
f
α2

1 1 1 1 1
E 6: f f f f f
α1 α3 α4 α5 α6

1
f
α2

1 1 1 1 1 1
E7: f f f f f f
α1 α3 α4 α5 α6 α7

1
f
α2

1 1 1 1 1 1 1
E8: f f f f f f f
α1 α3 α4 α5 α6 α7 α8

Figure 6.9: Dynkin diagrams for the systems E6 , E7 , and E8 . Note that, with this choice of
basis (where the branching occurs at the third node), omitting the last node of E8 produces
E7 , and omitting the last node of E7 produces E6 .

7. Exceptional Lie algebra e7 .


CSA: h = C7 , h0 = R 7 . Elements: H = ∑7i=1 ai ei .

Σ (` = 7): α1 = 1/2 ( ε 1 − ε 2 − · · · − ε 6) + 1/ 2 ε 7, α2 = ε 1 + ε 2, αi = ε i−1 − ε i−2
(i = 3, 4, . . . , 7). √
∆+ (dimension = 63): 1/2 (± ε1 ± ε 2 ± · · · ± √ε 6 + 2 ε 7 ) (odd number of minus
signs), ε i + ε j (i < j ≤ 6); ε i − ε j (j < i ≤ 6); 2 ε 7.
Killing form on h0 : ( H : H 0 ) = 36 ∑7i=1 ai a0i .
Inner products on h∗0 : h αi, αi i = 1/18 (i = 1, 2, . . . , 7); h αi, α j i = − 1/24 for
( i, j ) = (1, 3), (2, 4), (3, 4)(4, 5), (5, 6), (6, 7).

Fundamental coroot generators: h̃α1 = 1/2 (e1 − e2 − · · · − e6 ) + 1/ 2 e7 ,
h̃α2 = e1 + e2 , h̃αi = ei−1 − ei−2 with i = 3, 4, 5, 6, 7.
6.7. CLASSIFICATION OF LIE ALGEBRAS 213

The Lie algebra e7 we have constructed is represented by the Dynkin diagram


E7 in Fig. 6.9 with the Cartan matrix obtained from the matrix in (6.55) by omitting
its last column and last row.
8. Exceptional Lie algebra e8 .
CSA: h = { H = ∑8i=1 aiei | ai ∈ C } . For h0, ai ∈ R.
Σ (` = 8): α1 = 1/2 ( ε 1 − ε 2 − · · · − ε 7 + ε 8 ), α2 = ε 1 + ε 2, αi = ε i−1 − ε i−2
(i = 3, 4, . . . , 8).
∆+ (dim = 120): 1/2(± ε1 ± ε 2 ± · · · ± ε 7 + ε 8) (even number of minus signs),
ε + ε j (i < j ≤ 8); ε i − ε j (j < i ≤ 8).
i

Killing form on h0 : ( H : H 0 ) = 60 ∑8i=1 ai a0i .


Inner products on h∗0 : h αi, αi i = 1/30 (i = 1, 2, . . . , 8); h αi, α j i = − 1/60 for
( i, j ) = (1, 3), (2, 4), (3, 4)(4, 5), (5, 6), (6, 7), (7, 8).
Fundamental coroot generators: h̃α1 = 1/2 (e1 − e2 − · · · − e7 + e8 ),
h̃α2 = e1 + e2 , h̃αi = ei−1 − ei−2 with i = 3, 4, . . . , 8.
The Cartan matrix for e8, corresponding to the Dynkin diagram E8 in Fig. 6.9,
is given below.

−1
 
2 0 0 0 0 0 0
 0
 2 0 −1 0 0 0 0 

 −1 0 2 −1 0 0 0 0 
 
 0 −1 −1 2 −1 0 0 0 
A [ e8 ] =  . (6.55)
 0
 0 0 −1 2 −1 0 0 

 0
 0 0 0 −1 2 −1 0 

 0 0 0 0 0 −1 2 −1 
0 0 0 0 0 0 −1 2

To each FSR there corresponds exactly one simple complex Lie algebra (and
so a unique complex compact Lie group) of some complex dimension, but there
are in general several real forms of the same real dimension. For example to E8
are associated a complex Lie algebra of complex dimension 248 and three real
forms of real dimension 248. The Lie algebra e8 contains as subalgebras all the
exceptional Lie algebras as well as several classical Lie algebras.
A classical Lie group G acting on a vector space V may generally be viewed as
a group of automorphisms in V preserving a bilinear form B ( x, y ) where x, y ∈ V.
For Lie algebra Lie ( G ) the equivalent statement would refer to infinitesimal in-
variance of systems of a form similar to ( x, dy ) + ( dx, y) = 0 on R n for O ( n ).
Killing (1889) showed that G 2 could be realized as a group of local transforma-
tions on R 5 , and Cartan (1894) proved that similar groups of transformations on
R 15, R 16, R 27 , and R 29 exist for F4, E6, E7 and E8, respectively. Alternatively, the
exceptional groups can be regarded as isometry groups over a noncommutative,
nonassociative, normed division algebra over the real numbers, called the octo-
nion algebra (see [Ba], [CS]). The octonions (discovered independently by John
Graves (1843) and Arthur Cayley (1845)) are an 8-dimensional algebra with basis
1, e1, . . . , e7 , in which each unit, except 1, squares to − 1, that is both noncommu-
tative and nonassociative.
214 CHAPTER 6. SIMPLE LIE ALGEBRAS: STRUCTURE

Problems
6.1 (Find ∆ for b2 ) Find all the roots of the b2 Lie algebra.
6.2 (Find ∆+ for d4 ) We have seen in Sect. 6.5 that the set of roots of a simple Lie
algebra g can be found from its given fundamental system Σ and Cartan matrix
A. Use this method to find all positive roots of the d4 Lie algebra.
6.3 (Rules for Dynkin diagrams) Identification of the allowed Dynkin diagrams
(those that are consistent with the definition of a fundamental system of roots
(FSR)) is based on a set of general rules derived from the definition of FSR of
semisimple Lie algebras. Prove:
(a) Rule 1: A Dynkin diagram contains more vertices than joined pairs. It
cannot have closed polygons. (A closed polygon, or a loop, is a set of points
sequentially connected and the last point is also connected to the first.)
(b) Rule 2: The maximum number of lines issuing from a vertex is 3.
(c) Rule 3: If an allowed Dynkin diagram Γ contains an A 2 simple chain as a
sub diagram, then the diagram obtained from Γ by replacing the simple chain by
a vertex point is also an allowed Dynkin diagram
6.4 (Non allowed diagrams) Show that configurations Fig. 6.7(c.1), Fig. 6.8(d.1)-
(d.2) are not allowed by showing that the bilinear form h, i defined in the respec-
tive spaces h∗0 is not positive definite.
6.5 (Generators of g2 ) Given a 2 × 2 Cartan matrix with entries a11 = a22 = 2,
a12 = − 3, a21 = − 1, identify the generators of the corresponding Lie algebra.
6.6 (Multiplication table for g2 ) With the same definitions as in Problem 5, estab-
lish the multiplication table for the generators of the Lie algebra g2 .
6.7 (Positivity of norms in the classical series) Check that vectors in the root
spaces associated with diagrams A ` , B` , C` , and D` have positive definite norm
squares, as expected of Euclidean spaces.
6.8 (Positivity of norms in the exceptional algebras) Same question as in the pre-
ceding problem for the five exceptional algebras.
6.9 (Maximal roots) Find the maximal roots of e6, e7 , and e8.
6.10 (Data on g2 ) Rank-two
√ 1 Lie algebra g2 is defined by the simple roots α1 and
2 2 ◦
α related by | α | = 3| α | and separated by an angle of 150 . Introduce the
basis ei (i = 1, 2, 3) for h0 and the dual basis ε i (i = 1, 2, 3) for h∗0 obeying ε i ( e j ) =
δij and ε 1 + ε 2 + ε 3 = 0. Express the fundamental coroot generators h i and the
fundamental roots and weights αi and ω i, where i = 1, 2 and ω i ( h j ) = δij , in the
appropriate basis.

Q. Ho-Kim. Group Theory: A Physicist’s Primer.


Chapter 7

Lie Algebras: Representations

7.1 General Properties


7.2 Irreducible Representations
7.3 Dimension of Representations
7.4 Lie Groups in Particle Physics

In this chapter we study the complex finite-dimensional representations (at times


shortened to ‘representations’) of complex semisimple Lie algebras based on their
structure described in the last chapter. In fact, most of the relevant definitions,
theorems, and techniques about representations were already introduced and ap-
plied to a2 = sl(3, C ) in Chapter 5; they will now be extended to all complex
semisimple Lie algebras. This will be followed by a derivation of two important
results, the Freudenthal multiplicity formula and the Weyl character formula for
the irreducible representations of semisimple Lie algebras. Our main references
for this chapter are [FH], [Ja], and [Sa].

7.1 General Properties


Let g be a complex semisimple Lie algebra of rank `, and h a Cartan sub Lie alge-
bra (CSA) of g, that is, a maximally abelian diagonalizable subalgebra of g . In the
Cartan–Weyl analysis, g is decomposed into a direct sum of h and g root spaces,
g = h ⊕ (⊕ αgα ), with h acting on each (root space) gα by scalar multiplication
by a linear function α ∈ h∗ . The summation ⊕ α is over all roots α, which to-
gether form the complete root system ∆ of g, of which ∆+ is the subset of positive
roots with respect to some ordering in h∗ . Of these positive roots, one chooses
` linearly independent roots αi, called simple, or fundamental, roots, to form a ba-
sis Σ = { α1, α2, . . . , α` } for h∗ , called the fundamental system of roots (FSR). For
any root α ∈ ∆, one can define via the Killing form a (unique) root generator
hα ∈ h. In particular, each fundamental-root generator h αi (or the normalized co-

215
216 CHAPTER 7. LIE ALGEBRAS: REPRESENTATIONS

def
root hi = h̃αi ) is uniquely associated to a simple root αi ∈ Σ . The roots lie in a
real subspace h∗0 of h∗ .
As in Chapters 3–5, we call π : g → gl(V ) a finite-dimensional representation
of g on the complex vector space V , such that each element z in g is mapped to a
linear operator π ( z ) ≡ Z in V .
As a Cartan sub Lie algebra is abelian, one may define joint eigenvectors v
of all the operators H ≡ π ( h ) for h in the given CSA h. Such a vector v ∈ V ,
if nonzero, is called a weight vector, and the corresponding eigenvalue µ ( h ) for
π ( h) is a linear function of h on h, and so is an element of h∗ , called the weight of
v: π ( h)v = µ ( h ) v, or Hv = µ ( h)v . For a given µ ∈ h∗ , the weight space Vµ (which
may be zero) is a subspace of V consisting of 0 and all weight vectors of weight µ.
When Vµ 6 = 0, µ is said to be a weight in the representation π. Further, the dimension
of Vµ is called the multiplicity of µ. Thus V admits a direct-sum decomposition
V = ⊕ µ Vµ , where h acts diagonally on each Vµ by scalar multiplication by µ. The
weight spectrum of the adjoint representation of g consists of 0 (of multiplicity `)
and non-zero weights (each of multiplicity 1), called the roots of g.
Let us now select a basis for h. We know that generally in semisimple Lie al-
gebras, to each α ∈ ∆+ correspond one-dimensional dual root spaces gα and g−α ,
such that the direct sum sα = gα ⊕ g−α ⊕ [gα , g−α ] determines a three-dimensional
subalgebra of g isomorphic to sl (2, C ). And so, in particular, for every (simple)
root αi in the fundamental system Σ that spans h∗ , we have one such subalgebra
of g: si ≡ sαi with i = 1, 2, . . . , `. Now in each subalgebra si , we can pick a basis
consisting of the elements ei ≡ eαi ∈ gαi , f i ≡ f αi ∈ g−αi , and hi ≡ [ei , f i ], normal-
ized such that αi ( hi ) = 2. With this standard normalization, h i, ei , f i satisfy the
canonical commutation relations for sl (2, C ).
We can now choose as a basis for the CSA those ` elements h 1, h2, . . . , h` in h
that obey the two conditions h i = [ei , f i ] and αi ( hi ) = 2. (In fact, they are pre-
cisely the fundamental co-root generators h̃αi .) As far as g itself is concerned, the
remaining (dim g − `) elements needed to complete a basis for the whole algebra
may be chosen to be either (i) ei, f i (i = 1, 2, . . . , `) together with their distinct
multiple commutators; or (ii) eα ∈ gα , f α ∈ g−α for all α ∈ ∆+ . We will have
occasions to use one or the other type of basis in our discussion.
This choice of basis { hi} for h has important implications. From our analysis
of the sl (2, C ) representations, we know that each π ( h i) is diagonalizable on the
restriction of π to si , with integral eigenvalues. As all π ( h i)’s commute, they are
simultaneously diagonalizable. The joint eigenvectors v are finite in number, and span
V . Each of the corresponding eigenvalues, µ, is a linear function of hi , and so lives
in h∗ , or, more precisely in h∗0 because µ ( hi) ∈ Z. It will soon become apparent
that π (g) has the structure of a direct sum, V = ⊕ a Wa , where Wa are the sub
representations in irreducible π under a given si = { hi, ei , f i } .
Just as for sl (3, C ), there is a Lemma concerning weight spaces of any semisim-
ple Lie algebra g, which says that If α is any root and zα the corresponding root vector,
then π ( zα ) ≡ Zα maps weight space Vµ into weight space Vµ+α . In other words, if
v is a weight vector of representation π with weight µ, then Zα v, if not zero,
is again a weight vector of π with weight µ + α, as is clear from the equation
H Zα v = ( µ + α )(h) Zαv.
7.1. GENERAL PROPERTIES 217

This Lemma indicates that from a given weight vector of a representation π,


other weight vectors of π can be generated. So given a root α ∈ ∆+ and the
corresponding subalgebra sα = Ceα ⊕ C f α ⊕ C h̃α , repeated actions of π ( eα) ≡ Eα
and π ( f α ) ≡ Fα on a weight vector v of weight µ in representation π produce
either zero or weight vectors of weights all of the form µ + kα, where k ∈ Z. As π
is by assumption a finite representation, this sequence of weights must be finite.
(Recall h̃α = [ eα , f α ] and α ( h̃α ) = 2.) Thus, we have the parallel sequences

Weights : µ + pα, . . . , µ + α, µ, µ − α, . . . , µ − qα ; (7.1)


Weight vectors : ( Eα ) p v, . . . , Eα v, v, Fα v, .. . . . ., ( Fα )q v ; (7.2)

where p and q are non-negative integers depending on α and µ, such that µ +


( p + 1) α and µ − ( q + 1) α are not weights. One calls either { µ + kα; − q ≤ k ≤ p }
as in (7.1), or ⊕ k Vµ+kα as in (7.2), an α-string of weights containing µ, and denotes
it by Sα ( µ ).
When the weights are calculated at h̃α , the weight sequence gives

µ ( h̃α ) + pα ( h̃α ), . . . , µ ( h̃α ), . . . , µ ( h̃α ) − qα ( h̃α ) ,

which are the weights of a complete irreducible representation of sα — a set of


integers symmetric about 0, of the same parity, and of multiplicity equal to one.
Since α ( h̃α ) = 2, it follows that µ ( h̃α ) = q − p is an integer. As the set of the
corresponding weight vectors are invariant under the action of CEα ⊕ CFα ⊕ CHα ,
the string of weights Sα ( µ ) defines a sub representation in π (g) invariant under
sα , which we refer to as an sα -irreducible representation (or sα -multiplet) of g. And
so, representation V reduces to a direct sum of such sα -irreducible representations Wa ,
which we write as V = ⊕ a Wa , with Wa = W[µ] = ⊕ k Vµ+kα for some α ∈ ∆+ .
As we recall, if m is a weight of an irreducible representation ϕ of sl (2, C ),
then so is − m, that is, the set of weights of ϕ is invariant under reflection about
the origin. A similar symmetry exists for a general Lie algebra g.
For any root α of a Lie algebra g and any element µ of h∗0 , the mapping

2h µ, αi
wα : µ 7 → µ0 , where µ0 = µ − α = µ − µ ( h̃α ) α , (7.3)
h α, α i

is a reflection of µ with respect to the hyperplane in h∗0 perpendicular to the root


α. It is a linear transformation that sends α into − α, and leaves fixed every vector
in the hyperplane orthogonal to α (i.e. wα · µ = µ iff h µ, α i = 0).
The transformations wα : h∗0 → h∗0 just described are called the Weyl reflections
wrt α. When α ranges over all of ∆, they generate a group of isometries on h∗0 called
the Weyl group W of algebra g. Since there are a finite number of roots, wα ’s are
finite in number and W is a finite group. The elements of W divide h∗0 into distinct
regions, called the Weyl chambers, such that every point in one chamber is sent
into some point in another chamber by some element w ∈ W (unless the points
are on a common edge, then two w’s are involved).
Under the Weyl group W, the elements of h∗0 have the following properties:
218 CHAPTER 7. LIE ALGEBRAS: REPRESENTATIONS

(a) If µ is a weight, µ0 belongs to the string Sα ( µ ) since − q ≤ p − q ≤ p and


µ ( h̃α) = q − p for any root α. It follows that if µ is a weight of π, then so also is µ0 =
wα · µ (just as are all the terms µ + kα, for integers k in the interval [−(q − p ), 0]
if p ≤ q, or in [0, p − q ] if p ≥ q). In fact, every term in the string Sα ( µ ) is sent
to another term by wα , because wα · ( µ + kα ) = µ − ( q − p + k ) α. In particular,
the two end-point weights of the string are conjugates: wα · ( µ + pα ) = µ − qα.
In a given representation, wα maps weights to weights. Thus, from a weight of a
representation π, one can find all other weights of π via Weyl group W.
(b) The multiplicity is preserved under wα . As Sα ( µ ) corresponds to a Wa in the
decomposition V = ⊕ a Wa , we can say that µ0 is a weight in every Wa that has
µ as a weight. Being weights of an irreducible sα representation, they both have
multiplicity of one. Hence the dimension of the weight space Vµ is given by the
number of Wa ’s having µ as a weight, and so dim Vµ = dim Vµ0 , or nµ = nµ0 .
Besides those general characteristics, the following properties also hold in
Weyl reflections performed relative to simple roots αi ∈ Σ:
(c) Call wi ≡ wαi the Weyl reflection associated to the simple roots αi, with i =
1, 2, . . . , `. (i) If αi , α j ∈ Σ, then wi α j = α j − α j ( h̃αi ) αi = α j − αi aij (no summation)
where aij are the Cartan integers. (ii) If α is a positive root different from αi , then wi α
is also a positive root. In other words, αi is the only positive root that wi transforms
into a negative one, implying det wi = − 1.
(d) As αi ’s span h∗0 , wi ’s generate the Weyl group W. This means any w ∈ W may
be written as a product of wi’s. The number of factors in such a product, call it
r ( w), is equal to the number of positive roots that w sends to negative ones. As
det wi = − 1 for each αi , we have det w = (−)r(w) .
E XAMPLE 1: Lie algebra a2 has three positive roots: the two simple roots α1, α2,
and α3 = α1 + α2. As the two simple roots form a basis for h∗0 , we start by consid-
ering the reflections wr to them. Let w1 = wα1 and w2 = wα2 the Weyl reflections
on the hyperplanes perpendicular to α1 and α2. From the definition of the Weyl
reflection, we have w1 α1 = − α1 and w2 α2 = − α2, and w1 α2 = α1 + α2 = α3 and
w2α1 = α3. The mapping α3 7 → − α3 is w3 = wα3 = w1 w2 w1. It is clear that wi
permute the roots. To construct the multiplication table, we begin with w0 (the
identity), w1, and w2. We will see that we need w4 = w1 w2 and w5 = w2 w1 as
additional elements, and to complete the table we also need w3 = w1 w5 = w2 w4:

W w0 w1 w2 w3 w4 w5
w0 w0 w1 w2 w3 w4 w5
w1 w1 w0 w4 w5 w2 w3
w2 w2 w5 w0 w4 w3 w1
w3 w3 w4 w5 w0 w1 w2
w4 w4 w3 w1 w2 w5 w0
w5 w5 w2 w3 w1 w0 w4

Element w0 is of order 1; w1, w2 and w3 are of order 2; w4 and w5 of order 3. Thus


the Weyl group for a2 has six elements, and is isomorphic to the dihedral group
D3 ∼= S3. We also have det w j = − 1 for j = 1, 2, 3, and det w j = + 1 for j = 0, 4, 5.
See also Fig.7.1 and Problem 7.1. 
7.1. GENERAL PROPERTIES 219

hµ, α1 i = 0
w1
α2
w0
ω2
hµ, α2 i = 0
ω1
w4 α1
w2

w3 w5

Figure 7.1: The Weyl chambers of h∗0 for a2 are labeled by the operators wi that
take the fundamental Weyl chamber w0 to the Weyl chamber wi . Also indicated
are the simple roots α1, α2, and the fundamental weights ω 1, ω 2.

A lattice in a vector space is an additive subgroup generated by some basis of


the space. The subgroup of h∗0 consisting of the µ ∈ h∗0 for which µ ( hi) are integers
is called the weight lattice and denoted Λ. The subgroup of h∗0 generated by ∆
is called the root lattice, denoted Λ R . Besides the root lattice Λ R, the lattice Λ
contains another important sublattice: the lattice of dominant weights, denoted
Λd , consisting of all elements with non-negative integral components. A dominant
weight µ is an integral linear function on h∗0 that satisfies h µ, α i ≥ 0 for all α ∈ ∆+ .
It is said to be strongly dominant if h µ, α i > 0 for all α ∈ ∆+ .
The fundamental weights of g, denoted ω i, are the ` independent dominant
weights that are dual to the fundamental co-root generators h i = h̃αi normalized
such that αi ( hi ) = 2, according to

ω i ( h j ) = δij , ( i, j = 1, 2, . . . , `). (7.4)

As an alternative to the simple roots αi, one may pick the fundamental weights
ω i to form a basis for h∗0 , so that Λ = { ∑i mi ω i | mi ∈ Z } . It is called the Dynkin
basis (each ω i is uniquely associated to a node αi in the Dynkin diagram). The
condition ω i ( h j) = 0 for i 6 = j means that the ω i’s are the first weights on the
edges of a cone called the fundamental Weyl chamber on h∗0 . Thus, one may describe
Λd as the intersection of Λ and the closed fundamental Weyl chamber. The sum
δ = ∑`1 ω i gives the lowest strongly dominant weight.
Any weight (or root) µ may be written either as µ = ∑i k i αi in the basis { αi} ,
or as µ = ∑i mi ω i in the basis { ω i} . But only in the latter are the coefficients always
integral, which is a clear advantage. So, in the Dynkin basis we may write any
weight (or root) µ = m1 ω 1 + m2 ω 2 + · · · + m` ω ` as µ = ( m1, m2, . . . , m` ), with
mi = µ ( hi) ∈ Z. Correspondence between the two bases is determined from the
Cartan matrix by the relations αi = ∑j ω j a ji and k i = ∑j bij m j , where a ji = αi ( h j)
is the element A ji of the Cartan matrix, and bij is the element A− 1
ij of its inverse.
Note that the components of the simple roots αi wrt CSA { h1, . . . } are given by
the entries in the i-th column of the Cartan matrix.
220 CHAPTER 7. LIE ALGEBRAS: REPRESENTATIONS

E XAMPLE 2: From the data on the algebraic structure (Chapter 6) we can deter-
mine its fundamental weights for any simple Lie algebra. Take a2 as an example.
From the Cartan matrix, we obtain the two simple roots: α1 = ω 1 a11 + ω 2 a21 =
2ω 1 − ω 2 and α2 = ω 1 a12 + ω 2 a22 = − ω 1 + 2ω 2, and by inversion, the fun-
damental weights: ω 1 = 2/3α1 + 1/3α2 and ω 2 = 1/3α1 + 2/3α2. On the other
hand, writing ei for the ith coordinate in R n or Cn , and ε i the ith coordinate
function, we know that α1 = ε 1 − ε 2 and α2 = ε 1 + 2ε 2, and so ω 1 = ε 1 and
ω 2 = ε 1 + ε 2. A more direct way to find ω i goes by their definition: We already
know that h1 = e1 − e2 , h2 = e2 − e3 on h0. Since ω i live on h∗0 , they must have
the form ω i = ai ε 1 + bi ε 2 + ci ε 3, with ε 1 + ε 2 + ε 3 = 0. By duality ε i (ej ) = δij and
ω i ( h j ) = δij , with i (or j ) = 1, 2 , we find again ω 1 = ε 1 and ω 2 = ε 1 + ε 2. These


results are summarized in the following table:

ε1 = 2/3α1 + 1/3α2 = ω 1 = ω1
ε2 = − /3α + /3α = − ω + ω = ω 1 − α1
1 1 1 2 1 2 (7.5)
ε3 = − 1/3α1 − 2/3α2 = − ω 2 = ω 1 − α1 − α2
The various bases for h∗0 we have considered are shown in Fig.7.2. 

α2 = ε2 -ε3 ε1 -ε3

e
ω2 e
ε2 ω1 = ε1
ε2 -ε1 u α1 = ε1 -ε2

ε3

ε3 -ε1 ε3 -ε2

Figure 7.2: Vectors in a small sector of h∗0 space of a2 , with various relative angles
given by θ ( ω 1, ω 2 ) = 60◦ , θ ( α1, α2 ) = 120◦ , and θ ( ε 1, ε 2) = 120◦ . The darkened
cone is the fundamental Weyl chamber.

C OMMENT . Sometimes we need to evaluate inner products h µ, µ0i of µ, µ0 ∈ h∗0 .


There are several ways of doing this, making use of the bases { αi} , or { ω i} , or
both. Thus we may write µ = ∑i k i αi (k i ∈ Q) or µ = ∑i mi ω i (mi ∈ Z) with i
going from 1 to `, and similarly for µ0 . Then we have these options: (a) h µ, µ0i =
∑ij k i k 0j h αi, α j i; (b) h µ, µ0 i = ∑ij mi m0j h ω i, ω j i; and (c) h µ, µ0 i = ∑ij k i m0j h αi, ω j i,
the latter sums reducible to a single sum because h ω j, αi i = 1/2δij h αi, αi i, by def-
inition (7.4). The elementary inner products h αi, α j i are known for simple Lie
algebras, from which we can calculate h ω i, α j i and h ω i, ω j i.
7.1. GENERAL PROPERTIES 221

E XAMPLE 3: For the algebra a2 , we have α1 = 2ω 1 − ω 2 and α2 = − ω 1 + 2ω 2,


with inner products h αi, αi i = 1/3 (for i = 1, 2), h α1, α2 i = − 1/6; and ω 1 =
2/3α 1 + 1/3α 2 and ω 2 = 1/3α 1 + 2/3α 2 ; from which we get h ω i , ω i i = 1/9 (for
i = 1, 2) and h ω 1, ω 2 i = 1/18. That the three expressions for h µ, µ0i, namely,
( k 1k 01 + k 2 k 02 ) /3 − ( k 1k 02 + k 2 k 01 ) /6, ( m1m01 + m2 m02 ) /9 + ( m1m02 + m2 m01 ) /18, and
( m1k 01 + m2 k 02 ) /6, are equivalent can be shown with 3k 1 = 2m1 + m2 , 3k 2 = m1 +
2m2 and similar relations for k 01 and k 02. 
We now introduce an ordering of the weights in h∗0 . If λ and µ are weights for
a representation π of g, λ is said to be higher than µ (or equivalently µ lower than
λ) if λ − µ is a positive root of g: λ − µ = n1 α1 + n2 α2 + · · · + n` α` with αi ∈ Σ
and all integral ni ≥ 0. (We denote this relationship by λ  µ.) A weight λ of π
is said to be a highest weight if λ  µ for all weights µ of π. Hence, if λ of π is
highest, then λ + α is not a weight of π for a positive root α. A weight vector v of
π is a highest-weight vector if Eα v = 0 for all α ∈ ∆+ . Then, as implied by the
Lemma, the weight of v must be a highest weight. (The lowest weight is similarly
defined.) Given the string Si ( λ ) based on a highest weight λ for any αi ∈ Σ, we
have pi = 0, so that λ ( hi) = qi is a non negative integer N, giving the number of
times the simple root αi can be subtracted from λ. So every highest weight lies in
the lattice Λd , having the form λ = ∑i qi ω i in Dynkin basis, with integers qi ≥ 0.
To a representation π of g on V , one may associate the dual representation π ∗
on the dual vector space V ∗ (or, in an alternative notation, π̄ on V ), defined by
π ∗ ( X) = − π ( X )T : V ∗ → V ∗ for every X ∈ g. The weights of π ∗ are the negatives
of the weights of π, with the highest weight of one being the negative of the
lowest weight of the other. A representation π is called self-dual if it is equivalent
to its dual π ∗ . For irreducible representations (see the following section), there is a
simple criterion for self-duality: An irreducible representation πλ , with the dominant
weight λ as highest weight, is self-dual if and only if its lowest weight is − λ.
The first task in seeking all representations of a Lie algebra is to determine its
weight lattice (where the weights of all representations must lie). It is generated, as
we know, by the fundamental weights. The root lattice, which is a sublattice of the
weight lattice, is then identified by the simple roots of the algebra. As examples
we now construct the lattices of the rank-2 Lie algebras a2 , b2 , and g2 .
E XAMPLE 4: a2. α1 and α2 are the fundamental roots, given by α1 = ε 1 − ε 2,
α2 = ε 1 + 2ε 2, separated by an angle of 120◦ (as in Example 2). ω 1 and ω 2 are the
fundamental weights, given by ω 1 = ε 1, ω 2 = ε 1 + ε 2 separated by an angle of
60◦ . All weights are of the form µ = m1 ω 1 + m2 ω 2 = ( m1, m2 ) (with integral mi ).
The fundamental co-roots are h1 = e1 − e2 , and h2 = e2 − e3 , with the restriction
ε 1 + ε 2 + ε 3 = 0 on h, so that ε i ( hi ) = 1 and ε i+1 ( hi ) = − 1 with i = 1, 2. The
dominant weights are given by λ = ∑31 ai ε i (mod ∑31 ε i) with integers ai satisfying
a1 ≥ a2 ≥ a3 ≥ 0, so that λ ( hi) are non-negative integers. The Weyl group
(see Example 1) has six elements, generated by wα1 (of order 2) and wα1 wα2 (of
order 3). It is isomorphic to the finite group D3 (the symmetry group for the
equilateral triangle). The weight lattice for a2 is shown in Fig.7.11 at the end of
this chapter, where the open circles indicate dominant weights lying within the
closed fundamental Weyl chamber (the cone between ω 1 and ω 2), while the solid
circles represent the other weights.
222 CHAPTER 7. LIE ALGEBRAS: REPRESENTATIONS

E XAMPLE 5: b2 . We again obtain all necessary data from Chapter 6. Calling α1


the long root, we have | α1|2 = 1/3, | α2|2 = 1/6, and h α1, α2 i = − 1/6. By the Cartan
matrix, we have α1 = 2ω 1 − 2ω 2 and α2 = − ω 1 + 2ω 2, and by inversion, we get
ω 1 = α1 + α2 and ω 2 = 1/2α1 + α2. With the known inner products for the simple
roots, we can then calculate h ω 1, α1 i = 2h ω 2, α2 i = 1/6, h ω 1, ω 1 i = 2h ω 2, ω 2 i =
1/6, and h ω 1, ω 2 i = 1/12. From these, we compute the relative angle between
α1 and α2 to be 135◦ , and that between ω 1 and ω 2 to be 45◦ . This gives enough
information to draw the diagram shown in Fig.7.12 at this chapter end. The Weyl
group has eight elements, generated by wα1 (of order 2) and wα1 wα2 (of order 4).
It is isomorphic to D4 , the discrete symmetry group for the square.
E XAMPLE 6: g2. Let α2 be the long root, then we have α1 = 2ω 1 − ω 2 and α2 =
− 3ω 1 + 2ω 2; or inversely ω 1 = 2α1 + α2 and ω 2 = 3α1 + 2α2. Note that these
fundamental weights are also roots, and so the weight lattice and the root lattice
of g2 coincide. We already know that | α2|2 = 3| α1|2 and h α1, α2 i = − 1/2| α2|2.
With ω i known in terms of the αi ’s, a calculation produces | ω 2|2 = 3| ω 1|2 = | α2|2
and h ω 1, ω 2 i = 1/2 | α2|2. It is then seen that the simple roots are separated by an
angle of 150◦ , and the fundamental weights by an angle of 30◦ (see Fig.7.13 at this
chapter end). The Weyl group has twelve elements, generated by wα1 (of order 2)
and wα1 wα2 (of order 6). It is isomorphic to the dihedral group D6 . 

7.2 Irreducible Representations


We will discuss here the main topics of the representation theory of the complex
semisimple Lie algebras. They concern the following issues: to relate each irre-
ducible representation (‘irrep’; cf. Chap. 3, Sec. 3.6) of a Lie algebra to an extremal
dominant weight; to construct the weight space of each irrep; and to decompose
an arbitrary representation into irreps.

Theorem 7.1 (Irreducible representatiosn of complex semisimple Lie algebra).


1. Every finite-dimensional irreducible representation π of a complex semi-simple Lie
algebra g, provided with a CSA h, on a complex vector space V is the direct sum of its
weight spaces, V = ⊕ Vµ, where each µ is a weight (or eigenvalue µ ( h) of the operator
π ( h) for any element h of h).
2. Every irreducible representation of g (denoted by πλ ) has a unique highest weight
λ, which is dominant (being in Λd ), maximal, and of multiplicity one. All other weights
of πλ are of the form λ − ∑i ai αi (where αi ∈ Σ are simple roots, and ai are non-negative
integers), and are contained in the convex closure of w · λ defined by the elements w of
the Weyl group. The representation space is found by applying all possible combinations
of elements of negative-root spaces g−αi for all αi ∈ Σ on the highest-weight vector.
3. Conversely, every λ of the lattice Λd determines a complex finite-dimensional irre-
ducible representation admitting λ as its highest weight. The representation so defined is
essentially unique, in the sense that two irreducible representations with the same highest
weight are equivalent under some linear transformation.
4. Every representation of g is completely reducible, i.e. it may be written as a direct
sum of irreducible representations of g.
7.2. IRREDUCIBLE REPRESENTATIONS 223

Given a complex semisimple a Lie algebra g, we denote by πλ = πm1 ,m2 ,...,m`


the irreducible representation (‘irrep’) of g having the dominant weight λ =
∑i mi ω i ≡ ( m1, m2 , . . . , m` ) as its highest weight. An irrep having a fundamental
weight ω i as its highest weight is called a fundamental representation, denoted
π0,...,0,1,0,... (with the 1 at the ith position). The main object of representation the-
ory is to construct all the irreps πλ with λ ∈ Λd .
Construction of the irreducible representation πλ corresponding to a given
dominant weight λ = ( m1, m2, . . . , m` ) and associated highest-weight vector v
may proceed in steps outlined below:
(1) For each positive component mi of λ, subtract the simple root αi from λ mi
m
times. This gives an si -multiplet described by v, Fi v, Fi2v,. . . , Fi i v.
(2) For each of the resulting weight µ (with weight vector u) and each of its
positive component q j subtract the simple root α j from µ q j times, producing an
qj
sj -multiplet described by u, Fj u, Fj2 u,. . . , Fj u.
(3) Repeat step (2) as many times as necessary until a weight is produced with
only non-positive components (of which at least one is negative).
When a weight µ can be reached from λ by different paths, its multiplicity
nµ might be greater than 1, and it might be necessary to orthogonalize the inde-
pendent weight vectors having the same weight. In general πλ , irreducible by
construction in the full algebra g, is reducible under its subalgebras si ∼ = sl (2, C ),
and nµ is just the number of the si -irreps at the weight µ. One can compute nµ by
checking the independence of the vectors having the same weight µ (see exam-
ples below); or alternatively by using the Freudenthal multiplicity formula (see
discussion in the next section).
In order to do calculations involving weight vectors, it is useful to generalize
Eq. (4.8) of Chapter 4 (applied there just to sl (2, C )). Let α j be a simple root, and
µ be a weight with positive integral component q j = µ ( h j) such that µ + α j is not
a weight. Let µ − kα j be the weights of the vectors uk = Fjk u0 for integer k such
that 0 ≤ k ≤ q j . Now, if all the weights µ, µ − α j , . . . , µ − q j α j in the sj multiplet
formed in this way have multiplicity one, then the representation matrices of Fj
and Ej are determined by the following relations:
q
Fj uk = ( k + 1)(q j − k ) uk+1 , 0 ≤ k ≤ q j ,
q (7.6)
Ej uk = k ( q j − k + 1) uk−1 , u−1 = 0.

If a certain vector uk happens to have the same weight as a member v r of another


si -multiplet, with i 6 = j, in the same representation πλ , then it is necessary to
check their independence.
The vector space of πλ is spanned by the highest-weight vector v and vectors
of weights λ − ∑i ai αi , of the form Fi1 Fi2 · · · Fik v, where k = 0, 1, . . . , and 1 ≤
ij ≤ `. The number of times, ∑i ai , simple roots must be subtracted from the
highest weight to obtain the corresponding weight is called the height of the basis
vector. For example the height of v is 0, while F1 v, . . . , F` v all have a height equal
to 1. If the weights are placed on successive rows, those of equal heights on the
224 CHAPTER 7. LIE ALGEBRAS: REPRESENTATIONS

same line, then the number of weights on each row first increases then decreases
symmetrically as the height increases, giving a vaguely spindle-shaped diagram.
The weight system of any irreducible representation π is invariant under the
Weyl group W, that is, if µ is a weight of π, then so is w · µ for any w ∈ W. A
weight µ and its conjugates w · µ have the same multiplicity and are considered
equivalent. The different equivalent weights form a Weyl orbit (W-orbit), their
number being called the orbital size.
Just as the weights of an irrep of a2 lie on a triangle or a hexagon in the weight
lattice Λ, so the weight systems of irreps of any simple Lie algebra g lie on con-
vex (polygonal) hulks in Λ. As we have seen, the Weyl group W of g sends each
weight µ into a weight of the same representation. In particular, if µ occupies a
corner of the polygon, its mirror image w · µ occupies an opposite corner since
µ and w · µ are the extreme weights of some string of weights. Starting with the
highest weight λ, we can generate other corner weights with every w in W; re-
peating the process, we can identify all other weights and their associated strings,
and so obtain the weight system πλ . These remarks show how the Weyl group
can be useful in finding all the weights of a given irrep.
The simplest of the irreps of the Lie algebra a2 are the fundamental repre-
sentations (1, 0) and (0, 1), which we have already seen in Examples 1–2 of this
chapter and Fig. 5.4 of Chapter 5. We now consider a few others.
E XAMPLE 7: π1,1 of a2 = sl (3, C ). Subtracting α1 and α2 from the dominant
weight λ = (1, 1) associated with the weight vector v produces two doublets:
(1, 1), (−1, 2): v, F1 v (s1 doublet),
(1, 1), (2, −1): v, F2 v (s2 doublet).
From (− 1, 2) subtracting α2 twice gives a triplet:
(− 1, 2), (0, 0)00, (1,-2): F1 v, F2 F1 v, F22 F1v (s2 triplet).
From (1, − 2) subtracting α1 once gives
(1, − 2), (−1, −1): F22 F1 v, F1 F22 F1v (s1 doublet)
On the other hand, subtracting α1 twice from (2, − 1) gives
(2, − 1), (0, 0)0, (-2,1): F2 v, F1 F2 v, F12 F2 v (s1 triplet).
Finally, from (− 2, 1) subtracting α2 once gives
(− 2, 1), (−1, −1): F12 F2 v, F2 F12 F2v (s2 doublet).
The end line is the all negative weight (-1,-1). To verify that the weight system just
obtained is complete, we can show that it transforms into itself under the known
Weyl group (described in Example 1). It is illustrated in Fig. 7.3.
Each of the weights (0,0) and (− 1, −1) is reached from two different paths.
What is their multiplicity? The weight (− 1, − 1) is the mirror image of (1,1) under
the Weyl transformation w1 w2 w1, and so it has the multiplicity of one.
Now, let’s look at (0,0). We have |(0, 0)0i = F1 F2 v as a member of an s1 triplet,
and |(0, 0)00 i = F2 F1 v a member of an s2 triplet. Let us define the orthonormal
vectors |(0, 0)0i of the s1 -singlet and |(0, 0)1i of the s1 -triplet. Then by defini-
tion |(0, 0)0i = |(0, 0)1i; and |(0, 0)00i must be a linear combination of the form
|(0, 0)00 i = F2 F1 v = a |(0, 0)1i + b |(0, 0)0i, such that a2 + b2 = 2.
√ Applying E1 on √ this vector, we get on the RHS aE1|(0, 0)1i + bE1|(0, 0)0i =
a 2|(2, −1)i = a 2F2 v, where we have used (7.6), whereas √ on the LHS we have

E1 F2 F1v = F2 E1 F1 v = F2 v. Equating RHS and LHS, a 2 = 1, we get a = 1/ 2
7.2. IRREDUCIBLE REPRESENTATIONS 225

α(2 )
−1, 2 K
AAu u 1, 1
1,1 A A
A A
-1,2 2,-1 A A
0,0 0,0 u UA u

h UA u -α(1)
−2, 1 A A 0, 0 2, −1
1,-2 -2,1 A A
A A
−1, −1 U A u UA u 1, −2
-1,-1

Figure 7.3: Two views of the weight system of the adjoint representation of a2 .
Note the two-fold degeneracy of (0, 0).

(with conventional sign and phase). So |(0, 0)0i and |(0, 0)00 i are independent, and
the weight (0,0) of π11 has multiplicity two, which implies that the dimension of
π11 is equal to eight.
E XAMPLE 8: π3,0 of a2. Following the usual steps, we obtain the following:
(30), (11), (−1, 2), (−3, 3): v, F1 v, F12v, F13v (s1 quadruplet);
(11), (2, −1): F1 v, F2 F1 v (s2 doublet);
(− 1, 2), (00), (1, −2): F12 v, F2 F12v, F22 F12 v (s2 triplet);
(− 3, 3), (−2, 1), (−1, −1), (0, −3): F13 v, F2 F13v, F22 F13 v (s2 quadruplet);
(2, − 1), (00), (−2, 1): F2 F1 v, F1 F2 F1 v, F12 F2 F1 v (s1 triplet);
(1, − 2), (−1, −1): F22 F12v, F1 F22 F12 v (s1 doublet).
Again this is the complete Weyl invariant weight system having (3,0) as the
highest weight, as illustrated in the following diagram (cf. Fig. 5.8(d) Chapter 5).

30
11
-1,2 2,-1
-3,3 00
1,-2 -2,1
-1,-1
0,-3

The multiplicity of several weights needs to be verified. We will just look at


the case of (00) which is present in an s1 triplet: |(0, 0)0i = F1 F2 F1 v, as well as in
an s2 triplet: |(0, 0)00 i = F2 F12 v. Call (00)0 and (00)1 the orthonormal singlet and
triplet states of s1 having the weight (00). Then by definition |(0, 0)0i = |(0, 0)1i;
and |(0, 0)00 i must be a linear combination of the form |(0, 0)00 i = a |(0, 0)1i +
b |(0, 0)0i, such that a2 + b2 = 2.
226 CHAPTER 7. LIE ALGEBRAS: REPRESENTATIONS

Applying E1 on both√ sides of the equation F2 | − 1, 2i = a |(0, 0)1i + b |(0, 0)0i,


we obtain on the RHS a 2|2, − 1i, where we have used (7.6) with q = 2, k = 1.
On the LHS, we have E1 F2 | − 1, 2i = F2 E1| − 1, 2i = F22|1, 1i (with q = 3, k = 2),
which√is = 2|2, − 1i √(with q = 1, k = 0). Equating right and left hand sides, we
get a 2 = 2, or a = 2. Then from a2 + b2 = 2 it follows b = 0, and so (00)0 and
(00)00 are parallel vectors, and (00) is of multiplicity one. 
We now describe the irreducible representations of the simple Lie algebras,
essentially following the model of sl (3, C ) presented in Chapter 5. We shall make
use of the data on their structure (roots, co-roots) found in Chapter 6, and the
standard tools of tensor algebra, in particular the idea of induced representations
on tensor product of spaces. Without excluding constructs of mixed symmetries (see
Appendix A), we mention here two especially useful products: the completely anti-
symmetric powers and the completely symmetric powers. Let π be a representation of
a Lie algebra g on vector space V , and 2 V the exterior product space of all bilin-
V

ear combinations ∑ij cij vi v j with vi, v j ∈ V and cij = − c ji ∈ C, then 2 π = π ∧ π


V

denotes the induced representation on 2 V defined such that π ∧ π ( X ) sends


V

V for any v, w ∈ V to Xv ∧ w + v ∧ Xw. More generally r π de-


V2
v∧w ∈
V

notes the induced representation on the r-th exterior power (the r-fold antisym-
metric product) r V . If µ1, µ2 , . . . are the weights of π for the vectors v 1, v2, . . . ,
V

then the weights of r π are the r-wise sums of distinct weights µi given by the
V

sums µi1 + µi2 + · · · + µir subject to the restrictions i1 < i2 < · · · < ir , and have
the products vi1 ∧ vi2 · · · ∧ vir as weight vectors. Note that all indices must be
different to satisfy antisymmetry, and i1 < i2 < · · · to avoid double counting.
Similar considerations apply mutatis mutandis to the completely symmetric pow-
ers, named Symr π and Symr V , the weights of which are given by the r-wise sums
µi1 + µi2 + · · · + µir subject to the restrictions i1 ≤ i2 ≤ · · · ≤ ir .

1. Special linear Lie algebra a` = sl (` + 1, C ).


Here ` is the rank of the algebra, and n = ` + 1 the order of its defining
matrices. Let ei be the ith coordinate in Cn , and ε i the ith coordinate function,
such that ε i (ej ) = δij where i, j = 1, 2, . . . , n. As usual, h k denotes the fundamental
co-roots and ω k the fundamental weights, obeying ω i ( h j ) = δij (i, j = 1, 2, . . . , `).
From the known Cartan matrix A = [ a ji ] for the algebra, we get its simple roots,
αi ( h j ) = a ji . Here is a brief reminder of known facts on structure, followed by
results on representations.
• Cartan subalgebra, its dual, root space:
CSA : h = { diag [ a1, . . . , an ]; ai ∈ C, ∑n1 ai = 0} .
CSA∗ : h∗ = { ∑n1 ci ε i ; ci ∈ C, ∑n1 ci = 0} / (∑n1 ε i = 0).
Root lattice: Λ R = { ∑n1 ci ε i | ci ∈ Z, ∑n1 ci = 0} / (∑n1 ε i = 0).
All roots : ∆ = { αij = ε i − ε j , with i, j = 1, . . . , ` + 1} . | ∆ | = `(` + 1).
Simple roots : Σ = { αi = αi,i+1 = ε i − ε i+1 , with i = 1, . . . , `} , so that the
positive roots, αij with 1 ≤ i < j ≤ ` + 1, may be written as αi + αi+1 + · · · + α j−1 .
It is useful to note ∑`1 αi = ε 1 − ε `+1. The fundamental co-roots are hi = ei − ei+1 ,
so that ε i ( hi ) = 1, ε i+1 ( hi) = − 1, with 1 ≤ i ≤ `.
7.2. IRREDUCIBLE REPRESENTATIONS 227

• Weight space:
Weight lattice: Λ = { ∑n1 mi ε i | mi ∈ Z } / ( ∑n1 ε i = 0).
Dominant weights: λ = ∑n1 mi ε i, with integers m1 ≥ m2 ≥ · · · ≥ mn ≥ 0, so
that λ ( hi) are all non-negative integers.
Fundamental weights: ω 1 = ε 1, ω 2 = ε 1 + ε 2,. . . , ω ` = ε 1 + ε 2 + · · · + ε ` , in
accordance with ω i ( h j) = δij . Note also that the lowest strongly dominant weight
is given by δ = ∑`1 ω i = ∑`1 kε `+1−k.
The root space admits several bases tied together by relations to be given here.
The simple roots may be written in the Dynkin basis { ω 1, . . . , ω ` } as αi = ∑j ω j a ji
where a ji = αi ( h j ) (as already indicated before), or explicitly,
α1 = ε 1 − ε 2 = (2, − 1, 0, . . . ),
α2 = ε 2 − ε 3 = (− 1, 2, −1, 0, . . . ),
α3 = ε 3 − ε 4 = (0, − 1, 2, −1, 0, . . . ), ..., ...,
α`−1 = ε `−1 − ε ` = ( . . . , 0, − 1, 2, − 1), and
α` = ε ` − ε `+1 = ( . . . , 0, − 1, 2).
Upon inverting, we obtain the relations
ε1 = ω 1,
ε2 = ω 2 − ω 1 = ω 1 − α1,
ε3 = ω 3 − ω 2 = ω 1 − α1 − α2, ..., ...,
ε ` = ω ` − ω `−1 = ω 1 − α1 − · · · − α`−1 ,
ε `+1 = − ω ` = ω 1 − α1 − · · · − α` .
• The classification of the irreducible representations of the special linear Lie algebra
sl ( n, C ) has a neat solution: all of its irreps can be built up from its standard repre-
sentation, the latter being defined by traceless matrices of order n.
To begin, it is useful to recall the results concerning the case of sl (3, C ). In
its standard representation, each element is represented by a 3 × 3 traceless com-
plex matrix, and a basis of the CSA may be taken to be 3 × 3 diagonal matri-
ces, e.g. π ( h1) = diag [1, − 1, 0] and π ( h2) = diag [0, 1, − 1]. The basis of C3 is
the usual column vectors ξ i , which are joint eigenvectors of π ( h 1), π ( h2) with
weights ε 1 = (1, 0), ε 2 = (− 1, 1) and ε 3 = (0, − 1), where the components are
given by ε i ( h j ). Alternatively we may write ε 1 = ω 1, ε 2 = ω 2 − ω 1 and ε 3 = − ω 2.
The highest weight is ε 1 corresponding to the highest-weight vector ξ 1 = u. Then
we can identify ξ 2 and ξ 3 with F1 u and F2 F1 u, respectively, where F1 and F2 are
the lowering operators associated with the two simple roots α1 and α2 of sl (3, C ).
As ε 1 = ω 1, this is just the irrep called π1,0.
Now, in the general case of sl ( n, C ), each element h in h is represented on Cn
by a diagonal n × n matrix of the form H = diag [ a1, . . . , an ], where ai ∈ C and
∑n1 ai = 0. Its eigenvectors are the standard basis vectors in Cn , with eigenvalues
a1, . . . , an , so the weights, as elements of h∗ , are ε 1, ε 2, . . . , ε n (because ε i ( H ) = ai ).
As ε 1 = ω 1, ε 2 = ω 1 − α1, etc., we see that ε 1 is the highest weight (whereas
ε n is the lowest), and so the standard representation is π1,0,... . The representation
space Cn is spanned by v, F1v, F2 F1 v,. . . , F` · · · F2 F1 v, where v is the highest-weight
vector, and Fi = Fαi are the lowering operators associated with the simple root αi .
The dual π ∗ to π1,0,... is a representation such that π ∗ ( H ) = − H T , so its
weights are − ε `+1, − ε `, . . . , − ε 1, or ω ` , ω ` − α` , . . . , ω ` − ( α` + α`−1 + · · · + α1 ).
228 CHAPTER 7. LIE ALGEBRAS: REPRESENTATIONS

Its highest weight being − ε `+1 = ω ` indicates it must be the irrep π...,0,1 . Both
π1,0,... and π...,0,1 of course have the same dimension n.
Other fundamental representations (those having a fundamental weight as high-
est weight) all are exterior products of π1,0,... , called ϕ1 for short. For example,
take 2 ϕ1. It has `(` + 1) /2 weights, of the form ε i + ε j with 1 ≤ i < j ≤ ` + 1,
V

with ε 1 + ε 2 = ω 2 as the highest weight. Once we have convinced ourselves that it


is irreducible, we can write 2 ϕ1 = π0,1,0,... . Generally, the weights of r ϕ1 (with
V V

2 ≤ r ≤ `) are given by all the r-wise sums of the form ε i1 + ε i2 + · · · + ε ir with


1 ≤ i1 < i2 < · · · < ir ≤ ` + 1, forming an irreducible set, with highest weight
ε 1 + ε 2 + · · · + ε r = ω r . And so r ϕ1 is just irrep π0,...,1,0,... (with the 1 in the rth
V

position), of dimension (` + 1) !/r!(` + 1 − r ) !. Each fundamental representation


is associated to a node of the Dynkin diagram, as shown in Fig. 7.4.

A` f f f··· f f
ϕ1 ϕ2 ϕ3 ϕ`−1 ϕ`
Vi
Figure 7.4: The fundamental representations ϕi = ϕ1 = π0,...,1,0,... , with i =
2, . . . , `, of sl ( n, C ) associated to the nodes in the Dynkin diagram.

We turn now to symmetric powers of the fundamental representations. The


simplest is Symm ϕ1 , having the weights ε i1 + ε i2 + · · · + ε im with 1 ≤ i1 ≤ i2 ≤
· · · ≤ im ≤ ` + 1. It turns out that it is irreducible with highest weight mε 1 = mω 1,
that is Symm ϕ1 = πm,0,... . But symmetric powers like Symm ϕr (where ϕi = i ϕ1 ,
V

with i > 1) are not irreducible although they contain the irrep π0,...,0,m,0,... (with m
in the rth position) when it is decomposed into a direct sum.
The general result on the representations of the special linear Lie algebra can
be stated as follows: Every irreducible representation of sl (` + 1, C ) has its high-
est weight of the form ∑i mi ω i = ( m1 + m2 + · · · + m` ) ε 1 + · · · + m` ε ` , where
m1, . . . , m` are natural numbers N; conversely to every `-tuple of non-negative in-
tegers m1, . . . , m` corresponds a unique irreducible representation of sl (` + 1, C )
having highest weight ( m1, m2, . . . , m` ). It occurs in the tensor product
Symm1 ϕ1 ⊗ Symm2 ϕ2 ⊗ · · · ⊗ Symm` ϕ` = πm1 ,m2 ,...,m` ⊕ · · · , (7.7)
where ϕi are the fundamental representations of the algebra, all obtainable from
the standard representation: ϕi = i ϕ1.
V

E XAMPLE 9: a3 = sl (4, C ). Here ϕ1 = π1,0,0 has four weights, namely, ε 1, ε 2, ε 3, ε 4,


written in the basis ( ω 1, ω 2, ω 3 ) as ε 1 = (1, 0, 0), ε 2 = (− 1, 1, 0), ε 3 = (0, − 1, 1),
and ε 4 = (0, 0, − 1). The dual representation ϕ∗1 has the weights − ε i, the highest
weight being − ε 4. The weights of ϕ2 = 2 ϕ1 are ε 1 + ε 2, ε 1 + ε 3, ε 1 + ε 4, ε 2 + ε 3,
V

ε 2 + ε 4, and ε 3 + ε 4. Noting that they are symmetric about the origin, we see
that ϕ2 = π0,1,0 is self-dual, being isomorphic with ( 2 ϕ1 )∗ = 2 ϕ∗1 . Finally,
V V
V3
ϕ3 = ϕ1 has weights ε 1 + ε 2 + ε 3, ε 1 + ε 2 + ε 4, ε 1 + ε 3 + ε 4, and ε 2 + ε 3 + ε 4, and
so ϕ3 = π0,0,1. Since ε 1 + ε 2 + ε 3 + ε 4 = 0, we see that 3 ϕ1 = ϕ∗1 . In conclusion,
V

the fundamental representations of the Lie algebra a3 = sl (4, C ) are: π1,0,0 = ϕ1 ,


π0,1,0 = 2 ϕ1, and π0,0,1 = 3 ϕ1 = ϕ∗1 , of respective dimensions 4, 6, and 4.
V V
7.2. IRREDUCIBLE REPRESENTATIONS 229

Consider now the symmetric powers of ϕ1 (homogeneous polynomials of 4


variables on C4 ). In particular, Sym2 ϕ1 has the weights 2ε 1, ε 1 + ε 2, ε 1 + ε 3, ε 1 + ε 4,
2ε 2, ε 2 + ε 3, ε 2 + ε 4, 2ε 3, ε 3 + ε 4, and 2ε 4, all expressible in the form λ − n1 α1 −
n2 α2 − n3 α3, where ni are non-negative integers, and λ = 2ε 1. So, Sym2 ϕ1 is the
irrep π2,0,0 (of dimension 10, with highest weight 2ε 1 = 2ω 1 and lowest weight
2ε 4 = − 2ω 3). Similarly, Sym3 ϕ1 has the weights aε 1 + bε 2 + cε 3 + dε 4, with non-
negative integers a, b, c, d such that a + b + c + d = 3, again expressible in the form
λ − n1α1 − n2 α2 − n3 α3 with λ = 3ε 1. It is the irrep π3,0,0 (dimension 20, with
highest weight 3ε 1 = 3ω 1 and lowest weight 3ε 4 = − 3ω 3). Generally, Symm ϕ1 is
irrep πm,0,0 of highest weight mω 1 and dimension ( m + 3) !/(m!3! ).
On the other hand, Sym2 ϕ2 is not irreducible (admitting 21 weights), although
it contains the irrep π0,2,0 (which has 20 weights). In fact, we have the decompo-
sition Sym2 ϕ2 = π0,2,0 ⊕ C. In general, Symm ϕi is not irreducible for i > 1.
The tensor product Symm1 ϕ1 ⊗ Symm2 ϕ2 ⊗ Symm3 ϕ3 contains a copy of the
irreducible representation πm1 ,m2 ,m3 with highest weight m1 ω 1 + m2 ω 2 + m3 ω 3.
In particular, ϕ1 ⊗ ϕ2 ∼ = π1,1,0 ⊕ π0,0,1 (checking the dimensions: 4 × 6 = 20 + 4),
and ϕ1 ⊗ ϕ3 ∼ = π1,0,1 ⊕ C (consistent with 4 × 4 = 15 + 1), where π1,0,1 is the
adjoint representation of sl (4, C ).
2. Symplectic Lie algebra sp (`, C ) = c` .
• Root space:
CSA : h = { H = diag [ a1, . . . , a` ; − a1, . . . , − a` ]; ai ∈ C } .
∆+ : Positive roots are ε i ± ε j (1 ≤ i < j ≤ `) and 2ε i (1 ≤ i ≤ `). | ∆+ | = `2.
Σ: Simple roots are α1 = ε 1 − ε 2, . . . , α`−1 = ε `−1 − ε ` , α` = 2ε ` . The long root:
α = 2ε ` , | α` |2 = 2| αi|2 for i < `. The maximum root: 2ε 1 .
`

CSA∗ : Fundamental co-roots: h1 = e1 − e2 , . . . , h`−1 = e`−1 − e` , h` = e` .


• Weight space:
Weight lattice: Λ = { ∑n1 mi ε i; mi ∈ Z } . The dominant-weight lattice Λd con-
sists of the ∑i mi ε i with mi ∈ N, m1 ≥ m2 ≥ · · · ≥ m` ≥ 0.
The fundamental weights are ω i = ε 1 + ε 2 + · · · + ε i (1 ≤ i ≤ `), and their
sum is δ = ` ε 1 + (` − 1) ε 2 + · · · + 2ε `−1 + ε ` .
• The irreps: (1) If ϕ1 = π1,0,... is the standard rep, then the fundamental rep-
resentations ϕr = π...,0,1,0... (the 1 is in the rth position; 1 ≤ r ≤ `) with high-
est weight ω r is contained in r ϕ1 (more precisely it is the kernel of the map
V
Vr−2
ϕ1, i.e. the part of r ϕ1 that is mapped to 0 in r−2 ϕ1). (2) To
Vr
ϕ1 7 →
V V
ϕr :
an arbitrary `-tuple ( m1, . . . , m` ) ∈ N ` (labeling a weight on Λd ) corresponds a
unique irrep with highest weight m1 ω 1 + · · · + m` ω ` , to be denoted by πm1 ,...,m` .
This exhausts all possible irreps of the algebra. Each such πm1 ,...,m` is contained in
the tensor product Symm1 ϕ1 ⊗ Symm2 ϕ2 ⊗ · · · ⊗ Symm` ϕ` .
E XAMPLE 10: The simplest case is sp(1, C ): there are two roots, ± 2ε 1, as in sl (2, C )
to which it is isomorphic. In sp(2, C ), of rank 2, we have the simple roots α1 =
ε 1 − ε 2 and α2 = 2ε 2 (the long root), and the positive roots ε 1 ± ε 2, 2ε 1, and 2ε 2.
Its root diagram is shown in Fig. 7.5 (it is equivalent to that for so(5, C ) to be
shown later in Fig. 7.6(b), indicating the isomorphism sp(2, C ) ∼ = so (5, C ) ). The
fundamental weights in sp (2, C ) are ω 1 = ε 1 and ω 2 = ε 1 + ε 2.
230 CHAPTER 7. LIE ALGEBRAS: REPRESENTATIONS

The standard representation ϕ1 of sp(2, C ) is sp(2, C ) itself on C4 . This means


its standard basis vectors ξ i on C4 are the joint eigenvectors of H with eigenvalues
ε 1, ε 2, − ε 1, − ε 2. Noting that ε 1 = ω 1, ε 2 = ω 1 − α1, − ε 2 = ω 1 − α1 − α2, and
− ε 1 = ω 1 − α1 − α2 − α1, we see that ϕ1 is the irrep with highest weight ω 1,
which we indicate with ϕ1 = π1,0. From the reflection symmetry of their weights,
we conclude that ϕ1 is self-dual.
The second exterior product 2 ϕ1, which is a representation of highest weight
V

ε 1 + ε 2, has 6 weights: ±( ε1 + ε 2 ), ±( ε1 − ε 2 ) and 0 (multiplicity 2). Now the five


weights ±( ε1 + ε 2), ±( ε1 − ε 2), and 0 (once) can be reached from ω 2 by sub-
tracting α1 and α2, we see that they are irreducible and associated to π0,1. So,
V2
ϕ1 itself is not irreducible, but expressible as a direct sum of irreps: π0,1 ⊕ C.
This example shows that, in contrast to sl ( n ), the fundamental representations of
sp(`, C ) are not equal to, but are contained in, exterior products of ϕ1; in other
words, r ϕ1 = π0,...,0,1,0,... ⊕ · · · (the 1 is in the rth position).
V

@ s @ @ @
@ @ @ @ @ @ @ @
6
@ @ @ @ @ @ @ @
@s @ε 2
@ s @ @s @ε 2
@ s @
@ @ @ @ @
@ @ @ @ @ @
@
@ @ @ @ @ @ @ @
@ s @ @ @
@ @ @ 0 @ε 1 s
@ @ @sf0 - @ε 1

@ @
@@ @ @@ @ @ @ @ @@
@ @ @
@ @ @ @ @ @
@s Rs @
@ @s
@ @ @ @ @ @ @ @s @
@ @ @ @ @ @ @ @
@ @ s @@ @ @ @ @ @
@ @ @
@ @ @ @
@ @ @ @

Figure 7.5: Lie algebra sp(2, C ). (a) LHS: Root diagram (| ∆ | = 8), arrows indicat-
ing the simple roots. (b) RHS: Weight diagram of 2 ϕ1, with arrows indicating
V

the fundamental weights.

3. Odd orthogonal Lie algebra so (2` + 1, C ) = b` .


• Roots, root space:
Σ: Simple roots are: αi = ε i − ε i+1 (1 ≤ i ≤ ` − 1) and α` = ε ` .
ij
∆+ : The positive roots are: αi0 = ε i and α± = ε i ± ε j (1 ≤ i < j ≤ `). There
ij
are 2 × `(` − 1) /2 of α± and ` of αi0 , and so | ∆+ | = `2. They all can be written in
terms of the simple roots as follows:
ij
α − = α i + α i+ 1 + · · · + α j− 1 , (1 ≤ i < j ≤ `),
ij
α + = α i + · · · + α j− 1 + 2 ( α j + · · · + α ` ) ,
αi0 = αi + αi+1 + · · · + α` .

The maximum root is α12 1 2 ` i 1


+ = ε + ε ; the sum of the simple roots is ∑1 α = ε . The
fundamental co-roots are: hi = ei − ei+1 (1 ≤ i ≤ ` − 1), and h` = 2e` .
7.2. IRREDUCIBLE REPRESENTATIONS 231

• Weights, weight space:


Λ: It is generated by ε 1, . . . , ε ` , and 1/2( ε 1 + · · · + ε ` ).
Λd : It consists of all ∑i ai ε i, with either all ai integral, or all ai half-integral, and
a1 ≥ a2 ≥ · · · ≥ a` ≥ 0.
The fundamental weights are ω 1 = ε 1, ω 2 = ε 1 + ε 2,. . . , ω `−1 = ε 1 + ε 2 +
· · · + ε `−1, and ω ` = 1/2( ε 1 + ε 2 + · · · + ε ` ). The lowest strongly dominant is
δ = ∑i ω i = (` − 1/2) ε 1 + (` − 3/2) ε 2 + · · · + 3/2ε `−1 + 1/2ε ` .
• The irreps: (1) The standard rep ϕ1 has the weights ± ε i and 0, with highest
weight ε 1 = ω 1. Except for the one associated with ω ` , all other fundamental rep-
resentations are obtainable from ϕ1 by exterior product: ϕr = r ϕ1 = π...,0,1,0...,0
V

(with 1 in the r position) with highest weight ω r for 1 ≤ r ≤ ` − 1. (ϕ` = ` ϕ1 is


V
`
also irreducible, but with the highest weight 2ω , twice the expected weight.)
(2) Any weight λ may be expanded as λ = m1 ω 1 + · · · + m`−1 ω `−1 + m` ω ` ; if m`
is even, the last term may also be written ( m` /2)(2ω` ). If λ is in Λd , there exists
a (unique) irrep πλ having λ as the highest weight. A copy of πλ can be found in
tensor product (2a) Symm1 ϕ1 ⊗ · · · ⊗ Symm`−1 ϕ`−1 ⊗ Symm` /2 ϕ` if m` is even; or
in (2b) Symm1 ϕ1 ⊗ · · · ⊗ Symm`−1 ϕ`−1 ⊗ Symm` π0,...,0,1 if m` is odd.
Therefore, all irreps ϕi and the irreps of type (2a) can be constructed from
the standard representation ϕ1. But the irrep π0,...,0,1 with the highest weight ω `
(associated to the end node of the Dynkin diagram and connected to the rest by
a double bond), and all the irreps of type (2b) cannot be obtained in this way.
The proper way to study these half-spin representations makes use of the Clifford
algebras, to be explained in Appendix B.
4. Even orthogonal Lie algebra so (2`, C ) = d` .
• Roots, root space:
Σ: The simple roots are αi = ε i − ε i+1 (i = 1, . . . , ` − 1) and α` = ε `−1 + ε ` .
ij
∆+ : The positive roots (| ∆+ | = `(` − 1)) are α± = ε i ± ε j (1 ≤ i < j ≤ `),
12 1 2
including the maximum root α+ = ε + ε . They all can be written in terms of the
simple roots as follows:
ij
α − = α i + α i+ 1 + · · · + α j− 1 , (1 ≤ i < j ≤ `),
ij
α+ = αi + · · · + α j−1 + 2( α j + · · · + α`−2 ) + α`−1 + α` ,
(1 ≤ i < j ≤ ` − 2),
i,`−1
α+ = αi + αi+1 + · · · + α`−2 + α`−1 + α`, ( i = 1, . . . , ` − 2),
αi,+` = αi + αi+1 + · · · + α`−2 + α` , ( i = 1, . . . , ` − 2),
α`−
+
1,`
=α .`

CSA: h is spanned by h i = ei − ei+1 (i = 1, . . . , ` − 1), and h` = e`−1 + e` .


• Weights, weight space:
Λ: It is generated by ε 1, . . . , ε ` and 1/2( ε 1 + · · · + ε ` ).
Λd : It consists of all ∑i ai ε i where the ai are either all integral, or all half-integral,
and such that a1 ≥ a2 ≥ · · · ≥ a`−1 ≥ | a` |. The fundamental weights are:
ω i = ε 1 + ε 2 + · · · + ε i where 1 ≤ i ≤ ` − 2 , ω `−1 = 1/2( ε 1 + ε 2 + · · · + ε `−1 − ε ` ),
and ω ` = 1/2( ε 1 + ε 2 + · · · + ε `−1 + ε ` ).
232 CHAPTER 7. LIE ALGEBRAS: REPRESENTATIONS

@ @ ε2 @ @ @ ε2 @
@s @ @s ε1 + ε2 @s @s @s ε1 + ε2
@ @ @ @ @
6 @
@ @ @ @ @ @
@ @ @ @ @ @
@ @ 0 @ ε1 @s @ 0 @s ε1
@ @ @ @ @ @
@ @ @ @ @ @
@ @ @ @ @ @
@s @ R
@@s ε1 − ε2 @s @s R
@@s ε1 − ε2
@ @ @ @ @ @
@ @ @ @ @ @

Figure 7.6: (a) LHS: Root diagram (| ∆ | = 4) for so (4, C ); note the isomorphism
so (4, C ) ∼
= sl (2, C ) × sl (2, C ). (b) RHS: Root diagram (| ∆ | = 8) for so (5, C ). Ar-
rows indicate the simple roots.

• The irreps: (1) With ϕ1 being the standard rep, ϕi = i ϕ1 are the irreps of
V

highest weights ω i for i = 1, . . . , ` − 2, and $ = ε 1 + · · · + ε `−1 for i = ` − 1. But


V`
ϕ1 is not irreducible but is fully reducible to π2ω`−1 ⊕ π2ω` . (2) Any weight λ
in Λd may be written as m1 ω 1 + · · · + m`−1 ω `−1 + m` ω ` . If m`−1 + m` is even, the
last two terms may be written m` $ + 1/2( m`−1 − m` )(2ω`−1) when m`−1 ≥ m` ,
and m`−1 $ + 1/2( m` − m`−1 )(2ω `) when m` ≥ m`−1 . A copy of the irrep πλ may
be found in one of these tensor products: (2a) Symm1 ϕ1 ⊗ · · · ⊗ Symm`−2 ϕ`−2 ⊗
Symm` ϕ`−1 ⊗ Sym(m`−1 −m` )/2 π2ω`−1 if m`−1 + m` is even and m`−1 ≥ m` ; (2b)


Symm1 ϕ1 ⊗ · · ·⊗ Symm`−2 ϕ`−2 ⊗ Symm`−1 ϕ`−1 ⊗ Sym(m` −m`−1 )/2 π2ω` if m`−1 +

m` is even and m` ≥ m`−1 ; (2c) Symm1 ϕ1 ⊗ · · · ⊗ Symm`−2 ϕ`−2 ⊗ Symm`−1 πω`−1 ⊗
Symm` πω` if m`−1 + m` is odd.
The irreducible representations of types (2a) and (2b) have weights in the inte-
gral sublattice Z { ε 1, . . . , ε ` } of Λ. On the other hand, those of type (2c) are built on
the irreps of highest weights ω `−1 and ω ` , which lie in the half-integral sublattice
and correspond to the two nodes that branch off one end of the Dynkin diagram.
These reps πω`−1 and πω` are the two half-spin representations of so(2`, C ), both
of dimension 2`−1 . Clifford algebras are a more appropriate approach to study
them, as we will explain in Appendix B.
E XAMPLE 11: We now illustrate the above results for some lower rank algebras.
For ` = 1, d1 = so(2, C ) ∼ = C is not semisimple (Chapter 6); b1 = so (3, C ) has
the same system of roots as sl (2, C ), namely {± ε1} . But while the fundamental
weight is ε 1 for a1 , it is 1/2ε 1 for b1 . So, the standard representation of b1 is a spin
representation of an orthogonal Lie algebra.
E XAMPLE 12: Rank 2 includes d2 = so (4, C ) and b2 = so(5, C ). Lie algebra
so (4, C ) has two simple roots, α1 = ε 1 − ε 2 and α2 = ε 1 + ε 2 in a disconnected
7.2. IRREDUCIBLE REPRESENTATIONS 233

Dynkin diagram. Its roots, ±( ε1 − ε 2 ) and ±( ε1 + ε 2), form the root system ∆
shown in Fig.7.6(a). They lie on two orthogonal lines, which means that so (4, C )
is reducible, being decomposable into the direct sum of its two si subalgebras, one
of roots ±( ε1 − ε 2 ) and the other of roots ±( ε1 + ε 2 ). The fundamental coroots of
so (4, C ) are h1 = e1 − e2 and h2 = e1 + e2 , from which follow the fundamental
weights ω 1 = 1/2( ε 1 − ε 2) and ω 2 = 1/2( ε 1 + ε 2). The representation ϕ1 is so (4, C )
itself spanned by the basis vectors ξ i of C4 with weights ± ε 1 and ± ε 2. The second
exterior product 2 ϕ1 has weights ±( ε1 + ε 2 ), ±( ε1 − ε 2), and 0 (multiplicity 2),
V

shown in the weight diagram in Fig.7.7. Both are reducible and could be under-
stood in terms of the representations of sl (2, C ) × sl (2, C ).

@ @ ε2 @ @ @ ε2 @
@ @s @ ε1 + ε2 @s @ @s ε1 + ε2
@ @ @ @ @ @
@ @ @ @ @ @
@ @ @ @ @ @
@s @ 0 @s ε1 @ @sg0 @ ε1
@ @ @ @ @ @
@ R
@ @ @ @ @
@ @ @ @ @ @
@ @s @ ε1 − ε2 @s @ @s ε1 − ε2
@ @ @ @ @ @
@ @ @ @ @ @

Figure 7.7: Representations of so(4, C ). (a) LHS: Weight diagram of the standard
representation ϕ1 = 4. (b) RHS: Weight diagram of the exterior product repre-
sentation 2 ϕ1 = 6. Arrows indicate the fundamental weights.
V

The other rank-2 orthogonal Lie algebra, b2 = so (5, C ), has two simple roots,
of unequal sizes, given by α1 = ε 1 − ε 2 and α2 = ε 2. Its root system ∆ =
{± ε1, ± ε 2, ±( ε1 + ε 2 ), ±(ε1 − ε 2 )} is illustrated in Fig.7.6(b). Its fundamental co-
roots are h1 = e1 − e2 and h2 = 2e2 , from which are computed the fundamen-
tal weights ω 1 = ε 1 and ω 2 = 1/2( ε 1 + ε 2). The standard representation ϕ1 of
so (5, C ), of highest weight ε 1, has weights ± ε 1, ± ε 2 and 0. Its second exterior
product 2 ϕ1 has weights ± ε 1, ± ε 2, ±( ε1 + ε 2), ±( ε1 − ε 2 ), and 0 (multiplicity
V

2). It is irreducible and of highest weight ε 1 + ε 2, being the adjoint representa-


tion of so(5, C ). The weight diagrams for these two representations are shown in
Fig.7.8. Again, the representation of highest weight ω 2 is a spin representation.
E XAMPLE 13: Rank 3: d3 = so (6, C ) (| ∆ | = 12) and b3 = so (7, C ) (| ∆ | = 18).
I so(6, C ) (or d3 ) has the fundamental weights ω 1 = ε 1, ω 2 = 1/2( ε 1 + ε 2 − ε 3),
and ω 3 = 1/2( ε 1 + ε 2 + ε 3 ). Its standard representation ϕ1 has the weights ± ε i
(i = 1, 2, 3), and is irreducible with the highest weight ε 1 and of dimension 6
(ϕ1 = π1,0,0). The exterior power 2 ϕ1 has the weights ±( εi ± ε j ) (i < j) and 0
V
234 CHAPTER 7. LIE ALGEBRAS: REPRESENTATIONS

@ @ ε2 @ @ @ ε2 @
@ @s @ ε1 + ε2 @s @s @s ε1 + ε2
@ @ @ @ @ @
@ @ @ @ @ @
@ @ @ @ @ @
@s @s 0 -
@s ε1 @s @sg0 @s ε1
@ @ @ @ @ @
@ @ @ @ @ @
@ @ @ @ @ @
@ @s @ ε1 − ε2 @s @s @s ε1 − ε2
@ @ @ @ @ @
@ @ @ @ @ @

Figure 7.8: Representations of so (5, C ). (a) LHS: Weight diagram of ϕ1 = 5. (b)


RHS: Weight diagram of 2 ϕ1 = 10. Arrows indicate the fundamental weights,
V

generators of the weight lattice.

(3 times), making it 15 dimensional. It is irreducible (the adjoint representation


– dimension 15), with highest weight ε 1 + ε 2 = ω 2 + ω 3. However, 3 ϕ1 is not
V

irreducible, being reducible to two irreducible components of highest weights


ε 1 + ε 2 − ε 3 = 2ω 2 and ε 1 + ε 2 + ε 3 = 2ω 3.
I so(7, C ) (or b3 ) has the fundamental weights ω 1 = ε 1, ω 2 = ε 1 + ε 2, and ω 3 =
1/2( ε 1 + ε 2 + ε 3 ) . Its standard representation ϕ has the weights ± ε i (i = 1, 2, 3)
1
and 0. It is irreducible with the highest weight ε 1 and of dimension 7. The exterior
power 2 ϕ1 has the weights ±( εi ± ε j ) (i < j), ± ε i, and 0 (3 times), making
V

it 21 dimensional. It is irreducible (the adjoint representation – dimension 21),


with highest weight ε 1 + ε 2 = ω 2. The exterior power 3 ϕ1 has the weights
V

± ε 1 ± ε 2 ± ε 3; ± ε i ± ε j; ± ε i; 0 (3 times). It turns out to be irreducible, with the


highest weight ε 1 + ε 2 + ε 3 = 2ω 3.
The important point to note is that the method of construction we have been
using produces only irreps with highest weights lying in the integral sublattice of
Λ, thus missing the weights generated by ω 2 and ω 3 in the case of so(6, C ), and
ω 3 in the case of so (7, C ).

5. Exceptional Lie algebra g2 . √


As g2 is defined by an FSR of α1, α2 such that | α2| = 3| α1| and θ ( α1, α2 ) =
150◦ , we will take h to be a subspace of C3 , with a basis such that each h ∈ h is
∑3i ai ei with constraint ∑3i ai = 0, or ε 1 + ε 2 + ε 3 = 0. In this basis we will have:
Simple roots Σ (2) : α1 = ε 2 , α2 = ε 1 − ε 2.
Positive roots ∆+ (6) : ε 1, ε 2, − ε 3, ε 1 − ε 2, ε 2 − ε 3, ε 1 − ε 3.
Fundamental co-roots FCR : h1 = − e1 + 2e2 − e3 , h2 = e1 − e2 .

Fundamental weights FW : ω 1 = ε 1 + ε 2, ω 2 = ε 1 − ε 3; | ω 1| = 1/ 3 | ω 2| = | α1|.
Lattices : Λ R = Λ. See Fig.7.13.
7.2. IRREDUCIBLE REPRESENTATIONS 235

g2 has two fundamental representations:


(1) The 7-dimensional standard representation ϕ1 = π10, with weights 0, ± ε 1,
± ε , ± ε 3; and
2

(2) The 14-dimensional adjoint representation π0,1, with weights 0 (occurring


twice) and ± ε i, ±( εi − ε j ) (1 ≤ i < j ≤ 3).
All the irreps of g2 are self-dual, with the smallest having the dimensions 1, 7,
14, 27, 64, 77 (twice), 182, 189, . . . (calculated with the dimension formula (7.28)).
Note that both π30 and π02 have dimension 77, and π11 has dimension 64.
E XAMPLE 14: Construction of irrep π1,0 of g2 . The fundamental roots of g2 relative
to a given h are α1 = (2, − 1) and α2 = (− 3, 2). Starting from the weight ω 1 =
(1, 0) = − ε 3, we successively identify the following strings of weights:
Sα1 · (1, 0) : (1,0), (− 1, 1); doublet of s1 ;
Sα2 · (− 1, 1) : (− 1, 1), (2, − 1); doublet of s2 ;
Sα1 · (2, − 1) : (2, − 1), (0,0), (− 2, 1); triplet of s1 ;
Sα2 · (− 2, 1) : (− 2, 1), (1, − 1); doublet of s2 ;
Sα1 · (1, − 1) : (1, − 1), (− 1, 0); doublet of s1 .
The weights are distributed over the orbits of the dominant weights (1, 0) (of
size 6) and (0, 0) (of size 1). This standard representation of g2 is seven-dimensional
and self-dual, as shown in Fig.7.9. Its space is spanned by the following vectors
with the corresponding indicated weights:

v F1 v F2 F1 v F1 F2 F1v F12 F2 F1 v F2 F12 F2 F1 v F1 F2 F12 F2 F1 v


− ε3 ε1 ε2 0 − ε2 − ε1 ε3

r ω2

1
α2 r r r r r rω
ε1

r e -rα1 , ε 2 r r r
3
r rε r r r r

Figure 7.9: g2 . (a) LHS: Root diagram with all 12 roots. α1 = ε 2 is the short root
and α2 = ε 1 − ε 2 the long one. (b) RHS: Weight diagram of the standard (seven-
dimensional) representation ϕ1 = π1,0; it is self-dual, with ω 1 = (1, 0) = − ε 3 as
the highest weight, and − ω 1 = ε 3 the lowest.

E XAMPLE 15: Construction and decomposition of tensor product π1,0 ⊗ π1,0. The
tensor-product representation π1,0 ⊗ π1,0 splits into a symmetric part and an an-
tisymmetric part, as in ab = 1/2( ab + ba ) + 1/2( ab − ba ) or, symbolically, using
Young diagrams (described in Appendix A):
236 CHAPTER 7. LIE ALGEBRAS: REPRESENTATIONS

1
= Sym2 π1,0 ⊕ 2
^
1 ⊗ 2 = 1 2 ⊕ π1,0.
2
V2
The antisymmetric part, denoted π1,0, corresponds to a representation con-
sisting of the 21 antisymmetric products of two vectors, one from each of the two
copies of π1,0, with corresponding weights given by pairwise sums of distinct
weights (0 and ± ε i) of π1,0. It is reducible and decomposes into a direct sum of
two irreducible representations, the standard representation π1,0 and the adjoint
representation π0,1. So we write 2 π1,0 ∼ = π1,0 ⊕ π0,1.
V
2
The symmetric part, denoted Sym π1,0, consists of 28 symmetric products of
vectors of the two spaces π1,0, with weights given by pairwise sums of weights
of the component vectors. Among these, 2ω 1 = 2( ε 1 + ε 2) is the highest weight
and − 2(ε1 + ε 2 ) the lowest; and the weight 0 has the multiplicity of 4. Sym2 π1,0
decomposes into a direct sum of two irreducible representations according to
Sym2 π1,0 ∼ = C ⊕ π2,0, where C ∼ = π0,0 and π2,0 a 27-dimensional irreducible rep-
resentation of g2 with highest weight 2( ε 1 + ε 2 ).
In conclusion, π1,0 ⊗ π1,0 is reducible, with the decomposition into irreps,

π1,0 ⊗ π1,0 = π0,0 ⊕ π1,0 ⊕ π0,1 ⊕ π2,0.

(or in terms of their dimensions, 7 × 7 = 1 + 7 + 14 + 27. All four lowest dimen-


sional representations appear through this construction). The weight diagrams
of these two representations are shown in Fig. 7.10
The reader may have noted that in the above calculations, irrep π20 arises in
Sym2 π1,0 = π0,0 ⊕ π2,0, whereas π01 can be obtained from 2 π1,0 = π1,0 ⊕ π0,1.
V

In fact, in general any irrep πab of highest weight aω 1 + bω2 can be obtained from
decomposition of some tensor power of the standard representation.

r r r r

r dr dr r r dr dr r

dr f
dr dr r dr f
dh
r dr r

r dr dr r r dr dr r

r r r r

Figure 7.10: g2. Construction of higher-dimensional irreps by tensoring the stan-


dard representation π1,0. (a) LHS: Weight diagram of 2 ϕ1 decomposable into
V

π1,0 ⊕ π0,1 = 7 + 14. (b) RHS: Weight diagram of Sym2 ϕ1 decomposable into
π0,0 ⊕ π2,0 = 1 + 27.
7.2. IRREDUCIBLE REPRESENTATIONS 237

6. Exceptional Lie algebra f4.

CSA : h = C4 ; h0 = R 4 . Basis : { ei ; i = 1 − 4} .
Σ (4) : α1 = 1/2( ε 1 − ε 2 − ε 3 − ε 4), α2 = ε 4, α3 = ε 3 − ε 4, α4 = ε 2 − ε 3.
∆+ (24) : ε i , ε i + ε j, ε i − ε j ( i < j ), 1/2( ε 1 ± ε 2 ± ε 3 ± ε 4 ) .
FCR (4) : h1 = e1 − e2 − e3 − e4 , h2 = 2e4 , h3 = e3 − e4 , h4 = e2 − e3 .
FW (4) : ω i ( h j) = δij ; ω 1 = (1, 0, 0, 0), ω 2 = 1/2(3, 1, 1, 1), ω3 = (2, 1, 1, 0),
ω 4 = (1, 1, 0, 0);
δ = ∑ ω i = 1/2(11, 5, 3, 1) all written in basis ( ε 1, ε 2, ε 3, ε 4 ).

The dimensions of the finite-dimensional irreps of simple complex or real Lie


algebras are given by the Weyl dimension formula (7.24). For the smallest irreps
of f4 , they are 1, 26, 52, 273, 324, 1053 (twice), 1274, 2652, . . . . The fundamental
representations are those with dimensions 26, 52, 273, 1274.

7. Exceptional Lie algebra e6.


For e6, as well as for e7 and e8 that immediately follow, the labeling of the
simple roots is just as shown in Chap. 6, Fig. 6.9.

CSA : h = C 6 ; h0 = R 6 .

Σ ( 6) : α1 = 1/2 ( ε 1 − ε 2 − ε 3 − ε 4 − ε 5 + 3ε 6 ), α2 = ε 1 + ε 2,
α 3 = ε 2 − ε 1 , α 4 = ε 3 − ε 2, α 5 = ε 4 − ε 3, α 6 = ε 5 − ε 4.

∆+ (36) : ε i + ε j |i< j≤5 , ε i − ε j | j<i≤5 , 1/2(± ε1 ± ε 2 ± ε 3 ± ε 4 ± ε 5 + 3ε 6)
(with even number of minus signs) .

FCR (6) : h1 = 1/2 (e1 − e2 − e3 − e4 − e5 + 3e6 ), h2 = e1 + e2 ,
hi = ei−1 − ei−2 ( i = 3, 4, 5, 6).
√ √
FW(6) : ω 1 = (0, 0, 0, 0, 0, 2/ 3), ω 2 = 1/2(1, 1, 1, 1, 1, 3),
√ √
ω 3 = 1/2(− 1, 1, 1, 1, 1, 5/ 3), ω 4 = (0, 0, 1, 1, 1, 3),
√ √
ω 5 = (0, 0, 0, 1, 1, 2/ 3), ω 6 = (0, 0, 0, 0, 1, 1/ 3).

∑ ωi : δ = (0, 1, 2, 3, 4, 4 3) in basis ( ε 1, ε 2, ε 3, ε 4, ε 5, ε 6 ).

The smallest irreps have the dimensions 1, 27 (twice), 78, 351 (four times), 650,
1728 (twice), 2430, 2925, 3003 (twice). The existence of multiple non-isomorphic
representations for some dimensions can be understood in part from the symme-
try of the Dynkin diagram for E6. In the configuration of the Dynkin diagram
consistent with the definitions of αi and ω i given above (where the last node α2 is
linked to the middle node α4, and α1, α3 are symmetric with α5, α6 resp.) the fun-
damental irreps of highest weights ω 1, ω 2, · · · , ω 6 have dimensions 27, 78, 351,
2925, 351, 27. Of all the exceptional algebras, e6 is the only one that has any non
self-dual irreps (making it a possible candidate for a flavor-chiral particle theory).
238 CHAPTER 7. LIE ALGEBRAS: REPRESENTATIONS

8. Exceptional Lie algebra e7.

CSA : h = C 7 , h0 = R 7 .

Σ (7) : α1 = 1/2 ( ε 1 − ε 2 − · · · − ε 6 + 2 ε 7 ), α2 = ε 1 + ε 2,
αi = ε i−1 − ε i−2 ( i = 3, 4, . . . , 7).

∆+ (63) : εi + ε j ( i < j ≤ 6); εi − ε j ( j < i ≤ 6); 2 ε7.

1/2 (± ε1 ± ε 2 ± · · · ± ε 6 + 2 ε 7 ) (odd number of minus signs).

FCR (7) : h1 = 1/2 (e1 − e2 − · · · − e6 + 2 e7 ), h2 = e1 + e2 ,
hi = ei−1 − ei−2 ( i = 3, 4, 5, 6, 7).
√ √
FW (7) : ω 1 = (0, . . . , 0, 2), ω 2 = 1/2(1, 1, 1, 1, 1, 1, 2 2),
√ √
ω 3 = 1/2(− 1, 1, 1, 1, 1, 1, 3 2), ω 4 = (0, 0, 1, 1, 1, 1, 2 2),
√ √
ω 5 = (0, 0, 0, 1, 1, 1, 3/ 2), ω 6 = (0, . . ., 0, 1, 1, 2),

ω 7 = (0, . . . , 0, 1, 1/ 2).

∑ ωi : δ = (0, 1, 2, 3, 4, 5, 17/ 2).

9. Exceptional Lie algebra e8.

CSA : h = C 8 , h0 = R 8 .
Σ (8) : α1 = 1/2 ( ε 1 − ε 2 − · · · − ε 7 + ε 8), α2 = ε 1 + ε 2,
αi = ε i−1 − ε i−2 ( i = 3, 4, . . . , 8).
∆+ (120) : 1/2(± ε1 ± ε 2 ± · · · ± ε 7 + ε 8) (even number of minus signs),
εi + ε j ( i < j ≤ 8); εi − ε j ( j < i ≤ 8).
FCR (8) : h1 = 1/2 (e1 − e2 − · · · − e7 + e8 ), h2 = e1 + e2 ,
hi = ei−1 − ei−2 ( i = 3, 4, . . . , 8).
FW(8) : ω 1 = (0, . . . , 0, 2), ω 2 = 1/2(1, . . . , 1, 5),
ω 3 = 1/2(− 1, 1, . . . , 1, 7), ω 4 = (0, 0, 1, . . . , 1, 5),
ω 5 = (0, 0, 0, 1, . . . , 1, 4), ω 6 = (0, . . . , 0, 1, 1, 1, 3),
ω 7 = (0, . . . , 0, 1, 1, 2), ω 8 = (0, . . . , 0, 1, 1).
∑ ωi : δ = (0, 1, 2, 3, 4, 5, 6, 23).

The adjoint representation is also the standard representation, of dimension


248, making it the smallest non-trivial irrep, a feature unique to e8 .
7.3. DIMENSION OF REPRESENTATION 239

7.3 Dimension of Representation


In this Section, we give two formulas, one due to Hans Freudenthal, for the
weight multiplicities, and the other, due to Hermann Weyl, for the character of
any irreducible representation (irrep) of a simple Lie algebra. Freudenthal’s mul-
tiplicity formula (FMF) shows how to compute the multiplicity of a weight in an
irreducible representation by iterating down from the multiplicity of its highest
weight. The character of an irrep is a complex function on a Cartan subalge-
bra that records the multiplicities of the weights in the representation. The Weyl
character formula (WCF) is remarkable because it gives us a simple way of com-
puting the character that requires us to know not the representation completely,
only its highest weight. It has many important applications, one of which is the
derivation of a formula for the dimension of the representation space.

7.3.1 Preliminaries
Let h be a Cartan sub algebra of a simple Lie algebra g, and π a representation of
g on vector space V . Under the action of h, V decomposes into a direct sum of the
weight spaces Vµ , each attached to a weight µ of π relative to h, with a dimension
equal to nµ , the multiplicity of the weight µ. We write accordingly:

V = ⊕ µ Vµ , dim Vµ = nµ . (7.8)

For each positive root α ∈ ∆+ , one may define a sub algebra sα of g of type
a1 ∼
= sl (2, C ) spanned by the operators Hα and E±α (with E−α ≡ Fα ), which obey
the standard commutation relations [ Eα , E−α ] = Hα and [ Hα, E±α ] = ± α ( Hα) E±α ,
where Hα is normalized such that α ( Hα ) = 2.
Before considering the FMF and WCF, we will state a few relevant results,
labeled R1, R2, . . . as follows.
R1. In any irreducible representation πn of Lie algebra a1 = { H, E, F } on vector space
V , let vk be the eigenvectors with eigenvalues µk of H and k = 0, 1, . . . , n. Then vk are
also eigenvectors of EF with eigenvalues tk . The following relations hold:
(a) TrV H = µ0 + µ1 + · · · + µn = 0.
(b) tk − tk−1 = µk (with t−1 = 0), so that tk = µ0 + µ1 + · · · + µk .
(a) Since µk are symmetric about 0 and all of multiplicity of one, their sum
vanishes. (b) Since EF and FE commute with H, they also admit v k as eigenvec-
tors with eigenvalues tk and t0k . From Fvk = vk+1 and EFvk = tk vk , we deduce
EFvk−1 = Evk = tk−1 vk−1 . Then from these three relations, we get FEvk = tk−1 vk ,
and so t0k = tk−1 . The commutation relation [ E, F ] = H acting on v k produces
tk − tk−1 = µk . This is an iterative relation for tk , starting with t0 = µ0; its solu-
tion is tk = µ0 + µ1 + · · · + µk , as stated. Further, as we know that µk = n − 2k,
the sum yields tk = ( k + 1)( n − k ). Noting from this result that tn = 0, we also
obtain TrV H = 0, already found in (a).
R2. In any representation π of a semisimple Lie algebra g, let Vρ be the weight space
for the weight ρ, and let Hα , Eα , Fα define the sub Lie algebra a1 ∼
= sα < g for any root
240 CHAPTER 7. LIE ALGEBRAS: REPRESENTATIONS

α  0 of g. With Trρ denoting a trace on Vρ , let tα,ρ = Trρ Eα Fα and t0α,ρ = Trρ Fα Eα .
Then, if ρ + α is a weight, we have t0α,ρ = tα,ρ+α ; and if not, t0α,ρ = 0. Furthermore,
tα,ρ − t0α,ρ = nρ ρ ( Hα ), where nρ = dimVρ.
R2 evidently generalizes R1 (b) to the situation where sα is a subalgebra. If
ρ and ρ + α are weights of π, then from the cyclicity property of the trace, we
have Trρ Fα Eα = Trρ+α Eα Fα (by inserting intermediate states). Taking the trace of
both sides of Eα Fα − Fα Eα = Hα and noting that Tr ρ Hα = nρ ρ ( Hα ), we obtain the
stated result.
R3. For each root α  0, the action of sα < g on the representation space V decomposes
it into irreducible subspaces Wa (a = 1, 2, . . . ), such that each Wa is itself a direct sum of
weight spaces for Hα . We express this fact in the form
V = ⊕ a Wa , Wa = ⊕ jWρa+ jα , (7.9)

where ρ is a weight, and j an integer ranging over an unbroken interval depending on ρ.


R3 is simply a reminder of some previous results. Each Wa corresponds to a
string of weights: the direct sum of one-dimensional spaces ⊕ jWρa+ jα is equivalent
to the α-string of weights containing the weight ρ, which we have denoted by
Sα ( ρ ) = { ρ + jα ; − q ≤ j ≤ p } , and which, as we know, is invariant under sα .
The eigenvectors of Hα in each Wρa+ jα are weight vectors of π with the weight
ρ ( Hα ) + jα ( Hα ) ≡ ( ρ + jα )( Hα ).
For example, the octet representation π1,1 of a2 decomposes under sα1 ∼ = a1
into W1 , . . . , W4 corresponding to a singlet, two doublets and a triplet of sα1 , each
equivalent to a string Sα1 ( ρ ) for a weight ρ of π1,1 [a2 ]. The weight space V0,0 is
the direct sum of the weight spaces of the sα1 -1 and sα1 -3 constituents, say W1 and
W4, having (0, 0) as a weight. So the dimension of V0,0 in π1,1 is equal to 2.
R4. For any weight ρ and root α, the relation ∑∞ −∞ nρ+ jα ( ρ + jα )( Hα) = 0 holds, with
the convention that nµ = dimVµ if µ is a weight of π, and nµ = 0 if µ is not a weight.
The string of weights Sα ( ρ ) is invariant under the Weyl group relative to α,
that is, under each Weyl reflection wα the image wα ( ρ + jα ) = ρ − ( m + j ) α, with
m = q − p = ρ ( Hα), remains in the string. Since the multiplicity is preserved, i.e.
nw(ρ+ jα) = nρ+ jα , the sum of two weights

nρ+ jα ( ρ + jα )( Hα ) + nρ−(m+ j)α ρ − ( m + j ) α ( Hα ) = nρ+ jα (2ρ − mα )( Hα ),

is zero since α ( Hα ) = 2 and m = ρ ( Hα ). Therefore, the contributions of the


weights to all weight spaces Vµ present in the string (or equivalently to the trace
p
of Hα over Wa ) cancel in pairs, yielding ∑−q nρ+ jα ( ρ + jα )( Hα ) = 0. The limits
in summation can be extended to infinity by taking nµ = 0 if µ is not a weight.
Further, since the trace of Hα over each Wa vanishes, we also have TrV Hα = 0,
which corresponds to R1(a).
R5. The trace of Eα Fα on the weight space Vµ of a representation π is given by

Trµ ( Eα Fα ) = ∑ nµ+iα ( µ + iα )( Hα). (7.10)
i= 0
7.3. DIMENSION OF REPRESENTATION 241

Consider the string of weights Sα ( ρ ) = { ρ − jα; 0 ≤ j ≤ m } , where ρ + α is not


a weight and m = ρ ( Hα). Now calculate the trace of Eα Fα on the weight space
corresponding to a weight in Sα ( ρ ) by iterating tα,µ = tα,µ+α + nµ µ given in R2,
starting with tα,ρ = nρ ρ ( Hα ) (since ρ + α is not a weight). Then, for k ≤ m, one
obtains tα,ρ−kα = ∑kj=0 nρ− jα ( ρ − jα )( Hα ). Now as µ = ρ − kα is a weight in π,
Vµ is a weight space of π, and tα,µ = Trµ Eα Fα is well-defined. So we obtain (after
changing j to i = k − j)

k
tα,µ = ∑ nµ+iα ( µ + iα )( Hα). (7.11)
i= 0

Letting k → ∞, we obtain the relation given in R5. This result says that the trace
of Eα Fα on any weight space Vµ is the sum of its weight and of all higher weights
factoring in their multiplicities.

7.3.2 Casimir Operators


In quantum angular-momentum algebra, the squared angular-momentum oper-
ator J 2 = ∑ Ji Ji commutes with every component Jk . Its eigenvalue in an SU (2)
irreducible representation is invariant and characterizes it. A similar operator
exists in general Lie algebras, and is the main tool in the derivation of the FMF.
Let { xi } be a basis on a simple Lie algebra g, and { x0i } its dual relative to the
Killing form, normalized so that ( xi : x0j ) = δij . In a representation π of g on
a vector space V , denote Xi = π ( xi) and Xi0 = π ( x0i ), and form the quadratic
operator C = ∑i Xi Xi0 : V → V . We have thus defined the (quadratic) Casimir
operator. It has the following basic properties:
(1) Mapping Xi 7 → Ya and Xi0 7 → Ya0 by a similarity transformation leaves C
invariant ∑i Xi Xi0 = ∑a Ya Ya0 . So it is independent of the bases used to define it:


it behaves as a scalar operator.


(2) C commutes with every element z of g, i.e. [ C, π (z) ] = 0. Let Z = π ( z )
for any z ∈ g, and consider the following bracket operating on V :

[ C, Z] = ∑ [ Xi, Z] Xi0 + ∑ Xi [ Xi0 , Z].


i i

In the basis { Xi} , one may write [ Xi , Z] = ∑j cij Xj , or inverting this with Xi0 , we
get the constant cij = ([ Xi, Z] : X0j ) = ( Xi : [ Z, X0j ]), in which the second equality
follows from invariance of the Killing form, ([ a, b] : c ) = ( a : [ b, c ]). This gives
cij = −([ X0j, Z] : Xi ), or equivalently, [ X0j , Z] = − ∑i cij Xi0 . All the terms in the
expansion of [ C, Z ] cancel in pairs, leading to [ C, Z ] = 0 for every Z in π (g).
(3) Now if the representation π is irreducible, then by Schur’s lemma, C acts as
a scalar multiplication in V by a constant which may serve to characterize π.
In any representation π of g, a natural choice of basis is the set of elements
{ Hi, i = 1, 2, . . . , `; Eα , α ∈ ∆ } , where Eα are the root vectors corresponding to
the (positive or negative) roots α, and Hi = Hαi , are the fundamental co-roots
242 CHAPTER 7. LIE ALGEBRAS: REPRESENTATIONS

normalized in the usual way, αi ( Hi ) = 2. Then the dual basis, obtained from its
definition as X0j = Xk [ κ −1]kj , where [ κ −1] is the inverse of the matrix [ κ ], with ele-
ments given by the Killing products κij = ( Xi : Xj ), can be easily calculated since
we know that [ κ ] is block-diagonal by the orthogonality of the CSA and various
gα ⊕ g−α , and so restrictions of [ κ ] to the disjoint subspaces can be calculated sep-
arately. The elements Hi0 dual to Hi are defined by the condition ( Hi : Hj0 ) = δij
on the restriction of the Killing form to h. One finds Hi0 = Hk bki | αk |2 /2, where
| α |2 = h α, α i, and bki are the elements of the inverse Cartan matrix.
For any α ∈ ∆, one determines the element Hα in h normalized such that
α ( Hα ) = 2. Then E±α are found as root vectors with the roots ± 2 relative to Hα .
We already know that ( Eα : Eβ ) = 0 unless α + β = 0, and so E±α are dual with
respect to the Killing form. All we need now is to consistently normalize them. By
the invariance property of the Killing form, we have ( Hα : H ) = α ( H )(Eα : Fα ).
Recalling now that ( Hα : H ) = α ( H )(2/ |α|2), we get the form ( Eα : Fα ) = 2/ | α|2.
Hence the dual to Eα is given by Eα0 = (| α|2/2) E−α, with Fα ≡ E−α .
In the bases { Hi, i = 1, . . . , `; Eα , α ∈ ∆ } and { Hi0, i = 1, . . . , `; Eα0 , α ∈ ∆ } just
defined, the (quadratic) Casimir operator assumes the form
`
C= ∑ Hi Hi0 + ∑ Eα Eα0 . (7.12)
i= 1 α ∈∆

This is a convenient basis for C because every term Xi Xi0 acts as multiplication by
a scalar on any weight space, and this scalar can be calculated in terms of weight
multiplicities. In addition, if the representation is irreducible, C = cI where c is
a (Casimir) constant characteristic of the representation. Taking the trace on both
sides of (7.12) then produces a relation among the weight multiplicities.
E XAMPLE 16: The Lie algebra a1 has one-dimensional CSA h spanned by the unit
vector ε: there exist one fundamental root α1 = 2ε and two roots α = ± α1 (so
that | α |2 = 4). Its Cartan matrix A = 2 has the inverse B = 1/2. So we have
H10 = H1 B | α |2/2 = H1, and Eα0 = 2E−α . It follows that the Casimir operator
is C = H12 + 2( Eα1 E−α1 + E−α1 Eα1 ). The canonical generators correspond to the
angular momentum operators, H1 = 2J3 and E±  α = J± = J1 ± iJ2 ; hence the
Casimir operator is C = 4 J32 + 1/2( J+ J− + J− J+ ) = 4J 2 .
It will be seen below that for an irreducible representation of any Lie algebra
of highest weight λ, with half the sum of positive roots given by δ, the Casimir
invariant equals c ( λ) = h λ, λ + 2δ i. In the case of a1, we have δ = 1/2α1 and
λ = mω 1 = ( m/2)α1, where m is an N number, and so we get the inner product
h λ, λ + 2δ i = m/2( m/2 + 1)h α1, α1 i, which, with | α1|2 = 4, takes the form 4j ( j + 1)
found in the angular momentum algebra. This implies j = m/2 = 0, 1/2, 1, . . . .
E XAMPLE 17: For a2, the quadratic Casimir invariant of irrep λ = ( m1, m2 ) is
given by c ((m1, m2 )) = 1/9(3m1 + 3m2 + m21 + m22 + m1 m2 ). 
C OMMENTS .
(a) The value of c depends on the normalization of Xi , or of αi . A related quan-
tity, given by I ( λ) = [dim( πλ ) /dim(g)] c(λ), called the index of representation,
appears in the renormalization group equations of quantum field theory.
7.3. DIMENSION OF REPRESENTATION 243

(b) C defined here is called the quadratic Casimir operator. For a` one may
also define a third-order operator C3 of the same kind by d ijk Xi Xj Xk where dijk is
a completely symmetric invariant tensor. It is involved in the studies of anomalies
in particle theory. Casimir operators of higher orders do not have such simple
explicit expressions, but may be written as homogeneous polynomials of higher
orders of the basis elements. They do not exist in all orders for all Lie algebras
(C exists for all simple Lie algebras, but C3 only for a` ). Thus, for the simple Lie
algebras in the classical series, they exist only in orders of 2, 3, . . . , ` + 1 for a` ; of
2, 4, . . . , 2` for b` and c` ; and, finally, of 2, 4, . . . , 2 ` − 2 and ` for d` .

7.3.3 Freudenthal’s Multiplicity Formula


Hans Freudenthal gives a remarkable formula to calculate the multiplicity nµ of
a weight µ in π in terms of the multiplicities of the weights in π that are higher
than µ. Its derivation involves calculation of the traces of the operators appearing
in (7.12) over the weight space Vµ of π, to be indicated by the symbol Tr µ .
In any representation π, ∑i Hi Hi0 acts on any vector in Vµ as scalar multipli-
cation by ∑i µ ( Hi) µ ( Hi0). In terms of the fundamental weights ωi and their duals
ωi0 in h∗ , the weights may be written µ = ∑i µ ( Hi) ωi and µ = ∑i µ ( Hi0) ωi0. And
so we have h µ, µ i = ∑ij µ ( Hi) µ ( Hj0 )h ωi, ω 0j i = ∑i µ ( Hi) µ ( Hi0). This gives

Trµ ( ∑ Hi Hi0 ) = nµ h µ, µ i. (7.13)


i

Note that an explicit expression for Hi0 is not needed in this calculation.
Recalling that ρ ( Hα) equals (2/ | α|2)h ρ, αi for any ρ ∈ h∗ , and the expression
Eα0 = (| α|2/2) Fα, we can use (7.10) to write

| α |2
Trµ ( ∑ Eα Eα0 ) = ∑ tα,µ = ∑ ∑ nµ+iα h µ + iα, α i. (7.14)
α ∈∆ α ∈∆
2 α ∈∆ i = 0
The two terms with ± α and i = 0 cancel out, leaving us with
∞ ∞
∑ ∑ nµ+iα h µ + iα, α i + ∑ ∑ nµ+iα h µ + iα, α i
α∈∆+ i=1 α∈∆− i=1
∞ ∞
= ∑ ∑ nµ+iα h µ + iα, α i − ∑ ∑ nµ−iα h µ − iα, α i.
α∈∆+ i=1 α∈∆+ i=1
Using the identity given in R4, we can see that the second term reduces to the
double sum ∑α0 ∑∞ i=0 nµ+iα h µ + iα, α i . It follows that

Trµ ( ∑ Eα Eα0 ) = nµ ∑ h µ, α i + 2 ∑ ∑ nµ+iα h µ + iα, α i. (7.15)
α ∈∆ α∈∆+ α∈∆+ i=1
The trace of the Casimir operator C on Vµ in any representation π then follows
from the above results, together with the definition δ = 1/2 ∑α∈∆+ α :

Trµ C = nµ h µ, µ + 2δ i + 2 ∑+ ∑ nµ+iα h µ + iα, α i. (7.16)
α ∈∆ i = 1
244 CHAPTER 7. LIE ALGEBRAS: REPRESENTATIONS

Now, assuming that π = πλ is irreducible with highest weight λ, we have


C = cI. To evaluate the representation invariant c, we may evaluate it on the
highest weight space Vλ (where nλ = 1) either by simply letting µ = λ on the
right-hand side of (7.16), or by applying directly the operators ∑ Hi Hi0 + ∑ Eα Eα0
on Vλ . Either way gives us the Casimir invariant:

c = h λ, λ + 2δ i = | λ + δ |2 − | δ |2. (7.17)

In conclusion, the Freudenthal multiplicity formula for the multiplicity nµ of


a weight µ in an irreducible representation πλ of highest weight λ reads

nµ · (| λ + δ |2 − | µ + δ |2) = 2 ∑ ∑ nµ+iα h µ + iα, α i, (7.18)
α∈∆+ i=1

where | β |2 = h β, β i; δ is half the sum of positive roots, δ = 1/2 ∑α0 α; and nν = 0


if ν is not a weight of πλ , and nν = dimVν if ν is a weight of πλ .
C OMMENTS .
(a) In an irreducible representation with highest weight λ and for any weight
µ ≺ λ, the factor | λ + δ |2 − | µ + δ |2 = h λ, λi − h µ, µ i + 2h λ − µ, δ i on the LHS
of (7.18) is strictly positive. This is so because h λ, λ i is maximal among all h µ, µ i;
because λ − µ = ∑i k i αi is linear combination of the fundamental roots αi with
non negative integers k i (with at least one k i > 0); and finally because δ ( Hi) = 1
for all fundamental co-roots Hi , so that h λ − µ, δ i is positive.
(b) On the right hand side, all terms in the sum are non negative and add up,
and so FMF gives an efficient way of calculating successively the multiplicities of
weights µ, working downward from the highest weight λ (for which nλ = 1), in
terms of weights greater than µ, all of which being known at each stage.
(c) The number of terms in the sum ∑i in (7.18) becomes larger and larger for
some weights as λ increases. Thus, FMF is of greatest usefulness when one needs
just the multiplicities of only a few higher weights, such as when evaluating the
formula dim πλ = ∑ nν sν , summing over all dominant weights in πλ . On the
other hand, if for a given λ, all multiplicities are somehow known, then we get
dim πλ = ∑µ nµ , summing over all weights of the representation. Another way
of obtaining dim πλ , as a limiting value of the character of πλ , will be described
in the following paragraphs.
E XAMPLE 18: Multiplicities of the adjoint representation of a2 . The adjoint rep
has highest weight λ = ω 1 + ω 2. We also need δ = ω 1 + ω 2; and h αi, αi i = 1/3,
h α1, α2 i = − 1/6; and h ω i, ω i i = 1/9, h ω 1, ω 2 i = 1/18.
By Weyl symmetry, all weights on the perimeter of the hexagon have multi-
plicity one, the same as λ; we need to compute nµ only for µ = (0, 0). On the
LHS of (7.18), we have nµ (| λ + δ |2 − | µ + δ |2) = 3nµ | δ |2 = nµ . On the RHS, for
each of the three positive roots, there is just one weight higher than µ contribut-
ing, namely µ + α1 = α1, µ + α2 = α2, µ + α3 = α1 + α2, each with multiplicity
one. And so we have for the RHS, 2 (h α1, α1 i + h α2, α2 i + h α3, α3 i) = 2. This gives
n0,0 = 2; and we already know that nν = 1 for ν 6 = (0, 0). 
7.3. DIMENSION OF REPRESENTATION 245

7.3.4 Weyl’s Character Formula


The character of a finite-dimensional representation π of a semisimple Lie algebra
g, denoted χπ or χ, for each element h in a CSA h of g is a C-valued function given
by a trace over the restriction of the representation space to h: χ ( h ) = Tr exp π ( h ),
where exp z = 1 + z + z2 /2! + · · · . (Recall now the definition in Sect. 2.6 and the
exponential map in Sect. 3.5.) As π ( h ) acts diagonally on h, it is a diagonal matrix
in a given basis with diagonal entries equal to the weights µ ( h ) of π. Taking the
trace of exp π ( h ) leads to a finite sum over the weights of the representation:

χπ ( h ) = ∑ nµ exp µ ( h ), (7.19)
µ∈Λ

where nµ are the multiplicities of the weights µ (i.e. the dimensions of the weight
spaces Vµ ). As h is normally restricted to h0 (the real part of h), we may replace
µ with 2πiµ and exp µ ( h) with exp 2πµ (ih), making apparent the Cartan subal-
gebra ih0 of the compact form of g. Nevertheless, to keep it short, we keep using
the simpler notation without making the factor 2πi explicit.
Hermann Weyl has given a simple formula (WCF) for the character of a finite-
dimensional irreducible representation πλ of a semisimple Lie algebra with high-
est weight λ as the quotient of two sums of exponentials:

χλ ( h ) = Aλ+δ /Aδ , where Aµ = ∑ (detw ) exp · w µ ( h) (7.20)


w∈W

for any element h of h0. The summation runs over all elements w of the Weyl
group W of the Lie algebra g; and δ is half the sum of the positive roots (equal to
the sum of the fundamental weights). The power of this formula lies in the fact
that it does not depend on the details of the representation, only on its highest
weight λ, and on two characteristics of the Lie algebra, namely δ and W.
For a Lie algebra g of rank ` and Cartan sub algebra h, the weights ρ of any
representation of g are integral linear functions of h0 , i.e. such that all values
ρ ( h̃α ), with α ∈ ∆, are integers. All such integral forms constitute a lattice Λ for
g, a free abelian subgroup of h∗0 generated by the fundamental weights ω i . To each
ρ of Λ we associate the function exp ρ, or more generally the symbol e ( ρ), subject
to the composition rule e ( ρ)e(σ) = e ( ρ + σ ) and the normalization e (0) = 1
(identity); and consider arbitrary finite integral linear combinations of the form
∑ρ mρ e ( ρ ), with integral coefficients mρ and summation of ρ over a finite subset
of Λ. The collection of all such sums will be conventionally called ZΛ, and the
sum defining the character χ in (7.19) is an element of ZΛ.
The Weyl group W of g acts on h∗0 , and so also on Λ and ZΛ. A Weyl reflection
w acts on ZΛ according to1 we ( ρ) = e ( w · ρ) = e ( wρ). When applied to a product,
it gives w e ( ρ )e (σ ) = e ( wρ) e(wσ ) since e ( ρ) e(σ) = e ( ρ + σ ).
1 In e(ρ)(h ), with ρ ∈ h∗0 and h ∈ h0 , ρ labels the ‘state’ while h labels the function variable. An
operator w acts either on state we(ρ)( h) = e(wρ )( h), or on function variable we(ρ )(h ) = e(ρ)(w −1 h).
In the latter view, the interpretation is ρ(w−1 h) = hρ, w−1 σ i, where σ ∈ h∗0 is the conjugate to h ∈ h0
via the Killing form. The distinction between the roles of ρ and h can be made more apparent by
246 CHAPTER 7. LIE ALGEBRAS: REPRESENTATIONS

Among the elements of ZΛ, there are those that have a definite symmetry un-
der the action of W: An element a is said to be symmetric if wa = a, and alternating
or skew if wa = det w a for all w in W, where det w = (−)rw and rw the number of
positive roots that w sends to negative ones. Cf. discussion on page 218.
Combinations such as ∑w we ( ρ) (summing over all w ∈ W) are symmetric for
any ρ in Λ. Another class of symmetric elements is exemplified by the character
in (7.19), χ = ∑µ nµ e ( µ ). Their symmetry stems from invariance of π under W
and nwµ = nµ for all w in W, as can be seen below:

wχ = ∑ nµ we ( µ) = ∑ nµ e ( wµ) = ∑ nwµ e ( wµ) = ∑ nµ e ( µ ).


µ µ wµ µ

But the key to the proof of (7.20) lies, not in the symmetric, but rather in the skew
elements of ZΛ, those that may acquire a sign under Weyl reflections w according
to the sign of the det w factors. For an arbitrary ρ ∈ Λ, define

Aρ = ∑ (detw ) we (ρ) = ∑ (det w ) e ( wρ). (7.21)


w∈W w∈W

Aρ is an element of ZΛ, and it is alternating because for any s ∈ W, we have

sAρ = ∑(det w ) e ( swρ) = det s ∑(det sw ) e ( swρ)


= det s ∑(det w0 ) e ( w0ρ ) = (det s ) Aρ .

Note that it also satisfies the relation sAρ = Asρ .


The product of two alternating elements is symmetric, and the product of an
alternating and a symmetric element is alternating. An important result is that
the Aρ ’s, with ρ ∈ Λ, span the subspace of alternating elements of ZΛ.
E XAMPLE 19: a1 = sl (2, C ) has one positive root α = 2ε, one fundamental weight
ω 1 = ε, and the lowest strongly dominant weight δ = ω 1 The irreducible rep-
resentation πλ of highest weight λ = mω 1, with non-negative integer m has
the weights µ given by m, m − 2, . . . , − m. Then, letting nµ = 1 for every µ,
and µ ( h ) → iθµ (with a real θ), the expression for χλ in (7.19) becomes χλ =
exp( miθ ) + exp(( m − 2)iθ ) + · · · + exp(− miθ ). On the other hand, as there are
two elements w = ± 1 in the Weyl group such that det w = ± 1, we have in (7.20)
Aλ+δ = exp(iθ ( m + 1)) − exp(− iθ (m + 1)) and Aδ = exp(iθ ) − exp(− iθ ), result-
ing in the quotient Aλ+δ /Aδ = sin(( m + 1) θ ) / sin θ. 
The arguments leading to the WCF shown in (7.20) are outlined as follows.
(i) Define a Laplacian differential operator L for C-valued functions on h0 such
that Le ( ρ) = | ρ |2e ( ρ), where | ρ |2 = h ρ, ρi. It follows that LAρ = | ρ |2Aρ .
(ii) Let ψ = Aδ χλ , then ψ satisfies the Laplacian equation: Lψ = | λ + δ |2ψ. As
ψ is a product of a symmetric (χλ ) and a skew (Aδ ) element, it is skew under W,
and so must be a linear combination of terms of the form detw we (ρ) for those ρ’s
in Λ that satisfy | ρ |2 = | λ + δ |2.
changing notation, from e(ρ)(h ) to eρ (h). Then, weρ (h) = ewρ (h) = eρ (w−1 h). We keep the notation
e(ρ) for typographical convenience and in order to avoid confusion with the root elements eα which
we have been extensively using; [Ja] and [FH] also use this convention.
7.3. DIMENSION OF REPRESENTATION 247

(iii) Multiplying χ = ∑µ∈Λ nµ e ( µ ) by Aδ = ∑w∈W det w e ( wδ) produces a


linear combination of e ( µ + wδ ) equal to ∑w,µ pw,µ e ( µ + wδ ), where w are Weyl
reflections, µ weights of rep π, and pw,µ integral coefficients. Note that the term
e ( λ + δ ) appears with coefficient exactly equal to 1, since λ is maximal among the
µ, and δ maximal among the wδ.
(iv) Since Le ( µ + wδ ) = | µ + wδ|2e ( µ + wδ), element e ( µ + wδ) is in the charac-
teristic space of the root | µ + wδ|2 = | w−1 µ + δ |2 of the Laplacian operator. Hence
ψ is a linear combination of those e ( µ + wδ) such that | w−1µ + δ |2 = | λ + δ|2. This
implies w−1 µ = λ, or µ = wλ. It follows that ψ is a (skew) linear combination of
e ( w(λ + δ )) = we ( λ + δ ), and contains, as noted, e ( λ + δ ) with coefficient exactly
equal to 1.
(v) As Aλ+δ is a solution to L with characteristic root | λ + δ |2, just as is ψ,
and contains the element e ( λ + δ ) with coefficient equal to 1, we come to the
conclusion that ψ = Aλ+δ , which completes the proof that Aδ χλ = Aλ+δ .
Now, the denominator in the WCF, Aδ = ∑w detw e ( wδ), can be written as a
product (where α goes over all positive roots in ∆+ ) in several equivalent ways:
Aδ = ∏ [ e ( α/2) − e (−α/2)] (7.22a)
α0
= e ( δ) ∏ [1 − e (− α)] (7.22b)
α0
= e (− δ) ∏ [ e (α) − 1] . (7.22c)
α0

Let B stand for any one of the equivalent expressions on the right-hand side of
(7.22). Then:
(i) The equality of the three displayed expressions stems from the equality of
δ to half the sum of positive roots: e ( δ ) = e ( ∑ α/2) = ∏ e ( α/2).
(ii) B changes sign when a Weyl reflection corresponding to a simple root αi
is applied to it. To see this, we use the first expression for B, and recall that wαi
changes the sign of αi and permutes the other positive roots. Hence wαi B = − B
for every αi, which implies B is alternating, and so must be an integral linear
combinations of the Aρ’s with strongly dominant weights ρ.
(iii) Multiplying out the product in the second expression, we see that e ( δ )
appears with coefficient 1 and has the highest weight, since all other terms cor-
respond to lower weights of the form δ − ∑ α with positive α. Moreover besides
e ( δ), there are no other contributions from the strongly dominant weights.
(iv) As B (any one of the equivalent products on the right-hand side of (7.22))
is skew, and has the term e ( δ ) with coefficient 1, it must be equal to Aδ . 

Dimension of irreducible representations


An important consequence of WCF is to produce a formula for calculating the
dimension of an irreducible representation πλ of g. The dimension of πλ is equal
to the value of χλ at h = 0 (recall from Sect. 2.6 that χ (id) = dim V ). However, we
cannot get it by simply letting e ( µ ) equal to 1 in the expression Aλ+δ /Aδ , because
then the denominator would give zero. We proceed instead as follows.
248 CHAPTER 7. LIE ALGEBRAS: REPRESENTATIONS

To every h in h0 corresponds an element ρ of h∗0 through the relation ( h : k ) =


ρ ( k ) for all k ∈ h0 . So rather than considering functions χ ( h), Aµ ( h ), or e ( α )(h),
we will examine the corresponding functions χ ( ρ ), Aµ ( ρ ), or e ( α)(ρ) on h∗0 . We
further make the correspondence e ( α)(ρ) → exph α, ρ i.
We first prove the symmetry relation Aµ ( ρ ) = Aρ ( µ ), which holds for any ρ
and µ in h∗0 . This follows from invariance of h, i under Weyl reflections:

Aµ ( ρ ) = ∑ detw exph wµ, ρi


w
= ∑ detw exph µ, w−1ρ i
w
= ∑ dets exph µ, sρi = Aρ ( µ ),
s

where we have used the fact that detw = ± 1, and that w and s = w−1 go over the
same W.
We take the limit of χ ( ρ ) = Aλ+δ ( ρ )/Aδ( ρ ) as ρ goes to 0 along the path
ρ = tδ with t → 0. From the symmetry of Aµ ( ρ ), we have

Aλ+δ ( tδ ) A ( t (λ + δ ))
χ ( tδ) = = δ .
Aδ ( tδ ) Aδ ( tδ )

Using the expression in (7.22) for Aδ ( ρ ), we obtain:

exp (h λ + δ, α i t) − 1
χ ( tδ) = exp (−hλ, δit) ∏ .
α0 exp(h δ, α it) − 1

The limiting value χ (0) as t → 0 is now well defined and gives the dimension of
the irreducible representation πλ :

h λ + δ, α i
dλ [g] = dim πλ = ∏ h δ, α i
. (7.23)
α0

A good way to calculate dλ [g] from (7.23) is to write λ and δ in terms of the
fundamental weights ω i, and every α ∈ ∆+ in terms of the simple roots αi , and
use the relation h ω i, α j i = δij | αi |2 /2 (no summation) for the dual bases. So, let
λ = ∑i mi ω i , α = ∑i k αi αi , δ = ∑i ω i, where mi ≥ 0, k αi ≥ 0, with 1 ≤ i ≤ `.
Then one obtains h α, λ + δ i = ∑i k αi ( mi + 1)| αi|2 /2 in the numerator, and h α, δ i =
∑i k αi | αi |2 /2 in the denominator. The dimension formula (7.23) then becomes

∑i k αi ( mi + 1)| αi|2
dim πλ = ∏ ∑i k αi | αi |2
. (7.24)
α0

Here | αi |2 equals 1, or 2, or 3, as the values of the relative weights of the nodes in


the Dynkin diagram for the algebra. This formula contains all explicit instructions
for a programmable computation of dim πλ for any simple Lie algebra.
7.3. DIMENSION OF REPRESENTATION 249

S PECIAL CASES .

(a) If the highest weight λ of the representation equals δ, then (7.23) reduces
to dim πλ = 2 p , where p = | ∆+ | is the number of positive roots in the algebra.
(b) If all the fundamental roots have the same lengths, then (7.24) reduces to

∑i k αi ( mi + 1)
dim πλ = ∏ ∑i k αi
. (7.25)
α0

E XAMPLE 20: The algebra a1 has just one positive root equal to the single fun-
damental root, so k α1 = 1. The fundamental weight λ is specified by the non-
negative integer m1 = m. The product in (7.25) has a single factor, so that the
dimension of the irreducible representation of a1 of highest weight m is given by
dm [a1 ] = m + 1.
E XAMPLE 21: The algebra a2 has three positive roots, α1, α2 and α1 + α2, from
which we read the k αi ’s. The highest weight of the irreducible representation πλ is
λ = ( m1, m2 ). The three factors in (7.25) corresponding to the three positive roots
are (( m1 + 1)1 + 0) / (1 + 0) = m1 + 1, (0 + ( m2 + 1)1) /(0 + 1) = m2 + 1, and
(( m1 + 1)1 + ( m2 + 1)1) /(1 + 1) = ( m1 + m2 + 2) /2. This gives the dimension
of the irreducible representation of a2 of highest weight λ = ( m1, m2 ) as

1
dm1 ,m2 [a2 ] = ( m1 + 1)( m2 + 1)( m1 + m2 + 2). (7.26)
2

E XAMPLE 22: The algebra b2 has four positive roots: α1, α2, α1 + α2, and α1 + 2α2.
The simple roots α1 and α2 satisfy | α1|2 = 2| α2|2 = 1/3 and h α1, α2 i = − 1/6. In
(7.24), the numerator is 2k α1 ( m1 + 1) + k α2 ( m2 + 1), and the denominator 2k α1 + k α2 .
So we get for the four positive roots (in the same order) the factors ( m1 + 1),
( m2 + 1), (2m1 + m2 + 3) /3, and ( m1 + m2 + 2) /2, leading to the dimension for
the representation of b2 of highest weight ( m1, m2 ):

1
dm1 ,m2 [b2 ] = ( m1 + 1)( m2 + 1)( m1 + m2 + 2)(2m1 + m2 + 3). (7.27)
6

E XAMPLE 23: The algebra g2 has six positive roots, α1, α2, α1 + α2, 2α1 + α2, 3α1 +
α2, and 3α1 + 2α2, from which we read the k αi ’s. The highest weight of πλ is
λ = m1 ω 1 + m2 ω 2. Remembering now to include the different norms | α1|2 = 1
and | α2|2 = 3, we obtain the following factors in (7.24):
– For α1: m1 + 1;
– For α2: m2 + 1;
– For α1 + α2: (( m1 + 1) + 3( m2 + 1)) /(1 + 3) = ( m1 + 3m2 + 4) /4;
– For 2α1 + α2: (2( m1 + 1) + 3( m2 + 1)) / (2 + 3) = (2m1 + 3m2 + 5) /5;
– For 3α1 + α2: (3( m1 + 1) + 3( m2 + 1)) / (3 + 3) = ( m1 + m2 + 2) /2;
– For 3α1 + 2α2: (3( m1 + 1) + 6( m2 + 1)) /(3 + 6) = ( m1 + 2m2 + 3) /3.
250 CHAPTER 7. LIE ALGEBRAS: REPRESENTATIONS

The dimension of the irreducible representation of Lie algebra g2 of highest


weight λ = ( m1, m2 ) is therefore

1
dm1 ,m2 [g2] = ( m + 1)( m2 + 1)(m1 + m2 + 2)
120 1 (7.28)
· ( m1 + 2m2 + 3)( m1 + 3m2 + 4)(2m1 + 3m2 + 5).

E XAMPLE 24: Lie algebra d4 = so (8, C ). The simple roots are: α1 = ε 1 − ε 2,


α2 = ε 2 − ε 3, α3 = ε 3 − ε 4, α4 = ε 3 + ε 4. There are 12 positive roots (all with the
ij ij
same norms), which are α− = ε i − ε j and α+ = ε i + ε j, with 1 ≤ i < j ≤ 4. In the
following Table 7.1 we give the numerator ∑i k αi ( mi + 1) and the denominator
∑i k αi for each α  0 in (7.25). The dimension of the irrep with highest weight
( m1, m2, m3 , m4 ) is then obtained by simple multiplication of factors N/D.

Table 7.1: Factors in the dimension formula for the irrep with highest weight
( m1, m2, m3 , m4 ) of d4 = so(8, C )

α0 α1 α2 α3 α4 Numerator N Denominator D


ε1 − ε2 1 m1 + 1 1
ε1 − ε3 1 1 m1 + m2 + 2 2
ε1 − ε4 1 1 1 m1 + m2 + m3 + 3 3
ε2 − ε3 1 m2 + 1 1
ε2 − ε4 1 1 m2 + m3 + 2 2
ε3 − ε4 1 m3 + 1 1
ε1 + ε2 1 2 1 1 m1 + 2m2 + m3 + m4 + 5 5
ε1 + ε3 1 1 1 1 m1 + m2 + m3 + m4 + 4 4
ε1 + ε4 1 1 1 m1 + m2 + m4 + 3 3
ε2 + ε3 1 1 1 m2 + m3 + m4 + 3 3
ε2 + ε4 1 1 m2 + m4 + 2 2
ε3 + ε4 1 m4 + 1 1

7.4 Lie Groups in Particle Physics


Physics brings to group theory many interesting applications. Particle physicists,
in particular, have made some crucial advances in their field by exploiting results
from the theory of Lie groups and Lie algebras, and, in return, have opened new
lines of thought for mathematicians. An account can be found in [Ge].
The Standard Model. The basis for our current understanding of matter at its
fundamental level is the Standard Model (SM), which is a theory of particles and
interactions based on the symmetry group GSM = SU(3) × SU(2) × U(1) acting on
7.4. LIE GROUPS IN PARTICLE PHYSICS 251

the internal (non time-spatial) spaces of color, isospin, and hypercharge. All par-
ticles then evolve in representations of GSM on finite-dimensional Hilbert spaces
V, and since GSM and its component groups are compact Lie groups, every irre-
ducible representation (irrep) V of GSM may be decomposed to a direct product
of irreps of the form V3 ⊗ V2 ⊗ V1, where V3, V2, and V1 are respective irreps of
SU(3), SU(2), and U(1). Not only does this give us a way to organizing particles,
but also to uncovering their interactions as well.
In the Standard Model, there are three kinds of particles: the fundamental
fermions, which constitute matter; the gauge bosons, which carry the (strong and
electro-weak) interactions; and the Higgs bosons, which give masses.
Following a long line of theoretical arguments and experimental observations,
all the particles we now know are completely identified by their quantum num-
bers. Thus, we know that the fundamental fermions, of which matter is ultimately
made up, are structureless particles of spin one-half. They are of two kinds, quarks
and leptons, distinguished by the property that the former are sensitive to the
strong interaction whereas the latter are not. They come in different flavors: up
(u), down (d), electron (e), and neutrino (ν). As we will see later, there are others,
but these are the lightest ones, forming what one calls the first generation.
Each flavor of quark (u, d) comes in three different ‘strong charges’ called
colors, labeled i = 1 (red), 2 (green), and 3 (blue). Observations (e.g. the spins of
the baryons and mesons) and the spin–statistics relationship point to an internal
SU(3) symmetry of color in which the quarks of each flavor transform among
themselves in a 3 representation of SU(3). So u1, u2, u3 are basis vectors of the
standard representation of SU(3) in C3 ; and similarly d1, d2, d3 in another copy of
C3 . On the other hand, the leptons (e, ν) carry no color, and so may be considered
color neutral in this context, sitting together in C ⊕ C.
Massless quarks and leptons in motion can be characterized by their helicity
± 1/2, the component of their spin in the direction of motion. Those with he-
licity +1/2 are said to be right-handed, while those of helicity −1/2, left-handed.
The antiparticle of a right-handed particle is left-handed, and vice versa. Helicity
is a relativistically invariant quantity for massless particles, but not for massive
particles (and so these must be linear superpositions of left- and right-handed
components). At the symmetry level where the fundamental fermions carry no
mass, there may exist separate left- and right-handed quarks and leptons, uiL , uiR ,
diL , diR , e L , e R, νL , νR, and their conjugate antiquarks and antileptons. However,
both components need not exist in nature; in fact, right-handed neutrinos and
left-handed antineutrinos seem not to exist at all. So one may take for the first-
generation fundamental fermions of the SM the set of 15 particles uiL , diL , uiR , diR ,
νL , e L , and e R, although at times one may want to include νR (making it a set of 16),
for example when studying neutrino oscillations, or Higgs boson interactions.
A salient feature of the weak force (and only of the weak force) is that it vio-
lates parity, i.e. it acts non symmetrically on particles of different helicities. No
other physical law is asymmetric in left and right. The parity violation so ob-
served is maximal, such that only left-handed particles or right-handed antipar-
ticles participate in weak interactions. Thus, neutrons (n) decay into protons (p)
252 CHAPTER 7. LIE ALGEBRAS: REPRESENTATIONS

via n L → p L + e L + ν̄R, but never via n R → p R + e R + ν̄L . This explains the


asymmetric treatment of the left- and the right-handed fermions in what follows.
It has long been established that, if one ignores electromagnetic effects, the
proton and the neutron may be taken as different states of the same particle, more
precisely, as basis vectors of the standard representation of the nuclear isospin
Lie algebra su (2) in which there are two weights T3 = ± 1/2, with highest weight
T = 1/2. The quarks u and d behave in much the same way (for example in their
similar compositions p = uud and n = ddu), and therefore may be considered
members of an isospin doublet. Now, at the quark level, as the weak neutron
decay may be viewed as d L → u L + e L + ν̄R, or even as equivalent to the inelas-
tic scattering u L + e L → d L + νL , we may pair up u L and d L in a doublet, and
νL and e L in another doublet of the isospin symmetry group SU(2), shifting the
concept to apply it to weak-interacting processes only. (One may call it the ‘weak
isospin’ to emphasize this new meaning.) As for the right-handed fermions they
are each in a trivial representation C of SU(2). The neutron and proton have the
values of their electric charge Q related to those of their (nuclear) isospin T3 by
the relation Q = T3 + 1/2. A similar relation applies to the weak isospin provided
we introduce a new quantum number, called the weak hypercharge Y, such that
Q = T3 + Y/2. Since the electric charge and the (weak) isospin T3 are known for
quarks and leptons, their (weak) hypercharge can be simply calculated.
Just as a given basis { T1, T2, T3 } of su (2) can generate all elements of the cor-
responding group SU(2) by exponentiation exp (− i ∑ ωi Ti ) for arbitrary constants
ωi, so can the Hermitian operator Y associated to the hypercharge generate all
elements of U(1) in the form e−iωY for an arbitrary constant ω. This is the right
U(1) factor in GSM, rather than the U(1) associated with the electric charge Q,
because Q, as an operator, does not commute with su (2), while Q − T3 does. To
repeat, GSM is the direct product of three commuting simple Lie groups: the color
SU(3), the weak isospin SU(2), and the weak hypercharge U(1). Its smallest irrep
fixes the particle content of the model, as summarized in Table 7.2, where are
also indicated the conserved charges of the fundamental fermions u, d, ν, e. One
could complete the table with information on the antifermions ū, d,¯ ν̄, ē, sitting
in the dual representations. (For example, the ūiL live in the irrep 3 × 1 × 1 −4/3 .)
In an alternative, strictly equivalent representation FL , one replaces all isospin-
singlet fermions by their charge conjugates, so that it is composed of fermions
and antifermions, all left-handed. This is shown in the last four lines of the table.
In either way, if we call F the direct sum of all the irreps listed in (the fifth column
of) Table 7.2, and F ∗ its dual, then F ⊕ F ∗ is the Standard Model representation of
the first generation of fermions.
So far we have dealt with just the internal subspace of particle space, and so
we have considered only symmetry transformations with constant parameters,
which for this reason are said to be global. When we extend our study to the
full space, where particles are described by products χ ⊗ ψ of an internal vector
χ and a vector ψ in the infinite-dimensional Hilbert space carrying spacetime xµ
dependence, the symmetry transformations applied on ψ ( x) must be generalized
to include their dependence on xµ , which are then said to be local.
A physical system assumed invariant under a global symmetry group G can
7.4. LIE GROUPS IN PARTICLE PHYSICS 253

Table 7.2: Irreducible representations of GSM = SU (3) × SU (2) × U (1)Y for the
fermion first generation in the Standard Model. It is equivalent, and convenient,
to replace u R , d R, νR , e R with ū L , d¯L , ν̄L , e+
L in the definition of F. The values of Y
are explicitly in indicated in 1Y .

Fermions i T3 Y GSM irreps FL GSM irreps


 i
uL
1, 2, 3 ± 12 1
3 × 2 × 11/3
diL 3
 
νL
− ± 12 − 1 1 × 2 × 1 −1
eL
4
uiR 1, 2, 3 0 3 3 × 1 × 14/3 ūiL 3 × 1 × 1−4/3
diR 1, 2, 3 0 − 233 × 1 × 1−2/3 d¯i
L 3 × 1 × 12/3
νR − 0 0 1 × 1 × 10 ν̄L 1 × 1 × 10
eR − 0 − 2 1 × 1 × 1 −2 e+L 1 × 1 × 12

acquire a more stringent local symmetry under G when it is made equivalent to


itself at each spacetime point xµ at different ‘scales’ or ‘gauges’ specified by xµ -
dependent group parameters. This type of equivalence is referred to as gauge
invariance, and the transformations between different possible gauges are called
‘gauge transformations’, which together form a gauge group.
Specifically, let us give to that physical system (described by a Lagrangian) the
symmetry Lie group G with elements U ( ω) = e−igω , where ω = ωi Xi , and Xi ∈
Lie( G ), with constant scalars g and ωi. Under global symmetry, a particle field
ψ transforms into ψ0 = Uψ ≈ ψ − igωψ, and its spacetime derivative ∂µ ψ into
∂µ ψ0 ≈ ∂µ ψ − igω∂µ ψ. If now ω is allowed to have spacetime dependence, ∂µ ψ0
will get an extra term, − ig∂µ ω · ψ, in violation of the symmetry. So for the system
to remain gauge-invariant, there must exist for each generator Xi a vector field
Aiµ ( x) transforming to Aiµ0 ( x) given by A0µ = Aµ + ig[ Aµ , ω ] + ∂µ ω (where Aµ =
Aiµ Xi ) so that the terms proportional to ∂µ ω in ∂µ ψ0 and in A0µ cancel out. The
quantized fields Aiµ ( x) are the sources of vector particles, called gauge bosons.
They are abelian if G is abelian, and non-abelian (Yang-Mills) if G is non-abelian.
By construction, Xi Aiµ are elements of the complexified Lie algebra Lie( G ). Since
an explicit mass term for a boson field would break gauge invariance, Aiµ must be
massless under gauge symmetry. They interact with particles that transform non-
trivially under G, possibly including themselves, in an invariant manner and with
a coupling constant unique to G. This is a very important result: symmetry forces
the existence of gauge bosons and dictates the form of the interaction. Einstein’s theory
of gravitation is an example where that idea was first applied, in the context of
Poincaré invariance of classical non-quantum physics.
When GSM = SU(3) × SU(2) × U(1) is made into a local symmetry group,
gauge invariance will be guaranteed by the presence of massless vector bosons
254 CHAPTER 7. LIE ALGEBRAS: REPRESENTATIONS

(obeying appropriate transformation rules). There are 12 such gauge bosons—the


eight SU(3) gluons of quantum chromodynamics and the three SU(2) W i bosons
plus one U(1) B boson of the electro-weak dynamics—all belonging to the com-
plexified adjoint representation sl (3, C ) ⊗ sl (2, C ) ⊗ C, with each type of gauge
boson being associated with a coupling constant, gs , or g, or g0 .
Whereas experimental evidence strongly supports exact color SU(3) symme-
try, the situation is different with the electroweak sector SU(2) × U(1). Observed
weak interactions are short-ranged, indicating the presence of massive Yukawa
quanta and contradicting exact gauge invariance, and, although the photon is
massless, its interactions involve the electric charge, not the hypercharge. In other
words, SU(2) × U(1) cannot really be an exact symmetry. An elegant solution to
these problems lies in the concept of spontaneous symmetry breaking, whereby a
system with an infinite number of degrees of freedom may arbitrarily select out
of many possible degenerate minimum-energy states a particular member as its
ground (vacuum) state, which then goes on to become our asymmetric physical
world, hiding the underlying symmetry. This phenomenon is not uncommon in
nature; in particular, it is responsible for the existence of magnetism in a ferro-
magnetic material (when rotational invariance is broken) and superconductivity
in some electrical conducting materials (when the phase invariance of charged
particles breaks down).
Now, to produce the desired effects in our physical world, a doublet of com-
plex scalar bosons has to be present to create the required minimum-energy states.
A vacuum is selected so as to partially break the SU(2) × U(1) symmetry, specifi-
cally in the directions T1 ± iT2 and T3 − Y/2, keeping the symmetry in the fourth
direction T3 + Y/2 preserved. As a result, the gauge bosons Z = W 3 − B/2 and
W ± associated with the symmetry breaking directions acquire masses and longi-
tudinal polarizations by absorbing three real scalar bosons, while γ = W 3 + B/2,
the gauge boson associated with the symmetry preserving T3 + Y/2, remains
massless. W ± and Z are just the observed massive quanta of the weak interac-
tions, whereas γ is the massless photon mediating the electromagnetic force. The
surviving scalar boson from the doublet forms a condensate in the vacuum, and
its lowest-energy excitations give rise to a stable state identifiable with a massive
particle, called the Higgs boson (H 0). It is also the last particle of the Standard
Model to be discovered (in 2013), thus confirming the theory and giving credence
to Peter Higgs’s scheme for mass generation of the fundamental particles through
their interactions with H 0. This symmetry breaking mechanism is summarized
by the group reduction relation SU(2) × UY (1) → UQ (1), in which the vacuum
selected for our world is a singlet 1 in the UQ (1) of electromagnetism.
Fermion Generations. Besides u, d, e and ν (which we now write νe to be more
specific), the fundamental fermions come in other flavors, composing two more
generations. The second generation consists of the charm quark (c), the strange
quark (s), the muon lepton (µ− ), and the muon neutrino (νµ ); and the third gen-
eration, of the top quark (t), the bottom quark (b), the tau lepton (τ − ), and the
tau neutrino (ντ ). (In the current state of physics, the existence of heavy flavors
and the limit to three generations are accepted experiment-based facts.) Just as
before, each quark comes in three colors; and each quark or lepton comes in two
7.4. LIE GROUPS IN PARTICLE PHYSICS 255

helicities, as well as with a charge conjugate (antiparticle). So each of the three


generations consists of 16 quarks and leptons plus 16 antiquarks and antileptons.
From the algebraic point of view, the fermion generations are identical to one
another and, as representations of GSM, each generation spans a copy of F ⊕ F ∗
space, similar to that described in Table 7.2.
In the completed Standard Model, the gauge bosons remain unchanged since
they are determined by the gauge group on which the model is based. But once
the Higgs mechanism of symmetry breaking is introduced and interactions are
allowed, many novel complex events arise from mixing the quark fields and the
lepton fields across different generations.
Grand Unified Theories. For all its resounding success, the SM is not the final
solution, and one hopes to build a better model, a Grand Unified Theory (GUT),
possessing a higher degree of symmetry (to avoid multiple undetermined inter-
action coupling constants and possibly clarify some outstanding issues) which is
spontaneously broken down to GSM in which our world exists. That such a hope
is not unjustified is indicated by the observation that if one heats up a system
with a broken symmetry, this symmetry is restored intact at higher temperatures.
Take for example a ferromagnetic material existing in a magnetized state at, say,
room temperature because of a breakdown of its rotational symmetry. It will re-
cover this symmetry and thereby lose its magnetization when heated up to a high
enough temperature. Similarly, in early universe, when the temperature was ex-
tremely high, all symmetries, now broken, existed unbroken, and the laws of
nature were highly symmetric.
So we want to find a simple group G (which calls for a single gauge coupling
constant) that contains SU(3) × SU(2) × U(1) as a subgroup. Its Lie(G) algebra
evidently must have a rank of at least four (the number of commuting generators
of GSM ). A possible candidate is SU(5), which also admits distinct complex con-
jugate representations (so as to incorporate inequivalent left- and right-handed
quarks and leptons). Lie group SU(5) has irreps of dimensions 1, 5, 10, 15, 24,
35, . . . . Its 24 generators are elements of the associated algebra sl (5, C ), that is,
hermitian traceless 5 × 5 matrices in C5 . In a basis { ζ 1, ζ 2 , ζ 3, ξ + , ξ − } with colors
in the first three entries and isospin in the last two, the hypercharge generator
Y has the matrix diag [ −2/3, −2/3, −2/3, 1, 1]. Quarks and leptons will sit together
in two different irreps, while the sterile ν̄L forms a representation by itself, all
decomposable in GSM -irreps as shown in the following equations (where ( a, b, c)
used here is the same as a × b × c in Table 7.2):

10 = [ ūiL, ( uiL , diL ), e+


L ] = ( 3̄, 1, 1−4/3 ) ⊕ ( 3, 2, 11/3 ) ⊕ ( 1, 1, 12 ) ,
¯i
5 = [ d , ( νL, e L )] = (3̄, 1, 12/3 ) ⊕ (1, 2, 1−1 ),
L
1 = [ ν̄L ] = (1, 1, 10).

Thus, all the fundamental fermions of the first generation are defined as basis
vectors of three irreps of SU(5): F = 10 ⊕ 5 ⊕ 1.
Here the vector gauge bosons, which belong to the adjoint representation 24,
include besides the Wµi , Bµ and the QCD gluons, twelve new gauge bosons, called
256 CHAPTER 7. LIE ALGEBRAS: REPRESENTATIONS

the Xµ , Yµ bosons. Symmetry breakdown of SU(5) occurs in two stages at very


different energy scales—the first to give mass to the X’s and Y’s, and the second
to give mass to the W ± , Z. In this second event, the symmetry-breaking Higgs
scalars, which sit in the representations 5 and 5 and carry both color and isospin,
are coupled to quarks and leptons, changing their colors and flavors.
SU(5) GUT is defined with a single universal gauge coupling at the grand
unification energy scale. It gives a natural explanation for quark quantization,
with Tr Q = 0 on fermion-5 representation producing the relation 3Qd + Qe+ = 0,
or Qd = − 1/3Qe+ . On the other hand, as quarks, antiquarks, and the electron
appear in the same irreducible representation, the couplings of the color Higgs to
fermions lead to violation of the conservation of the baryon and lepton numbers,
as in H → ud and H → e+ ū, allowing proton decay p → π 0 e+ .
Other approaches to unification with simple Lie groups have been attempted.
In particular, the GUT based on SO(10) (or Spin(10)) has been suggested. There
is a single universal gauge coupling, and a complete generation of quarks and
leptons reside in the 16-dimensional irrep, so that F = 16 (an advantage over
SU(5)). Its adjoint representation 45, which may be written symbolically as a
direct sum 24 ⊕ 10 ⊕ 10 ⊕ 1 of SU(5) irreps, contains the 12 SM gauge bosons
as well as the X, Y bosons, and other new particles. SO(10) GUT has two in-
equivalent symmetry-breaking scenarios, namely, SO(10) → SU(5)× U(1) and
SO(10) → SU(4)× SU(2)×SU(2). In the first, another symmetry breakdown at
a lower energy scale may be envisaged, so that SU(5) → GSM, preserving all the
predictions of SU(5) and GSM. In the second scenario, a new symmetry group,
SU(4), appears, which would treat the leptons as the fourth color on the same
footing as the three quark colors, and with it one obtains another possible unifi-
cation scheme, with three independent gauge coupling constants.
Finally, E6, being the only one among the five exceptional groups that has
any complex representations to make a left-right asymmetric spectrum of parti-
cles possible, offers another potential solution. Its fundamental representation
27, which transforms as 16 ⊕ 10 ⊕ 1 under SO(10), may contain a complete gen-
eration of fundamental fermions in the 16 component, but also many more states
which have not been observed and so must be removed from a subsequent effec-
tive low-energy theory to make it viable.
Beyond the symmetries encoded in the compact Lie groups by now familiar
to us, nature could appear in more complex and subtle patterns, hidden under
layers of broken symmetries. Supersymmetry, in which every particle requires
the presence of a partner whose spin differs by 1/2, classifies whole categories
of particles and fields, and offers an attractive explanation for the particle mass
hierarchy and the source of the mysterious cold dark matter. Now, it might turn
out that energy-matter does not exist at all as point-like particles as we have so
far thought, but rather as vibrations of one-dimensional strings, or even two-
dimensional membranes, which then raises the prospect of having all the four
known forces in a unified theory. In spaces where such extended structures
evolve, symmetries of a completely new kind, which relate manifolds of different
geometries, must certainly exist, opening another window on the laws of nature.
PROBLEMS 257

Problems
7.1 (Weyl group for a2). Consider the Weyl group for a2 (see Example 1 in the
chapter). (a) Give the transformations of the positive roots and an arbitrary
weight under the Weyl group. (b) Give the orbits of the dominant weights (0, 0),
(1,0), (0,1), (1,1), (2,0), (0,2), (3,0) and (0,3).
7.2 Weyl group for b2 . (a) Find the Weyl group for the Lie algebra b2 . (b) Give the
orbits of the dominant weights (0,0), (1,0), (0,1), (1,1), (2,0), (0,2).
7.3 (Properties of Weyl reflections). Let W be the Weyl group for a Lie algebra
g. Prove the following: (a) The Weyl reflections are isometries on h∗ , i.e trans-
formations that preserve the scalar product. (b) The reflection wα on the hyper-
plane perpendicular to a simple root α permutes all other positive roots. (c) Let
δ = 1/2 ∑β∈∆+ β, i.e. half the sum of the positive roots, and w 6 = 1 be an arbitrary
Weyl reflection. δ − w · δ is a sum of distinct positive roots. (d) δ defined in (c) is
the sum of the fundamental weights, and so is a weight.
7.4 (Dual representations). We know that the NSC for an irrep πλ with highest
weight λ to be self-dual is that its lowest weight is − λ. Determine the self-duality
of the irreps of a2 , b2 , and g2 .
7.5 (Inner products of weights). Inner products for weights can be calculated in
the same way as for roots, since the two entities live on the same space, h∗0 . We
take sl (3, C ) to illustrate, and set µ = ( m1, m2 ) for any weight, where mi = µ ( hi)
and hi are the fundamental coroots. (a) Calculate h µ defined by ( hµ : h ) = µ ( h ),
where h is any vector of h0. (b) Calculate h µ, µ0i for any weights µ, µ0 ∈ h∗0 .
7.6 (Fundamental weights). Let hi with i + 1, . . . , ` be the fundamental coroots of
a simple Lie algebra, and ωi the fundamental weights, so that ωi ( h j ) = δij . Recall
that for any simple root αi, we also have αi ( h j) = a ji (the Cartan integers), so
that αi = ∑j ω j a ji . The sets Σ = { αi; i = 1, . . . , `} and Φ = { ω i; i = 1, . . . , `} are
two bases for h∗0 . In addition, there is also the canonical coordinate basis { ε i} , in
which αi = ∑i ε j c ji (where c ji are given in the text of this chapter for all simple
algebras). Find the expansion coefficients of ω i in { ε j} for the algebras b3 , d4, g2 ,
f4, e6, e7 , and e8 .
7.7 (Representations of b2 ). Find the irreducible representations of b2 with high-
est weights (0,1), (1,0),(0,2) and (2,0). Discuss the representations of the tensor
products π0,1 ⊕ π0,1, π1,0 ⊕ π1,0, and π0,1 ⊕ π1,0.
7.8 (Representations of g2 ). (a) Find the Weyl group for the Lie algebra g2 . (b)
Give the Weyl orbits of the dominant weights (1,0), (0,1), (1,1), (2,0), (0,2). (c) Give
the weights of the irreducible representation having (0,1) as the highest weight.
7.9 Find the integral expansion ∑i k i αi (where αi are simple roots) for the positive
weights of the classical algebras a` , b` , c` , and d` .
7.10 (Dimension of the irreps of a` ). Find the dimension of the irreps of a` for
arbitrary `.
Q. Ho-Kim. Group Theory: A Physicist’s Primer.
258 CHAPTER 7. LIE ALGEBRAS: REPRESENTATIONS

m1 : −2 −1 0 1 2 3

u u e e
u e e
u u e e
u e e
m2 : u u e e
α2
3
u e ω2 e
u u e ω1 u
2
u e u α1
u u u u
1
u u u
u u u u
0
u u u
u u u u
−1
u u u
u u u u
−2

Figure 7.11: A section of the weight lattice for a2 . α1 and α2 are the simple roots,
with | α1| = | α2|; ω 1 and ω 2 the fundamental weights. Weights are labeled in the
system { ω 1, ω 2 } by (m1, m2 ).
WEIGHT LATTICE 259

m1 : −2 −1 0 1 2

t t d d d

t α2 t dω 2 dω1 t
@
@
m2 : @
@
3 @
t t @d t tα1

2
t t t t t

1
t t t t t

0 −1 −2

A section of the weight lattice for b2 . α1 and α2 are the simple roots,
Figure 7.12: √
with | α | = 2| α2|; ω 1 and ω 2 the fundamental weights. Weights are labeled in
1

the system { ω 1, ω 2 } by (m1, m2 ).


260 CHAPTER 7. LIE ALGEBRAS: REPRESENTATIONS

s s s s c c c s s

s s s s c c s s

ω2
s s s s c c s s s

s s α2 s s c ω1 s s s

s s s s s s α1 s s s

s s s s s s s s

s s s s s s s s s

s s s s s s s s

s s s s s s s s s

A section of the weight lattice for g2 . α1 and α2 are the simple roots,
Figure 7.13: √
with | α2| = 3| α1|; ω 1 and ω 2 the fundamental weights. Weights are labeled in
the system { ω 1, ω 2 } by (m1, m2 ).
Appendix A

The Symmetric Group and


Tensor Representations

The symmetric group Sn , or the group of all permutations of n objects, plays a


central role in the study of symmetries in mathematics and in physics. It became
part of abstract algebra when Évariste Galois used it to derive the precise con-
ditions for a polynomial equation to be solvable or not, and, when solvable, the
relations between its various roots. Many other finite groups are subgroups of Sn ,
and so information on the representations of Sn plays a key role in the study of
the representations of finite groups. In addition, tools developed for finding the
irreducible representations of the symmetric group can be adapted to the classi-
fication of tensors in some classical Lie groups and Lie algebras, and therefore
to the classification of the finite representations of these structures themselves.
In physics, whenever we deal with identical particles, properties of the symmet-
ric group will come into play; the identification of the atomic and nuclear states,
and the classification of observed elementary particles and predictions of new
particles are just two examples. This appendix, which complements the briefer
discussion of the subject in Sec. 1.6 of Chapter 1, mainly aims to describe the ir-
reducible representations of Sn , making use of both algebraic and graphical tools, and
to apply these results to classify the irreducible representations of SU ( n) according
to the symmetries of their basis tensors. See also [Ge], [Ha], and [Tu].

A.1 Algebraic Method


1. We explain here the concept of group algebra and show the reader how to use it
to find the irreducible representations (irreps) of finite groups. To begin, we give
a definition:
Definition A.1 (Group Algebra). Given a finite group G, the associated group algebra
Ge is the set of all linear combinations of the elements of G with complex-valued coeffi-
cients: x = ∑g xg g, with g ∈ G and xg ∈ C.

261
262 APPENDIX A. THE SYMMETRIC GROUP

2. A group algebra G e is closed under the following operations:


(i) Multiplication by C: cx = ∑g cxg g is in Ge if x ∈ G e and c ∈ C;
e if x, y ∈ G;
(ii) Addition of two elements: x + y = ∑g ( xg + y g ) g ∈ G e
e if x, y ∈ G.
(iii) Multiplication of two elements: xy = ∑g ( ∑h xgh−1 yh ) g ∈ G e
Thus, a group algebra is a complex linear vector space, with characteristic vector
properties (i) and (ii), that also possesses a product rule (iii) for two elements.
This last feature also qualifies it as an algebra; in fact, it is an associative algebra,
since the multiplication law taken over from the group is itself associative. It is
this double structure that gives group algebras additional power, the source of
many of its remarkable properties. As a linear vector space (of dimension | G |),
it may admit as a basis the group elements g ∈ G themselves. As an algebra, its
elements may operate on any vector space, including the vector space G e itself.
3. The main motivation for studying group algebras is to learn how to construct
representations of a given finite group from those of its associated group alge-
bra, especially in the case of the symmetric group for which the approach is well
suited. We first define ‘representation’:

Definition A.2 (Representations of G). e A representation of a group algebra G e is a


mapping U from G e to a set of linear operators on a vector space V that preserves the
group-algebra structure: If x, y ∈ G e and U ( x), U ( y) their images, then U ( cx + c0 y ) =
0
cU ( x) + c U ( y ) and U ( xy ) = U ( x)U ( y); that is, properties (i)–(iii) of group algebras
are preserved. An irreducible representation of G e in V is one that does not have any
non-trivial invariant subspace.

To define a matrix representation corresponding to U ( G e), we choose a basis {ui}


for the vector space V on which U operates. Then, as usual (in group theory),
the representation matrix D( x) is defined as U ( x) u j = ∑i ui Dij ( x). The regular
representation of G e is one in which the basis consists of all g ∈ G.
4. The linear relationship between the elements of G and G, e as given in the group
algebra definition, results in a relationship between their respective representa-
tions: If x = ∑g xg g, then D ( x) = ∑g xg D ( g). And so, a representation of a group
e Conversely, any representation of
G is also a representation of the associated algebra G.
e
G gives a representation of G. If one of these representations is reducible (or irreducible),
then so is the other.
As we shall see, the irreducible representations of a group algebra G e can be
systematically constructed with projection operators formed by taking appropriate
linear combinations of group elements (an operation allowed in algebras but not
in groups and being the main motivation for introducing this concept). Once the
irreps of G e are found, we shall have automatically those of the associated group.
5. A subalgebra of an algebra G e is a linear vector space that is contained in G e and
e e e
closed under the multiplication law of G. A subalgebra H of G is called a left
ideal if it is invariant under left multiplication by all x ∈ G. e It consists of all those
elements such that, if h ∈ H, e then xh is also in H e Left ideals that
e for all x ∈ G.
A.1. ALGEBRAIC METHOD 263

contain no proper subideals are said to be minimal. Minimal left ideals correspond to
e and are therefore objects of interest. (A right ideal is similarly
irreps of the algebra G
defined as a subalgebra invariant under right multiplication, that is the set of all
vectors h of G e such that hx is some vector of the set for all x ∈ G.)
e
6. The regular representation. As we have mentioned, the regular representation
DR of a group algebra G e results from the operators U ( x) acting on the algebra
itself. We learned from Chapter 2 that the regular representation of any group
(and therefore of any group algebra) is fully reducible and contains each distinct
irreducible representation, Dµ , a number of times equal to the dimension d µ of
that irrep. When completely reduced, it appears as a direct sum, DR = ⊕ µ dµ Dµ .
In other words, the group algebra can be decomposed into a direct sum of irre-
ducible invariant subspaces Laµ , where a = 1, 2, . . . , dµ . As subalgebras of G, e Laµ
are minimal left ideals, which we write as G e = ⊕ µ Lµ , where Lµ is a left ideal
a
completely reducible into ⊕ a Lµ . Symbolically we write
DR = ⊕ dµDµ ⇐⇒ G e = ⊕Lµ = ⊕ µ (⊕ a Laµ ).
7. The problem of finding the distinct irreducible representations reduces to iden-
e To make our discussion less abstract, let us
tifying all minimal left ideals of G.
assume that G e may be decomposed into two left ideals: G e = L1 ⊕ L2 such that
L1 ∩ L2 = { 0} , where by definition xL1 = L1 and xL2 = L2 for every x ∈ G. e
e
Then every x ∈ G is expressible in a unique way as the sum x = x1 + x2 of an
element x1 ∈ L1 and an element x2 ∈ L2. In particular, the identity e of G e (which
is also the identity element of G) is uniquely expressible as e = e1 + e2 where e1
(e2) is some element in L1 (L2). By definition of the identity, x = xe, we have

x = xe = x( e1 + e2 ) = xe1 + xe2 . (A.1)

As L1 and L2 are left ideals, xe1 is in L1 and xe2 is in L2, which means

x1 = xe1 and x2 = xe2 . (A.2)

In particular, if x ∈ L1, x = x1 and x2 = 0, then (A.2) says that

x = xe1, xe2 = 0 for x ∈ L1 . (A.3)

Setting x = e1 in these results, we have

e1 = ( e1)2 = e21 and e1 e2 = 0 . (A.4)

By the same reasoning, with x ∈ L2, we get

x = xe2 , xe1 = 0, for x ∈ L2, (A.5)


2
e2 = ( e2 ) = e22, and e2 e1 = 0 . (A.6)

e which satisfy (A.4) and (A.6), are mutually orthogonal


The elements e1, e2 in G,
idempotents; and they generate the left ideals L1 and L2.
264 APPENDIX A. THE SYMMETRIC GROUP

8. These ideals may or may not be reducible. If any one is not, we continue the
process of resolving the unit element into a sum of idempotents in that way, until
we can go no further: An idempotent eµ is said to be primitive if it cannot be
resolved into a sum of idempotents. If eµ is a primitive idempotent, the left ideal
Lµ = Gee µ is minimal; conversely, if Lµ is a minimal left ideal, any generating element
of Lµ is primitive.
9. To verify that an idempotent is primitive, one may use a simple criterion:
Theorem A.1 (Primitive idempotents). An idempotent eµ is primitive if and only if
e where λ x is some number, which may be zero or may depend
eµ xeµ = λ x eµ for all x ∈ G,
on x.

10. Finally, to select among primitive idempotents those which generate inequiva-
lent representations, one may use the following theorem:
Theorem A.2 (Equivalent primitive idempotents). Two primitive idempotents e1
and e2 generate equivalent irreducible representations if and only if e1 xe2 6 = 0 for some
e In other words, if e1 xe2 = 0 for all x ∈ G,
x ∈ G. e then the idempotents e1 and e2 must
generate inequivalent representations.

To generalize now, the group algebra G e associated with a finite group G can
be decomposed into left ideals Lµ and written G e = ⊕ µ Lµ , with µ running over all
inequivalent irreps of the group G. Each Lµ = Ge e µ is generated by an idempotent
eµ , which satisfies the conditions eµ eν = δµν eµ and ∑µ eµ = e. In addition, each Lµ
may be decomposed into dµ minimal left ideals Laµ with a = 1, . . . dµ corresponding
to the primitive idempotents eaµ , which obey eaµ xebµ = ceaµ δab for all x ∈ G, e where c
is a number. The idempotents eµ generate the inequivalent irreps µ of G, each of
multiplicity dµ which is the number of minimal left ideals in µ.
E XAMPLE 1: The identity representation. The algebra G e associated with any finite
group G contains the element s = | G |−1 ∑h∈G h, which satisfies gs = s and so
sgs = ss = s, for any g ∈ G. Hence s is a primitive idempotent. It generates
an irreducible representation of G on the group algebra. The basis vectors of the
representation are gs for each g ∈ G, but since gs = s (the same vector for every
g), the representation is one-dimensional, with the representation matrix given
by D ( g) = 1.
E XAMPLE 2: Algebra of C2. Group C2 has two elements a and e = a2. Any element
of the associated algebra must have the form f = αe + βa, and if it is an idempo-
tent, it must also satisfy f f = f , or α2 + β2 = α and 2αβ = β. Hence α = 1/2
and β = ±(1/2). From the previous example, we know that e1 = 12 ( e + a ) is a
primitive idempotent, which generates the identity representation. We check that
the second solution, e2 = 21 ( e − a ), is orthogonal to the first, e1 e2 = 0. In addition,
e2 ee2 = e2, e2 ae2 = − e2, and e2 ge1 = 0 with g = e, a. Hence if x = xe e + xa a is any
element of the algebra, then e2 xe2 = ( xe − xa ) e2, which according to Theorem A.1
means that e2 is also a primitive idempotent; and e2 xe1 = 0, which according to
Theorem A.2 means that e1, e2 are inequivalent.
A.1. ALGEBRAIC METHOD 265

From Example 1, we know that e1 generates the identity representation, with


matrices D ( e ) = 1 and D ( a ) = 1. Now, the irreducible representation generated
by e2 has as basis vectors ee2 = e2 and ae2 = − e2 (i.e. basically the same vector),
and so is also one-dimensional, with matrices D ( e ) = 1 and D ( a ) = − 1.
E XAMPLE 3: Algebra of C3. This order-3 group has three elements a, b = a2 and
e = a3. We already know that e1 = 31 ( e + a + b ) is the primitive idempotent
that generates the identity representation, D ( g) = 1 for g = e, a, a2. Any other
idempotent of the algebra, which we may write as f = 31 ( αe + βa + γb ), must
satisfy the conditions f e1 = 0 and f f = f , or explicitly

α+ β +γ = 0, (A.7)
2
α + 2βγ − 3α = 0 , (A.8)
2
β + 2αγ − 3γ = 0 , (A.9)
2
γ + 2αβ − 3β = 0 . (A.10)

These equations are equivalent to

α = −( β + γ ) , (A.11)
0 = ( α − 1)( β − γ ) , (A.12)
a
β3 = γ3 = (3 − α )(3 − 2α ) . (A.13)
2
There are two cases: In the first, α = 1 and β 6 = γ; in the second, β = γ and α 6 = 1.
And so there are three solutions:
(i) α = 2, β = γ = − 1;
(ii) α = 1, β = ε, γ = ε 2;
(iii) α = 1, β = ε 2, γ = ε. √
where ε = exp(i2π/3) = − 21 + i 23 . We now check the primitivity (using Theo-
rem A.1) and the equivalence (using Theorem A.2) of the solutions.
. Solution (i): e0 = (2e − a − b ) /3
e0 e = ee0 = e0 ,
e0 a = ae0 = (− e + 2a − b ) /3,
e0 b = be0 = (− e − a + 2b ) /3,
e0 ee0 = e0 ,
e0 ae0 = (− ee0 + 2ae0 − be0 ) /3 = e0 a.
The last equation shows that e0 is not a primitive idempotent. It turns out, as seen
from the results obtained below, that e0 is decomposable, such that e0 = e2 + e3.
. Solution (ii): e2 = ( e + εa + ε 2 b ) /3
e2 e = ee2 = e2, e2 a = ae2 = ε 2 e2, e2b = be2 = εe2;
e2 ee2 = e2, e2 ae2 = ε 2e2, e2 be2 = εe2.
And so e2 is a primitive idempotent.
. Solution (iii): e3 = ( e + ε 2 a + εb ) /3.
One can check in a similar manner that e3 is also a primitive idempotent. It re-
mains to see whether e2 and e3 are equivalent or not. Applying Theorem A.2 we
find that they are not: e2 ee3 = e2 e3 = 0, e2 ae3 = ε 2 e2 e3 = 0, and e2 be3 = εe2e3 = 0.
266 APPENDIX A. THE SYMMETRIC GROUP

The left ideal L2 generated by e2 contains the basis vectors: ee2 = e2, ae2 =
e2 ε 2, and be2 = e2ε. Thus, the corresponding irreducible representation is one-
dimensional, with matrices D ( e ) = 1, D ( a ) = ε 2, and D ( b ) = ε. Similarly, the left
ideal L3 generated by e3 gives rise to another one-dimensional representation,
with matrices D ( e ) = 1, D ( a ) = ε, and D ( b ) = ε 2. To summarize we list all the
irreducible representations of the group C3:

µ : e a a2
1 : 1 1 1
2 : 1 ε2 ε
3 : 1 ε ε2

A.2 Graphical Method


The preceding discussion applies to any finite group, but from now on we shall
concentrate on the symmetric group Sn . We begin by describing an effective tool
for finding its representations, the graphical method due to Rev. Alfred Young.
11. Partitions of n. A partition [ λ ] ≡ [ λ1, λ2, . . . , λn ] of a positive integer n ∈ N ∗
is a sequence of non-negative integers λi ∈ N arranged in non-ascending order
and whose sum is equal to n:
n
∑ λi = n, λ1 ≥ λ2 ≥ . . . ≥ λn ≥ 0 . (A.14)
i= 1

When enumerating several partitions, it is useful to give them in some order,


and so we define two partitions [ λ ] and [ µ ] to be equal if λi = µi for all i, and
partition [ λ ] to be ‘greater’ than [ µ ] if the first non-zero number in the sequence
λi − µi is positive. The usual convention is to list the partitions in a non-ascending
order. If [ λ1, λ2 , . . .] contains a value repeated k times, the repeated values may be
indicated by an exponent, λk . It is also customary to suppress the last zeroes in
the sequence, as in [ λ1, λ2, . . . , λr , 0, 0, . . .] ≡ [ λ1, λ2, . . . , λr ] with λi ≥ λi+1 > 0.
12. Young patterns. A partition [ λ1, . . . , λr ] is uniquely represented by a diagram,
called a Young pattern, which consists of n squares (or boxes) arranged in r rows
lined up on the left, the ith row containing λi squares. The ordering convention
for the λi ’s means that each row has no fewer squares than any row below it (as
illustrated in the following examples). We will use the same symbol, e.g. [ λ ], to
designate a partition of n and the Young pattern it represents.
13. Representations of Sn . Recall (Chapters 1 and 2) that every conjugacy class (or
simply ‘class’) of Sn is characterized by a cycle structure, consisting of cycles of
lengths `i, such that ∑i `i = n. To each solution of this equation corresponds a
class, the number of solutions is equal to the number of classes. Since the number
of solutions to this equation is equal to the number of solutions to (A.14), we see
A.2. GRAPHICAL METHOD 267

that the number of partitions of n is equal to the number of Young patterns, and
to the number of conjugacy classes of Sn , which in turn is equal to the number of
inequivalent irreducible representations of Sn . This suggests that one may associate
each Young pattern with an irreducible representation of Sn . It is important to note
that this correspondence is one-to-one. We will describe this relationship more
precisely in the next section after we have familiarized ourselves with this new
graphical tool.
E XAMPLE 4: For n = 2, 3, 4, we have the following partitions [ λ ] enumerated in
the conventional order in each case:
. n = 2: [2], [1, 1] ≡ [12]. The two partitions correspond to the graphs:

[2]: [1 2 ]:

. n = 3: [3], [2,1], [1, 1, 1] ≡ [13]. They are represented graphically by the


diagrams:

[3]: [2,1]: [1 3 ]:

. n = 4: [4], [3,1], [2, 2] ≡ [22], [2, 1, 1] ≡ [2, 12], [1, 1, 1, 1] ≡ [14 ]. To each
partition, there corresponds a Young pattern:

[4 ]: [3, 1]: [2 2 ]: [2, 12 ]: [1 4 ]:

14. Standard Young tableaux. A Young tableau for the symmetric group Sn (of
degree n) is a Young pattern in which n boxes are filled with n symbols (e.g. 1,
2,. . . , n) in any order, each symbol being used only once. A standard Young tableau
is a tableau in which the numbers 1, 2,. . . , n in each row appear in an increasing
order to the right, and those in each column appear increasing from top to bottom.
A normal Young tableau is a tableau in which the labels 1, 2,. . . , n appear in that
order from left to right and from the top row to the bottom row. To each pattern,
there corresponds a unique normal tableau Γ; one can obtain every other standard
tableau having the same pattern by applying an appropriate permutation p acting
on the labels in the normal tableau, producing the tableau pΓ.
This graphical tool yields a simple first result: We can find the dimension of any
irreducible representation of Sn by counting the number of standard tableaux of
the pattern corresponding to that irrep. (Another, more systematic, way is to
use the ‘hook’ formula described later.) The numbers of standard tableaux of
two patterns conjugate under the interchange of rows and columns (and so the
dimensions of the corresponding irreps of Sn ) are equal.
268 APPENDIX A. THE SYMMETRIC GROUP

15. Ordering the tableaux. It is conventional to list the tableaux of the same
pattern in the ‘dictionary order’, such that when all the numbers in the boxes
are written down on one line, one row followed by the next, the tableaux are
arranged in the increasing order of the resulting numbers. For example,

1346 1347 1347


27 → 1346275 before 25 → 1347256 before 26 → 1347265.
5 6 5

E XAMPLE 5: We list below the standard tableaux of the symmetric groups S2, S3
and S4, and give the number of standard tableaux d [λ] for each pattern [ λ ].
. S2: The partitions [2] and [1,1] of 2 result in two patterns, [2]: and
[ 12 ] : , with one standard tableau (d = 1) for each pattern. Both tableaux are
normal, and conjugates of each other.

1
[2]: 1 2 d[2] = 1; [ 12 ] : d[1,1] = 1.
2

. S3: There are three possible partitions of 3, namely [3], [2,1], and [1,1,1],

corresponding to the three patterns [3]: ; [2,1]: ; and [13]: ,

producing the following standard tableaux:

[3] : 1 2 3 d[3] = 1

1 2 1 3
[2,1] : d[2,1] = 2
3 2

1
[ 13 ] : 2 d[13 ] = 1
3

13
All the tableaux are normal, except of the pattern [2, 1], which is simply
2
standard, and can be obtained from the normal tableau of the pattern [2,1] by
applying permutation p = (23) to it. As a rule, the normal tableau for each
pattern always appears first in the dictionary ordering of all the tableaux.
A.2. GRAPHICAL METHOD 269

. S4: Five Young patterns producing the following standard tableaux:


[4] : 1 2 3 4 d[4] = 1

1 2 3 1 2 4 1 3 4
[3,1] : d[3,1] = 3
4 3 2

1 2 1 3
[ 22 ] : d[2,2] = 2
3 4 2 4

1 2 1 3 1 4
[2, 12] : 3 2 2 d[2,1,1] = 3
4 4 3

1
2
[ 14 ] : d[14 ] = 1
3
4

In each pattern, the first tableau is normal, the others are not but can be obtained
by permutations of the symbols in the normal tableau of the same pattern. For
example, the standard non-normal tableaux of pattern [3, 1] can be obtained by
applying the permutations (34) and (234) to the normal tableau. Note also that
the conjugate patterns have the same dimensions (d = 1 for the pair [4] and [14 ],
and d = 3 for the pair [3,1] and [2, 12 ]); on the other hand, [22] is self-conjugate,
with dimension d = 2. These examples illustrate the general rule. 
16. Number of standard tableaux. There is a simple way of calculating the num-
ber of standard tableaux of any given pattern [ λ ], and hence the dimension of the
corresponding irrep of Sn . It is given by the ‘hook’ formula:
n!
d[λ] ( Sn ) = , (A.15)
H[λ]
where H = ∏ij hij , and hij is the hook number of the box B located on column j
and row i in the given pattern. The hook number h ij is defined as the number of
boxes to the right of B on row i, plus the boxes below B in column j, and finally,
plus one. The hook product H has n factors in it (i.e. the total number of boxes).
For example, in the Young pattern [ λ ] = [2, 1]: , there are three hooks
which produce the hook numbers h 11 = 3, h12 = h21 = 1, resulting in the hook
product H = 3 · 1 · 1 = 3, and so d [2,1] ( S3) = 3!/3 = 2. In both partitions [ n ]
and [1n ], the hook product is H = n!, and so d ( Sn) = n!/n! = 1 in both cases.
(The procedure gets its name from the calculational device of drawing a ‘hook’,
or a line passing vertically up through the bottom of column j, making a 90◦ right
turn at row i to move through the boxes to the right end of that row: h ij is the
total number of boxes through which the hook line passes.)
270 APPENDIX A. THE SYMMETRIC GROUP

A.3 Idempotents and Representations


In this section, we construct the idempotents for Sen (the algebra assigned to Sn ),
which generate the left ideals associated to the representations contained in the
regular representation. As we are dealing with the symmetric group, we can
make use of the graphical method of Young.
17. Horizontal, vertical, and Young operators. Given a Young tableau Γλ for the
group Sn (a pattern of n boxes carrying the symbols 1, 2, . . . , n in some order λ),
we consider two types of permutations: the horizontal permutations hλ , which are
all those interchanging symbols in the same row; and the vertical permutations v λ ,
which interchange only symbols in the same column. Then we define the horizon-
tal operator or symmetrizer sλ of tableau Γλ as the sum over all horizontal permu-
tations of Γλ , and the vertical operator or antisymmetrizer aλ of Γλ as an alternating
sum of all vertical permutations of Γλ :

sλ = ∑ h λ , (symmetrizer of Γλ ), (A.16)
h
aλ = ∑ δv vλ , (antisymmetrizer of Γλ ), (A.17)
v

where δv is the parity of v; i.e. δv = + 1 if v is an even permutation, and δv = − 1


if v is an odd permutation. Next, we define the Young operator:

Yλ = sλ aλ , (Young operator on Γλ ) . (A.18)

For a given tableau Γλ , the horizontal operator, vertical operator, and Young op-
erator are uniquely defined (but one may also take aλ sλ ). In practice, to calculate
the sλ of tableau Γλ , one takes the sum of all permutations of the symbols in row i
of Γλ and then forms the product of such sums for all rows: sλ = ∏i ∑ p p(i) . Sim-
ilarly, to calculate aλ , one first takes the sum of all permutations of the symbols in
column j, weighted by factors δ p, then the product of such sums for all columns:
aλ = ∏j ∑ p δ p p(j) . These expressions when expanded become combinations of
permutations, as shown in the equations displayed above.
As we will discuss in more details below, the Young operator, once properly
normalized, is an idempotent, and generates a left ideal which generates an irre-
ducible representation. So, the Young operators for all standard Young tableaux
of Sn completely resolve the identity, and give the complete decomposition of the
regular representation of Sn .
18. The identity representation of Sn . For the normal tableau of one-row pattern
[ n ] the set {h[n] } consists of all elements p ∈ Sn , whereas {v[n] } contains only the
group identity e. So a[n] = e, and s[n] = ∑ p p ≡ s, which is the symmetrizer of the
whole group; it follows that the Young operator for [ n ] is Y[n] = se = s.
Since sp = ps = s for all p ∈ Sn , it follows that ss = n!s, and so s is essentially
idempotent, i.e. it is idempotent up to a constant. Thus, s/n! is strictly idempotent.
In addition, sps = ss = n!s for all p ∈ Sn , and so s is a primitive idempotent.
Therefore, s (or Y[n] = s) generates an irreducible representation of Sn on the
A.3. IDEMPOTENTS AND REPRESENTATIONS 271

group algebra. Since ps = s for all p ∈ Sn , the group representation is one-


dimensional, with matrices D( p ) = 1 for all p ∈ Sn .
19. The alternating representation of Sn . For the normal tableau of one-column
pattern [1n ], the identity is the only horizontal permutation, but all permutations
are vertical permutations, therefore s[1n ] = e, and a[1n ] = ∑ p δ p p ≡ a is the anti-
symmetrizer of the full group. So, the Young operator for [1n ] is Y[1n ] = ea = a.
With arguments similar to the ones used above, we see that p a = a p = δ p a,
which implies aa = n! a and apa = δ p n! a for all p ∈ Sn . And so a is (essentially)
idempotent and primitive. Furthermore, since pa = δ p a we have xa = ∑ p δ p xp a
for all vectors x = ∑ p xp p ∈ Sen , and the left ideal Sen a generated by a consists of
multiples of a: this is a one-dimensional vector space. The result obtained is the
alternating representation, in which D( p ) = δ p for all p ∈ Sn .
The symmetrizer s and antisymmetrizer a of Sn are mutually orthogonal, and
generate inequivalent (irreducible) representations, a simple result we can verify:
spa = sa = 0 for all p ∈ Sn .
20. The symmetrizers and antisymmetrizers for patterns other than [ n ] and [1n ]
are also (essentially) idempotent, but not primitive. But the Young operators Yλ
constructed from them are both essentially idempotent and primitive:

Yλ2 = HYλ , (A.19)


Yλ xYλ = cYλ , for all x ∈ Sen . (A.20)

where H and c are ordinary numbers. It turns out that H = n!/d [λ] ( Sn ) is the
hook product for pattern [ λ ]. Using this normalization factor, we define the nor-
malized primitive idempotent for [ λ ] as

def 1 1
eλ = Yλ = sλ v λ . (A.21)
H H

As a primitive idempotent, the Young operator eλ generates a minimal left ideal on Sen ,
which corresponds to an irreducible invariant subspace for group representation.
21. Equivalent representations from the same pattern. Up to now we have used
the symbol Γλ to denote any standard Young tableau obtained from the pattern
p
[ λ ]. Now, let us replace it with a more precise notation, Γλ , such that Γ1λ ≡ Γeλ
p
denotes the normal tableau, and Γλ = pΓ1λ every one of the other standard tableaux,
which as we know, can be obtained by applying an appropriate permutation p on
p
Γ1λ . If eλ is the primitive idempotent for the normal Young tableau Γ1λ , then eλ =
peλ p−1 is the primitive idempotent for the non-normal standard Young tableau
p p
Γλ . Therefore, the irreducible representations generated by eλ and eλ are equivalent,
p q
just as are those generated by eλ and eλ for any permutations p, q ∈ Sn .
p
This can be easily seen as follows: We need to show that eλ xeλ 6 = 0 for some
p p
x ∈ Sen (cf. Theorem A.2). As eλ = peλ p−1 , we have eλ peλ = peλ p−1 peλ = peλ,
which is non-vanishing.
272 APPENDIX A. THE SYMMETRIC GROUP

j
22. The permutation R ji , which relates the Young operators eiλ and eλ , correspond-
j j
ing to the Young tableaux Γiλ and Γλ of the same pattern, so that eλ = R ji eiλ R − 1
ji ,
can be found either by inspection of the diagrams, or more systematically in the
following way: Write the labeling numerals in successive rows of tableau Γiλ on
the first line of a two-row representation of R ji , and in the same way the numerals
j
in successive rows of tableau Γλ on the second line.
1346 1347
j
Thus, for example, if Γiλ = 27 and Γλ = 25 , then the permutation
5 6
needed is R ji = (1346275
1347256) ≡ ( 567 ) .
23. Inequivalent representations from different patterns. The Young operators
for different patterns ([ λ ] 6 = [ µ ]) generate inequivalent representations.
The proof runs as follows: Let eλ and eµ be the primitive idempotents of the
patterns λ and µ 6 = λ. Let p be any element of Sn , then
p
e µ p e λ = e µ p e λ ( p −1 p ) = ( e µ e λ ) p = 0 .

It follows that
eµ x eλ = 0 for all x ∈ Sen .
By Theorem A.2, the primitive idempotents eλ and eµ ([ λ ] 6 = [ µ ]) generate in-
equivalent representations.
24. Irreducible representations of Sn . The Young operators { eλ} associated to the
normal Young tableaux generate all the inequivalent irreducible representations of the
symmetric group Sn .
A proof of this key result is based on the following observations:
(i) The number of the inequivalent irreducible representations of Sn is equal
to the number of Young patterns.
(ii) There is one Young operator eλ for each Young pattern.
(iii) Every eλ generates an inequivalent irreducible representation of the group.
25. Complete reduction of the regular representation. The left ideals generated
by the Young operators associated to the distinct standard Young tableaux are lin-
early independent, and they span the group-algebra space Sen . The group identity
element is resolved into the sum of the Young operators associated to the stan-
dard tableaux.
p
Let { eλ } be the set of dλ primitive idempotents for the distinct standard Young
tableaux having the pattern [ λ ], including the normal tableaux. Then the state-
ment above tells us that the decomposition of the group algebra Sen and the reso-
lution of its identity element e are expressed in the form
p p p
Sen = ∑ ∑ Lλ , where Lλ = Sen eλ , (A.22)
λ p
p p p
e = ∑ ∑ eλ , where ( eλ )2 = eλ . (A.23)
λ p
A.3. IDEMPOTENTS AND REPRESENTATIONS 273

E XAMPLE 6: Group S2. There are two partitions of 2, namely [2] and [1,1]. The
corresponding Young patterns produce two standard tableaux, both normal:
. Pattern [2]: 1 2 s[2] = e + (12), a[2] = e, and Y[2] = e + (12). With
H = 2, we define the normalized primitive idempotent e[2] = [ e + (12)]/2. This
is a special case of the general situation discussed in §18 above.
1
. Pattern [1,1]: s[1,1] = e, a[1,1] = e − (12), and Y[1,1] = e − (12). With
2
H = 2, we define the normalized primitive idempotent e[1,1] = [ e − (12)]/2. This
is a special case of the situation discussed in §19 above.
So, the resolution of the identity on the group algebra Se2 is e = e[2] + e[1,1] .
E XAMPLE 7: Group S3. There are three partitions of 3, namely [3], [2,1] and [13 ].
The corresponding Young patterns produce four standard tableaux, all normal
except one.
. [3]: 1 2 3 This is again a special case of the situation discussed in §18.
All permutations are horizontal permutations, the only vertical permutation is e.
So, s(3) = ∑ p p = s (the total symmetrizer), a(3) = e, and Y(3) = s. With constant
H = 3! = 6, the normalized primitive idempotent for [3] is e[3] = s/6.
1 2
. [2,1]: e and (12) are horizontal, e and (13) are vertical. Hence
3
s[2,1] = e + (12), a[2,1] = e − (13), and Y[2,1] = s[2,1] a[2,1] = e + (12) − (13) − (321).
With H = 3, the normalized idempotent of this tableau is e[2,1] = Y[2,1] /3.
1 3
. [2, 1]0: This non-normal standard tableau can be obtained from
2
the normal tableau [2,1] by a (23) permutation. We have s[2,1]0 = e + (13), a[2,1]0 =
e − (12), and Y[2,1]0 = s[2,1]0 a[2,1]0 = e + (13) − (12) − (123), and the normalized
idempotent e[2,1]0 = Y[2,1]0 /3.
1
. [ 13 ] : 2 This is another special case of the general situation discussed
3
in §19 above. All permutations are vertical, the only horizontal being e. Therefore,
s[13 ] = e, a[13 ] = ∑p δ p p = a (the antisymmetrizer of the full group). The primitive
idempotent is Y[13 ] = a, which becomes normalized, e[13 ] = a/6, by including the
constant H = 3!.
And so, the resolution of the identity in Se3 is

e = e[3] + e[2,1] + e[2,1]0 + e[13 ] ,

corresponding to the full decomposition of Se3 into minimal left ideals:

Se3 = L[3] + L[2,1] + L[2,1]0 + L[13 ] .

The dimension of Se3 matches the sum of the dimensions of the component ideals:
3! = 6 = 1 + 2 + 2 + 1, as it should. 
274 APPENDIX A. THE SYMMETRIC GROUP

A.4 Irreducible Tensors


26. The representation theory of the symmetric group Sn has an important appli-
cation in the study of the irreducible tensors defined with respect to any group G
of linear transformations on a finite-dimensional space, and so (as these tensors
provide a basis for representations) also in the classification of the irreducible
representations of the group G itself. This applies whenever the group represen-
tations can be constructed by the tensorial techniques (cf. Chapter 7 Section 7.2),
but proves especially useful when the group in question is the unitary Lie group
SU ( n), because, with its standard representation realized on Cn , its other repre-
V
sentations are built up on tensor powers of Cn , e.g. exterior powers k (Cn ) for
the fundamental representations, and symmetric powers Symk (Cn ) for all other
irreducible representations. The simplest example is the case of SU (2): its irreps
are just the symmetric tensor products Symk C2 .
27. Let us begin with the Lie group Gn = GL ( n, C ) of invertible linear transfor-
mations g acting on an n-dimensional vector space Cn . Once a basis {| i i} for Cn
(with i = 1, 2, . . . , n) has been chosen, we define the standard representation of
Gn by the invertible matrices gij in the usual way (with summation over repeated
indices):
π ( g)| j i = | ii gij . (A.24)
g g
g maps any vector | xi = | ii xi in Cn into | xg i = | ii xi where xi = gij xj . A product
of vector coordinates such as xi y j zk transforms under g into gi` gjm gkn x` ym zn , and
is a simple example of the general tensor defined below.
The direct-product space Cn×r ≡ Cn ⊗ · · · ⊗ Cn involving r factors is called a
tensor space of nr dimensions. It may be provided with a basis consisting of
def
| I i = | i1 i ⊗ | i2 i ⊗ . . . ⊗ | ir i, (A.25)
where 1 ≤ i1 , . . . , ir ≤ n. The action of Gn on the tensor space Cn×r induces a
transformation defined by
π ( g)| J i = | I i D I J ( g)
where D I J ( g) = gi1 j1 gi2 j2 . . . gir jr . (A.26)
A rank-r tensor ψ with respect to Gn is an element of Cn×r with nr components
ψ I in the basis { I }, such that ψ = | I i ψ I, which transform under g ∈ Gn like a
g
product of r vectors of Cn , that is, as Cn×r 3 ψ I 7 → ψ I = D I J ( g) ψ J ∈ Cn×r .
28. On the other hand, consider now the  symmetric group Sr , which is the group
1 2 ... r
of all permutations of r symbols, p = . It acts on Cn×r by
p1 p2 ... pr
changing the sub-indices 1, 2, . . . of the tensor labels i1 , i2 , . . . into i p1 , i p2 , . . . , in
the manner described by
φ ( p)| J i = | I i D I J ( p )
i p1 i p2 i pr
where D I J ( p ) = δj δj . . . δjr ≡ δ Ip ,J = δ I,J . (A.27)
1 2 p−1
A.4. IRREDUCIBLE TENSORS 275

It transforms a rank-r tensor ψ such that the tensor components ψ I are mapped
p p
to ψ I = D I J ( p )ψJ = ψ Ip . For example, if p = (110 220 ) then ψi ,i ,i = ψi10 ,i20 ,i3 .
1 2 3
29. The representations of the Lie group Gn and the symmetric group Sr on tensor
space Cn×r have a property, crucial to our discussion, of commuting with each
other, which can be proved as follows. First, note that as the order of the factors
gij in D ( g) is not important, we have:

D I J ( g) = ∏ gij = ∏ gpi,pj = D Ip Jp ( g).

From (A.26) and (A.27), we have:

φ ( p)π ( g)| I i = | K i DKJ ( p ) D J I ( g) = | J p−1 i D J I ( g)


π ( g) φ( p)| I i = | K i DKJ ( g) D J I ( p ) = | K i DKI ( g) .
p−1

The RHS of the two equations may also be rewritten, after changing the repeated
indices, as | J i D Jp I ( g) and | J i D J I −1 ( g). But D J I −1 ( g) = D Jp I ( g), and so the LHS
p p
of the two equations are equal. Since | I i is any of the basis vectors of Cn×r , we
have the commutation relation on the tensor space

φ ( p ) π ( g ) | ψ i = π ( g ) φ ( p ) | ψ i, (A.28)

for any ψ ∈ Cn×r , g ∈ Gn , and p ∈ Sr. So those tensors of rank r with a particular
symmetry under Sr will be transformed among themselves by transformations
g ∈ Gn . The decomposition of Cn×r into symmetry classes of Sr automatically
gives a decomposition of the tensor space into subspaces invariant under the
general linear group Gn . The subspaces spanned by tensors of definite symmetry
classes are irreducible invariant under the finite group Sr , and provide simulta-
neously the spaces for the irreducible representations of the Lie group Gn .
30. To construct an irreducible representation of the group Gn accessible through
tensorial analysis, it suffices to provide it with a basis in the form of irreducible
tensors of rank r in Cn×r possessing a definite symmetry under the symmetric
group Sr, and this we know how to do—by using Young’s graphical method.
To each Young pattern [ λ1, . . . , λk ], with ∑ki=1 λi = r, corresponds a symmetry
type. To obtain the class of tensors of rank r having that symmetry type, we begin
by assigning the indices i1 , i2 , . . . , ir in this order to the boxes of each successive
row from the top down. Next, we apply the appropriate Young operator Y to the
Young tableau so obtained, that is to say, we symmetrize all the indices in each
row, and then antisymmetrize all the indices in each column.
E XAMPLE 8: Rank-3 tensors in Cn . There are three Young patterns, namely [3],
[2,1] and [13 ], producing four standard tableaux, all normal except one.
. [3]: i1 i2 i3 s[3] = s (the total symmetrizer), a[3] = e, and Y[3] = s. Then we
apply Y on a generic rank-3 tensor ψi1 i2 i3 , resulting in the set of totally symmetric
tensors Y[3] ψi1 i2 i3 = ψ{i1 i2 i3 } which form the basis of an irrep subspace describable
by rank-3 tensors.
276 APPENDIX A. THE SYMMETRIC GROUP

i1 i2
. [2,1]: The projection operator for this tableau is Y[2,1] = a[2,1] s[2,1] ,
i3
where s[2,1] = e + (12), a[2,1] = e − (13). The tensors have a mixed symmetry:
ψi1 i2 i3 + ψi2 i1 i3 − ψi3 i2 i1 − ψi2 i3 i1 , and are the basis of a subspace of the symmetry
class [2,1] of symmetry [123].
i1 i3
. [2,1]’: (Here we fill the boxes with indices in an order different
i2
from the standard order.) The projection operator for this Young tableau [2, 1]0
is Y = [1 − (12)][e + (13)], so that the corresponding tensors are of the form
ψi1 i2 i3 − ψi2i1 i3 + ψi3 i2 i1 − ψi3 i1 i2 . They span an invariant subspace also belonging
to the symmetry class [2,1], but of symmetry [132], different from [123]. The two
sets of tensors are distinct but equivalent, being in the same symmetry class.
i1
. [ 13 ] : i2 The Young operator for [13] is Y[13 ] = ae = a, where s[13 ] = e
i3
and a[13 ] = ∑p δ p p = a (the antisymmetrizer of the full group). The basis func-
tions are given by the tensors ψ[i1 i2 i3 ] , totally antisymmetric in its indices. 
C OMMENTS .
(a) By construction, the irreducible tensors in a given tensor subspace satisfy
certain linear relations, such as in [2, 1] subspace:
a b c b a b b a a c
(i) =− and (ii) = +
c a c c b
(To check, apply Y[2,1] = a[2,1] s[2,1] on both sides of each equation.)
(b) For a group Gn and tensors of fixed rank r, what are the allowed Young
patterns? The number of columns must be no greater than r (since ∑i λi = r), and
the number of rows must be no greater than n (because an antisymmetric tensor
component with at least one repeated index is equal to zero). Then, every pattern
with n rows or less, and r columns or less is realizable, giving the characteristic
symmetry type of a set of nonzero tensors of rank r.
(c) The Young pattern [1n ] has one standard tableau and the corresponding
irrep of Gn is one-dimensional. The associated tensor has one component S =
ε i1,...,in ψi1 ,...,in where ε i1,...,in is the invariant unit antisymmetric tensor of rank n.
Under Gn transformations, S is multiplied by det D ( g). (Note that while the
pattern [1n ] gives a one-dimensional (pseudo-scalar) representation in Cn×n , the
pattern [1] corresponds to an n-dimensional (vector) representation in Cn .) Sim-
ilarly, the pattern [2n ] also has one standard tableau, corresponding to a one-
dimensional irrep of Gn . The single tensor associated to it is multiplied by the
factor (det D ( g))2 upon a group transformation. Generally, for each [ sn ], there
exists a one-dimensional irrep S0 , distinguished by the transformation multiplica-
tive factor (det D ( g))s in a Gn transformation, S0 7 → (det D ( g))s S0 . It can also be
seen that the Young patterns [ λ1, . . . , λn ] and [ λ1 + s, . . . , λn + s ] (where s is any
positive integer) have the same standard tableaux, but differ by a scalar factor of
(det D ( g))s in their transformations under Gn .
A.4. IRREDUCIBLE TENSORS 277

31. Let us now restrict ourselves to the unimodular subgroups of GL ( n, C ), such


as SL (n, C ), or SU ( n ), in which det g = 1, and so det D ( g) = 1. Then, the irreps
of GL ( n, C ) remain the irreps of the unimodular subgroups, but they do not nec-
essarily remain distinct. For example, [1n ] and [ sn ] (with positive integer s) are
equivalent one-dimensional representations of SU ( n). More generally, the rep-
resentations given by Young patterns differing by n-box columns are equivalent,
so that [ λ1, . . . , λn ] = [ λ1 + s, . . . , λn + s ] for any integer s. In particular, pick
s = − λn, then [ λ1, . . . , λn ] = [ λ1 − λn , λ2 − λn . . . , λn−1 − λn , 0] (the last 0 is
normally omitted). So, in searching for distinct representations we can remove
n-box columns and need to consider only the patterns having n − 1 rows or less.
In the case of SU (2), admissible Young patterns have just one row (of r boxes),
[ r, 0] = [ r ], and every irrep of SU (2) is a symmetric tensor product of the standard
representation, so that the ( r + 1)-dimensional irrep of SU (2) is V (r) = Symr C2 .
32. For every given Young pattern [ λ1, . . . , λn−1 ] let us define a complementary
pattern [ λ1, λ1 − λn−1 . . . , λ1 − λ2 ], such that when placed together in a certain
way they form a rectangle of λ1 columns and n rows, as illustrated in Fig.A.1.
Those patterns correspond to mutually complex conjugate representations1 of the
group SU ( n). This operation is indicated either by an asterisk, or by an over line:

[ λ 1, . . . , λ n − 1 ] ∗ = [ λ 1, λ 1 − λ n − 1 , . . . , λ 1 − λ 2 ] . (A.29)

For example, if r is an integer less than n, then [1]∗ = [1n−1 ], [1r ]∗ = [1n−r ],
and [ r ]∗ = [ rn−1 ]. In particular, if n = 2, we have [ r ]∗ = [ r ], or all the irreps of
SU (2) are self-conjugate, or real.
 λ1 -
a b 6a

? b

Figure A.1: Young patterns for complex conjugate representations.

33. The notation [ λ1, . . . , λn−1 ] gives the lengths of the rows in a pattern. One may
also use a notation that gives the heights (numbers of boxes) of successive columns
in decreasing order, ( n − 1)λn−1 , ( n − 2)λn−2 −λn−1 , . . . , 2λ2 −λ3 , 1λ1 −λ2 . This tells us
that the pattern [ λ1, . . . , λn−1 ] contains λn−1 columns of height n − 1; λn−2 − λn−1
columns of height n − 2; and so on.
34. Irreducible representations of a Lie group or a Lie algebra are identified by their
highest weights, usually written ( m1, m2, . . . ) in the Dynkin basis (consisting of the
fundamental weights). Consider some irrep of SU ( n ) identified with the Young
pattern [λ1, . . . , λn−1 ], or alternately with the highest weight ( m1, m2 , . . . , mn−1 ).
1 This should not be confused with the conjugation of Young tableaux that interchanges rows and

columns, and produces conjugate representations of the symmetric group


278 APPENDIX A. THE SYMMETRIC GROUP

The correspondence between the two notations is given by a simple rule: the
Dynkin coordinate mi is equal to the number of columns in the Young pattern
containing i boxes, that is to say, mi = λi − λi+1 (taking λn = 0). So, for any irrep
and its complex conjugate, the correspondence is:
[λ1, . . . , λn−1] 7 → ( λ1 − λ2, λ2 − λ3, . . . , λn−2 − λn−1 , λn−1 ), (A.30)
[λ1, . . . , λn−1 ]∗ 7 → ( λn−1, λn−2 − λn−1 , . . . , λ2 − λ3, λ1 − λ2 ). (A.31)
Conversely, given a dominant weight ( m1, . . . , mn−1 ) of SU ( n) in the Dynkin
basis, one can draw the corresponding Young pattern formed by mi columns
of height i, with i = n − 1, n − 2, . . . , 2, 1. The lengths of the rows are then
given by λ1 = m1 + · · · + mn−1 ; λ2 = m2 + · · · + mn−1 ; . . . . . . ; λn−1 = mn−1 .
Thus, the fundamental representations ( . . . , 0, 1, 0, . . .) are 1-column patterns [1r ],
whereas symmetric tensor product representations ( r, 0, . . .) and ( r − s, s, 0, . . .)
correspond to the Young patterns [ r ] and [ r, s ].
35. The dimension of the irreps of SU ( n) can be calculated with a formula similar
to the ‘hook’ formula (A.15) devised for the Sn irreps, with modifications made
necessary by the fact that the permuted symbols are all distinct whereas here the
tensor indices may take any allowed (even equal) values. It is given by
Fn,λ
d[λ] ( SU (n)) = . (A.32)
H[λ]
The denominator H[λ] , which depends only on the pattern, is the same as for
(A.15), while the numerator Fn,λ is calculated as follows. Put the number n in
the box in the upper left corner of the tableau, and put integers in all other boxes
increasing by 1 each time moving to the right, and decreasing by 1 each time
moving down. Fn,λ is the product of all these numbers.
E XAMPLE 9: This example and all the examples that follow concern Lie group
SU ( n). For [2, 1]: , there are three hooks which produce the hook numbers
h11 = 3, h21 = h12 = 1, resulting in the hook product H = 3 · 1 · 1 = 3. The
numerator is F = n ( n + 1)( n − 1). And so the dimension of the irrep [2, 1] of
SU ( n) is F/H = n ( n2 − 1) /3. Note that [2, 1] = (1, 1, 0, . . . ) in the Dynkin basis.
V
E XAMPLE 10: [1r ] = r Cn (r boxes in one column): H = 1 · 2 · · · r = r! and
F = n ( n − 1) · · · ( n − r + 1), and so the dimension of the corresponding irrep is
 
n n!
d [ 1r ] = = .
r r! (n − r ) !
• [1] = Cn : d = n; in Dynkin basis [1] = (1, 0, . . . ).
• [ 1n − 1 ] = [1]∗ = Cn ∗ : d = n; in Dynkin basis [1n−1 ] = ( . . . , 0, 1).
Vn n
n
• [1 ] = C : d = 1; in Dynkin basis [1n ] = (0, . . . , 0).
E XAMPLE 11: [ r ] = Symr Cn (r boxes in one row): F = n ( n + 1) · · · ( n + r − 1) and
H = r! . The dimension of the corresponding irrep is
 
n+r−1 ( n + r − 1) !
d[r] = = .
r r! (n − 1) !
A.5. PRODUCTS OF REPRESENTATIONS 279

For example, for [3], d = n ( n + 1)( n + 2) /6, so that d = 10 for SU (3), and d = 56
for SU (6). Note that [ r ] = ( r, 0, . . . ) in the Dynkin basis.
E XAMPLE 12: [2, 1r−1 ] with r > 1 (two-column pattern: one column of r boxes
plus one of one box) H = ( r + 1) !/r and F = n ( n + 1)(n − 1) · · · ( n − r + 1) =
( n + 1) !/(n − r ) !. This gives the dimension
 
r−1 r ( n + 1) ! n+1
d [2, 1 ] = = r· .
( r + 1) !(n − r ) ! r+1

In particular, if r = n − 1, we have the adjoint representation of SU ( n), with the


dimension of d = n2 − 1. Note that [2, 1r−1 ] = (1, 0, . . . , 0, 1, 0 . . . ) (r > 1) where
the Dynkin labels 1’s are in the first and the r-th positions. 

A.5 Products of Representations


36. Every irreducible representation of SU ( n ) is contained in some tensor product
of representations of lower dimensions (cf. Chapter 7 Sect. 7.2). The problem of
decomposing the product of representations into a direct sum of irreps can be
solved by using a graphical method. As any multiple product can be built up in
steps from double products, we will need to examine only these.
37. As before, we begin by considering, rather than the group SU ( n) itself, the
general linear group Gn = GL ( n, C ). Let πλ ( Gn ) and πµ ( Gn ) be two irreps
of Gn , identified by their highest weights λ and µ, and πλ ( Gn ) ⊗ πµ ( Gn ) their
tensor product. We know from the preceding section that every irrep πλ of Gn
is uniquely associated to an irrep φ[λ] of a symmetric group Sr identified by a
partition [ λ ] of an integer r determined by the irrep πλ ( Gn ). Now, the ten-
sor product πλ ( Gn ) ⊗ πµ ( Gn ) (inner with respect to the general linear group)
uniquely corresponds to the tensor product φ[λ] ( Sr ) ⊗ φ[µ] ( Ss ) (outer with respect
to the symmetric group). And so the reduction πλ ( Gn ) ⊗ πµ ( Gn ) 7 → πν ( Gn )
of the product of irreps of Gn is completely equivalent to the decomposition
φ[λ] ( Sr ) ⊗ φ[µ] ( Ss) 7 → φ[ν] ( Sr+s ). This decomposition results in the relations:
M
π λ ( Gn ) ⊗ π µ ( Gn ) = a ( λ, µ, ν) πν( Gn ) , (A.33)
ν
M
φ[λ] ( Sr ) ⊗ φ[µ] ( Ss ) = a ( λ, µ, ν) φ[ν] ( Sr+s ) , (A.34)
[ν]

which specify the component irreps ν and their multiplicities a ( λ, µ, ν) in each


tensor product. These will be determined in the manner presently described.
We use a simplified notation in which φ[λ] is replaced by [ λ ], which refers to a
partition, a Young pattern, or an irrep according to the context.
38. Consider the tensor product [ λ ] ⊗ [ µ ] of two irreps of a symmetric group,
each being represented by a Young pattern. In the smaller of the two patterns,
say [ µ ], assign the integer j to every box in row j. Then take all the boxes labeled
1 from pattern [ µ ] and add them to pattern [ λ ], each in a different column in
280 APPENDIX A. THE SYMMETRIC GROUP

all possible ways to form new tableaux. Next, take the boxes labeled 2 and add
them to every enlarged tableau, again each box in a different column. Repeat
with the remaining boxes of [ µ ]. The tableaux one finally obtains must satisfy the
following conditions:
(i) Every new diagram is a valid Young pattern.
(ii) In each column, an integer-label appears only once.
(iii) Reading the boxes from right to left and the top down, the number of
boxes carrying an integer (say, 1) must be greater than or equal to the number
of boxes labeled with a larger integer (say, 2) at every step.
The diagrams formed in this way correspond to the irreps of Sm+n in [ λ ] ⊗ [µ],
and the number of ways a Young pattern [ ν ] appears is the multiplicity a ( λ, µ, ν)
of the irrep [ ν ] of Sm+n . This automatically gives the decomposition (A.33) for the
irreps of the Lie group Gn .
E XAMPLE 13: [2, 1] ⊗ [2, 1]. Insert digits j = 1, 2 in the boxes of the second pattern:

1 1

2

Following the rules, add the boxes labeled 1 in columns of the first pattern:

1
1 1 1
( a) : (b) : (c) : (d) : 1
1
1 1

Next, to each of the new diagram, add the remaining and last box labeled 2:

1 1 1 1 2
1 1
( a) :
2
2 6∈

1
1
(b) : 1
1 2
2

1
1
(c) : 2
1
1
2

1
(d) : 1
1
1 2
2
The diagram marked by 6 ∈ is not valid because it violates rule (iii), and the two
[3, 2, 1]’s are different; so that we obtain finally 8 patterns, leading to the following
decomposition:
A.5. PRODUCTS OF REPRESENTATIONS 281

⊗ = + + +

+ + + +

or equivalently in terms of partitions:


[2, 1] ⊗ [2, 1] = [4, 2] + [4, 12 ] + [32] + [3, 2, 1]
(A.35)
+ [3, 2, 1]0 + [23] + [3, 13 ] + [22, 12 ] .
C OMMENTS . To check the result given in (A.34) the following formula relating
the dimensions can be used:
(r + s)!
r!s!
d[λ)] ( Sr ) d[µ] ( Ss ) = ∑ b ( λ, µ, ν)d[ν] ( Sr+s ) , (A.36)
[ν]

where the dimensions of the irreps of the symmetric group are given by the hook
formula (A.15). The combinatorial factor on the LHS reflects the different ways
in which r different objects are selected from r + s objects to form the set acted on
by Sr, with the remaining s objects assigned to the second set, with the symmetry
of Ss . And so, in the above example, (6!/3!3! )2 × 2 = 80 for the product, and
9 + 10 + 5 + 2 × 16 + 10 + 5 + 9 = 80 for the sum. (The combinatorial factor
mentioned above is not necessary for the representations of Gn or its subgroups.)
E XAMPLE 14: More examples:

⊗ = +

⊗ = +

⊗ = + +

⊗ = +

⊗ = + +
282 APPENDIX A. THE SYMMETRIC GROUP

The results for these simple cases together with the distributive and associative
properties of outer products can be used to calculate more complicated products
and their decomposition into direct sums containing irreps of higher dimensions.
39. The procedure just described also applies to representations of SU ( n ) pro-
vided that in the final results we remove the patterns having more than n rows,
and remember that columns having exactly n boxes in any patterns are equivalent
to one-dimensional representations. So, for example, if we consider the decom-
position of the product [2, 1] ⊗ [2, 1] for SU (3), we have

⊗ = + +

+ + +

or equivalently in terms of partitions (Eq. (A.35) minus the last two terms):

[2, 1] ⊗ [2, 1] = [4, 2] + [4, 12 ] + [32 ] + [3, 2, 1] + [3, 2, 1]0 + [23 ]


= [4, 2] + [3] + [32 ] + [2, 1] + [2, 1]0 + [13 ] .

The dimensions match: 8 × 8 = 27 + 10 + 10∗ + 8 + 80 + 1.


E XAMPLE 15: For SU ( n ), the tensor product [1r ] ⊗ [1] (with r < n) decomposes
according to
1
1 1 a
2
2 2
·
· ⊗ a = + · .
·
· ·
r
r r
a
We can check the dimensions: on the LHS, we have the dimension of the tensor
product d [1r ] · d [1] = n d [1r], where the dimension of [1r ] is known; whereas on
the RHS we have the sum of two dimensions, d [1r+1 ] = d [1r ]( n − r ) / (r + 1) plus
d [2, rr−1 ] = d [1r ] r (n + 1)/(r + 1). We see that the two sides match. In particular if
r = n − 1, the above result tells us that the adjoint representation of SU ( n) is the
symmetric tensor product of the  fundamental representation and its conjugate:
πAd [ SU (n)] = Sym [1]∗ ⊗ [1] for any n = 2, 3, . . . . 
40. In conclusion, the techniques based on tensor algebra we have presented in
this Appendix are especially useful in studying representations of groups such as
SU ( n), SL ( n, C ), or Sp ( n, C ), and are complementary to the root-weight analysis
which we followed in this book (e.g. Chapter 7, Sect. 7.2). We can make the cor-
respondence of the notations used in the two approaches through the relations
(A.30)–(A.31).
Q. Ho-Kim. Group Theory: A Physicist’s Primer.
Appendix B

Clifford Algebras
and Spin Representations

1. In 1870 William Clifford invented a mathematical structure in an attempt to


generalize the quaternions to higher dimensions. It is an associative algebra gen-
erated by n anticommuting square roots of − 1, so that it is R for n = 0, C for
n = 1, and H for n = 2. It finds applications in several fields, in particular, al-
gebra, physics, and information science. In this appendix we will explain how to
use it to construct the half-integral-weight representations of the special orthog-
onal Lie algebras. More can be found in [Ba], [FH] (Chapter 20), and [LM].
Definition B.1 (Clifford algebra). Given a finite-dimensional vector space V over the
field F (R or C) and a symmetric bilinear form B on V, the Clifford algebra Cl ( V, B)
is an associative algebra that contains and is generated by V, such that its multiplication
rule satisfies vv = B ( v, v) 1 for all v ∈ V, where 1 is the unit element.
2. The defining condition takes an equivalent form if we apply it to the sum u + v
of any two vectors u and v of V, resulting in the relation
uv + vu = 2 B ( u, v) 1. (B.1)
The combination uv + vu ≡ { u, v}, called the anticommutator of u and v, is then
the characteristic composition law for any two elements of the Clifford algebra.
So it comes as no great surprise that this algebraic structure could somehow con-
tain representations of the half-integral spin fields.
Since Cl( V, B) is an associative algebra, it can produce a Lie algebra with a
definition for the Lie bracket [ u, v ] = uv − vu. This property will be used when
we consider the orthogonal Lie algebra.
3. A Clifford algebra Cl( V, B ) can be constructed by first defining a tensor space T
generated by V on base field F and given by the direct sum of all allowed tensor
powers,
M
T(V) = V ⊗k = F ⊕ V ⊕ ( V ⊗ V ) ⊕ ( V ⊗ V ⊗ V ) ⊕ · · · , (B.2)
k≥0

283
284 APPENDIX B. CLIFFORD ALGEBRAS

and then taking the quotient of T ( V ) by the ideal in T ( V ) generated by all ele-
ments v ⊗ v − B ( v, v)1 with v ∈ V, where B is the quadratic form associated with
a symmetric bilinear map ϕ : V × V 7 → F. This construction is similar to that
V
of the exterior tensor algebra ∗ V, which is just the quotient of T ( V ) by the ideal
V
generated by all elements v ⊗ v in V ⊗2. In fact, the exterior tensor algebra ∗ V is
V∗
the situation in which B = 0, which we write as V = Cl ( V, B = 0).
We will assume that B is non-degenerate. Non-degenerate bilinear forms over
a real vector space of dimension n can be put, by a change of basis, in a canonical
form with + 1 (p times) and − 1 (q = n − p times) on the diagonal. The case (+ +
+ , −) corresponds to Minkowski space, and is of special interest to relativistic
physics. But here, for our present purpose, we shall need to consider just the
positive definite (p = n, q = 0) or negative definite (p = 0, q = n) forms.
4. In the following examples, we assume F = R and V = R n for some positive
integer n, and so, the sum in (B.2) has n + 1 summands, and call C ( n, B ) the Clif-
ford algebra Cl(R n , B ). Given an orthonormal basis { e1, e2, . . . , en } of R n , C ( n, B)
is generated by the ei’s when these are subject to the condition ei e j + e j ei = 2 B δij ,
with B becoming now the signature of the algebra, either B = + 1 (positive defi-
nite), or B = − 1 (negative definite) for every i, j = 1, . . . , n.
E XAMPLE 1: n = 1. With the single basis vector e1 of V = R, C (1, B) is spanned
by 1 and e1, so that dim C (1, B ) = 2. Any element of the algebra is of the form
x = x0 1 + x1 e1, where xi ∈ R. For B = + , C (1, +) ∼
= R ⊕ R. For B = − , meaning
e21 = − 1, we may identify the basis vector e1 with the imaginary unit i, and so,
C (1, −) is just the algebra of the complex numbers: C (1, −) ∼
= C.
E XAMPLE 2: n = 2. Here, V = R 2 = { x1 e1 + x2 e2 | x1, x2 ∈ R } , and C (2, B) is
generated by 1 and the two basis vectors e1, e2 ∈ V, so that any of its elements is
a real linear combination x0 1 + x1 e1 + x2 e2 + x3 e1 e2 with xi ∈ R. It is now a non-
commutative associative algebra of dimension 4. Now, for B = + when e2i = + 1,
if x, y ∈ V, then the product xy is a scalar dot product plus a cross product of
x, y; on the other hand, if x, y ∈ C (2, +), then xy is again an element of C (2, +).
As for B = − , that is, e21 = − 1, e22 = − 1, we may make the identification with
Hamilton’s numbers e1 = i, e2 = j, and e1 e2 = k, and recognize that C (2, −) is just
the algebra of the quaternions: C (2, −) ∼= H.
E XAMPLE 3: n = 3. Given a basis e1, e2, e3 of V = R 3 , obeying the multiplication
rules e2i = B and ei e j = − e jei (i 6 = j), Clifford algebra C (3, B ) is spanned by the
following basis elements:

Subspaces Elements k cn,k


R 1 0 1
V e 1, e 2, e 3 1 3
V⊗V e1 e2 , e1 e3 , e2 e3 2 3
V⊗V⊗V e1 e2 e3 3 1

where k is the number of factors ei present in each basis element, and cn,k is the
number of elements in C ( n, B) for a given k. The dimension of this Clifford al-
gebra is simply the sum of cn,k over k (n fixed at 3), so dim C (3, B) = 8. With
B = + , C (3, +) may be interpreted as a enlargement of the three-dimensional
SPIN LIE ALGEBRA 285

Euclidean space to include a scalar subspace, and oriented subspaces of one, two,
and three dimensions. With signature B = − (so that e2i = − 1 for i = 1, 2, 3),
the four elements 1, e1, e2, e1 e2 span a subalgebra isomorphic to C (2, −) ∼ = H, and
the remaining four are (1, e1, e2, e1 e2 ) e3 ∼
= C (2, −) · e3. Since e3 ∼
= i and iH ∼
= H,
we have C (3, −) ∼ = H (1 + i), or C (3, −) ∼ = H ⊕ H. (Note, however, that the
octonions are not a Clifford algebra since they are non-associative.)
Calculations (e.g. in [LM]) have identified all the Clifford algebras, as listed
in the following table only for signature B = − , where C ( n ) ≡ C ( n, −), and A [ n ]
means n × n matrices with entries in the algebra A; i.e. A [ n ] ≡ M( n, A ).
n C ( n ) ≡ C ( n, −) n C ( n ) ≡ C ( n, −)
0 R 4 H [2]
1 C 5 C [4]
2 H 6 R [8]
3 H ⊕H 7 R [8] ⊕ R [8]
n+8 C ( n ) ⊗ R [16]
The last line means that C ( n + 8) consists of 16 × 16 matrices with entries in
the algebra C ( n). This period-8 behavior was discovered by É. Cartan in 1908.
5. Generally, if the vector space V = R n generating Clifford algebra Cl ( V, B) is
provided with an orthonormal basis { e1, e2 , . . . , en } , then e0 = 1 together with the
elements ε I = ei1 ei2 . . . eik with I = { 1 ≤ i1 < i2 < · · · < ik ≤ n } form a basis
for Cl ( V, B ) = C ( n, B). Hence, C ( n, B ) has dimension 2n . In fact, for a given k
there are cn,k = n!/k!(n − k ) ! products ε I (similar to exterior products of degree
k). Including e0 for which cn,0 = 1, we have the total number of basis elements
given by the sum ∑nk=0 cn,k = 2n , which is the dimension of C ( n, B).
The quadratic dependence (B.1) of the ideal v ⊗ v − B ( v, v)1 gives a Z/2Z
grading to the quotient space C ( n, B), with B = ± , splitting it into two parts
C ( n, B) = C + ( n, B ) ⊕ C − ( n, B ), (B.3)
where C + is spanned by e0 and the products of an even number of vectors in V,
and C − is spanned by the products of an odd number. C − cannot form an algebra
since C − C − ⊂ C + ; but C + is a subalgebra of C ( n, B ), of dimension 2n−1 , it is this
subalgebra that defines the spin group.
6. Clifford algebras hold a special interest for us by the fact that they contain
as subalgebras the spin Lie algebras spin( n ) associated to the simply-connected
spin Lie groups Spin( n ) which provide the double covers of the (non simply-
connected) special orthogonal Lie groups SO ( n); and so to understand the rep-
resentations of the latter group, it is useful to study those of the former. In order
to see this, consider the Clifford algebra C ( n) of signature B = − generated by
e1, . . . , en satisfying the (anti-commutation) relations
ei e j + e j ei = − 2 δij; i, j = 1, . . . , n. (B.4)
The quadratic products eij ≡ ei e j with i 6 = j obey the commutation relations
1 1  1 1 1 1
eij , ekl = eik δjl − eil δjk − e jk δil + e jl δik . (B.5)
2 2 2 2 2 2
286 APPENDIX B. CLIFFORD ALGEBRAS

Therefore, the set { eij = ei e j | i 6 = j } is closed under the bracket operation, and
forms a vector subspace of C ( n ), which is identified with the spin Lie algebra
spin( n ). Note also that eij acts on any ek to give
1 
eij , ek = δik e j − δjk ei. (B.6)
2
On the other hand, we know that the special orthogonal Lie algebra so( n )
admits as a basis n × n real matrices Lij , with i < j, which generate rotations in
the i-j plane, and satisfy commutation relations equivalent to (B.5)—see Sec. 3.6.5
of Chapter 3. So, we have isomorphism of the Lie algebras spin( n ) and so( n )
under 21 eij ↔ Lij . To get the simply-connected spin Lie group Spin( n ), we just
need to exponentiate spin( n ). In particular, for the subalgebra spanned by a given
1/2 e , we may use the relation 1/4 e 2 = − 1/4 to obtain the exponential exp e in
ij ij ij
a closed form (e0 = 1):
 1  θ θ
exp θ eij = e0 cos + eij sin . (B.7)
2 2 2
With θ ranging over [0, 4π ], this gives a U(1) subgroup of the spin group Spin( n ).
Noting that the RHS of this equation is of the form a e0 + b ei e j with a2 + b2 = 1,
and is an element in C + ( n ), and that exp ·spin( n ) can contain only even powers of
the ei’s, we have the general definition for the spin group:

Spin( n ) = s ∈ C + ( n ) | svs−1 ∈ R n for all v ∈ R n ; s s̄ = 1 , (B.8)

where s̄ is the conjugate of s. The conjugation rules for ei’s are given by

ē0 = e0,
ēi = − ei,
ei1 · · · eik = (−)k eik · · · ei1 , 1 ≤ i1 < · · · ik ≤ n.

Relation (B.8) tells us that Spin( n ) consists of invertible normalized elements


s of C + ( n ) that leave the generating vector space V = R n invariant under conju-
gation by s, meaning that sVs−1 = V.
Leaving aside the trivial case of Spin(1) = {− 1, 1} ∼ = Z/2Z, let us consider a
few cases of lower dimensions to illustrate the structures defined by (B.8).
E XAMPLE 4: Spin(2) is a circle, double covering SO(2). Clifford algebra C (2) is
generated by 1, e1, e2, and e1 e2, satisfying the relations e21 = e22 = − 1, e1 e2 = − e2e1.
C + (2) consists of all elements of the form r = a1 + be1e2, where a, b ∈ R, with
both r−1 and r̄ given by a1 − be1e2. Normalization of r means that r r̄ = a2 + b2 =
1. Now, for any r ∈ C + (2) and any v ∈ C − (2), we have rvr−1 ∈ C − (2), which
implies rvr−1 ∈ R 2 since the only elements of C − (2) are those in R 2. Then Spin(2)
consists of those r ∈ C + (2) with norm one, r r̄ = 1.
Now the complex numbers z = a + bi normed to one, | z |2 = a2 + b2 = 1,
represent U(1). Upon mapping 1 7 → 1, e1e2 7 → i we have r 7 → z, and thus Spin(2)
is isomorphic with U(1), the unit circle.
SPIN REPRESENTATIONS 287

For any r = a1 + b e1 e2 in C + (2), with a = cos θ, b = sin θ and 0 ≤ θ ≤ 2π,


and any v = x e1 + y e2 in R 2, we have on application of the Clifford algebraic
rules

rv = ( ax − by ) e1 + ( bx + ay ) e2
= [(cos θ ) x − (sin θ ) y ] e1 + [(sin θ ) x + (cos θ ) y ] e2. (B.9)

On the other hand,

rvr−1 = [( a2 − b2 ) x − 2aby] e1 + [2abx + ( a2 − b2 ) y ] e2


= [(cos 2θ ) x − (sin 2θ ) y ] e1 + [(sin2θ ) x + (cos 2θ ) y] e2. (B.10)

If r is considered a rotation SO(2), as it is in (B.9), it rotates the vector v ∈ R 2 by


an angle of θ ranging from 0 to 2π. But as an element of Spin(2), as in (B.10), it
acts on v by rotating it through an angle of 2θ ranging over [0, 4π ]. So Spin(2) is a
double cover of SO(2).
E XAMPLE 5: Spin(3) ∼ = SU(2), and the spin representation is the fundamental
representation of SU(2).
Let e1, e2, e3 be a basis of V = R 3 satisfying (B.4) and generating the basis
1, e1, e2, e3, e12, e23, e31, e123 for C (3), where eij = ei e j and e123 = e1 e2 e3. The
definition of a spin group tells us that Spin(3) consists of those elements r of
C + (3) = { 1, e12, e23, e31 } that satisfy r r̄ = 1 and rvr−1 ∈ V for all v ∈ V. It
turns out in fact that the two conditions r ∈ C + (3) and r r̄ = 1 suffice to define
Spin(3), because they also imply that rvr−1 ∈ R 3 for all v ∈ R 3 .
(To prove, note that r−1 = r̄ and v̄ = − v, and since rvr−1 ∈ C − (3), we may
write rvr−1 = u + a e123 where u ∈ R 3 and a ∈ R. Taking the conjugation, the
LHS gives rvr−1 = − rvr−1 = − u − a e123 ; and the RHS u + a e123 = − u + a e123.
This shows that a = 0, and hence rvr−1 ∈ R 3.)
Let i, j, k be the basic elements of H (so that i2 = j2 = k2 = − 1 and ij=k),
and SL (1, H ) the special linear group of quaternions of order one (see Problem
4.4 in Chapter 4). Then the bijections e12, e23, e31 ↔ i, j, k show that Spin(3) is
isomorphic with SL (1, H ), and so with SU(2) as well. It follows the fundamental
representation of SU(2) is also that of Spin(3). 
7. Representations of so( n, C ). In Chapter 7 we have seen that once the standard
representation ϕ1 (with highest weight ω 1 = ε 1) of the Lie algebra so(2` + 1, C )
(or b` ) is determined, all (except one) other fundamental
V
representations can be
obtained from it as exterior products ϕr = r ϕ1, with their highest weights
ω 2, . . . , ω `−1 lying in ZΛ. The missing irreducible representation ϕ` , associated
to the short-root node in the Dynkin diagram B` , has the highest weight ω ` =
1/2( ε 1 + · · · + ε ` ) and, by Weyl symmetry, general weights 1/2(± ε1 ± · · · ± ε ` ) ,
and so is of dimension 2` . We shall denote this rep S = ϕ` . See Fig.B.1.
Similarly, if ϕ1 denotes the standard representation of d` = so(2`, C ), all ex-
cept two other fundamental representations
V
have integral weights, and can be
constructed by tensoring, ϕr = r ϕ1, with r = 2, . . . , ` − 2. The two exceptional
irreps ϕ`−1 and ϕ` , which cannot be constructed in that way, are associated to
288 APPENDIX B. CLIFFORD ALGEBRAS

the two nodes that branch off one end of the Dynkin diagram D` . The irrep ϕ`−1
has highest weight ω `−1 = 1/2( ε 1 + ε 2 + · · · + ε `−1 − ε ` ), with its general weights
given by 1/2 ∑`1 ηi ε i , where ηi = ± 1 and η ≡ ∏`1 ηi = − 1. On the other hand, ϕ` ,
which corresponds to the fundamental weight ω ` = 1/2( ε 1 + ε2 + · · · + ε`−1 + ε` ),
has its weights of the same form, but with η ≡ ∏`1 ηi = + 1. One can see then that
both ϕ`−1 and ϕ` are of dimension 2`−1 , to be known as S− and S+ .
We will now consider these special representations (called spin representations,
or, more exactly, fundamental spin representations ) of so( n, C ) on complex vector
spaces from the point of view of (complexified) Clifford algebra and spin Lie
algebra.

B` D`
f −
f f··· f f v f f··· f f S
H
ϕ1 ∧2 ∧`−2 ∧`−1 S ϕ1 ∧2 ∧`−3 ∧ `−2H f +
S
Figure B.1: Fundamental representations of the orthogonal Lie algebras b` =
V V
so (2` + 1, C ) and d` = so (2`, C ). Notation: k = k ϕ1 .

8. Consider a complex 2` -dimensional vector space, the type of spaces where the
spin representations of orthogonal Lie algebras live. It is useful to treat it as a
tensor product of ` two-dimensional spaces: C2⊗` = C2 ⊗ C2 ⊗ · · · ⊗ C2 , with
typical vectors v = | a, b, . . . i = | a i ⊗ | b i ⊗ . . . . An arbitrary matrix on C2⊗` can
be formed from the 2 × 2 identity matrix σ0 = diag [1, 1] and the Pauli matrices
j
σi (with i = 1, 2, 3 and σ3 = diagonal [1, − 1] by convention). We will call σi
the Pauli matrix σi that acts on the j th subspace C2 in C2⊗` . Then, define the
following tensor products of ` matrices σµ (µ = 0, 1, 2, 3) (cf. [Ge] p. 210):

γ1 ≡ σ1 ⊗ 1 · · · ⊗ 1 = σ11 ,
γ2 ≡ σ2 ⊗ 1 · · · ⊗ 1 = σ21 ,
γ3 ≡ σ3 ⊗ σ1 ⊗ 1 · · · ⊗ 1 = σ31 σ12 ,
γ4 ≡ σ3 ⊗ σ2 ⊗ 1 · · · ⊗ 1 = σ31 σ22 ,
... = ... ... ,
γ2`−3 ≡ σ3 ⊗ σ3 · · · ⊗ σ3 ⊗ σ1 ⊗ 1 = σ31 · · · σ3`−2 σ1`−1 ,
γ2`−2 ≡ σ3 ⊗ σ3 · · · ⊗ σ3 ⊗ σ2 ⊗ 1 = σ31 · · · σ3`−2 σ2`−1 ,
γ2`−1 ≡ σ3 ⊗ σ3 · · · ⊗ σ3 ⊗ σ1 = σ31 · · · σ3`−1 σ1` ,
γ2` ≡ σ3 ⊗ σ3 · · · ⊗ σ3 ⊗ σ2 = σ31 · · · σ3`−1 σ2` ,
γ2`+1 ≡ σ3 ⊗ σ3 · · · ⊗ σ3 ⊗ σ3 = σ31 · · · σ3`−1 σ3` .

Making use the defining conditions of the Pauli matrices, σi σj = δij + iσk ε kij ,
with i, j, k ranging over the values 1, 2, 3, one can prove that the γi ’s satisfy the
SPIN REPRESENTATIONS 289

anticommutation relations:

γi γj + γj γi = 2 δij, i, j = 1, 2, . . . , 2` + 1. (B.11)

And, keeping in mind (B.4), we see (from ei 7 → iγi ) that the γi ’s are the generators
of a complexified Clifford algebra which yields the spin representations we are
considering.
Eq. (B.11) indicates that they generalize the matrices Dirac introduced in rel-
ativistic space-time quantum mechanics to higher dimensions in which γ2`+1 =
(− i)`γ1 γ2 · · · γ2`−1 γ2` plays a role comparable to that of Dirac’s γ5 . We shall now
see that these matrices γi ’s, defined on C2⊗` , give just the natural framework for
studying the spin representations of the orthogonal Lie groups.
Now, given n γi -matrices subject to the anticommutation relations (B.11), we
define the antisymmetric quadratic products

1 1  
Mij ≡ γi γj = γi , γj , ( i, j = 1, . . . , n ) (B.12)
2i 4i
(of which n ( n − 1)/2 are independent) which obey the following bracket relations
characteristic of SO (n):
 
Mij , Mkm = i ( δik Mjm − δjk Mim − δim Mjk + δjm Mik ) . (B.13)

In addition, the commutation relations


 
Mij , γk = i ( δik γj − δjk γi ), (B.14)

mean that the γi ’s are a set of tensors that transform as the n-dimensional vector
representation of SO ( n).
9. Spin representation S of b` = so (2` + 1, C ). The Lie algebra b` is provided
with a basis of `(2` + 1) elements Mij = 1/2i γi γj , with 1 ≤ i < j ≤ 2` + 1.
Noting that (B.13) implies [ Mij , Mkm ] = 0 for i 6 = j 6 = k 6 = m, we choose the
Cartan subalgebra h as { M1,2, M3,4, . . . , M2`−1, 2` } ≡ { H1, H2, . . . , H` } . Explicitly,
we have
1 1
Hi ≡ M2i−1, 2i = γ γ = σ3i , 1 ≤ i ≤ `. (B.15)
2i 2i−1 2i 2
The set of equations Hi vi = µ ( Hi) vi on C2 admit as solution vi = | 1/2ηi i with
eigenvalues µ ( Hi) = 1/2 ηi and ηi = ± 1. So the weight vectors in the irrep S
on C2⊗` wrt CSA h are tensor products of the form v = | 1/2η1 , 1/2η2 , . . . , 1/2η` i
with the weight ( 1/2η1 , 1/2η2 , . . . , 1/2η` ), where ηi = + 1 or − 1. Clearly, S is 2` -
dimensional and has highest weight ( 1/2, 1/2, . . . , 1/2) in the given CSA basis.
10. Spin representation S± of d` = so(2`, C ). Here, we shall take as a basis
the `(2` − 1) matrices Mij , with 1 ≤ i < j ≤ 2`, satisfying the commutation
relation (B.13). In the usual Killing-Cartan analysis, we next specify a Cartan
subalgebra by the set of elements { H1, H2, . . . , H` } , where Hj = M2j−1, 2j =
1 j
2 σ3 and j = 1, . . . , `. And so the 2` -dimensional representation we obtain con-
sists of the joint eigenvectors of { H1, H2, . . . , H` } , namely, the weight vectors
290 APPENDIX B. CLIFFORD ALGEBRAS

v = | 1/2η1 , 1/2η2 , . . . , 1/2η` i with weights (± 1/2, ± 1/2, . . . , ± 1/2) (with ` terms)
wrt the chosen CSA basis. But this time there is a non trivial element, namely,
γ2`+1 = σ31 · · · σ3`−1 σ3` (which does not participate in the construction of the set
Mij ), which by (B.14) commutes with all other elements, [ Mij , γ2`+1 ] = 0 for
all Mij ∈ so(2`, C ). It acts on any weight vector v = | 1/2η1 , 1/2η2 , . . . , 1/2η` i
as a multiplication by a scalar η = η1 · · · η` , where ηi = ± 1. In other words,
we have γ2`+1 v = η v, where η is equal to − 1 if the number of minus signs
is odd, and to + 1 if it is even. (γ2`+1 is comparable to the γ5 of Dirac alge-
bra, and η to chirality.) It follows that this 2` representation we have obtained
is reducible, and its space may be split into disjoint subspaces of equal sizes via
v = 1/2(1 − γ2`+1) v + 1/2(1 + γ2`+1) v, corresponding to the decomposition of the
2` representation into two irreducible representations, which are the fundamental
spin representations S− and S+ of so(2`, C ), each of dimension 2`−1 .
E XAMPLE 6: so (5) (` = 2). The 5 generators of the Clifford algebra are γ1 = σ11 ,
γ2 = σ21 , γ3 = σ31 σ12, γ4 = σ31 σ22 , and γ5 = σ31 σ32 . The basis elements of Lie algebra
so (5) are { Mij, 1 ≤ i < j ≤ 5} , and for its CSA, we take H1 = M12 = 1/2σ31 and
H2 = M34 = 1/2σ32 . Call |±i the eigenvectors of σ3 on C2 with eigenvalues ± 1.
Then the fundamental spin representation of so (5) is 4-dimensional, consisting of
the eigenvectors of ( H1, H2 ) on C2⊗2 given by | + +i, | + −i, | − +i, and | − −i,
with respective weights ( 1/2, 1/2), ( 1/2, − 1/2), (− 1/2, 1/2), and (− 1/2, − 1/2).
E XAMPLE 7: so (6) (` = 3). The generators of the Clifford algebra are γ1 = σ11 ,
γ2 = σ21 , γ3 = σ31 σ12 , γ4 = σ31 σ22 , γ5 = σ31 σ32 σ13, and γ6 = σ31 σ32 σ23 . They serve
to determine the 15 basis elements of so(6), namely, { M12, . . . , M16, . . . , M56 } , of
these, Hi = M2i−1, 2i = 1/2σ3i (i = 1, 2, 3) are selected to define the CSA. Then
the resulting representation on C2⊗3 consists of 23 = 8 weight vectors |± , ±, ±i
of weights (± 1/2, ± 1/2, ± 1/2), divided into 4 vectors | + ++i, | + −−i, | − +−i,
and | − −+i of chirality η = + 1 (representation S+ ); and 4 vectors | + +−i,
| + −+i, | − ++i, and | − −−i of chirality η = − 1 (representation S− ). 

Q. Ho-Kim. Group Theory: A Physicist’s Primer.


Appendix C

Manifolds

A manifold is a topological space that locally resembles Euclidean space, though


globally it may not. This means each point of an n-dimensional manifold has
a neighborhood that is in one-to-one correspondence with an area in Euclidean
space of the same dimension. Points in a region of a manifold can be projected
onto the Euclidean ‘plane’, producing a map, or chart; several such charts, mak-
ing up an atlas for the manifold, may be needed to cover the whole manifold.
When a region appears in two overlapping projections, the two representations
do not exactly coincide and a transformation is needed to convert one to the other.
Charts of manifolds are comparable to representations of curves or surfaces lying
on some n-dimensional space that one gives in terms of parameters, which are
then interpreted as coordinates in some Euclidean space R m .
BASIC DEFINITIONS. We give first some basic definitions and properties of sets, maps,
and mappings. As a concrete example for the following discussion, think of the map
f : R m → R n , with f : x 7 → y, which takes an m-tuple (x1 ,. . . , xm ) in R m to a unique
n-tuple (y1,. . . , yn ) in R n such that yi = f i ( x1 , . . . , xm ) are uniquely defined functions
of x1 , . . . , xm . We will treat Euclidean spaces and their subspaces as sets, a concept more
adaptable to other structures.
1. Let X, Y be any sets. A function f maps elements x from its domain X to ele-
ments y from its codomain Y. This is written variously as f ( x) = y, or f : x 7 → y,
or f : X → Y, where x, y, f are abstract elements possibly in multi-dimensional
spaces. Injections, surjections, and bijections are classes of functions (or maps)
distinguished by the manner in which their arguments (expressions from the do-
main) and their images (expressions from the codomain) are related (mapped)
to each other. Functions belonging to such classes are said to be, respectively,
injective, surjective, and bijective.
2. The function f : X → Y is injective if and only if for all x, x0 ∈ X, the relation
f ( x ) = f ( x0) implies x = x0 . Therefore, f is injective, or one-to-one, if each
element of the codomain Y is mapped to by at most one (i.e. 0 or 1) element of
the domain X. Alternatively f : X → Y is injective iff either X is empty, or f is
left-invertible; that is, there is a function g : f ( X ) → X such that g ◦ f = idX the

291
292 APPENDIX C. MANIFOLDS

identity in X. Symbolically we write | X| ≤ |Y |. If some images are associated


with more than one argument, f is said to be non-injective.
3. The function f : X → Y is surjective if and only if for all y ∈ Y, there is some
x ∈ X such that f ( x) = y. Therefore, f is surjective, or onto, if each element of
the codomain Y is mapped to by at least one (i.e. 1, 2, . . . ) element of the domain
X. Alternatively f : X → Y is surjective iff it is right-invertible; that is, iff there is
a function g : Y → X such that f ◦ g = idY the identity in Y. As a mnemonic, we
write | X| ≥ |Y |. If there are images without any associated arguments, f is said
to be non-surjective.
4. The function f : X → Y is bijective if and only if for all y ∈ Y, there is a unique
x ∈ X such that f ( x) = y. So f is bijective, or in one-to-one correspondence, iff each
element of the codomain Y is mapped to by exactly one (i.e. 1) element of the
domain X. Equivalently, f : X → Y is bijective iff it is invertible; that is, there is a
function g : Y → X such that, simultaneously, g ◦ f = idX and f ◦ g = idY . It is
both injective and surjective, and | X| = |Y |.
5. E XAMPLES to illustrate the terms injection, surjection, and bijection:
(1) Injective and non-surjective: f : R → R, x 7 → e x is injective because for any
given y = e x there exists a real x = ln y; and non-surjective because if y < 0 there
is no real x.
(2) Surjective and non-injective: R → R : x 7 → x3 − x = y. A cubic equation
has at least one real root (more precisely, either one or three). (Graphically, a
cubic must cross the x-axis at least once giving you at least one real root.) In this
example, if y = 0, we may have x = 0, ± 1; for any other y, one may have 1 or 3
real solutions for x.
(3) Bijective (Injective
√ and surjective): R > → R > , x 7 → x2 ; also its inverse
R > → R > , x 7 → x. Another example exp : R → R > , x 7 → exp x; also its inverse
ln : R > → R, x 7 → ln x. (Here and in the following, R > is the set of all positive
real numbers.)
(4) Non-injective and non-surjective: R → R, x 7 → y = sin( x), because y = 0
for x = nπ with n = 0, ± 1, . . . ; and | y | > 1 has no corresponding x. Another
example is R → R : y = x2 .
6. In Euclidean space R n (with the usual metric) an open ball centered at x ∈ R n
of radius r is the set of all points y ∈ R n such that | y − x| < r ; in other words, the
open ball is the interior of an n-sphere of radius r centered at x.
An open neighborhood of a point x in a metric space ( X, d ) is the set Uε ( x) =
{ y ∈ X| | x − y | < ε } . For example, in R 2 (with the usual metric) an open neigh-
borhood is an open disc (one not containing its boundary); in R 3 it is an open
ball.
A subset V of a metric space ( X, d ) is said to be open in X if for each element
x ∈ V there is an open neighborhood Uε ( x) centered at x that lies entirely in V.
(An open set V ∈ R n is the interior of some ( n − 1)-dimensional closed surface
in R n .) Any metric space is an open subset of itself. The empty set is an open
subset of any metric space. In a discrete metric space (in which d ( x, y ) = 1 for
every x 6 = y) every subset is open. The union of an arbitrary number of open sets
is open. The intersection of finitely many open sets is open.
293

In the same topology, a set U ⊂ ( X, d ) is closed in X if every point outside U


has an open neighborhood disjoint from U. (Equivalently, U ⊂ X is closed iff its
complement, X\ U, is open; or, it contains all its limit points). It is possible for a
set to be neither open nor closed, e.g., the half-closed R-interval (0, 1].
7. The class of functions (possibly of several variables) f ∈ S having continuous
derivatives of all orders ≤ k is denoted C k ( S ). Functions possessing derivatives of
arbitrary order are called smooth functions or smooth maps, and form the class
C ∞ ( S ). A C ∞ ( S ) map is continuous and differentiable as many times as we want,
while a C 0( S ) map is continuous but not necessarily differentiable. The notations
are often simplified, writing for example C k rather than C k ( S ).
TOPOLOGICAL SPACES. Topological spaces are a class of spaces with specified struc-
ture which contain the familiar linear vector spaces and metric spaces, and also, more
importantly, the manifolds that define Lie groups.
8. Let T be a collection of subsets of a non empty set X such that: (a) The empty
set ∅ and the set X are both in T; (b) The union of an arbitrary (finite or infinite)
number of elements of T belongs to T; (c) The intersection of a finite number of
elements of T belongs to T. Then we say that ( X, T ) (or simply X) is a topological
space with topology T, and the elements of T are, by definition, the open sets of
the topological space X.
9. E XAMPLES to illustrate different types of topologies and topological spaces:
(1) First a few simple examples: Let X = { a, b } and let T = { ∅, { a}, X }.
Then T is a topology (called the Sierpinski topology). Now let X = { 1, 2, 3} with
T = { ∅, {1} , { 1, 2}, X } and T 0 = { ∅, { 3}, {2, 3}, X }. Both T and T 0 are topologies
on X, defining topological spaces ( X, T ) and ( X, T 0 ).
(2) Let X = R, the real-number line. Define T to be the empty set ∅, all
intervals of the type a < x < b for any a, b, x ∈ R, and every subset that is the
union of them. Then, properties (a)–(c) are satisfied and ( X, T ) is a topological
space. Note that the intersection of all open intervals of the form − a < x < a
(a arbitrary) is a single point (x = 0) and so is not an open set. That’s why it is
necessary to limit the intersection to a finite number of open sets.
(3) Let X be a nonempty set, and define T = { ∅, X }, then T is a topology on
X, called the trivial topology (or indiscrete) topology.
(4) For an arbitrary nonempty set X, let T be the set of ∅ and all subsets of X,
then T is a discrete topology. It is the topology associated with the discrete metric.
Finite sets can have many topologies on them. A topology with many open sets
is called strong; one with few open sets is weak. The discrete topology is the
strongest topology on a set, while the trivial topology is the weakest.
(5) Let X = R n , and the open sets be ∅ and all open balls in R n . Then R n
together with this collection of open sets (called the standard topology, with open
sets being the usual open balls) becomes a topological space. Similarly one can
define the standard topology for Cn .
(6) Every metric space (X, d) together with the unions of open balls taken as
open sets of X is a topological space. A vector space R n (or Cn ), together with the
usual definition of the distance d ( x, y) = | x − y | between two vectors x and y, is
the standard example of a metric space.
294 APPENDIX C. MANIFOLDS

10. Let ( X, T ) be a topological space. A set V ⊂ X is said to be a neighborhood of


a point x ∈ X if there is an open set O ⊂ V with x ∈ O (but V itself need not be
open). A topology T on X is Hausdorff if every pair of distinct elements x, y ∈
X has a pair of disjoint (non-intersecting) neighborhoods, that is to say, there
are neighborhoods V ( x) of x and V ( y) of y such that V ( x ) ∩ V ( y) = ∅. ( X, T )
is called a Hausdorff topological space if T is a Hausdorff topology. Actually,
almost all topological spaces in analysis are Hausdorff. It is in order to exclude
the peculiar exceptional cases that one requires Hausdorffness.
11. E XAMPLES of Hausdorff spaces:
(1) Every metric space is Hausdorff, because if x and y are distinct points,
their mutual distance is positive, and the open balls centered at x and y, each
with radius half this distance will be disjoint by the triangle inequality.
(2) Every vector space R n (or Cn ), together with the standard topology, is a
Hausdorff topological space.
(3) The topological space given by a set X of at least two elements equipped
with the trivial topology { ∅, X} is not Hausdorff.
CONTINUITY and HOMEOMORPHISM. When we say a function f ( x) is continuous
at x0 , we mean f ( x) is ‘close’ to f ( x0) if x is sufficiently ‘close’ to x0 . More precisely:
12. Given the map f : X → Y, where X, Y are topological spaces, we say that f
is continuous at a point x ∈ X if for every neighborhood V (y) ⊂ Y of y = f ( x)
there is a neighborhood U ( x) ⊂ X of x such that f (U ) ⊂ V. And f is said to be
continuous if it is continuous at every point x ∈ X.
13. Let X and Y be topological spaces; A ⊂ X and B ⊂ Y. A map f : A → B
that is continuous, bijective, and has a continuous inverse f −1 is called a home-
omorphism between A and B. We then say that A and B are homeomorphic.
The homeomorphism forms an equivalence relation on the class of all topological
spaces, because it satisfies reflexivity, symmetry, and transitivity. The resulting
equivalence classes are called homeomorphism classes.
R EMARKS: In algebra ‘homomorphism’ is a structure-preserving map between
two algebraic structures, such as groups, rings, or vector spaces; whereas the
term ‘homeomorphism’ in topology means a continuous bijective map from one
topological space to another, which has a continuous inverse; ‘diffeomorphism’ is
a differentiable bijection possessing a differentiable inverse; every diffeomorphism
is a homeomorphism, but the converse is not true.
14. E XAMPLES to illustrate continuity, homeomorphism, and diffeomorphism:
(1) Simple examples are provided by finite-dimensional linear vector space.
For example, f : R → R : f : x 7 → y, such that f ( x) = x. Here f is continuous
and bijective; its inverse f −1 = x−1 is not continuous at x = 0. So this f is not a
homeomorphism. But the map f : (0, 1) → (1, ∞ ) is a homeomorphism, with the
definition f ( x) = 1/x, which stretches a finite set to infinite length. However, it
is not a diffeomorphism because of the singularity at x = 0.
(2) A circle S1 = {( x, y) ∈ R 2 | x2 + y2 = 1} and a square T = {( x, y) ∈
R 2 | | x| + | y | = 1} are homeomorphic (S1 ≈ T) by mapping f : S1 → T, with the
definition f ( x) = ( x/s, y/s) (s = | x| + | y |), which is continuous, bijective, and
295
p
has a continuous inverse, given by f −1 ( x, y ) = ( x/t, y/t) (with t = x2 + y2).
Similarly, the sphere and the cube are topologically identical; but the sphere and
the torus are not.
(3) A more subtle example is provided by the parameterized curve Γ defined
by γ ( t) = (cos t, sin t ) in R 2 . It draws a figure ∞ self-intersecting at (0, 0), which
corresponds to γ ( kπ/2) for all odd integer k (± 1, ± 3, . . .). If we restrict t to the in-
terval I, given by − π/2 < t < 3π/2, the restriction of γ to I is an injective regular
curve γ I passing through (0, 0) at t = π/2. But it is not a homeomorphism from
I to Γ for the following reason: For any open interval ( π/2) − e < t < ( π/2) + e
where 0 < e < π, the neighborhood of (0, 0) = γ ( π/2) in R 2 necessarily contains
points from the other branch passing through (0, 0).
CHART and ATLAS. A chart of a subset of a set X is its coordinate system, and the col-
lection of all charts needed to map the whole X is called its atlas. So generally speaking,
given an arbitrary set X, a chart for X is a bijective map φ : V → O, where V ⊂ R n
is an open set in R n (n is a fixed natural number) and O ⊂ X is a subset of X, such
that φ ( V ) = O. It is a codification of O written in R n . For our purpose however
we require more from X.
15. Let X be a Hausdorff topological space. A chart for X is a homeomorphism
φ : V → O, where V ⊂ R n is an open set in R n and O ⊂ X is an open set in
X (meaning φ ( V ) = O and φ−1 (O ) = V are continuous and give a one-to-one
correspondence between points in O and in V). Such a chart may be denoted
{ φ, V }, or equivalently { φ−1, O } . See Fig. C.1.

Figure C.1: Charting maps φU : U → X and φV : V → Y of subsets U and V of the


set M onto X, Y ∈ Rn . Coordinates of charts of the intersection U ∩ V are related via the
−1 . (Illustration from ncatlab.org/nlab - Introduction to Topology)
transition map φV ◦ φU

16. E XAMPLE of a chart: General coordinates on an open domain of R n .


Points x ∈ R n are given in the standard coordinates by (x1 ,. . . , xn ) (meaning
xi are the components of x over the canonical basis vectors e1 = (1, 0, . . .), . . .,
en = (0, . . . , 0, 1)). A system of general (non-standard) coordinates is defined by
296 APPENDIX C. MANIFOLDS

(y1,. . . , yn ) on an open domain O ⊆ R n if yi can be uniquely defined in terms of


xj again by differentiable functions y i = f i ( x1 , . . . , xn ). In other words, y = J x,
and, inversely, x = J −1 y, where J = Mat[ ∂y j/∂xi ] is the invertible Jacobi matrix.
If J is a constant invertible matrix, we have a linear coordinate system (which
works for the whole space R n ); if not, we have some sort of curvilinear (e.g.
polar, spherical, etc.) coordinates (which applies to open sets O ⊂ R n ). This
is a common example of a chart φ : R n → R n . Other examples of charts will be
given below.
17. An n-dimensional smooth atlas of a Hausdorff topological space X is a col-
lection of charts φα : Vα → Oα (α ∈ I, set of labels), where Oα ⊂ X and Vα ⊂ R n
(n a natural number fixed for S all α ∈ I) are open sets, such that (1) the collection
Oα cover the whole of X, X = α∈ I Oα ; and (2) there is smooth transition between
overlaps, i.e. for each pair α, β ∈ I the map φα−1 ◦ φβ is smooth from open set
φ− 1 n −1
β ( Oα ∩ Oβ ) ⊂ R to open set φα ( Oα ∩ Oβ ) ⊂ R .
n

A chart φ : V → O gives a one-to-one correspondence between x ∈ O ⊂ X


and point ( x1 , . . . , xn ) ∈ V ⊂ R n defined by the maps φ and φ−1, such that
x = φ ( x1, . . . , xn ) and ( x1 , . . . , xn ) = φ−1 ( x). So a chart on X is simply a lo-
cal coordinate system on X (local means covering only O ⊂ X). Consider now
non-disjoint sets Oα , Oβ ⊂ X. To the intersection Oα ∩ Oβ ⊂ X correspond two
subsets φα−1 (Oα ∩ Oβ ) ⊂ Vα and φ− 1 n
β ( Oα ∩ Oβ ) ⊂ Vβ , where Vα , Vβ ⊂ R . Any
point x ∈ Oα ∩ Oβ has two coordinate representations ( x1α , . . . , xnα ) = φα−1 ( x) and
( x1β , . . . , xnβ ) = φ− 1
β ( x) related by a smooth invertible map (called transition map)
describing the change of coordinates between the charts φα and φβ :

φα−1 ◦ φβ : φ− 1 −1
β ( Oα ∩ Oβ ) → φα ( Oα ∩ Oβ ) ,

( x1β , . . . , xnβ ) 7 → ( x1α , . . . , xnα ) ;

and its inverse:


φ− 1 −1 −1
β ◦ φα : φα ( Oα ∩ Oβ ) → φβ ( Oα ∩ Oβ ) ,

( x1α , . . . , xnα ) 7 → ( x1β , . . . , xnβ ) .

18. Just as there exist in general different equivalent ways to parameterize a sur-
face, so there are different equivalent charts on space X. The maps φα : Vα → Oα
and φβ : Vβ → Oβ are said to be compatible charts (1) if both φα−1 (Oα ∩ Oβ ) and
φ− 1 n
β ( Oα ∩ Oβ ) in R are open sets; and (2) if the mutually inverse changes of coor-
dinates ( x1α , . . . , xnα ) ↔ ( x1β , . . . , xnβ ) are given by smooth functions. Thus, one can
say briefly that a smooth atlas is a collection of pairwise compatible charts. (Note that we
require only continuity for a chart, but need C ∞-differentiability via the condition
of ‘compatible charts’ to define ‘smooth atlas’.) This equivalence relation extends
to atlases: Two n-dimensional smooth atlases A 1 and A 2 on the same topological
space X are said to be equivalent if any chart from A 1 and any chart from A 2 are
compatible.
297

In summary, a chart can be thought of as a coordinate system on some open


set; and an atlas, a collection of charts that are smoothly related on their overlaps.
19. E XAMPLES of charts and atlases:
(1) Circle S1 in polar angle θ. A circle of unit radius in R 2 is usually defined
by the equation x2 + y2 = 1 in standard coordinates ( x, y ) ∈ R 2 . It can also be
specified by the set S1 = { x, y ∈ R | x2 + y2 = 1} . Alternatively, one may use the
polar angle θ, defined up to n 2π. It is made a single-valued (injective) function
of x, y by the restriction 0 < θ < 2π, thus excluding the point ( x, y ) = (1, 0). So
the open set V1 = { θ ∈ R |0 < θ < 2π } covers just O1 = S1 \(1, 0), such that
θ ( x, y ) = arc tan( y/x). Now define V2 = { θ 0 ∈ R | − π < θ 0 < π } , which covers
O2 = S1 \(−1, 0), such that θ 0 ( x, y ) = arc tan( y/x ). So, we may choose for S1 the
atlas {(O1, θ ); (O2, θ 0 )} . Another choice for parameterization of S1 is described in
the following example.
(2) Unit circle S1 in stereographic coordinate. Let u ∈ R and define the map
φ1 : u 7 → ( x, y ) by the formulas x = 2u/ ( u2 + 1) and y = ( u2 − 1) /(u2 + 1), which
satisfy x2 + y2 = 1. Conversely, one has φ1−1 : ( x, y ) 7 → u given by u = x/ (1 − y )
for ( x, y ) ∈ S1 and y 6 = 1. So, the map φ1−1 is the stereographic projection of S1
from the ‘north pole’ N = (0, 1) onto the line y = 0. To make the connection with
the general discussion, we let V1 = R and O1 = S1 \ N, and we see that φ1 : V1 →
O1 is a homeomorphism between R and S1 \ N, and so is a chart on S1 \ N. We still
need to chart the point N, and so we define φ2 : u0 7 → ( x, y ) by x = 2u0 / (1 + u02 )
and y = (1 − u02) / (1 + u02), which satisfy x2 + y2 = 1. The corresponding inverse
map φ2−1 ( x, y ) for ( x, y ) ∈ S1, given by u0 = x/ (1 + y ), is well defined provided
that y 6 = − 1. It may be viewed as the stereographic projection of S1 from the
‘south pole’ S = (0, − 1) onto the line y = 0. Let V2 = R and O2 = S1 \ S, we see
that φ2 : V2 → O2 is a homeomorphism between R and S1 \ S, and so is a chart
on S1 \ S. Together the charts φ1 and φ2 cover the whole circle S1. Noting that the
intersection O1 ∩ O1 = S1 \{ N, S} corresponds to R \{ 0}, we have the transition
map on the overlap φ1−1 ◦ φ2 : R \{ 0} → R \{ 0}, or the change of coordinates
u0 7 → u = 1/u0 well defined on R \{ 0}. Thus, we have found a smooth atlas for
S1 consisting of the charts φ1 and φ2 .

Figure C.2: Stereographic projection of the sphere from the north pole N to the equatorial
plane. Points Z on the sphere are projected to points z on the plane.
298 APPENDIX C. MANIFOLDS

(3) Unit sphere S2 in stereographic coordinates u, v (Fig. C.2). The unit sphere S2
in R 3 is defined by the set S2 = {( x, y, z ) ∈ R 3 | x2 + y2 + z2 = 1} . Just as in the
previous example for S1, mapping the two-dimensional surface S2 requires two
maps, φ1 and φ2, using the local coordinates ( u, v) ∈ R 2 and ( u0, v0 ) ∈ R 2:

φ1 : R 2 → S2 \ N ⊂ R 3
( u, v) 7 → (2u/s, 2v/s, ( s − 2) /s)
φ2 : R 2 → S2 \ S ⊂ R 3

( u0, v0 ) 7 → 2u0 /s0 , 2v0 /s0 , (2 − s0 ) /s0 ,

where N = (0, 0, 1), S = (0, 0, − 1), s = u2 + v2 + 1, and s0 = u02 + v02 + 1. The


inverse map φ1−1 : S2 \ N → R 2 ( u, v ) is given by u = x/ (1 − z ), v = y/ (1 − z )
for any point ( x, y, z ) ∈ S2 \ N. It represents the stereographic projection of any
point on S2\ N from the north pole N = (0, 0, 1) onto the equatorial plane z = 0.
Similarly, the inverse map φ2−1 : S2 \ S → R 2 ( u0, v0 ) is given by u0 = x/ (1 + z ),
v0 = y/ (1 + z ), for any ( x, y, z ) ∈ S2 \ S. It represents the stereographic projection
of any point on S2 \ S from the south pole S = (0, 0, − 1) onto the equatorial plane
z = 0. The maps φ1 and φ2 just defined are continuous and have continuous
inverses, and so are homeomorphisms charting O1 = S2 \ N and O2 = S2 \ S, the
union of which is the whole sphere S2.
On O1 ∩ O2 = S2 \{ N, S}, the intersection of O1 and O2 , we have the transition
map

φ1−1 ◦ φ2 : φ2−1 (O1 ∩ O2 ) → φ1−1 (O1 ∩ O2 )


( u0, v0 ) 7 → (u, v ) .

Noting the identities of the coordinates, u0 v0 /uv = ( s0 /s )2 and u0 /v0 = u/v, we


obtain u = u0 / ( u02 + v02 ) and v = v0 / ( u02 + v02 ); and conversely u0 = u/ ( u2 +
v2), v0 = v/ ( u2 + v2 ). They are smooth transition maps on the overlap. Thus, we
have found a smooth atlas for S2 consisting of the charts φ1 and φ2.
(4) Unit n-sphere Sn ⊂ R n+1 . One can readily generalize the above discussion
to any Sn ⊂ R n+1 . Take any point C of Sn as the center of projection and an
n-dimensional plane not containing C as the plane of projection. Then the stereo-
graphic projection maps an arbitrary point p 6 = C ∈ Sn to a unique point q located
at the intersection of the projection plane with the line passing through C and p.
Just as for S1 and S2, we need two such maps to chart the whole surface Sn . We
cannot cover the full manifold with one set of coordinates because we require
that the coordinates come from an open set and that the map from the manifold
to the coordinates be bijective. Topologically, we can also see that this must be
so because there cannot be a continuous bijection between the sphere, which is
compact, and an open set, which is not. Now take C = N = (0, . . . , 0, 1) ∈ R n+1
for φ1 and C = S = (0, . . . , 0, − 1) ∈ R n+1 for φ2, and the equatorial plane as the
projection plane. Then define
299

φ1 : R n → Sn \ N ⊂ R n+1
u 7 → (2u/s, ( s − 2) /s )
φ2 : R n → Sn \ S ⊂ R n+1

u0 7 → 2u0 /s0, (2 − s0 ) /s0 ,

where s = u2 + 1, and s0 = u02 + 1. On the intersection of the sets that chart


Sn \ N and Sn \ S, the change of coordinates u and u0 is given by the function φ1−1 ◦
φ2 : R n \{ 0} 7 → R n \{ 0} , or u0 7 → u = u0 /u02 . As this function is its own inverse,
it can be used in both directions u0 ↔ u, and as it defines a smooth transition
map, we have a smooth atlas. In conclusion, Sn admits a smooth atlas consisting
of the charts φ1 : R n → Sn \ N and φ2 : R n → Sn \ S.
MANIFOLDS. With the basic notions laid out, let us now discuss the manifolds.
20. D EFINITION . An n-dimensional smooth manifold ( X, A) is a Hausdorff topologi-
cal space X equipped with an n-dimensional smooth atlas A . The manifolds ( X, A 1)
and ( X, A 2), with two different atlases on the same space X, are regarded as the
same manifold if the atlases A 1 and A 2 are equivalent. (The qualitative ‘smooth’
is always implied when we say simply ‘manifold(s)’.)
21. We can make ‘larger’ manifolds by constructing products of manifolds, for
which we need a ‘product topology’ and a ‘product atlas’. If M and N are Haus-
dorff topological spaces, the Cartesian product M × N of M and N is a Hausdorff
topological space having as topology the product topology (i.e. one in which a sub-
set R ⊂ M × N is open iff for each point ( x, y ) ∈ R there are open sets O ⊂ M and
U ⊂ N such that ( x, y ) ∈ O × U ⊂ R).
Explicitly, let M and N be smooth manifolds of dimensions m and n equipped with
smooth atlases {φµ , µ = 1, 2, . . . } and {ψν , ν = 1, 2, . . . }, respectively. This means
we have the charts φµ : Vµ → Oµ and ψν : Wν → Uν , where Vµ ⊂ R m , Wν ⊂
R n , Oµ ⊂ M, and Uν ⊂ N. In the now familiar notations, x ∈ O ⊂ M and
( x1, . . . , xm ) ∈ V ⊂ R m for the chart φµ : Vµ → Oµ ; and y ∈ U ⊂ N, ( y1, . . . , yn ) ∈
W ⊂ R n for the chart ψν : Wν → Uν ; the charts on the product topological space
M × N are
φµ × ψν : Vµ × Wν → Oµ × Uν ,
( x , . . . , xm , y1, . . . , yn ) 7 → ( x, y) .
1

The collection of the charts φµ × ψν is a product atlas – a smooth atlas on M × N of


dimension m + n. We call the topological space product M × N equipped with this
product atlas { φµ × ψν } the product manifold of M and N.
22. E XAMPLES of manifolds:
(1) Manifold of zero dimension. By definition R 0 is the trivial vector space { 0}
consisting of just the zero vector. Let X be an arbitrary set. A map φ : R 0 → X has
an image consisting of just a single point φ (0) = p ∈ X. As the map 0 7 → p is, by
definition, smooth and regular for each p ∈ X, the collection of all these maps is
300 APPENDIX C. MANIFOLDS

a 0-dimensional smooth atlas on X, and hence a 0-dimensional manifold in X. It is


the same as a discrete subset, i.e. a collection of isolated points in X. (An element p
in a set S ∈ X is said to be isolated if it is the only point in S in some neighborhood
of p ; and the set S is said to be discrete if all points in S are isolated.)
(2) Open subset of R n . Any open subset O of R n with standard topology is
a smooth manifold of dimension n. One possible atlas is the identity map, φ =
id : O → O. The maximal atlas is the set of all maps φ : O → R n such that φ (O )
is open, and φ and φ−1 : φ (O ) → O are C k differentiable. The smooth atlas is
the sub-atlas where φ and φ−1 are required to be C ∞ differentiable. Of course,
one possible choice of set O is R n itself. Then a non-maximal atlas is just φ =
id : R n → R n (just as described in detail in §16). R n looks like R n , not only
locally, but also globally. Euclidean space is a non-compact manifold.
(3) The unit n-sphere Sn . Sn (n ≥ 1) with the unique maximal smooth atlas
consisting of the two charts φ1 : R n → Sn \ N and φ2 : R n → Sn \ S (stereographic
projection described in §19) is a compact smooth manifold.
(4) R m identified with the set {( x1, . . . , xm , c1, . . . , cn−m )} ⊂ R n (where c1, . . . ,
cn−m are some fixed constants) is an m-dimensional manifold in R n .
(5) Constructions of curves (e.g. S1 ⊂ R 2 ) and surfaces (e.g. Sm ⊂ R m+1
for m > 1) can be generalized as follows. Take any open set O ⊂ R n . Let x =
( x1, . . . , xn ) ∈ O, and assume that the n coordinates xj are tied together by k ≤ n
independent conditions expressed by a set of smooth functions f i ( x) = ci (some
constants) with i = 1, . . . , k, or more succinctly f : O → R k . Then, if not empty,
the set S = { x ∈ O | f i( x) = ci , i = 1, . . . , k } is an (n − k)-dimensional smooth
manifold in R n .
(6) M( n, R ), the space of all real n × n matrices A, is an open subset of the
2
vector space of n × n matrices over R in bijective correspondence with R n (the
coordinates of each point A being identified with its n2 entries Aij ), and so is a
smooth manifold of dimension n2. Similarly, the set M( n, C ) of all complex n × n
matrices can be viewed as an open subset of the vector space of n × n matrices
over C, and hence is a smooth manifold of dimension 2n2 (we always give the
dimension over R; of course, dimension over R doubles dimension over C).
R EMARKS. The definition of a manifold involves two parts, one concerning
the topological nature (open sets) of X and its regularity (Hausdorffness); and
the other, the conceptual condition for it to be locally Euclidean (each point has a
neighborhood homeomorphic to an open ball in some Euclidean space). So a set
may not be a manifold either because it has no satisfactory topology, or because it
does not look locally like Euclidean space. The ∞-shaped curve is not a manifold
because its self-intersection prevents it from satisfying homeomorphism. Two
cones intersecting at their vertices (like the complete space-time light cone) can-
not be a manifold because the intersection does not look locally like a Euclidean
space. A single cone is a manifold, but not a smooth one. A finite cylinder with-
out circular boundaries is a manifold, but is not when boundaries are included,
because the circles are one-dimensional. In general, manifolds with boundaries
are not manifolds according to our definition.
301

LIE GROUPS. Lie groups are an important example of manifolds.


23. D EFINITION . A Lie group is a group G, which is also a smooth manifold of
finite dimension, such that the group operations, multiplication m : G × G → G,
m( x, y ) = xy, and inversion inv : G → G, inv ( x) = x−1 , are smooth maps. The
Lie group is said to be real if the manifold is real; and complex if the manifold is
complex. The dimension of a Lie group is defined as that of its manifold.
24. E XAMPLES of Lie groups:
(1) Lie group of dimension 1. The set R × of non-zero real numbers (R \ { 0} ) with
usual topology defined by the metric d ( x, y) = | x − y |, and with multiplication
as the group operation and 1 as the identity element, is a one-dimensional Lie
group. And so is the circle group S1, which is isomorphic to R/Z.
(2) Lie group of dimension 2. The set C× = C \ { 0} , equipped with a metric
d ( x, y) = | x − y | and complex multiplication as the group operation, is a two-
dimensional Lie group. Let x = x1 + ix2 be a group element. Then the product
xy = x1 y1 − x2 y2 + i( x1 y2 + x2 y1) and the inverse x−1 = ( x1 − ix2 ) / ( x21 + x22 ) are
both smooth functions of their entries xi , y j. Alternatively, we may also say that
C× is a complex Lie group of dimension 1 (over C).
(3) Real vector space. Every real vector space V of finite dimension n, equipped
with standard topology, and with the addition of vectors as the group operation,
and the zero vector as the identity element, is a group. The map m ( x, y ) = x + y
is linear, and so, is smooth. Similarly, inv ( x) = − x is also smooth. Hence, V is a
Lie group of dimension n.
(4) Real orthogonal matrices 2 × 2. Let O (2) be the subset of the set M(2, R )
of all 2 × 2 real matrices A restricted by the orthogonality condition AAT = I,
where I is the identity matrix, and AT is the transpose of A. It has the structure
of a group G, with matrix multiplication as the group operation. Every matrix
A ∈ O (2) has the form
   
x y x y
A= , or A= ,
−y x y −x

which satisfies det A = 1, or det A = − 1; and is in one-to-one correspondence


with the unit circle S1 = { x, y ∈ R | x2 + y2 = 1} in R 2 . So, we see that O (2) is
a smooth manifold, and the disjoint union of two circles. Further, the multipli-
cation of matrices produces a smooth expression of the entries x, y, x0 , y0 and so,
is a smooth map G × G → G; similarly the inversion A 7 → A−1 = AT , which
amounts to y 7 → − y (| A | = 1) or y 7 → + y (| A | = − 1), is smooth. Therefore, O (2)
is a 1-dimensional Lie group. The subset of A ∈ O (2) with det A = 1 preserves
both group and topological properties of O (2), and is thus itself a Lie group of di-
mension one, called SO (2). Other Lie groups will be considered in the following.
25. General linear group. Let G = { A ∈ M( n, R )| det A 6 = 0} be the space of all
real invertible n × n matrices (n ≥ 1). It is a group, with matrix multiplication as the
group operation, and a well-defined inverse for any element. It is also a manifold,
because it is an open subset of M( n, R ), which from §22 (6), is an n2-manifold.
Moreover, the matrix multiplication m( A, B ) = AB with A, B ∈ M( n, R ) is given
302 APPENDIX C. MANIFOLDS

by a smooth function of the entries (involving their products and sums), and so
is a smooth map. Its restriction to G × G is also smooth. As for the inversion, the
map A 7 → A−1 is smooth inv : G → G because, by virtue of Cramer’s formula,
the entries of A−1 are given by rational functions of the entries of A and det A (a
continuous function). So G = GL( n, R ) is a Lie group, of dimension n2. Note the
special case GL(1, R ) = R × .
If the matrices are defined over the complex field C, the corresponding group
GL( n, C ) = { A ∈ M( n, C )| det A 6 = 0} is a an open subset of the vector space of
n × n matrices over C, and hence is a smooth manifold of dimension 2 n2. So,
GL( n, C ) is a Lie group, of dimension 2 n2 over R (or, equivalently, n2 over C). Note
that GL(1, C ) = C× .
Most Lie groups we consider are ‘contained’ in the general linear group; they
can be generated as subgroups of GL ( n, F ) for F ∈ { R, C } , and are Lie groups
by virtue of the following result: A subgroup of Lie group GL( n, F ) that is also a
manifold in R N , where N = n2 (or 2 n2) for F = R (or C), is a Lie group. This
is so because the multiplication map in a subgroup H of the Lie group G is the
restriction m H of the multiplication map mG on G, which is itself smooth. And
so, the subgroup layered with a compatible manifold is a Lie group.
When the subgroup H is specified by a system of simultaneous equations on
R N that are equivalent or reducible to k independent conditions, it is a manifold in
R N of dimension N − k, and therefore is a Lie group; cf. §22 (5).
26. E XAMPLES of Lie groups that are subgroups of GL( n, R ) and GL ( n, C ):
(1) The subgroup GL+ ( n, R ) of GL ( n, R ) of invertible matrices of positive de-
terminant. This is indeed a subgroup because det ( AB) = det A · det B > 0 and
det I = 1 > 0. Moreover, by continuity of the determinant map, det : A 7 → R > , it
is an open set, and so is a manifold. Therefore GL+ ( n, R ) is a Lie group.
(2) The special linear groups SL( n, R ) and SL ( n, C ) are the subgroups of the
general linear group GL( n, F ) with determinant 1:

SL( n, F ) = { A ∈ GL( n, F )| det A = 1} .

SL( n, F ) is a subgroup because for any A, B ∈ GL( n, F ) with determinant 1, one


has det( AB ) = det A · det B = 1. It is an open set in R N (N = n2 or 2 n2) specified
by a smooth (scalar) map det : A 7 → 1, and so is a manifold. Therefore, SL( n, F )
is a Lie group of dimension n2 − 1 for F = R, and 2 n2 − 2 for F = C.
(3) The orthogonal group O ( n ) is the group of R-linear isometries in R n de-
fined by O ( n ) = O ( n, R ) = { A ∈ M( n, R )| AT A = I } . Noting ( AB )T = BT AT
and AT = A−1 for any matrices A, B ∈ O ( n ), one verifies that it indeed satisfies
the group structure. As we have seen for GL( n, F ), the group multiplication and
inversion operations are smooth maps. Moreover, O ( n ) has a manifold structure
for the following reasons. The matrix equation AT A = I corresponds to a system
of n2 equations for the n2 real-valued matrix entries, of which n ( n + 1) /2 are in-
dependent for every A ∈ O ( n ). Therefore, the space of solutions, with constant
dimension given by n2 − n ( n + 1) /2 = n ( n − 1) /2, is a smooth manifold, and
O ( n ) is a Lie group of dimension n ( n − 1) /2.
303

The unitary group U ( n ) is the group of C-linear isometries in Cn defined by

U ( n )) = U ( n, C ) = { A ∈ M( n, C )| A† A = I }

(A† is the hermitian adjoint of A, so that A†ij = A∗ji , with ∗ denoting complex conju-
gation). Just as for O ( n ), but now we have complex entries in the matrices. The
total number of (real) coordinates for every A ∈ U ( n ) is N = 2 n2 subjected to
k = n + 2 ( n2 − n ) /2 = n2 independent conditions, so the space of solutions for
A† A = I, and thus the manifold, has dimension N − k = 2 n2 − n2 = n2. The
group U ( n ) is a Lie group of dimension n2 (over R).
(4) The special orthogonal group SO ( n ). The defining relation for any A ∈
O ( n ) says that AT A = I, which implies (det A )2 = 1, or det A = ± 1, which
may be written as the determinant map det : O ( n ) → {± 1}. This means that
O ( n ) is the disjoint union of two components, only one of which, with det A = 1,
is connected to the identity and forms a subgroup, called the special orthogonal
group SO( n ).
In the complex case, for any A ∈ U ( n ) one has | det A | = 1, and so det A = eiθ
for any real θ, in other words, det : U ( n ) → S1. The set of unitary matrices with
determinant one is the special unitary group SU( n ). The group SO( n ) is a Lie
group of the same dimension, n ( n − 1) /2, as O ( n ); the group SU ( n ) is also a Lie
group, but its dimension (n2 − 1) is one less than that of U ( n ). This reflects the
difference in the determinant mapping in the real and complex cases.
(5) The defining condition for every A ∈ O ( n ) is AT I A = I, where Iij = δij .
This symmetric metric Iij may be replaced with a non-positive definite matrix, in
particular the non-degenerate antisymmetric matrix of even dimension
 
0 I
J=
−I 0

where I is the identity n × n matrix. The set of all real 2n × 2n matrices A that
satisfy AT J A = J forms a group, called the symplectic group over R:

Sp( n, R ) = { A ∈ M(2n, R )| AT J A = J } .

Arguments similar to those for O ( n ) show that Sp( n, R ) is a Lie group. Its di-
mension is given by N − k, where N = (2n )2 is the number of entries in any A ∈
Sp( n, R ), and k is the number of the independent conditions on them. This num-
ber can be calculated from the system of equations AT J A = J: there are n ( n − 1)
independent equations from the two n × n diagonal blocks, and n2 from the two
n × n off-diagonal blocks, leading to k = n2 + n ( n − 1) = 2n2 − n. So the space
of solutions (or the manifold of the group) has the dimension N − k = n (2n + 1),
which is also the dimension of Sp ( n, R ). The same arguments also apply to the
complex group Sp ( n, C ) = { A ∈ M(2n, C )| AT J A = J } , where one should note
the presence of the transpose AT and not the Hermitian adjoint A† . It has the
same dimension as Sp( n, R ).
304 APPENDIX C. MANIFOLDS

Figure C.3: The Mobius strip and the Klein bottle.

Q. Ho-Kim. Group Theory: A Physicist’s Primer.


References

[As] M. Ashbacher, The Status of the Classification of the Finite Simple Groups.
Notices of the Amer. Math. Soc. 51, 736–740, 2004.
[Ba] J. C. Baez, The Octonions. Bull. Amer. Math. Soc. 39, 145–205, 2002.
[Bal] L. Ballentine, Quantum Mechanics. World Scientific Publishing Co., Singa-
pore, 1998.
[Be] W. Berry, Structural Functions in Music. Prentice Hall, Englewood Cliffs,
N.J. (1976). Repr. by Dover Publications, Mineola, N.Y. (1987).
[Ca] R. N. Cahn, Semi-Simple Lie Algebras and Their Representations. Dover Pub-
lications, Inc. New-York, N.Y., 2006.
[CS] J. H. Conway and D. A. Smith, On Quaternions and Octonions: Their Geom-
etry, Arithmetic and Symmetry. A. K. Peters, Ltd., 2004.
[Dy1] E. B. Dynkin, The structure of semisimple Lie algebras. Uspehi Mat. Nauk 2,
59–127, 1947. Am. Math. Soc. Trans 1, 1–143, 1950.
[Dy2] E. B. Dynkin, Semisimple subalgebras of semisimple Lie algebras. Mat. Sbornik
30, 349–462, 1952. Am. Math. Soc. Trans Ser. 2, 6, 111–244, 1957.
[Fa] G. Fano, Mathematical Methods of Quantum Mechanics. McGraw-Hill Book
Company, New-York, N.Y., 1971.
[FH] W. Fulton and J. Harris, Representation Theory. Springer-Verlag New York,
Inc. New-York, N.Y., 1991.
[Ge] H. Georgi, Lie Algebras in Particle Physics. Benjamin/Cumming, Reading,
MA., 1982.
[Gi] R. Gilmore, Lie Groups, Physics, and Geometry. Cambridge U. Press, Cam-
bridge, U.K., 2008.
[Go] D. Gorenstein, Finite Simple Groups: An Introduction to their Classification.
Plenum, New York, N.Y., 1992
[GLS] D. Gorenstein, R. Lyons, and R. Solomon, The Classification of the Finite
Simple Groups. Amer. Math. Soc., Providence , RI., 2002.
[Ha] B. C. Hall, Lie Groups, Lie Algebras, and Representations. Springer-Verlag
New York, Inc. New-York, N.Y., 2003.
[Ham] M. Hamermesh, Group Theory and its Applications to Physical Problems.
Addison-Wesley Publishing Company, Reading, MA., 1962.

305
306 REFERENCES

[Hu] T. W. Hungerford, Algebra. Springer-Verlag, New-York, N.Y., 1974.


[Ja] N. Jacobson, Lie Algebras. Dover Publications, Inc. New-York, N.Y., 1979.
[Ke] S. Kettle, Symmetry and Structure: Readable Group Theory for Chemists. (3rd
edition). John Wiley and Sons, New-York, 2007.
[LM] B. Lawson and M. Michelson, Spin Geometry. Princeton University Press,
Princeton N.J., 1989.
[Sa] H. Samelson, Notes on Lie Algebras. Lectures Notes at Cornell University,
Ithaca, N.Y., 1974.
[Sl] R. Slansky, Lie Algebras for Model Building. Physics Reports 79(1981)1.
[Tu] W. K. Tung, Group Theory in Physics. (1985). World Scientific Publishing
Co., Singapore, 1985.
Index

a1 , sl(2, C), 101, 115, 123–128, 182 matrix, 190


canonical normalization, 123, 165 subalgebra (CSA), 178, 216
representation, 126–130, 249 Cartan–Killing metric, 114
a2 , sl(3, C), 155–165, 182, 191, 193, 220, 249 Cartan–Weyl basis, 193
representation, 165–173 Cartesian product, 2
a3 , sl(4, C), 228 Casimir invariant, 152, 242, 244
an , sl(n + 1, C), 202 Casimir operator, 127, 138, 241
representation, 226 Cauchy theorem, 29
Abel, Niels, 2 Cayley theorem, 26
abelian Cayley–Klein parameters, 135
group, 2 center of a group, 8, 16
Lie algebra, 91 center of Lie algebra, 94
adjoint action, 107, 156 centralizer, 8, 16
adjoint representation, 108, 114, 159, 160, 172 character, 56, 111
Ado theorem, 103 formula, 245
affine algebra, group, 79, 93, 106 of representation, 56, 141, 245
alternating group, 25, 30 orthogonal, 58
alternating representation, 41, 271 regular, 59
atlas, 295, 296 character table, 60
automorphism, 9, 94 characteristic polynomial, 37
chart, 295
b2 , so(5, C), 249 compatible, 296
b3 , so(7, C), 195 chirality, 290
bn , so(2n + 1, C), 208 class function, 56
representation, 230, 289 classical groups of Lie type, 9, 30
Baker–Campbell–Hausdorff (BCH), 102 classical Lie algebras, groups, 80, 201
basis of Lie algebra, 189, 193, 202 classification of
bijection, 292 complex simple Lie algebras, 201
bilinear form, 35, 51, 113, 181, 283 simple finite groups, 30
bounded set, 81 Clebsch–Gordan coefficient, 67, 149
Burnside theorem, 30 Clifford algebra, 231, 283–285
closed set, 81, 293
cn , sp(n, C), 203, 229 closed subgroup, 76
C1 –C6 , 2, 4, 13, 19 color, 174, 175, 251
representation, 41, 61 commutation relation, 88
Cn (cyclic group), 4, 10, 13, 61 compact Lie group, 81, 112, 116
representation, 41 complex representation, 106, 112, 277
Cartan, Élie, 177 complexification of Lie algebra, 97
criterion, 115, 178 Condon–Shortley phase convention, 149
integer, 184, 185 congruence, 11

307
308 INDEX

conjugacy class, 16, 57, 266 exceptional Lie algebra, 201


conjugate representation, 277 exponential map, 98, 134, 140
conjugation, 16 exterior (tensor) algebra, 226, 274, 284
connected component (G0 ), 82
connected Lie group, 82 f4 , 210
coroot, 165, 182 representation, 237
coset, 10 factor group, 18
e 84
covering group (G), field, 6
cycle (cyclic permutation), 24 Fischer–Griess Monster group, 31
cyclic group, 4, 10, see also Cn flavor, 155, 174, 251
of prime order, 12, 30 Freudenthal multiplicity formula (FMF), 244
fundamental representation, 223
d2 , so(4, C), 104, 201, 232 fundamental root, 159, 188
d4 , so(8, C), 250 fundamental system of roots (FSR), 189
dn , so(2n, C), 206 fundamental weight (ω i ), 219
representation, 231, 289
D2 –D4 , 5, 13, 16, 19, 20, 26 g2 , 191, 210, 249
representation, 42, 45, 50, 62 representation, 234
Dn (dihedral group), 14, 22, 29 gl(n, F), 91, 99, 116
representation, 46, 63 GL(n, F), 3, 76, 83, 99, 274, 302
derived algebra, 94, 101 Galilean group, 118
determinant, 35 Galois, Évariste, 1, 73, 261
diffeomorphism, 294 gauge boson, 251, 253, 254
dihedral group, see Dn gauge group, 80, 253
dimension formula, 173, 248 gauge invariance, 253, 254
dimension of Gell-Mann matrices (λi ), 175
Lie algebra, 98, 104 general linear group, 3, 7, 8, 76, 99, 302
representation, 173, 248, 269, 278 general linear Lie algebra, 91, 99
Dirac matrices (γµ ), 289 generalized orthogonal group, 79, 100
Dirac notation, 35 generalized orthogonal Lie algebra, 92, 96,
direct product group (×), 19, 22, 64 100
direct sum of algebras, 101, 116 generator of group, 4, 86
direct sum of representations (⊕), 49, 110 gluon, 175, 254
direct sum of spaces (⊕), 37 Gram–Schmidt orthogonalization, 36
dual of a representation, 171, 221 Grand Unified Theory (GUT), 255
dual of a vector space, 36 group, 2
dual of root space, 156, 160 abelian, 3
Dynkin, Eugene, 177 cyclic, 10
basis, 219, 277 nonabelian, 3
diagram, 194–200 normal, 15
of finite order, 7
e6 –e8 , 211–213 of order 1–4, 4–5
representation, 237–238 of order 6, 13
eightfold way, 175 of prime order, 12, 28
equivalence classes, 6, 16, 45 sporadic, 30, 31
equivalence relation (∼), 10, 15 group algebra, 261–266
equivalent representation, 44, 48 representation, 262
Euclidean group commutator, 88
group, 79, 83, 87, 95 group of integers modulo n (Zn ), 3, 6, 10, 29
space, 35, 74, 81, 185, 291, 300 group of transformation, 86
INDEX 309

H (quaternion algebra), 91, 284, see also Q8 of dimension three, 95, 101
Haar measure, 111 of dimension two, 94, 101
Hausdorff topology, 294 of rank one, 187
Heisenberg group, algebra, 80, 95, 114 of rank two, 187
helicity, 251 representation, 105
Higgs boson, 251, 254 subalgebra, 91
homeomorphism, 294 Lie bracket, 91
homomorphism, 7, 9, 38, 54, 294 Lie group, 75, 301
hook formula, 269, 278 character, 111, 141, 245
hypercharge, 252 compact, 80, 111, 116
complex, 75
ideal (Lie subalgebra), 94, 102, 178, 263 connected component, 82
idempotent, 263, 270 dimension, 75
primitive, 264, 271 generator, 86
image (of a map), 9 of finite dimension, 9
index of a subgroup, 12 of infinite dimension, 80
index of representation, 242 of transformation, 75
infinitesimal operator, 85 real, 75
injection, 291 representation, 106
inner product, 35, 220 Lie subalgebra, 91, 94
inner-product space, 35 Lie subgroup, 76, 108, 302
integration measure, 81, 111 normal, 82, 102
invariant subgroup, 16 Lie third theorem, 103
invariant subspace, 36 Lie, Sophus, 73
irreducible representation linear group, 76
see representation – irreducible linear vector space, 33–37
isometry, 79 Lorentz group, 79, 119
isomorphism, 9, 38, 94, 109
isospin, 252 manifold, 74, 291, 299
differentiable, 74
Jacobi identity, 88–89 dimensionality, 74, 299
examples, 299–300
kernel, 7, 9, 18 product, 299
Killing form, 114–116, 160, 178–180 smooth, 74
Killing, Wilhelm, 177 matrix
Klein bottle, 304 cyclic, 99
Klein group, 5, 19, 20, 21, 61, see also V diagonal, 37
nilpotent, 99
Lagrange theorem, 12 orthogonal, 9, 56, 77
lattice, 219 representation, 34, 38, 106
of dominant weights (Λd ), 219 unimodular, 3, 99
lepton, 251 unitary, 9, 36, 37
Lie algebra, 91 metric space, 76, 81, 292, 293
automorphism, 94 Mobius strip, 304
center, 94 module, 38
complexification, 95, 97 morphism, 101, 109
derived algebra, 94 multiplication table, 4, 157
homomorphism, 94
isomorphism, 94 neighborhood, 74, 292, 294
of dimension one, 93, 101 non-isotropy of root, 181
310 INDEX

normal group, 15, 102 rectangle, 5


complete classes, 16 regular permutation, 28
kernel, 18 representation, 38, 126, 215, 222
adjoint, 216, 233, 236
o(2, 1; R), 96 alternating, 41
o(3), so(3), 96, 108, 115 completeness, 59
o(n, F), 92, 99 defining, 39, 42
o(n; m), 92, 100 direct sum, 49
O (3), 130 equivalent, 44, 58
O (n), 77, 85, 104, 302 in function space, 43
O (n, F), 9, 99 irreducible (simple), 45, 58, 110, 126,
O (n, F), 83, 99 169, 222–238
O (n; m), 79, 100 isomorphism, 110
octonion, 213 of finite group, 38–50
one-dimensional representations, 41 of Lie algebra or group, 105–119, 128,
one-parameter subgroup, 90, 98 215–238
open ball, 292 of product group, 64
open set, 74, 76, 292, 293 permutational, 40, 43
operator reducible, 45, 48
antilinear, 34, 117 regular, 40, 43, 49, 50, 59, 108, 262, 272
antiunitary, 117 semisimple (completely reducible), 48,
linear, 34 52, 58, 111
normal, 37 spin, 233, 288
orthogonal, 36 standard, 106, 227
projection, 36 unitary, 51
self-adjoint (Hermitian), 36, 55 root, 156, 159, 179, 191
symmetric, 36 diagram, 162
unitary, 36, 37, 51 generator, 181
order of element, 8 lattice (ΛR ), 219, 221
order of group, 2, 7 ordering, 188
ordering of weights, 221 space, 156, 159, 179, 188
orthogonal Lie algebra, 92, 99, 205, 230, 286 system, 186
orthogonal Lie group, 9, 77, 99, 130, 285, 302 vector, 179
rotation, 40, 87, 118, 130–134
π0 (group of components), 82
matrix, 140
π1 (fundamental group), 84
parameters, 131
particle physics, 174, 250
partition of an integer, 266
sl(2, F), see a1
Pauli matrices (σi ), 96, 135, 288
sl(3, F), see a2
permutation, 24, 25
sl(n, F), see an
permutational representation, 40, 43, 63
so(2, C), 104, 208, 232
Poincaré group, 79, 83, 119
so(4, C), 201
Q8 (quaternion group), 2, 16, 32, 91 so(n, F), 92, 97, 99 see bn , dn
quantum physics, 116 so(n; m), 100
quark, 174, 251 sp(n), 92, 100
quaternion, see H, Q8 sp(n, F), 92, 87, 100, 114; see cn
quotient group, 18 su(2), 96, 102, 104, 115, 123
quotient Lie algebra, 102 su(n), 92, 100, 102
S1 (circle), 74, 84, 297, 301
rank of Lie algebra, 178 S2 (sphere), 75, 298
INDEX 311

Sn (n-sphere), 81, 83, 84, 298, 300 subgroup, 8


S2 –S4 , 14, 26 cyclic, 8
representation, 41, 63 index, 12
Sn (symmetric group), 24, 25, 41, 128, 266 of S3 , 27
alternating representation, 271 of S4 , 27
representation, 272 superselection rule, 118
SL(n, F), 3, 9, 76, 99, 302 surjection, 292
SO (3), 124, 130 symmetric group, 14, 24–28, 128, see also Sn
SO (10), 256 symmetric power (representations), 129, 172,
SO (n), 77, 83, 99, 303 226
Sp(n), 78, 100 symmetry
Sp(n, F), 78, 100, 303 global, 252
SU(2), 104, 124, 135–138 local, 252
character, 141 symmetry transformation, 5, 14
exponential map, 137 symplectic group, 78, 100, 303
SU(3), 156 symplectic Lie algebra, 92, 100, 203, 229
SU(3) × SU(2) × U(1) , 251
SU(5), 255 table of
SU(n), 78, 100, 303 character, 61
Schur Lemma, 53–55, 60, 127, 241 complex simple Lie algebras, 201
semi-direct product group, 22 elements of sl(3, C), 157
semi-direct sum of algebras, 116 Galilean Lie algebra, 118
semisimple group, 17 global properties of Lie groups, 85
semisimple Lie algebra, 94, 100, 115, 178 group multiplication, 4
semisimple representation 48, 58 Killing form of sl(3, C), 161
similarity transformation, 35 particles in the Standard Model, 253
simple (fundamental) root, 159, 188 Poincaré Lie algebra, 119
simple character, 56, 58 properties of Lie algebras, 104
simple group, 17 simple finite groups, 30
simple Lie algebra, 101, 178 tangent space (T1 G), 87
simple representation tensor product (⊗)
see representation – irreducible of matrices, 65
simply-connected Lie group, 84 of operators, 64
smooth map, 293 of representations, 64, 112, 146, 171, 226,
special linear group, 3, 7, 9, 76, 99, 302 274, 279–282
special linear Lie algebra, 91, 99, 202, 226 of spaces, 64, 128
special orthogonal group, 77, 130, 303 topological space, 293
special orthogonal Lie algebra, 99, 286 topology, 293
special unitary group, 77, 100, 135, 303 trace form, 113, 179
special unitary Lie algebra, 92, 100 transposition, 24
spin Lie group, 285–286 triangle, 14
spin representation, 231, 233, 287–290 trivial group, 4
spontaneous symmetry breaking, 254 trivial representation, 106
sporadic group, 30, 31 trivial subgroup, 8
Standard Model (of particles), 251
standard representation, 106, 110, 114, 227 u(2), 102, 103
string of roots, 183, 191 u(n), 92, 100, 102
string of weights, 168, 217 U(1), U(2), 78, 104, 135, 252
structure constant, 88, 93, 165, 176 U(1) × SU(2), 135
312 INDEX

U(n ), 77, 81, 100, 303


U(n, C), 9, 77, 303
unitary group, 9, 77, 100, 303
unitary Lie algebra, 92, 100, 102
unitary operator, 36, 37
unitary representation, 51, 111

V (Klein group), 5, 19, 20, 21

weight, 125, 156, 158, 166, 216


diagram, 151, 170
dominant, 219
fundamental, 219
highest, 126, 167, 221
lattice, 219, 221, 258–260
lemma, 166, 216
multiplicity, 216, 239
ordering, 167, 221
space, 125, 156, 158, 216
vector, 125, 156, 216
weight lattice of
a2 , 221, 258
b2 , 221, 259
g2 , 222, 260
Weyl, Hermann
chamber, 217
character formula (WCF), 245
group, 217
orbit, 224
reflection, 217
reflection symmetry, 246
Wigner (rotation matrix) formula, 144

Young, Rev. Alfred


operator, 270
pattern, 266
tableau, 267

Zn (integers modulo n), 3, 6, 10, 11, 19

You might also like