You are on page 1of 229

清华大学数学系

本科生学术杂志

2021
第 15 期
荷 思

2021
第 15 期
主办 《荷思》编辑部
主编 卜辰璟
编委 (按姓氏拼音排序)
卜辰璟 陈起渊 段哲凡 贺宇昕 黄虞来
孔繁浩 秦珺辉 尚鉴桥 肖咏涵 谢雨潇
张锐翀 张睿桐
排印 陈起渊
目录

专题介绍

1 𝑝-adic Cohomology Theories 陈起渊

25 Sheaf Cohomology 贺宇昕

45 An Introduction to Factorization Algebras and Factorization Homology


孔繁浩

研究讨论

91 Stable Irrationality of Varieties 卜辰璟

139 Applications of the Eells–Kuiper Invariant to Exotic 7-Spheres 蓝青

163 Transversality Theorems by Example 谢雨潇

数学小品

181 超实数 《荷思》编辑部

195 折纸问题 黄虞来

203 A Game About Competing Area 尚鉴桥

221 变号矩阵与可积系统 徐凯
𝑝‐adic Cohomology Theories

Chen Qiyuan1

ABSTRACT
In this article, we define two 𝑝-adic cohomology theories: crystalline co-
homology and rigid cohomology as such kinds of de Rham cohomology and
prove their desired properties as Weil cohomology theories, such as finiteness,
Künneth formula and Poincaré duality. We shall prove the Katz conjecture as
an application of 𝑝-adic cohomology theories.

Contents

1 Introduction 1

2 Crystalline cohomology 3
Divided powers and crystalline topoi . . . . . . . . . . . . . . . . . . . 3
Calculus with divided powers . . . . . . . . . . . . . . . . . . . . . . . 6
Verification of the desired properties . . . . . . . . . . . . . . . . . . . 9
Frobenius action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3 Rigid cohomology 16
Basic definitions and comparison results . . . . . . . . . . . . . . . . . 16
Verification of the desired properties . . . . . . . . . . . . . . . . . . . 21

1 Introduction
Let us fix the notations: 𝑝 a prime, ℓ a prime different from 𝑝, 𝑘 a field of charac-
teristic 𝑝 and 𝐾 a field of characteristic 0.
The motivation of a “good” cohomology theory (more precisely, Weil coho-
mology theory, [Stacks]) for algebraic varieties is from Weil conjecture, which is
about the zeta function of algebraic variety 𝑋 of characteristic 𝑝,

𝑡𝑛
𝜁(𝑋, 𝑡) = ∑ |𝑋(𝔽𝑞𝑛 )| .
𝑛=1 𝑛
1
陈起渊,清华大学数学系数 80 班.
2 𝑝‐adic Cohomology Theories

Theorem 1.1 (Weil conjecture). The following holds:


• (Rationality)
2𝑑
𝑖+1
𝜁(𝑋, 𝑡) = ∏ 𝑃𝑖 (𝑡)(−1) ,
𝑖=0

with 𝑃𝑖 (𝑡) ∈ ℚ[𝑡] and 𝑑 the dimension of 𝑋.


• (Functional equation) For proper smooth 𝑋,
2𝑑
1 𝑑𝐸
2 𝑡𝐸 𝜁 (𝑋, 𝑡),
𝜁 (𝑋, ) = ±𝑞 𝐸 = ∑ deg 𝑃𝑖 (𝑡).
𝑞𝑑 𝑡 𝑖=0

• (Purity) The inverse of roots of 𝑃𝑖 (𝑡) are algebraic integers with absolute
𝑘
value 𝑞 2 , where 𝑘 is an integer between 0 and 2𝑖. When 𝑋 is proper and
smooth, 𝑘 = 𝑖. (It is an analog of purity of mixed Hodge structures.)
To prove the theorem one needs a Weil cohomology theory, that is, contravari-
ant functors 𝐻 • (𝑋), 𝐻𝑐• (𝑋), 𝐻𝑍• (𝑋) (the usual cohomology, cohomology with
compact support and cohomology supported on a closed set) from varieties over a
field 𝑘 of characteristic 𝑝 to vector spaces over some characteristic 0 field, satisfy-
ing:
• (Finiteness) 𝐻 • (𝑋), 𝐻𝑍• (𝑋), 𝐻𝑐• (𝑋) are finite dimensional with 𝐻 𝑖 (𝑋) = 0
for 𝑖 < 0 or 𝑖 > 2𝑑.
• (Künneth formula) 𝐻 • (𝑋 × 𝑌 ) = 𝐻 • (𝑋) ⊗ 𝐻 • (𝑌 ), similarly for 𝐻𝑐• and
𝐻𝑍• .
• (Poincaré duality) For smooth 𝑋, there is a trace map 𝐻 2𝑑 (𝑋) → 𝐾 satisfy-
ing 𝐻 • (𝑋) ⊗ 𝐻𝑐2𝑑−• (𝑋) → 𝐻 2𝑑 (𝑋) → 𝐾 being a perfect pairing.
• (Cycle class) There is a natural tranformation of functors 𝐴• (𝑋) → 𝐻 2• (𝑋).
One deduces then the Lefschetz fixed point theorem and apply it to the Frobenius
𝐹 𝑛 ∶ 𝑋 → 𝑋 to gain information about the rational points.
The “Riemann hypothesis” part gives information about the archimedean ab-
solute value of the roots of 𝑃𝑘 (𝑡) and then the number of rational points. It is then
natural to ask about 𝑝-adic absolute value of those roots (|𝛼|ℓ = 1). However, 𝑝
-adic étale cohomology behaves badly for characteristic 𝑝 varieties. Thus, alterna-
tive cohomology theories are necessary.
In characteristic zero, the de Rham cohomology theory 𝐻 𝑖 (𝑋) = ℍ𝑖 (Ω•𝑋∕𝐾 )
is a good cohomology theory [Stacks]. Therefore, a rough idea is to lift 𝑋∕𝑘 to
something over 𝐾 = Frac(𝑊 (𝑘)), and working out the de Rham cohomology. Two
ways to realize the idea are the crystalline cohomology and the rigid cohomology.
Another benefit of 𝑝-adic cohomology theories is that the Frobenius action on
étale cohomology can not be calculated down simply. However, in 𝑝 -adic coho-
mology theory, one can represent it as a action of a de Rham complex. There is
then a convenient algorithm of counting rational points ([Ked01]).
2 Crystalline cohomology 3

2 Crystalline cohomology
Divided powers and crystalline topoi
A few basic facts about divided power rings are enumerated here for further appli-
cations. For details, see [Stacks], [BO78], [Ber74].
Definition 2.1. A divided power ring is a triple (𝐴, 𝐼, 𝛾), where 𝐴 is a ring, 𝐼 an
ideal of 𝐴 and 𝛾𝑛 , 𝑛 ∈ ℕ a collection of maps 𝐼 → 𝐴 such that those desired
𝑛
properties of “ 𝑥𝑛! ” hold ( 𝑥, 𝑦 ∈ 𝐼, 𝑎 ∈ 𝐴 in the following):
• 𝛾0 (𝐼) = 1, 𝛾1 (𝑥) = 𝑥, 𝛾𝑛 (𝐼) ⊂ 𝐼, ∀𝑛 > 0.
(𝑛+𝑚)!
• 𝛾𝑛 (𝑥)𝛾𝑚 (𝑥) = 𝛾 (𝑥).
𝑛!𝑚! 𝑛+𝑚

• 𝛾𝑛 (𝑥 + 𝑦) = ∑𝑛𝑖=0 𝛾𝑖 (𝑥)𝛾𝑛−𝑖 (𝑦).


• 𝛾𝑛 (𝑎𝑥) = 𝑎𝑛 𝛾(𝑥).
(𝑛𝑚)!
• 𝛾𝑛 (𝛾𝑚 (𝑥)) = 𝛾 (𝑥).
𝑛!(𝑚!)𝑛 𝑛𝑚

A morphism between two divided power rings, (𝐴, 𝐼, 𝛾), (𝐵, 𝐽 , 𝛿), is a ring homo-
morphism 𝑓 ∶ (𝐴, 𝐼) → (𝐵, 𝐽 ) such that 𝑓 ∘ 𝛾𝑛 = 𝛿𝑛 ∘ 𝑓 . Let (𝐴, 𝐼, 𝛾) be a divided
power ring, a divided power algebra over 𝐴 is a divided power ring (𝐵, 𝐽 , 𝛿) with
a morphism (𝐴, 𝐼, 𝛾) → (𝐵, 𝐽 , 𝛿).
All divided power rings form a category with initial object (ℤ, 0, 0).
Definition 2.2. Let (𝐴, 𝐼, 𝛾) be a divided power ring, 𝐵 a ring and 𝐽 an ideal of 𝐵.
The divided power envelope of 𝐵, (𝐷𝐵,𝛾 (𝐽 ), 𝐽 ,̄ 𝛾) is defined by the representable
functor:
(𝐶, 𝐾, 𝛿) ↦ 𝐻𝑜𝑚𝐴 ((𝐵, 𝐽 ), (𝐶, 𝐾)).

in the category of divided power algebras over 𝐴.


Example 2.3. Let (𝐴, 𝐼, 𝛾) = (ℤ, 0, 0), (𝐵, 𝐽 ) = (ℤ[𝑥], (𝑥)). Then
𝑥𝑛
𝐷𝐵,𝛾 (𝐽 ) = ℤ⟨𝑥⟩ = { ∑ 𝑎𝑛 , 𝑎 ∈ ℤ}.
finite 𝑛! 𝑛

Proposition 2.4. If (𝐴, 𝐼, 𝛾) is a divided power ring with 𝑚𝐴 = 0, then 𝐽 is nilpo-


tent for any divided power algebra (𝐵, 𝐽 , 𝛿) over (𝐴, 𝐼, 𝛾).
Proof. 𝑥𝑚 = 𝑚!𝛾𝑚 (𝑥) = 0. ◻
Definition 2.5. Let (𝐴, 𝐼, 𝛾), (𝐵, 𝐽 , 𝛿) be divided power rings, 𝐴 → 𝐵 a ring mor-
phism. 𝛿 is called compatible with 𝛾 if 𝛿 extends to a divided power structure on
𝛿 ̄ on 𝐽 + 𝐼𝐵 such that (𝐴, 𝐼, 𝛿) → (𝐵, 𝐽 , 𝛾) is a morphism between divided power
rings. In particular, if (𝐵, 𝐽 , 𝛿) = (𝐵, 0, 0), we say that the divided power structure
of 𝐴 extends to 𝐵.
4 𝑝‐adic Cohomology Theories

In particular, if 𝐵 is an 𝐴∕𝐼-algebra, the divided power structure of 𝐴 extends


to 𝐵.

Definition 2.6. A PD-derivation 𝑑 ∶ 𝐵 → 𝑀, where 𝐵 is a divided power ring


(𝐵, 𝐽 , 𝛿) over a ring 𝐴, is a usual derivation with 𝑑𝛾𝑛 (𝑥) = 𝛾𝑛−1 (𝑥)𝑑𝑥. Then there
is a universal module Ω̃ 1𝐵∕𝐴 and the PD-de Rham complex Ω̃ •𝐵∕𝐴 as before. We
shall denote it simply by Ω•𝐵∕𝐴 , which will not lead confusion.

The crystalline cohomology is defined via the crystalline sites and topoi. In
the above discussion, we may replace “rings” by “ringed topoi” to define similar
notions for divided power schemes. For example:

Definition 2.7. A divided power scheme is a triple (𝑆, ℐ , 𝛾) with ℐ a quasi-


coherent ideal of 𝒪𝑆 and 𝛾 compatible divided power structures on each open set.

Other notions, such as divided power envelopes, extensions, divided power de


Rham complexes can be similarly defined.

Definition 2.8. An arrow 𝑈 → 𝑇 is a divided power thickening in which 𝑇 is a


divided power scheme with ideal 𝒥 and 𝑈 the closed subscheme defined by 𝒥 and
𝒥 is nilpotent (This is always the case if 𝑚𝒪𝑆 = 0 for some 𝑚.)

Like étale cohomology, crystalline cohomology works well only in “finite” case
(or “profinite” case). That is, we will assume 𝑚𝒪𝑆 = 0 for some integer 𝑚 without
mentioned.

Definition 2.9. Let (𝑆, ℐ , 𝛾) be a divided power scheme, 𝑆0 = 𝑆∕ℐ and 𝑋 an


𝑆0 -scheme such that the divided power structure extends. The big crystalline site
CRIS(𝑋∕𝑆) contains objects defined by the diagram:

𝑈 𝑇

𝑋 𝑆,

with 𝑈 → 𝑇 a divided power thickening and the divided power structures of 𝑇 and
𝑆 are compatible and morphisms defined by morphisms of divided power thick-
enings with compatibility conditions. A collection of morphisms {(𝑈𝑖 → 𝑇𝑖 ) →
(𝑈 → 𝑇 )} is a covering if {𝑇𝑖 } forms a open covering of 𝑇 and for each 𝑖, the
commutative diagram below is cartesian.

𝑈1 𝑇1

𝑈2 𝑇2

The small crystalline site Cris(𝑋∕𝑆) contains those 𝑈 → 𝑇 such that 𝑈 is an open
subscheme of 𝑋 with morphisms and coverings defined the same.
2 Crystalline cohomology 5

It can be checked that CRIS(𝑋∕𝑆) and Cris(𝑋∕𝑆) are actually sites. Then the
big crystalline topos (𝑋∕𝑆)CRIS is the topos corresponding to the big crystalline
site and similarly for the small crystalline topos (𝑋∕𝑆)Cris . From the definition,
one notices that a sheaf on the crystalline site is a compatible collection of Zariski
sheaves.
The inclusion of sites Cris(𝑋∕𝑆) → CRIS(𝑋∕𝑆) is continuous and co-
continuous. Hence we may define functors 𝑖∗ , 𝑖! ∶ Cris(𝑋∕𝑆) → CRIS(𝑋∕𝑆)
and 𝑖−1 ∶ CRIS(𝑋∕𝑆) → Cris(𝑋∕𝑆) respectively. Moreover, for a morphism
𝑓 ∶ 𝑋∕𝑆 → 𝑌 ∕𝑆 ′ , the natural map CRIS(𝑋∕𝑆) → CRIS(𝑌 ∕𝑆 ′ ) defines 𝑓 −1 , 𝑓∗
respectively.
Then we are able to define the push forward and pull back of crystalline topoi.
For a morphism 𝑓 ∶ 𝑋∕𝑆 → 𝑌 ∕𝑆 ′ , we have arrows:

𝑖∗ 𝑓∗ 𝑖−1

(𝑋∕𝑆)Cris (𝑋∕𝑆)CRIS (𝑌 ∕𝑆 ′ )CRIS (𝑌 ∕𝑆 ′ )Cris .

𝑖−1 𝑓 −1 𝑖!

Then 𝑖−1 ∘ 𝑓∗ ∘ 𝑖∗ and 𝑖−1 ∘ 𝑓 −1 ∘ 𝑖! are morphisms of topoi (𝑋∕𝑆)Cris and (𝑌 ∕𝑆)Cris ,
also denoted by 𝑓∗ and 𝑓 −1 . From now on, we will consider the small crystalline
site only.
The structure sheaf 𝒪𝑋∕𝑆 associates each divided power thickening 𝑈 → 𝑇
with the ring 𝒪(𝑇 ). For morphism 𝑋∕𝑆 → 𝑌 ∕𝑆 ′ , there exists a canonical mor-
phism 𝒪𝑌 ∕𝑆 ′ → 𝒪𝑋∕𝑆 . Then we may define module push forward and pull back,
denoted 𝑓∗ and 𝑓 ∗ respectively.
The crystalline site and Zariski sites are linked. There is a morphism of sites

𝑢𝑋∕𝑆 ∶ Cris(𝑋∕𝑆) → 𝑋Zar , (𝑈 → 𝑇 ) ↦ 𝑈 .

Moreover, for an object 𝑈 → 𝑇 in the crystalline site, those objects lying over
it form a subsite, which is equivalent to the Zariski site of 𝑇 . For a sheaf ℱ ∈
(𝑋∕𝑆)Cris , the corresponding Zariski sheaf is denoted by ℱ𝑇 .
As in the Zariski case, quasi-coherent modules are the mainly objects for study.
Those are, “crystals” for some intuitive reasons.
Definition 2.10. A crystal on Cris(𝑋∕𝑆) is a sheaf of 𝒪𝑋∕𝑆 -modules such that for
any morphism 𝑢 ∶ 𝑇 ′ → 𝑇 , the map induced by 𝑢, 𝑢∗ ℱ𝑇 → ℱ𝑇 ′ is an isomorphism
and for each 𝑇 , ℱ𝑇 is a quasi-coherent 𝒪𝑇 module.
From definitions, a crystal is indeed a quasi-coherent sheaf of 𝒪𝑋∕𝑆 -modules.
Example 2.11. The sheaf 𝒪𝑋 defined by 𝒪𝑋 (𝑈 → 𝑇 ) = 𝒪(𝑈 ) is a crystal. There
is a canonical morphism: 𝒪𝑋∕𝑆 → 𝒪𝑋 , which is surjective.
Remark 2.12. In the definition of the crystalline site, we require 𝑈 to become an
Zariski open set in the object 𝑈 → 𝑇 . We can substitute the condition by requiring
𝑈 → 𝑋 to be an étale map and all the theory still preserved.
6 𝑝‐adic Cohomology Theories

Calculus with divided powers


Firstly we recall the characteristic 0 case. Here 𝑋 is a smooth scheme over a field
of characteristic 0, 𝐾. Those abelian categories are equivalent:
• Crystals on Cris(𝑋∕𝐾),
• Integrable connections of quasi-coherent sheaves on 𝑋,
• Stratifications of quasi-coherent modules on 𝑋,
• Quasi-coherent 𝒟 -modules.
Now we return to the usual case, that is 𝑚𝒪𝑆 = 0. Similar results hold. We shall
follow [Stacks],
Let 𝑖 ∶ 𝑋 → 𝑌 be an immersion over 𝑆 with the ideal of definition 𝒥 and 𝑌 ∕𝑆
smooth. Let 𝐷𝑋,𝛾 (𝑌 ) be the spectrum of the quasi-coherent algebra 𝐷𝒪𝑌 ,𝛾 (𝒥 ).
Then Hom(−, 𝐷𝑋,𝛾 (𝑌 )) ≃ 𝑖∗ Hom(−, 𝑌 ) by definition. 𝐷𝑋,𝛾 (𝑌 𝑛+1 ) is denoted by
𝐷(𝑛).
Theorem 2.13. The following categories are equivalent:
• Crystals on Cris(𝑋∕𝑆),
• 𝐷𝑋,𝛾 (𝑌 )-modules with HPD-stratification as 𝒪𝑌 -modules,
• 𝐷𝑋,𝛾 (𝑌 )-modules with integrable, quasi-nilpotent connection as 𝒪𝑌 -
modules,
• 𝐷𝑋,𝛾 (𝑌 )-modules ℰ which are also “divided power” 𝒟 -modules.
We shall mention the first and the third categories in the article only. “Quasi-
nilpotent” means that the “taking derivative” action is locally nilpotent.
Proof. There is an exact sequence:
0 → Ω𝑌 ∕𝑆 → 𝒪𝐷(1) → 𝒪𝐷(0) → 0.
The sequence is split by two sections 𝑝∗1 , 𝑝∗2 , where 𝑝1 , 𝑝2 ∶ 𝑌 × 𝑌 → 𝑌 . 𝐷(0)
and 𝐷(1) are objects in the crystalline site. Then for a crystal ℰ , we have an exact
sequence:
0 → ℰ𝐷(0) ⊗ Ω𝑌 ∕𝑆 → ℰ𝐷(1) → ℰ𝐷(0) → 0.
The two projections 𝑝1 , 𝑝2 induce 𝑝∗1 , 𝑝∗2 ∶ ℰ𝐷(0) → ℰ𝐷(1) . 𝑝∗1 − 𝑝∗2 gives such a
connection. One checks that the compatibility condition for a crystal translates to
the integrability condition.
Conversely, given a module with integrable connection, ℰ , by smoothness, for
any object 𝑈 → 𝑇 in the crystalline site, locally there is a map 𝑇 → 𝑌 lifting the
immersion 𝑈 → 𝑌 . Then locally there is a map 𝑇 → 𝐷(0) by universal property
of divided power envelopes. One defines ℰ𝑇 by pulling ℰ back via the map above.
From the connection and the integrability, one checks that those ℰ𝑇 can form a
crystal. ◻
2 Crystalline cohomology 7

We prove the fundamental comparison theorem in the section and derive some
corollaries. We shall use the notations in the above sections.
Lemma 2.14 (Poincaré lemma). Let (𝐴, 𝐼, 𝛾) be a divided power ring and 𝑃 =
𝐴⟨𝑡1 , ⋯ , 𝑡𝑛 ⟩ with the natural divided power structure, then there is an exact se-
quence:
0 → 𝑃 → Ω1𝑃 ∕𝐴 → ⋯ Ω𝑛𝑃 ∕𝐴 → 0.

Proof. One checks it directly by noticing


𝑛
Ω1𝑃 ∕𝐴 ≃ 𝑃 𝑑𝑡𝑖 . ◻

𝑖=1

Lemma 2.15. Suppose there is a commutative diagram

𝑌′
𝑝

𝑋 𝑌

𝑆,
𝑛
with 𝑋, 𝑌 , 𝑆 as above and 𝑌 ′ = 𝔸𝑌 . For a crystal ℰ ∈ (𝑋∕𝑆)Cris , there is a
quasi-isomorphism of complexes:
𝑅𝑝∗ (ℰ𝐷𝑋 (𝑌 ′ ) ⊗ Ω•𝑌 ′ ∕𝑆 ) ≃ ℰ𝐷𝑋 (𝑌 ) ⊗ Ω•𝑌 ∕𝑆 .

Proof. We have an exact sequence


0 → Ω𝑌 ′ ∕𝑌 → Ω𝑌 ′ ∕𝑆 → 𝑝∗ Ω𝑌 ∕𝑆 → 0.
We have the Gauss–Manin filtration of Ω𝑌 ′ ∕𝑆 , given by
𝑗 𝑗−𝑘 𝑗
𝐹 𝑖𝑙𝑘 Ω𝑌 ′ ∕𝑆 = im(Ω𝑘𝑌 ′ ∕𝑌 ⊗ Ω𝑌 ′ ∕𝑆 → Ω𝑌 ′ ∕𝑆 )

with filtrants
𝑗 𝑗−𝑘
𝑔𝑟𝑘 Ω𝑌 ′ ∕𝑆 ≃ Ω𝑘𝑌 ′ ∕𝑌 ⊗ 𝑝∗ Ω𝑌 ∕𝑆 .

It suffices to check for each filtrants.


Moreover,
ℰ𝐷𝑋 (𝑌 ′ ) ≃ ℰ𝐷𝑋 (𝑌 ) ⊗𝒪𝐷 𝒪𝐷𝑋 (𝑌 ′ )
𝑋 (𝑌 )

by the crystal condition. The result then follows from the Poincaré lemma 2.14. ◻
Lemma 2.16. For 𝑖 ∶ 𝑋 → 𝑌 a closed immersion with 𝑌 ∕𝑆 smooth, ℰ a quasi-
coherent 𝐷𝑋,𝛾 (𝑌 )-module with a connection. 𝑅𝑢𝑋∕𝑆∗ ℰ is represented by the
cosimplicial module (or a complex by Dold–Kan correspondence)
ℰ𝐷(0) → ℰ𝐷(1) → ℰ𝐷(2) → ⋯ .
8 𝑝‐adic Cohomology Theories

Proof. 𝑢𝑋∕𝑆 is acyclic (indeed, an isomorphism) on the subsite of objects over


𝐷(0). Moreover, 𝐷(0) covers the “final object” ∗ by the lifting property of smooth
maps. One applies the Čech resolution for 𝐷(0) → ∗ to conclude by noticing
𝐷(𝑛) = 𝐷(0)𝑛+1 in the category Cris(𝑋∕𝑆). ◻

Lemma 2.17. For 𝐴• a cosimplicial ring, the cosimplicial module 𝑀 • defined by


𝑛
𝑀𝑛 = 𝐴𝑛 𝑒𝑖

𝑖=1

with obvious arrows is contractible.

Proof. The homotopy


ℎ ∶ 𝑀 • → Hom(Δ[1], 𝑀 • )

is given by
𝑒𝑖 𝑖 < 𝑗
ℎ𝑛 (𝑒𝑖 )(𝛼𝑗𝑛 ) =
{ 0 𝑖 ≥ 𝑗,

where 𝛼𝑗𝑛 is the element in Δ[1]𝑛 given by

0 𝑖 < 𝑗,
𝛼𝑗𝑛 ∶ {0, 1, ⋯ , 𝑛} → {0, 1}, 𝛼𝑗𝑛 (𝑖) =
{ 1 𝑖 ≥ 𝑗.


𝑛
Lemma 2.18. Notations as above. When 𝑌 = 𝔸𝑆 , the cosimplicial module

Ω1𝐷(0)∕𝑆 → Ω1𝐷(1)∕𝑆 → Ω1𝐷(2)∕𝑆 → ⋯

is contractible.

Proof. One has


𝑑
Ω1𝐷(0)∕𝑆 ≃ 𝒪𝐷(0) 𝑑𝑡𝑑 .

𝑖=1

One applies 𝑑 -fold of the ℎ above to conclude. ◻

Theorem 2.19. Situations as above. ℰ corresponds to a crystal on (𝑋∕𝑆)Cris , also


denoted by ℰ . Then there is a canonical isomorphism:

𝑅𝑢𝑋∕𝑆∗ ℰ ≃ ℰ ⊗𝑌 Ω•𝑌 ∕𝑆 ≃ ℰ ⊗𝐷 Ω•𝐷∕𝑆 .

In particular,
𝑅Γ((𝑋∕𝑆)Cris , ℰ ) ≃ 𝑅Γ(𝑋Zar , ℰ ⊗ Ω•𝑌 ∕𝑆 ).
2 Crystalline cohomology 9

Proof. There is a commutative diagram:

ℰ𝐷(0) ℰ𝐷(0) ⊗ Ω1𝐷(0)∕𝑆 ⋯

ℰ𝐷(1) ℰ𝐷(1) ⊗ Ω1𝐷(1)∕𝑆 ⋯

⋮ ⋮,

where the rows are de Rham complexes and the horizontal arrows are given by
𝑝∗1 − 𝑝∗2 + ⋯ + (−1)𝑛 𝑝∗𝑛+1 .
𝑛
The problem is étale local. Thus we may assume that 𝑌 = 𝔸𝑆 . The left vertical
arrow of the diagram represents the 𝑅𝑢𝑋∕𝑆∗ ℰ by 2.16. Other vertical arrows have
zero cohomology for they are contractible. One applies the spectral sequence to
conclude the cohomology of the total complex is the cohomology of 𝑅𝑢𝑋∕𝑆∗ ℰ .
We apply the spectral sequence to the other direction. Applying 2.15 for
𝑝𝑖 ∶ 𝑌 𝑛+1 → 𝑌 𝑛 , one deduces that the 𝐸1 package of the spectral sequence
becomes
ℋ0 ℋ1 ⋯
0 0

0
ℋ ℋ1 ⋯

⋮ ⋮.

Here ℋ 𝑖 means the cohomology of the first row. Thus, all arrows except the first
row are cancelled in the 𝐸2 package of the spectral sequence. Thus the cohomology
of the total complex is the cohomology of the first arrow in the original commuta-
tive diagram, i.e., the de Rham complex. The theorem thus follows by comparing
the spectral sequence of two directions. ◻

Verification of the desired properties


The theorem above provides a way to express the crystalline cohomology explicitly
by de Rham cohomology. We shall use this to prove those results that a good
cohomology theory should satisfy. The main reference here is [BO78] and [Ber74].
Theorem 2.20 (Vanishing theorem). Let 𝑓 ∶ 𝑋 → 𝑆 be quasi-compact and quasi-
separated and 𝑋∕𝑆0 smooth. Denote 𝑓𝑋∕𝑆 = 𝑓 ∘𝑢𝑋∕𝑆 . Then there exists an integer
𝑟 such that 𝑅𝑖 𝑓𝑋∕𝑆∗ ℰ vanishes for any 𝑖 > 𝑟 and any crystal ℰ .

Proof. Using the Čech-to-derived spectral sequence, we reduce to the local case.
That is, 𝑋 admits a lifting 𝑌 , such that 𝑌 ∕𝑆 smooth. Then there is an isomorphism:

𝑅𝑓𝑋∕𝑆∗ ℰ ≃ 𝑅𝑓∗ (ℰ ⊗𝑌 Ω•𝑌 ∕𝑆 ).


10 𝑝‐adic Cohomology Theories

The result follows from the vanishing result of cohomology of (Zariski) quasi-
coherent sheaves. ◻

Theorem 2.21 (Base change thoerem). Suppose there is a base change diagram
for 𝑓 ∶ 𝑋 → 𝑆 and 𝑓 ′ ∶ 𝑋 ′ → 𝑆 ′ ,

𝑢′
𝑋′ 𝑋

𝑆0′ 𝑆0

𝑢
𝑆′ 𝑆.

Then there is an isomorphism:

𝐿𝑢∗ 𝑅𝑓𝑋∕𝑆∗ ℰ ≃ 𝑅𝑓𝑋′ ′ ∕𝑆 ′ ∗ 𝐿𝑢′∗ ℰ ,

for any flat crystal ℰ on (𝑋∕𝑆)Cris .

Proof. For the case that there is a smooth lifting 𝑌 ∕𝑆 of 𝑋, it follows from the base
change theorem for Zariski sheaves. The general case follows from cohomological
descent. ◻

Theorem 2.22 (Finiteness theorem). Let 𝑋∕𝑆0 be proper and smooth and 𝑆 be
noetherian. Then for any coherent crystal ℱ flat over 𝒪𝑋∕𝑆 , the complex 𝑅𝑓𝑋∕𝑆∗ ℱ
is quasi-isomorphic to a perfect complex, i.e. a complex whose terms are locally
free sheaves of finite rank and the number of non-zero terms of which is finite.

Proof. By Čech resolution and the computation in the local case 𝑅𝑓𝑋∕𝑆∗ ℱ has
finite tor-dimension. Hence it suffices prove that 𝑅𝑓𝑋∕𝑆∗ ℱ has coherent cohomol-
ogy. For 𝑆 is noetherian, ℐ 𝑛 = 0 for sufficiently large 𝑛. By dévissage arguments,
it suffices to prove for coherent 𝒪𝑋∕𝑆0 -modules. Then

𝑅𝑓𝑋∕𝑆∗ ℱ ≃ 𝑅𝑓𝑋∕𝑆0 ∗ ℱ ≃ 𝑅𝑓∗ ℱ ⊗𝑋 Ω•𝑋∕𝑆 .

Then the result follows from the finiteness result of cohomology of coherent
sheaves in Zariski topology. ◻

We may then define the “completed” version. That is, 𝑆 is a 𝑝 -adically com-
plete formal scheme with ℐ = (𝑝), 𝑆𝑛 = 𝑆∕ℐ 𝑛 . The completed crystalline site
Cris(𝑋∕𝑆)̂ is the combination of all Cris(𝑋∕𝑆𝑛 ) and crystals similarly defined. One
defines the completed crystalline topos then.

Proposition 2.23. For a crystal ℰ on (𝑋∕𝑆)Cris with restriction ℰ𝑛 on (𝑋∕𝑆𝑛 )Cris ,


there is an isomorphism:

𝑅𝑢𝑋∕𝑆∗ ℰ = 𝑅 lim 𝑅𝑢𝑋∕𝑆𝑛 ∗ ℰ𝑛 .


𝑛
2 Crystalline cohomology 11

The proof is given in [BO78]. Then completions of the comparison results hold.
For example,
𝑅𝑢𝑋∕𝑆∗ ℰ ≃ ℰ ⊗ Ω•𝐷∕𝑆̂ ,

where 𝐷̂ is the 𝑝-adic completion of 𝐷𝑋 (𝑌 ) and the latter ℰ is a 𝐷̂ -module. In


particular, there is an isomorphism:

𝑅Γ((𝑋∕𝑆)Cris , ℰ ) ≃ 𝑅Γ(𝑋Zar , ℰ ̂ ⊗ Ω•𝐷∕𝑆


̂ ).

Then the vanishing theorem, the base change theorem and the finiteness theorem
also hold in the completed case for coherent crystals, since 𝑅 lim works well then.
If moreover 𝑋∕𝑆0 is proper, ℰ is coherent and 𝑋 lifts to 𝑌 , then 𝑌 is also
proper. By Grothendieck’s existence theorem, there exists a coherent 𝒪𝑌 -module,
also denoted by ℰ such that

𝑅Γ((𝑋∕𝑆)Cris , ℰ ) ≃ 𝑅Γ(𝑋Zar , ℰ ⊗ Ω•𝑌 ∕𝑆 ).

In particular,
𝑅Γ((𝑋∕𝑆)Cris , 𝒪𝑋∕𝑆 ) ≃ 𝑅Γ(𝑋Zar , Ω•𝑌 ∕𝑆 ).

We will denote the left hand side by 𝐻cris (𝑋∕𝑆). We restrict to the case 𝑆 =
Spec(W(𝑘)) and 𝑆0 = Spec(𝑘) for some perfect field 𝑘 of characteristic 𝑝, which is
the crystalline cohomology for characteristic 𝑝 varieties.
We firstly discuss the case that 𝑋∕𝑘 is proper and 𝑋 admits a lifting. Then
there exists a cartesian diagram (if there exists an embedding W(𝑘) → ℂ, which is
the case for 𝑘 finite):

𝑋 𝒳 𝑋 an

Spec(𝑘) Spec(W(𝑘)) Spec(ℂ),

with 𝒳 a proper smooth scheme over W(𝑘). Then we have the following compari-
son theorem between crystalline cohomology and Betti cohomology.
Theorem 2.24. There is a canonical isomorphism
𝑛
𝐻cris (𝑋∕W(𝑘)) ⊗ ℂ ≃ 𝐻 𝑛 (𝑋 an , ℂ),

where the right hand side is the Betti cohomology.


Proof. There exists an canonical isomorphism:
𝑛
𝐻cris (𝑋∕W(𝑘)) ≃ ℍ𝑛 (Ω•𝒳 ∕W(𝑘) ).

There exists a spectral sequence


𝑝𝑞 𝑝
𝐸1 = ℍ𝑞 (Ω𝒳 ∕W(𝑘) ) ⇒ ℍ𝑝+𝑞 (Ω•𝒳 ∕W(𝑘) ).
12 𝑝‐adic Cohomology Theories

All the modules in the left hand side is locally free. After tensoring ℂ, it becomes
the spectral sequence
𝑝𝑞 𝑝
𝐸1 = ℍ𝑞 (Ω𝑋 an ∕ℂ ) ⇒ ℍ𝑝+𝑞 (Ω•𝑋 an ∕ℂ ).

By (complex) Hodge theory, the spectral sequence degenerates and the right hand
side is isomorphic to 𝐻 𝑝+𝑞 (𝑋 an , ℂ). Hence the original spectral sequence degen-
erates and the result follows. ◻
For example, ℙ𝑛 , and hypersurfaces in ℙ𝑛 admits a lifting. Moreover, by
obstruction theory, such a lifting exists if and only if the obstruction class
𝜉 ∈ 𝐻 2 (𝑋, 𝒯𝑋 ) vanishes. In particular, this is the case when 𝑋 is a curve.
Theorem 2.25 (Künneth formula). For a crystal ℰ1 on 𝑋1 and a crystal ℰ2 on 𝑋2 ,
ℰ1 ⊠ ℰ2 defines a crystal on 𝑋1 × 𝑋2 and

𝑅Γ((𝑋1 × 𝑋2 ∕𝑆)cris , ℰ1 ⊠ ℰ2 ) ≃ 𝑅Γ((𝑋1 ∕𝑆)Cris , ℰ1 ) ⊗𝐿 𝑅Γ((𝑋2 ∕𝑆)Cris , ℰ2 ).

Proof. The map is natural and we may check it locally. Then there are closed
immersions 𝑋1 → 𝑌1 and 𝑋2 → 𝑌2 with 𝑌1 , 𝑌2 ∕𝑆 smooth. The result follows
from the fact that the tensor product of the de Rham complex of ℰ1 and ℰ2 is
isomorphic to the de Rham complex of ℰ1 ⊠ ℰ2 . ◻
Theorem 2.26 (Poincaré duality). For 𝑋∕𝑘 proper smooth of dimension 𝑑, there
2𝑑
is a trace map tr ∶ 𝐻cris (𝑋) ≃ W(𝑘), such that for all coherent crystal ℰ over
𝑋∕W(𝑘), the pairing

𝑅Γ((𝑋∕W(𝑘))Cris , ℰ ) × 𝑅Γ((𝑋∕W(𝑘))Cris , ℰ ∨ ) → W(𝑘)[2𝑑]

is perfect.
Proof. We define the trace map firstly. Locally, 𝑋 lifts to a smooth scheme 𝑋̃
over W(𝑘) with 𝑝-adic completion by 𝒳 (though it is no longer compact) . Take a
compactification of 𝑋,̃ 𝑋.̄ 𝑋̄ → Spec(W(𝑘)) has a dualizing complex 𝜔. Then

𝜔|𝑋̃ ≃ Ω𝑑𝑋∕W(𝑘)
̃ .

The trace map is defined as (by the usual procedure, one checks the map is inde-
pendent of 𝑋)̄
2𝑑
𝐻cris (𝑋) ≃ 𝐻 𝑑 (𝒳 , Ω𝑑𝒳 ) → 𝐻 𝑑 (𝑋,̄ 𝜔) ≃ W(𝑘).

By derived Nakayama lemma and the finiteness theorem, it suffices to prove af-
ter tensoring 𝑘. By base change theorem, it suffices to prove similar results on
(𝑋∕𝑘)Cris . Then the derived global sections is represented by de Rham complexes
and the result follows from the Serre duality. ◻
Remark 2.27. The cycle class map can be defined: defining it locally by de Rham
cohomology and then applying cohomological descent.
2 Crystalline cohomology 13

Remark 2.28. Crystalline cohomology does not work well for non-proper or non-
smooth schemes. See 3.6 and tag 07LI in [Stacks].

Remark 2.29. For varieties with normal crossing singularities and becoming an
open set in a proper variety with complement a normal crossing divisor, there is a
canonical log structure on it. One defines their log-crystalline sites (similar to the
crystalline site, but requiring all objects having a log structure) and the correspond-
ing cohomology, called the log crystalline cohomology.
The log crystalline cohomology has similar comparison results to de Rham
cohomology and one can apply this to show that it is a good cohomology theory.
In fact, it can be proved that there is a comparison between log crystalline co-
homology and rigid cohomology. However, the log crystalline cohomology is in-
dispensable for there is an operator 𝑁, the log monodromy operator on it. Those
operators are required in 𝑝-adic Hodge theory.

Frobenius action
For this section, let 𝑆 = Spec(W(𝑘)) and 𝑆0 = Spec(𝑘), where 𝑘 is a characteristic
𝑝 perfect field. 𝑋 be a smooth scheme over 𝑘. The (absolute) Frobenius actions
𝐹𝑋 and 𝐹𝑆 induce a commutative diagram, where the square is cartesian:

𝐹𝑋
𝑋
𝐹𝑋∕𝑆

𝐹𝑆′
𝑋′ 𝑋

𝐹𝑆
𝑆 𝑆.

Then a morphism 𝒪𝑋∕𝑆 → 𝑅𝐹𝑋∗ 𝒪𝑋∕𝑆 and morphism 𝒪𝑋 ′ ∕𝑆 → 𝑅𝐹𝑋∕𝑆∗ 𝒪𝑋∕𝑆 ,



the first induces the Frobenius action on 𝐻cris (𝑋∕W(𝑘)) and the cohomology of
𝑋 becomes an 𝐹 -crystal after modulo torsion and the latter induces the W(𝑘) -
linearization of such action.
As we have mentioned, the eigenvalues of the Frobenius action provide infor-
mations of the zeta function. We analyse the Frobenius in this section and the goal
is to prove the Katz conjecture. The main reference here is [BO78].
We firstly analyse the Frobenius action when there exists a lifting (𝑌 , 𝐹𝑌 ) of
(𝑋, 𝐹𝑋 ). Then the above diagram lifts to a similar diagram of 𝑌 . As the map
Ω1𝑋 ′ ∕𝑘 → 𝐹𝑋∕𝑘∗ Ω1𝑋∕𝑘 is zero, the map Ω1𝑌 ′ ∕𝑆 → 𝐹𝑌 ∕𝑆∗ Ω1𝑌 ∕𝑆 factors through
𝑝𝐹𝑌 ∕𝑆∗ Ω1𝑌 ∕𝑆 . Thus the image of the Frobenius action lies in the largest subcomplex
of 𝑝𝑖 𝐹𝑌 ∕𝑆∗ Ω𝑖𝑌 ∕𝑆 .

Proposition 2.30. The image of the map above is exactly the largest subcomplex
of 𝑝𝑖 𝐹𝑌 ∕𝑆∗ Ω𝑖𝑌 ∕𝑆 . Moreover, the map is an isomorphism to its image.

To prove this, we shall prove a general result. We need some terminology:


14 𝑝‐adic Cohomology Theories

Definition 2.31. A gauge is a map 𝜖 ∶ ℤ → ℕ such that 𝜖(𝑖+1) ≤ 𝜖(𝑖) ≤ 𝜖(𝑖+1)+1.


A cogauge is a map: 𝜂 ∶ ℤ → ℕ such that 𝜖(𝑖) ≤ 𝜖(𝑖 + 1) ≤ 𝜖(𝑖) + 1. For a map
𝜁 ∶ ℤ → ℕ and a complex 𝐾, 𝐾𝜁 is the largest complex that contains in 𝑝𝜁(𝑖) 𝐾 𝑖 . For
two gauges 𝜖 and 𝜖 ′ , 𝜖 ′ is called a simple augmentation of 𝜖 at 𝑖 if 𝜖(𝑖) = 𝜖 ′ (𝑖) − 1
and 𝜖(𝑗) = 𝜖 ′ (𝑗) for 𝑗 ≠ 𝑖.

Proposition 2.32. We have a quasi-isomorphism:

(Ω•𝑌 ′ ∕𝑆 )𝜖 ≃ (𝐹𝑌 ∕𝑆∗ Ω•𝑌 ∕𝑆 )𝜖+1 .

To prove this, we firstly states the Cartier theorem:

Proposition 2.33. There is an isomorphism:

𝑐 −1
Ω𝑖𝑋∕𝑘 ∼ ℋ 𝑖 (Ω•𝑋∕𝑘 ).

with 𝑐 −1 (𝑥) = 𝑥, 𝑐 −1 (𝑑𝑥) = 𝑥𝑝−1 𝑑𝑥 and 𝑐 −1 is compatible with the wedge product.

Proof. Note that both sides are compatible with étale base change. Hence it suffices
to construct a canonical isomorphism when 𝑋 = Spec(𝑘[𝑡1 , ⋯ , 𝑡𝑛 ]). Then the
result follows from a direct calculation. ◻

Now we are able to prove the proposition above:

Proof. For the complex is left-bounded, we restrict the domain of 𝜖 to ℕ. We


proceed by induction: we can find a series of simple augmentations {𝜖𝑖 }∞𝑖=1 such
that 𝜖1 = 𝜖 and 𝜖𝑚 (𝑛) → ∞ when 𝑚 → ∞, which induces a filtration of (Ω•𝑌 ∕𝑆 )𝜖 .
Then it suffices to prove the quasi-isomorphism for each filtrant. We then have a
commutative diagram (𝜂𝑙 = 𝜖𝑙 + 1) and 𝜖𝑙+1 is the simple augmentation of 𝜖𝑙 at 𝑗.

(Ω•𝑌 ′ ∕𝑆 )𝜖𝑙 ∕(Ω•𝑌 ′ ∕𝑆 )𝜖𝑙+1 (𝐹𝑌 ∕𝑆 Ω•𝑌 ∕𝑆 )𝜖𝑙 ∕(𝐹𝑌 ∕𝑆 Ω•𝑌 ∕𝑆 )𝜖𝑙+1

•𝑝−𝜖𝑙 (𝑗)

𝑐 −1
Ω𝑋 ′ ∕𝑘 [−𝑗] ∼ ℋ 𝑗 (𝐹𝑋∕𝑘∗ Ω𝑌 ∕𝑆0 )[−𝑗].

Hence the arrow above is an isomorphism as other arrows are. ◻

To globalize the result, we can define (−)𝜂 is the derived category.

Definition 2.34. For a cogauge 𝜂, the functor 𝐿𝜂 in the derived category of abelian
sheaves 𝒟 (𝒜 𝑏(𝑋)) is defined as follows: for a complex 𝐾, take a flat (as a ℤ -
module) resolution and then apply (−)𝜂 on it. For two cogauges 𝜂, 𝜂 ′ , with 𝜂 ′ ≤ 𝜂,
the functor 𝐿𝜂 ′ ∕𝜂 is defined by the distinguished triangle:

𝐿𝜂 → 𝐿𝜂 ′ → 𝐿𝜂 ′ ∕𝜂.

It can be proved that 𝐿𝜂 is well-defined.


2 Crystalline cohomology 15

Theorem 2.35. There is an isomorphism, where 𝜂(𝑖) = 𝑖:

𝑅𝑢𝑋 ′ ∕𝑆 𝒪𝑋 ′ ∕𝑆 ≃ 𝐹𝑋∕𝑆∗ 𝐿𝜂𝑅𝑢𝑋∕𝑆 𝒪𝑋∕𝑆 .

Proof. Choose such a lifting locally and glue the above local isomorphisms to-
gether by cohomological descent. ◻
Definition 2.36. For an 𝐹 -crystal 𝑀 (i.e. a finite rank free W(𝑘) -module with
a semilinear Frobenius action), choose a basis of it and then the Frobenius action
is represented by a matrix. The slopes of the action are the absolute values of the
eigenvalues of the matrix (the slopes are independent of the choice of bases). Let
𝛼1 ≤ ⋯ ≤ 𝛼𝑚 be the slopes counted by multiplicity. Then the Newton polygon
of 𝑋 is defined by the polygon with domain [0, 𝑚] with slope 𝛼𝑖 in the interval
𝑙
[𝑖 − 1, 𝑖]. If 𝑀 ′ = im(𝐹 ) = ⨁𝑖=0 𝑝𝑖 𝑀𝑖 , the Hodge number of 𝑀 is (𝑒0 , ⋯ , 𝑒𝑙 ),
the ranks of 𝑀0 , ⋯ , 𝑀𝑙 .
Definition 2.37. The Hodge polygon with parameter (𝑎0 , ⋯) is the polygon with
slope 𝑖 in the interval [𝑎0 + ⋯ + 𝑎𝑖−1 , 𝑎0 + ⋯ + 𝑎𝑖 ].
Lemma 2.38. The following basic result holds:
• The Newton polygon of an 𝐹 -crystal lies on the Hodge polygon of it.
• The Hodge polygon with parameter (𝑎0 , ⋯) lies on the Hodge polygon with
parameter (𝑏0 , ⋯) if and only if for any 𝑖 > 1, 𝑖𝑎0 + (𝑖 − 1)𝑎1 + ⋯ + 𝑎𝑖−1 ≤
𝑖𝑏0 + (𝑖 − 1)𝑏1 + ⋯ 𝑏𝑖−1 holds for any 𝑖.
• ℓ(𝑀 ′ ∕𝑀 ′ ∩ 𝑝𝑖 𝑀) = 𝑖𝑒0 + ⋯ + 𝑒𝑖−1 , where ℓ means length.
We are then able to prove the Katz conjecture:
𝑛
Theorem 2.39 (Katz, Mazur-Ogus). The Newton polygon of 𝑀 = 𝐻cris (𝑋∕W(𝑘))
𝑛−𝑞 𝑞
lies on the Hodge polygon with parameters 𝑎𝑞 = ℎ (𝑋, Ω𝑋∕𝑘 ).

Proof. From the above facts, we are reduced to prove the following:

ℓ(𝑀 ′ ∕𝑀 ′ ∩ 𝑝𝑖 𝑀) ≤ 𝑖ℎ0 + (𝑖 − 1)ℎ1 + ⋯ + ℎ𝑖−1 .

There exists a commutative diagram (𝐿𝜂𝑅𝑢𝑋∕𝑆 𝒪𝑋∕𝑆 ) is simply denoted 𝐿𝜂)


𝑞
ℍ𝑛 (𝑋, 𝐿(𝜁𝑖 + 𝑖)) ℍ𝑛 (𝑋, 𝐿𝜁0 ) ℍ𝑛 (𝑋, 𝐿𝜁0 ∕(𝜁𝑖 + 𝑖))
𝜑
𝜓 𝑛
ℍ𝑛 (𝑋, 𝐿𝑖) 𝐻cris (𝑋∕W(𝑘)),

where 𝜁𝑖 (𝑛) = max{0, 𝑛 − 𝑖}. After modulo torsion, the image of 𝜑 is 𝑀 ′ and the
image of 𝜓 is 𝑝𝑖 𝑀. Then 𝑀 ′ ∕𝑀 ′ ∩ 𝑝𝑖 𝑀 is a quotient of im 𝑞. Moreover, there is
a distinguished triangle:

ℍ𝑛 (𝑋, 𝐿(𝜁𝑗−1 +𝑗 −1)∕(𝜁𝑗 +𝑗)) → ℍ𝑛 (𝑋, 𝐿𝜁0 ∕(𝜁𝑗 +𝑗)) → ℍ𝑛 (𝑋, 𝐿𝜁0 ∕(𝜁𝑗−1 +𝑗 −1))
16 𝑝‐adic Cohomology Theories

which induces a long exact sequence of cohomology. Moreover,

ℍ𝑛 (𝑋, 𝐿(𝜁𝑗−1 + 𝑗 − 1)∕(𝜁𝑗 + 𝑗)) ≃ ℍ𝑛 (𝑋, 𝜏≤𝑗−1 Ω•𝑋∕𝑘 ),

where 𝜏 means the truncation. Then there is a spectral sequence, where the iso-
morphism is given by the Cartier isomorphism:
𝑝𝑞 𝑞
𝐸2 (𝑞 ≤ 𝑗 − 1) = 𝐻 𝑝 (𝑋, Ω𝑋∕𝑘 ) ≃ 𝐻 𝑝 (𝑋, ℋ 𝑞 (Ω•𝑋∕𝑘 )) ⇒ ℍ𝑝+𝑞 (𝑋, 𝜏≤𝑗−1 Ω•𝑋∕𝑘 ).

Thus, dim ℍ𝑝+𝑞 (𝑋, 𝜏≤𝑗−1 Ω•𝑋∕𝑘 ) ≤ ℎ0 + ⋯ + ℎ𝑗−1 . Combining all those 𝑗 gives the
result. ◻

3 Rigid cohomology

Basic definitions and comparison results


In this section, let 𝑘 be a perfect field with characteristic 𝑝, 𝒪 = W(𝑘) the Witt ring
and 𝐾 = Frac(W(𝑘)). 𝑋, 𝑌 etc. are varieties over 𝑘 and 𝒫 a formal scheme over
𝒪. 𝒫𝐾 , 𝒫𝑘 the generic fibre and special fibre of 𝒫 . We firstly define the convergent
cohomology for a basic approach.

Definition 3.1. Let 𝑋 → 𝒫 be an immersion, sp ∶ 𝒫𝐾 → 𝒫𝑘 the specialization


map. The tube of 𝑋 is the inverse image sp−1 (𝑋) ⊂ 𝒫 , denoted ]𝑋[𝒫 .

Definition 3.2. If there exists an immersion 𝑋 → 𝒫 such that ]𝑋[𝒫 is smooth,


the convergent cohomology of 𝑋 is defined as

𝐻conv (𝑋) = 𝑅Γ(Ω•]𝑋[ ∕𝐾
),
𝒫

where Ω•]𝑋[ ∕𝐾
is the de Rham complex.
𝒫

Proposition 3.3 (Weak fibration theorem [Stu74]). Suppose there is a commutative


diagram
𝒫′
𝑝

𝑋 𝒫

𝑆,

where 𝒫 ′ → 𝒫 is smooth in a neighbourhood of 𝑋. Then ]𝑋[𝒫 ′ =]𝑋[𝒫 ×𝐵 ∘ (0, 1)𝑑


locally on 𝑋, for 𝑑 the relative dimension of 𝒫 ′ → 𝒫 .

Proposition 3.4. The cohomology 𝐻conv (𝑋) is independent of the choice of 𝒫 ,
which means the convergent cohomology is well-defined.
3 Rigid cohomology 17

Proof. For two immersions 𝑋 → 𝒫1 and 𝑋 → 𝒫2 with 𝒫1 and 𝒫2 smooth, 𝑋 →


𝒫 = 𝒫1 × 𝒫2 maps to both of them by projections 𝑝1 and 𝑝2 . By the method of
Gauss–Manin filtration and Poincaré lemma mentioned in 2.15, one proves that

𝑅𝑝1∗ Ω•]𝑋[ ∕𝐾
≃ Ω•]𝑋[ ∕𝐾
]𝑋[𝒫1 .
𝒫 𝒫

Taking global sections gets the result. ◻

Remark 3.5. Such an immersion exists locally: for affine 𝑋, 𝑋 can be embedded
𝑛 ̂𝑛 . We may use the cohomological descent method to define
into 𝔸𝑘 . Then into 𝔸 𝒪
the convergent cohomology in general.

The convergent cohomology behaves badly for the non-proper varieties.


1
Example 3.6. Let 𝑋 = 𝔸𝑘 be the affine line, 𝒫 = Spf(ℤ𝑝 ⟨𝑇 ⟩), where ℤ𝑝 ⟨𝑇 ⟩
consists of all the convergent series. Then 𝒫𝐾 = Sp(𝐾⟨𝑇 ⟩) and ]𝑋[𝒫 = 𝒫𝐾 . By
the ”Theorem B” in rigid geometry, all the sheaves Ω𝑛]𝑋[ ∕𝐾 are acyclic. Then the
𝒫
convergent cohomology is the cohomology of the complex

𝑑
𝐾⟨𝑇 ⟩ 𝐾⟨𝑇 ⟩𝑑𝑇 .

However, 𝐻 1 is not finite dimensional for there are infinitely many linearly inde-
pendent convergent series that no longer converge after integration.

We then introduce the rigid cohomology to resolve the problem mentioned


above.

Definition 3.7. A frame of 𝑋 contains arrows,


𝑗 𝑖
𝑋 𝑌 𝒫

such that 𝑗 is an open immersion and 𝑖 a closed immersion. A proper smooth frame
is a frame such that 𝑌 is proper and ]𝑌 [𝒫 is smooth.

Definition 3.8. Let ℱ be an abelian sheaf on ]𝑌 [𝒫 , the overconvergent sheaf 𝑗𝑋 ℱ
is

𝑗𝑋 ℱ = colim 𝑗𝑉 ∗ 𝑗𝑉∗ ℱ ,

where 𝑉 runs through all the strict neighbourhood of ]𝑋[𝒫 in ]𝑌 [𝒫 and 𝑗𝑉 the
inclusion 𝑉 →]𝑌 [𝒫 .

Definition 3.9. If there exists a proper smooth frame of 𝑋, the rigid cohomology

𝐻rig (𝑋) is
• †
𝐻rig (𝑋) = 𝑅Γ(𝑗𝑋 Ω•]𝑌 [ ∕𝐾
),
𝒫

where Ω•]𝑌 [ ∕𝐾
is the de Rham complex.
𝒫
18 𝑝‐adic Cohomology Theories

Proposition 3.10. The rigid cohomology of 𝑋 is independent of the choice of the


frame if such frame exists.
Proof. It is similar to the case of convergent cohomology. See chapter 6 in [Stu74]
for details. ◻
Remark 3.11. A proper smooth frame of 𝑋 exists locally. That is, for affine 𝑋,
𝑛 𝑛 𝑛
𝑋 can be embedded into 𝔸𝑘 and then ℙ𝑘 . Then we may take 𝑌 be 𝑋̄ ⊂ ℙ𝑘 and
𝒫 be ℙ̂𝑛 . Then we may use cohomological descent to define rigid cohomology in
𝒪
general ([Tsu03]).
The rigid cohomology resolves the problem above.
1 1 ̂1
Example 3.12. Let 𝑋 = 𝔸𝑘 , 𝑌 = ℙ𝑘 , 𝒫 = ℙ𝒪 . Then ]𝑋[𝒫 = Sp(𝐾⟨𝑇 ⟩),
1,𝑎𝑛
]𝑌 [𝒫 = ℙ𝐾 . Then {𝑉 𝜆 = Sp(𝐾⟨𝜆𝑇 ⟩) ∣ |𝜆| < 1} forms a cofinal system of
strict neighbourhoods of ]𝑋[𝒫 in ]𝑌 [𝒫 . Then the cohomology becomes

𝐾⟨𝑇 ⟩† 𝐾⟨𝑇 ⟩† 𝑑𝑇 ,

where 𝐾⟨𝑇 ⟩† contains all the overconvergent series (i.e. converges in some ra-
dius 𝜇 > 1). Then 𝐻 1 = 0 as the integration of overconvergent series remains
overconvergent.
In general, for an affine smooth variety 𝑋 = Spec(𝐴), 𝐴 can be lifted to
a finitely presented formally smooth algebra 𝒜 = 𝒪⟨𝑇1 , ⋯ , 𝑇𝑛 ⟩∕(𝑓1 , ⋯ , 𝑓𝑚 ).
Then the rigid cohomology of 𝑋 becomes the cohomology of the de Rham

complex Ω• † , where 𝒜𝐾 ≃ 𝐾⟨𝑇1 , ⋯ , 𝑇𝑛 ⟩† ∕(𝑓1 , ⋯ , 𝑓𝑚 ). That is, the Monsky–
𝒜𝐾 ∕𝐾
Washnitzer cohomology.
Remark 3.13. The way considering overconvergent series can be justified by the
language of adic spaces. Indeed, ]𝑋[𝒫 is not closed in ]𝑌 [𝒫 when viewing them
as adic spaces and we should not expect that a non-closed thing can have a good
cohomology theory. Thus we should consider the closure of ]𝑋[𝒫 in ]𝑌 [𝒫 and
work out its cohomology. Indeed, the strict neighbourhoods are neighbourhoods
of the closure of ]𝑋[𝒫 , ]𝑋[𝒫 . Thus, for 𝑖 ∶ ]𝑋[𝒫 →]𝑌 [𝒫 ,

𝐻rig (𝑋) = 𝑅Γ]𝑌 [𝒫 𝑖∗ 𝑖∗ Ω•]𝑌 [ ∕𝐾
.
𝒫

In proper case, the convergent and rigid cohomology concide.


Proposition 3.14. For proper 𝑋, there is a canonical isomorphism in derived cat-
egory
• •
𝐻conv (𝑋) ≃ 𝐻rig (𝑋).

Proof. The proposition is trivial for projective 𝑋, for there exists a closed immer-
𝑛
sion 𝑋 → ℙ𝑘 , then an immersion 𝑋 → ℙ ̂𝑛 and a frame 𝑋 → 𝑋 → ℙ ̂𝑛 . In general,
𝒪 𝒪
one checks that both convergent cohomology and rigid cohomology functors are
3 Rigid cohomology 19

right Kan extensions from the category of projective varieties to the category of
proper varieties (for a proper variety, it has a simplicial resolution by projective
varieties). See [Tsu03] for details. ◻

We are now able to prove the comparison theorem between crystalline coho-
mology and convergent/rigid cohomology.

Theorem 3.15. There exists an canonical isomorphism for smooth 𝑋:


• •
𝐻cris (𝑋∕W(𝑘)) ⊗ 𝐾 ≃ 𝐻conv (𝑋).

Proof. By cohomological descent, it suffices to prove that it holds locally. For


affine 𝑋, there exists a formally smooth lift 𝒫 = 𝒳 . Then ]𝑋[𝒫 = 𝒳𝐾 and

𝐻cris (𝑋∕W(𝑘)) ≃ Ω•𝒳 ∕𝒪 , •
𝐻conv (𝑋) ≃ Ω•𝒳 .
𝐾 ∕𝐾

Hence the result follows from the fact that

Ω•𝒳 ∕𝒪 ⊗ 𝐾 ≃ Ω•𝒳 .
𝐾 ∕𝐾

The comparison between crystalline and rigid cohomology is a direct conse-


quence of the above two facts

Corollary 3.16 ([Ber86]). For proper smooth 𝑋, there exists an isomorphism


• •
𝐻cris (𝑋∕W(𝑘)) ⊗ 𝐾 ≃ 𝐻rig (𝑋).

As crystalline cohomology can be defined for crystals, rigid cohomology can be


defined for ”overconvergent isocrystals”. As the categorical equivalence mentioned
above, the following abelian categories are equivalent (chapter 7 in [Stu74]) (let
𝑋 → 𝑌 → 𝒫 be a proper smooth frame):

• Coherent 𝑗 † 𝒪]𝑌 [𝒫 ∕𝐾 -modules with integrable connection.

• Coherent 𝑗 † 𝒪]𝑌 [𝒫 ∕𝐾 -modules with stratification,

• 𝑗 † 𝒪]𝑌 [𝒫 ∕𝐾 -modules which are coherent as 𝑗 † 𝒪]𝑌 [𝒫 ∕𝐾 -modules,

• Overonvergent isocrystals, denoted by Isoc† (𝑋 → 𝑌 ). That is, associating


each morphism of frames

𝑋′ 𝑌′ 𝒫′

𝑋 𝑌 𝒫

a coherent 𝑗 † 𝒪]𝑌 [𝒫 ∕𝐾 -module ℰ𝒫′ , satisfying the crystal relations as above.


20 𝑝‐adic Cohomology Theories

Moreover, those categories are ”stacks” in some sense. That is, taking the fifth
for example, there is an equivalence (𝒰 = {𝑌𝑖 → 𝑌 } be an open covering):
Isoc† (𝑋 → 𝑌 ) → lim Isoc† (𝑋• → 𝑌• )
𝒰

where 𝑋• , 𝑌• means the simplicial scheme with respect to the open covering 𝒰.
Then we can define those concepts for the case 𝑋 does not admit a proper smooth
frame by descent.
Then for an overconvergent isocrystal ℰ (with the corresponding coherent sheaf
with integrable connection also denoted ℰ ), we may define the rigid cohomology
with coefficients
𝑛
𝐻rig (𝑋, ℰ ) ∶= ℍ𝑛 (ℰ ⊗ Ω•]𝑌 [ ∕𝐾 ).
𝒫

Then there is a comparison theorem between crystalline cohomology and rigid


cohomology.
Theorem 3.17. There is a functor for smooth 𝑋:
{coherent crystals on (𝑋∕W(𝑘))Cris } → Isoc(𝑋)
and a functor for proper smooth 𝑋
{coherent crystals on (𝑋∕W(𝑘))Cris } → Isoc† (𝑋),
(Isoc(𝑋) means convergent isocrystals), denoted ℰ ↦ ℰ𝐾 such that for smooth 𝑋,
𝑅𝑖 Γ((𝑋∕𝑆)Cris , ℰ ) ⊗ 𝐾 ≃ 𝐻conv
𝑖
(𝑋, ℰ𝐾 ),
and for proper smooth 𝑋,
𝑖
𝑅𝑖 Γ((𝑋∕𝑆)Cris , ℰ ) ⊗ 𝐾 ≃ 𝐻rig (𝑋, ℰ𝐾 ).
Proof. For the first functor, when 𝑋 admits a smooth lifting 𝒳 (this is the local
case), ℰ corresonds a 𝒪𝒳 -module with an integrable connection, denoted ℰ also.
ℰ𝐾 ∶= ℰ ⊗𝐾 is then a coherent sheaf over 𝒳𝐾 with induced integrable connection.
The comparison then holds. We may use descent theory to work out in general case.
For the second functor, note that for projective 𝑋, there is a frame 𝑋 → 𝑋 →
𝒫 and thus Isoc† (𝑋) ≃ Isoc(𝑋). Resolving general proper smooth varieties by
projective smooth verieties proves the general case. ◻
Remark 3.18. In practice, we shall use overconvergent 𝐹 -isocrystals only. That
is, overconvergent isocrystals with a compatible Frobenius action.
One can define the cohomology with compact support and cohomology sup-
ported on a closed set in rigid cohomology theory.
Definition 3.19. Let 𝑋 → 𝑌 → 𝒫 be a proper smooth frame. View ]𝑌 [𝒫 as an
adic space and then ]𝑋[𝒫 is an open subspace of it. Denote 𝑖 ∶ ]𝑋[𝒫 →]𝑌 [𝒫 and
𝑗 its complement.

𝐻𝑐,rig (𝑋) = 𝑅Γ]𝑌 [𝒫 (𝑖∗ 𝑖! Ω•]𝑌 [ ∕𝐾
) ≃ 𝑅Γ]𝑌 [𝒫 (Ω•]𝑌 [ ∕𝐾
→ 𝑗∗ 𝑗 ∗ Ω•]𝑌 [ ∕𝐾
).
𝒫 𝒫 𝒫
3 Rigid cohomology 21

Definition 3.20. Let 𝑋 → 𝑌 → 𝒫 be a proper smooth frame and 𝑍 a closed


subvariety of 𝑋. Denote 𝑗 ∶ ]𝑍[𝒫 →]𝑌 [𝒫 and 𝑖 its complement.

𝐻𝑍• (𝑋) = 𝑅Γ]𝑌 [𝒫 (𝑗! 𝑗 ∗ Ω•]𝑌 [ ∕𝐾


) ≃ 𝑅Γ(Ω•]𝑌 [ ∕𝐾
→ 𝑖∗ 𝑖∗ Ω•]𝑌 [ ∕𝐾
).
𝒫 𝒫 𝒫

Then we have exact sequences for 𝑋 = 𝑈 ∪ 𝑍, with 𝑈 open and 𝑍 closed (we
shall omit the subscript “rig” if there is no confusion):

⋯ → 𝐻𝑍• (𝑋) → 𝐻 • (𝑋) → 𝐻 • (𝑈 ) → 𝐻𝑍•+1 (𝑋) → ⋯

and
⋯ → 𝐻𝑐• (𝑈 ) → 𝐻𝑐• (𝑋) → 𝐻𝑐• (𝑍) → 𝐻𝑐•+1 (𝑈 ) ⋯ ,

and some excision theorems. For example, for 𝑍 ⊂ 𝑌 ⊂ 𝑋 being closed varieties,
there is an exact sequence

⋯ → 𝐻𝑌• (𝑋) → 𝐻𝑍• (𝑋) → 𝐻𝑍• (𝑌 ) → 𝐻𝑌•+1 (𝑋) → ⋯ .

Verification of the desired properties


We verify the assuptions in the Weil cohomology theory in remains of this section.
Proposition 3.21 (Gysin isomorphism, [Stu74]). For a smooth variety 𝑋 and its
smooth, closed subvariety 𝑍, if 𝑋 is liftable, there is an isomorphism

𝐻𝑍• (𝑋) ≃ 𝐻 •−2𝑐 (𝑍),

called the Gysin isomorphism, where 𝑐 is the codimension of 𝑍 in 𝑋.


Proof. For 𝑋 is liftable, 𝑍 is also. One represents those rigid cohomology by de
Rham cohomology and then define the Gysin map. It can be checked that it is an
isomorphism. ◻
Theorem 3.22 (Finiteness theorem, [Ber97a]). For any variety 𝑋∕𝑘 and overcon-

vergent 𝐹 -isocrystal ℰ on it, the cohomology 𝐻rig (𝑋, ℰ ) is finite dimensional.
Proof. We shall prove the case that ℰ is trivial only. Firstly consider the case that
𝑋 is smooth. We apply induction. Consider the following two propositions:
(a)𝑛 𝐻 • (𝑋) is finite dimensional for all smooth varieties 𝑋 with dimension no
greater than 𝑛.
(b)𝑛 𝐻𝑍• (𝑋) is finite dimensional for all varieties 𝑍 with dimension no greater
than 𝑛 and 𝑋 being smooth.
The proposition (a)0 is clear. (b)0 follows from the Gysin ismorphism. (One can
choose an affine open set 𝑈 containing 𝑍, 𝐻𝑍• (𝑈 ) ≃ 𝐻𝑍• (𝑋) by excision. 𝑈 is
liftable.)
The implication (b)𝑛 − 1 ⇒ (a)𝑛 : for such 𝑋, by de Jong’s alternation theorem
(theorem 4.1 in [De 96]), there is a projective smooth variety 𝑋 ′ and its open set
22 𝑝‐adic Cohomology Theories

𝑈 , such that 𝑝 ∶ 𝑈 → 𝑋 is proper and generically étale. There is then a dense


open set 𝑈1 of 𝑋 such that 𝑝 ∶ 𝑝−1 (𝑈1 ) → 𝑈1 is étale (hence finite). By finiteness
theorem for crystalline cohomology, the cohomology of 𝑋 ′ is of finite dimension.
By the long exact sequence above and (b)𝑛 − 1 , the cohomology of 𝑝−1 (𝑈1 ) is also.
There is a trace map for finite morphisms (it can be defined locally via de Rham
cohomology, and then globally via gluing), becoming a section (up to a scalar) of
the pull back map. Thus the cohomology of 𝑈1 is finite dimensional. By (b)𝑛 − 1
and the long exact sequence again, the cohomology of 𝑋 is finite dimensional.
The implication (b)𝑛 − 1 and (a)𝑛 ⇒ (b)𝑛 : one applies excision to reduce to the
case that 𝑋, 𝑍 are smooth and 𝑋 is liftable. Then apply (a)𝑛 and the Gysin iso-
morphism to conclude.
For general 𝑋, it is concluded by choosing a simplicial resolution of 𝑋 by
smooth varieties and applying cohomological descent. See [Tsu03] for details. ◻

Using similar methods, one proves that the cohomology with compact support
is of finite dimension.

Theorem 3.23 (Künneth formula, [Ber97b]). For 𝑋, 𝑌 arbitrary varieties, ℰ1 an


overconvergent isocrystal on 𝑋 and ℰ2 an overconvergent isocrystal on 𝑌 , there is
an isomorphism (then ℰ1 ⊠ ℰ2 is an overconvergent isocrystal on 𝑋 × 𝑌 ):

𝐻 • (𝑋, ℰ1 ) ⊗ 𝐻 • (𝑌 , ℰ2 ) ≃ 𝐻 • (𝑋 × 𝑌 , ℰ1 ⊠ ℰ2 ).

Similarly for cohomology with compact supports.

Proof. It is almost the same as the case for crystalline cohomology. One can check
it by first working locally via the de Rham complex and then noticing that the map
is canonical. ◻

Theorem 3.24 (Poincaré duality, [Ber97b]). For smooth 𝑋 of dimension 𝑑, there


is a trace map tr ∶ 𝐻 2𝑑 (𝑋) → 𝐾 such that for an overconvergent 𝐹 -isocrystal ℰ ,
the pairing
𝐻 • (𝑋, ℰ ) ⊗ 𝐻𝑐2𝑑−• (𝑋, ℰ ∨ ) → 𝐻 2𝑑 (𝑋) → 𝐾

is perfect.

Proof. The trace map is defined almost the same as the case for crystalline coho-
mology. We shall prove the case that ℰ is trivial. We consider two propositions:

(a)𝑛 Poincaré duality holds for all smooth varieties of dimension no greater than
𝑛.

(b)𝑛 The pairing (𝑑 is the dimension of 𝑍)

𝐻𝑍• (𝑋) × 𝐻𝑐2𝑑−• (𝑍) → 𝐾

(the trace map can be similarly defined) is perfect for 𝑍 (may not be smooth)
of dimension no greater than 𝑛 and 𝑋 smooth.
References 23

The case (a)0 is trivial. (b)0 is from the Gysin isomorphism. (b)𝑛 − 1 ⇒ (a)𝑛 is
done by de Jong’s alternation theorem and the Poincaré duality for crystalline co-
homology (note that the trace map for finite morphisms are compatible with all
those constructions). (b)𝑛 − 1 and (a)𝑛 ⇒ (b)𝑛 is done by excision and Gysin iso-
morphism. The whole process is similar to the proof of finiteness and we shall omit
it. ◻
Remark 3.25. The case for general overconvergent 𝐹 -isocrystals ℰ can be proved
by “𝑝-adic local monodromy theorem”, which claims that after a suitable base
change, the isocrystal is unipotent, i.e., has a filtration with each subquotient be-
coming trivial. See [Ked06] for details.
Remark 3.26. The cycle class map can be defined firstly by de Rham cohomology
locally and then applying cohomological descent.
All requirements for a good cohomology theory are therefore proved. We end
this article by remarking an example.
Example 3.27. If a family of varieties over 𝑘 comes from reduction of a proper
smooth family, the cohomology of all varieties in the family has the same dimen-
sion. The family of hypersurfaces in the projective space is the case. For example,
all plane cubic curves have ℎ0 = 1, ℎ1 = 2, ℎ2 = 1 and ℎ𝑘 = 0 for 𝑘 > 2. Here
ℎ𝑘 ∶= dim 𝐻 𝑘 .
Proof. The first statement is from the fact that Betti numbers are invariant over
a family and the comparison theorem between de Rham cohomology and Betti
cohomology.
The second statement is from the fact that for any homogenous polynomial
𝑓 ∈ 𝑘[𝑥0 , … , 𝑥𝑛 ], there is a element 𝑓 ̃ ∈ W(𝑘)[𝑥0 , … , 𝑥𝑛 ] defining a smooth
𝑛
hypersurface of ℙ𝐾 , as the set of singular hypersurfaces is Zariski closed in the
parameter space and all 𝑓 ̃ is Zariski dense in the the parameter space.
The last statement follows from computations of Betti numbers of cubic curves.

The example illustrates that for singular varieties, rigid cohomology may not
be the “intuitive” one. For example, the first Betti number of a cuspidal cubic curve
over ℂ is 0 and the first rigid cohomology of a cuspidal cubic curve over 𝑘 is of
dimension 2.

References
[Ber74] Pierre Berthelot (1974). Cohomologie cristalline des schémas de caractéristique
𝑝 > 0. Lecture Notes in Mathematics 406. Springer.
[Ber86] Pierre Berthelot (1986). “Géométrie rigide et cohomologie des variétés al-
gébriques de caractéristique 𝑝”. Mémoires de la Société Mathématique de
France (23), 7–32.
24 𝑝‐adic Cohomology Theories

[Ber97a] Pierre Berthelot (1997). “Finitude ét pureté cohomologique en cohomologique


rigide”. Inventiones mathematicae 128, 329–377.
[Ber97b] Pierre Berthelot (1997). “Dualité de Poincaré et formule de Künneth en coho-
mologie rigide”. Comptes Rendus de l’Académie des Sciences - Series I - Math-
ematics 325, 493–498.
[BO78] Pierre Berthelot and Arthur Ogus (1978). Notes on crystalline cohomology.
Princeton University Press. ISBN: 9780691648323.
[De 96] Aise Johan De Jong (1996). “Smoothness, semi-stability and alterations”. Pub-
lications Mathématiques de l’IHÉS 83, 51–93.
[Ked01] Kiran Kedlaya (2001). “Counting points on hyperelliptic curves using Monsky-
Washnitzer cohomology”. J. Ramanujan Math. Soc. 16, 323–338.
[Ked06] Kiran Kedlaya (2006). “Finiteness of rigid cohomology with coefficients”. Duke
Math. J. 134, 15–97.
[Stacks] The Stacks Project Authors (2018). Stacks Project. https://stacks.math.
columbia.edu.
[Stu74] Bernard Le Stum (1974). Rigid cohomology. Cambridge Tracts in Mathematics.
Cambridge University Press.
[Tsu03] Nobuo Tsuzuki (2003). “Cohomological descent of rigid cohomology for proper
coverings”. Inventiones Mathematicae 151, 101–133.
Sheaf Cohomology

He Yuxin1

ABSTRACT
We may come across many cohomology theorems when we study geome-
try and topology. Theorems like de Rham theorm and Dolbeault theorem tell
us the connection between different cohomology theories. The passage will
introduce the axiomatic cohomology theory to give a more general view of the
phenomenon. The connection do not rely on a particular cohomology theory,
but is a natural conclusion of the propositions in homological algebra.

Contents

1 Preliminaries 25
Sheaves and presheaves . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Tensors and exact sequences . . . . . . . . . . . . . . . . . . . . . . . 28

2 Axiomatic sheaf cohomology 31


Existence of sheaf cohomology theories . . . . . . . . . . . . . . . . . 32
Uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3 Examples 38
Alexander–Spanier cohomology . . . . . . . . . . . . . . . . . . . . . 38
de Rham cohomology . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

4 Multiplicative structure 41

1 Preliminaries

Sheaves and presheaves


Fix 𝐾 to be a principal ideal domain.
1
贺宇昕,清华大学数学系数 72 班.
26 Sheaf Cohomology

Definition 1.1. A sheaf of 𝐾-modules over a topological space 𝑀 contains a topo-


logical space 𝑆 and a projection 𝜋 ∶ 𝑆 → 𝑀 satisfying:
1. 𝜋 is a local homeomorphism and is surjective,
2. 𝜋 −1 (𝑥) is a 𝐾-module for each 𝑥 ∈ 𝑀,
3. the operations for 𝐾-modules are continuous.
Here the third condition means, if 𝐴 = {(𝑥, 𝑦) ∈ 𝑆 ×𝑆 ∣ 𝜋(𝑥) = 𝜋(𝑦)} ⊂ 𝑆 ×𝑆
is the set of pairs of elements in S with same image in 𝑀, then 𝐴 → 𝑆, (𝑠1 , 𝑠2 ) ↦
𝑠1 − 𝑠2 is continuous, and for fixed 𝑘 ∈ 𝐾, 𝑆 → 𝑆, 𝑠 ↦ 𝑘𝑠 is continuous. It
follows that 𝐴 → 𝑆, (𝑠1 , 𝑠2 ) ↦ 𝑠1 + 𝑠2 is continuous. While for a sheaf of 𝐾-
algebra, 𝐴 → 𝑆, (𝑠1 , 𝑠2 ) ↦ 𝑠1 ∘ 𝑠2 should be continuous.
Call 𝜋 −1 (𝑥) the stalk over 𝑥, and for an open subset 𝑈 ⊂ 𝑀, a continuous
map 𝑓 ∶ 𝑈 → 𝑆 satisfying 𝜋 ∘ 𝑓 = 𝑖𝑑𝑈 is called a section on 𝑈 , and the set of
such sections is denoted by Γ(𝑈 , 𝑆). In particular, set Γ(𝑆) = Γ(𝑀, 𝑆). Γ(𝑈 , 𝑆)
has a natural 𝐾-module structure, play similar role of the space 𝐶 ∞ (𝑈 ) of smooth
function on 𝑈 .
Notice that 𝜋 may not be a covering map. The space of germs of continuous
functions on 𝑅, for example, , is even not a Hausdorff space. .
Example 1.2. The constant sheaf 𝒢 = 𝑀 × 𝐺, where 𝐺 is a 𝐾-module with
discrete topology.
Example 1.3. The smooth function sheaf 𝐶 ∞ (𝑀) = ⋃𝑥∈𝑀 𝐹𝑥 . Here 𝐹𝑥 is the
set of germs of smooth functions at x. Here we may set the elements in 𝐹𝑥 be the
equivalent classs of (𝑓 , 𝑈 ) with 𝑓 ∶ 𝑈 → ℝ and 𝑥 ∈ 𝑈 , with (𝑓 , 𝑈 ) ∼ (𝑔, 𝑉 ) if
∃𝑊 ⊂ 𝑈 ∩ 𝑉 and 𝑥 ∈ 𝑊 with 𝑓 |𝑊 = 𝑔|𝑊 . The topology of 𝐶 ∞ (𝑀) is formed
by the basis ⋃𝑥∈𝑈 𝑓𝑥 , where 𝑓 ∶ 𝑈 → 𝑅 is smooth and 𝑓𝑥 denotes the germ of f
at x.
Definition 1.4. If 𝑆 and 𝑆 ′ are two sheaves over M with projection 𝜋, 𝜋 ′ , a continu-
ous map 𝜑 ∶ 𝑆 → 𝑆 ′ is called a sheaf homomorphism if the following commutative
diagram holds
𝜑
𝑆A / 𝑆′
AA |
AA𝜋 𝜋 ′ |||
AA |
A ~|||
𝑀
and 𝜑 induce homomorphism of 𝐾-modules at each stalk.
Definition 1.5. An open set 𝑅 ⊂ 𝑆 is called a subsheaf if 𝑅𝑥 = 𝑅 ∩ 𝑆𝑥 is a
𝐾-submodule of 𝑆𝑥 for each x. It is also a sheaf over 𝑀.
For a sheaf homomorphism 𝜑 ∶ 𝑆 → 𝑆 ′ , take the kernel and image at each stalk
and take the union, they are subsheaves of 𝑆 or 𝑆 ′ . If 𝜑 is injective, identity 𝑆 with
𝜑(𝑆), and consider 𝑆 as a subsheaf of 𝑆 ′ . The sheaf homomorphism 𝜑 ∶ 𝑆 → 𝑆 ′
induces a sheaf isomorphism 𝑆 ∶ ker(𝜑) → im(𝜑).
1 Preliminaries 27

If 𝑆 is a subsheaf of 𝑆 ′ , define the quotient sheaf

𝐽= 𝑆𝑥′ ∕𝑆𝑥 ,

𝑥∈𝑀

and equip it with the quotient topology. 𝐽 is called the quotient sheaf of 𝑆 ′ modulo
𝑆.
Definition 1.6. An exact sequence of sheaves is a sequence of sheaves and homo-
morphisms:

𝜑𝑖−1 𝜑𝑖 𝜑𝑖+1
⋯ / 𝑆 𝑖−1 / 𝑆𝑖 / 𝑆 𝑖+1 / ⋯,

with im(𝜑𝑖 ) = ker(𝜑𝑖+1 ). Short exact sequences are of the form

0 /𝑅 /𝑆 /𝑇 / 0,

with 0 the constant sheaf of the form 𝑀 × 𝐺, where 𝐺 is the trivial 𝐾-module.
The sequence is exact if and only if for each 𝑥 ∈ 𝑀, the sequence

𝜑𝑖−1 𝜑𝑖𝑥 𝜑𝑖+1


⋯ / 𝑆𝑥𝑖−1 𝑥
/ 𝑆𝑥𝑖 / 𝑆𝑥𝑖+1 𝑥
/⋯

is exact. Here 𝜑𝑖𝑥 is the induced 𝐾-module homomorphism on stalks.


Sheaves can be also defined as some particular presheaves. The two definitions
are equivalent.
Definition 1.7. A presheaf of 𝐾-modules 𝑃 = {𝑆𝑈 ; 𝜌𝑈 ,𝑉 } consists of a 𝐾-module
𝑆𝑈 for all open sets 𝑈 ⊂ 𝑀 and a 𝐾-module homomorphism 𝜌𝑈 ,𝑉 ∶ 𝑆𝑉 → 𝑆𝑈 for
each 𝑈 ⊂ 𝑉 , satisfying that 𝜌𝑈 ,𝑈 = 𝑖𝑑 and 𝜌𝑈 ,𝑊 = 𝜌𝑈 ,𝑉 ∘ 𝜌𝑉 ,𝑊 , for 𝑈 ⊂ 𝑉 ⊂ 𝑊 .
These are similar to the set of continuous functions on U and the restriction map.
A morphism from 𝑃 to 𝑃 ′ consists of 𝐾-module morphisms 𝜑𝑈 ∶ 𝑆𝑈 → 𝑆𝑈′
for each open subset U such that 𝜙 is compatible with 𝜌, that is to say, the following
commutative diagram holds for every 𝑈 ⊂ 𝑉 ,
𝜌𝑈 ,𝑉
𝑆𝑉 /𝑆
𝑈

𝜑𝑉 𝜑𝑈
 𝜌′𝑈 ,𝑉 
𝑆𝑉′ / 𝑆′
𝑈

If 𝑆 is a sheaf, then {Γ(𝑈 , 𝑆); 𝜌𝑈 ,𝑉 } is a presheaf with 𝜌 the usual restriction


map. Denote the presheaf as 𝛼(𝑆).
If 𝑃 = {𝑆𝑈 ; 𝜌𝑈 ,𝑉 } is a presheaf of 𝐾-module over 𝑀, then define a sheaf 𝑆 =
∪𝑥∈𝑀 𝑆𝑥 . Let𝑆𝑥 = ⋃𝑥∈𝑈 𝑆𝑈 ∕ ∼, and let 𝑓 ∈ 𝑆𝑈 ∼ 𝑔 ∈ 𝑆𝑉 if and only if there
exists some 𝑥 ∈ 𝑊 ⊂ 𝑈 ∩𝑉 such that 𝜌𝑊 ,𝑈 𝑓 = 𝜌𝑊 ,𝑉 𝑔, similar to the construction
of germs of functions. The projection 𝜋 sends [𝑓 ] ∈ 𝑆𝑥 to 𝑥, where 𝑓 ∈ 𝑆𝑈 .
28 Sheaf Cohomology

𝜋 −1 (𝑥) = 𝑆𝑥 is a 𝐾-module by 𝑘[𝑓 ] = [𝑘𝑓 ] and [𝑓1 ] + [𝑓2 ] = [𝜌𝑊 ,𝑈 𝑓1 + 𝜌𝑊 ,𝑉 𝑓2 ]


where 𝑓1 ∈ 𝑆𝑈 , 𝑓2 ∈ 𝑆𝑉 and 𝑥 ∈ 𝑊 ⊂ 𝑈 ∩ 𝑉 . Define the map 𝑆𝑈 → 𝑆𝑥 with
𝑥 ∈ 𝑈 by 𝜌𝑥,𝑈 , then the topology of 𝑆 is given by the topology generated by the
basis {𝑂𝑓 = {𝜌𝑥,𝑈 𝑓 ∣ 𝑥 ∈ 𝑈 } ∣ 𝑓 ∈ 𝑆𝑈 }. 𝑆 is a sheaf, denoted by 𝛽(𝑃 ).
Notice in some place a presheaf is a sheaf if it satisfies:

1. if 𝑈 = ⋃𝑖∈𝐼 𝑈𝑖 and 𝑓𝑖 ∈ 𝑆𝑈𝑖 with 𝑓𝑖 |𝑈𝑖 ∩𝑈𝑗 = 𝑓𝑗 |𝑈𝑖 ∩𝑈𝑗 , then there exists
𝑓 ∈ 𝑆𝑈 satisfying 𝑓𝑈𝑖 = 𝑓𝑖 , here if 𝑈 ∩ 𝑉 and 𝑓 ∈ 𝑆𝑉 , use 𝑓 |𝑈 to denote
𝜌𝑈 ,𝑉 𝑓 .

2. if 𝑈 = ⋃𝑖∈𝐼 𝑈𝑖 and 𝑓 , 𝑔 ∈ 𝑆𝑈 satisfies 𝑓 |𝑈𝑖 = 𝑔|𝑈𝑖 , then 𝑓 = 𝑔.

If a presheaf 𝑃 = {𝑆𝑈 ; 𝜌𝑈 ,𝑉 } satisfies the two conditions, then say the presheaf
is complete.
Notice a sheaf homomorphism 𝜑 ∶ 𝑆 → 𝑆 ′ induce presheaf homomorphism
𝜑 ∶ 𝛼(𝑆) → 𝛼(𝑆 ∗ ), and (𝜓 ∘ 𝜑)∗ = 𝜓 ∗ ∘ 𝜑∗ . Besides, a presheaf homomorphism

{𝜑𝑈 } also induce sheaf homomorphism 𝜑∗ between 𝛽(𝑃 ) and 𝛽(𝑃 ′ ).


If 𝑆 is sheaf, then the composition map

𝛽(𝛼(𝑆)) → 𝑆
𝜉 = 𝜌𝑝,𝑈 𝑓 ↦ 𝑓 (𝑝), 𝑤ℎ𝑒𝑟𝑒 𝑓 ∈ Γ(𝑈 , 𝑆)

is a sheaf isomorphism. The inverse map is guaranteed by the local homeomor-


phism of 𝑆.
But 𝑃 is a presheaf, 𝑃 and 𝛼(𝛽(𝑃 )) may not be isomorphic. However, the
following theorem holds.

Theorem 1.8. 𝑃 and 𝛼(𝛽(𝑃 )) are isomorphic if and only if P is complete.

This map should be

𝑆𝑈 → Γ(𝑈 , 𝛽(𝑃 ))
(𝑓 ∈ 𝑆𝑈 ) ↦ (𝑝 ∈ 𝑈 ↦ 𝜌𝑝,𝑈 𝑓 ).

Tensors and exact sequences


Definition 1.9. For two sheaves 𝑃 = {𝑆𝑈 ; 𝜌𝑈 ,𝑉 } and 𝑃 ′ = {𝑆𝑈′ ; 𝜌′𝑈 ,𝑉 }, their
tensor presheaf is defined as

𝑃 ⊗ 𝑃 ′ = {𝑆𝑈 ⊗ 𝑆𝑈′ ∣ 𝜙𝑈 ⊗ 𝜙′𝑈 }.

and for two sheaves 𝑆 and 𝑆 ′ , their tensor sheaf is defined as

𝑆 ⊗ 𝑆 ′ = 𝛽(𝛼(𝑆) ⊗ 𝛼(𝑆 ′ )).

By the construction, we have (𝑆 ⊗ 𝑆 ′ )𝑥 ≃ 𝑆𝑥 ⊗ 𝑆𝑥′ .


1 Preliminaries 29

The tensor of sheaf homomorphisms is defined canonially as 𝜑 ∶ 𝑆 → 𝑇 and


𝜓 ∶ 𝑆 ′ → 𝑇 ′ will induce 𝜑 ⊗ 𝜓 ∶ 𝑆 ⊗ 𝑆 ′ → 𝑇 ⊗ 𝑇 ′ , and 𝜑 ⊗ 𝜓|(𝑆⊗𝑆 ′ )𝑥 ≃
𝜑|𝑆𝑥 ⊗ 𝜓|𝑆𝑥′ .
Similarly, define the direct sum of sheaves and presheaves as
𝑃 ⊕ 𝑃 ′ ={𝑆𝑈 ⊕ 𝑆𝑈′ ∶ 𝜙𝑈 ⊕ 𝜙′𝑈 },
𝑆 ⊕ 𝑆 ′ =𝛽(𝛼(𝑆) ⊕ 𝛼(𝑆 ′ )).
Notice that smooth manifolds always guarantee the existence of the partition of
unity. For sheaves there is a more general concept.
Definition 1.10. A sheaf 𝑆 over 𝑀 is a fine sheaf if for any locally finite open cov-
ering {𝑈𝑖 } of 𝑀, there exists endomorphisms {𝑙𝑖 } of 𝑆 satisfying that 𝑠𝑢𝑝𝑝(𝑙𝑖 ) ⊂ 𝑈𝑖
and ∑𝑖 𝑙𝑖 = 𝑖𝑑𝑆 . Set 𝑠𝑢𝑝𝑝(𝑓𝑖 ) to be the closure of the points set where 𝑙𝑖 |𝑆𝑥 is not
zero. {𝑙𝑖 } is called the partition of unity of sheaf 𝑆 subordinate to {𝑈𝑖 } of 𝑀.
It can be checked that if 𝑆 or 𝑆 ′ is fine then 𝑆 ⊗ 𝑆 ′ is fine. It only need to take
the tensors of endomorphisms 𝑙𝑖 with 𝑖𝑑𝜋 −1 (𝑈𝑖 ) .
Notice that when

0 / 𝑆′ /𝑆 / 𝑆 ′′ /0

is a short exact sequence, the sequence

0 / Γ(𝑆 ′ ) / Γ(𝑆) / Γ(𝑆 ′′ ) /0

is not always exact. In fact, the sequence may not be exact at Γ(𝑆 ′′ ). We only have
the following exact sequence

0 / Γ(𝑆 ′ ) / Γ(𝑆) / Γ(𝑆 ′′ ).

How does the sequence differ from exact sequence will be seen by extending the
sequence to a long exact sequence.
Theorem 1.11. If there is a short exact sequence

0 /𝑅 /𝑆 /𝑇 / 0,

where 𝑅 = 𝑘𝑒𝑟(𝑆 → 𝑇 ) is a fine sheaf, then it will induce the short exact sequence
𝜑
0 / Γ(𝑅) 𝑖 / Γ(𝑆) / Γ(𝑇 ) / 0.

Proof. The exactness at Γ(𝑅) is easy to see pointwisely. The exactness at Γ(𝑆) is
just by the definition of the kernel sheaf. So it suffices to check the sufficiency at
Γ(𝑇 ). For 𝑡 ∈ Γ(𝑇 ), since sheaves are locally homeomorphic to 𝑀, for an open
covering {𝑈𝑖 } of M, there exist 𝑠𝑖 ∈ Γ(𝑈𝑖 , 𝑆) such that 𝜑 ∘ 𝑠𝑖 = 𝑡|𝑈𝑖 , and {𝑈𝑖 } is
locally finite. Set 𝑠𝑖𝑗 = 𝑠𝑖 −𝑠𝑗 over 𝑈𝑖 ∩𝑈𝑗 , and let 𝑙𝑖 be the corresponding partition
of unity, then set
𝑠′𝑖 = ∑ 𝑙𝑗 ∘ 𝑠𝑖𝑗 ,
𝑗
30 Sheaf Cohomology

here 𝑙𝑗 ∘ 𝑠𝑖𝑗 can be seen as an element in Γ(𝑆, 𝑈𝑖 ).


Since on 𝑈𝑖 ∩ 𝑈𝑗 the equation

𝑠′𝑖 − 𝑠′𝑗 = ∑ 𝑙𝑘 ∘ (𝑠𝑖𝑘 − 𝑠𝑗𝑘 ) = ∑ 𝑙𝑘 ∘ 𝑠𝑖𝑗 = 𝑠𝑖 − 𝑠𝑗


𝑘 𝑘

holds, then by the gluing lemma, there exists some 𝑠 ∈ Γ(𝑆) with 𝑠|𝑈𝑖 = 𝑠𝑖 −𝑠′𝑖 ,
then 𝜑 ∘ 𝑠 = 𝑡. ◻
For the tensor product, consider the tensor functor.
Proposition 1.12. If there is a short exact sequence

0 / 𝑆′ /𝑆 / 𝑆 ′′ / 0,

then if either 𝑆 ′′ or 𝑇 is torsionless, it the short exact sequence

0 / 𝑆′ ⊗ 𝑇 / 𝑆 ⊗𝑇 / 𝑆 ′′ ⊗ 𝑇 / 0.

Here torsionless means all the stalks are torsionless 𝐾-modules.

𝑆′ ⊗ 𝑇 / 𝑆 ⊗𝑇 / 𝑆 ′′ ⊗ 𝑇 / 0.

holds for all sheaves since

𝐴
𝑢 /𝐵 𝑣 /𝐶 /0

is exact if and only if

/ 𝐻𝑜𝑚(𝐶, 𝑁) 𝑣∗ / 𝐻𝑜𝑚(𝐵, 𝑁) 𝑢∗ / 𝐻𝑜𝑚(𝐴, 𝑁)


0

is exact for all 𝐾-module 𝑁 and the fact that 𝐻𝑜𝑚(𝐴⊗𝐵, 𝑁) ≃ 𝐻𝑜𝑚(𝐴, 𝐻𝑜𝑚(𝐵, 𝑁)).
While for the injectivity of 𝑆 ′ ⊗ 𝑇 → 𝑆 ⊗ 𝑇 ,since tensor product commutes
with direct limit, it suffices to consider the case of finitely generated torsionfree
𝐾-modules, and all such modules have the form 𝐾 ⊕𝑛 , for which the injectivity is
clear.
Theorem 1.13. If

0 / 𝑆′ /𝑆 / 𝑆 ′′ /0

is an exact sequence of sheaves of 𝐾-modules over M, 𝑇 is a sheaf of 𝐾-modules


over M, and 𝑇 or 𝑆 ′′ is torsionless, then there is an exact sequence

0 / 𝑆′ ⊗ 𝑇 / 𝑆 ⊗𝑇 / 𝑆 ′′ ⊗ 𝑇 / 0.

At the same time, if either 𝑇 or 𝑆 ′ is a fine sheaf, then there is an exact sequence

0 / Γ(𝑆 ′ ⊗ 𝑇 ) / Γ(𝑆 ⊗ 𝑇 ) / Γ(𝑆 ′′ ⊗ 𝑇 ) / 0.


2 Axiomatic sheaf cohomology 31

2 Axiomatic sheaf cohomology


Definition 2.1. A sheaf cohomology theory ℋ for 𝑀 with coefficients in sheaves
of 𝐾-modules over M consists of
1. A 𝐾-module 𝐻 𝑞 (𝑀, 𝑆) for each sheaf 𝑆 and each integer 𝑞, called the 𝑞-th
cohomology module of 𝑀 with coefficients in the sheaf 𝑆 relative to coho-
mology theory ℋ ,
2. A homomorphism 𝐻 𝑞 (𝑀, 𝑆) → 𝐻 𝑞 (𝑀, 𝑆 ′ ) for each sheaf homeomor-
phism 𝑆 → 𝑆 ′ and integer 𝑞,
3. A homomorphism 𝐻 𝑞 (𝑀, 𝑆 ′′ ) → 𝐻 𝑞+1 (𝑀, 𝑆 ′ ) for each 𝑞 and each short
exact sequence 0 → 𝑆 ′ → 𝑆 → 𝑆 ′′ → 0,
and they should satisfy:
1. 𝐻 𝑞 (𝑀, 𝑆) = 0 for 𝑞 < 0.
2. There are natural isomorphisms 𝐻 0 (𝑀, 𝑆) ≃ Γ(𝑆) with the commutative
diagram for each homomorphism 𝑆 → 𝑆 ′

𝐻 0 (𝑀, 𝑆)
≃ / Γ(𝑆)

 
𝐻 0 (𝑀, 𝑆 ′ )
≃ / Γ(𝑆 ′ ).

3. 𝐻 𝑞 (𝑀, 𝑆) = 0 for all 𝑞 if 𝑆 is fine.


4. For short exact sequence 0 → 𝑆 ′ → 𝑆 → 𝑆 ′′ → 0, there is the long exact
sequence
⋯ → 𝐻 𝑞 (𝑀, 𝑆 ′ ) → 𝐻 𝑞 (𝑀, 𝑆) → 𝐻 𝑞 (𝑀, 𝑆 ′′ ) → 𝐻 𝑞+1 (𝑀, 𝑆 ′ ) → ⋯ .

5. 𝑖𝑑 ∶ 𝑆 → 𝑆 induces 𝑖𝑑 ∶ 𝐻 𝑞 (𝑀, 𝑆) → 𝐻 𝑞 (𝑀, 𝑆).


6. The commutative diagram

𝑆′ B /𝑆
BB
BB
BB
B! 
𝑆 ′′
induces the commutative diagram

𝐻 𝑞 (𝑀, 𝑆O′ ) / 𝐻 𝑞 (𝑀, 𝑆)


OOO
OOO
OOO
O' 
𝐻 𝑞 (𝑀, 𝑆 ′′ )
32 Sheaf Cohomology

7. For homomorphism of short exact sequences

0 / 𝑆′ /𝑆 / 𝑆 ′′ /0

  
0 / 𝑇′ /𝑇 / 𝑇 ′′ / 0,

it will induces commutative diagram

𝐻 𝑞 (𝑀, 𝑆 ′′ ) / 𝐻 𝑞+1 (𝑀, 𝑆 ′ )

 
𝐻 𝑞 (𝑀, 𝑇 ′′ ) / 𝐻 𝑞+1 (𝑀, 𝑇 ′ ).

Existence of sheaf cohomology theories


Definition 2.2. A resolution of a sheaf 𝐴 is an exact sequence of sheaves
0 → 𝐴 → 𝐶0 → 𝐶1 → ⋯ .
If each 𝐶 𝑖 is fine(resp. torsionless), then the resolution is a fine resolution(resp.
torsionless resolution).
For a resolution 𝐶 ∗ and a sheaf 𝑆, consider the cochain complex
0 → Γ(𝐶 0 ⊗ 𝑆) → Γ(𝐶 1 ⊗ 𝑆) → ⋯ ,
denoted by Γ(𝐶 ∗ ⊗ 𝑆). The homomorphisms are induced by the tensor with
𝑖𝑑 ∶ 𝑆 → 𝑆.
For different 𝑆, the homomorphism 𝑆 → 𝑆 ′ induces Γ(𝐶 ∗ ⊗𝑆) → Γ(𝐶 ∗ ⊗𝑆 ′ ),
which is a cochain homomorphism.
If there is a fine and torsionless resolution 𝐶 ∗ of the constant sheaf 𝒦 = 𝑀 ×𝐾,
then it induces a sheaf cohomology theory determined by 𝐻 𝑞 (Γ(𝐶 ∗ ⊗ 𝑆)). To be
more careful, we define it as follows:
1.
𝐻 𝑞 (𝑀, 𝑆) = 𝐻 𝑞 (Γ(𝐶 ∗ ⊗ 𝑆)),

2.
𝐻 𝑞 (𝑀, 𝑆) → 𝐻 𝑞 (𝑀, 𝑆 ′ )
for 𝑆 → 𝑆 ′ is induced by 𝐻 𝑞 (Γ(𝐶 ∗ ⊗ 𝑆)) → 𝐻 𝑞 (Γ(𝐶 ∗ ⊗ 𝑆 ′ )),
3. 𝐻 𝑞 (𝑀, 𝑆 ′′ ) → 𝐻 𝑞+1 (𝑀, 𝑆 ′ ) for 0 → 𝑆 ′ → 𝑆 → 𝑆 ′′ → 0 is induced by
applying the snake lemma to the sequence of cochain maps
0 → Γ(𝐶 ∗ ⊗ 𝑆 ′ ) → Γ(𝐶 ∗ ⊗ 𝑆) → Γ(𝐶 ∗ ⊗ 𝑆 ′′ ) → 0,
which is exact since the resolution is torsionless.
2 Axiomatic sheaf cohomology 33

To show that this gives a cohomology theory, it suffices to check that if 𝑆 is


fine and 𝑞 > 0 then 𝐻 𝑞 (𝑀, 𝑆) = 0, and 𝐻 0 (𝑀, 𝑆) = Γ(𝑆).
Set 𝐿𝑞 = 𝑘𝑒𝑟(𝐶 𝑞 → 𝐶 𝑞+1 ), then the short exact sequence
0 → 𝐿𝑞 → 𝐶 𝑞 → 𝐿𝑞+1 → 0
induce the short exact sequence
0 → 𝐿𝑞 ⊗ 𝑆 → 𝐶 𝑞 ⊗ 𝑆 → 𝐿𝑞+1 ⊗ 𝑆 → 0,
since 𝐿𝑞 is torsion free. Therefore we have the eaxct sequence
0 → Γ(𝐿𝑞 ⊗ 𝑆) → Γ(𝐶 𝑞 ⊗ 𝑆) → Γ(𝐿𝑞+1 ⊗ 𝑆),
it means
𝑘𝑒𝑟(Γ(𝐶 𝑞 ⊗ 𝑆) → Γ(𝐶 𝑞+1 ⊗ 𝑆))
= ker(Γ(𝐶 𝑞 ⊗ 𝑆) → Γ(𝐿𝑞+1 ⊗ 𝑆))
=Γ(𝐿𝑞 ⊗ 𝑆).
So
𝐻 𝑞 (𝑀, 𝑆) = Γ(𝐿𝑞 ⊗ 𝑆)∕𝐼𝑚(Γ(𝐶 𝑞−1 ⊗ 𝑆))
.
If 𝑆 is fine then there is an exact sequence
0 → Γ(𝐿𝑞−1 ⊗ 𝑆) → Γ(𝐶 𝑞−1 ⊗ 𝑆) → Γ(𝐿𝑞 ⊗ 𝑆) → 0.
So 𝐻 𝑞 (𝑀, 𝑆) = 0 if 𝑆 is fine.
While to compute 𝐻 0 (𝑀, 𝑆), the sequence 0 → 𝐻 = 𝑀 × 𝐾 → 𝐶 0 → 𝐶 1 →
⋯ means 𝐻 ≃ 𝐿0 , so Γ(𝑆) ≃ Γ(𝐻 ⊗ 𝑆) ≃ Γ(𝐿0 ⊗ 𝑆) = 𝐻 0 (𝑀, 𝑆).

Uniqueness
After proving the existence of the cohomology theory using the resolution of the
constant sheaf. It left to prove that it is unique under isomorphisms. Firstly it is
required to make clear how to describe the homomorphism between cohomology
theories.
Definition 2.3. For two sheaf cohomology theories ℋ , ℋ ̃ , a homomorphism be-
tween them consists of a homomorphism 𝐻 𝑞 (𝑀, 𝑆) → 𝐻̃ 𝑞 (𝑀, 𝑆) for each 𝑆 and
𝑞, which should satisfy the following commutative conditions.
1.
𝐻 0 (𝑀, 𝑆)
≃ / Γ(𝑆)

 
𝐻̃ 0 (𝑀, 𝑆)
≃ / Γ(𝑆).

Here the isomorphism is canonical as defined before.


34 Sheaf Cohomology

2. For homomorphism 𝑆 → 𝑇 ,
𝐻 𝑞 (𝑀, 𝑆) / 𝐻 𝑞 (𝑀, 𝑇 )

 
𝐻̃ 𝑞 (𝑀, 𝑆) / 𝐻̃ 𝑞 (𝑀, 𝑇 ).

3. For the exact sequence 0 → 𝑆 ′ → 𝑆 → 𝑆 ′′ → 0,

𝐻 𝑞 (𝑀, 𝑆 ′′ ) / 𝐻 𝑞+1 (𝑀, 𝑆 ′ )

 
𝐻̃ 𝑞 (𝑀, 𝑆 ′′ ) / 𝐻̃ 𝑞+1 (𝑀, 𝑆 ′ ).

There is always a unique homomorphism between two cohomology theory. If


this holds, then there is a unique cohomology theory isomorphic to the cohomology
theory given by the fine torsionless resolution providing that such resolution always
exists, by using the uniqueness to

ℋA
𝑖𝑑 /ℋ ,
AA }}
AA𝜑 𝜓 }}
AA }}
A ~}}
ℋ̃
and
ℋ̃ A
𝑖𝑑 / ℋ̃ .
AA }}
AA𝜓 𝜑
}}
AA }}
A ~}
}

Lemma 2.4. For each sheaf 𝑆, there exist a fine sheaf 𝑆0 and a sheaf 𝑆 ̄ with the
short exact sequence
0 → 𝑆 → 𝑆0 → 𝑆 ̄ → 0.
Proof. Set 𝑆0 to be the sheaf of germs of discontinuous sections of 𝑆, and set 𝑆 ̄ to
be the quotient sheaf 𝑆0 ∕𝑆. To construct 𝑆0 , set 𝑆𝑈 to be the set of discontinuous
maps over 𝑈 , that is map 𝑓 ∶ 𝑈 → 𝑆 with 𝜋 ∘ 𝑓 = 𝑖𝑑. This forms a presheaf, and
let 𝑆0 to be the associated sheaf.
It suffices to prove that 𝑆0 is a fine sheaf. Set {𝑈𝑖 } to be a locally finite open
covering, then there exists a refined open covering {𝑉𝑖 } with 𝑉𝑖̄ ⊂ 𝑈𝑖 . Set the func-
tion 𝜑𝑖 = 𝜒𝑉𝑖 , and arrange every 𝑥 ∈ 𝑀 a 𝑉𝑖 , denoted by 𝑉𝑖𝑥 containing it. Set
𝜓𝑖 (𝑥) = 𝜒𝑉𝑖 (𝑥)𝛿𝑖𝑥 ,𝑖 , then ∑𝑖 𝜓𝑖 = 1. Now set the endomorphism of presheaf to be
𝑙𝑖̃ (𝑆)(𝑚) = 𝜓𝑖 (𝑚)𝑠(𝑚), and the presheaf endomorphism induces sheaf endomor-
phism 𝑙𝑖 of 𝑆. That’s the desired partition of unity.
The injection 𝑆 → 𝑆0 is given by the local homeomorphism at every 𝑠 ∈ 𝑆.
This is obviously injective and a sheaf homomorphism. ◻
2 Axiomatic sheaf cohomology 35

Theorem 2.5. For two cohomology theories ℋ and ℋ ̃ , there exists a unique ho-
momorphism ℋ → ℋ ̃ .

Proof. By the first condition in the definition of cohomology theory homomor-


phism,

𝐻 0 (𝑀, 𝑆)
≃ / Γ(𝑆)

 
𝐻̃ 0 (𝑀, 𝑆)
≃ / Γ(𝑆).

So 𝐻 0 (𝑀, 𝑆) → 𝐻̃ 0 (𝑀, 𝑆) is uniquely determined by this commutative dia-


gram.
The commutative diagram

Γ(𝑆0 ) / Γ(𝑆)̄ / 𝐻 1 (𝑀, 𝑆) /0

𝑖𝑑 𝑖𝑑
  
Γ(𝑆0 ) / Γ(𝑆 ) / 𝐻̃ 1 (𝑀, 𝑆) /0
0

implies the uniqueness of 𝐻 1 (𝑀, 𝑆) → 𝐻̃ 1 (𝑀, 𝑆).


While for 𝑞 ≥ 2

0 / 𝐻 𝑞−1 (𝑆)̄ / 𝐻 𝑞 (𝑀, 𝑆) /0

 
0 / 𝐻̃ 𝑞−1 (𝑀, 𝑆)̄ / 𝐻̃ 𝑞 (𝑀, 𝑆) /0

implies the uniqueness of 𝐻 𝑞 (𝑀, 𝑆) → 𝐻̃ 𝑞 (𝑀, 𝑆) providing the uniqueness of


𝐻 𝑞−1 (𝑀, 𝑆) → 𝐻̃ 𝑞−1 (𝑀, 𝑆). By induction the uniqueness of homomorphism
between two cohomology theories can be proved.
Now focus on the existence of the homomorphism.
In fact, the above three commutative diagram uniquely determined all the ho-
momorphism 𝐻 𝑞 (𝑀, 𝑆) → 𝐻̃ 𝑞 (𝑀, 𝑆). For 𝑞 ≤ 0 and 𝑞 ≥ 2 this is clear, while
for 𝑞 = 1, for any 𝑎 ∈ 𝐻 1 (𝑀, 𝑆), assume 𝑏 ∈ Γ(𝑆)̄ which maps to 𝑎 in the first
line. Then the image of 𝑎 in 𝐻 1 (𝑀, 𝑆) → 𝐻̃ 1 (𝑀, 𝑆) is defined as the image of
b in the second line. This is well defined since for another 𝑏′ with image 𝑎, 𝑏 − 𝑏′
lies in the image of Γ(𝑆0 ).
Then check this is a cohomology theory homomorphism. The first commuta-
tive diagram holds automatically.
The second commutative diagram can be proved by induction.
36 Sheaf Cohomology

For 𝑞 ≤ 0 it holds automatically. For 𝑞 > 0, observe the commutative diagram

𝐻 𝑞−1 (𝑀, O𝑆)̄ / 𝐻 𝑞 (𝑀, 𝑆) /0


OOO MMM
OOO MMM
OOO MMM
O' MM&
𝐻̃ 𝑞−1 (𝑀, 𝑆)̄ / 𝐻̃ 𝑞 (𝑀, 𝑆) /0

 
𝐻 𝑞−1 (𝑀, O𝑇 ̄ ) / 𝐻 𝑞 (𝑀, 𝑇 ) /0
OOO MMM
OOO MMM
OOO MMM
O'  MM& 
𝐻 𝑞−1 (𝑀, 𝑇 ̄ ) / 𝐻̃ 𝑞 (𝑀, 𝑇 ) /0

By the assumption all the other five faces of the cube are commutative, then the
desired right face of this cubic is also commutative. Notice that 𝑆 → 𝑇 will induce
the commutative diagram

0 /𝑆 /𝑆 / 𝑆̄ /0
0

  
0 /𝑇 /𝑇 / 𝑇̄ / 0.
0

The first square is easy to see the commutativity, and then map 𝑆 ̄ → 𝑇 ̄ can be
canonical constructed.
While for the third commutative diagram, the short exact sequence 0 → 𝑅 →
𝑆 → 𝑇 → 0 can be extended to the commutative diagram
0 /𝑅 /𝑆 /𝑇 /0

𝑖𝑑
 
0 /𝑅 /𝑆 /𝐺 /0
O O0
𝑖𝑑

0 /𝑅 / 𝑅0 / 𝑅̄ / 0,

here 𝐺 = 𝑆0 ∕𝑅.
The diagram can be extended to
0 /𝑅 /𝑆 /𝑇 /0

𝑖𝑑
  
0 /𝑅 /𝑆 /𝐺 /0
O O0 O
𝑖𝑑

0 /𝑅 / 𝑅0 / 𝑅̄ / 0,
2 Axiomatic sheaf cohomology 37

making it commutative. The construction has been mentioned before.


it induces the commutative diagram
/ Γ(𝑆) / Γ(𝑇 ) / 𝐻 1 (𝑀, 𝑅)

𝑖𝑑
  
/ Γ(𝑆 ) / Γ(𝐺) / 𝐻 1 (𝑀, 𝑅) /0
O 0 O O
𝑖𝑑

/ Γ(𝑅 ) / Γ(𝑅)̄ / 𝐻 1 (𝑀, 𝑅) / 0.


0

So 𝐻 0 (𝑀, 𝑇 ) → 𝐻 1 (𝑀, 𝑅) is the composition of

𝐻 0 (𝑀, 𝑇 )
≃ / Γ(𝑇 ) / Γ(𝐺)∕Γ(𝑆0 ) o ≃ ̄
Γ(𝑅)∕Γ(𝑅 0)
≃ / 𝐻 1 (𝑀, 𝑅),

and by the construction of the the map ℋ → ℋ ̃ , the following diagram is commu-
tative

𝐻 0 (𝑀, 𝑇 )
≃ / Γ(𝑇 ) / Γ(𝐺)∕Γ(𝑆 ) o ≃ ̄
Γ(𝑅)∕Γ(𝑅
≃ / 𝐻 1 (𝑀, 𝑅)
0 0)

𝑖𝑑 𝑖𝑑 𝑖𝑑
    
𝐻̃ 0 (𝑀, 𝑇 )
≃ / Γ(𝑇 ) / Γ(𝐺)∕Γ(𝑆0 ) o ≃ ̄
Γ(𝑅)∕Γ(𝑅 0)
≃ / 𝐻̃ 1 (𝑀, 𝑅).

The left and right commutative squares are just by the construction.
While for 𝑞 ≥ 1 and 𝐻 𝑞 (𝑀, 𝑇 ) → 𝐻 𝑞+1 (𝑀, 𝑅), consider the commutative
diagram

𝐻 𝑞 (𝑀, 𝑆) / 𝐻 𝑞 (𝑀, 𝑇 ) / 𝐻 𝑞+1 (𝑀, 𝑅) /

𝑖𝑑
 
0 / 𝐻 𝑞 (𝑀, 𝐺) ≃ / 𝐻 𝑞+1 (𝑀, 𝑅) /0
O O
𝑖𝑑

0 / 𝐻 𝑞 (𝑀, 𝑅)̄ ≃ / 𝐻 𝑞+1 (𝑀, 𝑅) / 0,

and it shows that 𝐻 𝑞 (𝑀, 𝑇 ) → 𝐻 𝑞+1 (𝑀, 𝑅) can be decomposed into

𝐻 𝑞 (𝑀, 𝑇 ) / 𝐻 𝑞 (𝑀, 𝐺) o ≃
𝐻 𝑞 (𝑀, 𝑅)̄
≃ / 𝐻 𝑞+1 (𝑀, 𝑅).

Therefore the following commutative diagram holds

𝐻 𝑞 (𝑀, 𝑇 ) / 𝐻 𝑞 (𝑀, 𝐺) o ≃
𝐻 𝑞 (𝑀, 𝑅)̄
≃ / 𝐻 𝑞+1 (𝑀, 𝑅)

   
𝐻̃ 𝑞 (𝑀, 𝑇 ) / 𝐻̃ 𝑞 (𝑀, 𝐺) o ≃
𝐻̃ 𝑞 (𝑀, 𝑅)̄
≃ / 𝐻̃ 𝑞+1 (𝑀, 𝑅).
38 Sheaf Cohomology

The first two squares have been proved to be commutative, and the third square is
commutative by the construction. This complete the proof. ◻

Now it’s convinced that there is a unique cohomology theory under isomor-
phism, given by the fine torsion free resolution of the constant sheaf, assuming the
existence of such resolution. That’s 𝐻 𝑞 (𝑀, 𝑆) ≃ 𝐻 𝑞 (Γ(𝐶 ∗ ⊗ 𝑆)), where 𝐶 ∗ is the
fine torsion free resolution for the constant sheaf.

Theorem 2.6. For the fine resolution of S ( not the constant sheaf)

0 → 𝑆 → 𝐶0 → 𝐶1 → ⋯ ,

there is a canonical isomorphism 𝐻 𝑞 (𝑀, 𝑆) ≃ 𝐻 𝑞 (Γ(𝐶 ∗ )).

Proof. The exact sequence 0 → 𝑆 → 𝐶 0 → 𝐶 1 → ⋯ will induce the exact


sequence 0 → Γ(𝑆) → Γ(𝐶 0 ) → Γ(𝐶 1 ) → ⋯, then 𝐻 0 (𝑀, 𝑆) ≃ Γ(𝑆) ≃
𝐻 0 (Γ(𝐶 ∗ )). Set 𝐻 𝑞 = 𝑘𝑒𝑟(𝐶 𝑞 → 𝐶 𝑞+1 ) for positive 𝑞, and then

0 → 𝑆 → 𝐶 0 → 𝐻 1 → 0,

and
0 → 𝐻 𝑞 → 𝐶 𝑞 → 𝐻 𝑞+1 → 0,

Since 𝐶 1 is fine, 𝐻 1 (𝑀, 𝑆) ≃ 𝐻 0 (𝑀, 𝐻 1 )∕𝐼𝑚𝐻 0 (𝑀, 𝐶 0 ) ≃ Γ(𝐻 1 )∕Γ(𝐶 0 ) ≃


𝐻 1 (Γ(𝐶 ∗ )). And then 𝐻 𝑞 (𝑀, 𝑆) ≃ 𝐻 𝑞−1 (𝑀, 𝐻 1 ) for 𝑞 > 1, so

𝐻 𝑞 (𝑀, 𝑆) ≃ 𝐻 𝑞−1 (𝑀, 𝐻 1 ) ≃ 𝐻 𝑞−2 (𝑀, 𝐻 2 ) ≃ ⋯


≃ 𝐻 1 (𝑀, 𝐻 𝑞−1 ) ≃ Γ(𝐻 𝑞 )∕𝐼𝑚Γ(𝐶 𝑞−1 ) ≃ 𝐻 𝑞 (Γ(𝐶 ∗ )),

since 𝐻 𝑘 has resolution 𝐻 𝑘 → 𝐶 0′ → 𝐶 1′ → ⋯, where 𝐶 𝑖′ = 𝐶 𝑘+𝑖 . ◻

Remark 2.7. All the above claims are based on the assumption of the existence of
fine torsionless resolution of the constant sheaf on M. There will be an example of
the resolution in lemma 3.1 to introduce Alexander–Spanier cohomology.

3 Examples

Alexander–Spanier cohomology
Assume 𝑀 is a paracompact compact Hausdorff space here. For open set 𝑈 ⊂ 𝑀,
use 𝐴𝑝 (𝑈 , 𝐾) to denote the 𝐾 modules of pointwise functions 𝑈 𝑝+1 → 𝐾 for each
𝑝 ≥ 0. The coboundary operator 𝑑 ∶ 𝐴𝑝 (𝑈 , 𝐾) → 𝐴𝑝+1 (𝑈 , 𝐾) is defined by

𝑝+1
𝑑𝑓 (𝑥0 , 𝑥1 , ⋯ , 𝑥𝑝+1 ) = ∑ (−1)𝑖 𝑓 (𝑥0 , ⋯ , 𝑥𝑖̂ , ⋯ , 𝑥𝑝+1 )
𝑖=0
3 Examples 39

for 𝑓 ∈ 𝐴𝑝 (𝑈 , 𝐾) and 𝑥𝑖 ∈ 𝑈 , 𝑖 = 0, ⋯ , 𝑝 + 1. By direct computation 𝑑 ∘ 𝑑 =


0. Using the natural restriction 𝜌𝑉 ,𝑈 ∶ 𝐴𝑝 (𝑈 , 𝐾) → 𝐴𝑝 (𝑉 , 𝐾), {𝐴𝑝 (𝑈 , 𝐾), 𝜌𝑉 ,𝑈 }
forms the presheaf of Alexander–Spanier p-cochains. For 𝑝 ≥ 1 it is not a complete
𝑝+1
presheaf, since ∪𝑈𝑖 does not contain (∪𝑈𝑖 )𝑝+1 for most times.
Set 𝒜 𝑝 (𝑀, 𝐾) to denote the associated sheaf of germs with respect to
this presheaf. An coboundary operator can be induced as 𝑑 ∶ 𝒜 𝑝 (𝑀, 𝐾) →
𝒜 𝑝+1 (𝑀, 𝐾).

Lemma 3.1. The cochain complex

0 /𝒦 / 𝒜 1 (𝑀, 𝐾) 𝑑 / 𝒜 2 (𝑀, 𝐾) 𝑑 /⋯

is a fine torsionless resolution for 𝒦 , where elements in 𝒦 = 𝑀 × 𝐾 can be seen


as constant functions.

Proof. Since 𝐾 is an integral domain, 𝒜 𝑝 (𝑀, 𝐾) is torsionless.


For locally finite open covering {𝑈𝑖 } of 𝑈 , take a partition of unity {𝜙𝑖 } subor-
dinate to it and taking values only 0 or 1. Define endomorphisms 𝑙𝑖 on 𝐴𝑝 (𝑈 , 𝐾) as
𝑙𝑖 (𝑓 )(𝑥0 , ⋯ , 𝑥𝑝 ) = 𝜙𝑖 (𝑥0 )𝑓 (𝑥0 , ⋯ , 𝑥𝑝 ). It is compatible with the restriction map
so induces the desired sheaf endomorphisms. Therefore 𝒜 𝑝 (𝑀, 𝐾) is fine.
It remains to show the exactness. Firstly show the exactness of the sequence

0 /𝐾 / 𝐴1 (𝑈 , 𝐾) 𝑑 / 𝐴2 (𝑈 , 𝐾) 𝑑 / ⋯.

The exactness at 𝐾 is obviously. At 𝐴1 (𝑈 , 𝐾), 𝑑𝑓 (𝑥, 𝑦) = 0 implies 𝑓 (𝑥) − 𝑓 (𝑦) =


0 for all (𝑥, 𝑦) ∈ 𝑈 2 , so 𝑓 is a constant function of 𝐾. For 𝑝 ≥ 1 and 𝑓 ∈ 𝐴𝑝 (𝑈 , 𝐾)
with 𝑑𝑓 = 0, then fixed 𝑚 ∈ 𝑈 and define 𝑔 ∈ 𝐴𝑝−1 (𝑈 , 𝐾) as

𝑔(𝑚0 , ⋯ , 𝑚𝑝−1 ) = 𝑓 (𝑚, 𝑚0 , ⋯ , 𝑚𝑝−1 ).

Then compute that 𝑑𝑔(𝑥0 , 𝑥1 , ⋯ , 𝑥𝑝 ) = ∑𝑝𝑖=0 (−1)𝑖 𝑔(𝑥0 , ⋯ , 𝑥𝑖̂ , ⋯ , 𝑥𝑝 ) =


𝑝
∑𝑖=0 (−1)𝑖 𝑓 (𝑚, 𝑥0 , ⋯ , 𝑥𝑖̂ , ⋯ , 𝑥𝑝 ) = 𝑓 (𝑥0 , ⋯ , 𝑥𝑝 ), since 𝑑𝑓 (𝑚, 𝑥0 , ⋯ , 𝑥𝑝 ) = 0.
𝑑𝑔 = 𝑓 implies the exactness at 𝐴𝑝 (𝑈 , 𝐾).
By taking the germs it implies the exactness at 𝒜 𝑝 (𝑀, 𝐾). ◻

Now the cohomology theory can be defined as

𝐻 𝑞 (𝑀, 𝑆) = 𝐻 𝑞 (Γ(𝒜 ∗ (𝑀, 𝐾) ⊗ 𝑆)),

but it won’t bother to introduce the classical definition of Alexander-Spanier coho-


mology modules of 𝑀 with coefficients in a 𝐾 module 𝐺.
Replace all 𝐾 by 𝐺 and ℋ by 𝒢 in the above notions, since the functions can
be defined as 𝑈 𝑝+1 → 𝐺. Set
𝑝
𝐴0 (𝑀, 𝐺) = {𝑓 ∈ 𝐴𝑝 (𝑀, 𝐺) ∣ 𝜌𝑥,𝑀 𝑓 = 0, ∀𝑥 ∈ 𝑀}.
40 Sheaf Cohomology

Notice that 𝐴𝑝 (𝑀, 𝐾) is not a complete sheaf,and 𝐴𝑝 (𝑀, 𝐺) as well. The operator
𝑝 𝑝+1
d on 𝐴0 (𝑀, 𝐺) has image in 𝐴0 (𝑀, 𝐺), so it induces a map
𝑝 𝑝+1
𝐴𝑝 (𝑀, 𝐺)∕𝐴0 (𝑀, 𝐺) → 𝐴𝑝+1 (𝑀, 𝐺)∕𝐴0 (𝑀, 𝐺),
and induces a cochain complex denoted by 𝐴∗ (𝑀, 𝐺)∕𝐴∗0 (𝑀, 𝐺). The 𝑝-th
Alexander–Spanier cohomology modules for 𝑀 with coefficients in 𝐺 is the 𝑝-th
quotient module with respect to this sequence, or
𝑝
𝐻𝐴−𝑆 (𝑀; 𝐺) = 𝐻 𝑝 (𝐴∗ (𝑀, 𝐺)∕𝐴∗0 (𝑀, 𝐺)).
Lemma 3.2. If 𝑃 = {𝑃𝑈 , 𝜌𝑉 ,𝑈 } is a presheaf and 𝑆 = 𝛽(𝑃 ), set 𝑆0 = {𝑠 ∈
𝑃𝑀 , 𝜌𝑥,𝑀 𝑠 = 0, ∀𝑥 ∈ 𝑀}, then there is an exact sequence
0 → 𝑆0 → 𝑃𝑀 → Γ(𝑆) → 0,
where the morphism 𝑃𝑀 → Γ(𝑆) send an element 𝑠 ∈ 𝑃𝑀 to a map 𝑀 → 𝑆,
which takes value 𝜌𝑥,𝑀 𝑠 at 𝑥.
By theorem 2.6, 𝐻 𝑞 (𝑀, 𝒢 ) ≃ 𝐻 𝑞 (Γ(𝒜 ∗ (𝑀, 𝐺))), and 𝐴∗ (𝑀, 𝐺)∕𝐴∗0 (𝑀, 𝐺) ≃
𝑝
Γ(𝒜 ∗ (𝑀, 𝐺)), so 𝐻 𝑞 (𝑀, 𝒢 ) ≃ 𝐻𝐴−𝑆 (𝑀; 𝐺), a desired conclusion.

de Rham cohomology
De Rham cohomology can also be viewed as a sheaf cohomology theory. Set 𝐾 =
𝑅, and ℛ = 𝑀 × 𝑅 is the constant sheaf.
{𝐸 𝑝 (𝑈 ), 𝜌𝑉 ,𝑈 } forms a presheaf, where 𝐸 𝑝 (𝑈 ) is the set of differential 𝑝-forms
on 𝑈 . The exterior differential acts as the coboundary operator. The associated
sheaf can be denoted by ℰ 𝑝 (𝑀), and 𝑑 represents the induced coboundary operator.
There is a fine torsionless resolution of ℛ as

0 /ℛ / ℰ 0 (𝑀) 𝑑 / ℰ 1 (𝑀) 𝑑 / ℰ 2 (𝑀) 𝑑 / ⋯,

since Poincare lemma shows locally closed forms are exact forms. Define the co-
homology theory for 𝑀 with ceofficients in sheaves of real vector spaces by setting
𝐻 𝑞 (𝑀, 𝒥 ) = 𝐻 𝑞 (Γ(ℰ ∗ (𝑀) ⊗ 𝒥 )).
If 𝒥 = ℛ, then
𝐻 𝑞 (𝑀, ℛ) = 𝐻 𝑞 (Γ(ℰ ∗ (𝑀) ⊗ ℛ)) ≃ 𝐻 𝑞 (Γ(ℰ ∗ (𝑀))).
and the isomorphism is commutative with the coboundary operator.
Define the 𝑞-th de Rham cohomology group by 𝐻 𝑞 (𝐸 ∗ (𝑀)) as usual, then
notice that {𝐸 𝑝 (𝑈 ), 𝜌𝑉 ,𝑈 } is complete, so 𝐸 𝑝 (𝑀) ≃ Γ(ℰ 𝑝 (𝑀)) and it induces an
isomorphism between the cochain complexs Γ(ℰ ∗ (𝑀)) and 𝐸 ∗ (𝑀), so
𝑞
𝐻 𝑞 (𝑀, ℛ) ≃ 𝐻𝑑𝑒𝑅 (𝑀),
a desired conclusion.
4 Multiplicative structure 41

4 Multiplicative structure
Definition 4.1. For cochain complexes 𝐶 ∗ and 𝐶 ̂∗ , define their tensor by

(𝐶 ∗ ⊗ 𝐶 ∗̂ )𝑟 = 𝐶 𝑝 ⊗ 𝐶 ̂𝑞 ,

𝑝+𝑞=𝑟

and the 𝑟-th coboundary operator is defined by

(𝑑𝑝 ⊗ 𝑖𝑑̂ 𝑝 + (−1)𝑝 𝑖𝑑 𝑝 ⊗ 𝑑𝑞̂ )



𝑝+𝑞=𝑟

There is a canonical homomorphism

𝐻 𝑝 (𝐶 ∗ ) ⊗ 𝐻 𝑞 (𝐶 ∗̂ ) → 𝐻 𝑝+𝑞 (𝐶 ∗ ⊗ 𝐶 ∗̂ ).

Lemma 4.2. The tensor of two torsionless 𝐾 modules is torsionless.


Now if there is a fine torsionless resolution of the constant sheaf 𝒦 of the form

0 → 𝒦 → 𝐶0 → 𝐶 1 → ⋯ ,

tensor it with itself and get a sequence

0 → 𝒦 ≃ 𝒦 ⊗𝒦 → 𝐶0 ⊗𝐶0 → (𝐶0 ⊗𝐶1 )⊕(𝐶1 ⊗𝐶0 ) → ⋯ → 𝐶𝑝 ⊗𝐶𝑞 → ⋯



𝑝+𝑞=𝑟

with the coboundary operators

(𝑑𝑝 ⊗ 𝑖𝑑 𝑝 + (−1)𝑝 𝑖𝑑 𝑝 ⊗ 𝑑𝑞 ),

𝑝+𝑞=𝑟

This is a fine torsionless resolution of the constant sheaf 𝒦 . The sequence is fine
since tensor products and direct sums preserve fineness. Lemma 4.2 shows that
each term is torsionless. To show the exactness at 𝑟-th place(𝑟 ≥ 1), apply the
Kunneth formula at each stalk. The conditions of Kunneth formula is satisfied
automatically since 𝐶 ′∗ is torsionless, implying that the tensor product with it keeps
exactness. The exactness at 𝒦 and 𝐶0 ⊗ 𝐶0 can be checked by computation.
Theorem 4.3 (Kunneth formula). If there are two cochain complexes 𝐶 ∗ and 𝐶 ′∗
such that 𝐶 ∗ ∗ 𝐶 ′∗ is an acyclic (i.e. exact) cochain complex, there is a functional
split short exact sequence

0 → (𝐻 ∗ (𝐶 ∗ ) ⊗ 𝐻 ∗ (𝐶 ′∗ ))𝑞 → 𝐻 𝑞 (𝐶 ∗ ⊗ 𝐶 ′∗ ) → (𝐻 ∗ (𝐶 ∗ ) ∗ 𝐻 ∗ (𝐶 ′∗ ))𝑞+1 → 0.

Here 𝐴 ∗ 𝐵 is a notation representing 𝑇 𝑜𝑟1 (𝐴, 𝐵).


If 𝑆 is a sheaf of algebras over 𝑀, which means a homomorphism 𝑆 ⊗ 𝑆 → 𝑆,
it induces a homomorphism in the form of

Γ(𝐶𝑝 ⊗ 𝑆) ⊗ Γ(𝐶𝑞 ⊗ 𝑆) → Γ(𝐶𝑝 ⊗ 𝐶𝑞 ⊗ 𝑆 ⊗ 𝑆) → Γ(𝐶𝑝 ⊗ 𝐶𝑞 ⊗ 𝑆).


42 Sheaf Cohomology

Superpose it through the pair (𝑝, 𝑞) such that 𝑝 + 𝑞 = 𝑟 and get a cochain map

Γ(𝐶 ∗ ⊗ 𝑆) ⊗ Γ(𝐶 ∗ ⊗ 𝑆) → Γ(𝐶 ∗ ⊗ 𝐶 ∗ ⊗ 𝑆),

and a homomorphism

𝐻 𝑘 (Γ(𝐶 ∗ ⊗ 𝑆) ⊗ Γ(𝐶 ∗ ⊗ 𝑆)) → 𝐻 𝑘 (Γ(𝐶 ∗ ⊗ 𝐶 ∗ ⊗ 𝑆)).

Composite it with the homomorphism 𝐻 𝑝 (Γ(𝐶 ∗ ⊗ 𝑆)) ⊗ 𝐻 𝑞 (Γ(𝐶 ∗ ⊗ 𝑆)) →


𝐻 𝑝+𝑞 (Γ(𝐶 ∗ ⊗ 𝑆) ⊗ Γ(𝐶 ∗ ⊗ 𝑆)) to get the homomorphism

𝐻 𝑝 (𝑀, 𝑆) ⊗ 𝐻 𝑞 (𝑀, 𝑆) ≃𝐻 𝑝 (Γ(𝐶 ∗ ⊗ 𝑆)) ⊗ 𝐻 𝑞 (Γ(𝐶 ∗ ⊗ 𝑆))


→ 𝐻 𝑝+𝑞 (Γ(𝐶 ∗ ⊗ 𝐶 ∗ ⊗ 𝑆)) ≃ 𝐻 𝑝+𝑞 (𝑀, 𝑆),

since 𝐶 ∗ ⊗ 𝐶 ∗ is a fine torsionless resolution for the constant sheaf ℋ .


This homomorphism gives a multiplicative structure for sheaf cohomology, but
it is left to prove that the multiply is independent of the revolution 𝐶 ∗ of 𝒦 .
For another fine torsionless resolution

0 → 𝒦 → 𝐶0̃ → 𝐶1̃ → 𝐶2̃ → ⋯ ,

tensor it with 𝐶 ∗ and get a fine torsionless resolution

0 → 𝒦 → 𝐶0 ⊗ 𝐶0̃ → ⋯ → 𝐶𝑝 ⊗ 𝐶𝑞̃ → ⋯ ,

𝑝+𝑞=𝑟

which will play the role of the bridge between two tensor products 𝐶 ∗ ⊗ 𝐶 ∗ and
𝐶 ∗̃ ⊗ 𝐶 ∗̃ .
Consider the injection

𝐶𝑝 ≃ 𝐶𝑝 ⊗ 𝒦 → 𝐶𝑝 ⊗ 𝐶0̃ → ∑ 𝐶𝑟 ⊗ 𝐶𝑠̃ ,
𝑟+𝑠=𝑝

which will induce the commutative diagram

𝐶𝑝 / 𝐶𝑝 ⊗ 𝒦 / 𝐶𝑝 ⊗ 𝐶0̃ /∑ ̃
𝑟+𝑠=𝑝 𝐶𝑟 ⊗ 𝐶𝑠

  
𝐶𝑝 / 𝐶 ⊗𝒦 / 𝐶 ⊗ 𝐶̃ /∑ ̃
𝑝+1 𝑝+1 0 𝑟+𝑠=𝑝+1 𝐶𝑟 ⊗ 𝐶𝑠 .

Notice that the arrow 𝐶𝑝 ⊗ 𝐶0̃ → 𝐶𝑝+1 ⊗ 𝐶0̃ can’t be added, since for element
𝑐𝑝 ⊗ 𝑐0̃ , it maps to 𝑑𝑝 (𝑐𝑝 ) ⊗ 𝑐0̃ + (−1)𝑝 𝑐𝑝 ⊗ 𝑑0̃ (𝑐0̃ ) = 𝑑𝑝 (𝑐𝑝 ) ⊗ 𝑐0̃ , if and only if 𝑐0̃
is the image from 𝒦 , suppose 𝑐𝑝 ≠ 0.
The diagram induce the cochain map Γ(𝐶 ∗ ⊗ 𝑆) → Γ(𝐶 ∗ ⊗ 𝐶 ∗̃ ⊗ 𝑆), and will
induce the identity map 𝐻 𝑞 (𝑀, 𝑆) → 𝐻 𝑞 (𝑀, 𝑆), since this holds for 𝑞 = 0.
References 43

Furthermore consider the commutative diagram of cochain complexes and


cochain maps

Γ(𝐶 ∗ ⊗ 𝑆) ⊗ Γ(𝐶 ∗ ⊗ 𝑆) / Γ(𝐶 ∗ ⊗ 𝐶 ∗ ⊗ 𝑆)

 
Γ(𝐶 ∗ ⊗ 𝐶 ∗̃ ⊗ 𝑆) ⊗ Γ(𝐶 ∗ ⊗ 𝐶 ∗̃ ⊗ 𝑆) / Γ(𝐶 ∗ ⊗ 𝐶 ∗̃ ⊗ 𝐶 ∗ ⊗ 𝐶 ∗̃ ⊗ 𝑆),

and it induces the commutative diagram

𝐻 𝑝 (Γ(𝐶 ∗ ⊗ 𝑆)) ⊗ 𝐻 𝑞 (Γ(𝐶 ∗ ⊗ 𝑆)) / 𝐻 𝑝+𝑞 (Γ(𝐶 ∗ ⊗ 𝑆) ⊗ Γ(𝐶 ∗ ⊗ 𝑆))


fff
fffffffffff
ffff
 rffffff 
𝐻 𝑝+𝑞 (Γ(𝐶 ∗ ⊗ 𝐶 ∗̃ ⊗ 𝑆) ⊗ Γ(𝐶 ∗ ⊗ 𝐶 ∗̃ ⊗ 𝑆)) / 𝐻 𝑝+𝑞 (Γ(𝐶 ∗ ⊗ 𝐶 ∗̃ ⊗ 𝐶 ∗ ⊗ 𝐶 ∗̃ ⊗ 𝑆)).

Similarly, induces the commutative diagram

𝐻 𝑝 (Γ(𝐶 ∗̃ ⊗ 𝑆)) ⊗ 𝐻 𝑞 (Γ(𝐶 ∗̃ ⊗ 𝑆)) / 𝐻 𝑝+𝑞 (Γ(𝐶 ∗̃ ⊗ 𝑆) ⊗ Γ(𝐶 ∗̃ ⊗ 𝑆))


ffff
f fffffff ff
f
fff
 rffffff 
𝐻 𝑝+𝑞 (Γ(𝐶 ∗ ⊗ 𝐶 ∗̃ ⊗ 𝑆) ⊗ Γ(𝐶 ∗ ⊗ 𝐶 ∗̃ ⊗ 𝑆)) / 𝐻 𝑝+𝑞 (Γ(𝐶 ∗ ⊗ 𝐶 ∗̃ ⊗ 𝐶 ∗ ⊗ 𝐶 ∗̃ ⊗ 𝑆)).

By the two diagrams, the multiplication structure is independent of the fine torsion
free resolution [Spa66].

References
[AM69] M. F. Atiyah and I. G. Macdonald (1969). Introduction to Commutative Algebra.
Addison-Wesley Publishing Company.
[GH94] P. A. Griffiths and J. D. Harris (1994). Principles of Algebraic Geometry. Wiley
Classics Library.
[Spa66] E. H. Spanier (1966). Algebraic Topology. Mcgraw-Hill Book Company.
[War83] F. W. Warner (1983). Foundations of Differentiable manifolds and Lie Groups.
Springer-Verlag.
An Introduction to Factorization Algebras and
Factorization Homology

Kong Fanhao1

ABSTRACT
We introduce the theory of factorization algebras and factorization homol-
ogy. They are based on the theory on higher operads and algebras over them.
We also give two applications of factorization algebras and factorization ho-
mology: the Deligne conjecture and the Bar constructions.

Contents

1 Preliminaries: Operads, higher operads and algebras over them 47


Basic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Higher operads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
The little cubes operads and 𝐸𝑛 -algebras . . . . . . . . . . . . . . . . . 56

2 Factorization algebras 58
Basic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Basic properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

3 Factorization homology 66
Homology theory for manifolds . . . . . . . . . . . . . . . . . . . . . . 66
Factorization homology . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Relation to factorization algebra and applications to 𝐸𝑛 -algebras . . . . 75
Stratified spaces and applications to 𝐸𝑛 -modules . . . . . . . . . . . . . 80

4 Further applications 84
Centralizers and (higher) Deligne conjecture . . . . . . . . . . . . . . . 84
An introduction to the Bar constructions . . . . . . . . . . . . . . . . . 88

1
孔繁浩,清华大学数学系数 72 班.
46 An Introduction to Factorization Algebras and Factorization Homology

It has been many years since people try to study quantum field theories using the
tool of mathematics. People introduce a number of physical ideas into mathemat-
ics, including fields, phase spaces, quantization, Feynman diagrams, observables,
etc. Roughly speaking, we know that observables on small sets induce observables
on large sets. The study of this property of observables yields a new structure,
called the “factorization algebra”. Since the start of the 21st century, where the
notion of factorization algebras was introduced, it has been studied by a number of
mathematicians, and many notable results have been developed.
On the other hand, there exists another mathematical structure which is closely
related to the factorization algebra, called “factorization homology”. It is a struc-
ture to describe homology theories specific to manifolds of a specific dimension,
with some specific structures. The specific structures can be smooth structures,
framings, or simply no structure at all. They are introduced by axioms similar
to the Eilenberg-Steenrod axioms for homology theory for spaces, including the
monoidal axiom and the excision axiom.
In order to examine factorization homology and factorization algebras, a
number of preliminaries are required; they include the theory of operads and 𝐸𝑛 -
algebras. The theory of operads is used to study the structures of multimorphisms
and the relations that the multimorphisms must satisfy, while the 𝐸𝑛 -algebras are
introduced to study iterated loop spaces and configuration spaces, and have been
heavily studied in algebraic topology ever since the seventies.
It has been proved that factorization homology and factorization algebras are
highly related. This leads to more studies on 𝐸𝑛 -algebras, together with related
concepts in algebraic topology related to quantum mechanics, including deforma-
tions of 𝐸𝑛 -algebras, the (higher) Deligne conjecture, (higher) string topology and
Bar constructions of iterated loop spaces. It is also related to Hoschchild homology
and cohomology, which arise as specific examples of factorization homology.
In this article, we will present the preliminaries we need for defining factoriza-
tion algebras and factorization homology in Section 1. In Section 2, we will give
the definition of factorization algebras, and present some basic operations, such
as pullbacks, pushforwards, extending from a basis and gluing. The definition of
homology theories for manifolds, factorization homology, together with their prop-
erties, relationships with factorization algebras and basic examples, will be given
in Section 3. In the final section, two applications will be introduced, including the
Bar constructions and the Deligne conjecture.

Notation. We fix the following notations throughout the whole article:

• All manifolds are assumed to be smooth.

• When we talk about a “topological space”, we mean a 𝐾-space, whose def-


inition and basic properties may be found in [Hov99]. The category of 𝐾-
spaces is demoted by K.

• SSet is the category of simplicial sets. By [Hov99], there exists a singular


functor Sing ∶ K → SSet.
1 Preliminaries: Operads, higher operads and algebras over them 47

• The definitions and basic properties of model categories are from [Hov99]
as well. Note that the definitions are slightly different from other references.

• An introduction to the language of ∞-categories can be found in [Lur09].


Further reference for ∞-categories include [Lur16].

• S denotes the ∞-category of topological spaces, and H denotes the homotopy


category of topological spaces. The 1-morphisms in S are continuous maps
between spaces, and the 𝑛-morphisms in S are homotopies between (𝑛 − 1)-
morphisms; and the morphisms in H are homotopy classes of continuous
maps.

• For any commutative ring with unit 𝑅, we use Ch(𝑅) to denote the (ordinary)
category of chain complexes over 𝑅, and we use ℂ𝕙(𝑅) to denote the ∞-
category of chain complexes over 𝑅. For a minimal supply to ∞-category of
chain complexes, the first chapter of [Lur16] may be helpful.

• For a topological space 𝑋, 𝐶• (𝑋) denotes the singular chain complex of 𝑋,


and 𝐶 • (𝑋) denotes the singular cochain complex of 𝑋.

1 Preliminaries: Operads, higher operads and algebras


over them
In this section, we introduce the language of operads, that will be used in the sec-
tions below. We will see how these operads represent different structures on mul-
tilinear operators in a single, simple language. We will also introduce the language
of higher operads, which allows us to intepret multilinear operators in different cat-
egories in one single language, as well as homotopy coherent algebras. Finally, we
introduce the little cubes operad, and 𝐸𝑛 -algebras, as we will use them in the third
section.

Basic definitions
Operads are first introduced by Peter May in his book [May72]. We first introduce
its definition and give some evident generalizations.

Definition 1.1. Suppose (C, ⊗, 𝟙) is a symmetric monoidal category. An operad


with values in C consists of:

i) A sequence of objects 𝐹 (𝑛) ∈ C for all nonnegative integers 𝑛;

ii) A unit map 𝑒 ∶ 𝟙 → 𝐹 (1);

iii) A right action 𝜌𝑛 of the symmetric group 𝑆𝑛 on 𝐹 (𝑛);


48 An Introduction to Factorization Algebras and Factorization Homology

iv) A composition map

𝑐 ∶ 𝐹 (𝑘) ⊗ 𝐹 (𝑛1 ) ⊗ ⋯ ⊗ 𝐹 (𝑛𝑘 ) → 𝐹 (𝑛1 + ⋯ + 𝑛𝑘 )

for nonnegative integers 𝑘, 𝑛1 , ⋯ , 𝑛𝑘 ;

such that the following conditions are satisfied:

i) The unital identity: The following maps are the identity maps:

≅ 𝑒⊗𝟙 𝑐
𝐹 (𝑛) −
→ 𝟙 ⊗ 𝐹 (𝑛) −−−→ 𝐹 (1) ⊗ 𝐹 (𝑛) →
− 𝐹 (𝑛),

≅ 𝟙⊗𝑒⊗𝑛 𝑐
→ 𝐹 (𝑛) ⊗ 𝟙⊗𝑛 −−−−−→ 𝐹 (𝑛) ⊗ 𝐹 (1)⊗𝑛 →
𝐹 (𝑛) − − 𝐹 (𝑛);

ii) The associativity diagram: The following diagram is commutative:

𝑘 𝑘 𝑛𝑖 𝑘 𝑛
𝐹 (𝑘) ⊗ ⨂ 𝐹 (𝑛𝑖 ) ⊗ ⨂ ⨂ 𝐹 (𝑚𝑖𝑗 )

/ 𝐹 (𝑘) ⊗ ⨂ 𝐹 (𝑛𝑖 ) ⊗ ⨂𝑖 𝐹 (𝑚𝑖𝑗 )
(𝑖=1 ) (𝑖=1 𝑗=1 ) 𝑖=1 ( 𝑗=1 )
𝑘
𝟙⊗ ⨂ 𝑐
 𝑖=1

𝑘 𝑛𝑖 𝑘 𝑛𝑖
𝑐⊗ ⨂ ⨂ 𝟙
(𝑖=1 𝑗=1 )
𝐹 (𝑘) ⊗ ⨂ 𝐹 ∑ 𝑚𝑖𝑗
𝑖=1 (𝑗=1 )

𝑐
 
𝑘 𝑘 𝑛𝑖 𝑘 𝑛
𝐹 ∑ 𝑛𝑖 ⊗ ⨂ ⨂ 𝐹 (𝑚𝑖𝑗 )
𝑐
/ 𝐹 ∑ ∑𝑖 𝑚𝑖𝑗 ;
(𝑖=1 ) (𝑖=1 𝑗=1 ) (𝑖=1 𝑗=1 )

iii) Compatibility with the 𝑆𝑛 -action: The following diagram is commutative:

𝑘 𝑘
𝐹 (𝑘) ⊗ ⨂ 𝐹 (𝑛𝑖 )
≅ / 𝐹 (𝑘) ⊗ ⨂ 𝐹 (𝑛 −1 )
𝜎 (𝑖)
𝑖=1 𝑖=1

𝜌𝜎 ⊗𝟙 𝑐

𝑘 
𝐹 (𝑘) ⊗ ⨂ 𝐹 (𝑛𝑖 )
𝑐 / 𝐹 (𝑛 + ⋯ + 𝑛 )
1 𝑘
𝑖=1

for any 𝜎 ∈ 𝑆𝑘 ; and the following diagram is commutative:


1 Preliminaries: Operads, higher operads and algebras over them 49

𝑘
𝐹 (𝑘) ⊗ ⨂ 𝐹 (𝑛𝑖 )
𝑐 / 𝐹 (𝑛 + ⋯ + 𝑛 )
1 𝑘
𝑖=1
𝑘
𝟙 ⊗ ⨂ 𝜌 𝜎𝑖 𝜌(𝜎1 ,⋯,𝜎𝑟 )
𝑖=1 
𝑘 
𝐹 (𝑘) ⊗ ⨂ 𝐹 (𝑛𝑖 )
𝑐 / 𝐹 (𝑛 + ⋯ + 𝑛 )
1 𝑘
𝑖=1

for any 𝜎𝑖 ∈ 𝑆𝑛𝑖 , where (𝜎1 , ⋯ , 𝜎𝑟 ) denotes its image under the evident inclu-
sion 𝑆𝑛1 × ⋯ × 𝑆𝑛𝑘 → 𝑆𝑛1 +⋯+𝑛𝑘 .

If C = SSet, we call such an operad with values in SSet a simplicial operad.

What an operad describes is some sort of “multimorphisms on a single object”;


that is, we may regard 𝐹 (𝑛) as some sort of “the space of 𝑛-ary operations”, 𝑒 to be
the “identity operation”, 𝜌 to be “permuting the inputs”; 𝑐 to be the “composition”,
and these operations satisfy a number of coherent diagrams. To be more explicitly,
we have the following example:

Example 1.2. Suppose C is a symmetric monoidal category, D is a symmetric


monoidal C-enriched category, and 𝑋 ∈ D. There exists a canonical operad given
by
𝐹 (𝑛) = Map(𝑋 ⊗𝑛 , 𝑋),

where 𝑒 is the identity map

𝟙𝑋 ∶ 𝟙C → Map(𝑋, 𝑋)

(regarded as an object in Map(𝑋, 𝑋)), and the 𝑆𝑛 -actions and compositions are
given in the obvious way. This operad indeed describes multimorphisms on the
object 𝑋. We denote this operad by End(𝑋), and call it the endomorphism operad
of 𝑋.

More examples of operads include Examples 1.8 and 1.9.


In many cases we also need to consider multimorphisms on not only one object
but maybe more than one: for example, observables in small sets lead to observ-
ables in large sets. In order to deal with this situation, we introduce a structure
called colored operads. The following definition is from [nLab].

Definition 1.3. Suppose 𝐶 is a set, called the set of colors. A colored operad with
values taken in a symmetric monoidal category C consists of:

i) An object Hom(𝑐1 , ⋯ , 𝑐𝑛 ; 𝑐) in C for every nonnegative integer 𝑛 and


𝑐1 , ⋯ , 𝑐𝑛 , 𝑐 ∈ 𝐶;

ii) A unit map 𝑒 ∶ 𝟙 → Hom(𝑐; 𝑐) for all 𝑐 ∈ 𝐶;


50 An Introduction to Factorization Algebras and Factorization Homology

iii) An action by the symmetric group

𝜌𝜎 ∶ Hom(𝑐1 , ⋯ , 𝑐𝑛 ; 𝑐) → Hom(𝑐𝜎(1) , ⋯ , 𝑐𝜎(𝑛) ; 𝑐)

for 𝜎 ∈ 𝑆𝑛 and 𝑐1 , ⋯ , 𝑐𝑛 , 𝑐 ∈ 𝐶;
iv) A composition law
𝑘
Hom(𝑐1 , ⋯ , 𝑐𝑘 ; 𝑐) ⊗ Hom(𝑐𝑖1 , ⋯ , 𝑐𝑖𝑛𝑖 ; 𝑐𝑖 ) → Hom(𝑐11 , ⋯ , 𝑐𝑘𝑛𝑘 ; 𝑐);

𝑖=1

such that the composition is associative, unital and compatible with the 𝑆𝑛 -action.
If C = SSet, we call such a colored operad with values in SSet a simplicial
colored operad.
From this definition, we see that an operad is a special case of a colored operad:
it is a colored operad with only one color. In this case, we have

𝐹 (𝑛) = Hom(∗,⏟
⋯ , ∗; ∗).
𝑛 copies

For this reason, unless focusing on specific examples, we aim at colored operads
instead of single-colored operads.
The endomorphism operad can also be generalized to the colored case, as fol-
lows:
Example 1.4. Suppose C is a symmetric monoidal category, D is a symmetric
monoidal C-enriched category, 𝑋 = {𝑋𝑐 }𝑐∈𝐶 is a set of objects in D, where 𝐶 is
the set of colors. Consider the colored operad given by

Hom(𝑐1 , ⋯ , 𝑐𝑛 ; 𝑐) = Map(𝑋𝑐1 ⊗ ⋯ ⊗ 𝑋𝑐𝑛 , 𝑋𝑐 ),

where the identity maps, 𝑆𝑛 -actions and compositions are given in the obvious way.
We denote this operad by End(𝑋), and call it the endomorphism operad of 𝑋. If
𝑋 is the set of all objects in D, we also write it as End(D).
Next we discuss morphisms between (colored) operads.
Definition 1.5. Suppose O1 , O2 are colored operads, with colorings 𝐶1 , 𝐶2 , respec-
tively. A map 𝐹 from O1 to O2 consists of a map 𝐹 ∶ 𝐶1 → 𝐶2 of colors, and a
map
𝐹 ∶ Hom(𝑐1 , ⋯ , 𝑐𝑛 ; 𝑐) → Hom(𝐹 (𝑐1 ), ⋯ , 𝐹 (𝑐𝑛 ); 𝐹 (𝑐))

of morphisms for any objects 𝑐1 , ⋯ , 𝑐𝑛 , 𝑐 ∈ 𝐶1 , such that the map of morphisms


preserves unit, composition, and is 𝑆𝑛 -equivariant.
Thus, we may define the category of single-colored operads and colored oper-
ads with values in C, which we will denote by Op(C) and COp(C).
Now we are ready to introduce what an algebra over a (colored) operad is.
1 Preliminaries: Operads, higher operads and algebras over them 51

Definition 1.6. Suppose O is a colored operad, taking values in a symmetric


monoidal category C, with coloring 𝐶, and D is a symmetric monoidal C-enriched
category.
An algebra over O in D is a pair (𝑋, 𝜃), where 𝑋 = {𝑋𝑐 }𝑐∈𝐶 is a collection
of objects in D indexed by 𝐶, and 𝜃 ∶ O → End(𝑋) is a map of colored operads,
such that 𝜃(𝑐) = 𝑋𝑐 for all 𝑐 ∈ 𝐶. Equivalently, an algebra over O in D is a map
O → End(D) of colored operads.
A map 𝑓 between algebras (𝑋, 𝜃) and (𝑌 , 𝜉) over O in D consists a map
𝑓 ∶ 𝑋𝑐 → 𝑌𝑐 for every 𝑐 ∈ 𝐶, such that the following diagram is commutative for
any colors 𝑐1 , ⋯ , 𝑐𝑛 , 𝑐 ∈ 𝐶:

Hom(𝑐1 , ⋯ , 𝑐𝑛 ; 𝑐)
𝜃 / Hom(𝑋 , ⋯ , 𝑋 ; 𝑋 )
𝑐1 𝑐𝑛 𝑐

𝜉 𝑓∗
 𝑓∗ 
Hom(𝑌𝑐1 , ⋯ , 𝑌𝑐𝑛 ; 𝑌𝑐 ) / Hom(𝑋𝑐 , ⋯ , 𝑋𝑐 ; 𝑌𝑐 ).
1 𝑛

The category of algebras over O in D will be denoted by Alg(O, D).


Loosely speaking, an algebra over an operad is a structure such that the “formal
operations” in an operad turn into “actual operations” between certain objects.
The language of operads allows us to inteprete different structures in a com-
mon language. We first state a lemma, whose proof is simply reinterpreting the
definitions:
Lemma 1.7. Suppose O is a colored operad, taking values in a symmetrical
monoidal category C, with coloring 𝐶, and D is a symmetric monoidal C-enriched
category that is tensored over C. Giving an algebra (𝑋, 𝜃) over O in D is equivalent
to giving a set of objects 𝑋 = {𝑋𝑐 }𝑐∈𝐶 in D indexed by 𝐶, and maps
𝜃 ∶ Hom(𝑐1 , ⋯ , 𝑐𝑛 ; 𝑐) ⊗ 𝑋𝑐1 ⊗ ⋯ ⊗ 𝑋𝑐𝑛 → 𝑋𝑐 ,
such that the unital, 𝑆𝑛 -equivariance and associativity diagrams are satisfied.
We now state some basic examples.
Example 1.8. Suppose C is a symmetric monoidal cocomplete category, D is
a symmetric monoidal C-enriched category. Define the (single-colored) operad
Assoc to be the operad with:
i) Assoc(𝑛) = ∐𝜎∈𝑆𝑛 (𝟙C )𝜎 ;
ii) The unit map is the identity map of the unit object;
iii) The 𝑆𝑛 -action is given by right multiplication;
iv) The composition map is induced by the inclusion 𝑆𝑘 × 𝑆𝑛1 × ⋯ × 𝑆𝑛𝑘 →
𝑆𝑛1 +⋯+𝑛𝑘 , which is again induced by an 𝑆𝑘 × 𝑆𝑛1 × ⋯ × 𝑆𝑛𝑘 -action on
{1, ⋯ , 𝑛1 + ⋯ + 𝑛𝑘 }, where we first divide the set into 𝑘 block with the 𝑖-th
block has size 𝑛𝑖 , permute the blocks by the element in 𝑆𝑘 , and permute each
block by the element in 𝑆𝑛𝑖 .
52 An Introduction to Factorization Algebras and Factorization Homology

Then an algebra over Assoc is equivalent to a monoidal object in D, and a map of


algebras over Assoc is equivalent to a homomorphism between two monoidal ob-
jects. In particular, if D = Mod𝑅 for some ring 𝑅, the two assertions are equivalent
to associative 𝑅-algebras and homomorphisms between associative 𝑅-algebras; if
D = Ch𝑅 for some ring 𝑅, the two assertions are equivalent to associative dg-𝑅-
algebras and homomorphisms between associative dg-𝑅-algebras.

Example 1.9. Suppose C is a symmetric monoidal category, D is a symmetric


monoidal C-enriched category. Define the (single-colored) operad Comm to be the
operad with:

i) Comm(𝑛) = 𝟙C ;

ii) The unit map and composition map is the identity map of the unit object;

iii) The 𝑆𝑛 -action is the trivial action.

Then an algebra over Comm is equivalent to a commutative monoidal object in


D, and a map of algebras over Comm is equivalent to a homomorphism between
two commutative monoidal objects. In particular, if D = Mod𝑅 for some ring 𝑅,
the two assertions are equivalent to commutative 𝑅-algebras and homomorphisms
between commutative 𝑅-algebras; if D = Ch𝑅 for some ring 𝑅, the two asser-
tions are equivalent to commutative dg-𝑅-algebras and homomorphisms between
commutative dg-𝑅-algebras.

Higher operads
In practice we often have to change the underlying category of an operad. For
example, we sometimes face the problem of changing the underlying category of
an operad O from the category of topological spaces to some other category (for
example Ch(𝑅) for a ring 𝑅). If there exists a symmetric monoidal functor between
the underlying categories (in the case Ch(𝑅) we have the singular chain complex
functor), then we may take the pushforward of O along the symmetric monoidal
functor; however this is not always the case. In this section, we will introduce the
notion of a higher operad, which solves this problem nicely for topological operads.
To do this, we first review how we get a symmetric monoidal ∞-category from
an ordinal symmetric monoidal category.

Construction 1.10. For a symmetric monoidal category C, we may construct a


category C⊗ , as follows:

i) The objects of C⊗ are finite (possibly empty) sequences of objects in C, which


will be expressed in the form [𝐶1 , ⋯ , 𝐶𝑛 ];

ii) For any [𝐶1 , ⋯ , 𝐶𝑛 ], [𝐷1 , ⋯ , 𝐷𝑚 ] ∈ C⊗ , a morphism

𝑓 ∶ [𝐶1 , ⋯ , 𝐶𝑛 ] → [𝐷1 , ⋯ , 𝐷𝑚 ]
1 Preliminaries: Operads, higher operads and algebras over them 53

consists of a subset 𝑆 ⊆ {1, ⋯ , 𝑛}, a map 𝑓 ∶ 𝑆 → {1, ⋯ , 𝑚}, and a map

𝑓𝑖 ∶ 𝐶𝑗 → 𝐷 𝑖

𝑗∈𝑓 −1 (𝑖)

for any 1 ≤ 𝑖 ≤ 𝑚;
iii) Suppose

𝑓 ∶ [𝐶1 , ⋯ , 𝐶𝑛 ] → [𝐷1 , ⋯ , 𝐷𝑚 ], 𝑔 ∶ [𝐷1 , ⋯ , 𝐷𝑚 ] → [𝐸1 , ⋯ , 𝐸𝑙 ]

are morphisms in C⊗ , determining subsets 𝑆 ⊆ {1, ⋯ , 𝑛} and 𝑇 ⊆ {1, ⋯ , 𝑚}.


The composition 𝑔 ∘ 𝑓 consists of the set 𝑆 ′ = 𝑓 −1 (𝑇 ) ⊆ {1, ⋯ , 𝑛}, the map

𝑔 ∘ (𝑓 |𝑆 ′ ) ∶ 𝑆 ′ → {1, ⋯ , 𝑙},

and the composition

(𝑔 ∘ 𝑓 )𝑖 ∶ 𝐶𝑘 ≅ 𝐶𝑘 → 𝐷𝑗 → 𝐸𝑖
⨂ ⨂ ⨂ ⨂
𝑘∈(𝑔∘(𝑓 |𝑆 ′ ))−1 (𝑖) 𝑗∈𝑔 −1 (𝑖) 𝑘∈𝑓 −1 (𝑗) 𝑗∈𝑔 −1 (𝑖)

for every 1 ≤ 𝑖 ≤ 𝑙.

Next, take the category Fin∗ , the category of pointed finite sets, where the ob-
jects are ⟨𝑛⟩ ∶= {∗, 1, ⋯ , 𝑛} for all nonnegative integers 𝑛, and the morphisms are
all maps that preserve the base point. Then we have a forgetful functor 𝑝 ∶ C⊗ →
Fin∗ , that carries an object [𝐶1 , ⋯ , 𝐶𝑛 ] to the set ⟨𝑛⟩. Moreover, 𝑝 satisfies the
following properties:
Proposition 1.11. i) 𝑝 is a coCartesian fibration, which means for any

𝛼 ∶ ⟨𝑛⟩ → ⟨𝑚⟩ ∈ Fin∗ , 𝐶 = [𝐶1 , ⋯ , 𝐶𝑛 ] ∈ C⊗ ,

there exists a map


𝐶 → 𝐷 = [𝐷1 , ⋯ , 𝐷𝑚 ] ∈ C⊗

lifting 𝛼 that is coCartesian in the following sense: for any 𝐸 ∈ C⊗ the map

HomC⊗ (𝐷, 𝐸) → HomC⊗ (𝐶, 𝐸) × HomFin (⟨𝑚⟩ , 𝑝(𝐸))


HomFin (⟨𝑛⟩,𝑝(𝐸)) ∗

is an isomorphism. This implies that there exists a functor 𝛼! ∶ C⊗


⟨𝑛⟩
→ C⊗
⟨𝑚⟩
between the fibers.
ii) The map
C⊗
⟨𝑛⟩
→ (C⊗
⟨1⟩
)𝑛 ≅ C𝑛

induced by the 𝑛 maps 𝜌𝑖 ∶ ⟨𝑛⟩ → ⟨1⟩ for every 1 ≤ 𝑖 ≤ 𝑛, where 𝜌𝑖 (𝑖) = 1


and 𝜌𝑖 (𝑗) = ∗ if 𝑖 ≠ 𝑗, is an equivalence of categories.
54 An Introduction to Factorization Algebras and Factorization Homology

Thus the following definition is reasonable:

Definition 1.12. A symmetric monoidal ∞-category is an ∞-category C⊗


equipped with a coCartesian fibration 𝑝 ∶ C⊗ → N(Fin∗ ) (where N is the nerve
functor that takes a category to its nerve), such that for every 𝑛 ≥ 0, the map
C⊗⟨𝑛⟩
→ (C⊗
⟨1⟩
)𝑛 induced by the 𝑛 maps 𝜌𝑖 ∶ ⟨𝑛⟩ → ⟨1⟩ for every 1 ≤ 𝑖 ≤ 𝑛, is an
equivalence of ∞-categories. The underlying category of C⊗ is defined to be
C ∶= C⊗⟨1⟩
.

We now turn back to the case of (colored) operads.

Construction 1.13. For any colored operad O over Set, we may define a category
O⊗ as follows:

i) The objects of O⊗ are finite (possibly empty) sequences of colors in O, which


will be expressed in the form [𝐶1 , ⋯ , 𝐶𝑛 ];

ii) For any [𝐶1 , ⋯ , 𝐶𝑛 ], [𝐷1 , ⋯ , 𝐷𝑚 ] ∈ O⊗ , a morphism

𝑓 ∶ [𝐶1 , ⋯ , 𝐶𝑛 ] → [𝐷1 , ⋯ , 𝐷𝑚 ]

consists of a map 𝑓 ∶ ⟨𝑛⟩ → ⟨𝑚⟩, and a map 𝑓𝑖 ∈ Hom((𝐶𝑗 )𝑗∈𝑓 −1 (𝑖) ; 𝐷𝑖 ) for
any 1 ≤ 𝑖 ≤ 𝑚;

iii) Composition of morphisms in O⊗ is determined by the composition on Fin∗


and on the colored operad O.

Then we have a forgetful functor 𝑝 ∶ O⊗ → Fin∗ , that carries an object


[𝐶1 , ⋯ , 𝐶𝑛 ] to the set ⟨𝑛⟩. Moreover, 𝑝 satisfies properties similar to the C⊗
constructed above. Consequencely, we may define ∞-operads in a way similar
to that of symmetric monoidal ∞-categories. We make the following definition,
which is due to [Lur16]:

Definition 1.14. An ∞-operad is an ∞-category O⊗ equipped with a functor


𝑝 ∶ O⊗ → N(Fin∗ ), such that:

i) For any inert morphism 𝑓 ∶ ⟨𝑛⟩ → ⟨𝑚⟩, which means that 𝑓 −1 ({1, ⋯ , 𝑚}) →
{1, ⋯ , 𝑚} is an isomorphism, and any 𝐶 ∈ O⊗
⟨\⟩
, there exists a 𝑝-coCartesian
morphism 𝐶 → 𝐷 ∈ O⊗ lifting 𝑓 ;

ii) For every 𝐶1 , ⋯ , 𝐶𝑛 ∈ O⊗


⟨1⟩
, there exists 𝐶 ∈ O⊗
⟨𝑛⟩
and 𝑝-coCartesian mor-
phisms 𝐶 → 𝐶𝑖 covering 𝜌𝑖 ;

iii) For any objects 𝐶 ∈ O⊗


⟨𝑛⟩
, 𝐶 ′ ∈ O⊗
⟨𝑚⟩
, and a morphism 𝑓 ∈ HomFin (⟨𝑛⟩ , ⟨𝑚⟩),

take
Map𝑓O⊗ (𝐶, 𝐶 ′ ) = MapO⊗ (𝐶, 𝐶 ′ ) × {𝑓 },
MapN(Fin ) (⟨𝑛⟩,⟨𝑚⟩)

1 Preliminaries: Operads, higher operads and algebras over them 55

take 𝑝-coCartesian morphisms 𝐶 ′ → 𝐶𝑖′ covering 𝜌𝑖 for every 1 ≤ 𝑖 ≤ 𝑚, then


the induced map
𝜌 ∘𝑓
Map𝑓O⊗ (𝐶, 𝐶 ′ ) → ∏ MapO𝑖 ⊗ (𝐶, 𝐶𝑖′ )
1≤𝑖≤𝑚

is an isomorphism in H.
Remark 1.15. i) By this definition, for any Set-valued colored operad O, N(O⊗ )
is an ∞-operad. Thus this definition generalizes the definition of a (Set-valued
colored) operad to the higher categorial sense.
ii) If O⊗ is an ∞-operad, we define the underlying category of O⊗ to be O ∶=
O⊗⟨1⟩
. This is an ∞-category, since 𝑝 must be an inner fibration by [Lur09],
Proposition 2.3.1.5.
iii) By definition, for every 𝑛 ≥ 0 the map O⊗ ⟨𝑛⟩
→ O𝑛 , induced by the 𝑛 maps
𝜌𝑖 ∶ ⟨𝑛⟩ → ⟨1⟩ for every 1 ≤ 𝑖 ≤ 𝑛, is an equivalence of ∞-categories. Thus,
as in the ordinary case, we may regard objects in O⊗⟨𝑛⟩
as finite sequences of
objects in O of length 𝑛.
iv) Any symmetric monoidal ∞-category C⊗ is an ∞-operad. This is because
the condition that 𝑝 is a coCartesian fibration gaurantees i) and ii), and the
equivalence between C⊗ ⟨𝑛⟩
and C𝑛 gaurantees iii).

Next we define morphisms between ∞-operads.


Definition 1.16. Suppose O⊗ , O′⊗ are ∞-operads. A map of ∞-operads from
O⊗ to O′⊗ is a map of ∞-categories 𝑓 ∶ O⊗ → O′⊗ , such that:
i) 𝑓 is compatible with the forgetful maps to N(Fin∗ );

ii) 𝑓 preserves inert morphisms, where a morphism in O⊗ is inert if it is 𝑝-


coCartesian, and its image in N(Fin∗ ) is inert.

The full sub-∞-category of FunN(Fin ) (O⊗ , O′⊗ ) spanned by maps of ∞-operads



will be denoted Alg(O, O′ ). A map of ∞-operads from O⊗ to O′⊗ is also called an
algebra over O⊗ in O′⊗ , if O′⊗ is a symmetric monoidal ∞-category.
Remark 1.17. i) Since by definition, if 𝑓 ∶ O → O′ is a map of (ordinary) oper-
ads, the map N(𝑓 ⊗ ) ∶ N(O⊗ ) → N(O′⊗ ) is a map of ∞-operads, this is really
a generalization of the original definition.
ii) By this definition, we obtain an ∞-category of algebras over O⊗ in O′⊗ .
iii) If O′⊗ is a symmetric monoidal ∞-category, then the symmetric monoidal
structure on O′⊗ induces a symmetric monoidal structure on Alg(O, O′ ), where
for any 𝑓 , 𝑔 ∈ Alg(O, O′ ) and 𝑝 ∈ O⊗ , (𝑓 ⊗𝑔)(𝑝) = 𝑓 (𝑝)⊗𝑔(𝑝). Verifications
of the axioms are straightforward.
56 An Introduction to Factorization Algebras and Factorization Homology

We now show how these definitions solve the problem mentioned in the begin-
ning of the section. We first notice that the construction of O⊗ also makes sense
in the case of a simplicial colored operad, except that we need to change ii) of
Construction 1.13 into the following construction:

ii’) For any [𝐶1 , ⋯ , 𝐶𝑛 ], [𝐷1 , ⋯ , 𝐷𝑚 ] ∈ O⊗ , we define

HomO⊗ ([𝐶1 , ⋯ , 𝐶𝑛 ], [𝐷1 , ⋯ , 𝐷𝑚 ])


= ∐ ∏ Hom((𝐶𝑗 )𝑗∈𝑓 −1 (𝑖) , 𝐷𝑖 ).
𝑓 ∈HomFin (⟨𝑛⟩,⟨𝑚⟩) 1≤𝑖≤𝑚

Since O⊗ is a simplicial category, we may take its simplicial nerve to obtain a


simplicial set N⊗ (O), which we will call the operadic nerve of O. This simplicial
set may or may not be an ∞-category, but we have the following proposition:

Proposition 1.18. Suppose all the Hom-spaces of O are Kan complexes. Then
N⊗ (O) is an ∞-operad.

We refer the reader to [Lur16], Proposition 2.1.1.27 for the proof.


In the case that O is a topological operad, we may take the pushforward of
O along the symmetric monoidal functor Sing ∶ K → SSet, and repeat the above
constructions, obtaining a simplicial set N⊗ (Sing O), which we will simply denote
by N⊗ (O), called the operadic nerve of O. Now the operadic nerve of a topological
operad is always an ∞-operad, since Sing(𝑋) is always a Kan complex for any
topological space 𝑋.
Since any symmetric monoidal category C is a symmetric monoidal ∞-
category, we may regard an algebra over O in C as an algebra over N⊗ (O) in
N⊗ (C).

Remark 1.19. If a map of topological colored operads induces weak equivalences


on all morphism spaces, then the operadic nerve of the topological colored operads
are equivalent ∞-categories, thus the associated ∞-categories of algebras over the
topological colored operads are equivalent.

The little cubes operads and 𝐸𝑛 ‐algebras


The little cubes operads are one of the most often discussed types of operads. We
give the definitions of the little cubes operads and 𝐸𝑛 -algebras, and basic examples
of 𝐸𝑛 -algebras.

Definition 1.20. A rectilinear morphism is a map 𝑓 ∶ ℝ𝑛 → ℝ𝑛 , such that 𝑓 can


be written as the form 𝑓1 × ⋯ × 𝑓𝑛 , where each 𝑓𝑖 ∶ ℝ → ℝ is an affine morphism.

Definition 1.21. Suppose 𝑛 ≥ 1 is an integer. We define the topological single-


colored operad E𝑛 , called the little 𝒏-cubes operad, as follows:
1 Preliminaries: Operads, higher operads and algebras over them 57

i) For any 𝑟 ≥ 0, the morphism space E𝑛 (𝑟) is the configuration space of 𝑟 ordered
disjoint open rectangles whose edges are parallel to the coordinate axes (each
called a little 𝒏-cube) lying in the unit open cube; in other words, E𝑛 (𝑟) is
the space of rectilinear embeddings from ∐𝑟𝑖=1 (0, 1)𝑛 to (0, 1)𝑛 , which will be
denoted by
𝑟
Rect ∐(0, 1)𝑛 , (0, 1)𝑛 ;
( 𝑖=1 )

ii) The 𝑆𝑟 -action is given by permuting the 𝑟 little 𝑛-cubes;


iii) The unit map is the identity map of (0, 1)𝑛 ;
iv) Composition is given by the composition of the rectilinear embeddings.
An 𝑬𝒏 -algebra is an algebra over E𝑛 . The ∞-category of 𝐸𝑛 -algebras in a sym-
metric monoidal ∞-category C will be denoted E𝑛 -Alg(C) = Alg(N⊗ (E𝑛 ), C).
By definition, there exists a canonical map 𝑖 ∶ E𝑛 → E𝑛+1 , given by
𝑟 𝑟
𝑖(𝑟) ∶ Rect ∐(0, 1)𝑛 , (0, 1)𝑛 → Rect ∐(0, 1)𝑛+1 , (0, 1)𝑛+1 ,
( 𝑖=1 ) ( 𝑖=1 )
𝑓 ↦ 𝑓 × 𝟙(0,1) ,
which is an inclusion of topological spaces, for all 𝑟 ≥ 0. Taking colimits, we obtain
an operad, which we will call the little ∞-cubes operad, denoted E∞ . Explicitly,
we have E∞ (𝑟) = colim𝑛→∞ E𝑛 (𝑟). An 𝑬∞ -algebra is an algebra over E∞ . The ∞-
category of 𝐸∞ -algebras in a symmetric monoidal ∞-category C will be denoted
E∞ -Alg(C) = Alg(N⊗ (E∞ ), C).
Note that the inclusion 𝑖 ∶ E𝑛 → E𝑛+1 induces a functor
𝑖∗ ∶ E𝑛 -Alg(C) → E𝑚 -Alg(C)
for any 1 ≤ 𝑚 ≤ 𝑛 ≤ ∞, which shows that any 𝐸𝑛 -algebra can be regarded as an
𝐸𝑚 -algebra if 𝑛 ≥ 𝑚.
For one of the basic properties of E𝑛 we have the following proposition, which
is stated and proved in Theorem 4.8, [May72]:
Proposition 1.22. For any 1 ≤ 𝑛 ≤ ∞, 𝑟 ≥ 0, the space E𝑛 (𝑟) is 𝑆𝑟 -equivariantly
homotopy equivalent to Conf(ℝ𝑛 , 𝑟), the configuration space of 𝑟 ordered points in
ℝ𝑛 .
It is well known that the configuration space Conf(ℝ𝑛 , 𝑟) is (𝑛 − 2)-connected
for 𝑛 ≥ 2 (See, for instance, [FN62]). This proposition shows that an 𝐸𝑛 -algebra
may be regarded as an algebra that is “homotopy commutative up to the (𝑛 − 1)-th
degree”. Also, note that E1 (𝑟) is a disjoint union of 𝑟! contractible spaces. There-
fore, the maps 𝜋0 ∶ E1 → Assoc and 𝜋0 ∶ E∞ → Comm induce weak equivalences
in all morphism spaces. Thus, 𝐸1 -algebras and 𝐸∞ -algebras can be regarded as
“homotopy refinements” of associative algebras and commutative algebras.
58 An Introduction to Factorization Algebras and Factorization Homology

Example 1.23. Iterated loop spaces. This is one of the first examples of 𝐸𝑛 -
algebras that have been studied. Suppose (𝑋, ∗) is a pointed topological space.
We define a map
𝜃 ∶ E𝑛 (𝑟) × (Ω𝑛 𝑋)𝑟 → Ω𝑛 𝑋

as follows: for any


𝑟
E𝑛 (𝑟) ∋ 𝑓 ∶ ∐(0, 1)𝑛 → (0, 1)𝑛
𝑖=1

given by 𝑓1 , ⋯ , 𝑓𝑟 ∶ (0, 1)𝑛 → (0, 1)𝑛 , and 𝛼1 , ⋯ , 𝛼𝑟 ∶ [0, 1]𝑛 → 𝑋 representing


𝑟 elements in Ω𝑛 𝑋, 𝜃(𝑓 , 𝛼1 , ⋯ , 𝛼𝑟 ) is the map 𝛽 ∶ [0, 1]𝑛 → 𝑋 such that for any
𝑢 ∈ (0, 1)𝑛 , if 𝑢 = 𝑓𝑘 (𝑣) for some 1 ≤ 𝑘 ≤ 𝑟 and 𝑣 ∈ (0, 1)𝑛 , then 𝛽(𝑢) = 𝛼𝑘 (𝑣);
otherwise 𝛽(𝑢) = ∗. It is easy to verify that this gives the iterated loop space Ω𝑛 𝑋
an 𝐸𝑛 -algebra structure. Moreover, the singular chain complex 𝐶• (Ω𝑛 𝑋) is an 𝐸𝑛 -
algebra in chain complexes.

𝐸𝑛 -algebras play an important role in the study of homology theories of mani-


folds and factorization homology; we will see this in the following sections.

2 Factorization algebras
Factorization algebras are a way to describe multilinear structures that satisfy cer-
tain homotopy coherent properties. It is a bit similar to an algebra over the little
cubes operad, but this time the operad is replaced by a colored operad, with the
cubes (0, 1)𝑛 replaces by manifolds. We shall now present the definitions and the
basic properties, which are mostly based on [CG16] and [Gin13].

Basic definitions
Definition 2.1. Suppose 𝑋 is a topological space (which in the case of a factoriza-
tion algebra is often taken to be a manifold). Define the colored operad Fact(𝑋) to
be the following operad:

i) The colors of Fact(𝑋) are the connected open sets in 𝑋 。

ii) For any connected open sets 𝑈1 , ⋯ , 𝑈𝑛 , 𝑈 in 𝑋, define Hom(𝑈1 , ⋯ , 𝑈𝑛 ; 𝑈 )


to be the single-element set, if the 𝑈𝑖 ’s are pairwise disjoint and all contained
in 𝑈 ; and the empty set if otherwise。

iii) The unit morphism, 𝑆𝑛 -action, and the composition are given in the obvious
way.

The morphism from (𝑈1 , ⋯ , 𝑈𝑛 ) to 𝑈 is some sort of “putting an element into


each of the open sets 𝑈𝑖 and get an element in 𝑈 ”. We now state this precisely.
2 Factorization algebras 59

Definition 2.2. Suppose C is a symmetric monoidal ∞-category, 𝑋 is a topological


space. A prefactorization algebra 𝑨 over 𝑿 with values in C is an algebra over
Fact(𝑋) in C. The ∞-category of prefactorization algebras over 𝑋 in C is denoted
PreFA(𝑋, C). If 𝑈 = ∐𝑖∈𝐼 𝑈𝑖 is an open set in 𝑋 where each 𝑈𝑖 is connected, we
define 𝐴(𝑈 ) to be ⨂𝑖∈𝐼 𝐴(𝑈𝑖 ); and we define 𝐴(∅) to be the tensor identity. A
morphism between prefactorization algebras is called an equivalence if it induces
an equivalence on every open subset.
Loosely speaking, a prefactorization algebra assigns an open set with an object,
and a morphism from a small open set to a large one.
Example 2.3. Suppose 𝐴 is an associative algebra. We construct a prefactorization
algebra on the real line ℝ: for any open interval (𝑎, 𝑏) ⊆ ℝ we set 𝐴(𝑎, 𝑏) = 𝐴; for
any
−∞ ≤ 𝑎 ≤ 𝑎1 < 𝑏1 ≤ 𝑎2 < 𝑏2 ≤ ⋯ ≤ 𝑎𝑛 < 𝑏𝑛 ≤ 𝑏 ≤ ∞

we set the map


𝐴(𝑎1 , 𝑏1 ) ⊗ ⋯ ⊗ 𝐴(𝑎𝑛 , 𝑏𝑛 ) → 𝐴(𝑎, 𝑏)

to be the map
𝐴⊗𝑛 → 𝐴, (𝑥1 , ⋯ , 𝑥𝑛 ) ↦ 𝑥1 ⋯ 𝑥𝑛 .

It is easy to show that this forms a prefactorization algebra on ℝ.


We now discuss the definition of a factorization algebra. Factorization algebras
are to prefactorization algebra as cosheaves are to precosheaves. We first introduce
a special type of cover, called a factorization cover.
Definition 2.4. Suppose 𝑈 is an open set in a topological space 𝑋. A cover {𝑈𝑖 ↪
𝑈 }𝑖∈𝐼 is called a factorizing cover if for any 𝑥1 , ⋯ , 𝑥𝑛 ∈ 𝑈 there exists a finite
collection of pairwise disjoint open sets 𝑈𝑖1 , ⋯ , 𝑈𝑖𝑚 in the open cover such that
𝑥1 , ⋯ , 𝑥𝑛 ∈ 𝑈𝑖1 ∐ ⋯ ∐ 𝑈𝑖𝑚 .
Remark 2.5. Note that a nontrivial factorization cover exists whenever 𝑋 is Haus-
dorff. If 𝑋 is a manifold, we may take a factorization cover by take a neighborhood
basis at every point of 𝑋.
For a factorization cover {𝑈𝑖 ↪ 𝑈 }𝑖∈𝐼 , we denote 𝑃 𝐼 to be the set of finite
subsets of 𝐼, such that for every 𝛼 ∈ 𝑃 𝐼 and 𝑖, 𝑗 ∈ 𝛼, the sets 𝑈𝑖 and 𝑈𝑗 are
disjoint. 𝑃 𝐼 describes the tuples of open sets that appear in the structure maps of
a prefactorization algebra.
Now, for any 𝐴 a prefactorization algebra and any 𝛼1 , ⋯ , 𝛼𝑛 ∈ 𝑃 𝐼, we denote

𝐴(𝛼1 , ⋯ , 𝛼𝑛 ) = 𝐴 ∐ 𝑈 𝑖 ∩ ⋯ ∩ 𝑈 𝑖𝑛 ,
(𝑖1 ∈𝛼1 ,⋯,𝑖𝑛 ∈𝛼𝑛 1 )

then there exists a natural map

𝑑𝑖 ∶ 𝐴(𝛼1 , ⋯ , 𝛼𝑛 ) → 𝐴(𝛼1 , ⋯ , 𝛼̂𝑖 , ⋯ , 𝛼𝑛 )


60 An Introduction to Factorization Algebras and Factorization Homology

for every integer 1 ≤ 𝑖 ≤ 𝑛, and a natural map

𝑠𝑖 ∶ 𝐴(𝛼1 , ⋯ , 𝛼𝑛 ) → 𝐴(𝛼1 , ⋯ , 𝛼𝑖 , 𝛼𝑖 , ⋯ , 𝛼𝑛 )

for every integer 1 ≤ 𝑖 ≤ 𝑛. Together this defines a simplicial object 𝐶•̌ (𝐼, 𝐴) in C,
where
𝐶𝑛̌ (𝐼, 𝐴) = ∐ 𝐴(𝛼),
𝛼∈𝑃 𝐼 𝑛+1

and the face and degeneracy maps are given above. We call this the simplicial Čech
complex of 𝐼. Note that there exists a map of simplicial objects 𝐶•̌ (𝐼, 𝐴) → 𝐴(𝑈 ),
where we regard 𝐴(𝑈 ) as the constant simplicial object.
We now assume that C is a stable ∞-category, whose tensor product is com-
patible with colimits over simplicial diagrams (say, for example, ℂ𝕙(𝕜)). Then by
the ∞-categorical Dold-Kan correspondence and taking the direct colimit, we ob-
̌ 𝐴) from the simplicial Čech complex. We call this the Čech
tain an object 𝐶(𝐼,
complex of 𝐼. Moreover we have a canonical map 𝐶(𝐼, ̌ 𝐴) → 𝐴(𝑈 ).
We are now ready to state the definition of a factorization algebra:

Definition 2.6. A prefactorization algebra 𝐴 is said to be a homotopy factoriza-


tion algebra, or a factorization algebra for short, if for any factorization cover-
ing {𝑈𝑖 ↪ 𝑈 }𝑖∈𝐼 , the morphism 𝐶(𝐼,̌ 𝐴) → 𝐴(𝑈 ) is an equivalence in C. The
full subcategory of PreFA(𝑋, C) spanned by factorization algebras will be denoted
FA(𝑋, C).

Note that in the case that C = ℂ𝕙(𝕜) where 𝕜 is a field, the above definition is
equivalent to that the chain map 𝐶• (𝐼, 𝐴) → 𝐴(𝑈 ) being a quasi-isomorphism; a
morphism between factorization algebras is an equivalence if and only if it induces
a quasi-isomorphism on every open subset.
Naively speaking, homotopy factorization algebras are to prefactorization alge-
bras as homotopy cosheaves are to precosheaves, where we change the underlying
site to the category of open subsets of 𝑋 with open coverings being factorization
coverings.
In the case that 𝑋 is a manifold, there exists a notion of a locally constant
factorization algebra, which we will give as follows:

Definition 2.7. Suppose 𝑋 is an 𝑛-dimensional topological manifold. We say that


an open subset 𝑈 of 𝑋 is a disk if it is homeomorphic to ℝ𝑛 . A (pre-)factorization
algebra 𝐴 on 𝑋 is called locally constant, if for any inclusion 𝑈 ↪ 𝑉 of disks, the
map 𝐴(𝑈 ) → 𝐴(𝑉 ) is an equivalence.

Remark 2.8. In Example 3.24, we will prove that a locally constant prefactor-
ization algebra can be reformed to a factorization algebra, using factorization ho-
mology. We denote the full subcategory of FA(𝑋, C) spanned by locally constant
factorization algebras by FA𝑙𝑐 (𝑋, C).
2 Factorization algebras 61

Basic properties
We now discuss basic operations and properties of factorization algebras. Most of
them will be useful in the next section.
Firstly, we have the following proposition.

Proposition 2.9. The categories PreFA(𝑋, C), FA(𝑋, C), FA𝑙𝑐 (𝑋, C) are symmetric
monoidal ∞-categories.

Proof. By Remark 1.17, the ∞-category PreFA(𝑋, C) is symmetric monoidal. In


order to verify FA(𝑋, C), FA𝑙𝑐 (𝑋, C) are symmetric monoidal, it suffices to verify
for any 𝑓 , 𝑔 ∈ FA(𝑋, C) (resp. FA𝑙𝑐 (𝑋, C)), the tensor product of 𝑓 and 𝑔 again
satisfies the axiom of FA(𝑋, C) (resp. FA𝑙𝑐 (𝑋, C)). But this is straightforward since
the tensor product in C is compatible with colimits over simplicial diagrams and
equivalences are compatible with the tensor product. ◻

Next we discuss changing the underlying topological space 𝑋. Suppose we are


given a continuous map 𝑓 ∶ 𝑋 → 𝑌 of topological spaces. Just like precosheaves
and cosheaves, we can take the pushforward of a (pre-)factorization algebra 𝐴 on
𝑋 to a (pre-)factorization algebra 𝑓∗ (𝐴) on 𝑌 , given by

𝑓∗ (𝐴)(𝑉 ) = 𝐴(𝑓 −1 (𝑉 ))

for every open set 𝑉 ⊆ 𝑌 . If 𝐴 is a prefactorization algebra, then 𝑓∗ (𝐴) is a


prefactorization algebra, since it is the composite

𝑓 −1 𝐴
Fact(𝑌 ) − → C⊗ ;
−−→ Fact(𝑋) −

If 𝐴 is a factorization algebra, then since the pullback of any factorizing cover on


𝑌 by 𝑓 is again a factorizing cover, 𝑓∗ (𝐴) is a factorization algebra. Furthermore,
we have the following property:

Proposition 2.10. Suppose 𝑓 ∶ 𝑋 → 𝑌 is a locally trivial fibration between


manifolds. Then the pushforward of any locally constant factorization algebra 𝐴
is again locally constant.

Proof. Suppose 𝑈 ↪ 𝑉 is an inclusion of disks in 𝑌 . Since 𝑉 is contractible, by


assumption, we may assume that 𝑓 −1 (𝑉 ) = 𝑉 × 𝐹 for some manifold 𝐹 . Take a
factorizing cover of 𝐹 by disks, namely {𝐹𝑖 → 𝐹 }𝑖∈𝐼 . Then {𝑉 × 𝐹𝑖 → 𝑉 × 𝐹 }𝑖∈𝐼
and {𝑈 × 𝐹𝑖 → 𝑈 × 𝐹 }𝑖∈𝐼 are factorizing covers by disks. On the other hand,
for any 𝑖 ∈ 𝐼, the map 𝐴(𝑈 × 𝐹𝑖 ) → 𝐴(𝑉 × 𝐹𝑖 ) is an equivalence. Thus the
̌ 𝑈 × 𝐹 ) → 𝐶(𝐼,
map 𝐶(𝐼, ̌ 𝑉 × 𝐹 ) is an equivalence, which means that the map
𝑓∗ (𝐴)(𝑈 ) → 𝑓∗ (𝐴)(𝑉 ) is an equivalence. ◻

On the other hand, if we have an inclusion 𝑖 ∶ 𝑌 → 𝑋 such that 𝑌 is an


open subspace of 𝑋, and 𝐴 a (locally constant) (pre-)factorization algebra, then
we may restrict 𝐴 to 𝑌 ; namely, reduce the domain of 𝐴 to the smaller colored
62 An Introduction to Factorization Algebras and Factorization Homology

operad Fact(𝑌 ). We denote this restriction by 𝐴|𝑌 . When 𝐴 is (locally constant)


(pre-)factorization algebra, so is 𝐴|𝑌 .
We next discuss how to extend a “partially defined” factorization algebra to a
“real” factorization algebra.
Definition 2.11. Suppose 𝒰 is an open cover of 𝑋. Define Fact(𝒰) to be the full
suboperad of Fact(𝑋) spanned by those colors in 𝒰. A 𝓤-prefactorization alge-
bra is an algebra over Fact(𝒰). A 𝓤-factorization algebra is a 𝒰-prefactorization
algebra 𝐴 such that for any factorization covering {𝑈𝑖 ↪ 𝑈 }𝑖∈𝐼 with all open sets
in 𝒰, the morphism 𝐶(𝐼, ̌ 𝐴) → 𝐴(𝑈 ) is an equivalence.

Now, suppose 𝒰 is a topological basis of 𝑋 that is stable under finite intersec-


tions (such 𝒰 is called a factorizing basis). We state the following proposition, that
allows us to extend a 𝒰-factorization algebra to an ordinary factorization algebra:
Proposition 2.12. Suppose 𝐴 is a 𝒰-factorization algebra. For any open subset
𝑉 of 𝑋, define
𝑖𝒰 ̌
∗ (𝐴)(𝑉 ) ∶= 𝐶(𝒰𝑉 , 𝐴),

where 𝒰𝑉 consists of those open sets in 𝒰 that is contained in 𝑉 . Then 𝑖𝒰 ∗ (𝐴) is a


factorization algebra, whose restriction to Fact(𝒰) is equivalent to 𝐴. Moreover, if
𝐴′ is a factorization algebra, whose restriction to Fact(𝒰) is equivalent to 𝐴, then
𝐴′ is equivalent to 𝑖𝒰 𝒰
∗ (𝐴). Furthermore, the construction 𝑖∗ is functorial.

The proof to this is really technical, so we refer the interested readers to Ap-
pendix B, [CG16]. We note that the uniqueness directly follows from the definition
of a factorization algebra.
This proposition tells us that a factorization algebra is “in some sense” defined
by its values on a few of its open subsets. This proposition is in particular very
useful in the case of a smooth manifold, since open disks form a factorizing basis,
and the value of a factorization algebra on a disk is usually easy to determine.
Next, we discuss what happens if we take the product of two spaces. Suppose
𝑋, 𝑌 are topological spaces and 𝜋 ∶ 𝑋 × 𝑌 → 𝑋 is the projection. By Proposition
2.10, 𝜋 induces a functor 𝜋∗ ∶ PreFA(𝑋 × 𝑌 ) → PreFA(𝑋). We make a stronger
claim:
Proposition 2.13. There exists a canonical functor

𝜋 ∗ ∶ PreFA(𝑋 × 𝑌 ) → PreFA(𝑋, PreFA(𝑌 ))

such that the following diagram is commutative:


𝜋∗
PreFA(𝑋 × 𝑌 ) / PreFA(𝑋, PreFA(𝑌 ))
RRR
RRR
RRR
𝜋∗ RRRR
PreFA(𝑋,𝑝∗ )
R) 
PreFA(𝑋).

Here 𝑝 ∶ 𝑌 → ∗ is the canonical map.


2 Factorization algebras 63

Proof. We do the construction as follows. Suppose 𝐴 is a prefactoriation algebra


on 𝑋 × 𝑌 . For any 𝑈 ⊆ 𝑋, we set 𝜋 ∗ (𝐴)(𝑈 ) to be the prefactoriation algebra
on 𝑌 , which asserts an open set 𝑉 ⊆ 𝑌 with 𝜋 ∗ (𝐴)(𝑈 )(𝑉 ) = 𝐴(𝑈 × 𝑉 ), and for
any 𝑉1 , ⋯ , 𝑉𝑛 , 𝑉 being connected open sets in 𝑌 , such that the 𝑉𝑖 ’s are pairwise
disjoint and all contained in 𝑉 , define the map

𝜋 ∗ (𝐴)(𝑈 )(𝑉1 ) ⊗ ⋯ ⊗ 𝜋 ∗ (𝐴)(𝑈 )(𝑉𝑛 ) → 𝜋 ∗ (𝐴)(𝑈 )(𝑉 )

to be the structure map

𝐴(𝑈 × 𝑉1 ) ⊗ ⋯ ⊗ 𝐴(𝑈 × 𝑉𝑛 ) → 𝐴(𝑈 × 𝑉 ).

Since 𝐴 is a prefactoriation algebra, the axioms of a prefactoriation algebra ensure


that 𝜋 ∗ (𝐴)(𝑈 ) is a prefactoriation algebra on 𝑌 . Moreover, the axioms of a pref-
actoriation algebra ensure that for any 𝑈1 , ⋯ , 𝑈𝑛 , 𝑈 being connected open sets in
𝑌 , such that the 𝑈𝑖 ’s are pairwise disjoint and all contained in 𝑈 , the collection of
the maps

𝜋 ∗ (𝐴)(𝑈1 )(𝑉 ) ⊗ ⋯ ⊗ 𝜋 ∗ (𝐴)(𝑈𝑛 )(𝑉 )


≅ 𝐴(𝑈1 × 𝑉 ) ⊗ ⋯ ⊗ 𝐴(𝑈𝑛 × 𝑉 ) → 𝐴(𝑈 × 𝑉 ) ≅ 𝜋 ∗ (𝐴)(𝑈 )(𝑉 )

for all 𝑉 ⊆ 𝑌 forms a map

𝜋 ∗ (𝐴)(𝑈1 ) ⊗ ⋯ ⊗ 𝜋 ∗ (𝐴)(𝑈𝑛 ) → 𝜋 ∗ (𝐴)(𝑈 )

of prefactoriation algebras on 𝑌 , the collection of which again forms a prefac-


toriation algebra on 𝑋 with values in prefactoriation algebras on 𝑌 . Finally,
since 𝑝∗ sends a prefactoriation algebra on 𝑌 to its global sections, the identity
PreFA(𝑋, 𝑝∗ ) ∘ 𝜋 ∗ = 𝜋∗ is straightforward. ◻
The above construction also holds for factorization algebras, and has a more
interesting property:
Proposition 2.14. There exist functors

𝜋 ∗ ∶ FA(𝑋 × 𝑌 ) → FA(𝑋, FA(𝑌 ))

and
𝜋 ∗ ∶ FA𝑙𝑐 (𝑋 × 𝑌 ) → FA𝑙𝑐 (𝑋, FA𝑙𝑐 (𝑌 ))

between ∞-categories, and the latter is an equivalence.


Proof. We first check that if 𝐴 is a factorization algebra on 𝑋 ×𝑌 , then for any 𝑈 ⊂
𝑋, 𝜋 ∗ (𝐴)(𝑈 ) is a factorization algebra on 𝑌 . To prove this, take any factorizing
cover 𝒱 of 𝑉 ⊆ 𝑌 where 𝑉 is open, then 𝑈 × 𝒱 forms a factorizing cover of
𝑈 × 𝑉 , and the Čech complex 𝐶(𝒱 ̌ , 𝜋 (𝐴)(𝑈 )) is equal to 𝐶(𝑈
̌ × 𝒱 , 𝐴). Hence

̌ ̌
the map 𝐶(𝒱 , 𝜋 ∗ (𝐴)(𝑈 )) → 𝜋 ∗ (𝐴)(𝑈 )(𝑉 ) is equal to 𝐶(𝑈 × 𝒱 , 𝐴) → 𝐴(𝑈 × 𝑉 ),
which is an equivalence since 𝐴 is a factorization algebra.
64 An Introduction to Factorization Algebras and Factorization Homology

Next we check that if 𝐴 is a factorization algebra on 𝑋 × 𝑌 , then 𝜋 ∗ (𝐴) is


a factorization algebra on 𝑋 with values in FA(𝑌 ). To prove this, take any fac-
torizing cover 𝒰 of 𝑈 ⊆ 𝑋 where 𝑈 is open, it suffices to show that the map
̌
𝐶(𝒰, 𝜋 ∗ (𝐴)(−)(𝑉 )) → 𝜋 ∗ (𝐴)(𝑈 )(𝑉 ) is an equivalence for all 𝑉 ⊆ 𝑌 . But this
follows from the same argument as above. Therefore, 𝜋 ∗ restricts to a functor
FA(𝑋 × 𝑌 ) → FA(𝑋, FA(𝑌 )).
If 𝐴 is locally constant, applying Proposition 2.10 to the two natural projections,
it is shown that 𝜋 ∗ (𝐴) ∈ FA𝑙𝑐 (𝑋, FA𝑙𝑐 (𝑌 )).
We shall now show that 𝜋 ∗ is an equivalence in the locally constant case by
giving an inverse to it. Take a metric on 𝑋, 𝑌 , and take 𝒰, 𝒱 to be the collection
of all strictly convex subset of 𝑋, 𝑌 , respectively. Then 𝒰, 𝒱 are stable under
intersection, form bases of 𝑋, 𝑌 , and are factorization bases of 𝑋, 𝑌 , respectively.
By Proposition 2.10, in order to give a factorization algebra on 𝑋 × 𝑌 , it suffices to
give a factorization algebra on 𝒰 × 𝒱 , since this is a factorizing basis for 𝑋 × 𝑌 .
Now for any 𝑈 ∈ 𝒰, 𝑉 ∈ 𝒱 , 𝑈 , 𝑉 must be homeomorphic to the euclidean disks.
Thus the construction of the structure maps for opens in 𝒰 × 𝒱 that lies in 𝑈 × 𝑉 ,
restricts the problem to the case that 𝑋 = ℝ𝑚 , 𝑌 = ℝ𝑛 for some integers 𝑚, 𝑛. By
Theorem 3.232 and the Dunn’s theorem (Theorem 3.30), the functor

𝜋 ∗ ∶ FA𝑙𝑐 (ℝ𝑚+𝑛 ) → FA𝑙𝑐 (ℝ𝑚 , FA𝑙𝑐 (ℝ𝑛 ))

is indeed an equivalence, making it possible to define a 𝒰 ×𝒱 -factorization algebra


𝑗(𝐵) for arbitrary manifolds 𝑋, 𝑌 from any 𝐵 ∈ FA𝑙𝑐 (𝑋, FA𝑙𝑐 (𝑌 )), by previous
discussion. We denote 𝑗(𝐵) again to be the factorization algebra on 𝑋 ×𝑌 obtained
from the 𝒰 × 𝒱 -factorization algebra 𝑗(𝐵) by extending from a basis. Again, by
Dunn’s theorem, 𝑗(𝐵) is locally constant.
It remains to show that 𝑗 is an inverse to 𝜋 ∗ . This is again from Proposition
2.10, which tells us that the factorization algebra extending a factorization algebra
on a factorizing basis is unique. ◻

We next discuss how to define the pullback of a factorization algebra. We do


this in the case that 𝑓 ∶ 𝑋 → 𝑌 is an open immersion of smooth manifolds. Sup-
pose 𝐵 is a factorization algebra on 𝑌 . Define


𝒰𝑓 = {𝑈 ⊆ 𝑋 ∣ 𝑈 open, 𝑓 ∶ 𝑈 −
→ 𝑓 (𝑈 )}.

Then 𝒰𝑓 is a factorizing basis for 𝑋. For any 𝑈 ∈ 𝒰𝑓 , define 𝑓 ∗ (𝐵)(𝑈 ) =


𝐵(𝑓 (𝑈 )). The axioms of a factorization algebra ensure that 𝑓 ∗ (𝐵) is a 𝒰𝑓 -
factorization algebra. By Proposition 2.12 it can be extended to a factorization
algebra on 𝑋. We denote this factorization algebra again by 𝑓 ∗ (𝐵) and call it
the pullback of 𝐵 along 𝑓 . By the functoriality of the construction above and
Proposition 2.12, 𝑓 ∗ ∶ FA(𝑌 ) → FA(𝑋) is a functor. Moreover by the uniqueness
2
The only usage of the proposition is to prove the Fubini formula in Example 3.28, so there is no
circularity.
2 Factorization algebras 65

property of Proposition 2.12, for any open immersions 𝑓 ∶ 𝑋 → 𝑌 , 𝑔 ∶ 𝑌 → 𝑍


and 𝐶 a factorization algebra on 𝑍, the restriction of (𝑔 ∘ 𝑓 )∗ (𝐶) and 𝑓 ∗ 𝑔 ∗ (𝐶) on

𝒰 ∶= {𝑈 ⊆ 𝑋 ∣ 𝑈 ∈ 𝒰𝑓 , 𝑈 ∈ 𝒰𝑔∘𝑓 , 𝑓 (𝑈 ) ∈ 𝒰𝑔 }

are equal, since 𝒰 is a factorizing basis, (𝑔 ∘ 𝑓 )∗ (𝐶) and 𝑓 ∗ 𝑔 ∗ (𝐶) are equivalent.
Since the above construction is functorial, the functors (𝑔 ∘ 𝑓 )∗ and 𝑓 ∗ 𝑔 ∗ (𝐶) are
equivalent.

Remark 2.15. We note that the functors 𝑓∗ and 𝑓 ∗ are not adjoint functors. To
see this, take 𝑋 = {𝑐, 𝑑} to be the discrete space with two points, and 𝑌 = ∗. A
factorization algebra 𝐵 on 𝑌 in C is an object 𝐵 ∈ C and a map 𝑏 ∶ 𝟙 → 𝐵 in
C; while a factorization algebra 𝐴 on 𝑋 in C consists of two objects 𝐶, 𝐷 ∈ C,
two maps 𝑐 ∶ 𝟙 → 𝐶, 𝑑 ∶ 𝟙 → 𝐷 in C, corresponding to the objects 𝐴(𝑐) and
𝐴(𝑑). By direct computation, the factorization algebra 𝑓∗ (𝑓 ∗ (𝐵)) is 𝐵 ⊗ 𝐵, and
the factorization algebra 𝑓 ∗ (𝑓∗ (𝐴)) is 𝐴 ⊗ 𝐴. But there does not exist a natural
map 𝐴 ⊗ 𝐴 → 𝐴, nor does there exist a natural map 𝐵 ⊗ 𝐵 → 𝐵, thus 𝑓∗ and 𝑓 ∗
are not adjoint functors.
Nevertheless, if 𝑓 ∶ 𝑋 → 𝑌 is an open immersion of smooth manifolds, 𝐴 is a
factorization algebra on 𝐴, then for any 𝑈 ∈ 𝒰𝑓 there exists a map

induced by 𝑈 ↪𝑓 −1 (𝑓 (𝑈 ))
𝐴(𝑈 ) −−−−−−−−−−−−−−−−−→ 𝐴(𝑓 −1 (𝑓 (𝑈 ))) ≅ (𝑓 ∗ (𝑓∗ (𝐴)))(𝑈 ),

thus inducing a map of 𝒰𝑓 -factorization algebras 𝐴 → 𝑓 ∗ (𝑓∗ (𝐴)). By Proposition


2.12, it induces a map of factorization algebras 𝐴 → 𝑓 ∗ (𝑓∗ (𝐴)). Moreover this
construction is compatible with morphisms between factorization algebra. There-
fore we do obtain a natural transformation between functors 𝟙 → 𝑓 ∗ 𝑓∗ ∶ FA(𝑋) →
FA(𝑋).

We end this section by gluing factorization algebras. Suppose 𝒰 = {𝑈𝑖 }𝑖∈𝐼 is


an open cover of a space 𝑋, such that any point is contained in only finitely many
open sets in 𝒰. We define a set of “gluing data” to be as follows:

Definition 2.16. A set of gluing data for 𝒰 consists of a factorization algebra 𝐴𝐽


on the intersection 𝑉𝑗 of the elements of the collection {𝑈𝑗 }𝑗∈𝐽 for every finite set
𝐽 ⊆ 𝐼, an equivalence 𝑟𝐽 ,𝑗 ∶ 𝐴𝐽 → (𝐴𝐽 −{𝑗} )|𝑉𝐽 for any 𝐽 and 𝑗 ∈ 𝐽 , such that the
following diagram commutes for every 𝐽 and 𝑗, 𝑘 ∈ 𝐽 :
𝑟𝐽 ,𝑗
𝐴𝐽 / (𝐴
𝐽 −{𝑗} )|𝑉𝐽

𝑟𝐽 ,𝑘 𝑟𝐽 −{𝑗},𝑘
 𝑟𝐽 −{𝑘},𝑗 
(𝐴𝐽 −{𝑘} )|𝑉𝐽 / 𝐴𝐽 −{𝑗,𝑘} .

Now, if we have a set of gluing data, we may take a factorizing basis 𝒱 of


𝑋, which is given by all open subsets of 𝑈𝑖 for all 𝑖 ∈ 𝐼. We define 𝐴(𝑉 ) =
66 An Introduction to Factorization Algebras and Factorization Homology

𝐴𝐽 (𝑉 ) where 𝐽 is the largest subset of 𝐼 satisfying 𝑉𝐽 ⊇ 𝑉 . It is easy to verify


that the coherent property of the structure maps 𝑟𝐽 ,𝑗 gives 𝐴 a structure of a 𝒱 -
factorization algebra. Moreover, if we define 𝑟𝐽 to be the canonical restriction
equivalence 𝐴𝐽 → 𝐴|𝑉𝐽 , the maps 𝑟𝐽 and 𝑟𝐽 −{𝑗} ∘ 𝑟𝐽 ,𝑗 are equal for any 𝐽 and
𝑗 ∈ 𝐽 . Combining Proposition 2.10, we obtain the following proposition:
Proposition 2.17. Given a set of gluing data, there exists a factorization algebra 𝐴
on 𝑋 that is unique up to equivalence, whose restriction to each 𝑉𝐽 is canonically
equivalent to 𝐴𝐽 .
In particular, given two open sets 𝑈 , 𝑉 of 𝑋 such that 𝑋 = 𝑈 ∪ 𝑉 , there exists
an equivalence of ∞-categories
FA(𝑋) ≅ FA(𝑈 ) × FA(𝑉 ).
FA(𝑈 ∩𝑉 )

In the case of locally constant factorization algebras, we state the following


proposition, which says that being locally constant is some sort of a “local prop-
erty”:
Proposition 2.18. Suppose 𝑀 is a topological manifold and 𝐴 is a factorization
algebra on 𝑀. If there exists an open cover 𝒰 of 𝑀 such that 𝐴|𝑈 is locally
constant for every 𝑈 ∈ 𝒰, then 𝐴 is locally constant.
This is Proposition 13, [Gin13], whose proof is given in Section 9.1 of the
paper. It is really technical, and does not give much enlightenment to the proof of
other propositions; so we will omit it here.
Combining the two propositions above, we see that if we have a gluing data con-
sisting of locally constant factorization algebras, the resulting global factorization
algebra is locally constant.

3 Factorization homology
Factorization homology is a type of homology theory on manifolds. They produce
invariants of manifolds, just as classical homology theories produce invariants of
topological spaces. In some specific manifolds, factorization homology reduces to
some well-known structures, such as Hoschchild homology. In this section, we will
introduce homology theories on manifolds and factorization homology, and study
their properties, as well as their relationships with factorization algebras, especially
locally constant factorization algebras.

Homology theory for manifolds


We first specify an interesting category of manifolds that we will study.
Definition 3.1. Consider the topological category of 𝑛-dimensional manifolds
(without boundaries), with morphism spaces the spaces of all topological embed-
dings. We define the nerve of the topological category to be Mfld𝑛 , and regard this
∞-category as the ∞-category of 𝒏-dimensional manifolds.
3 Factorization homology 67

Note that we consider topologial manifolds instead of smooth manifolds. How-


ever, we may also consider smooth manifolds, whose terminology will be intro-
duced later.
We may also add other structures that we may be interested in, such as orien-
tations, framings, etc. These structures may be interpreted as follows:

Definition 3.2. Suppose 𝐸 → 𝑋 is a topological 𝑛-dimensional vector bundle,


which is equivalent to a space 𝑋 together with a homotopy class of maps 𝑒 ∈
[𝑋, 𝐵 Aut(ℝ𝑛 )], classifying 𝐸, where Aut(ℝ𝑛 ) is the automorphism topological
group of ℝ𝑛 . We define an (𝑿, 𝒆)-structure on an 𝑛-dimensional manifold 𝑀 to
be a map 𝑓 ∶ 𝑀 → 𝑋, such that the tangent bundle 𝑇 𝑀 is classified by the map
𝑒 ∘ 𝑓 . (In other words, if we pullback 𝐸 along 𝑓 , we obtain 𝑇 𝑀.) We consider the
(homotopy) pullback

Mfld(𝑋,𝑒)
𝑛
∶= Mfld𝑛 × S∕𝑋 ,
S∕𝐵 Aut(ℝ𝑛 )

we regard this ∞-category as the ∞-category of 𝒏-dimensional manifolds


with an (𝑿, 𝒆)-structure. We define Emb(𝑋,𝑒) (𝑀, 𝑁) to be the space of (𝑋, 𝑒)-
embeddings from 𝑀 to 𝑁.

Example 3.3. i) If 𝑋 = ∗, then 𝐸 is trivial, and an (𝑋, 𝑒)-structure on 𝑀


is a trivialization of 𝑇 𝑀. Hence Mfld(𝑋,𝑒)
𝑛
consists of framed manifolds.
We define Mfldfr𝑛 to be Mfld(∗,𝑒)
𝑛
, and call it the ∞-category of framed
𝒏-dimensional manifolds. Note that a morphism in Mfldfr𝑛 from (𝑀, 𝑓 ) to
(𝑁, 𝑔) is an embedding from 𝑀 to 𝑁 together with a homotopy between the
framing that already exists on 𝑇 𝑀 and the pullback of the framing on 𝑇 𝑁.

ii) If 𝑋 = 𝐵𝑂(𝑛), and 𝑒 is the canonical map 𝐵𝑂(𝑛) → 𝐵 Aut(ℝ𝑛 ), then Mfld(𝑋,𝑒) 𝑛
consists of smooth manifolds. This is essentially because the map from 𝑂(𝑛)
to the topological group of all diffeomorphisms on ℝ𝑛 and the characterization
of smooth manifolds in terms of their microbundle structure, which is shown
in [KS77]. We define Mfldun 𝑛
to be Mfld(𝐵𝑂(𝑛),𝑒)
𝑛
, and call it the ∞-category of
smooth 𝒏-dimensional manifolds. A morphism in Mfldun 𝑛
is equivalent to a
smooth embedding.

iii) If 𝑋 = 𝐵𝑆𝑂(𝑛), and 𝑒 is the canonical map 𝐵𝑆𝑂(𝑛) → 𝐵 Aut(ℝ𝑛 ), then


Mfld(𝑋,𝑒)
𝑛
consists of oriented smooth manifolds. We define Mfldor 𝑛
to be
(𝐵𝑆𝑂(𝑛),𝑒)
Mfld𝑛 , and call it the ∞-category of smooth oriented 𝒏-dimensional
manifolds. A morphism in Mfld𝑢𝑛 𝑛
is equivalent to a smooth oriented embed-
ding.

iv) If 𝑋 is an 𝑛-dimensional manifold, then we may take 𝑒 to be the canonical map


characterizing 𝑇 𝑋. We define Mfld(𝑋,𝑇𝑛
𝑋)
to be the associated ∞-category
with respect to (𝑋, 𝑒). Note that every open set of 𝑋 is canonically an object
of Mfld(𝑋,𝑇
𝑛
𝑋)
.
68 An Introduction to Factorization Algebras and Factorization Homology

For any 𝑀, 𝑁 ∈ Mfld(𝑋,𝑒)𝑛


, 𝑀 ⨿ 𝑁 is canonically a manifold with an (𝑋, 𝑒)-
(𝑋,𝑒)
structure. therefore, (Mfld𝑛 , ⨿) is a symmetric monoidal ∞-category, where the
axioms are easily verified. Note that there is no embedding 𝑀 ⨿ 𝑀 → 𝑀 in
general, so that ⨿ is not a coproduct on Mfld(𝑋,𝑒)
𝑛
. Indeed, Mfld(𝑋,𝑒)
𝑛
does not admit
finite coproducts.
We now take a special 𝑛-dimensional manifold, ℝ𝑛 . Since it is contractible, for
any (𝑋, 𝑒) there exists a canonical framing on ℝ𝑛 such that it lies in the ∞-category
Mfld(𝑋,𝑒)
𝑛
. Moreover, unlike other manifolds, there exists (𝑋, 𝑒)-structured embed-
dings ∐𝑚 𝑛 𝑛
𝑖=1 ℝ → ℝ for any integers 𝑚, 𝑛 and any (𝑋, 𝑒). We define Disk𝑛
(𝑋,𝑒)

to be the full subcategory of Mfld(𝑋,𝑒)


𝑛
, spanned by the disjoint unions of ℝ𝑛 . We
observe the following property:

Proposition 3.4. Disk(𝑋,𝑒)


𝑛
is an ∞-operad, with only one color, being ℝ𝑛 .

Proof. This is by Proposition 1.18, since it is the nerve of a topological single col-
ored operad, with morphism space Disk(𝑋,𝑒)
𝑛
(𝑟) the subspace of Hom(∐𝑟 ℝ𝑛 , ℝ𝑛 )
consisting of all (𝑋, 𝑒)-structured embeddings ∐𝑟 ℝ𝑛 → ℝ𝑛 . ◻

We now specify a case: 𝑋 = ∗. In this case Diskfr𝑛 (𝑟) is the subspace of


Hom(∐𝑟 ℝ𝑛 , ℝ𝑛 ) consisting of all framed embeddings ∐𝑟 ℝ𝑛 → ℝ𝑛 . In this case
we have the following proposition, see [MathOverflow]:

Proposition 3.5. The ∞-operads Diskfr𝑛 and E𝑛 are equivalent.

Thus, the ∞-category of algebras over the ∞-operad Diskfr𝑛 is equivalent to the
∞-category of 𝐸𝑛 -algebras.

Definition 3.6. Suppose C is a symmetric monoidal ∞-category. We define


Disk(𝑋,𝑒)
𝑛
-Alg(C) to be the ∞-category of algebras over Disk(𝑋,𝑒)
𝑛
in C. In some
cases we give the ∞-category Disk(𝑋,𝑒)
𝑛
-Alg (C) some special names:

i) If 𝑋 = 𝐵𝑂(𝑛), and 𝑒 is the canonical map 𝐵𝑂(𝑛) → 𝐵 Aut(ℝ𝑛 ), we call the


objects of Diskun
𝑛
-Alg the unoriented (smooth) 𝑬𝒏 -algebras;

ii) If 𝑋 = 𝐵𝑆𝑂(𝑛), and 𝑒 is the canonical map 𝐵𝑆𝑂(𝑛) → 𝐵 Aut(ℝ𝑛 ), we call


the objects of Diskor
𝑛
-Alg the oriented (smooth) 𝑬𝒏 -algebras.

We are now able to present what a homology theory for (𝑛-dimensional) man-
ifolds is. We interprete this in a way similar to the Eilenberg–Steenrod axioms.
First, we consider the excision axiom.

Proposition 3.7. Suppose 𝐻 ∶ (Mfld(𝑋,𝑒)


𝑛
, ∐) → C is a symmetric monoidal func-
tor. Then:

i) For any (𝑛 − 𝑠)-dimensional manifold 𝑁 such that 𝑁 × ℝ𝑠 is endowed with an


(𝑋, 𝑒)-structure, 𝐻(𝑁 × ℝ𝑠 ) is an 𝐸𝑠 -algebra in C;
3 Factorization homology 69

ii) Suppose 𝑀 is an (𝑋, 𝑒)-structured manifold such that an end of 𝑀 is trivial-


ized as 𝑁 × ℝ such that 𝑁 is of codimension 1 and the open part of 𝑀 lies in
the neighborhood of 𝑁 × {−∞}, then 𝐻(𝑀) is a left (pointed) module3 of the
𝐸1 -algebra 𝐻(𝑁 × ℝ); dually, if the open part of 𝑀 lies in the neighborhood
of 𝑁 × {+∞}, then 𝐻(𝑀) is a right module of the 𝐸1 -algebra 𝐻(𝑁 × ℝ);

iii) 𝐻(ℝ𝑛 ) is a Disk(𝑋,𝑒)


𝑛
-algebra.

Proof. i) Notice that for any finite sets 𝐼, 𝐽 , any framed embedding ∐𝐼 ℝ𝑠 →
∐𝐽 ℝ𝑠 induces a (𝑋, 𝑒)-embedding ∐𝐼 (𝑁 × ℝ𝑠 ) → ∐𝐽 (𝑁 × ℝ𝑠 ), thus it in-
𝑁
duces a map 𝛾𝐼,𝐽 from the space of framed embeddings from ∐𝐼 ℝ𝑠 to ∐𝐽 ℝ𝑠
to the space of (𝑋, 𝑒)-embeddings from ∐𝐼 (𝑁 × ℝ𝑠 ) to ∐𝐽 (𝑁 × ℝ𝑠 ). There-
fore, 𝑁 × ℝ𝑠 is an 𝐸𝑠 -algebra object in the ∞-category Mfld(𝑋,𝑒)
𝑛
. Since 𝐻 is
symmetric monoidal, 𝐻(𝑁 × ℝ𝑠 ) is an 𝐸𝑠 -algebra in C.

ii) It suffices to give a map

Embfr0 ∐ ℝ ⨿ (−∞, 0], (−∞, 0] ⊗ 𝐻(𝑁 × ℝ)⊗𝐻(𝑀) → 𝐻(𝑀),


(( 𝐼 ) ) ⨂𝐼

where Embfr0 is the subspace of Embfr consisting of maps that map 0 in the last
component to 0. Using i), the map is given by

Embfr0 ∐ℝ ⨿ (−∞, 0], (−∞, 0] ⊗ 𝐻(𝑁 × ℝ) ⊗ 𝐻(𝑀)


(( 𝐼 ) ) ⨂
𝐼
𝑁
𝛾𝐼⨿∗,∗
−−−−→ Embfr ∐(𝑁 × ℝ) ⨿ (𝑁 × (−∞, 0]), 𝑁 × (−∞, 0]
(( 𝐼 ) )
⊗ 𝐻(𝑁 × ℝ) ⊗ 𝐻(𝑀)

𝐼

→ Embfr0 ∐(𝑁 × ℝ) ⨿ 𝑀, 𝑀 ⊗𝐻 ∐(𝑁 × ℝ) ⨿𝑀


(( 𝐼 ) ) (( 𝐼 ) )
→𝐻(𝑀).

iii) This is because by definition ℝ𝑛 is an Disk(𝑋,𝑒)


𝑛
-algebra object in the ∞-
category Mfld(𝑋,𝑒)
𝑛
and 𝐻 is symmetric monoidal.

With this proposition, we are able to express the excision axiom for homology
theories for manifolds, which is again due to [Gin13]:
3
We assume once and for all that all modules are pointed.
70 An Introduction to Factorization Algebras and Factorization Homology

Definition 3.8. A homology theory for (𝑋, 𝑒)-manifolds with values in a symmet-
ric monoidal stable ∞-catgeory C is a functor

𝐻 ∶ Mfld(𝑋,𝑒)
𝑛
→ C, 𝑀 ↦ 𝐻(𝑀),

such that 𝐻 is symmetric monoidal, preserves sequential colimits, and satisfies the
excision axiom:
• For any (𝑋, 𝑒)-manifold 𝑀, if there exists a codimension 1 submanifold 𝑁
of 𝑀 and a trivialization 𝑁 × ℝ of its neighborhood such that 𝑀 is decom-
posible as 𝑅⨿𝑁×ℝ 𝐿, where 𝑅, 𝐿 are submanifolds of 𝑀 glued along 𝑁 × ℝ,
then the natural map

𝐻(𝐿) ⊗ 𝐻(𝑅) → 𝐻(𝑀)


𝐻(𝑁×ℝ)

is an equivalence.
The ∞-category of homology theories is defined to be the full subcategory of
Alg(Mfld(𝑋,𝑒)
𝑛
) spanned by homology theories, denoted HT(Mfld(𝑋,𝑒)
𝑛
).

Factorization homology
Now we will introduce a specific homology theory, called factorization homology.
We will discuss the properties of factorization homology, and we will prove the
following result, which is from [AF12]:
Theorem 3.9. For any Disk(𝑋,𝑒)
𝑛
-algebra 𝐴, there exists, up to contractible choices,
a unique homology theory for (𝑋, 𝑒)-manifolds, satisfying the dimension axiom:
• The value of the homology theory on ℝ𝑛 is naturally equivalent to 𝐴;
which is exactly the factorization homology with coeffecients in 𝐴 (whose definition
will be given below).
We start with the definition of the factorization homology.
Definition 3.10. Any Disk(𝑋,𝑒)
𝑛
-algebra 𝐴 defines a functor Disk(𝑋,𝑒)
𝑛
→ C. On the
other hand, if 𝑀 is in Mfld𝑛 , then 𝑀 defines a functor 𝐸𝑀 ∶ (Disk(𝑋,𝑒)
(𝑋,𝑒)
𝑛
)op →
(𝑋,𝑒)
K, which maps an object in Disk𝑛 to the space of (𝑋, 𝑒)-embeddings from that
object to 𝑀. Together 𝑀 and 𝐴 defines a functor:

𝐸𝑀 ⊗ 𝐴 ∶ (Disk(𝑋,𝑒)
𝑛
)𝑜𝑝 × Disk(𝑋,𝑒)
𝑛
→ K × C −→ C.

We define the factorization homology of 𝑀 with coeffecients in 𝐴 to be the (ho-


motopy) coend of the above functor:

∫ 𝐴 ∶= 𝐸𝑀 ⊗ 𝐴.
(𝑋,𝑒)
𝑀 Disk𝑛
3 Factorization homology 71

Note that by the definition of a coend, factorization homology can also be in-
terpreted as
𝐴
∫ 𝐴 = colim((Disk(𝑋,𝑒)
𝑛
)∕𝑀 → Disk(𝑋,𝑒)
𝑛
→ C).

𝑀

Thus, factorization homology preserves sequential colimits.


We shall now verify some properties of factorization homology. The fully faith-
ful symmetric monoidal functor 𝑖 ∶ Disk(𝑋,𝑒)
𝑛
→ Mfld(𝑋,𝑒)
𝑛
induces a functor

𝑖∗ ∶ Alg(Mfld(𝑋,𝑒)
𝑛
, C) → Disk(𝑋,𝑒)
𝑛
-Alg(C).

The following proposition identifies factorization homology as a left adjoint to this


functor provided C admits some additional properties:
Proposition 3.11. Suppose C is ⊗-presentable (which means that C is presentable
and the monoidal structure distributes over small colimits). Then there exists a left
adjoint
𝑖! ∶ Disk(𝑋,𝑒)
𝑛
-Alg(C) → Alg(Mfld(𝑋,𝑒)
𝑛
, C)

to 𝑖∗ , which is given by
𝑖! (𝐴)(𝑀) = ∫ 𝐴.
𝑀

Proof. From the theory of ∞-categories, we know that there exists a functor

𝑖! ∶ Fun(Disk(𝑋,𝑒)
𝑛
, C) → Fun(Mfld(𝑋,𝑒)
𝑛
, C),

that is the left adjoint to the restriction functor

𝑖∗ ∶ Fun(Mfld(𝑋,𝑒)
𝑛
, C) → Fun(Disk(𝑋,𝑒)
𝑛
, C).

Moreover, 𝑖! (𝐴) sends every manifold 𝑀 to


𝐴
colim((Disk(𝑋,𝑒)
𝑛
)∕𝑀 → Disk(𝑋,𝑒)
𝑛
→ C),

which coincides with the definition of factorization homology. Therefore, it suffices


to show that the image of Disk(𝑋,𝑒)
𝑛
-Alg(C) under 𝑖! lies in Alg(Mfld(𝑋,𝑒)
𝑛
, C).
To prove this, it is equivalent to prove that for any 𝐴 ∈ Disk(𝑋,𝑒)
𝑛
-Alg (C), and for
any map 𝑓 ∶ ⟨𝑛⟩ → ⟨𝑚⟩, the following diagram commutes, up to an equivalence:

(𝑖! (𝐴))𝑛
(Mfld(𝑋,𝑒) )𝑛 / C𝑛
𝑛

𝑓∗ 𝑓∗
 (𝑖! (𝐴))𝑚 
(Mfld(𝑋,𝑒) )𝑚 / C𝑚 .
𝑛

Since every map 𝑓 can be factored to be the composition of a surjective active map
followed by an injective active map followed by an inert map (where an active map
72 An Introduction to Factorization Algebras and Factorization Homology

is a map such that the inverse image of ∗ is ∗), it suffices to prove this in these three
cases.
If 𝑓 is inert, then 𝑓∗ is a projection, the commutativity is obvious.
If 𝑓 is injective and active, since 𝐴 is symmetric monoidal, 𝐴 preserves the
monoidal unit, thus 𝑖! (𝐴) also preserves the monoidal unit, therefore 𝑓∗ is the
canonical inclusion, the commutativity follows.
If 𝑓 is surjective and active, by taking all fibers of the map 𝑓 , we may assume
that 𝑚 = 1. Then the map 𝑓∗ is the 𝑛-fold tensor product. For any (𝑀1 , ⋯ , 𝑀𝑛 ) ∈
(Mfld(𝑋,𝑒)
𝑛
)𝑛 , we consider the following diagram:

𝑛 ∐
∏(Disk(𝑋,𝑒) )∕𝑀𝑖 / (Disk(𝑋,𝑒) ) 𝑛
𝑛 𝑛
𝑖=1 ∕ ∐ 𝑀𝑖
𝑖=1

 ∐

(Disk(𝑋,𝑒) )𝑛 / Disk(𝑋,𝑒)
𝑛 𝑛

𝐴𝑛 𝐴
 ⨂ 
C𝑛 / C.

This diagram is commutative. Moreover the top row is fully faithful and essen-
tially surjective, which means that it is an equivalence. Therefore, we obtain the
following natural equivalence

⎛ ⎞
(𝑋,𝑒) 𝐴
colim ⎜(Disk(𝑋,𝑒)
𝑛
) 𝑛 → Disk𝑛

→ C⎟
⎜ ∕ ∐ 𝑀𝑖 ⎟
⎝ 𝑖=1 ⎠
𝑛
𝐴𝑛 ⨂
≅ colim ∏(Disk(𝑋,𝑒)
𝑛
)∕𝑀𝑖 → (Disk(𝑋,𝑒)
𝑛
)𝑛 −−→ C𝑛 −−→ C .
( 𝑖=1 )

On the other hand, since ⊗ and colim commute in C, there exists a natural equiva-
lence
𝑛
𝐴𝑛 ⨂
colim ∏(Disk(𝑋,𝑒)
𝑛
)∕𝑀𝑖 → (Disk(𝑋,𝑒)
𝑛
)𝑛 −−→ C𝑛 −−→ C
( 𝑖=1 )
𝑛
𝐴
≅ colim (Disk(𝑋,𝑒) )∕𝑀𝑖 → Disk(𝑋,𝑒) →C .

⨂ ( 𝑛 𝑛 )
𝑖=1

The composition of the two equivalences yields the commutativity of the diagram,
as desired. ◻
The above proposition shows that factorization homology can be expressed as
symmetric monoidal left Kan extension, at least when C is ⊗-presentable. In par-
ticular, factorization homology is symmetric monoidal on 𝑀.
Next we show the independence of factorization homology with respect to 𝑋.
3 Factorization homology 73

Proposition 3.12. Suppose 𝜑 ∶ (𝑋, 𝑒) → (𝑋 ′ , 𝑒′ ) is a map of spaces over


′ ,𝑒′ )
𝐵 Aut(ℝ𝑛 ), 𝑀 is an (𝑋, 𝑒)-manifold, 𝐴 is an Disk(𝑋 𝑛
-algebra Let 𝜑𝑀 denote
the (𝑋 ′ , 𝑒′ )-manifold 𝑀, 𝜑𝐴 denote the Disk(𝑋,𝑒)
𝑛
-algebra 𝐴, then there exists a
natural equivalence
∫ 𝐴 ≅ ∫ 𝜑𝐴.
𝜑𝑀 𝑀
′ ,𝑒′ )
Proof. It suffices to show that the ∞-categories (Disk(𝑋,𝑒)
𝑛
)∕𝑀 and (Disk𝑛(𝑋 )∕𝑀
′ ,𝑒′ )
are equivalent. By definition of Disk(𝑋,𝑒)
𝑛
and Disk(𝑋
𝑛
, it suffices to show that the
map
((S∕𝐵 Aut(ℝ𝑛 ) )∕𝑋 )∕𝑀 → ((S∕𝐵 Aut(ℝ𝑛 ) )∕𝑋 ′ )∕𝑀

is an equivalence of ∞-categories. But this map is an isomorphism. ◻


Remark 3.13. Factorization homology does depend on the choice of the map 𝑀 →
𝐵 Aut(ℝ𝑛 ). For example, if 𝑀 = ℝ𝑛 is equipped with its standard framing and 𝐴 is
an 𝐸𝑛 -algebra, then ∫𝑀 𝐴 = 𝐴. However, if 𝑀 = ℝ𝑛 is equipped with the opposite
framing, then ∫𝑀 𝐴 = 𝐴𝑜𝑝 . In particular, factorization homology is not homotopy
invariant on manifolds.
Next we discuss the excision property of factorization homology.
Proposition 3.14. Factorization homology satisfies the excision property: for any
(𝑋, 𝑒)-manifold 𝑀, if there exists a codimension 1 submanifold 𝑁 of 𝑀 and a triv-
ialization 𝑁 × ℝ of its neighborhood such that 𝑀 is decomposable as 𝑅 ∐𝑁×ℝ 𝐿,
where 𝑅, 𝐿 are submanifolds of 𝑀 glued along 𝑁 × ℝ, then the map

∫𝐴 ⊗ ∫𝐴 → ∫ 𝐴
𝐿 ∫𝑁×ℝ 𝐴 𝑅 𝑀

is an equivalence.
Proof. First, we write the definition of a factorization algebra in another way. The
objects in (Disk(𝑋,𝑒)
𝑛
)∕𝑀 are those open subsets in 𝑀, that is homeomorphic to a
disjoint union of disks. Then if we define Ball(𝑀) to be the collection of the open
subsets in 𝑀 that are homeomorphic to a disk, then
𝑙
∫ 𝐴= colim ∫ 𝐴.
𝑀 𝑈1 ,⋯,𝑈𝑙 ∈Ball(𝑀), pairwise disjoint ⨂ 𝑈
𝑖=1 𝑖

Now, the tensor product

∫𝐴 ⊗ ∫𝐴
𝐿 ∫𝑁×ℝ 𝐴 𝑅

can be interpreted as

colim ∫ 𝐴 ⊗ ∫ 𝐴 ⊗ ∫ 𝐴 ⇉ ∫ 𝐴 ⊗ ∫ 𝐴 ,
( 𝐿 𝑁×ℝ 𝑅 𝐿 𝑅
)
74 An Introduction to Factorization Algebras and Factorization Homology

where the two maps are the two module structures. Define 𝐿0 and 𝑅0 to be 𝐿−𝑁 ×
(−∞, 0] and 𝑅 − 𝑁 × [0, ∞), respectively. Then we have canonical equivalences

∫ 𝐴 ≅ ∫ 𝐴, ∫ 𝐴 ≅ ∫ 𝐴.
𝐿 𝐿0 𝑅 𝑅0

Under this equivalence, using the definition of the module structure, the colimit
above is equivalent to
𝑙
colim ∫ 𝐴,
̃ pairwise disjoint ⨂ 𝑈
𝑈1 ,⋯,𝑈𝑙 ∈Ball, 𝑖=1 𝑖

̃ = Ball(𝐿0 ) ∪ Ball(𝑅0 ) ∪ Ball(𝑁 × ℝ). Therefore, it suffices to show


where Ball
that the map
𝑙 𝑙
colim ∫ 𝐴→ colim ∫ 𝐴
̃ pairwise disjoint ⨂ 𝑈
𝑈1 ,⋯,𝑈𝑙 ∈Ball, 𝑈1 ,⋯,𝑈𝑙 ∈Ball(𝑀), pairwise disjoint ⨂ 𝑈
𝑖=1 𝑖 𝑖=1 𝑖

is an equivalence. But if 𝑈 ∈ Ball(𝑀) intersects 𝑁 × {0}, we may take an open set


𝑈 ′ ⊆ 𝑈 such that 𝑈 ′ ∈ Ball(𝑁 × ℝ), thus inducing a map ∫𝑈 ′ 𝐴 → ∫𝑈 𝐴, which
is an equivalence since both sides are naturally equivalent to 𝐴. Thus the colimit
on the right hand side is equivalent to the colimit on the left hand side, completing
the proof. ◻
Combining the results above, we obtain the following result:
Proposition 3.15. For any Disk(𝑋,𝑒)
𝑛
-algebra 𝐴, the functor ∫− 𝐴 is a homology
theory for (𝑋, 𝑒)-manifolds.
Example 3.16 (The relationship with Hochschild homology). Hochschild homol-
ogy is an example of factorization homology. Take 𝑀 = 𝑆 1 and take a framing
on 𝑀 induced by its Lie group structure. Then 𝑀 can be viewed as the gluing:
𝑀 = ℝ ∐{1,−1}×ℝ ℝ, where ℝ is endowed with the trivial framing. Now, let 𝐴
be an associative algebra (or more generally an 𝐴∞ -algebra), then 𝐴 is a Diskfr1 -
algebra, and the factorization homology on 𝑀 with coefficient 𝐴 is

∫ 𝐴 ≅ ∫𝐴 ⊗ ∫ 𝐴 = 𝐴 ⊗ 𝐴,
𝑀 ℝ ∫{1}×ℝ 𝐴⊗∫{−1}×ℝ 𝐴 ℝ 𝐴⊗𝐴𝑜𝑝

which is the Hochschild homology of 𝐴.


We may now prove Theorem 3.9:
Theorem 3.17. If C is a symmetric monoidal ⊗-presentable ∞-category, then
there exists a pair of equivalence

∫ ∶ Disk(𝑋,𝑒)
𝑛
-Alg(C) ⇄ HT(Mfld(𝑋,𝑒)
𝑛
, C) ∶ evℝ𝑛

between Disk(𝑋,𝑒)
𝑛
-algebras and homology theories for (𝑋, 𝑒)-manifolds.
3 Factorization homology 75

Proof. By Proposition 3.11, ∫ is a left adjoint to the functor evℝ𝑛 . The unit of the
adjunction is an equivalence, because Disk(𝑋,𝑒)
𝑛
→ Mfld(𝑋,𝑒)
𝑛
is fully faithful, and
the Kan extension along a fully faithful functor restricts as the original functor. The
counit of the adjuntion evaluates on a symmetric monoidal functor 𝐹 as a morphism
∫ 𝐹 (ℝ𝑛 ) → 𝐹 . It suffices to show that this map is an equivalence, which means
we only have to prove that the map ∫𝑀 𝐹 (ℝ𝑛 ) → 𝐹 (𝑀) is an equivalence for any
(𝑋, 𝑒)-manifold 𝑀. We denote 𝐴 ∶= 𝐹 (ℝ𝑛 ).
If 𝑀 is the disjoint union of some copies of ℝ𝑛 , this is obvious. Suppose that
it is true for 𝑀 = 𝑆 𝑖 × ℝ𝑛−𝑖 for some 0 ≤ 𝑖 < 𝑛. Then for the case 𝑆 𝑖+1 × ℝ𝑛−𝑖−1 ,
it can be decomposed as ℝ𝑖+1 − ×ℝ
𝑛−𝑖−1
∐𝑆 𝑖+1 ×ℝ𝑛−𝑖 ℝ𝑖+ × ℝ𝑛−𝑖−1 , since both 𝐹 and
∫− 𝐴 have the excision property, we have

∫ 𝐴≅∫ 𝐴 ⊗ ∫ 𝐴
𝑆 𝑖+1 ×ℝ𝑛−𝑖−1 ℝ𝑖+1
− ×ℝ
𝑛−𝑖−1 ∫𝑆 𝑖 ×ℝ𝑛−𝑖 𝐴 ℝ𝑖+1
+ ×ℝ
𝑛−𝑖−1

≅ 𝐹 (ℝ𝑖+1
− ×ℝ
𝑛−𝑖−1
) ⊗ 𝐹 (ℝ𝑖+1
+ ×ℝ
𝑛−𝑖−1
)
𝐹 (𝑆 𝑖 ×ℝ𝑛−𝑖 )

≅ 𝐹 (𝑆 𝑖+1 × ℝ𝑛−𝑖−1 ).

Thus the map is an equivalence for 𝑀 = 𝑆 𝑖+1 × ℝ𝑛−𝑖−1 . By induction, for any
0 ≤ 𝑖 ≤ 𝑛, the above map is an equivalence for 𝑀 = 𝑆 𝑖 × ℝ𝑛−𝑖 .
Now, for any manifold 𝑀 that is obtained from a manifold 𝑁 making the counit
map an equivalence by adding a handle of index 𝑖 + 1, 𝑀 can be expressed as the
coproduct 𝑁 ∐𝑆 𝑖 ×ℝ𝑛−𝑖 ℝ𝑛 . Similarly as above, since 𝑁, 𝑆 𝑖 × ℝ𝑛−𝑖 , ℝ𝑛 all make the
counit map an equivalence, so does 𝑀. Therefore, by induction, the counit map is
an equivalence for all handlebodies.
We finish the proof by noticing that all smooth manifolds can be regarded as the
direct colimit of inclusions of handlebodies, and the functors ∫ 𝐴 and 𝐹 preserve
sequential colimits. ◻

Relation to factorization algebra and applications to 𝐸𝑛 ‐algebras


We now explain the relation between factorization homology and locally con-
stant factorization algebras. We will prove the following theorem, which is from
[GTZ10]:
Theorem 3.18. There exists an equivalence of ∞-categories from the ∞-category
of Disk(𝑀,𝑇
𝑛
𝑀)
-algebras to the ∞-category of locally constant factorization alge-
bras on 𝑀, for any manifold 𝑀.
We will need a lemma, which is stated in [Lur16] and will not be proved here:
Lemma 3.19. Suppose 𝐴 is an 𝐸𝑛 -algebra, 𝒰 is a collection of open sets in ℝ𝑛
covering ℝ𝑛 , each homeomorphic to ℝ𝑛 . The natural 𝒰-prefactorization algebra
with respect to 𝐴, given by 𝑈 ↦ 𝐴(𝑈 ), is a 𝒰-factorization algebra.
We will also use the following lemma, which presents the excision property of
locally constant factorization algebras:
76 An Introduction to Factorization Algebras and Factorization Homology

Lemma 3.20. Suppose 𝑀 is a manifold such that it can be decomposed as


𝑅 ∐𝑁×ℝ 𝐿, and 𝐴 is a locally constant factorization algebra on 𝑀. Then 𝐴(𝑅)
and 𝐴(𝐿) are right and left 𝐸1 -modules over 𝐴(𝑁 × ℝ), and

𝐴(𝑀) ≅ 𝐴(𝑅) ⊗ 𝐴(𝐿).


𝐴(𝑁×ℝ)

Proof. By the proof of Proposition 3.7, to prove that 𝐴(𝑅) and 𝐴(𝐿) are right
and left 𝐸1 -modules over 𝐴(𝑁 × ℝ), it suffices to show that the map 𝐴(𝑅 − 𝑁 ×
[𝑡, +∞)) → 𝐴(𝑅) is an equivalence for all 𝑡 ∈ ℝ. This is since by Proposition 2.10,
the map 𝐴(𝑁 × (𝑎, 𝑏)) → 𝐴(𝑁 × (𝑎′ , 𝑏′ )) is an equivalence for all 𝑎′ ≤ 𝑎 < 𝑏 ≤ 𝑏′ ,
thus the Čech complexes of the covers

{𝑅 − 𝑁 × [𝑡 − 1, +∞)} ∪ {𝑁 × (𝑎, 𝑏) ∣ 𝑡 − 2 ≤ 𝑎 < 𝑏}

and
{𝑅 − 𝑁 × [𝑡 − 1, +∞)} ∪ {𝑁 × (𝑎, 𝑏) ∣ 𝑡 − 2 ≤ 𝑎 < 𝑏 ≤ 𝑡}

are naturally equivalent, since 𝐴 is a factorization algebra, the map 𝐴(𝑅 − 𝑁 ×


[𝑡, +∞)) → 𝐴(𝑅) is an equivalence for all 𝑡 ∈ ℝ. Using the same argument in
Proposition 3.7, it is shown that 𝐴(𝑅) and 𝐴(𝐿) are right and left 𝐸1 -𝐴(𝑁 × ℝ)-
modules.
To show that the excision property holds, we take the following factorizing
cover of 𝑀:
𝒰 ∶= {𝑅 − 𝑁 × [𝑡, +∞) ∣ −∞ < 𝑡 ≤ ∞}
∪ {𝐿 − 𝑁 × (−∞, 𝑡] ∣ −∞ ≤ 𝑡 < ∞} ∪ {𝑁 × (𝑎, 𝑏) ∣ −∞ ≤ 𝑎 < 𝑏 ≤ ∞}.

Then we know that 𝐴(𝑀) is equivalent to the Čech complex 𝐶(𝒰, ̌ 𝐴). But by direct
computation, the Čech complex is given by the classical two-sided Bar construction
𝐵(𝐴(𝑅), 𝐴(𝑁 × ℝ), 𝐴(𝐿)), which is equivalent to the tensor product 𝐴(𝑅)⊗𝐴(𝑁×ℝ)
𝐴(𝐿) (by [CG16]). Thus 𝐴(𝑀) is equivalent to 𝐴(𝑅) ⊗𝐴(𝑁×ℝ) 𝐴(𝐿), completing
the proof. ◻

Using these lemmas, we state the following proposition:

Proposition 3.21. Suppose 𝑀 is an (𝑋, 𝑒)-manifold. Then for any Disk(𝑋,𝑒)𝑛


-
algebra 𝐴, the map 𝑈 ↦ ∫𝑈 𝐴 for any open set 𝑈 of 𝑀 defines a locally constant
factorization algebra, which we will denote by 𝑖(𝐴).

Proof. Since the inclusion of open sets of 𝑀 are (𝑋, 𝑒)-embeddings, the map 𝑈 ↦
∫𝑈 𝐴 is a prefactorization algebra. To prove that it is a factorization algebra, we
construct a factorization algebra that is locally constant, and show that it is naturally
equivalent to 𝑖(𝐴). We deduce this by a path similar to the proof of Theorem 3.9.
We begin with the case 𝑀 ≅ ℝ𝑛 . Then 𝐴 is a (Disk(𝑋,𝑒) 𝑛
)∕𝑀 -algebra. But by
(𝑋,𝑒) (𝑀,𝑇 𝑀)
definition (Disk𝑛 )∕𝑀 is naturally isomorphic to Disk𝑛 , thus if 𝑀 = ℝ𝑛 ,
𝑛 𝑛
then 𝐴 is naturally a Disk(𝑛ℝ ,𝑇 ℝ ) -algebra, which is equivalent to a Diskfr𝑛 -algebra.
3 Factorization homology 77

By Lemma 3.19 and Proposition 2.12, we obtain a factorization algebra on ℝ𝑛 ,


whose value on each open subset that is homeomorphic to a disk is 𝐴. We will
denote it by 𝑖′ (𝐴).
To prove that 𝑖(𝐴) and 𝑖′ (𝐴) are naturally equivalent, we again mimic the proof
of Theorem 3.9. We recall that the essential part of the proof of Theorem 3.9 is
the excision property. By Lemma 3.20, similar to the proof of Theorem 3.9, we
can show that 𝑖(𝐴)(𝑈 ) and 𝑖′ (𝐴)(𝑈 ) are naturally equivalent firstly for 𝑈 homeo-
morphic to some 𝑆 𝑖 × ℝ𝑛−𝑖 , secondly for all handlebodies, and finally for all open
subsets. (Note that the Čech complex functor preserves sequential colimits of inclu-
sions of classes of coverings by its definition.) Thus 𝑖(𝐴) is a factorization algebra
when 𝑀 ≅ ℝ𝑛 .
Now we show that 𝑖(𝐴) is a factorization algebra for all manifolds 𝑀. By the
proof of Theorem 3.9, we only have to consider the following two cases:
i) 𝑀 can be decomposed as the gluing 𝑅 ⨿𝑁 𝐿, where 𝑅, 𝑁, 𝐿 are open subsets
of 𝑀 such that 𝑖(𝐴)|𝑅 , 𝑖(𝐴)|𝑁 , 𝑖(𝐴)|𝐿 are factorization algebras. In this case
𝑖(𝐴)|𝑅 , 𝑖(𝐴)|𝑁 , 𝑖(𝐴)|𝐿 together with the restriction maps form a gluing data,
thus defining a factorization algebra 𝑖′ (𝐴), that is locally constant by Proposi-
tion 2.18; which means that the values of 𝑖′ (𝐴) and 𝑖(𝐴) are naturally equiv-
alent for all open disks in 𝑀. Again using the lemma, 𝑖(𝐴)(𝑈 ) and 𝑖′ (𝐴)(𝑈 )
are naturally equivalent firstly for 𝑈 homeomorphic to some 𝑆 𝑖 × ℝ𝑛−𝑖 , sec-
ondly for all handlebodies, and finally for all open subsets. Thus 𝑖(𝐴) is a
factorization algebra.
ii) 𝑀 is the colimit of a sequential colimit colim(𝑀1 ↪ 𝑀2 ↪ ⋯) of inclusions
of open sets, such that each 𝑀𝑛 satisfies that 𝑖(𝐴)|𝑀𝑛 is a factorization algebra.
Although in this case 𝑖(𝐴)|𝑀𝑛 do not form a gluing data, the proof of Propo-
sition 2.17 is still available in the case of sequential colimit of inclusions of
open sets. Thus the data 𝑖(𝐴)|𝑀𝑛 defines a factorization algebra 𝑖′ (𝐴), that is
locally constant by Proposition 2.18; which means that the values of 𝑖′ (𝐴) and
𝑖(𝐴) are naturally equivalent for all open disks in 𝑀. Again using the lemma,
𝑖(𝐴)(𝑈 ) and 𝑖′ (𝐴)(𝑈 ) are naturally equivalent firstly for 𝑈 homeomorphic to
some 𝑆 𝑖 × ℝ𝑛−𝑖 , secondly for all handlebodies, and finally for all open subsets.
Thus 𝑖(𝐴) is a factorization algebra.
After discussing the two cases, we discovered that 𝑖(𝐴) is a factorization algebra
firstly for manifolds homeomorphic to some 𝑆 𝑖 × ℝ𝑛−𝑖 , secondly for all handlebod-
ies, and finally for all (𝑋, 𝑒)-manifolds, completing the proof. ◻
Proof of Theorem 3.18. By the above proposition, it suffices to show that the func-
tor
𝑖 ∶ Disk(𝑀,𝑇
𝑛
𝑀)
-Alg → FA𝑙𝑐 (𝑀), 𝐴↦ 𝑈 ↦∫𝐴
( 𝑈
)

induces an equivalence of ∞-categories. We take the functor 𝑗 from FA𝑙𝑐 (𝑀) to


Disk(𝑀,𝑇
𝑛
𝑀)
-Alg to be the evaluation at any disk inside 𝑀, and verify that this func-
tor is the inverse of the above one.
78 An Introduction to Factorization Algebras and Factorization Homology

i) For any 𝐴 being a Disk(𝑀,𝑇


𝑛
𝑀)
-algebra, 𝑗(𝑖(𝐴)) is ∫ℝ𝑛 𝐴, which is naturally
equivalent to 𝐴.

ii) For any locally constant factorization algebra 𝐵, 𝑖(𝑗(𝐵)) sends every subset 𝑈
of 𝑀 to ∫𝑈 𝐵(ℝ𝑛 ). Now, take a metric on 𝑀, and take 𝒰 to be the collection
of all strictly convex subset of 𝑀. Then 𝒰 is stable under intersection, forms
a basis of 𝑀, and is a factorization basis of 𝑀. Moreover the values of 𝑖(𝑗(𝐵))
and 𝐵 on 𝒰 are naturally equivalent, since ∫ℝ𝑛 𝐵(ℝ𝑛 ) and 𝐵(ℝ𝑛 ) are naturally
equivalent. By Proposition 2.12, 𝑖(𝑗(𝐵)) and 𝐵 are naturally equivalent.

The above discussion shows that 𝑗 is an inverse to 𝑖, which completes the proof.

Using this theorem, we may make some interesting applications:

Example 3.22. We take 𝑀 = ℝ𝑛 . Then a preferred choice of framing for ℝ𝑛


together with the projective map ℝ𝑛 → ∗ induces an equivalence of ∞-categories
𝑛 𝑛
Disk(𝑛ℝ ,𝑇 ℝ ) → Diskfr
𝑛
, which induces equivalences of ∞-categories
𝑛 ,𝑇 ℝ𝑛 )
FA𝑙𝑐 (ℝ𝑛 ) ≅ Disk(𝑛ℝ -Alg ≅ Diskfr𝑛 -Alg ≅ E𝑛 -Alg.

Hence we obtain the following theorem, which also appears in [Lur16]:

Theorem 3.23. There exists an equivalence of ∞-categories

FA𝑙𝑐 (ℝ𝑛 ) ≅ E𝑛 -Alg.

Example 3.24. Suppose 𝐴 is a locally constant prefactorization algebra. Then we


may take its value on any open subset of the underlying space that is homeomor-
phic to a disk, which we denote by 𝐴0 . By Theorem 3.18, 𝑖(𝐴0 ) is a factorization
algebra, whose value on open disks is naturally equivalent to 𝐴. This shows that a
locally constant prefactorization algebra can be reformed to a factorization algebra,
as stated in Remark 2.8.

Example 3.25. The notion of a factorization algebra is in some sense a generaliza-


tion of factorization homology. Indeed, suppose 𝐴 is a Disk(𝑋,𝑒)
𝑛
-algebra, and 𝑀 is
an (𝑋, 𝑒)-manifold. Then by Proposition 3.21, 𝐴 corresponds to a locally constant
factorization algebra on 𝑀, which is 𝑖(𝐴), in the language of Theorem 3.18. Take
the map 𝑝 ∶ 𝑀 → ∗, we see that

𝑝∗ (𝑖(𝐴))(∗) = 𝑖(𝐴)(𝑀) = ∫ 𝑀.
𝐴

From this observation, it is meaningful to make the following definition:

Definition 3.26. Suppose 𝐴 is a factorization algebra on a space 𝑋, 𝑝 ∶ 𝑋 → ∗


is the unique map. The factorization homology of 𝐴 is defined to be ∫𝑋 𝐴 ∶=
𝑝∗ (𝐴)(∗).
3 Factorization homology 79

When 𝑋 is a manifold and 𝐴 is locally constant, this restricts to the original


definition of factorization homology.
We now discuss the behavior of factorization homology in the case that the
underlying space changes, again using Theorem 3.18.
Example 3.27. Suppose 𝑋 and 𝑌 are spaces, 𝑓 ∶ 𝑋 → 𝑌 is a continuous map.
Then, if we set 𝑝1 ∶ 𝑋 → ∗ and 𝑝2 ∶ 𝑌 → ∗, we have 𝑝1 = 𝑝2 ∘ 𝑓 . Thus for any
factorization algebra 𝐴 on 𝑋, we have

∫ 𝐴 = (𝑝1 )∗ (𝐴)(∗) = (𝑝2 )∗ 𝑓∗ (𝐴)(∗) = ∫ 𝑓∗ (𝐴),


𝑋 𝑌

which we will call the pushforward formula.


Example 3.28. Suppose 𝑋 and 𝑌 are spaces, 𝑝1 ∶ 𝑋 → ∗ and 𝑝2 ∶ 𝑌 → ∗ are
the canonical maps. If we take 𝜋 ∶ 𝑋 × 𝑌 → 𝑋 to be the projection, then by
Proposition 2.14, there exists an equivalence of ∞-categories 𝜋 ∗ ∶ FA𝑙𝑐 (𝑋 × 𝑌 ) →
FA𝑙𝑐 (𝑋, FA𝑙𝑐 (𝑌 )), such that the following diagram is commutative:
𝜋∗
FA𝑙𝑐 (𝑋 × 𝑌 ) / FA𝑙𝑐 (𝑋)
PPP pp 7
PPP≅ pp
PP pp
𝜋 ∗ PPPP pp
( ppp FA𝑙𝑐 (𝑋,(𝑝2 )∗ )
FA𝑙𝑐 (𝑋, FA𝑙𝑐 (𝑌 ))

If we composite the diagram with (𝑝1 )∗ , we obtain that for any locally constant
factorization algebra 𝐴 on 𝑋 × 𝑌 , there is an equivalence

∫ 𝐴 = (𝑝1 × 𝑝2 )∗ (𝐴)(∗)
𝑋×𝑌

= (𝑝1 )∗ (𝜋∗ (𝐴))(∗) = ∫ FA𝑙𝑐 (𝑋, (𝑝2 )∗ (𝐴))(∗) = ∫ ∫ 𝐴.


𝑋 𝑋 𝑌

Restricting to the case of manifolds yields the Fubini formula:


Proposition 3.29. Suppose 𝑀, 𝑁 are manifolds with dimensions 𝑚, 𝑛, respec-
tively, 𝐴 is a Disk(𝑀×𝑁,𝑇
𝑚+𝑛
𝑀×𝑇 𝑁)
-algebra. Then ∫𝑁 𝐴 is canonically a Disk(𝑀,𝑇
𝑚
𝑀)
-
algebra, and
∫ 𝐴 ≅ ∫ ∫ 𝐴.
𝑀×𝑁 𝑀 𝑁
𝑚 𝑛
Furthermore, if we set 𝑀 = ℝ , 𝑁 = ℝ , using Theorem 3.23, we see that the
example is a generalization of the Dunn’s Theorem, which is stated in [Dun88],
rewritten in [Lur16]:
Theorem 3.30. There exists an equivalence of ∞-categories
E𝑚+𝑛 -Alg ≅ E𝑚 -Alg(E𝑛 -Alg)

for any positive integers 𝑚, 𝑛.


80 An Introduction to Factorization Algebras and Factorization Homology

Stratified spaces and applications to 𝐸𝑛 ‐modules


In this section, we briefly discuss locally constant factorization algebras on strati-
fied spaces, and take some important examples. The notion of stratified spaces is
a slight generalization of manifolds without boundaries. Pointed manifolds (which
are manifolds with a distinguished point), manifolds with or without boundaries can
all be regarded as stratified spaces. Most results in this section can be proved using
the strategies given in the previous sections, so we will mostly omit the proofs.

Definition 3.31. By a stratified space of dimension 𝑛, we mean a Hausdorff para-


compact topological space 𝑋, together with a filtration ∅ = 𝑋−1 ⊆ 𝑋0 ⊆ ⋯ ⊆
𝑋𝑛 = 𝑋 of closed subsets, such that for any 𝑥 ∈ 𝑋𝑖 − 𝑋𝑖−1 , there exists a neigh-
borhood 𝑈𝑥 of 𝑥 that is homeomorphic to ℝ𝑖 × 𝐶(𝐿), where 𝐶(𝐿) is the open cone
on a stratified space of dimension 𝑛 − 𝑖 − 1 if 𝑖 < 𝑛, 𝐶(𝐿) = ∗ if 𝑖 = 𝑛, and the
homeomorphism preserves the filtration; and 𝑋 − 𝑋𝑛−1 is dense in 𝑋. The con-
nected components of 𝑋𝑖 − 𝑋𝑖−1 are called the dimension 𝒊-strata of 𝑋. In our
discussion, we always assume that 𝑋 has at most countable strata.

For example, a manifold with boundary can be regarded as a stratified space,


with its (𝑛 − 1)-dimensional strata being its boundary; a pointed manifold can be
regarded as a stratified space with its 0-dimensional strata being the base point.

Definition 3.32. An open subset 𝐷 of 𝑋 is called a (stratified) disk, if it is home-


omorphic to ℝ𝑖 × 𝐶(𝐿), where 𝐶(𝐿) is the open cone on a stratified space of di-
mension 𝑛 − 𝑖 − 1 if 𝑖 < 𝑛, 𝐶(𝐿) = ∗ if 𝑖 = 𝑛, and the homeomorphism preserves
the filtration, and further 𝐷 ∩ 𝑋𝑖 ≠ ∅ and 𝐷 ⊆ 𝑋 − 𝑋𝑖−1 . The integer 𝑖 is called
the index of 𝐷.
A (stratified) disk 𝐷 is called a good neighborhood at 𝑋𝑖 if 𝐷 has index 𝑖, and
𝐷 intersects only one connected component of 𝑋𝑖 − 𝑋𝑖−1 .
A factorization algebra 𝐴 on a stratified space 𝑋 is called locally constant if for
any inclusion of (stratified) disks 𝑈 ↪ 𝑉 such that 𝑈 and 𝑉 have the same index 𝑖
and 𝑈 , 𝑉 are good neighborhoods at 𝑋𝑖 , the map 𝐴(𝑈 ) → 𝐴(𝑉 ) is an equivalence.

Remark 3.33. In the case of pointed manifolds and manifolds with or without
boundaries, all (stratified) disks are homeomorphic to the euclidean plane or half
plane.

Some results on locally constant factorization algebras in the usual case remain
true in the stratified case. We state some useful ones; the proofs are similar.

Proposition 3.34 (c.f. Proposition 2.10). Suppose 𝑓 ∶ 𝑋 → 𝑌 is a locally trivial


fibration between stratified spaces, that is adequately stratified, in the sense that 𝑌
has an open cover by trivializing (stratified) disks 𝑉 which are good neighborhoods
satisfying:

i) 𝑓 −1 (𝑉 ) ≅ 𝑉 × 𝐹 has a cover by (stratified) disks of the form 𝑉 × 𝐷 which are


good neighborhoods in 𝑋;
3 Factorization homology 81

ii) For sub-disks 𝑇 ⊆ 𝑈 which are good neighborhoods (in 𝑉 ) with the same
index, 𝑇 × 𝐷 is a good neighborhood in 𝑋 of same index as 𝑈 × 𝐷.
Then pushforward along 𝑓 preserves local constantness.
Proposition 3.35 (c.f. Proposition 2.14). Suppose 𝑋, 𝑌 are stratified spaces with
finitely many strata. Then the projections 𝑋 × 𝑌 → 𝑋 and 𝑋 × 𝑌 → 𝑌 are
adequately stratified, and the functor 𝜋 ∗ ∶ FA(𝑋 × 𝑌 ) → FA(𝑋, FA(𝑌 )) restricts to
a functor 𝜋 ∗ ∶ FA𝑙𝑐 (𝑋 × 𝑌 ) → FA𝑙𝑐 (𝑋, FA𝑙𝑐 (𝑌 )), which is an equivalence.
Proposition 3.36 (c.f. Proposition 2.18). Suppose 𝑋 is a stratified space, 𝐴 is a
factorization algebra on 𝑋. If there exists an open cover 𝒰 of 𝑋 such that 𝐴|𝑈 is
locally constant for every 𝑈 ∈ 𝒰, then 𝐴 is locally constant.
We furthermore have the following proposition:
Proposition 3.37. Suppose 𝑖 ∶ 𝑋 → 𝑌 is a stratified embedding of stratified
spaces such that 𝑖(𝑋) is a union of strata of 𝑌 . Then taking the pushforward along
𝑖 preserves local constantness.
We will now focus on some examples.
Example 3.38. The half line, 𝑋 = [0, +∞), with the dimension-0 stratum given
by {0}. Then all connected open subsets of 𝑋 form a factorization basis ℐ of 𝑋.
An example of stratified locally constant prefactorization algebra on ℐ arises as
follows: Choose an 𝐸1 -algebra 𝐸, a right 𝐸-module 𝑀, and define the factorization
algebra 𝐴𝐸,𝑀 as follows:
i) For any 0 < 𝑎 < 𝑏, define 𝐴𝐸,𝑀 ((𝑎, 𝑏)) = 𝐸;
ii) For any 0 < 𝑎, define 𝐴𝐸,𝑀 ([0, 𝑎)) = 𝑀;
iii) The structure maps are given by the multiplication on 𝐸 and the structure of
𝑀 being a right 𝐸-module.
It is easy to verify that this forms an ℐ -prefactorization algebra, and satisfies
the locally constant condition. Furthermore, we have:
Proposition 3.39. i) The ℐ -prefactorization algebra 𝐴𝐸,𝑀 given above is an
ℐ -factorization algebra, and hence extends uniquely into a factorization al-
gebra on 𝑋, which we will still denote by 𝐴𝐸,𝑀 ;
ii) 𝐴𝐸,𝑀 is locally constant on 𝑋;
iii) Moreover, all locally constant factorization algebras on 𝑋 is equivalent to
some 𝐴𝐸,𝑀 ;

iv) Finally, there exists an equivalence between FA𝑙𝑐 ([0, ∞)) and the ∞-category
E1 -RMod of (pointed) right modules over 𝐸1 -algebras (whose objects are or-
dered pairs (𝐸, 𝑀) such that 𝐸 is an 𝐸1 -algebra and 𝑀 is a right 𝐸1 -module);
82 An Introduction to Factorization Algebras and Factorization Homology

v) The equivalence of ∞-categories satisfies the following commutative diagram


up to equivalence:

FA𝑙𝑐 ([0, ∞))


≅ / E -RMod
1

 
FA𝑙𝑐 ((0, ∞))
≅ / E1 -Alg.

Hence this equivalence can be used to characterize right 𝐸1 -modules, which


may be regarded as a generalization of Theorem 3.23 in the one dimensional case.
Similarly, there exists an equivalence between FA𝑙𝑐 ((−∞, 0]) and the ∞-category
E1 -LMod of (pointed) left modules over 𝐸1 -algebras.

Example 3.40. The unit interval, 𝑋 = [0, 1], with the dimension-0 stratum given
by {0, 1}. Proposition 3.36 shows that

FA𝑙𝑐 (𝑋) ≅ FA𝑙𝑐 ([0, 1)) × FA𝑙𝑐 ((0, 1]) ≅ E1 -RMod × E1 -LMod.
FA𝑙𝑐 ((0,1)) E1 -Alg

Thus, giving a locally constant factorization algebra 𝐴 on 𝑋 is equivalent to giving


a triple (𝐸, 𝑀𝑟 , 𝑀𝑙 ), where 𝐸 is an 𝐸1 -algebra (where for any 0 ≤ 𝑎 < 𝑏 ≤ 1
we have 𝐴((𝑎, 𝑏)) = 𝐸), 𝑀𝑟 is a right 𝐸-module (where for any 0 < 𝑎 ≤ 1 we
have 𝐴([0, 𝑎)) = 𝑀𝑟 ), and 𝑀𝑙 is a left 𝐸-module (where for any 0 ≤ 𝑎 < 1
we have 𝐴((𝑎, 1]) = 𝑀𝑙 ). Furthermore by discussion in the previous section, the
factorization homology of 𝐴 is

∫ 𝐴 ≅ 𝑀 𝑟 ⊗ 𝑀𝑙 .
[0,1] 𝐸

Example 3.41. The pointed euclidean space, 𝑋 = ℝ𝑛∗ , with the dimension-0 stra-
tum given by {0}. Similar to the case above, we will discuss the relationship be-
tween locally constant factorization algebras on 𝑋 and 𝐸𝑛 -modules.
To do this, we notice that there exists a factorization basis of 𝑋, the open convex
sets, which we will denote by 𝒞 . Suppose 𝑀 is a module over an 𝐸𝑛 -algebra 𝐸.
We define a 𝒞 -prefactorization algebra 𝐴𝐸,𝑀 as follows:

i) For any 𝑈 ∈ 𝒞 , 0 ∈ 𝑈 , define 𝐴𝐸,𝑀 (𝑈 ) = 𝑀;

ii) For any 𝑈 ∈ 𝒞 , 0 ∉ 𝑈 , define 𝐴𝐸,𝑀 (𝑈 ) = 𝐸;

iii) The structure maps are given by the multiplication on 𝐸 and the structure of
𝑀 being an 𝐸-module.

It is easy to verify that this forms a 𝒞 -prefactorization algebra, and satisfies


the locally constant condition. Similar to Proposition 3.36, we have the following
result:
3 Factorization homology 83

Proposition 3.42. i) The 𝒞 -prefactorization algebra 𝐴𝐸,𝑀 given above is a 𝒞 -


factorization algebra, and hence extends uniquely to a factorization algebra
on 𝑋, which we will still denote by 𝐴𝐸,𝑀 ;
ii) 𝐴𝐸,𝑀 is locally constant on 𝑋;

iii) We obtain a functor 𝜓 ∶ E𝑛 -Mod → FA𝑙𝑐 (𝑋), which fits into the following
commutative diagram:

E𝑛 -Alg
≅ / FA𝑙𝑐 (ℝ𝑛 )
 _

 𝜓

E𝑛 -Mod / FA𝑙𝑐 (𝑋);

iv) The map


𝜓
→ FA𝑙𝑐 (𝑋) → FA𝑙𝑐 (ℝ𝑛 − {0})
E𝑛 -Mod −

and
E𝑛 -Mod → E𝑛 -Alg ≅ FA𝑙𝑐 (ℝ𝑛 ) → FA𝑙𝑐 (ℝ𝑛 − {0})

are equivalent, and identify E𝑛 -Mod with the pullback FA𝑙𝑐 (𝑋) ×FA𝑙𝑐 (ℝ𝑛 −{0})
FA𝑙𝑐 (ℝ𝑛 ).
Hence this proposition can be used to characterize 𝐸𝑛 -modules, which may
be regarded as a generalization of Theorem 3.23 in the 𝑛-dimensional case. In
particular, by taking the fiber we have the following corollary:
Corollary 3.43. The functor
E𝑛 -Mod(𝐴) → FA𝑙𝑐 (ℝ𝑛∗ ) × {𝐴}
FA𝑙𝑐 (ℝ𝑛 −{0})

is an equivalence for any 𝐸𝑛 -algebra 𝐴.


We may also use this example to characterize 𝐸1 -bimodules. We take 𝑛 = 1,
then there exists two (instead of one) strata of maximal dimension in 𝑋, unlike the
case of 𝑛 ≥ 2. Mimicking Example 3.38, we have the following proposition:
Proposition 3.44. There exists an equivalence of ∞-categories between FA𝑙𝑐 (ℝ∗ )
and the ∞-category E1 -BiMod of 𝐸1 -bimodules. The equivalence assigns an
(𝐿, 𝑅)-bimodule 𝑀 to a locally constant factorization algebra 𝐴, whose value on
(𝑎, 𝑏) is 𝐿 if 𝑏 ≤ 0, 𝑅 if 𝑎 ≥ 0, 𝑀 otherwise, and the structure maps are given by
the structure maps of a bimodule.
Example 3.45. The closed unit disk, 𝑋 = 𝐷𝑛 , with the (𝑛 − 1)-dimensional
strata given by 𝑆 𝑛−1 . Then by Proposition 3.36, there exists an equivalence of
∞-categories
FA𝑙𝑐 (𝐷𝑛 ) ≅ FA𝑙𝑐 (𝐷𝑛 − {0}) × FA𝑙𝑐 (𝐷𝑛 − 𝑆 𝑛−1 ).
FA𝑙𝑐 (𝐷𝑛 −𝑆 𝑛−1 −{0})
84 An Introduction to Factorization Algebras and Factorization Homology

Furthermore, we have

FA𝑙𝑐 (𝐷𝑛 − 𝑆 𝑛−1 ) ≅ FA𝑙𝑐 (ℝ𝑛 ) ≅ E𝑛 -Alg,


FA𝑙𝑐 (𝐷𝑛 − 𝑆 𝑛−1 − {0}) ≅ FA𝑙𝑐 (ℝ𝑛 − {0})
≅ FA𝑙𝑐 (𝑆 𝑛−1 × ℝ) ≅ E1 -Alg(FA𝑙𝑐 (𝑆 𝑛−1 )),
FA𝑙𝑐 (𝐷𝑛 − {0}) ≅ FA𝑙𝑐 (𝑆 𝑛−1 × (−∞, 0]) ≅ E1 -LMod(FA𝑙𝑐 (𝑆 𝑛−1 )),

we obtain an equivalence

FA𝑙𝑐 (𝐷𝑛 ) ≅ E𝑛 -Alg × E1 -LMod(FA𝑙𝑐 (𝑆 𝑛−1 )).


E1 -Alg(FA𝑙𝑐 (𝑆 𝑛−1 ))

Now, for any map 𝑓 ∶ 𝐸 → 𝐹 of 𝐸𝑛 -algebras, 𝐹 has an 𝐸𝑛 -module structure


over 𝐸 induced by 𝑓 . Furthermore, 𝐸, 𝐹 can be regarded as locally constant factor-
ization algebras on ℝ𝑛 , hence on ℝ𝑛 −{0} ≅ 𝑆 𝑛−1 × ℝ. Define 𝑞 ∶ 𝑆 𝑛−1 × ℝ → 𝑆 𝑛−1
to be the projection, then 𝑞∗ (𝐸) is an 𝐸1 -algebra, and 𝑞∗ (𝐹 ) is an 𝐸1 -module over
𝑞∗ (𝐸). Thus the equivalence above assigns the map 𝑓 with a locally constant fac-
torization algebra on 𝐷𝑛 , which we will denote 𝜔(𝑓 ). Its value on a good neighbor-
hood of index (𝑛 − 1) is 𝐹 , and its value on a good neighborhood of index 𝑛 is 𝐸.
Moreover, we may collapse the boundary of 𝐷𝑛 into one point, making the locally
constant factorization algebra on 𝐷𝑛 be a locally constant factorization algebra on
𝑆∗𝑛 . We denote this locally constant factorization algebra on 𝑆∗𝑛 by 𝜔′ (𝑓 ).

4 Further applications
In this section, we will briefly talk about two applications of the above results on
factorization algebras and factorization homology: the Deligne conjecture and the
Bar constructions. We will solve the Deligne conjecture by the tool of centralizers,
and extend the study of Bar constructions on associative algebras to arbitrary 𝐸𝑛 -
algebras. We will mostly present ideas and sketches of proofs; the details can be
found in [Fra11] and [GTZ12].

Centralizers and (higher) Deligne conjecture


The original Deligne conjecture on the structure of Hochschild cohomology of
an associative algebra (more generally an 𝐴∞ -algebra) states that Hochschild co-
homology naturally has the structure of a BV-algebra. More generally, the full
Hochschild cohomology complex has the structure of an algebra over the (framed)
little disk operad. We will here prove the following result, which is mostly stated
as the “higher Deligne conjecture”:

Theorem 4.1 (Higher Deligne conjecture). Suppose 𝐴 is an 𝐸𝑛 -algebra. Then the


𝐸𝑛 -Hochschild cohomology HH𝐸𝑛 (𝐴, 𝐴) has an 𝐸𝑛+1 -algebra structure.

We will first review the definition of the 𝐸𝑛 -Hochschild cohomology.


4 Further applications 85

Definition 4.2. Suppose C is a closed symmetric monoidal ∞-category. Let 𝑀 be


an 𝐸𝑛 -module over an 𝐸𝑛 -algebra 𝐴. The 𝑬𝒏 -Hochschild cohomology of 𝐴 with
values in 𝑀 is defined to be
𝐸
HH𝐸𝑛 (𝐴, 𝑀) ∶= Map𝐴𝑛 (𝐴, 𝑀) ∶= MapE𝑛 -Mod(𝐴) (𝐴, 𝑀) ∈ C.

Remark 4.3. Notice that an 𝐸1 -𝐴-module is equivalent to a left 𝐸1 -module over


𝐴 ⊗ 𝐴𝑜𝑝 , we find out that

HH𝐸1 (𝐴, 𝑀) = MapE1 -Mod(𝐴) (𝐴, 𝑀) ≅ MapE1 -LMod(𝐴⊗𝐴𝑜𝑝 ) (𝐴, 𝑀)

coincides with the usual definition of Hochschild cohomology.

Now we introduce the language of the center of a map of 𝐸𝑛 -algebras, which


is due to [Lur16] and can be used to prove the Deligne conjecture:

Definition 4.4. Suppose 𝑓 ∶ 𝐴 → 𝐵 is an map of 𝐸𝑛 -algebras. The centralizer of


𝑓 is an 𝐸𝑛 -algebra 𝔷𝑛 (𝑓 ) together with a map 𝑒𝑓 ∶ 𝐴 ⊗ 𝔷𝑛 (𝑓 ) → 𝐵 of 𝐸𝑛 -algebras,
such that 𝑒𝑓 ∘ (𝟙 ⊗ 1𝔷𝑛 (𝑓 ) ) = 𝑓 ; and for any 𝐸𝑛 -algebra 𝐶 together with a map
𝜑 ∶ 𝐴 ⊗ 𝐶 → 𝐵 of 𝐸𝑛 -algebras, such that 𝜑 ∘ (𝟙 ⊗ 1𝐶 ) = 𝑓 , there exists a unique
map 𝜅 ∶ 𝐶 → 𝔷𝑛 (𝑓 ) such that 𝜑 = 𝑒𝑓 ∘ (𝟙 ⊗ 𝜅). We define the center of 𝐴 to be
𝔷𝑛 (𝐴) ∶= 𝔷𝑛 (𝟙𝐴 ).

Example 4.5. Suppose 𝐺 is an 𝐸1 -algebra in the category Set; that is a monoid.


Then 𝔷1 (𝐺) is the usual center 𝑍(𝐺). If 𝑓 ∶ 𝐻 → 𝐺 is an inclusion of monoids,
then 𝔷1 (𝐺) is the usual centralizer 𝑍𝐻 (𝐺). This explains the name “centralizer”.
Notice that in this example, 𝑍(𝐺) is commutative, meaning that it is an 𝐸2 -algebra.

Remark 4.6. By the definition, for any maps of 𝐸𝑛 -algebras 𝑓 ∶ 𝐴 → 𝐵, 𝑔 ∶ 𝐵 →


𝐶, there exists a commutative diagram

𝐴 ⊗ 𝔷𝑛 (𝑓 ) ⊗ 𝔷𝑛 (𝑔)
6 QQQ
𝟙⊗1𝔷𝑛 (𝑔) mmmm QQQ𝑒𝑓 ⊗𝟙
mm QQQ
mm mm QQQ
mm (
𝐴 ⊗ 𝔷𝑛 (𝑓 ) 𝐵 ⊗ 𝔷𝑛 (𝑔)
𝟙⊗1𝔷𝑛 (𝑓 ) vv
: QQQ 𝟙⊗1𝔷𝑛 (𝑔) mmmm
6 II
v QQQ𝑒𝑓 m II 𝑒𝑔
v Q QQQ mm m II
vvv QQ m mm II
I$
vv 𝑓 QQ(/ mmm 𝑔
𝐴 𝐵 / 𝐶,

which induces a natural map

𝔷𝑛 (∘) ∶ 𝔷𝑛 (𝑓 ) ⊗ 𝔷𝑛 (𝑔) → 𝔷𝑛 (𝑔 ∘ 𝑓 )

of 𝐸𝑛 -algebras by the universal property.

The following theorem, also known as the “relative Deligne conjecture”, claims
that the centralizer is computed by 𝐸𝑛 -Hochschild cohomology:
86 An Introduction to Factorization Algebras and Factorization Homology

Theorem 4.7 (Relative Deligne conjecture). Suppose 𝑓 ∶ 𝐴 → 𝐵 is a map of


𝐸𝑛 -algebras. Denote 𝐵𝑓 to be 𝐵, endowed with the 𝐸𝑛 -module structure over 𝐴
induced by 𝑓 . Then there is an 𝐸𝑛 -algebra structure on HH𝐸𝑛 (𝐴, 𝐵𝑓 ), making it the
centralizer of 𝑓 . In particular, 𝔷𝑛 (𝑓 ) exists. Moreover for any 𝑔 ∶ 𝐵 → 𝐶 another
map of 𝐸𝑛 -algebras, the following diagram is commutative:

𝔷𝑛 (∘)
𝔷𝑛 (𝑓 ) ⊗ 𝔷𝑛 (𝑔) / 𝔷 (𝑔 ∘ 𝑓 )
𝑛

≅ ≅
 
𝐸 𝐸
Map𝐴𝑛 (𝐴, 𝐵𝑓 ) ⊗ Map𝐵𝑛 (𝐵, 𝐶𝑔 )
∘ / Map𝐸𝑛 (𝐴, 𝐶 ),
𝐴 𝑔∘𝑓

where the lower arrow is induced by composition of maps.

Sketch of proof. Details can be found in [GTZ12]. By the characterization given


in Proposition 3.42, 𝐴, 𝐵 can be regarded as locally constant factorization algebras
on ℝ𝑛∗ , and a map of 𝐸𝑛 -modules over 𝐴 from 𝐴 to 𝐵𝑓 is equivalent to a map of
factorization algebras on ℝ𝑛∗ from 𝐴 to 𝐵, whose restriction on ℝ𝑛 − {0} is 𝑓 .
𝐸
We first give the 𝐸𝑛 -algebra structure on HH𝐸𝑛 (𝐴, 𝐵𝑓 ) = Map𝐴𝑛 (𝐴, 𝐵𝑓 ). To
do so, it suffices to give a locally constant factorization algebra on ℝ𝑛 , whose global
𝐸
section is Map𝐴𝑛 (𝐴, 𝐵𝑓 ). To do so, it will be enough to define it on 𝒞 , the basis
of convex open sets. To any 𝑈 ∈ 𝒞 , we take a central point 𝑥𝑈 , and associate 𝑈
𝐸
with the object 𝑍(𝑈 ) = Map𝐴𝑛 (𝐴, 𝐵𝑓 ), represented by all maps of factorization
algebras from 𝐴|𝑈 to 𝐵|𝑈 whose restriction to 𝑈 − {𝑥𝑈 } is 𝑓 ; i.e.

MapFA(𝑈 ) (𝐴|𝑈 , 𝐵|𝑈 ) × {𝑓 }.


MapFA(𝑈 −{𝑥𝑈 }) (𝐴|𝑈 ,𝐵|𝑈 )

For any 𝑈1 , ⋯ , 𝑈𝑟 , 𝑉 ∈ 𝒞 such that 𝑈1 , ⋯ , 𝑈𝑟 are pairwise disjoint and contained


in 𝑉 , we need to define a map

𝜌𝑈1 ,⋯,𝑈𝑟 ,𝑉 ∶ 𝑍(𝑈1 ) ⊗ ⋯ ⊗ 𝑍(𝑈𝑟 ) → 𝑍(𝑉 ).

For any 𝑔𝑖 ∶ 𝐴|𝑈𝑖 → 𝐵|𝑈𝑖 ∈ 𝑍(𝑈𝑖 ) for all 𝑖, we define 𝜌𝑈1 ,⋯,𝑈𝑟 ,𝑉 (𝑔1 , ⋯ , 𝑔𝑟 ) to be
a map of factorization algebras from 𝐴 to 𝐵 on a factorizing basis on 𝑉 . We take
the following basis:

• All open sets in 𝑉 such that it is either in some 𝑈𝑖 , or is in 𝑉 −{𝑥𝑈1 , ⋯ , 𝑥𝑈𝑟 };

and define the following map 𝜌𝑈1 ,⋯,𝑈𝑟 ,𝑉 (𝑔1 , ⋯ , 𝑔𝑟 ) of factorization algebras:

• Its value on an open set 𝐷 contained in 𝑈𝑖 is given by 𝑔𝑖 ;

• Its value on an open set 𝐷 contained in 𝑉 − {𝑥𝑈1 , ⋯ , 𝑥𝑈𝑖 } is given by 𝑓 .

Then we observe that if we take a sufficient large closed disk 𝐷′ contained in


𝑉 containing all 𝑥𝑈𝑖 ’s, and collapsing 𝐷′ , 𝜌𝑈1 ,⋯,𝑈𝑟 ,𝑉 (𝑔1 , ⋯ , 𝑔𝑟 ) turns into a map
4 Further applications 87

of factorization algebras on 𝑉 ∕𝐷′ from 𝐴 to 𝐵, whose restriction on 𝑉 − 𝐷′ is 𝑓 ,


𝐸
hence lies in Map𝐴𝑛 (𝐴, 𝐵𝑓 ) = 𝑍(𝑉 ).
It can be checked that the constructions given above make 𝑍 a locally constant
𝒞 -factorization algebra, thus extends to a locally constant factorization algebra on
𝐸
ℝ𝑛 , making Map𝐴𝑛 (𝐴, 𝐵𝑓 ) an 𝐸𝑛 -algebra. Moreover using the constructions above,
𝐸
one can check that the map of evaluation ev ∶ 𝐴 ⊗ Map𝐴𝑛 (𝐴, 𝐵𝑓 ) → 𝐵 is a map of
𝐸𝑛 -algebras.
To prove that HH𝐸𝑛 (𝐴, 𝐵𝑓 ) is equivalent to 𝔷𝑛 (𝑓 ), we choose an arbitrary 𝐸𝑛 -
algebra 𝐶 together with a map 𝜑 ∶ 𝐴 ⊗ 𝐶 → 𝐵 of 𝐸𝑛 -algebras, such that 𝜑 ∘
(𝟙 ⊗ 1𝐶 ) = 𝑓 . Then 𝜑 induces a map 𝜃𝜑 ∶ 𝐶 → MapE𝑛 -Alg (𝐴, 𝐵). On the other
hand, there exists a canonical functor MapE𝑛 -Alg (𝐴, 𝐵) → MapE𝑛 -Mod(𝐴) (𝐴, 𝐵𝑓 );
composition of the two maps yields a map 𝜃𝜑̃ ∶ 𝐶 → HH𝐸𝑛 (𝐴, 𝐵𝑓 ), which, by the
definition of an adjunction, fits the identity ev ∘ (𝟙 ⊗ 𝜃𝜑̃ ) = 𝜑. Using the identity
that
𝐸
Map𝐴𝑛 (𝐴,𝐴⊗𝟙)
𝐸 𝐸 𝐸
Map𝐴𝑛 (𝐴, 𝐵𝑓 ) −−−−−−−−−−→ Map𝐴𝑛 (𝐴, 𝐴 ⊗ Map𝐴𝑛 (𝐴, 𝐵𝑓 ))
𝐸
Map𝐴𝑛 (𝐴,ev)
𝐸
−−−−−−−−−→ Map𝐴𝑛 (𝐴, 𝐵𝑓 )

is the identity map, the uniqueness of the map 𝜃𝜑̃ follows.


The commutativity of the diagram is immediately from the universal property.

We can now prove Theorem 4.1.

Proof of Theorem 4.1. We first notice that for any object 𝑋 in C, the composi-
tion map ∘ ∶ Map(𝑋, 𝑋) ⊗ Map(𝑋, 𝑋) → Map(𝑋, 𝑋) makes Map(𝑋, 𝑋) an 𝐸1 -
algebra. Now, by the relative Deligne conjecture, HH𝐸𝑛 (𝐴, 𝐴) is an 𝐸1 -algebra
in the ∞-category E𝑛 -Alg, which means it is an 𝐸𝑛+1 -algebra, by Dunn’s Theo-
rem. ◻

We state some examples.

Example 4.8. Suppose A is a monoidal category, which is an 𝐸1 -algebra in the


category of categories. Then 𝔷1 (A) is an 𝐸2 -algebra in the category of categories,
which is a braided monoidal category. It is shown in [Lur16] that 𝔷1 (A) is actually
the Drinfeld center of A.

Example 4.9. Suppose 𝐴 is an 𝐸𝑛 -algebra, then the map 1𝐴 ∶ 𝟙 → 𝐴 has center


𝔷𝑛 (1𝐴 ) = 𝐴. The composition map 𝔷𝑛 (1𝐴 ) ⊗ 𝔷𝑛 (𝐴) → 𝔷𝑛 (1𝐴 ) exhibits 𝔷𝑛 (1𝐴 ) = 𝐴
a left module of 𝔷𝑛 (𝐴) in the ∞-category E𝑛 -Alg, which means that 𝐴 has a natural
action on the 𝐸𝑛+1 -algebra HH𝐸𝑛 (𝐴, 𝐴). This is also called the “Swiss-Cheese
version of Deligne conjecture”.
88 An Introduction to Factorization Algebras and Factorization Homology

An introduction to the Bar constructions


Bar constructions have been introduced in topology as a model for the coalgebra
structure on the cochains on Ω𝑋, the loop space of a pointed space 𝑋. In this
section, we will give a short introduction of the definition and basic properties of the
Bar constructions. We refer to [Fra11] and [GTZ12] for those who are interested in
more properties of the Bar constructions and its relationship to iterated loop spaces.
We recall the original definition of the Bar construction.
Definition 4.10. Suppose 𝐴 is an augmented associative algebra in a symmetric
monoidal ∞-category C; that is, an associative algebra together with an algebra
homomorphism 𝐴 → 𝟙, where 𝟙 is the unit of the tensor product of C. The Bar
object4 is given by 𝐵(𝐴) ∶= 𝟙 ⊗𝐴 𝟙.
Remark 4.11. In the sense of chain complexes, if 𝐴 is flat over the base field 𝕜,
⊗𝑛
𝐵(𝐴) is computed by the usual standard chain complex ⨁𝑛≥0 𝐴 , where 𝐴 is the
kernel of the augmentation.
It is well known that the Bar construction has a standard coalgebra structure.
Moreover, if 𝐴 is a commutative algebra, then the Bar construction has a standard
commutative algebra structure. We now generalize the results into the case of 𝐸𝑛 -
algebras.
Definition 4.12. An augmented 𝑬𝒏 -algebra is an 𝐸𝑛 -algebra 𝐴 together with a
map 𝜀 ∶ 𝐴 → 𝟙 of 𝐸𝑛 -algebras, which is called the augmentation. The ∞-
category of augmented 𝐸𝑛 -algebras is denoted E𝑛 -Alg𝑎𝑢𝑔 .
Now, according to Example 3.45, an augmented 𝐸𝑛 -algebra defines a locally
constant factorization algebra on the closed disk 𝐷𝑛 . Moreover, we have the fol-
lowing functors

𝜔𝑖 ∶ E𝑛 -Alg𝑎𝑢𝑔 → E𝑖 -Alg𝑎𝑢𝑔 (E𝑛−𝑖 -Alg𝑎𝑢𝑔 ) → FA𝑙𝑐 (𝐷𝑖 , E𝑛−𝑖 -Alg𝑎𝑢𝑔 ),

where the first is induced by the equivalence E𝑛 -Alg ≅ E𝑖 -Alg(E𝑛−𝑖 -Alg).


Definition 4.13. Suppose 𝐴 is an augmented 𝐸𝑛 -algebra. Define its Bar object to
be
𝐵(𝐴) ∶= ∫ 𝐴 ⊗ 𝟙.
𝐼×ℝ𝑛−1 ∫𝑆 0 ×ℝ𝑛−1 𝐴

It can be directly verified that the right hand side of the definition is equivalent
to 𝑝∗ (𝜔1 (𝐴)), where 𝑝 ∶ 𝐼 → ∗ is the unique map. Thus the definition is just the
factorization homology of the associated factorization algebra on 𝐷1 . In particular,
this definition agrees with the definition for augmented associative algebras.
Hence, the above discussion shows that 𝐵(𝐴) has a natural structure of an 𝐸𝑛−1 -
algebra. We can therefore iterate this construction at most 𝑛 times:
4
This is often called the “Bar complex”, since the underlying category is often taken to be the derived
∞-category of chain complexes, ℂ𝕙(𝕜).
4 Further applications 89

Definition 4.14. Suppose 𝐴 is an augmented 𝐸𝑛 -algebra, and 0 ≤ 𝑖 ≤ 𝑛. The 𝒊-th


iterated Bar object of 𝐴 is the augmented 𝐸𝑛−𝑖 -algebra
𝐵 𝑖 (𝐴) = 𝐵(𝐵(⋯ (𝐵(𝐴)) ⋯)).
By the definition of the Bar object, and the excision axiom of factorization
homology, we may prove the following proposition:
Proposition 4.15. Suppose 𝐴 is an augmented 𝐸𝑛 -algebra, and 0 ≤ 𝑖 ≤ 𝑛. There
exist natural equivalences of 𝐸𝑛−𝑖 -algebras

𝐵 𝑖 (𝐴) ≅ ∫ 𝐴 ⊗ 𝟙 ≅ 𝑝∗ (𝜔𝑖 (𝐴)),


𝐷𝑖 ×ℝ𝑛−𝑖 ∫𝑆 𝑖−1 ×ℝ𝑛−𝑖 𝐴

where 𝑝 ∶ 𝐷𝑖 → ∗ is the unique map.


In particular, taking 𝑛 = ∞, we obtain the 𝐸∞ -structure on the iterated Bar
objects 𝐵 𝑖 (𝐴) of an augmented 𝐸∞ -algebra, which restricts to the standard com-
mutative algebra structure on the Bar object if 𝐴 is a commutative algebra.
We now sketch the construction of the 𝐸𝑖 -coalgebra structure on the 𝑖-th iterated
Bar construction of an 𝐸𝑛 -algebra. To do so, we only need to construct a locally
constant “cofactorization algebra” on ℝ𝑖 , whose global section is 𝐵 𝑖 (𝐴), which,
similar to the previous section, suffices to do so on 𝒞 , the basis of convex open
sets.
We take the locally constant factorization algebra on ℝ𝑖 with values in
E𝑛−𝑖 -Alg𝑎𝑢𝑔 determined by 𝐴, which will be still denoted 𝐴. For any 𝑉 ∈ 𝒞 , by
Example 3.45, the augmentation gives a stratified locally constant factorization
algebra 𝜔′ (𝐴|𝑉 ) on 𝑉 ∪ {∞}. (Notice that 𝑉 ∪ {∞} is homeomorphic to 𝑆∗𝑛 .) We
assign 𝑉 with the object

𝐵 𝑖 (𝐴)(𝑉 ) ∶= ∫ 𝜔′ (𝐴|𝑉 ).
𝑉 ∪{∞}

By definition, 𝐵 (𝐴)(𝑉 ) is equivalent to 𝐵 𝑖 (𝐴). For any 𝑈1 , ⋯ , 𝑈𝑟 , 𝑉 ∈ 𝒞 such


𝑖

that 𝑈1 , ⋯ , 𝑈𝑟 are pairwise disjoint and contained in 𝑉 , there exists a continuous


map 𝜋 ∶ 𝑉 ∪ {∞} → 𝑈1 ∪ ⋯ ∪ 𝑈𝑛 ∪ {∞}, and the augmentation defines a map
𝑟
𝜀 ∶ 𝜋∗ (𝜔′ (𝐴|𝑉 )) → 𝜔′ (𝐴|𝑈𝑖 ).

𝑖=1

We then obtain a structure map

∇𝑈1 ,⋯,𝑈𝑠 ,𝑉 ∶ 𝐵 𝑖 (𝐴)(𝑉 ) = ∫ 𝜔′ (𝐴|𝑉 )


𝑉 ∪{∞}
𝑟 𝑟
∫𝜀
−−→ ∫ 𝜔′ (𝐴|𝑈𝑘 ) = 𝐵 𝑖 (𝐴)(𝑈𝑘 ).
⨂ 𝑈𝑖 ∪{∞}

𝑘=1 𝑘=1

The following theorem from [GTZ12] proved that this indeed gives the 𝐸𝑖 -
coalgebra structure on 𝐵 𝑖 (𝐴):
90 An Introduction to Factorization Algebras and Factorization Homology

Theorem 4.16. The maps ∇𝑈1 ,⋯,𝑈𝑠 ,𝑉 form the structure maps of a locally constant
factorization coalgebra, making 𝐵 𝑖 (𝐴) into an 𝐸𝑖 -coalgebra. Thus 𝐵 𝑖 lifts to a
functor
𝐵 𝑖 ∶ E𝑛 -Alg𝑎𝑢𝑔 → E𝑖 -coAlg(E𝑛−𝑖 -Alg𝑎𝑢𝑔 ).
In particular, taking 𝑖 = 1, we obtain the 𝐸1 -coalgebra structure on the Bar
object 𝐵(𝐴) of an augmented 𝐸𝑛 -algebra, which restricts to the standard coalgebra
structure on the Bar object if 𝐴 is an associative algebra.

References
[AF12] D. Ayala and J. Francis (2012). Factorization Homology of Topological
Manifolds. arXiv: 1206.5522 [math.RT].
[CG16] K. Costello and O. Gwilliam. (2016). Factorization Algebras in Quantum
Field Theory. Cambridge University Press.
[Dun88] G. Dunn (1988). “Tensor Product of Operads and Iterated Loop Spaces”.
Journal of Pure and Applied Algebra 50, 237–258.
[FN62] E. Fadell and L. Neuwirth (1962). “Configuration Spaces”. Mathematica
Scandinavica 10, 111–118.
[Fra11] J. Francis (2011). The Tangent Complex and Hochschild Cohomology of
𝐸𝑛 -rings. arXiv: 1104.0181 [math.RT].
[Gin13] G. Ginot (2013). Notes on Factorization Algebras, Factorization Homol-
ogy and Applications. arXiv: 1307.5213 [math.RT].
[GTZ10] G. Ginot, T. Tradler, and M. Zeinalian (2010). Derived Higher
Hochschild Homology, Topological Chiral Homology and Factorization
algebras. arXiv: 1011.6483 [math.RT].
[GTZ12] G. Ginot, T. Tradler, and M. Zeinalian (2012). Higher Hochschild Co-
homology, Brane Topology and Centralizers of 𝐸𝑛 -algebra Maps. arXiv:
1205.7056 [math.RT].
[Hov99] M. Hovey (1999). Model Categories. Vol. 63. Mathematical Surveys and
Monographs. American Methematical Society.
[KS77] R. C. Kirby and L. C. Siebenmann (1977). Foundational Essays on Topo-
logical Manifolds, Smoothings, and Triangulations. Princeton University
Press.
[Lur09] J. Lurie (2009). Higher Topos Theory. Vol. 170. Annals of Mathematics
Studies. Princeton University Press.
[Lur16] J. Lurie (2016). Higher Algebra. URL: http://www.math.harvard.
edu/l̃urie/papers/HA.pdf.
[MathOverflow] 𝑘-Disk algebras versus 𝐸𝑘 algebras. https://mathoverflow.net/
questions/181828/k-disk-algebras-versus-e-k-algebras.
[May72] J. P. May (1972). The Geometry of Iterated Loop Spaces. Vol. 271. Lec-
ture Notes in Mathematics. Springer.
[nLab] The nLab. https://ncatlab.org.
Stable Irrationality of Varieties

Bu Chenjing1

ABSTRACT
We present several methods of showing the existence of stably irrational
varieties within a given family, and we show the infinitude of stable birational
classes in the family of quartic threefolds.

Contents

1 Introduction 92

2 Criteria for rationality 93


Chow groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
Rationality and zero-cycles . . . . . . . . . . . . . . . . . . . . . . . . 96
Decomposition of the diagonal . . . . . . . . . . . . . . . . . . . . . . 99
The Brauer group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

3 The deformation method 103


Families of cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
Locus of rational equivalence . . . . . . . . . . . . . . . . . . . . . . . 105
Locus of decomposability of the diagonal . . . . . . . . . . . . . . . . 109
Stable equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

4 The specialisation method 115


The specialisation map . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Rationality and specialisation . . . . . . . . . . . . . . . . . . . . . . . 116

5 Example: Quartic threefolds 119


The example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Consequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

6 Example: Cubic threefolds 125


1
卜辰璟,清华大学数学系数 72 班.
92 Stable Irrationality of Varieties

7 Example: Quadric surface bundles 130


Irrationality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
Density of the rational locus . . . . . . . . . . . . . . . . . . . . . . . 132

1 Introduction
For a projective variety, there are various notions of rationality, describing how
close a variety is to the projective space.
Definition 1.1. Let k be a field, and let 𝑋 be a projective k-variety.
• 𝑋 is rational, if there exists 𝑛 ∈ N, such that

𝑋 is birational to P𝑛k .

Equivalently, we have k(𝑋) ≃ k(𝑥1 , … , 𝑥𝑛 ) as k-algebras.


• 𝑋 is stably rational, if there exist 𝑚, 𝑛 ∈ N, such that

𝑋 × P𝑚
k is birational to P𝑛k .

Equivalently, we have k(𝑋)(𝑦1 , … , 𝑦𝑚 ) ≃ k(𝑥1 , … , 𝑥𝑛 ) as k-algebras.


• 𝑋 is retract rational, if there exists 𝑛 ∈ N, and open sets 𝑈 ⊂ 𝑋, 𝑉 ⊂ P𝑛k ,
together with two maps

𝑓∶ 𝑈 →𝑉, 𝑔 ∶ 𝑉 → 𝑈,

such that 𝑔 ∘ 𝑓 = id𝑈 .


• 𝑋 is unirational, if there exists a dominant rational map

P𝑛k ⇢ 𝑋.

Equivalently, there exists a map k(𝑋) → k(𝑥1 , … , 𝑥𝑛 ) of k-algebras.


• 𝑋 is rationally connected, if for every algebraically closed field K containing
k, and any two K-points 𝑥, 𝑦 ∈ 𝑋(K), there exists a rational curve

𝑓 ∶ P1K → 𝑋K

joining them, i.e. we have 𝑓 (0) = 𝑥 and 𝑓 (∞) = 𝑦.


These notions are sorted from strong to weak, i.e. every notion implies its next
one. A natural question to ask is that whether these implications are strict.
In 1972, Artin and Mumford [AM72] gave an example of a variety that is uni-
rational but not retract rational. More recently, Voisin [Voi15] developed a defor-
mation method which can show that a very general member of a family of varieties
is not retract rational, as long as it contains one particular example that is not retract
2 Criteria for rationality 93

rational. This method was then modified by Colliot-Thélène and Pirutka [CTP16]
to show that over C, a very general quartic hypersurface in P4 is not retract rational.
They also developed the specialisation method, with which they can provide more
general examples of smooth varieties that are not retract rational over number fields
and local fields.
In this article, we give an exposition of the methods and results mentioned
above.

2 Criteria for rationality


In this section, we relate rationality with several other invariants of a variety. We
show that retract rationality implies that these invariants are trivial. The results are
summarised in the following diagram, although some assumptions are dropped:
retract universally decomposition trivial
⟹ ⟺ ⟹
rational CH0 -trivial of the diagonal Brauer group.

Chow groups
In this subsection, we recall the definition and some basic properties of the Chow
groups of a variety. A general reference is [Ful98].
In the following, let k be a field, and let 𝑋 be a k-variety.
Definition 2.1. Let 𝑑 be an non-negative integer. The free abelian group
Z𝑑 (𝑋) = ⨁ Z ⋅ [𝑍],
𝑍⊂𝑋

where 𝑍 runs through all 𝑑-dimensional integral closed subvarieties of 𝑋, is called


the group of 𝑑-cycles of 𝑋.
In other words, a 𝑑-cycle of 𝑋 is a finite sum ∑ 𝑛𝑖 [𝑍𝑖 ], where each 𝑛𝑖 is an
integer, and each 𝑍𝑖 is a subvariety of 𝑋. For example, a 0-cycle is a linear com-
bination of closed points.
Let 𝑉 ⊂ 𝑋 be a (𝑑 + 1)-dimensional integral closed subvariety, and let 𝑓 ∈
k(𝑉 )× be a rational function on 𝑉 . The principal divisor of 𝑉 corresponding to 𝑓 ,
denoted by div(𝑓 ), can be naturally seen as a 𝑑-cycle of 𝑋. This defines a map of
abelian groups
div ∶ k(𝑉 )× → Z𝑑 (𝑋).
Definition 2.2. The 𝑑-th Chow group of 𝑋 is defined by

CH𝑑 (𝑋) = coker( ⨁ k(𝑉 )× → Z𝑑 (𝑋)),


𝑉 ⊂𝑋

where 𝑉 runs through all (𝑑 + 1)-dimensional integral closed subvarieties of 𝑋.


If 𝑋 has dimension 𝑛, we write
Z𝑑 (𝑋) = Z𝑛−𝑑 (𝑋) and CH𝑑 (𝑋) = CH𝑛−𝑑 (𝑋).
94 Stable Irrationality of Varieties

An element of the Chow group is thus a class of cycles. We say that the cycles
in the same class are rationally equivalent.
Here we explain four operations of the Chow group: proper pushforward, flat
pullback, the intersection product, and the Gysin map.

Definition 2.3. Let 𝑓 ∶ 𝑋 → 𝑌 be a proper map of k-varieties, and let 𝑍 ⊂ 𝑋 be


an integral closed subvariety of dimension 𝑑. We define

0, if dim 𝑓 (𝑍) < 𝑑,


𝑓∗ [𝑍] =
{ [k(𝑓 (𝑍)) ∶ k(𝑍)] [𝑓 (𝑍)], if dim 𝑓 (𝑍) = 𝑑,

as a 𝑑-cycle of 𝑌 , where

• 𝑓 (𝑍) is an irreducible closed subset of 𝑌 , which we see as an integral closed


subscheme.

• The field extension k(𝑓 (𝑍))∕k(𝑍) is finite because it is finitely generated


and of transcendence degree 0.

This extends linearly to a pushforward map

𝑓∗ ∶ Z𝑑 (𝑋) → Z𝑑 (𝑌 ).

It turns out that proper pushforward preserves rational equivalence [Ful98,


§1.4]. We thus obtain a pushforward map of Chow groups

𝑓∗ ∶ CH𝑑 (𝑋) → CH𝑑 (𝑌 ).

Next, we introduce the flat pullback of Chow groups.

Definition 2.4. Let 𝑓 ∶ 𝑋 → 𝑌 be a map of k-varieties, which is flat of relative


dimension 𝑟. Let 𝑍 ⊂ 𝑋 be an integral closed subvariety of dimension 𝑑. We
define
𝑓 ∗ [𝑍] = [𝑓 −1 (𝑍)]

as a (𝑑 + 𝑟)-cycle of 𝑌 , where

• 𝑓 −1 (𝑍) is the scheme-theoretic inverse image, i.e., 𝑍 ×𝑌 𝑋.

• [𝑓 −1 (𝑍)] denotes the sum ∑ 𝑚𝑖 [𝑍𝑖 ], where the 𝑍𝑖 are the irreducible com-
ponents of 𝑓 −1 (𝑍), and 𝑚𝑖 is the geometric multiplicity of 𝑍𝑖 in 𝑓 −1 (𝑍),
defined as the length of the local ring 𝒪𝑓 −1 (𝑍),𝑍𝑖 .

This extends linearly to a pullback map

𝑓 ∗ ∶ Z𝑑 (𝑌 ) → Z𝑑+𝑟 (𝑋).
2 Criteria for rationality 95

It turns out again that flat pullback preserves rational equivalence [Ful98, §1.7].
Switching to the cohomological indexing notation, we get a pullback map of Chow
groups
𝑓 ∗ ∶ CH𝑑 (𝑌 ) → CH𝑑 (𝑋).

Now, we introduce the intersection product of cycles.


Let 𝑍1 , 𝑍2 ⊂ 𝑋 be two integral closed subvarieties. We can form their
“scheme-theoretic intersection”

𝑍1 ∩ 𝑍2 = 𝑍1 ×𝑋 𝑍2 .

However, this does not always produce cycles of the expected dimension, as the
subvarieties may not be in a general position to intersect. To work around this
difficulty, we use the Gysin map of the diagonal map, which acts as a pullback
along a closed embedding.
The Gysin map is defined for vector bundles as follows.
Theorem 2.5. Let 𝑝 ∶ 𝐸 → 𝑋 be a vector bundle of rank 𝑟. Then the flat pullback

𝑝∗ ∶ CH𝑑 (𝑋) → CH𝑑+𝑟 (𝐸)

is an isomorphism for all 𝑑. Its inverse is called the Gysin map, and denoted by 𝑖! ,
where 𝑖 is the zero section map 𝑋 → 𝐸.
See [Ful98, §3.3].
Recall that a regular embedding is a closed embedding of schemes, such that
the ideal sheaf is locally generated by regular sequences. For example, a closed
embedding of smooth varieties is always a regular embedding.
For a regular embedding, the normal cone is a vector bundle. We can use this
property to extend the definition of the Gysin map to this case.
Definition 2.6. Let 𝑖 ∶ 𝑍 → 𝑋 be a regular embedding of constant codimension 𝑒.
The Gysin map
𝑖! ∶ CH𝑑 (𝑋) → CH𝑑−𝑒 (𝑍)

is defined as follows. Let 𝑁𝑍 𝑋 denote the normal bundle of 𝑍 in 𝑋, and let

𝜎 ∶ Z𝑑 (𝑋) → Z𝑑 (𝑁𝑍 𝑋)

be the map given by


[𝑉 ] ↦ [𝑁𝑍∩𝑉 𝑉 ].

This map respects rational equivalence [Ful98, §5.2], inducing a map

𝜎 ∶ CH𝑑 (𝑋) → CH𝑑 (𝑁𝑍 𝑋)

We then compose this map with the Gysin map defined in Theorem 2.5, giving the
desired map
𝑖! ∶ CH𝑑 (𝑋) → CH𝑑−𝑒 (𝑍).
96 Stable Irrationality of Varieties

Using the Gysin map as a pullback along the diagonal map, we can define the
intersection product of cycles.

Definition 2.7. Let 𝑋 be an 𝑛-dimensional projective variety. The intersection


product is a map of graded abelian groups

⋅ ∶ CH• (𝑋) ⊗ CH• (𝑋) → CH• (𝑋),

defined as follows.

• If 𝑋 is smooth, we define this map by the composition

× Δ!
CH𝑑1 (𝑋) ⊗ CH𝑑2 (𝑋) ⟶ CH𝑑1 +𝑑2 (𝑋 × 𝑋) ⟶ CH𝑑1 +𝑑2 −𝑛 (𝑋),

where × denotes the cross product map sending [𝑍1 ] ⊗ [𝑍2 ] to [𝑍1 × 𝑍2 ],
and Δ ∶ 𝑋 → 𝑋 × 𝑋 is the diagonal map, which is a regular embedding.

• If 𝑋 is arbitrary, we can always embed 𝑋 in some P𝑁 , so that we can regard


cycles of 𝑋 as cycles of P𝑁 , and intersect them in P𝑁 .

This product equips the Chow groups with the structure of a graded ring, called
the Chow ring.

Rationality and zero‐cycles


Definition 2.8. We say that a map 𝑓 ∶ 𝑋 → 𝑌 of k-varieties is universally CH0 -
trivial, if

• 𝑓 is proper.

• For any field extension 𝐹 ∕k, the pushforward map

𝑓∗ ∶ CH0 (𝑋𝐹 ) → CH0 (𝑌𝐹 )

is an isomorphism.

If 𝑌 = Spec k, then we say 𝑋 is universally CH0 -trivial. This means that

• 𝑋 is complete.

• For any field extension 𝐹 ∕k, the degree map

deg𝐹 ∶ CH0 (𝑋𝐹 ) → Z

is an isomorphism.

We will show that retract rationality implies universal CH0 triviality. The proof
will need a few lemmas. First of all, we prove a moving lemma for 0-cycles.
2 Criteria for rationality 97

Lemma 2.9. Let 𝑋 be a smooth projective k-variety, with k infinite and perfect,
and let 𝑈 ⊂ 𝑋 be a dense open set. Then every 0-cycle of 𝑋 is rationally equivalent
to one supported in 𝑈 .

Proof. We follow [CT05, Complément]. Write 𝑍 = 𝑋 ⧵ 𝑈 , and let 𝑝 ∈ 𝑍 be a


closed point. It suffices to show that the 0-cycle [𝑝] is equivalent to one supported
in 𝑈 .
Let 𝑔 ∈ 𝒪𝑋,𝑝 be a locally defined non-zero function that vanishes on 𝑍. Since
𝑋 is smooth, we can find a regular sequence 𝑓1 , … , 𝑓𝑛−1 of 𝒪𝑋,𝑝 , where 𝑛 = dim 𝑋,
such that 𝑔 ≠ 0 in the quotient 𝒪𝑋,𝑝 ∕(𝑓1 , … , 𝑓𝑛−1 ). This can be done by working
in affine coordinates and taking the 𝑓𝑖 to be linear functions.
The ideal (𝑓1 , … , 𝑓𝑛−1 ) defines a curve in a neighbourhood of 𝑝. Taking its
closure in 𝑋, we obtain a closed integral curve 𝐶 in 𝑋, which is regular at 𝑝 and is
not contained in 𝑍.
Let 𝑓 ∶ 𝐷 → 𝐶 be the normalisation of 𝐶. Then 𝐷 is regular (normality im-
plies regularity in codimension one), and hence smooth since k is perfect. Also,
𝐷 is quasi-projective [EGA-II, Corollary 7.4.10], and 𝑓 is finite [EGA-II, Corol-
lary 7.4.6], and hence proper [EGA-II, Corollary 6.1.11].
Let 𝑞 be the inverse image of 𝑝, which is a single point as 𝐶 is regular at 𝑝. Let
𝑊 be a neighbourhood of 𝑞 in 𝐷, such that 𝑓 |𝑊 is an isomorphism. For example,
we can take 𝑊 to be the inverse image of the smooth locus of 𝐶.
The 0-cycle [𝑞] is equivalent to a 0-cycle supported in 𝑊 ⧵ 𝑓 −1 (𝑍). Indeed,
we have to find a rational function on 𝐷 that has a simple zero at 𝑞, and is defined
on the finite set (𝐷 ⧵ 𝑊 ) ∪ 𝑓 −1 (𝑍). As 𝐷 is quasi-projective, this can be done by
taking a suitable linear function on the projective space.
Finally, we consider the pushforward along the proper map 𝐷 → 𝐶 → 𝑋.
Since it preserves rational equivalence, we are done. ◻

Remark 2.10. This is a special case of [EGA-II, Proposition 7.4.9], but the proof
given here is more elementary.

Lemma 2.11. Let k be a finite field, and let 𝑈 ⊂ P𝑛k be a non-empty open set.
Then for any extension 𝐹 ∕k of sufficiently large degree 𝑑, the open set 𝑈𝐹 ⊂ P𝑛𝐹
contains an 𝐹 -rational point.

Proof. Let 𝑞 be the cardinality of k. Let 𝑓 be a non-zero homogeneous polynomial


over k, which vanishes outside 𝑈 .
In P𝑛𝐹 , when 𝑞 𝑑 > deg 𝑓 , there are at least (𝑞 𝑑 − deg 𝑓 )𝑛 rational points where
𝑓 does not vanish. Indeed, by induction on 𝑛, one easily shows that a non-zero
polynomial of degree 𝑟 on A𝑛𝐹 does not vanish at at least (𝑞 𝑑 − 𝑟)𝑛 rational points,
provided that 𝑞 𝑑 > 𝑟.
Hence 𝑈𝐹 contains a rational point whenever 𝑞 𝑑 > deg 𝑓 . ◻

Theorem 2.12 (Colliot-Thélène and Pirutka). Let 𝑋 be a smooth projective k-


variety. If 𝑋 is retract rational, then 𝑋 is universally CH0 -trivial.
98 Stable Irrationality of Varieties

Proof. First, we suppose that the base field k is infinite. Since retract rationality is
preserved under change of base field, it suffices to prove that 𝑋 is CH0 -trivial over
k, i.e. degk ∶ CH0 (𝑋) → Z is an isomorphism.
By definition, there exist non-empty open sets 𝑈 ⊂ 𝑋, 𝑉 ⊂ P𝑛k , and maps
𝑓 𝑔
𝑈 ⟶ 𝑉 ⟶ 𝑈,

whose composition is id𝑈 .


Let 𝑃 ∈ 𝑈 be a closed point, and write 𝑄 = 𝑓 (𝑃 ) ∈ 𝑉 . Then we have
induced maps of residue fields 𝜅(𝑃 ) → 𝜅(𝑄) → 𝜅(𝑃 ), whose composition is
id𝜅(𝑃 ) . Therefore, we have
𝜅(𝑃 ) ≃ 𝜅(𝑄).

Let 𝐹 denote this field. We then consider the diagram


𝑔𝐹
P𝑛𝐹 ⊃ 𝑉𝐹 𝑈𝐹 ⊂ 𝑋 𝐹
𝑝
𝑔
P𝑛k ⊃ 𝑉 𝑈 ⊂𝑋.

There exists 𝑅 ∈ 𝑝−1 (𝑄) such that 𝜅(𝑅) ≃ 𝐹 . This is because 𝑝−1 (𝑄) ≃ Spec(𝐹 ⊗k
𝐹 ), and we can take 𝑅 to be the point defined by the maximal ideal which is the
kernel of the multiplication map 𝐹 ⊗k 𝐹 → 𝐹 .
Let 𝐴 ∈ 𝑉 ⊂ P𝑛k be a k-rational point, which exists since k is infinite. Let
𝐿 ≃ P1𝐹 ⊂ P𝑛𝐹 be a line connecting 𝑅 and 𝐴𝐹 . The map 𝑔𝐹 sends 𝐿 to a rational
line 𝐿′ in 𝑈𝐹 , which is a rational 𝐹 -map P1𝐹 ⇢ 𝑈𝐹 . This map extends to a map

𝐿′ ∶ P1𝐹 → 𝑋𝐹 ,

which is proper since P1𝐹 is complete [EGA-II, Corollary 5.4.3]. This is a line
connecting the points 𝑔𝐹 (𝑅) and 𝑔𝐹 (𝐴𝐹 ).
As 𝑅 is an 𝐹 -rational point, we have 𝜅(𝑔𝐹 (𝑅)) ≃ 𝐹 and 𝜅(𝑔𝐹 (𝐴𝐹 )) ≃ 𝐹 .
Hence, the pushforward map of 𝐿′ on CH0 gives

[𝑔𝐹 (𝑅)] = [𝑔𝐹 (𝐴𝐹 )] ∈ CH0 (𝑋𝐹 ).

Pushing forward to 𝑋, and noticing that 𝜅(𝑔(𝐴)) ≃ k, we thus have

[𝑃 ] = [𝐹 ∶ k] [𝑔(𝐴)] ∈ CH0 (𝑋).

By the moving lemma 2.9, every 0-cycle of 𝑋 is equivalent to one supported


in 𝑈 , which, as we have shown, is equivalent to a multiple of [𝑔(𝐴)]. Since
degk [𝑔(𝐴)] = 1, this shows that 𝑋 is CH0 -trivial over k.
Finally, if k is a finite field, by Lemma 2.11, we can apply the above argument
to an extension 𝐹 ∕k of sufficiently large degree 𝑑. Since the composition
𝑝∗ 𝑝∗
CH0 (𝑋) ⟶ CH0 (𝑋𝐹 ) ⟶ CH0 (𝑋)
2 Criteria for rationality 99

is multiplication by 𝑑, where 𝑝 ∶ 𝑋𝐹 → 𝑋 is the projection, it follows that every


0-cycle of 𝑋 of degree 0 is 𝑑-torsion in CH0 (𝑋). Hence it must be zero, since 𝑑
can be chosen to be two coprime values. Moreover, this also shows that 𝑋 has two
0-cycles of coprime degrees, and hence, the degree map deg ∶ CH0 (𝑋) → Z is
surjective. ◻

We also mention the following criterion for universal CH0 -triviality of a mor-
phism, which will be useful later.

̃ → 𝑋 be a proper mor-
Theorem 2.13 (Colliot-Thélène and Pirutka). Let 𝑓 ∶ 𝑋
phism of k-varieties. Suppose that

̃𝑀 is universally
• For every point 𝑀 ∈ 𝑋, not necessarily closed, the fibre 𝑋
CH0 -trivial as a 𝜅(𝑀)-variety.

Then 𝑓 is universally CH0 -trivial.

̃ → CH0 (𝑋) is an isomorphism.


Proof. It suffices to show that 𝑓∗ ∶ CH0 (𝑋)
By assumption, this map 𝑓∗ is surjective. Let 𝑥 be a 0-cycle of 𝑋, ̃ such that
𝑓∗ (𝑥) is equivalent to zero. We need to show that 𝑥 is equivalent to zero.
In this case, there exist finitely many integral curves 𝐶𝑖 ⊂ 𝑋, and functions
𝑔𝑖 ∈ k(𝐶𝑖 ), such that
𝑓∗ (𝑥) = ∑𝑖 div𝐶𝑖 (𝑔𝑖 ).

̃𝜂 contains a 0-cycle
Let 𝜂𝑖 be the generic point of 𝐶𝑖 . By hypothesis, each fibre 𝑋 𝑖

∑𝑗 𝑛𝑖𝑗 [𝐷𝑖𝑗 ]

of degree 1, where 𝑛𝑖𝑗 ∈ Z. We regard each 𝐷𝑖𝑗 as a curve in 𝑋.̃ Then each
function 𝑔𝑖 ∘ 𝑓 defines a rational function 𝑔𝑖𝑗 on 𝐷𝑖𝑗 . Write

𝑥′ = 𝑥 − ∑𝑖,𝑗 𝑛𝑖𝑗 div𝐷𝑖𝑗 (𝑔𝑖𝑗 ).

Then we have an equality of cycles 𝑓∗ (𝑥′ ) = 0.


Let us write 𝑥′ = ∑𝑖 𝑥′𝑄𝑖 , where 𝑄𝑖 ∈ 𝑋 are distinct points, and 𝑥′𝑄𝑖 is a 0-
cycle of 𝑋̃ supported in the fibre 𝑋 ̃𝑄 . The fact that 𝑓∗ (𝑥′ ) = 0 implies that each
𝑖
𝑥′𝑄𝑖 has degree 0. It follows from the hypothesis applied to 𝑋̃𝑄 that 𝑥′𝑄 is rationally
𝑖 𝑖

equivalent to zero. Therefore, 𝑥 is equivalent to zero, and so is 𝑥. ◻

Decomposition of the diagonal


We now give an equivalent characterisation of universal CH0 -triviality. We show
that it is equivalent to the existence of a decomposition of the diagonal class in the
Chow group of 𝑋 × 𝑋.
100 Stable Irrationality of Varieties

Definition 2.14. Let 𝑋 be a complete k-variety of dimension 𝑛. A decomposition


of the diagonal of 𝑋 is given by an equation

[Δ𝑋 ] = 𝐷 + [𝑋] × 𝑥0 in CH𝑛 (𝑋 × 𝑋),

where
• [Δ𝑋 ] is the pushforward of [𝑋] ∈ CH𝑛 (𝑋) along the diagonal map 𝑋 →
𝑋 × 𝑋.
• 𝐷 is an 𝑛-cycle of 𝑋 × 𝑋, supported in 𝑍 × 𝑋 for some closed subvariety
𝑍 ⊂ 𝑋 of codimension at least 1.
• 𝑥0 is a 0-cycle of 𝑋 of degree 1.
In order to show that this property is equivalent to CH0 -triviality, we introduce
the notion of a correspondence.
Definition 2.15. Let 𝑋 and 𝑌 be complete k-varieties, of dimensions 𝑚 and 𝑛,
respectively. A correspondence from 𝑋 to 𝑌 is an element of the set

Corr(𝑋, 𝑌 ) = CH𝑚 (𝑋 × 𝑌 ).

We view a correspondence as a generalised version of a graph of a map from 𝑋


to 𝑌 . In this way, we can compose correspondences as if we are composing graphs
of maps. Namely, for 𝑓 ∈ Corr(𝑋, 𝑌 ) and 𝑔 ∈ Corr(𝑌 , 𝑍), we define

𝑔 ∘ 𝑓 = 𝑝∗ (([𝑋] × 𝑔) ⋅ (𝑓 × [𝑍])) ∈ Corr(𝑋, 𝑍),

where 𝑝 ∶ 𝑋 × 𝑌 × 𝑍 → 𝑋 × 𝑍 is the projection map.


Moreover, we have a group homomorphism

Corr(𝑋, 𝑌 ) ⊗Z CH• (𝑋) → CH• (𝑌 ),


(2.15.1)
(𝑓 , 𝛼) ↦ 𝑝∗ (𝑓 ⋅ (𝛼 × [𝑌 ])),

where 𝑝 ∶ 𝑋 × 𝑌 → 𝑌 is the projection map. In particular, this induces an action


of Corr(𝑋, 𝑋) on CH• (𝑋).
Proposition 2.16. Complete k-varieties and correspondences form a category,
which admits a functor from the category of complete k-varieties and k-maps. The
functor CH• factors through this functor. ◻
Before proving the main theorem, we need a lemma.
Lemma 2.17. Let 𝑋 be an integral k-variety, and let 𝜂 be its generic point. Con-
sider the map

𝜂 × id𝑋 ∶ Spec(k(𝑋)) × 𝑋 ≃ 𝑋k(𝑋) → 𝑋 × 𝑋.

The pullback of the diagonal class is the class of the generic point of 𝑋, which is a
0-cycle of 𝑋k(𝑋) of degree 1.
2 Criteria for rationality 101

Proof. We may assume 𝑋 = Spec 𝐴 is affine. Then the pullback of the diagonal
class is the closed point defined by the maximal ideal, which is the kernel of the
multiplication map
k(𝑋) ⊗ 𝐴 → k(𝑋).

The residue field is thus k(𝑋). ◻


Theorem 2.18 (Colliot-Thélène and Pirutka). Let 𝑋 be a smooth, integral, com-
plete k-variety. Then the following are equivalent.
(i) 𝑋 is universally CH0 -trivial.
(ii) 𝑋 has a 0-cycle of degree 1, and the degree map deg ∶ CH0 (𝑋k(𝑋) ) → Z is
an isomorphism.
(iii) 𝑋 admits a decomposition of the diagonal.
Proof. The implication (i) ⇒ (ii) follows from the definition of universal CH0 -
triviality.
Assume (ii). Let 𝛼 be a 0-cycle of 𝑋 of degree 1, and let 𝛽 ∈ CH0 (𝑋k(𝑋) ) be
the class of the generic point of 𝑋. Then by hypothesis, we have

𝛼k(𝑋) = 𝛽 in CH0 (𝑋k(𝑋) ).

By [Blo10, Lemma 1A.1], we have

CH𝑛 (𝑋k(𝑋) ) ≃ colim CH𝑛 (𝑈 × 𝑋).


𝑈 ⊂𝑋 open

By Lemma 2.17, the map CH𝑛 (𝑋 × 𝑋) → CH𝑛 (Spec(k(𝑋)) × 𝑋 ) ≃ CH0 (𝑋k(𝑋) )


maps the diagonal class [Δ𝑋 ] to 𝛽. Therefore, there exists a non-empty open set
𝑈 ⊂ 𝑋, such that
[𝑈 ] × 𝛼 = [Δ𝑈 ] in CH𝑛 (𝑈 × 𝑋).

Let 𝑍 = 𝑋 ⧵ 𝑈 . By [Ful98, §1.8], we have an exact sequence

CH𝑛 (𝑍 × 𝑋) → CH𝑛 (𝑋 × 𝑋) → CH𝑛 (𝑈 × 𝑋) → 0,

which implies that there exists 𝐷 ∈ CH𝑛 (𝑍 × 𝑋), such that

𝐷 = [𝑋] × 𝛼 − [Δ𝑋 ] in CH𝑛 (𝑋 × 𝑋).

This proves (iii).


Now assume (iii), and let

[Δ𝑋 ] = 𝐷 + [𝑋] × 𝑥0 in CH𝑛 (𝑋 × 𝑋)

be a decomposition, with 𝐷 supported in 𝑍 × 𝑋. Let 𝐹 be an extension of k.


Consider the action of Corr(𝑋, 𝑋) on CH0 (𝑋𝐹 ), as defined in (2.15.1). Since
[Δ𝑋 ] acts as the identity, so does 𝐷 + [𝑋] × 𝑥0 .
102 Stable Irrationality of Varieties

On the other hand, by the moving lemma 2.9, every 0-cycle of 𝑋 can be moved
out of 𝑍. This shows that the action of 𝐷 is 0, as in the defining equation (2.15.1),
we are taking the intersection product of two disjoint cycles. But for any 0-cycle 𝛾
of 𝑋, the action of [𝑋] × 𝑥0 sends it to

𝑝∗ (([𝑋] × 𝑥0 ) ⋅ (𝛾 × [𝑋])) = 𝑝∗ (𝛾 × 𝑥0 ) = (deg 𝛾) 𝑥0 ,

where 𝑝 ∶ 𝑋 × 𝑋 → 𝑋 is the second projection. This implies that CH0 (𝑋𝐹 ) is


generated by 𝑥0 , which has degree 1. This proves (i). ◻

The Brauer group


In this subsection, we show that the existence of a decomposition of the diagonal
implies the triviality of the Brauer group.
Definition 2.19. The (cohomological) Brauer group of a scheme 𝑋 is the étale
cohomology group
Br(𝑋) = 𝐻 2 (𝑋, Gm ).

For a field k, the Brauer group Br(k) = Br(Spec k) coincides with the classical
notion defined as the group of equivalence classes of central simple algebras. A
classical reference is [Gro68].
The Kummer exact sequence of étale sheaves
(−)𝑛
0 → μ𝑛 → Gm −−−→ Gm → 0

induces a long exact sequence


⋅𝑛
⋯ → Pic(𝑋) → 𝐻 2 (𝑋, μ𝑛 ) → Br(𝑋) −→ Br(𝑋) → ⋯ .

When 𝑋 = Spec 𝑅, where 𝑅 is a local ring, we have Pic(𝑋) = 0, so that

Br(𝑋) [𝑛] ≃ 𝐻 2 (𝑋, μ𝑛 ),

where the left hand side denotes the 𝑛-torsion subgroup of Br(𝑋).
Definition 2.20. Let 𝑀 be a contravariant functor from schemes to abelian groups
(e.g. étale cohomology), and let k ⊂ K be two fields. Let

𝑀nr (K∕k) = image(𝑀(Spec 𝐴) → 𝑀(Spec K)),



k⊂𝐴⊂K

where 𝐴 runs through all discrete valuation rings with fraction field K.
This is called the unramified version of the functor 𝑀. For example, one has the
𝑞
unramified Brauer group Brnr (K∕k), and unramified cohomology 𝐻nr (K∕k, μ𝑛 ).
A deep result on the cohomological purity of the Brauer group gives rise to the
following theorem.
3 The deformation method 103

Theorem 2.21. Let 𝑋 be a regular, complete, integral k-variety. Then the natural
map Br(𝑋) → Br(k(𝑋)) induces an isomorphism
Br(𝑋) ≃ Brnr (k(𝑋)∕k).
See [CTS19, Proposition 5.2.2].
The discussion above immediately implies the following.
Corollary 2.22. Let 𝑋 be a regular, proper, integral k-variety. Then
2
Br(𝑋) [𝑛] ≃ 𝐻nr (k(𝑋)∕k, μ𝑛 ). ◻
Since the Brauer group is a torsion group [CTS19, Proposition 1.3.6], it can
thus be computed by the unramified cohomology groups.
On the other hand, the second cohomology of μ𝑛 is a part of the cycle module
(in the sense of [Ros96])
⊗(𝑖−1)
⨁𝑖 𝐻 𝑖 (−, μ𝑛 ),
which we do not give a precise definition here. This allows it to be regarded as a
“coefficient group” for Chow groups. As a result, there is an action of correspon-
dences
2 2
Corr(𝑋, 𝑌 ) ⊗ 𝐻nr (k(𝑋)∕k, μ𝑛 ) ⟶ 𝐻nr (k(𝑌 )∕k, μ𝑛 )
for k-varieties 𝑋, 𝑌 . This action will relate the Brauer group with the decomposi-
tion of the diagonal.
Theorem 2.23. Let 𝑋 be a smooth projective k-variety. If 𝑋 admits a decomposi-
tion of the diagonal, then the natural map induces an isomorphism
Br(k) ⥲ Br(𝑋).
In particular, if k is separably closed, then Br(𝑋) = 0.
Sketch of proof. The decomposition of the diagonal implies that the identity map
and the constant map (to be precise, a sum of constant maps) induce the same action
on
2
𝐻nr (k(𝑋)∕k, μ𝑛 ).
2
But the action of a constant map factors through 𝐻nr (k∕k, μ𝑛 ), via the corestriction
map, whence the result follows. ◻

3 The deformation method


In this section, we describe the deformation method which produces irrational va-
rieties. We show that if we have a good family of varieties, and if one of them does
not have a decomposition of the diagonal, then a very general one in the family will
not have a decomposition, and hence, will not be retract rational.
This method was developed by C. Voisin [Voi15], and modified by J.-L. Colliot-
Thélène and A. Pirutka [CTP16], to show that a very general quartic threefold in
P4C is not retract rational. We will present a proof of this result.
104 Stable Irrationality of Varieties

Families of cycles
Definition 3.1 (Kollár [Kol96, Definition 3.10]). Suppose that
• k is an algebraically closed field of characteristic 0.
• 𝑆 is a k-scheme.
• 𝑋∕𝑆 is a projective 𝑆-scheme, with a chosen relatively ample line bundle.
• 𝐵∕𝑆 is a reduced normal 𝑆-scheme.
• 𝑑 and 𝑑 ′ are non-negative integers.
A well-defined family of 𝑑-cycles of 𝑋 of degree 𝑑 ′ parametrised by 𝐵 is a cycle

𝐶 = ∑𝑖 𝑚𝑖 [𝐶𝑖 ] of 𝑋 ×𝑆 𝐵,

such that
• Each 𝐶𝑖 is an integral closed subscheme of 𝑋 ×𝑆 𝐵.
• For each 𝑖, the image of the projection map 𝑔𝑖 ∶ 𝐶𝑖 → 𝐵 is an irreducible
component of 𝐵. In particular, 𝑔𝑖 is flat over a dense open subset of 𝐵.
• Each fibre of 𝑔𝑖 defines a 𝑑-cycle of 𝑋 of degree 𝑑 ′ . This means that the
fibre is either empty or of dimension 𝑑, and that 𝑔𝑖 is flat over a dense open
subset of 𝐵.
The deep theorem below shows the existence of a universal family of cycles, in
that every family of cycles is realised as its pullback.
Theorem 3.2. Under the assumptions of Definition 3.1, for an 𝑆-scheme 𝑍, define
′ well-defined families of non-negative 𝑑-cycles
Chow𝑑,𝑑 (𝑍) = { }.
𝑋∕𝑆 of 𝑋 of degree 𝑑 ′ parametrised by 𝑍

Then

• Chow𝑑,𝑑
𝑋∕𝑆
is a contravariant functor from the semi-normal 𝑆-schemes to sets.

• Moreover, this functor is represented by a projective semi-normal 𝑆-scheme


𝑑,𝑑 ′
Chow𝑋∕𝑆 , called the Chow scheme, so that there exists a universal well-
defined family of non-negative 𝑑-cycles
𝑑,𝑑 ′ 𝑑,𝑑 ′
Univ𝑋∕𝑆 of 𝑋 parametrised by Chow𝑋∕𝑆 ,

such that every other family of cycles is its pullback.


See [Kol96, Theorem I.3.21].
We also recall the existence of Hilbert schemes.
3 The deformation method 105

Theorem 3.3. Let 𝑆 be a locally noetherian scheme. Let 𝑋 → 𝑆 be a projective


morphism. For an 𝑆-scheme 𝑍, define

′ closed subschemes of 𝑋 ×𝑆 𝑍 flat over 𝑍


Hilb𝑑,𝑑 (𝑍) = { .
𝑋∕𝑆 of relative dimension 𝑑 and relative degree 𝑑 ′ }
′ 𝑑,𝑑 ′
The functor Hilb𝑑,𝑑
𝑋∕𝑆
is represented by an 𝑆-scheme Hilb𝑋∕𝑆 , called the Hilbert
scheme, whose irreducible components are projective over 𝑆. As a result, there
exists a universal family of subschemes
𝑑,𝑑 ′
𝑈 ⊂ 𝑋 ×𝑆 Hilb𝑋∕𝑆 ,

such that every other family of subschemes is its pullback.

Below, we will write


𝑑,𝑑 ′ 𝑑,𝑑 ′
Chow𝑋∕𝑆 = ∐ Chow𝑋∕𝑆 and Hilb𝑋∕𝑆 = ∐ Hilb𝑋∕𝑆 .
𝑑,𝑑 ′ 𝑑,𝑑 ′

Locus of rational equivalence


Situation 3.4. Suppose

• k is an algebraically closed field of characteristic 0.

• 𝐵 is a smooth k-scheme.

• 𝑋 → 𝐵 is a projective morphism.

Lemma 3.5. In Situation 3.4, for any non-negative integer 𝑑, there exists

• A countable family of normal, irreducible, quasi-projective 𝐵-schemes {𝑇𝑖 }.

• For each index 𝑖, a family of smooth (𝑑 + 1)-dimensional varieties 𝑊𝑖 → 𝑇𝑖 ,


with two families of divisors 𝐸𝑖,1 , 𝐸𝑖,2 → 𝑇𝑖 of 𝑊𝑖 ,

such that

• For any 𝑏 ∈ 𝐵 and any subvariety 𝑉 ⊂ 𝑋𝑏 of dimension 𝑑 + 1, there exists


a desingularisation 𝑉̃ , such that for any two effective divisors 𝐷1 , 𝐷2 of 𝑉̃ ,
such that 𝐷1 − 𝐷2 is principal, there exists 𝑖 and 𝑡 ∈ (𝑇𝑖 )𝑏 (k), such that the
data (𝑉̃ , 𝐷1 , 𝐷2 ) is identical to ((𝑊𝑖 )𝑡 , (𝐸𝑖,1 )𝑡 , (𝐸𝑖,2 )𝑡 ).

The reason to consider a desingularisation of 𝑉 , instead of 𝑉 itself, is that on a


smooth variety, a Weil divisor is the same thing as a Cartier divisor, and the Weil
divisor class group is the same as the Picard group. The normality of 𝑇𝑖 is required
in order to (later) satisfy the definition of a well-defined family of cycles.
106 Stable Irrationality of Varieties

Proof. By [EGA-IV3, Theorem 9.7.7], the set of points in the Hilbert scheme
Hilb𝑑+1
𝑋∕𝐵
corresponding to the geometrically integral subvarieties is locally con-
structible. Let 𝐺 be an irreducible component of this set, equipped with the re-
duced scheme structure. Then 𝐺 is quasi-projective over 𝑆, as the components of
Hilb𝑑+1
𝑋∕𝐵
are projective. Let
𝑊 ⊂ 𝐺 ×𝐵 𝑋

be the universal family of (𝑑+1)-dimensional subschemes. The morphism 𝑊 → 𝐺


is thus projective, flat, with geometrically integral fibres.
The generic fibre 𝑊k(𝐺) is integral, as its irreducible components correspond
to irreducible components of a general fibre. By Hironaka’s theorem, let 𝑊̃k(𝐺) →
𝑊k(𝐺) be a desingularisation map. This map extends to a map

̃1 → 𝑊1
𝑊

of schemes over an open set 𝐺1 ⊂ 𝐺, where 𝑊1 = 𝑊 |𝐺1 . Shrinking 𝐺1 if neces-


̃1,𝑡 → 𝑊𝑡 of fibres over 𝑡 is
sary, we can assume that for any 𝑡 ∈ 𝐺1 (k), the map 𝑊
a desingularisation map.
By noetherian induction, we can find a decomposition
𝑚
𝐺 = ⋃𝑗=1 𝐺𝑗 ,

with 𝐺𝑗 locally closed in 𝐺, together with maps 𝑊 ̃𝑗 → 𝑊𝑗 over 𝐺𝑗 , where 𝑊𝑗 =


̃
𝑊 |𝐺𝑗 , such that for all 𝑡 ∈ 𝐺𝑗 , the map 𝑊𝑗,𝑡 → 𝑊𝑡 is a desingularisation map.
Let 𝐺̃𝑗 → 𝐺𝑗 be a desingularisation, and we still denote by 𝑊 ̃𝑗 → 𝐺 ̃𝑗 the
pullback of the family 𝑊 ̃𝑗 → 𝐺𝑗 .
Since 𝑊 ̃𝑗 , with geometrically integral fibres, there
̃𝑗 is projective and flat over 𝐺
exist the schemes with a morphism

Ab ∶ Div𝑊 ̃𝑗 → Pic𝑊
̃𝑗 ∕𝐺 ̃𝑗 ,
̃𝑗 ∕𝐺

where Div𝑊 ̃𝑗 is the scheme parametrising the effective Cartier divisors [FAG,
̃𝑗 ∕𝐺
̃𝑗 , and hence over 𝐵, and Pic ̃ ̃
Theorem 9.3.7], which is quasi-projective over 𝐺 𝑊𝑗 ∕𝐺𝑗
is the Picard scheme [FAG, Theorem 9.4.8]. Let

Δ𝑗 ⊂ Div𝑊 ̃𝑗 × Div𝑊
̃𝑗 ∕𝐺 ̃𝑗
̃𝑗 ∕𝐺

be the inverse image of the diagonal of Pic𝑊 ̃𝑗 × Pic𝑊


̃𝑗 ∕𝐺 ̃𝑗 , under the map Ab ×
̃𝑗 ∕𝐺
Ab, equipped with the reduced scheme structure. Let 𝑇 be one of its irreducible
components, and let 𝑇̃ be the normalisation of 𝑇 . Thus 𝑇̃ is quasi-projective over
𝐵.
The family of all the schemes 𝑇̃ , together with the two universal families of
divisors given by the Div schemes, satisfies the requirement of the lemma. ◻
3 The deformation method 107

Of course, a rational equivalence of two 𝑑-cycles may involve more than one
(𝑑 + 1)-dimensional subvariety. The next lemma deals with this situation.
For simplicity, if 𝑉 ⊂ 𝑋𝑏 is a subvariety, we will say “the desingularisation”
of 𝑉 when we refer to the variety 𝑉̃ given by the previous lemma, and we simply
add a tilde to indicate this desingularisation.

Lemma 3.6. In Situation 3.4, for any non-negative integer 𝑑, there exists

• A countable family of normal irreducible 𝐵-schemes {𝐻𝑖 }.


𝑛
𝑖
• For each index 𝑖, an integer 𝑛𝑖 ≥ 1, and 𝑛𝑖 triples (𝑊𝑖,𝑗 , 𝐸𝑖,𝑗,1 , 𝐸𝑖,𝑗,2 )𝑗=1 ,
where for each 𝑗, 𝑊𝑖,𝑗 → 𝐻𝑖 is a smooth projective family of (𝑑 + 1)-
dimensional varieties, and 𝐸𝑖,𝑗,1 , 𝐸𝑖,𝑗,2 → 𝐻𝑖 are two families of divisors
of 𝑊𝑖,𝑗 ,

such that

• For any 𝑏 ∈ 𝐵(k), and any data (𝑉𝑗 , 𝐷𝑗,1 , 𝐷𝑗,2 )𝑛𝑗=1 , where each 𝑉𝑗 is an
integral subscheme of 𝑋𝑏 of dimension 𝑑 +1, and 𝐷𝑗,1 , 𝐷𝑗,2 are two effective
Weil divisors on the desingularisation 𝑉̃𝑗 of 𝑉𝑗 , such that 𝐷𝑗,1 − 𝐷𝑗,2 is a
principal divisor on 𝑉𝑗 , there exists 𝑖 and 𝑡 ∈ (𝐻𝑖 )𝑏 (k), such that the fibre
((𝑊𝑖 )𝑡 , (𝐸𝑖,𝑗,1 )𝑡 , (𝐸𝑖,𝑗,2 )𝑡 ) is identical to the given data.
Proof. For each 𝑛-tuple (𝑇1 , … , 𝑇𝑛 ) as given by the previous lemma, we consider
the normalisation 𝐻 of the product 𝑇1 ×⋯×𝑇𝑛 , equipped with the data of 𝑛 triples as
given by the previous lemma. The collection of all such 𝐻 satisfies the requirement
of this lemma. ◻

Our effort to parametrise all possibilities for a rational equivalence allows us to


prove the following result.

Lemma 3.7. In Situation 3.4, let

𝑍1 , 𝑍2 ∈ Chow𝑑𝑋∕𝐵 (𝐵)

be two well-defined families of cycles. Then there exists

• A countable family of quasi-projective 𝐵-schemes {𝑀𝑖 }.


𝑛 𝑖
• For each index 𝑖, the data (𝑊𝑖,𝑗 , 𝐸𝑖,𝑗,1 , 𝐸𝑖,𝑗,2 )𝑗=1 as in Lemma 3.6, with 𝑀𝑖
in place of 𝐻𝑖 ,

such that

• The union of the images of 𝑀𝑖 (k) in 𝐵(k) is exactly the set

{𝑏 ∈ 𝐵(k) ∣ [𝑍1,𝑏 ] = [𝑍2,𝑏 ] in CH• (𝑋𝑏 )}.


108 Stable Irrationality of Varieties

• For any 𝑏 ∈ 𝐵(k), and any data (𝑉𝑖 , 𝐷𝑖,1 , 𝐷𝑖,2 )𝑛𝑖=1 as in Lemma 3.6, such
that
𝑍1,𝑏 + ∑𝑛𝑖=1 [𝐷𝑖,1 ] = 𝑍2,𝑏 + ∑𝑛𝑖=1 [𝐷𝑖,2 ] in Z• (𝑋𝑏 ),

there exists 𝑖 and a point 𝑡 ∈ (𝑀𝑖 )𝑏 (k), such that the fibre ((𝑊𝑖 )𝑡 , (𝐸𝑖,𝑗,1 )𝑡 ,
(𝐸𝑖,𝑗,2 )𝑡 ) is identical to the given data.

Proof. Let {𝐻𝑖 } be the family in Lemma 3.6. We define a morphism

𝑓 ∶ 𝐻𝑖 → Chow𝑋∕𝐵 × Chow𝑋∕𝐵 ,
𝑖 𝑛 𝑖 𝑛
𝑡 ↦ (𝑍1 + ∑𝑗=1 (𝐸𝑖,𝑗,1 )𝑡 , 𝑍2 + ∑𝑗=1 (𝐸𝑖,𝑗,2 )𝑡 ),

where (𝐸𝑖,𝑗,1 or 2 )𝑡 is regarded as a cycle of 𝑋 via the pushforward along the desin-
gularisation map (onto a closed subvariety of 𝑋). Now let 𝑀𝑖 be the inverse image
of the diagonal along 𝑓 , with the reduced scheme structure, equipped with the data
of 𝑛𝑖 triples given by that of 𝐻𝑖 . This proves the second statement.
For the first statement, write 𝑍 = 𝑍1 − 𝑍2 . Let 𝑏 ∈ 𝐵 be a point where 𝑍𝑏
is rationally equivalent to zero. Then, there exist subvarieties 𝑉𝑗 ⊂ 𝑋𝑏 , where
𝑗 = 1, … , 𝑛, and rational functions 𝑔𝑗 on 𝑉𝑗 , which give rise to rational functions
𝑔̃𝑗 on a desingularisation 𝑉̃𝑗 , such that

𝑍𝑏 = ∑𝑛𝑗=1 (𝑓𝑗 )∗ (div 𝑔̃𝑗 ),

where 𝑓𝑗 denotes the map 𝑉̃𝑗 → 𝑋𝑏 . Conversely, the existence of this data implies
that 𝑍𝑏 is rationally equivalent to zero, since 𝑀𝑗 is taken to be the inverse image
of the diagonal. Therefore, the locus where 𝑍𝑏 is equivalent to zero is exactly the
union of the images of the 𝑀𝑖 . ◻

We are now getting close to the main theorem, which states that the locus where
[𝑍1,𝑏 ] = [𝑍2,𝑏 ] is a countable union of closed sets. There is one further lemma
needed.

Lemma 3.8. Let 𝑀 be a smooth k-variety of dimension 𝑚, with k algebraically


closed, and let 𝑓 ∶ 𝑊 → 𝑀 be a flat morphism of relative dimension 𝑟. Let 𝑍 be
an 𝑛-cycle on 𝑊 . Suppose that

• There is a dense open set 𝑀 ∘ ⊂ 𝑀, such that 𝑍|𝑓 −1 (𝑀 ∘ ) is rationally equiv-


alent to 0 in 𝑓 −1 (𝑀 ∘ ).

Then

• For any 𝑡 ∈ 𝑀(k), the fibre 𝑍𝑡 is rationally equivalent to 0 in 𝑊𝑡 .

Proof. Let 𝑡 ∈ 𝑀(k) ⧵ 𝑀 ∘ (k) be a point. As in the proof of Lemma 2.9, we can
find a curve 𝐶 in 𝑀, passing through 𝑡, and not contained in 𝑀 ⧵ 𝑀 ∘ . Taking the
normalisation of this curve, we may thus assume that 𝑀 is a smooth curve.
3 The deformation method 109

Let 𝐷 = 𝑀 ⧵ 𝑀 ∘ , which is now a finite set. There is an exact sequence [Ful98,


§1.8]
𝑖∗
CH𝑛 (𝑓 −1 (𝐷)) ⟶ CH𝑛 (𝑊 ) ⟶ CH𝑛 (𝑓 −1 (𝑀 ∘ )) → 0,

so that 𝑍 = 𝑖∗ (𝑧) for some 𝑧 ∈ CH𝑛 (𝑓 −1 (𝐷)), where 𝑖 ∶ 𝑓 −1 (𝐷) → 𝑊 denotes


the inclusion. But by the projection formula [Ful98, §2.3], the intersection of 𝑖∗ (𝑧)
with the divisor 𝑓 −1 (𝐷) of 𝑊 is 𝑖∗ 𝑖∗ (𝑓 −1 (𝐷) ⋅ 𝑧) = 0, so that 𝑍𝑡 is rationally
equivalent to 0 for any 𝑡 ∈ 𝐷. ◻
Now we are ready to prove the main result, and our proof follows that of [Voi15,
Proposition 2.4].
Theorem 3.9 (Voisin). In Situation 3.4, let

𝑍1 , 𝑍2 ∈ Chow𝑑𝑋∕𝐵 (𝐵)

be two well-defined families of cycles. Then there exists a countable family {𝐵𝑖 } of
closed subschemes of 𝐵, such that

{ 𝑏 ∈ 𝐵(k) | [𝑍1,𝑏 ] = [𝑍2,𝑏 ] in CH𝑑 (𝑋𝑏 ) } = ⋃𝑖 𝐵𝑖 (k).


Proof. Let {𝑀𝑖 } be as in Lemma 3.7. Replacing each 𝑀𝑖 by its desingularisation,
we can assume that all 𝑀𝑖 are smooth.
Let 𝐵𝑖 ⊂ 𝐵 be the closure of the image of 𝑀𝑖 in 𝐵, as a closed integral sub-
variety. By Lemma 3.7, the equation [𝑍1,𝑏 ] = [𝑍2,𝑏 ] implies 𝑏 ∈ 𝐵𝑖 for some 𝑖.
Thus, it suffices to show that it holds for all 𝑏 ∈ 𝐵𝑖 .
Let 𝐵𝑖∘ ⊂ 𝐵𝑖 be an open subset contained in the image of 𝑀𝑖 , and let 𝑀𝑖∘ be
the inverse image of 𝐵𝑖∘ in 𝑀𝑖 . Let 𝑋𝑀𝑖 = 𝑋 ×𝐵 𝑀𝑖 , and 𝑍𝑖 the pullback of
𝑍 = 𝑍1 − 𝑍2 along the morphism 𝑋𝑀𝑖 → 𝑋, which is actually the Chow pullback
of families of cycles along the map 𝑀𝑖 → 𝐵. Then 𝑍𝑖 is equal to the universal
𝑛𝑖
cycle ∑𝑗=1 (𝐸𝑖,𝑗,1 − 𝐸𝑖,𝑗,2 ) on 𝑀𝑖 , and hence is rationally equivalent to zero.
By taking the closure in a projective bundle, the morphism 𝑀𝑖 → 𝐵𝑖 extends
to a projective morphism 𝑀 𝑖 → 𝐵𝑖 . Again, taking a desingularisation, we may
assume 𝑀 𝑖 is smooth. Write 𝑋𝑀𝑖 = 𝑋 ×𝐵 𝑀 𝑖 . Now apply Lemma 3.8 with
𝑀 = 𝑀 𝑖 , 𝑊 = 𝑋𝑀𝑖 , and 𝑀 ∘ the inverse image of 𝑀𝑖∘ . This shows that for all
𝑏 ∈ 𝐵𝑖 , the cycle 𝑍𝑏 is equivalent to zero. ◻

Locus of decomposability of the diagonal


Lemma 3.10. Suppose
• k is an algebraically closed field of characteristic 0.
• 𝐵 is a smooth k-scheme.
• 𝑋 → 𝐵 is a projective morphism, and write 𝑌 = 𝑋 ×𝐵 𝑋.
Then there exists
110 Stable Irrationality of Varieties

• A countable family of smooth irreducible 𝐵-schemes {𝐹𝑖 }.


• For each index 𝑖, a well-defined family of non-negative 𝑑𝑖 -cycles 𝐶𝑖 of 𝑌 of
degree 𝑑𝑖′ , parametrised by 𝐹𝑖 ,
such that
• For any 𝑏 ∈ 𝐵 and any non-negative 𝑑-cycle 𝐶 of 𝑌𝑏 of degree 𝑑 ′ , supported
in 𝑍 × 𝑋𝑏 for a codimension 1 subset 𝑍 ⊂ 𝑋𝑏 , there exists 𝑖 and 𝑥 ∈ (𝐹𝑖 )𝑏
such that 𝐶 = (𝐶𝑖 )𝑥 .
• For any 𝑥 ∈ (𝐹𝑖 )𝑏 , the cycle 𝐶 = (𝐶𝑖 )𝑥 is supported in 𝑍 × 𝑋𝑏 for a codi-
mension 1 subset 𝑍 ⊂ 𝑋𝑏 .
The condition “supported in 𝑍×𝑋𝑏 ” is the main point of this lemma. In fact, the
proof would be a lot easier if we dropped this condition. This lemma will be used
to parametrise all possibilities for the term 𝐷 in a decomposition of the diagonal,
as in Definition 2.14.
Proof. First, we need to parametrise all the subschemes of 𝑋 that are codimension
1 in 𝑋𝑏 at each 𝑏 ∈ 𝐵. Therefore, we consider an irreducible component

𝐻 ⊂ Hilb𝑋∕𝐵 ,

parametrising the codimension 1 subschemes. Let 𝑈 ⊂ 𝐻 ×𝐵 𝑋 be the universal


subscheme. Thus if we look at the fibre at 𝑏 ∈ 𝐵, then 𝐻𝑏 parametrises the codi-
mension 1 subschemes of 𝑋𝑏 , and 𝑈𝑏 ⊂ 𝐻𝑏 ×𝑋𝑏 is a subscheme whose intersection
with {𝑐} × 𝑋𝑏 gives the subscheme of 𝑋𝑏 corresponding to 𝑐.
Next, we want to parametrise all the subschemes of 𝑌 which have the form
(codim 1 subset) × 𝑋𝑏 when restricted to the fibres. This is given by the universal
subscheme
𝑈 ′ = 𝑈 ×𝐵 𝑋 ⊂ 𝐻 ×𝐵 𝑋 ×𝐵 𝑋,

which, at 𝑏 ∈ 𝐵, when intersected with {𝑐} × 𝑋𝑏 × 𝑋𝑏 , gives the subscheme of


𝑋𝑏 × 𝑋𝑏 corresponding to 𝑐.
Finally, we parametrise cycles of 𝑌 supported in a subset of the form of the
previous step. Thus we consider an irreducible component

𝐶 ⊂ Chow𝑈 ′ ∕𝐻 .

Let 𝑉 ∈ Z• (𝐶 ×𝐻 𝑈 ′ ) be the universal family. Since

𝐶 ×𝐻 𝑈 ′ ⊂ 𝐶 ×𝐻 𝐻 ×𝐵 𝑋 ×𝐵 𝑋 ≃ 𝐶 ×𝐵 𝑋 ×𝐵 𝑋,

we can view 𝑉 as a family of cycles of 𝑌 parametrised by 𝐶.


Thus, all choices of 𝐻 and 𝐶 will give a countable set of families, which to-
gether parametrise all the cycles of 𝑌 of the given form.
However, the parametrising schemes need to be smooth. We thus apply Hiron-
aka’s desingularisation theorem to the schemes 𝐶. ◻
3 The deformation method 111

Proposition 3.11. Suppose

• k is an algebraically closed field of characteristic 0.

• 𝐵 is a smooth k-scheme.

• 𝑋 → 𝐵 is a projective morphism.

Then there exists a countable family {𝐵𝑖 } of closed subschemes of 𝐵, such that

{𝑏 ∈ 𝐵(k) ∣ 𝑋𝑏 has a decomposition of the diagonal} = ⋃𝑖 𝐵𝑖 (k).

Proof. Let 𝐹𝑖 , 𝐹𝑖′ be two of the schemes as in Lemma 3.10, with 𝑑𝑖 = 𝑑𝑖′ =
dim(𝑋∕𝐵), and let 𝐶𝑖 , 𝐶𝑖′ be the universal cycles, lying in 𝑋 ×𝐵 𝑋 ×𝐵 𝐹𝑖 or 𝑖′ .
0,𝑑 0,𝑑+1
Let 𝐺𝑗 , 𝐺𝑗 ′ be irreducible components of Chow𝑋∕𝐵 and Chow𝑋∕𝐵 , respec-
tively, where 𝑑 is arbitrary, and let 𝐷𝑗 , 𝐷𝑗 ′ be the universal cycles lying in 𝑋 ×𝐵
𝐺𝑗 or 𝑗 ′ .
We define two cycles of 𝑌 = 𝐹𝑖 ×𝐵 𝐺𝑗 ×𝐵 𝑋 ×𝐵 𝑋 ×𝐵 𝐺𝑗 ′ ×𝐵 𝐹𝑖′ by

𝑍1 = ([𝐺𝑗 ] × 𝐶𝑖 + [𝐹𝑖 ] × [𝐺𝑗 ] × [Δ𝑋∕𝐵 ] + [𝐹𝑖 ] × [𝑋] × 𝐷𝑗 ) × [𝐺𝑗 ′ ] × [𝐹𝑖′ ],


𝑍2 = [𝐹𝑖 ] × [𝐺𝑗 ] × (𝐶𝑖′ × [𝐺𝑗 ′ ] + [𝑋] × 𝐷𝑗 ′ × [𝐹𝑖′ ]),

where [Δ𝑋∕𝐵 ] is the diagonal class.


Now apply Theorem 3.9, where we take 𝑋 to be 𝑌 , and take 𝐵 to be

𝐹𝑖 ×𝐵 𝐹𝑖′ ×𝐵 𝐺𝑗 ×𝐵 𝐺𝑗 ′ .

At the point

𝑡 = (𝑡1 , 𝑡2 , 𝑥1 , 𝑥2 ) ∈ (𝐹𝑖 )𝑏 × (𝐹𝑖′ )𝑏 × (𝐺𝑗 )𝑏 × (𝐺𝑗 ′ )𝑏 ,

the cycle 𝑍1 gives [Δ𝑋𝑏 ]+𝑧1 +[𝑋𝑏 ]×𝑥1 , where 𝑧1 is a non-negative cycle supported
in 𝑍 × 𝑋𝑏 for 𝑍 ⊂ 𝑋𝑏 of codimension 1, and similarly, the cycle 𝑍2 gives [𝑋𝑏 ] ×
𝑥2 + 𝑧2 , with 𝑧2 likewise.
Therefore, Theorem 3.9 implies that the locus where the equation

[Δ𝑋𝑏 ] + 𝑧1 + [𝑋𝑏 ] × 𝑥1 = [𝑋𝑏 ] × 𝑥2 + 𝑧2 ∈ CHdim 𝑋𝑏 (𝑋𝑏 × 𝑋𝑏 )

holds (for non-negative 𝑥1 , 𝑥2 , 𝑧1 , 𝑧2 ) is the union of countably many closed sub-


sets. ◻

This result is restated as follows.

Theorem 3.12. Suppose

• k is an algebraically closed field of characteristic 0.

• 𝐵 is a smooth k-scheme.
112 Stable Irrationality of Varieties

• 𝑋 → 𝐵 is a dominant projective morphism.


• There exists a k-point 0 ∈ 𝐵, such that the fibre 𝑋0 does not have a decom-
position of the diagonal.
Then for a “very general” k-point 𝑏 ∈ 𝐵, the fibre 𝑋𝑏 will not have a decomposition
of the diagonal. ◻
By “very general”, we mean “except a countable union of closed sets of codi-
mension ≥ 1”.
This means that if we can find one example in a family of varieties, which we
can show has non-trivial Brauer group, and hence does not have a decomposition
of the diagonal, then a very general variety in this family is not retract rational.

Stable equivalence
This subsection gives a variant of the above result, concerning stable equivalence
instead of retract rationality.
Definition 3.13. Two projective k-varieties are stably equivalent, if

𝑋 × P𝑚 is birational to 𝑌 × P𝑛

for some 𝑚, 𝑛 ∈ N.
Stable rationality is the same as stable equivalence to a point.
Lemma 3.14. Let 𝑋, 𝑌 be two k-varieties, such that there exist open sets 𝑈 ⊂ 𝑋,
𝑉 ⊂ 𝑌 × P𝑛 , and two morphisms 𝑝 ∶ 𝑈 → 𝑉 , 𝑞 ∶ 𝑉 → 𝑈 , such that 𝑞 ∘ 𝑝 = id𝑈 .
Then there exist two correspondences

𝑓 ∈ Corr(𝑋, 𝑌 ), 𝑔 ∈ Corr(𝑌 , 𝑋),

such that for any field extension K∕k, the induced map

(𝑔 ∘ 𝑓 )∗ ∶ CH0 (𝑋K ) → CH0 (𝑋K )

is the identity map. When 𝑋 is smooth, we have a decomposition

[Δ𝑋 ] = 𝐷 + 𝑔 ∘ 𝑓 in Corr(𝑋, 𝑋),

where 𝐷 is supported in 𝑍 × 𝑋 for some closed subvariety 𝑍 ⊂ 𝑋 of codimension


at least 1.
Proof. The correspondence 𝑓 is given by the rational map
⊃ 𝑝
𝑋 ⇢ 𝑈 → 𝑉 ↪ 𝑌 × P𝑛 → 𝑌 ,

and 𝑔 is given by a rational map


⊃ 𝑞
𝑌 ↪ 𝑌 × P𝑛 ⇢ 𝑉 → 𝑈 ↪ 𝑋,
3 The deformation method 113

where the inclusion 𝑌 ↪ 𝑌 × P𝑛 is chosen so that the composition is defined. To


prove that 𝑔 ∘ 𝑓 induces the identity map on CH0 (𝑋K ), it suffices to prove that the
map
𝑌 × P𝑛 → 𝑌 ↪ 𝑌 × P𝑛

induces the identity map on CH0 . This is because every closed point of 𝑌 × P𝑛
is sent to another point that lives in the same slice of P𝑛 , and hence, is rationally
equivalent to it as a 0-cycle.
For the second part, we use an argument as in the proof of Theorem 2.18.
Namely, we change the base field to k(𝑋), to find that

𝑔k(𝑋) ∘ 𝑓k(𝑋) (𝛽) = 𝛽 in CH0 (𝑋k(𝑋) ),

where 𝛽 is the class of the generic point. The rest of the proof is analogous to the
proof of Theorem 2.18, (ii) ⇒ (iii). ◻
Note that the assumptions of this lemma is satisfied when 𝑋 and 𝑌 are stably
equivalent.
Using an argument as in the proof of Proposition 3.11, we obtain the following
result.
Theorem 3.15. Suppose
• k is an algebraically closed field of characteristic 0.
• 𝐵 is a smooth k-scheme.
• 𝑋 → 𝐵 and 𝑌 → 𝐵 are two projective morphisms.
Then the set of all points 𝑏 ∈ 𝐵(k) such that there exist correspondences

𝑓 ∈ Corr(𝑋𝑏 , 𝑌𝑏 ), 𝑔 ∈ Corr(𝑌𝑏 , 𝑋𝑏 ),

and 𝐷 as before, such that

[Δ𝑋𝑏 ] = 𝐷 + 𝑔 ∘ 𝑓 ,

is a countable union of closed sets.


Proof. We apply Theorem 3.9, where we take 𝑋 to be

𝑋 ×𝐵 𝑋 ×𝐵 𝐹𝑖 ×𝐵 𝐹𝑖′ ×𝐵 𝐺𝑗 ×𝐵 𝐺𝑗 ′ ×𝐵 𝐻𝑘 ×𝐵 𝐻𝑘′ ,

where
• 𝐹𝑖 , 𝐹𝑖′ are given by Lemma 3.5.
• 𝐺𝑗 , 𝐺𝑗 ′ , 𝐻𝑘 , 𝐻𝑘′ are irreducible components of Chow𝑋×𝐵 𝑌 ∕𝐵 , parametrising
the correspondences from 𝑋 to 𝑌 for 𝐺𝑗 and 𝐺𝑗 ′ , and from 𝑌 to 𝑋 for 𝐻𝑘 ,
𝐻𝑘′ .
114 Stable Irrationality of Varieties

The rest of the proof is analogous to the proof of Proposition 3.11. ◻

Corollary 3.16. Under the assumptions of Theorem 3.15, the set of all points 𝑏 ∈
𝐵(k) such that 𝑋𝑏 is stably equivalent to 𝑌𝑏 is contained in a countable union of
closed sets. Moreover, this union does not contain any point 𝑏 ∈ 𝐵(k) such that
𝑋𝑏 is smooth and has a decomposition of the diagonal, and 𝑌𝑏 does not have a
decomposition of the diagonal

Proof. The countable union of closed sets given by Theorem 3.15 satisfies this
requirement. Indeed, for those 𝑏 ∈ 𝐵(k) such that 𝑋𝑏 and 𝑌𝑏 are stably equivalent,
Lemma 3.14 shows that 𝑏 is in this union.
To prove the last statement, let 𝑏 ∈ 𝐵(k) be such a point. We show that such
correspondences 𝑓 , 𝑔 as in Theorem 3.15 do not exist between 𝑋𝑏 and 𝑌𝑏 . In fact,
if they exist, then 𝑔 ∘ 𝑓 acts on CH0 (𝑋) by the identity map. But since id𝑌 sends
every 0-cycle to its degree multiplied by a fixed 0-cycle of degree 1, so does the
correspondence 𝑔 ∘ 𝑓 = 𝑔 ∘ id𝑌 ∘ 𝑓 . Moreover, this holds over any field extension of
k. By Theorem 2.18, 𝑋 has a decomposition of the diagonal, a contradiction. ◻

In particular, if we take 𝑋 to be a constant family which is smooth, we de-


duce that every stable equivalence class in a family of varieties is contained in a
countable union of closed sets.

Corollary 3.17. Suppose

• k is an uncountable algebraically closed field of characteristic 0.

• 𝐵 is a smooth k-scheme.

• 𝑋 → 𝐵 is a dominant projective morphism, with smooth generic fibre.

• There exist two k-points 𝑏0 , 𝑏1 ∈ 𝐵, such that the fibre 𝑋𝑏0 has a decompo-
sition of the diagonal, while the fibre 𝑋𝑏1 does not have a decomposition of
the diagonal.

Then there are uncountably many stable equivalence classes of varieties in this
family.

Proof. For those smooth fibres 𝑋𝑏 that do not have a decomposition of the diagonal,
we apply Corollary 3.16 to the constant family 𝐵 ×𝑋𝑏 → 𝐵 and the family 𝑋 → 𝐵.
It follows that the set of 𝑏′ ∈ 𝐵(k) such that 𝑋𝑏 is stably equivalent to 𝑋𝑏′ is
contained in a countable union of closed sets, which can not coincide with the
whole space.
By Theorem 3.12, the locus of smooth fibres with no decomposition of the
diagonal is the complement of a countable union of closed subsets of 𝐵. Therefore,
in order to cover this locus, there must be uncountably many stable equivalence
classes of fibres of 𝑋 → 𝐵, since each of these classes is contained in a countable
union of closed sets. ◻
4 The specialisation method 115

4 The specialisation method

The specialisation map


The main idea of the specialisation method is to build a way to transport properties
between the generic fibre and the special fibre. We consider the following situation.

Situation 4.1. Let 𝐴 be a discrete valuation ring, with fraction field K and residue
field k. Let 𝒳 be an 𝐴-scheme. Suppose that

• The special fibre 𝑋 s = 𝒳 ×𝐴 k is a k-variety.

• The generic fibre 𝑋 = 𝒳 ×𝐴 K is a K-variety.

After Colliot-Thélène and Pirutka [CTP16], we introduce the specialisation


map on the Chow groups.

Proposition 4.2. In Situation 4.1, there is a specialisation map

𝜎 ∶ CH0 (𝑋) → CH0 (𝑋 s ),

which preserves the degree of 0-cycles.

Proof. By [Ful98, §1.8 and §20.1], there is an exact sequence

𝑖∗ 𝑗∗
CH1 (𝑋 s ) ⟶ CH1 (𝒳 ) ⟶ CH0 (𝑋) → 0,

where 𝑖, 𝑗 are the obvious inclusions. (The last term is CH0 instead of CH1 , since
Spec K is a 1-dimensional point in Spec 𝐴.)
By [Ful98, §2.6 and §20.1], there is a Gysin map

𝑖! ∶ CH1 (𝒳 ) → CH0 (𝑋 s ),

given by intersection with the divisor 𝑋 s of 𝒳 . By [Ful98, Proposition 2.6 (c)],


we have 𝑖! ∘ 𝑖∗ = 0. Thus the map 𝑖! factors through the cokernel of 𝑖∗ , giving the
desired map. ◻

Lemma 4.3. In Situation 4.1, suppose that 𝐴 is henselian, and 𝒳 is proper and
s
flat over 𝐴. Let 𝑋sm ⊂ 𝑋 s be the open set where 𝑋 s is smooth. Then every 0-cycle
s s
of 𝑋 supported in 𝑋sm can be lifted along the specialisation map

𝜎 ∶ CH0 (𝑋) → CH0 (𝑋 s )

to a 0-cycle supported in 𝑋sm .


s
Proof. We follow [EKW16, §4]. It is enough to lift the closed points of 𝑋sm .
s
Let 𝑥 ∈ 𝑋sm be a closed point, and let 𝑎1 , … , 𝑎𝑛 ∈ 𝒪𝑋 s ,𝑥 be a regular sequence
generating the maximal ideal. Choose liftings 𝑎1̄ , … , 𝑎𝑛̄ ∈ 𝒪𝒳 ,𝑥 . Since 𝒪𝑋 s ,𝑥 ≃
116 Stable Irrationality of Varieties

𝒪𝒳 ,𝑥 ∕𝜋𝒪𝒳 ,𝑥 , where 𝜋 ∈ 𝐴 is a uniformiser, it follows that 𝜋, 𝑎1̄ , … , 𝑎𝑛̄ is a reg-


ular sequence in the (𝑛 + 1)-dimensional local ring 𝒪𝒳 ,𝑥 . Therefore, the ideal
(𝑎1̄ , … , 𝑎𝑛̄ ) ⊂ 𝒪𝒳 ,𝑥 defines a 1-dimensional subset of Spec 𝒪𝒳 ,𝑥 , whose closure
in 𝒳 is a 1-dimensional subscheme 𝑍 ⊂ 𝒳 . Then 𝑍 is flat of relative dimension 0
over 𝐴, and hence quasi-finite over 𝐴. By properness, it is finite over 𝐴. It follows
that 𝑍 ≃ Spec 𝐵 for a finite 𝐴-algebra 𝐵. Since 𝐴 is henselian, 𝐵 is a product of
local rings. Therefore, the irreducible component of 𝑍 containing 𝑥 meets 𝑋 s at a
single point 𝑥. The corresponding 0-cycle of 𝑋 has the desired property. ◻
Lemma 4.4. In Situation 4.1, suppose that
• 𝐴 is henselian, and 𝒳 is proper and flat over 𝐴.
̃ → 𝑋, such that 𝑋
• The generic fibre 𝑋 has a desingularisation 𝑝 ∶ 𝑋 ̃ is
universally CH0 -trivial.
Then every 0-cycle of 𝑋 s of degree 0, supported in the open set 𝑋sm
s
⊂ 𝑋 s where
s s
𝑋 is smooth, is zero in CH0 (𝑋 ).
Proof. Let 𝑈 ⊂ 𝑋 be a dense open set such that 𝑝 ∶ 𝑝−1 (𝑈 ) → 𝑈 is an isomor-
s
phism. Let 𝑥 be a 0-cycle of 𝑋sm of degree 0. By Lemma 4.3, 𝑥 lifts to a 0-cycle of
𝑋sm of degree 0. By the moving lemma 2.9, it is equivalent to a 0-cycle supported
̃ This 0-cycle is equivalent to 0 in 𝑋
in 𝑈 , which then lifts to a 0-cycle of 𝑋. ̃ by
hypothesis. Therefore, applying the map
∗ 𝑝 𝜎
̃ ⟶
CH0 (𝑋) CH0 (𝑋) ⟶ CH0 (𝑋 s ),

we see that 𝑥 = 0 in CH0 (𝑋 s ). ◻

Rationality and specialisation


The main result is that the rationality (or more precisely, universal CH0 -triviality)
of the generic fibre can be specialised to the special fibre, so that once we show
that the special fibre is irrational, we know that the generic fibre is also irrational.
First, we make clear what we need to obtain universal triviality for a desingu-
larisation.
Lemma 4.5. Let 𝑓 ∶ 𝑋 ̃ → 𝑋 be a desingularisation map between two projective,
geometrically integral k-varieties. Suppose that
̃ has a 0-cycle of degree 1.
• 𝑋
• 𝑓 is universally CH0 -trivial.

• There exists an open set 𝑈 ⊂ 𝑋, with 𝑈 ̃ = 𝑓 −1 (𝑈 ), such that 𝑓 ∶ 𝑈


̃ →𝑈
is an isomorphism, and for any extension 𝐹 ∕k, every 0-cycle of degree 0
supported in 𝑈𝐹 is rationally equivalent to zero in 𝑋𝐹 .
̃ is universally CH0 -trivial.
Then 𝑋
4 The specialisation method 117

Proof. Since the conditions of the lemma is preserved by a base change, we only
need to prove that degk ∶ CH0 (𝑋)̃ → Z is an isomorphism. By the first hypothesis,
this map is surjective. Thus, it suffices to show that for every 0-cycle 𝑥 of 𝑋 ̃ of
degree 0, 𝑥 is rationally equivalent to 0. But by the moving lemma 2.9, it is equiv-
alent to one supported in 𝑈̃ , which induces a cycle in 𝑈 , which is equivalent to 0 in
̃ ◻
𝑋 by hypothesis. Since 𝑓 is universally CH0 -trivial, 𝑥 is equivalent to 0 in 𝑋.

Before the main theorem, we mention a convenient result in commutative al-


gebra.

Lemma 4.6. Let 𝐴 be a discrete valuation ring with residue field k, and let 𝐹 ∕k be
an extension. Then there exists a complete discrete valuation ring 𝐵 with residue
field 𝐹 , together with a local map 𝐴 → 𝐵 inducing the field map k → 𝐹 .

See [Bou06, Chapter IX, Appendix, §2, Corollary, and Exercise 4].

Theorem 4.7 (Colliot-Thélène and Pirutka). In Situation 4.1, suppose that

• 𝒳 is faithfully flat and proper over 𝐴, with geometrically integral fibres.

• The special fibre 𝑋 s has a desingularisation 𝑓 ∶ 𝑋̃ s → 𝑋 s , such that 𝑓 is


̃ has a 0-cycle of degree 1.
universally CH0 -trivial, and 𝑋 s

̃ → 𝑋.
• The generic fibre 𝑋 has a desingularisation 𝑋
̃ is universally CH0 -trivial, so is 𝑋
Then if 𝑋 ̃s.

Proof. The proof is done by putting the previous lemmas together.

̃ s s is CH0 -trivial, where we


• By Theorem 2.18, it suffices to show that 𝑋k(𝑋 )
̃ s ) ≃ k(𝑋 s ).
notice that k(𝑋
s s
• By Lemma 4.5, it suffices to show that the open set 𝑈 = (𝑋k(𝑋 s ) )sm ⊂ 𝑋k(𝑋 s )
satisfies the third assumption of Lemma 4.5.
s s
• By Lemma 4.4, it suffices to show that 𝑋k(𝑋 s ) can act the rôle of 𝑋 in that
lemma.

• By Lemma 4.6, we take a complete discrete valuation ring 𝐵, with residue


field k(𝑋 s ), and a local map 𝐴 → 𝐵 inducing the map of fields k → k(𝑋 s ).
Then 𝐵 is henselian. Doing a base change along the map 𝐴 → 𝐵 for every-
thing will complete the proof. ◻

There is a stronger variant of this result, which considers the geometrical


generic fibre over K, instead of over K. Before introducing the result, we need a
lemma.

Lemma 4.8. Let 𝑋 be a smooth, integral, projective k-variety. If 𝑋k is universally


CH0 -trivial, then 𝑋𝐹 is universally CH0 -trivial for some finite extension 𝐹 ∕k.
118 Stable Irrationality of Varieties

Proof. By Theorem 2.18, 𝑋k has a decomposition of the diagonal

[Δ𝑋k ] = 𝐷k + [𝑋k ] × 𝑥0 in CH𝑛 (𝑋k ×k 𝑋k ).

By Galois descent, there exists a finite extension 𝐹 ∕k over which everything in the
equation is defined, and we have

CH𝑛 (𝑋k ×k 𝑋k ) = colim CH𝑛 (𝑋𝐸 ×𝐸 𝑋𝐸 ).


𝐸∕𝐹 finite

Therefore, there exists a finite extension 𝐸 such that the equation of the decompo-
sition of the diagonal holds over 𝐸. ◻

Theorem 4.9 (Colliot-Thélène and Pirutka). In Situation 4.1, suppose that

• The residue field k is algebraically closed.

• 𝒳 is faithfully flat and proper over 𝐴, with geometrically integral fibres.


̃ s → 𝑋 s , such that 𝑓 is
• The special fibre 𝑋 s has a desingularisation 𝑓 ∶ 𝑋
universally CH0 -trivial.

• The geometrical generic fibre 𝑋 = 𝒳 ×𝐴 K is a K-variety, with a desingu-


̃ → 𝑋.
larisation 𝑋
̃ is universally CH0 -trivial, so is 𝑋
Then if 𝑋 ̃s.

Proof. Our plan is to find a suitable base change in order to apply Theorem 4.7.
First, we replace 𝐴 by its completion.
Let 𝐹 be a finite extension of K, on which 𝑋 ̃ is defined. In other words, there
exists a smooth variety 𝑌 over 𝐹 , such that 𝑌K ≃ 𝑋,̃ and there is a desingularisation
map 𝑌 → 𝑋𝐹 , which coincides with the map 𝑋 ̃ → 𝑋 over K.
By Lemma 4.8, we may replace 𝐹 by a finite extension of it, so we may assume
that 𝑌 is universally CH0 -trivial.
Let 𝐵 be the integral closure of 𝐴 in 𝐹 . By [Ser79, Proposition I.3], 𝐵 is also a
discrete valuation ring. Since k is algebraically closed, the residue field of 𝐵 is also
k. We can thus do a base change along the map 𝐴 → 𝐵, and apply Theorem 4.7 to
complete the proof. ◻

There is an even stronger variant of this result, where instead of K, we consider


any field containing K.

Lemma 4.10. Let 𝑋 be a projective k-variety. If 𝑋𝐹 is retract rational for some


extension 𝐹 ∕k, then the same is true for a certain finite extension 𝐹 ∕k.

Proof. In the definition of the retract rationality of 𝑋𝐹 , everything is defined over


a finitely generated extension of k. Thus we may assume 𝐹 ∕k is finitely generated.
For the same reason, the lemma is true when 𝐹 is the algebraic closure of k.
Therefore, we may assume that k is algebraically closed.
5 Example: Quartic threefolds 119

In this case, there exists a k-variety 𝑌 such that 𝐹 is isomorphic to k(𝑌 ), and
there exist non-empty open sets 𝑈 ⊂ 𝑋 ×k 𝑌 and 𝑉 ⊂ P𝑛𝑌 , such that 𝑈 is a retract
of 𝑉 as a 𝑌 -scheme. There exists a closed point 𝑦 ∈ 𝑌 such that the fibres 𝑈𝑦 and
𝑉𝑦 are non-empty. This proves the lemma. ◻
Theorem 4.11 (Colliot-Thélène and Pirutka). Assume that the four assumptions of
̃𝐹 is retract rational for a field 𝐹 containing
Theorem 4.9 are satisfied. Then, if 𝑋
̃ s
K, then 𝑋 is universally CH0 -trivial. ◻

5 Example: Quartic threefolds


In this section, we construct an explicit example of a quartic threefold which is
unirational, but has non-trivial Brauer group. This example was originally due to
Artin and Mumford [AM72], and slightly modified in [CTP16] so that it embeds
in P4 .

The example
• Let k be an algebraically closed field with char k ≠ 2.
• Let 𝐴 ⊂ P2 be a smooth conic, defined by the quadratic equation

𝛼(𝑧0 , 𝑧1 , 𝑧2 ) = 0.

• Let 𝐷1 , 𝐷2 ⊂ P2 be two smooth cubics, defined by

𝛿1 = 0 and 𝛿2 = 0,

such that they each meet 𝐴 tangentially at 3 points, giving six tangent points
𝑄1 , … , 𝑄6 , and such that 𝐷1 ∩ 𝐷2 is nine distinct points 𝑃1 , … , 𝑃9 . These
nine points do not lie on 𝐴, since otherwise 𝐷1 and 𝐷2 will meet tangentially
at that point.
• Let 𝐵 ⊂ P2 be a cubic that intersects 𝐴 in the six points 𝑄1 , … , 𝑄6 . In fact,
for any nine given points on the plane, there exists a cubic curve passing
through all of them. We use these six points, and choose three other points
which are non-collinear and not on 𝐴. This ensures that the cubic does not
contain 𝐴, and can only intersect 𝐴 in these six points.
As cycles of 𝐴, we have

(𝐷1 + 𝐷2 ) ⋅ 𝐴 = 2𝐵 ⋅ 𝐴.

This means that 𝛼 ∣ 𝛿1 𝛿2 − 𝛽 2 , where 𝛽 is the polynomial defining 𝐵. Thus


we may write
𝛿1 𝛿2 = 𝛽 2 − 4𝛼𝛾

for some 𝛾 of degree 4.


120 Stable Irrationality of Varieties

• Let 𝑆 ⊂ P3 be the quartic surface defined by

𝑔 = 𝛼(𝑧0 , 𝑧1 , 𝑧2 ) 𝑧23 + 𝛽(𝑧0 , 𝑧1 , 𝑧2 ) 𝑧3 + 𝛾(𝑧0 , 𝑧1 , 𝑧2 ) = 0.

Using the projection to P2 which sends (𝑧0 ∶ 𝑧1 ∶ 𝑧2 ∶ 𝑧3 ) to (𝑧0 ∶ 𝑧1 ∶ 𝑧2 ),


the surface 𝑆 ⧵ (0 ∶ 0 ∶ 0 ∶ 1) can be seen as a double cover of P2 , ramified
along the curves 𝐷1 and 𝐷2 .
• After applying a linear coordinate change in 𝑧0 , 𝑧1 , 𝑧2 , we may assume

The hyperplane 𝑧0 = 0 does not contain 𝑄1 , … , 𝑄6 ,


or any point of 𝑀 ⧵ {𝑃0 }, and is not tangent to 𝐴, (5.0.1)

where
𝜕𝑔 𝜕𝑔 𝜕𝑔
𝑀 = { 𝑔 = 0, ( = 0 or = 0), = 0 },
𝜕𝑧1 𝜕𝑧2 𝜕𝑧3

and 𝑃0 = (0 ∶ 0 ∶ 0 ∶ 1). This is a technical assumption which we make to


avoid bad singularities.
• Now let 𝑇 ⊂ P4 be the quartic threefold defined by

𝑓 = 𝛼(𝑧0 , 𝑧1 , 𝑧2 ) 𝑧23 + 𝛽(𝑧0 , 𝑧1 , 𝑧2 ) 𝑧3 + 𝛾(𝑧0 , 𝑧1 , 𝑧2 ) + 𝑧20 𝑧24 = 0.

It is a double cover of P3 , ramified along the surface 𝑆 and the hyperplane


𝑧0 = 0.

We will see that 𝑇 has the property of being unirational but not having a de-
composition of the diagonal.
We next construct an explicit desingularisation of the threefold 𝑇 constructed
above, following [CTP16, Appendix A]. We do this in order to get a desingulari-
sation map which is universally CH0 -trivial, and such that the Brauer group of the
desingularisation is non-trivial.

• We observe that the threefold 𝑇 is singular along the line

𝐿 ∶ 𝑧0 = 𝑧 1 = 𝑧 2 = 0

in P4 , and outside of this line, it has ordinary quadratic singularities at the


points 𝑃1 , … , 𝑃9 on the hyperplane 𝑧4 = 0.
• Let 𝑇1 → 𝑇 be the blow-up along 𝐿. Standard computation shows that the
exceptional divisor is a rational surface, and 𝑇1 is singular along a line 𝐿1
which is the inverse image of the point 𝑃 = (0 ∶ 0 ∶ 0 ∶ 0 ∶ 1).
• Let 𝑇2 → 𝑇1 be the blow-up along 𝐿1 . Standard computation shows that the
exceptional divisor is a rational surface, and 𝑇2 only has ordinary quadratic
singularities, at the inverse images of the nine points 𝑃1 , … , 𝑃9 .
5 Example: Quartic threefolds 121

• Finally, we blow up 𝑇2 at the nine points 𝑃1 , … , 𝑃9 . This gives a desin-


gularisation of 𝑇 . The exceptional divisor over the nine points are rational
surfaces, and over 𝐿, it is a union of two rational surfaces. We can thus ap-
ply Theorem 2.13 to conclude that the desingularisation map is universally
CH0 -trivial.
• Artin and Mumford [AM72, §2] showed that over C, there is a smooth
projective threefold 𝑉 , birational to 𝑇 , such that the singular cohomology
𝐻 3 (𝑉 , Z) contains non-trivial 2-torsion. It follows from the universal
coefficient theorem and the comparison theorem [Mil80, Theorem III.3.12,
p. 117] that the étale cohomology group 𝐻 3 (𝑉 , Z2 ) contains non-trivial
2-torsion. By Lemma 5.4 below, Br(𝑉 ) contains non-trivial 2-torsion, and
so does Br(𝑇2 ) by Theorem 2.21. In particular, Br(𝑇2 ) ≠ 0. This shows that
𝑇 is not retract rational, by Theorem 2.23.
For general k of characteristic zero, a consequence of the smooth base change
theorem for étale cohomology [Mil80, Corollary VI.4.3, p. 231] shows that
Br(𝑇2 ) ≃ 𝐻 2 (𝑇2 , Gm ) contains non-trivial 2-torsion. Indeed, the cited theo-
rem shows that this holds over Q, and applying it again shows that it holds
for any algebraically closed k of characteristic zero, so that 𝑇 is not retract
rational.
In summary, we have the following result.
Theorem 5.1. Suppose that k is algebraically closed of characteristic zero. Then
the quartic threefold 𝑇 admits a desingularisation
𝑓 ∶ 𝑇̃ → 𝑇 ,
such that 𝑓 is universally CH0 -trivial. Moreover, we have Br(𝑇̃ ) ≠ 0. ◻
Finally, we prove the relationship between the Brauer group and the ℓ-adic étale
cohomology group which was used above.
Lemma 5.2. Let 𝑋 be a rationally connected k-variety, with char k = 0. Then
𝐻 𝑝 (𝑋, 𝒪𝑋 ) = 0
for all 𝑝 > 0.
Proof. See [Deb03, §3.4]. ◻
Let 𝑋 be a variety over C. The exact sequence of sheaves
0 → Z → 𝒪𝑋 → 𝒪𝑋∗ → 0
on 𝑋 (as an analytic space) induces a long exact sequence
⋯ → 𝐻 1 (𝑋, 𝒪𝑋 ) → Pic(𝑋) → 𝐻 2 (𝑋, Z) → 𝐻 2 (𝑋, 𝒪𝑋 ) → ⋯ .
The image of Pic(𝑋) in 𝐻 2 (𝑋, Z) is called the Néron–Severi group, and its rank,
denoted by 𝜌(𝑋), is called the Picard number of 𝑋.
122 Stable Irrationality of Varieties

Lemma 5.3. Let 𝑋 be a rationally connected complex variety. Then the Picard
number 𝜌(𝑋) is equal to the Betti number 𝑏2 (𝑋). ◻

We fix some notations. For an abelian group 𝐴 and a prime number ℓ, we


denote
𝐴 {ℓ} = { 𝑥 ∈ 𝐴 ∣ ℓ𝑛 𝑥 = 0 for some 𝑛 },

which is naturally a Zℓ -module. Suppose 𝑀 is a Zℓ -module of cofinite type, i.e.


one has
𝑀 ≃ (Qℓ ∕Zℓ )⊕𝑟 ⊕ (finite group),

then we denote by 𝑀 fin its finite part, which is the largest finite submodule of 𝑀
that is a direct summand.

Lemma 5.4. Let 𝑋 be a rationally connected complex variety, and let ℓ be a prime
number. Then
Br(𝑋) {ℓ} ≃ 𝐻 3 (𝑋, Zℓ (1)) {ℓ},

where the right hand side is the étale cohomology of the sheaf Zℓ (1) = ⟵
lim𝑛 μℓ𝑛 ,
which may be identified with Zℓ over C.

Proof. By [Gro68, II, Theorem 3.1, p. 80], we have an exact sequence

0 → Pic(𝑋) ⊗Z Qℓ ∕Zℓ ⟶ 𝐻 2 (𝑋, μℓ∞ ) ⟶ Br(𝑋) {ℓ} → 0.

Since the first term is a finite sum of copies of Qℓ ∕Zℓ , we have

Br(𝑋) {ℓ}fin ≃ 𝐻 2 (𝑋, μℓ∞ )fin .

By Lemma 5.3, the “corank” of Br(𝑋) {ℓ} (i.e. number of summands Qℓ ∕Zℓ ) is
𝑏2 − 𝜌 = 0, so that
Br(𝑋) {ℓ}fin ≃ Br(𝑋) {ℓ}.

By [Gro68, III, (8.3), p. 144], we have an exact sequence

0 → 𝐻 2 (𝑋, μℓ∞ )fin ⟶ 𝐻 3 (𝑋, Zℓ (1)) ⟶ ⟵


lim 𝐻 3 (𝑋, μℓ∞ ) [ℓ𝑛 ] → 0,
𝑛

where [ℓ𝑛 ] indicates the subgroup of elements killed by ℓ𝑛 . Since the last term is
torsion-free, we have

𝐻 2 (𝑋, μℓ∞ )fin ≃ 𝐻 3 (𝑋, Zℓ (1)) {ℓ},

whence the result follows. ◻


5 Example: Quartic threefolds 123

Consequences
Following Colliot-Thélène and Pirutka [CTP16], we present some consequences
of the example given in the previous subsection.
The first result provides examples of smooth quartic threefolds over complex
numbers, which are not retract rational.
Theorem 5.5. Let 𝑃 → P𝑁 4
C be the family of all quartic hypersurfaces in PC . Let
𝑡 ∈ C ⧵ Q be a transcendental number. Then the set

𝑁 𝑧 has coordinates in Q(𝑡), and the hyper-


{ 𝑧 ∈ PC | surface 𝑃𝑧 is smooth but not retract rational }

is Zariski dense in P𝑁
C.

Proof. By Theorem 5.1, there is a quartic hypersurface 𝑋 ⊂ P4 , with a desingu-


Q
larisation
𝑓∶ 𝑋̃→𝑋

̃ ≠ 0.
such that 𝑓 is universally CH0 -trivial, and Br(𝑋)
𝑁
Let 𝑊 ⊂ P be the closed subset corresponding to the singular hypersurfaces.
Q
Let 𝑀 ∈ P𝑁 be the point corresponding to 𝑋. Choose a point 𝑀 ′ ∈ P𝑁 ⧵ 𝑊 , and
Q Q
let
𝐿 ≃ P1 ⊂ P𝑁
Q Q

be the straight line connecting 𝑀 and 𝑀 ′ , with generic point 𝜂. Then Theorem 4.9
implies that the quartic threefold 𝑋 ∘ defined by 𝜂, over the field Q(𝑥), is not geo-
metrically retract rational. Moreover, by Theorem 4.11, for any embedding

Q(𝑥) ↪ C,

the base change 𝑋C∘ is not retract rational, where a desingularisation of 𝑋 ∘ can be
obtained via Hironaka’s theorem, as is required by Theorem 4.11.
Let 𝑅 ∈ 𝐿(C) be a point whose coordinates are in Q(𝑡), but not in Q. Then the
quartic threefold 𝑃𝑅 is isomorphic to 𝑋C∘ , for some embedding Q(𝑥) ↪ C. Indeed,
we have a diagram of pull-back squares

𝑃𝑅 𝑃 𝐿C 𝑃 Q, 𝐿 𝑋∘

𝑅 𝜂
Spec C 𝐿C 𝐿Q Spec Q(𝑥) ,

where 𝑃 Q → P𝑁 denotes the family of all quartic hypersurfaces in P4Q . By


Q
the choice of 𝑅, in the diagram, the image of Spec C in 𝐿Q is the generic point
(since any other point is a Q-rational point, and cannot be 𝑅). Therefore, the
map Spec C → 𝐿Q in the diagram factors through Spec Q(𝑥), giving the desired
embedding.
124 Stable Irrationality of Varieties

Therefore, 𝑃𝑅 is not retract rational. This shows that every line passing through
𝑀 and a point not in 𝑊 contains infinitely many points where the hypersurface is
not retract rational. This implies Zariski density. ◻

Together with results from §3, this result allows us to obtain general statements
on the irrationality of quartic threefolds over complex numbers.

Theorem 5.6 (Colliot-Thélène and Pirutka). A very general quartic hypersurface


in P4C is not retract rational.

Proof. By Theorem 3.12, and by Theorem 5.5. ◻

Theorem 5.7. There are uncountably many stable equivalence classes in the family
of quartic hypersurfaces in P4C .

Proof. We apply Theorem 3.17.


We have seen that the family contains a threefold with no decomposition of the
diagonal. But the (singular) hypersurface

𝑋 ∶ 𝑥0 𝑥1 (𝑥0 + 𝑥1 )(𝑥0 − 𝑥1 ) = 0

has a decomposition of the diagonal. Indeed, let 𝑧 = (0 ∶ 0 ∶ 0 ∶ 0 ∶ 1) ∈ 𝑋,


and let 𝑋1 , … , 𝑋4 be the irreducible components of 𝑋, each isomorphic to P3 .
The diagonal class of 𝑋𝑖 × 𝑋𝑖 is rationally equivalent to [𝑋𝑖 ] × [𝑧], up to a minor
term 𝐷 as before. Summing over 𝑖, we see that the diagonal of 𝑋 × 𝑋 is rationally
equivalent to [𝑋] × [𝑧], up to a minor term. ◻

Remark 5.8. In these two theorems, the degree 4 can be replaced by any positive
multiple of 4, since one can consider quartic threefolds in P4 with multiplicity
𝑚 > 1, which will be a non-reduced hypersurface of degree 4𝑚. We use the fact that
the Chow group of a “non-reduced variety” is isomorphic to that of its reduction
[Ful98, Example 1.3.1].

Finally, we mention a result of Colliot-Thélène and Pirutka which provides


examples of smooth quartic hypersurfaces in P4C , which are defined over Q, but not
retract rational over C.

Theorem 5.9. There exist smooth quartic hypersurfaces in P4Q that are not univer-
sally CH0 -trivial over any field containing Q, and hence not retract rational over
any field containing Q, and in particular, over C.

Proof. By Theorem 5.1, there is a singular quartic hypersurface 𝑋 ⊂ P4 , with a


Q
desingularisation
𝑓∶ 𝑋 ̃→𝑋

̃ {2} ≠ 0. By Lemma 5.4, we thus


such that 𝑓 is universally CH0 -trivial, and Br(𝑋)
have
̃ Z2 ) {2} ≠ 0.
𝐻 3 (𝑋,
6 Example: Cubic threefolds 125

By Galois descent and by Lemma 4.8, we choose a finite extension 𝐾∕Q, over
which 𝑋, 𝑋 ̃ and 𝑓 are defined, such that 𝑓 is universally CH0 -trivial.
Let 𝒪𝐾 be the ring of integers of 𝐾. Let 𝑈 ⊂ Spec 𝒪𝐾 be an open set, such that
there exists a map of 𝑈 -schemes

𝒻 ∶ 𝒳̃ → 𝒳 ,

such that it coincides with 𝑓 ∶ 𝑋̃ → 𝑋 over the generic point of 𝑈 . Indeed, we


̃ in the projective
define them to be cut out by the same set of equations as 𝑋 and 𝑋
space.
Shrinking 𝑈 if necessary, we assume that 𝑈 contains no 2-adic points, and that
𝒳̃ is smooth over 𝑈 .
Shrinking 𝑈 again, we assume for any closed point 𝑣 ∈ 𝑈 , the map of geomet-
rical fibres
𝒻𝜅(𝑣) ∶ 𝒳̃𝜅(𝑣) → 𝒳𝜅(𝑣)

is a desingularisation map which is universally CH0 -trivial. In fact, it is what we


get if we start with k = 𝜅(𝑣) in §5.
Applying the smooth specialisation property of étale cohomology [Mil80,
Corollary VI.4.2, p. 230], we see that for any 𝑣 ∈ 𝑈 ,

𝐻 3 (𝒳̃𝜅(𝑣) , Z2 ) ≃ 𝐻 3 (𝑋
̃Q , Z2 ).

It follows that 𝐻 3 (𝒳̃𝜅(𝑣) , Z2 ) {2} ≠ 0, and hence Br(𝒳̃𝜅(𝑣) ) ≠ 0 by Lemma 5.4.


Fix a point 𝑣 ∈ 𝑈 . We regard 𝑣 as a discrete valuation on 𝐾. By Lemma 4.6,
there is an extension of discrete valuation rings

𝒪𝐾,𝑣 ⊂ 𝐴,

such that the residue field of 𝐴 is 𝜅(𝑣). Let 𝐿 be the fraction field of 𝐴.
Finally, there exists a smooth 𝐴-scheme whose special fibre is 𝒳𝜅(𝑣) . Indeed,
in the projective space P𝑁 4
𝐿 parametrising the quartic hypersurfaces in P𝐿 , the set
𝑁
of points in P𝐿 with coordinates in 𝒪𝐿 lying over 𝒳𝜅(𝑣) is Zariski dense. But there
is an open set of P𝑁𝐿 whose points correspond to smooth hypersurfaces.
Now we apply Theorem 4.11 to complete the proof. ◻

6 Example: Cubic threefolds


In this section, we consider a general example of a cubic hypersurface in P4 , fol-
lowing [CTP16].

• Let 𝑝 ≠ 3 be a prime number.

• Consider one of the following two situations.


126 Stable Irrationality of Varieties

– Let k be either a finite extension of Q𝑝 , or the field F𝑞 ((𝑡)), where 𝑞 is a


power of 𝑝. Let 𝐴 be its ring of integers and F the finite residue field.
Let 𝜋 ∈ 𝐴 be a uniformising element.
– Or, let k be a number field in which 𝑝 is a prime. Let 𝐴 ⊂ k be the
corresponding discrete valuation ring, and let F be the finite residue
field. Let 𝜋 = 𝑝.
• Let K∕k be a cubic extension which is unramified at 𝜋, giving a cubic exten-
sion E∕F of residue fields.
• Let 𝛼 be an element of the ring of integers of K, such that K = k(𝛼). Let
𝛽 ∈ E be its image.
• Let
Φ ∈ 𝐴[𝑢, 𝑣, 𝑤, 𝑥, 𝑦]

be a cubic homogeneous polynomial, which defines a smooth hypersurface


in P4𝐴 .
• Let 𝒳 ⊂ P4𝐴 be the hypersurface cut out by the equation

Ψ = NormK∕k (𝑢 + 𝛼𝑣 + 𝛼 2 𝑤) + 𝑥𝑦(𝑥 − 𝑦) + 𝜋 𝑚 Φ(𝑢, 𝑣, 𝑤, 𝑥, 𝑦) = 0,

where 𝑚 > 0 is an integer that is to be chosen.


• We can choose 𝑚 so that the generic fibre

𝑋 ∘ = 𝒳 ×𝐴 k

is smooth over k. In fact, the discriminant [Sal76, Article 105, p. 93] of Ψ


is a polynomial in 𝜋 𝑚 , which is non-zero since the coefficient of its leading
term is the discriminant of Φ. Therefore there are only finitely many values
of 𝑚 for which it is zero.
• We now look at the special fibre

𝑋 = 𝒳 ×𝐴 F,

which is defined by the equation

NormE∕F (𝑢 + 𝛽𝑣 + 𝛽 2 𝑤) + 𝑥𝑦(𝑥 − 𝑦) = 0.

Let 𝛽1 , 𝛽2 , 𝛽3 be the conjugates of 𝛽. Consider the linear coordinate change


over E

(𝑢, 𝑣, 𝑤) ↦ (𝑢 + 𝛽1 𝑣 + 𝛽12 𝑤, 𝑢 + 𝛽2 𝑣 + 𝛽22 𝑤, 𝑢 + 𝛽3 𝑣 + 𝛽32 𝑤). (6.0.1)

The equation is now simplified over E:

𝑢𝑣𝑤 + 𝑥𝑦(𝑥 − 𝑦) = 0.
6 Example: Cubic threefolds 127

Geometrically, this hypersurface has three singular points

(1 ∶ 0 ∶ 0 ∶ 0 ∶ 0), (0 ∶ 1 ∶ 0 ∶ 0 ∶ 0), (0 ∶ 0 ∶ 1 ∶ 0 ∶ 0). (6.0.2)

They define a single point 𝑀 ∈ 𝑋 with residue field E.

• Let 𝑋1 be the blow-up of 𝑋 at 𝑀. Standard computation shows that under


the map
(𝑋1 )E → 𝑋E ,

the inverse image of each of the 3 singular points is a union of two surfaces
P2E , which intersect in a line P1E , which contains 3 singular points.

• The Galois group 𝐺 = Gal(E∕F) acts on 𝑋E by permuting 𝑢, 𝑣, 𝑤 cyclically.


It follows that 𝑋1 ≃ (𝑋1 )E ∕𝐺, as a blow-up of 𝑋, has an exceptional divisor
which is a union of two surfaces P2E , and contains 3 singular points with
residue field E.

• Let 𝑋2 be the blow-up of 𝑋1 at these three points. Standard computation


shows that 𝑋2 is smooth over F, and that the exceptional divisor over each
point is a rational surface.

• We conclude by Theorem 2.13 that the map

𝑋2 → 𝑋

is a desingularisation map that is universally CH0 -trivial.

• We have Br(𝑋2 ) ≠ 0 by Theorem 6.4 below. Note that Br(F) = 0 by Wed-


derburn’s theorem, so that Br(𝑋2 )∕Br(F) ≠ 0.

In summary, we have obtained the following theorem.

Theorem 6.1. Let k be one of the following:

• a number field,

• a finite extension of Q𝑝 with 𝑝 ≠ 3, or

• the field F𝑞 ((𝑡)) of characteristic not equal to 3.

Then there exist smooth cubic hypersurfaces in P4k that is not universally CH0 -
trivial over k, and hence not retract rational over k.

Proof. By the above construction, and by Theorem 4.7, the generic fibre 𝑋 ∘ , as a
smooth k-variety, is not universally CH0 -trivial. ◻

Now, we complete the computation of the Brauer group of 𝑋2 .


128 Stable Irrationality of Varieties

Lemma 6.2. Let K be a field, and let

𝑅 = K[𝑢, 𝑣, 𝑤]∕(𝑢𝑣𝑤 − 1).

Then an element of 𝑅 is invertible if and only if it is of the form 𝑡 𝑢𝑚 𝑣𝑛 , with 𝑡 ∈ K


and 𝑚, 𝑛 ∈ Z. ◻
Lemma 6.3. Let 𝑝 ∶ 𝑋 → 𝐵 be a morphism of smooth varieties, where 𝐵 is an
integral curve. Suppose that
• Pic(𝐵) = 0.
• The Picard group of the generic fibre of 𝑝 is zero.
• For each 𝑏 ∈ 𝐵 such that 𝑋𝑏 is not integral, every irreducible component of
𝑋𝑏 is a principal divisor of 𝑋.
Then Pic(𝑋) = 0.
Proof. Let 𝐷 ⊂ 𝑋 be an irreducible divisor, with generic point 𝜂. We want to
show that 𝐷 is principal. There are two cases.
• 𝑝(𝜂) is a closed point 𝑏 ∈ 𝐵. Then 𝐷 is an irreducible component of the
fibre 𝑋𝑏 . If 𝑋𝑏 is not irreducible, the result follows from the hypotheses.
Otherwise, we have 𝐷 = 𝑋𝑏 , so that the rational function on 𝐵 establishing
the divisor 𝑏 ∈ 𝐵 as principal, also establishes 𝐷 as principal.
• 𝑝(𝜂) is the generic point. Then 𝜂 defines a divisor of the generic fibre, which
is principal by hypothesis. We thus obtain a rational function on 𝑋, whose
divisor is the sum of 𝐷 and some other divisors, each being an irreducible
component of a fibre of 𝑝. We can thus apply the first case. ◻
Theorem 6.4. Using the above notation, let 𝑌 be any desingularisation of 𝑋. For
example, we may take 𝑌 = 𝑋2 . Then

Br(𝑌 ) ≃ Z∕3.

Proof. As before, we consider the coordinate change (6.0.1), so that 𝑋E is defined


by the equation
𝑢𝑣𝑤 = 𝑥𝑦(𝑥 − 𝑦),

with the action of the Galois group 𝐺 = Gal(E∕F) ≃ Z∕3 by permuting the coor-
dinates 𝑢, 𝑣, 𝑤 cyclically.
Let 𝑈 ⊂ 𝑋 be the smooth locus, so that 𝑈E ⊂ 𝑋E is the complement of the
three singular points (6.0.2).
Let 𝑉 ⊂ 𝑈 be the open set given by 𝑥𝑦 ≠ 0. Then 𝑈E ⧵ 𝑉E consists of six
irreducible components Δ𝑢,𝑥 , Δ𝑣,𝑥 , Δ𝑤,𝑥 , Δ𝑢,𝑦 , Δ𝑣,𝑦 , Δ𝑤,𝑦 , where for example, Δ𝑢,𝑥
is defined by 𝑢 = 𝑥 = 0. The group 𝐺 acts on them by permuting 𝑢, 𝑣, 𝑤.
Let 𝑝 ∶ 𝑉E → A1E ⧵ {0} be the projection given by (𝑢, 𝑣, 𝑤, 𝑥, 𝑦) ↦ 𝑥∕𝑦. The
generic fibre is isomorphic to the surface 𝑢𝑣𝑤 = 1 in the affine space A3E(𝑥) , which
6 Example: Cubic threefolds 129

is isomorphic to the open subset 𝑢𝑣 ≠ 0 of A2E(𝑥) , so that its Picard group is zero.
Moreover, the only non-integral fibre is the fibre at 1, which consists of three irre-
ducible components, each being a principal divisor of 𝑉E , since they are defined in
𝑉E by the equations 𝑢 = 0, 𝑣 = 0 and 𝑤 = 0 respectively. Applying Lemma 6.3,
we obtain Pic(𝑉E ) = 0.
Let Div𝑈E ⧵𝑉E (𝑈E ) denote the group of divisors of 𝑈E supported in 𝑈E ⧵ 𝑉E . It
is a free abelian group of rank 6, generated by the divisors Δ𝑢,𝑥 , etc. The canonical
map
𝛽 ∶ Div𝑈E ⧵𝑉E (𝑈E ) → Pic(𝑈E )

is surjective, as its image is the kernel of the restriction map Pic(𝑈E ) → Pic(𝑉E ), the
latter group being zero. The kernel of 𝛽 consists of those divisors that are principal
in 𝑈E . We thus have an exact sequence of 𝐺-modules
𝛼 𝛽
0 → E[𝑉E ]× ∕E[𝑈E ]× ⟶ Div𝑈E ⧵𝑉E (𝑈E ) ⟶ Pic(𝑈E ) → 0. (6.4.1)

Let us take a closer look at the first term. Suppose 𝑓 ∈ E[𝑉E ]× . Using the
projection 𝑝 ∶ 𝑉E → A1E ⧵ {0} mentioned above, we can apply Lemma 6.2 to K =
E(𝑥), to conclude that 𝑓 has the form 𝑓 = 𝑡(𝑥∕𝑦) 𝑢𝑚 𝑣𝑛 , for 𝑡 a rational function,
and 𝑚, 𝑛 ∈ Z. Since 𝑓 has to be invertible on 𝑉E , we must have 𝑚 = 𝑛 = 0, and
𝑡(𝑥∕𝑦) = 𝑐 (𝑥∕𝑦)𝑘 for some 𝑐 ∈ E× and 𝑘 ∈ Z. It follows that E[𝑉E ]× ≃ E× ⊕ Z
and E[𝑈E ]× ≃ E× . The sequence (6.4.1) is thus
𝛼 𝛽
0 → Z ⟶ Z[𝐺] ⊕ Z[𝐺] ⟶ Pic(𝑈E ) → 0,

and the map 𝛼 sends 𝑘 ∈ Z to the divisor of the function (𝑥∕𝑦)𝑘 , which is the
element (𝑘𝜀, −𝑘𝜀) ∈ Z[𝐺] ⊕ Z[𝐺], where 𝜀 = ∑𝑔∈𝐺 𝑔 ∈ Z[𝐺].
Now is where the proof really begins. The exact sequence (6.4.1) will not be
used anywhere in this proof; what we use is the sequence (6.4.1) with 𝑌 in place of
𝑈 , where 𝑌 is a desingularisation of 𝑋 as in the statement of this theorem. Such
a sequence is obtained by a process as in the above argument. This gives an exact
sequence of Tate cohomology groups

≃Z
𝛿
̂ −1 (𝐺, Pic(𝑌E )) ⟶ 𝐻
0→𝐻 ⏞⏞⏞⏞⏞⏞⏞
̂ 0 (𝐺, E[𝑉 × ×
E ] ∕E ) ≃ Z∕3
𝛼′
̂ 0 (𝐺, Div𝑌 ⧵𝑉 (𝑌E )) → ⋯ ,
⟶𝐻 E E

where the 𝐻 ̂ −1 of Div𝑌 ⧵𝑉 (𝑌E ) vanishes, since the latter is a direct sum of copies
E E
of Z[𝐺] and Z, which both have zero 𝐻 ̂ −1 by direct computation.
The generator 𝑥∕𝑦 of the second non-zero term goes to a divisor of 𝑌E which is
the norm (in the 𝐺-module sense) of a divisor of 𝑌E . Indeed, we have seen that the
divisor of the rational function 𝑥∕𝑦 on 𝑈E is the norm of an element. But 𝑌E ⧵𝑈E is
the inverse image of the three singular points of 𝑋E , and the action of 𝐺 permutes
these three parts of 𝑌E ⧵ 𝑈E . Therefore, the divisor of 𝑥∕𝑦 on 𝑌E ⧵ 𝑈E is the norm
130 Stable Irrationality of Varieties

of the divisor of 𝑥∕𝑦 on one of these three parts. This shows that 𝛼 ′ (𝑥∕𝑦) = 0, so
̂ −1 (𝐺, Pic(𝑌E )) ≃ Z∕3.
that 𝑥∕𝑦 is in the image of 𝛿. It follows that 𝐻
By [CTS77, Lemma 15], there is a short exact sequence

0 → Br(F, E) ⟶ Br(𝑌 , E) ⟶ 𝐻 1 (𝐺, Pic(𝑌E )) → 0,

where Br(𝑌 , E) = ker(Br(𝑌 ) → Br(𝑌E )), and similarly for Br(F, E). Since F and E
are finite, one has Br(F, E) = 0 by Wedderburn’s theorem. Since 𝐺 is cyclic, one
̂ −1 (𝐺, Pic(𝑌E )) ≃ Z∕3. It follows that the middle term is
has 𝐻 1 (𝐺, Pic(𝑌E )) ≃ 𝐻
Z∕3. In other words, there is a short exact sequence

0 → Z∕3 ⟶ Br(𝑌 ) ⟶ Br(𝑌E ) → 0.

Recall the projection 𝑝 ∶ 𝑉E → A1E defined above. We have seen that the inverse
image of the complement of {0, 1} is isomorphic to the subset of A3E defined by
𝑢𝑣𝑥(𝑥 − 1) ≠ 0. This shows that 𝑌E is rational, so that Br(𝑌E ) ≃ Br(E) ≃ 0 by
Theorem 2.23 and Wedderburn’s theorem. Therefore, Br(𝑌 ) ≃ Z∕3. ◻

7 Example: Quadric surface bundles


This section presents the result of Hassett, Pirutka and Tschinkel [HPT16], which
states that over complex numbers, a very general fourfold which is a quadric surface
bundle over P2 is not retract rational, while those fourfolds that are rational are
dense in the family, in euclidean topology.
Definition 7.1. Let 𝑆 be an integral surface. A quadric surface bundle over 𝑆, is
a fourfold 𝑋 ⊂ 𝑆 × P3 , such that the composition
pr1
𝜋 ∶ 𝑋 ↪ 𝑆 × P3 −−→ 𝑆

is flat with smooth generic fibre.

Irrationality
We are interested in the case 𝑆 = P2 , and we consider the family of all hypersur-
faces of P2 × P3 of bidegree (2, 2). A general member of this family is a quadric
surface bundle over P2 .
As before, we use a specific example to establish the irrationality of a very
general member.

• Consider the fourfold 𝑋 ⊂ P2 × P3 , given by

𝑦𝑧𝑠2 + 𝑥𝑧𝑡2 + 𝑥𝑦𝑢2 + 𝐹 (𝑥, 𝑦, 𝑧)𝑣2 = 0,

where 𝑥, 𝑦, 𝑧 are the coordinates of P2 , and 𝑠, 𝑡, 𝑢, 𝑣 the coordinates of P3 ,


with
𝐹 (𝑥, 𝑦, 𝑧) = 𝑥2 + 𝑦2 + 𝑧2 − 2𝑦𝑧 − 2𝑥𝑧 − 2𝑥𝑦.
7 Example: Quadric surface bundles 131

• Hassett, Pirutka and Tschinkel [HPT16, §5] constructed a universally CH0 -


trivial desingularisation of 𝑋.
Let 𝐴 be a discrete valuation ring, with valuation 𝜈, fraction field K, and residue
field 𝜅. There is a residue map
𝜕𝜈 ∶ 𝐻 2 (K, Z∕2) → 𝐻 1 (𝜅, Z∕2) ≃ 𝜅 × ∕𝜅 ×2 ,
which sends
(𝑎, 𝑏) ↦ (−1)𝜈(𝑎) 𝜈(𝑏) 𝑎𝜈(𝑏) ∕𝑏𝜈(𝑎) ,
where 𝑎, 𝑏 ∈ K× , and (𝑎, 𝑏) = 𝑎 ∪ 𝑏 is the cup product of 𝑎 and 𝑏. The kernel of 𝜕𝜈
coincides with the image of 𝐻 2 (Spec 𝐴, Z∕2), so that an element of 𝐻 2 (K, Z∕2)
is unramified (Definition 2.20) if and only if it is in the kernel of 𝜕𝜈 for all 𝜈. See
[CT95] for more details.
Proposition 7.2. Br(𝑋) contains non-trivial 2-torsion. In fact, let
𝛼 = (𝑥∕𝑧, 𝑦∕𝑧) ∈ Br(C(P2 )) [2],
and let 𝛼 ′ ∈ Br(C(𝑋)) be its image. Then 𝛼 ′ is non-zero and unramified, i.e., lies
in Br(𝑋).
Proof. The generic fibre 𝑋 ∘ of 𝑋 → P2 is a quadric surface over the field K =
C(𝑥∕𝑧, 𝑦∕𝑧), and its discriminant is not a square in K. Applying [CTS19, Propo-
sition 6.2.3 (c)], the natural map
𝑖 ∶ Br(K) → Br(𝑋 ∘ )
is an isomorphism. As K(𝑋 ∘ ) ≃ C(𝑋), it remains to show that 𝛼 ′ is unramified,
i.e., 𝜕𝜈 (𝛼 ′ ) = 0 for all valuations 𝜈 on C(𝑋)∕C.
Let us first look at the residues of 𝛼. By definition, only the following residues
are non-trivial:
• 𝜕𝑥 (𝛼) = 𝑦∕𝑧 ∈ C(𝑦∕𝑧)× ∕C(𝑦∕𝑧)×2 , along the line 𝐿𝑥 ∶ 𝑥 = 0.
• 𝜕𝑦 (𝛼) = 𝑥∕𝑧 ∈ C(𝑥∕𝑧)× ∕C(𝑥∕𝑧)×2 , along the line 𝐿𝑦 ∶ 𝑦 = 0.

• 𝜕𝑧 (𝛼) = 𝑥∕𝑦 ∈ C(𝑥∕𝑦)× ∕C(𝑥∕𝑦)×2 , along the line 𝐿𝑧 ∶ 𝑧 = 0.


Now let 𝜈 be a valuation on C(𝑋)∕C. We need to show that 𝜕𝜈 (𝛼 ′ ) = 0.
Let 𝒪𝜈 ⊂ C(𝑋) be the valuation ring of 𝜈. If 𝒪𝜈 contains K, then 𝜕𝜈 (𝛼 ′ ) = 0.
Therefore, if we consider the centre of 𝜈 in P2 , there are two remaining cases.
• The centre is the generic point of a curve 𝐶 ⊂ P2 . The inclusion of discrete
valuation rings 𝒪P2 ,𝐶 ⊂ 𝒪𝜈 induces a commutative diagram
𝜕𝜈
𝛼 ∈ 𝐻 2 (K, Z∕2) 𝐻 1 (𝜅(𝐶), Z∕2)

𝜕𝜈
𝛼 ′ ∈ 𝐻 2 (C(𝑋), Z∕2) 𝐻 1 (𝜅(𝜈), Z∕2) .
132 Stable Irrationality of Varieties

It follows that if 𝐶 is different from 𝐿𝑥 , 𝐿𝑦 or 𝐿𝑧 , then 𝜕𝜈 (𝛼 ′ ) = 0, since


𝜕𝜈 (𝛼) = 0. If, for example, 𝐶 = 𝐿𝑥 , then 𝜕𝜈 (𝛼 ′ ) = 𝑦 in the residue field

C(𝑦, 𝑡, 𝑢)[𝑠 = √𝐹 (0, 𝑦, 1)∕𝑦] ≃ C(√𝑦, 𝑡, 𝑢),

where we have set 𝑧 = 1 and 𝑣 = 1. Therefore, 𝜕𝜈 (𝛼 ′ ) is a square in the


residue field, and hence is trivial. (The key point is that 𝐹 (𝑥, 𝑦, 𝑧) is a square
modulo any one of 𝑥, 𝑦, 𝑧.)

• The centre is a closed point 𝑃 ∈ P2 . There are three cases.

(i) 𝑃 ∉ 𝐿𝑥 ∪ 𝐿𝑦 ∪ 𝐿𝑧 . Then 𝜈(𝑥∕𝑧) = 𝜈(𝑦∕𝑧) = 0, so that 𝜕𝜈 (𝛼 ′ ) = 0.


(ii) 𝑃 lies on one of the three lines, say 𝐿𝑥 . Then 𝑦∕𝑧 ≠ 0 at 𝑃 , so that
𝑦∕𝑧 is a square in the completion 𝒪̂ ̂
P2 ,𝑃 , which embeds in 𝒪𝜈 , whose
fraction field is the completion C(𝑋)𝜈 . Thus 𝑦∕𝑧 is a square in C(𝑋)𝜈 ,
and 𝛼 ′ = 0 in 𝐻 2 (C(𝑋)𝜈 , Z∕2), so that 𝜕𝜈 (𝛼 ′ ) = 0.
(iii) 𝑃 lies on two of the three lines, say 𝐿𝑥 and 𝐿𝑦 . As in the previous
̂ Applying [CTS19,
case, 𝐹 (𝑥, 𝑦, 𝑧)∕𝑧2 is a square in the completion K.
Proposition 6.2.3 (c)] to the quadric 𝑋 ∘̂ , we see that the image of 𝛼 in
̂ ∘ ), Z∕2) is zero. The natural K
𝐻 2 (K(𝑋 ̂ ∘ ) → C(𝑋)𝜈
map of fields K(𝑋
shows that 𝛼 is zero in 𝐻 (C(𝑋)𝜈 , Z∕2). Therefore, 𝜕𝜈 (𝛼 ′ ) = 0.
2

Applying Theorem 2.23, and applying Theorem 3.12 to the family of bidegree
(2, 2) hypersurfaces in P2 × P3 , we obtain the following.

Corollary 7.3. A very general bidegree (2, 2) hypersurface in P2 × P3 is not retract


rational. ◻

Density of the rational locus


Now, we begin to prove a remarkable fact about this example, that those rational
members in the family of quadric surface bundles over P2 is dense. This also shows
that in Theorem 3.12, “a countable union of closed sets” can not be improved to “a
closed set”.
By a multisection of 𝑋∕𝑆 degree 𝑑, we mean a family of 0-cycles of degree 𝑑,
in the sense of Definition 3.1.

Lemma 7.4. Let 𝑆 be a rational surface, and 𝑋 → 𝑆 a quadric surface bundle.


Suppose that 𝑋∕𝑆 has a multisection of odd degree. Then 𝑋 is rational.

Proof. 𝑋 is rational, if and only if the generic fibre 𝑋 ∘ is rational over the field
C(𝑆). Since 𝑋 ∘ is a smooth quadric surface, it is rational if and only if it has a
rational point, as the projection from a rational point will give a birational map
between 𝑋 ∘ and P2 . Thus it suffices to show that 𝑋 ∘ has a C(𝑆)-rational point.
By a theorem of Springer [Spr52], 𝑋 ∘ has a C(𝑆)-rational point, if and only
if 𝑋 ∘ has a K-rational point for some extension K∕C(𝑆) of odd degree. Thus, we
7 Example: Quadric surface bundles 133

only need to show that 𝑋 ∘ has a 0-cycle of odd degree, which will imply that 𝑋 ∘
has a closed point of odd degree.
But by hypothesis, 𝑋∕𝑆 has a multisection of odd degree, which gives rise to
a 0-cycle of 𝑋 ∘ of odd degree. ◻

For a quadric surface bundle 𝑋 → 𝑆, and an integral (2, 2)-class, that is, an
element 𝛼 ∈ 𝐻 2,2 (𝑋) ∩ 𝐻 4 (𝑋; Z), we say that 𝛼 meets the fibre 𝑋𝑠 in degree 𝑑,
where 𝑠 ∈ 𝑆, if the pairing of 𝛼 with the homology class of 𝑋𝑠 equals 𝑑.

Lemma 7.5. Let 𝑆 be a rational surface, and 𝜋 ∶ 𝑋 → 𝑆 a quadric surface bundle.


Suppose that 𝑋 has an integral (2, 2)-class meeting the fibres of 𝜋 in odd degree.
Then 𝑋 is rational.

Proof. Let 𝑆0 ⊂ 𝑆 be the locus where the rank of the quadratic form is ≥ 3 (the
full rank is 4), and let 𝑋0 = 𝑋 ×𝑆 𝑆0 .
Let 𝐹1 → 𝑆 be the relative variety of lines of 𝜋, i.e., the points of the fibre (𝐹1 )𝑠
correspond to straight lines contained in the fibre 𝑋𝑠 . When 𝑋𝑠 is non-degenerate,
it contains 2 families of lines, each parametrised by P1 . When the rank of the
quadratic form drops by 1, 𝑋𝑠 becomes a quadric cone, which contains 1 family of
lines parametrised by P1 .
This shows that 𝐹1 |𝑆0 → 𝑆0 factors as

𝑝
𝐹1 |𝑆0 ⟶ 𝑇0 ⟶ 𝑆0 ,

where 𝑝 is an étale P1 -bundle, and 𝑇0 → 𝑆0 is a double cover branched along


𝑆0 ∩ 𝐷, where 𝐷 ⊂ P2 is the locus of degenerate fibres.
Let 𝐹 be a desingularisation of the closure of 𝐹1 |𝑆0 in 𝐹1 . The correspondence
Γ1 = {(𝑥, ℓ) ∣ 𝑥 ∈ ℓ} ⊂ 𝑋 ×𝑆 𝐹1 induces a correspondence Γ from 𝑋 to 𝐹 , which
induces a map
Γ∗ ∶ 𝐻 2,2 (𝑋) → 𝐻 1,1 (𝐹 ).

On the other hand, let 𝜂 be the generic point of 𝑆. There is a map

Ξ∗ ∶ Pic(𝐹𝜂 ) ≃ CH0 (𝐹𝜂 ) → CH0 (𝑋𝜂 )

constructed as follows. For a divisor 𝑍 ⊂ 𝐹𝜂 , i.e. a choice of 𝑛 lines from each


family of lines on each quadric surface, let Ξ∗ (𝑍) ⊂ 𝑋𝜂 be the 𝑛2 points where
these lines intersect. Note that Ξ∗ sends a divisor of odd degree on each geometric
component of 𝐹𝜂 to a multisection of odd degree.
Now let us prove the lemma. By hypothesis, 𝑋 has an integral (2, 2)-class meet-
ing the fibres in odd degree. Applying the map Γ∗ , we obtain an integral (1, 1)-class
of 𝐹 of meeting the fibres in odd degree. By the Lefschetz theorem on (1, 1)-classes
[GH94, p. 163], 𝐹 has a divisor which meets the fibres in odd degree. Finally, ap-
plying the map Ξ∗ to this divisor, we obtain a multisection of 𝑋∕𝑆 which meets
the fibres in odd degree. Applying Lemma 7.4 completes the proof. ◻
134 Stable Irrationality of Varieties

Next, we analyse the Hodge classes in the case 𝑆 = P2 , in order to verify the
assumption of this lemma. The key tool is the following technique of Voisin.
Lemma 7.6. Let 𝑌 → 𝐵 be a flat, projective family of complex varieties. Suppose
there exists 𝑏 ∈ 𝐵 and 𝜆 ∈ 𝐻 𝑝,𝑝 (𝑌𝑏 , R), such that the infinitesimal period map

∇(𝜆) ∶ 𝑇𝐵,𝑏 → 𝐻 𝑝−1,𝑝+1 (𝑌𝑏 )

is surjective, where 𝑇𝐵,𝑏 denotes the tangent space of 𝐵 at 𝑏. Then for any open
set 𝑈 ⊂ 𝐵 (in euclidean topology) containing 𝑏, such that 𝑌 |𝑈 → 𝑈 is a trivial
bundle, the map (notations are explained below)
𝑝,𝑝 2𝑝
𝜙 ∶ ℋR |𝑈 ↪ ℋR |𝑈 ≃ 𝐻 2𝑝 (𝑌𝑏 , R) × 𝑈 → 𝐻 2𝑝 (𝑌𝑏 , R) → 𝐹 𝑝−1 𝐻 2𝑝 (𝑌𝑏 , R)

is submersive at 𝜆.
𝑝,𝑞
We use the notation ℋ 𝑝,𝑞 , ℋR , etc., to refer to the vector bundles over 𝐵,
whose fibres are the cohomology of the fibres of 𝑌 → 𝐵. The notation 𝐹 𝑝−1 𝐻 2𝑝
refers to the Hodge filtration, and in this case, it is equal to 𝐻 𝑝−1,𝑝+1 ⊕ 𝐻 𝑝,𝑝 ⊕ ⋯ ⊕
𝐻 2𝑝,0 .
Proof. See, for example, [Voi07, §5.3.4]. ◻
In the following, we use the notation

𝑌 →𝐵

for the family of all smooth bidegree (2, 2) hypersurfaces in P2 ×P3 , and the notation
𝑌𝑏 refers to its fibres.
Proposition 7.7. The Hodge and Betti numbers of 𝑌𝑏 are given by
• 𝑏0 = 𝑏8 = 1.
• 𝑏1 = 𝑏3 = 𝑏5 = 𝑏7 = 0.
• 𝑏2 = ℎ1,1 = 𝑏6 = ℎ3,3 = 2.
• 𝑏4 = 46, ℎ0,4 = ℎ4,0 = 0, ℎ1,3 = ℎ3,1 = 3, ℎ2,2 = 40.
Proof. The Lefschetz hyperplane theorem shows that

𝑏𝑘 (𝑌𝑏 ) = 𝑏𝑘 (P2 × P3 ) and ℎ𝑝,𝑞 (𝑌𝑏 ) = ℎ𝑝,𝑞 (P2 × P3 )

for 𝑘 < 4 and 𝑝 + 𝑞 < 4. This, together with Poincaré/Serre duality, gives the first
three items.
We compute 𝑏4 by analysing the map 𝑌𝑏 → P2 . Let 𝐷 ⊂ P2 be the locus of
degenerate fibres. If 𝑌𝑏 is defined by the equation
2 3
∑𝑖,𝑗=0 ∑𝑘,𝑙=0 𝑎𝑖𝑗𝑘𝑙 𝑥𝑖 𝑥𝑗 𝑦𝑘 𝑦𝑙 = 0,
7 Example: Quadric surface bundles 135

where the coefficients 𝑎𝑖𝑗𝑘𝑙 are assumed to be symmetric with respect to 𝑖, 𝑗 and
𝑘, 𝑙, then 𝐷 is cut out by the equation

det (∑2𝑖,𝑗=0 𝑎𝑖𝑗𝑘𝑙 𝑥𝑖 𝑥𝑗 )0≤𝑘≤3 = 0.


0≤𝑙≤3

Therefore, 𝐷 is an octic curve, and hence has genus 21 and Euler number −40.
Recall that for a complex variety 𝑋 and a closed subvariety 𝑍 ⊂ 𝑋, we have
an additive formula 𝜒(𝑋) = 𝜒(𝑍) + 𝜒(𝑋 ⧵ 𝑍) of Euler numbers. Hence we have

𝜒(𝑌𝑏 ) = 𝜒(P1 × P1 ) 𝜒(P2 ⧵ 𝐷) + 𝜒(quadric cone) 𝜒(𝐷)


= 4 ⋅ (3 − (−40)) + 3 ⋅ (−40) = 52,
and it follows that 𝑏4 (𝑌𝑏 ) = 𝜒(𝑌𝑏 ) − 𝑏0 − 𝑏2 − 𝑏6 − 𝑏8 = 46.
To compute the remaining Hodge numbers, we apply the result of Batyrev and
Cox on hypersurfaces in toric varieties [BC94, Theorem 10.13], which implies that
the vanishing cohomology, defined by

𝐻 𝑝,𝑞 (𝑌𝑏 )van = 𝐻 𝑝,𝑞 (𝑌𝑏 )∕𝐻 𝑝,𝑞 (P2 × P3 )

is given by the formula

𝐻 𝑝,4−𝑝 (𝑌𝑏 )van ≃ Jac(𝐹 )(7−2𝑝,6−2𝑝) ,

where 𝐹 is the defining equation of 𝑌𝑏 , and

Jac(𝐹 ) = C[𝑥, 𝑦, 𝑧; 𝑠, 𝑡, 𝑢, 𝑣]∕ℐ (𝐹 )

is the Z2 -graded Jacobian ring of 𝐹 , where ℐ (𝐹 ) is the ideal generated by the


partial derivatives of 𝐹 .
Using this method, we obtain
• ℎ4,0 = dim Jac(𝐹 )(−1,−2) = 0, and hence ℎ0,4 = 0 as well.
• ℎ3,1 = dim Jac(𝐹 )(1,0) = 3, and hence ℎ1,3 = 3 as well.
• ℎ2,2 = 𝑏4 − (ℎ0,4 + ℎ1,3 + ℎ3,1 + ℎ4,0 ) = 40. ◻
Corollary 7.8. There exists 𝑏 ∈ 𝐵 which satisfies the assumption of Lemma 7.6,
with 𝑝 = 2.
Proof. Since 𝐵 ⊂ P(C[𝑥, 𝑦, 𝑧; 𝑠, 𝑡, 𝑢, 𝑣](2,2) ), we may identify

𝑇𝐵,𝑏 ≃ C[𝑥, 𝑦, 𝑧; 𝑠, 𝑡, 𝑢, 𝑣](2,2) ∕(C ⋅ 𝐹 ),

where 𝐹 is the defining equation of 𝑌𝑏 . The infinitesimal period map

∇ ∶ 𝑇𝐵,𝑏 × 𝐻 2,2 (𝑌𝑏 ) → 𝐻 1,3 (𝑌𝑏 )

is given by multiplication

(C[𝑥, 𝑦, 𝑧; 𝑠, 𝑡, 𝑢, 𝑣]∕(𝐹 ))(2,2) × Jac(𝐹 )(3,2) → Jac(𝐹 )(5,4) ,


136 Stable Irrationality of Varieties

by [Voi07, Theorem 6.13], which applies by the identifications [BC94, Corol-


lary 10.2, Theorem 10.6, and Theorem 10.13]. We consider the fibre 𝑌𝑏 given
by

𝐹 = 𝑥2 𝑠2 + 𝑦2 𝑡2 + 𝑧2 𝑢2 + 𝑦𝑧𝑠2 + 𝑥𝑧𝑡2 + 𝑥𝑦𝑢2 + 𝑥2 𝑠𝑣 + 𝑦2 𝑡𝑣 + 𝑧2 𝑢𝑣 = 0.

One verifies that it is a smooth hypersurface in P2 × P3 , and that Jac(𝐹 )(5,4) is


generated by the basis elements 𝑥𝑧4 𝑣4 , 𝑦𝑧4 𝑣4 and 𝑧5 𝑣4 , using computer software.
Therefore, if we take 𝜆 = 𝑧3 𝑣2 ∈ Jac(𝐹 )(3,2) , then the map

⋅ 𝜆 ∶ C[𝑥, 𝑦, 𝑧; 𝑠, 𝑡, 𝑢, 𝑣](2,2) → Jac(𝐹 )(5,4)

is surjective. ◻
Theorem 7.9. The set of those 𝑏 ∈ 𝐵 such that 𝑌𝑏 has an integral (2, 2)-class
meeting the fibres of 𝑌𝑏 → P2 in odd degree is dense in 𝐵, in euclidean topology.
Proof. Instead of finding an integral class, we only need to find such a class with
the coefficient ring
𝑅 = {𝑚∕𝑛 ∣ 𝑚, 𝑛 ∈ Z, 2 ∤ 𝑛},

as we can multiply by an odd integer to turn such a class into an integral class. (We
have an obvious definition of an odd element in 𝑅.)
Let 𝑏0 ∈ 𝐵 be as in the previous corollary, and let 𝑈 ⊂ 𝐵 be an open set which
trivialises 𝑌 → 𝐵 near 𝑏0 . Such a trivialisation preserves the homology classes of
the fibres of 𝑌𝑏 → P2 .
We have shown that 𝐻 0,4 (𝑌𝑏0 ) = 0. Thus, Lemma 7.6 shows that the image of
the map
2,2
𝜙 ∶ ℋR |𝑈 → 𝐻 4 (𝑌𝑏0 , R)

contains an open set. Since the image consists of those classes that are of type (2, 2)
over some 𝑏 ∈ 𝑈 , it suffices to show that the elements of 𝐻 4 (𝑌𝑏0 , 𝑅) that meet the
fibres of 𝑌𝑏0 → P2 in odd degree are dense in 𝐻 4 (𝑌𝑏0 , R), so that one such element
lies in the image. We only need to prove this for 𝑏0 ∈ 𝐵, as the set of such 𝑏0 is
Zariski open in 𝐵.
The quadric surface bundle 𝑌𝑏0 has a constant section P2 → 𝑌𝑏0 given by 𝑠 =
𝑡 = 𝑢 = 0. This gives rise to an element 𝛼 ∈ 𝐻 4 (𝑌𝑏0 , Z), which intersects the fibres
of 𝑌𝑏0 → P2 in degree 1. For any 𝛽 ∈ 𝐻 4 (𝑌𝑏0 , 𝑅), the class 𝛼 + 2𝛽 also intersects
the fibres of 𝑌𝑏0 → P2 in odd degree. Such classes are dense in 𝐻 4 (𝑌𝑏0 , 𝑅). ◻
The results are summarised as follows.
Theorem 7.10. In the family of all bidegree (2, 2) hypersurfaces in P2 × P3 , a very
general member is not retract rational, while the rational members form a dense
subset in the family, in euclidean topology. ◻
References 137

References
[AM72] M. Artin and D. Mumford (1972). “Some elementary examples of unirational
varieties which are not rational”. Proceedings of the London Mathematical
Society 25 (3), 75–95.
[BC94] V. V. Batyrev and D. A. Cox (1994). “On the Hodge structure of projective
hypersurfaces in toric varieties”. Duke Mathematical Journal 75 (2), 293–
338.
[Blo10] S. Bloch (2010). Lectures on Algebraic Cycles. 2nd ed. New Mathematical
Monographs 16. Cambridge University Press.
[Bou06] N. Bourbaki (2006). Algèbre Commutative, Chapitres 8 et 9. Springer.
[CT05] J.-L. Colliot-Thélène (2005). “Un théorème de finitude pour le groupe de
Chow des zéro-cycles d’un groupe algébrique linéaire sur un corps 𝑝-adique”.
Inventiones mathematicae 159, 589–606.
[CT95] J.-L. Colliot-Thélène (1995). “Birational invariants, purity and the Gersten
conjecture”. 𝐾-Theory and Algebraic Geometry: Connections with Quadratic
Forms and Division Algebras, Part 1. Proceedings of Symposia in Pure Math-
ematics 58. AMS, 1–64.
[CTP16] J.-L. Colliot-Thélène and A. Pirutka (2016). “Hypersurfaces quartiques de di-
mension 3 : non-rationalité stable”. Annales Scientifiques de l’École Normale
Supérieure 49 (2), 371–397.
[CTS19] J.-L. Colliot-Thélène and A. N. Skorobogatov (2019). The Brauer–
Grothendieck group. Preprint.
[CTS77] J.-L. Colliot-Thélène and J.-J. Sansuc (1977). “La 𝑅-équivalence sur les
tores”. Annales Scientifiques de l’École Normale Supérieure 10 (2), 175–229.
[Deb03] O. Debarre (2003). “Variétés rationnellement connexes”. Astérisque 290,
243–266.
[EGA-II] A. Grothendieck (1961). “Éléments de géométrie algébrique II. Étude glob-
ale élémentaire de quelques classes de morphismes”. Publications mathéma-
tiques de l’I.H.É.S. 8, 5–222.
[EGA-IV3] A. Grothendieck (1966). “Éléments de géométrie algébrique IV. Étude locale
des schémas et des morphismes de schémas. 3e partie”. Publications mathé-
matiques de l’I.H.É.S. 28, 5–255.
[EKW16] H. Esnault, M. Kerz, and O. Wittenberg (2016). “A restriction isomorphism
for cycles of relative dimension zero”. Cambridge Journal of Mathematics
4 (2), 163–196.
[FAG] B. Fantechi et al. (2005). Fundamental Algebraic Geometry: Grothendieck’s
FGA Explained. Mathematical Surveys and Monographs 123. AMS.
[Ful98] W. Fulton (1998). Intersection Theory. 2nd ed. Ergebnisse der Mathematik
und ihrer Grenzgebiete 2. Springer.
[GH94] P. Griffiths and J. Harris (1994). Principles of Algebraic Geometry. Wiley
Classics Library. John Wiley & Sons.
138 Stable Irrationality of Varieties

[Gro68] A. Grothendieck (1968). “Le groupe de Brauer I, II, III”. Dix exposés sur la
cohomologie des schémas. Masson & North-Holland, 46–188.
[HPT16] B. Hassett, A. Pirutka, and Yu. Tschinkel. Stable rationality of quadric surface
bundles over surfaces. arXiv: 1603.09262.
[Kol96] J. Kollár (1996). Rational curves on algebraic varieties. Ergebnisse der Math-
ematik und ihrer Grenzgebiete 32. Springer.
[Mil80] J. S. Milne (1980). Étale Cohomology. Princeton Mathematical Series 33.
Princeton University Press.
[Ros96] M. Rost (1996). “Chow groups with coefficients”. Documenta Mathematica
1, 319–393.
[Sal76] G. Salmon (1876). Lessons Introductory to the Modern Higher Algebra.
3rd ed. Dublin.
[Ser79] J.-P. Serre (1979). Local Fields. Trans. by M. J. Greenberg. Graduate Texts in
Mathematics 67. Springer.
[Spr52] T. A. Springer (1952). “Sur les formes quadratiques d’indice zéro”. C. R.
Acad. Sci. Paris 234, 1517–1519.
[Voi07] C. Voisin (2007). Hodge Theory and Complex Algebraic Geometry II. Trans.
by L. Schneps. Cambridge Studies in Advanced Mathematics 77. Cambridge
University Press.
[Voi15] C. Voisin (2015). “Unirational threefolds with no universal codimension 2
cycle”. Inventiones mathematicae 201 (1), 207–237.
Applications of the Eells–Kuiper Invariant
to Exotic 7‐Spheres

Lan Qing1

ABSTRACT
We introduce the Eells–Kuiper invariant, and apply the Eells–
Kuiper invariant to find the 28 differentiable structures on 𝑆 7 . We
also apply it to circle bundles over homotopy ℂ𝑃 3 , to show that any
homotopy 7-sphere that admits smooth regular 𝑆 1 -actions is realized
as the total space of a principal 𝑆 1 -bundle over some homotopy ℂ𝑃 3 ,
with primitive Euler class.

Contents

1 Introduction 140

2 The invariants of Milnor 140


The 𝜆 invariant . . . . . . . . . . . . . . . . . . . . . . . . . . 140
The 𝜆′ invariant . . . . . . . . . . . . . . . . . . . . . . . . . . 142

3 The Eells–Kuiper 𝝁 invariant 142


The domain of 𝜇 . . . . . . . . . . . . . . . . . . . . . . . . . . 142
The definition of 𝜇 . . . . . . . . . . . . . . . . . . . . . . . . 144
Rohlin invariant . . . . . . . . . . . . . . . . . . . . . . . . . . 147
Definition of the Rohlin invariant . . . . . . . . . . . . . . 147
Comparison with 𝜇 . . . . . . . . . . . . . . . . . . . . . 148

4 Applications to 𝑺 3 bundles over 𝑺 4 149


Calculation of characteristic classes of 𝑆 3 bundles over 𝑆 4 . . . 149
Differentiable structures on 𝑆 7 . . . . . . . . . . . . . . . . . . 151
1
蓝青,清华大学数学系数 83 班.
140 Applications of the Eells–Kuiper Invariant to Exotic 7‐Spheres

5 Applications to 𝑺 1 bundles over homotopy ℂ𝑷 3 154


Characteristic classes of bundles over ℂ𝑃 3 . . . . . . . . . . . . 154
A generalization of 𝜇 for nonspin case . . . . . . . . . . . . . . 155
Calculation for 𝑆𝐸−1 . . . . . . . . . . . . . . . . . . . . . . . 157
Calculation for spin manifolds homotopy equivalent to ℂ𝑃 3 . . . 157

6 Further Discussions 159

1 Introduction
In 1956, Milnor discovered that there exists a differentiable structure on
𝑆 7 that is different from the usual one in [Mil56], where he introduced a
differential invariant and applied it to a collection of 𝑆 3 -bundles over 𝑆 4 .
From the theory of h-cobordism discussed in [Mil59a] and [Sma61], there
are exactly 28 differentiable structures on 𝑆 7 . In 1962, Eells and Kuiper in-
troduced 𝜇 invariant which takes different values for different differentiable
structures on 𝑆 7 , completely classifying the 28 differentiable structures on
𝑆 7 . In [KM63], the group Θ𝑛 of homotopy spheres is discussed, where for
𝑛 ≥ 5, Θ𝑛 describes exotic spheres of dimension 𝑛.
In this thesis, we introduce the invariants of Milnor and Eells–Kuiper,
and apply the Eells–Kuiper invariant to the collection of 𝑆 3 -bundles over
𝑆 4 defined in [Mil56] to find 16 differentiable structures on 𝑆 7 , and then
using connected sum to find the 28 differentiable structures on 𝑆 7 .
We also apply a generalized Eells–Kuiper invariant to circle bundles
over homotopy ℂ𝑃 3 , to show that any homotopy 7-sphere that admits
smooth regular 𝑆 1 -actions is realized as the total space of a principal
𝑆 1 -bundle over some homotopy ℂ𝑃 3 , with primitive Euler class.
Throughout this thesis we will use ℚ as the coefficient ring, unless oth-
erwise stated.

2 The invariants of Milnor

The 𝜆 invariant
In [Mil56], Milnor introduced 𝜆 invariant for the discovery of exotic
7-spheres. For an oriented closed 7-manifold 𝑀 such that 𝐻 3 (𝑀) =
𝐻 4 (𝑀) = 0, since the seventh oriented cobordism ring Ω𝑆𝑂7 = 0, 𝑀
bounds a compact oriented 8-manifold 𝐵. The assumption on the cohomol-
ogy of 𝑀 implies that we have an isomorphism

𝑗 ∶ 𝐻 4 (𝐵, 𝑀) → 𝐻 4 (𝐵).
2 The invariants of Milnor 141

Let 𝜈 be the orientation class in 𝐻8 (𝐵, 𝑀), and let 𝜏(𝐵) be the signature
of 𝐵. Then we have a well-defined differential invariant

𝜆(𝑀) ∶ = 2⟨(𝑗 −1 𝑝1 (𝐵))2 , 𝜈⟩ − 𝜏(𝐵) mod 7,

which is indeed independent of 𝐵 chosen. For two such choices 𝐵1 , 𝐵2 , we


reverse the orientation of 𝐵2 and glue them to obtain a closed manifold 𝐶,
and by Hirzebruch signature theorem we have

0 ≡ 45𝜏(𝐶) + ⟨𝑝1 (𝐶)2 , [𝐶]⟩ ≡ 4(−𝜏(𝐶) + 2⟨𝑝1 (𝐶)2 , [𝐶]⟩) mod 7.

Thus

−𝜏(𝐵1 ) + 2⟨𝑗1−1 𝑝1 (𝐵1 )2 , 𝜈1 ⟩ ≡ −𝜏(𝐵2 ) + 2⟨𝑗2−1 𝑝1 (𝐵2 )2 , 𝜈2 ⟩ mod 7.

Using this invariant, Milnor finds an 𝑆 3 -bundle over 𝑆 4 which is home-


omorphic to the 7-sphere but has nonzero 𝜆. Thus it cannot bound 𝐷8 whose
fourth Betti number is zero, and consequently it is an exotic sphere.
Later, this invariant is generalized in [Mil59b]. For a closed smooth
oriented (4𝑘 − 1)-manifold 𝑀 such that over ℚ,

𝐻 2𝑘 (𝑀) = 𝐻 4𝑖 (𝑀) = 0, ∀0 < 𝑖 < 𝑘,

by Poincaré duality and universal coefficient theorem we have

𝐻 2𝑘−1 (𝑀) = 𝐻 4𝑖−1 (𝑀) = 0, ∀0 < 𝑖 < 𝑘.

If 𝑀 bounds a compact smooth oriented 4𝑘-manifold 𝑊 , we still have


isomorphisms
𝑗 ∶ 𝐻 4𝑖 (𝑊 , 𝑀) → 𝐻 4𝑖 (𝑊 ).

Recall that {𝐿𝑘 (𝑝1 , … , 𝑝𝑘 )} is the multiplicative sequence associated


to the formal power series √𝑡∕tanh(√𝑡), where 𝐿𝑘 is a homogeneous
polynomial of degree 4𝑘 and 𝑑𝑒𝑔(𝑝𝑖 ) = 4𝑖. Note that 𝐿𝑘 (𝑝1 , … , 𝑝𝑘 ) =
𝐿𝑘 (𝑝1 , … , 𝑝𝑘−1 , 0) + 𝑠𝑘 𝑝𝑘 where 𝑠𝑘 = 22𝑘 (22𝑘−1 − 1)𝐵𝑘 ∕(2𝑘)!, and 𝐵𝑘
denotes the Bernoulli numbers. Let 𝜈 = [𝑊 , 𝑀] be the fundamental class.
Definition 1. The 𝜆 invariant in ℚ∕ℤ is defined to be

𝜆(𝑀) ∶ = (𝜏(𝑊 ) − ⟨𝐿𝑘 (𝑗 −1 𝑝1 (𝑊 ), … , 𝑗 −1 𝑝𝑘−1 (𝑊 ), 0), 𝜈⟩)∕𝑠𝑘 mod 1.

Observe that if 𝑀 is empty, by Hirzebruch signature theorem we have

𝜏(𝑊 ) = ⟨𝐿𝑘 (𝑝1 (𝑊 ), … , 𝑝𝑘 (𝑊 )), [𝑊 ]⟩

and then

𝜆(∅) = ⟨𝑠𝑘 𝑝𝑘 (𝑊 ), [𝑊 ]⟩∕𝑠𝑘 = ⟨𝑝𝑘 (𝑊 ), [𝑊 ]⟩ ≡ 0 mod 1


142 Applications of the Eells–Kuiper Invariant to Exotic 7‐Spheres

becomes trivial. In essentially the same way as above we can show that
the 𝜆 invariant is well-defined, since two definitions will differ by some top
Pontrjagin number which is an integer.
In the special case where 𝑘 = 2,

45
𝜆(𝑀) = (𝜏(𝑊 ) − ⟨(7𝑗 −1 𝑝2 − 𝑗 −1 𝑝21 )∕45, [𝑊 , 𝑀]⟩)
7
3 1
≡ 𝜏(𝑊 ) + ⟨𝑗 −1 𝑝21 , [𝑊 , 𝑀]⟩ mod 1.
7 7
Trivially for integers 𝑎, 𝑏, 2(𝑎 + 3𝑏) ≡ 2𝑎 − 𝑏 mod 7 and 4(2𝑎 − 𝑏) ≡ 𝑎 +
3𝑏 mod 7. Consequently this 𝜆 invariant and the previous one defined for
7-manifolds indeed determine each other.

The 𝜆′ invariant
We say that a manifold 𝑊 is almost parallelizable if for some finite subset
𝐹 , 𝑊 − 𝐹 is parallelizable, i.e. 𝑀 − 𝐹 has trivial tangent bundle.
Let 𝜏𝑘 be the greatest common divisor of all 𝜏(𝑊 ) where 𝑊 ranges over
all almost parallelizable 4𝑘-manifolds without boundary.
Suppose 𝑀 is a smooth oriented homology (4𝑘 − 1)-sphere over ℤ and
𝑀 bounds a compact smooth oriented parallelizable manifold 𝑊 . Thus
𝜏(𝑊 ) mod 𝜏𝑘 is a well-defined invariant of 𝑀. But in fact, under the as-
sumption above 𝜏(𝑊 ) and 𝜏𝑘 are always divisible by 8.

Definition 2. The 𝜆′ invariant of 𝑀 is defined to be

𝜆′ (𝑀) ≡ 𝜏(𝑊 )∕8 mod 𝜏𝑘 ∕8.

3 The Eells–Kuiper 𝜇 invariant

The domain of 𝜇
We would like to define another invariant for certain manifolds. An oriented
manifold is said to be spin if its second Stiefel-Whitney class is zero.
Let 𝑀 be a closed smooth oriented (4𝑘 − 1)-manifold that bounds a
compact smooth oriented spin 4𝑘-manifold 𝑊 , such that
(a) over ℚ,

𝐻 2𝑘 (𝑀) = 𝐻 4𝑖 (𝑀) = 0, ∀0 < 𝑖 < 𝑘,

(b) the inclusion induces an epimorphism

𝑖∗ ∶ 𝐻 1 (𝑊 ; ℤ2 ) → 𝐻 1 (𝑀; ℤ2 ).
3 The Eells–Kuiper 𝜇 invariant 143

Again condition (a) implies 𝐻 2𝑘−1 (𝑀) = 𝐻 4𝑖−1 (𝑀) = 0, ∀0 < 𝑖 <
𝑘, so 𝑗 ∶ 𝐻 4𝑖 (𝑊 , 𝑀) → 𝐻 4𝑖 (𝑊 ) and 𝑗 ∶ 𝐻 2𝑘 (𝑊 , 𝑀) → 𝐻 2𝑘 (𝑊 ) are
still isomorphisms. Indeed, condition (a) can be replaced by the slightly
weaker condition requiring them to be isomorphisms, so that one can pull
back Pontrjagin classes of 𝑊 . We call this condition (a’).

One reason to introduce condition (b) is that the definition seems to be


dependent on the spin structure on 𝑀. If (𝑊 , 𝜎) is a spin manifold with
boundary 𝑀 with the induced spin structure 𝜎𝑀 , then for any spin structure

𝜎𝑀 of 𝑀, condition (b) allows to replace the spin structure of 𝑊 so that

the induced spin structure of 𝑀 becomes 𝜎𝑀 . Later we will see that the
formula for 𝜇 invariant will be independent of the spin structure on 𝑊 , so
𝜇 invariant is indeed independent of the spin structure of 𝑀.

An example satisfying these conditions is given by 𝑆 3 -bundles 𝑀 over


4
𝑆 with nonvanishing Euler class, which are the boundary of the corre-
sponding disk bundles 𝑊 . These disk bundles, having the homotopy type
of 𝑆 4 , is automatically spin. From the Serre spectral sequence it follows that
the first cohomology of the sphere bundle is 0, so condition (b) is satisfied.
Also, it follows from Serre spectral sequence that if the Euler class does not
0,0
vanish, then over ℚ the 𝐸∞ page consists of only two nonzero terms 𝐸∞
4,3
and 𝐸∞ , so condition (a) is also satisfied. The following diagram shows the
𝐸4 page over ℤ where the arrow is given by cup product with Euler class
(up to a sign) .

ℤF 0 0 0 ℤ
F FF
FF
FF
FF
0 0 FFF 0 0 0
FF
FF
FF
FF
FF
0 0 0 FF0 0
FF
FF
FF
FF
#
ℤ 0 0 0 ℤ

The following diagram is the 𝐸∞ page over ℚ, assuming the Euler class
does not vanish.
144 Applications of the Eells–Kuiper Invariant to Exotic 7‐Spheres

0 0 0 0 ℚ

0 0 0 0 0

0 0 0 0 0

ℚ 0 0 0 0

However, if the Euler class vanishes, the 𝐸∞ page is

ℚ 0 0 0 ℚ

0 0 0 0 0

0 0 0 0 0

ℚ 0 0 0 ℚ

and consequently 𝐻 3 (𝑀; ℚ) = 𝐻 4 (𝑀; ℚ) = ℚ is not zero, and condition


(a) is not satisfied.
In fact, in this case the weaker condition (a’) is not satisfied either. We
have
𝐻 0 (𝑆 4 )O
OOO
OO∪𝑒=0
OOO
∪𝑇 ℎ𝑜𝑚 𝑐𝑙𝑎𝑠𝑠
 OO'
/
𝐻 (𝑊 , 𝑀) 𝑗 𝐻 4 (𝑊 ) = ℚ
4

where the vertical map is the Thom isomorphism. Thus 𝑗 ∶ 𝐻 4 (𝑊 , 𝑀) →


𝐻 4 (𝑊 ) is not an isomorphism.

The definition of 𝜇

Let {𝐴̂𝑘 = 𝐴̂𝑘 (𝑝1 , … , 𝑝𝑘 )} be the multiplicative sequence associated to


1 1∕2
2
𝑧 ∕ sinh( 12 𝑧1∕2 ). The following theorem of Hirzebruch will be used. For
a proof, see [BH58].
3 The Eells–Kuiper 𝜇 invariant 145

Theorem 1. Let 𝑋 be a closed, smooth, oriented spin manifold of dimension


̂
4𝑘. Then the 𝑨-genus ̂
𝐴[𝑋] = ⟨𝐴𝑘̂ (𝑝1 (𝑋), … , 𝑝𝑘 (𝑋)), [𝑋]⟩ is an integer.
̂
If 𝑘 is odd, 𝐴[𝑋] is an even integer.

For manifolds 𝑀 in our domain, Pontrjagin classes 𝑝1 , … , 𝑝𝑘−1 of 𝑊


can be pulled back to 𝐻 ∗ (𝑊 , 𝑀). From

𝜏(𝑋) = 𝐿𝑘 (𝑝1 , … , 𝑝𝑘−1 , 0)[𝑋] + 𝐿𝑘 (0, 0, … , 1)𝑝𝑘 [𝑋]

𝐴̂𝑘 [𝑋] = 𝐴̂𝑘 (𝑝1 , … , 𝑝𝑘−1 , 0)[𝑋] + 𝐴̂𝑘 (0, 0, … , 1)𝑝𝑘 [𝑋]

we eliminate 𝑝𝑘 [𝑋] and get

𝐴̂𝑘 [𝑋] = 𝑁𝑘 (𝑝1 , … , 𝑝𝑘−1 )[𝑋] + 𝑡𝑘 𝜏(𝑋)

where
𝐴̂𝑘 (0, 0, … , 1)
𝑡𝑘 =
𝐿𝑘 (0, 0, … , 1)

𝑁𝑘 (𝑝1 , … , 𝑝𝑘−1 )[𝑋] = 𝐴̂𝑘 (𝑝1 , … , 𝑝𝑘−1 , 0)[𝑋] − 𝑡𝑘 𝐿𝑘 (𝑝1 , … , 𝑝𝑘−1 , 0)[𝑋].

Let 𝑎𝑘 = 1 if 𝑘 is even, and 𝑎𝑘 = 2 if 𝑘 is odd.

Definition 3. The 𝜇 invariant of 𝑀 4𝑘−1 in ℚ∕ℤ is

𝜇(𝑀) = (⟨𝑁𝑘 (𝑗 −1 𝑝1 (𝑊 ), … , 𝑗 −1 𝑝𝑘−1 (𝑊 )), [𝑊 , 𝑀]⟩+𝑡𝑘 𝜏(𝑊 ))∕𝑎𝑘 mod 1.

Theorem 2. The 𝜇 invariant is well-defined independent of 𝑊 chosen.

Proof. This is essentially the same as in the previous section. Let (𝑊1 , 𝑀1 )
and (𝑊2 , 𝑀2 ) be two such choices for the space 𝑀. Reverse the orientation
of the latter and glue the two manifolds along the boundary to get 𝑋. Let 𝑟
be 2𝑘 or 4𝑖(0 < 𝑖 < 𝑘). We have a commutative diagram

𝐻 𝑟 (𝑊1 , 𝑀) ⊕ 𝐻 𝑟 (𝑊2 , 𝑀) o 𝐻 𝑟 (𝑋, 𝑀)



𝑗1 ⊕𝑗2 𝑗
 
𝐻 𝑟 (𝑊1 ) ⊕ 𝐻 𝑟 (𝑊2 ) o 𝐻 𝑟 (𝑋)
𝑘

where ℎ is an isomorphism by Mayer-Vietories sequence, and 𝑗1 ⊕ 𝑗2 is an


isomorphism by assumption.
Since
zero isom. zero
𝐻 𝑟 (𝑀) ← 𝐻 𝑟 (𝑊1 ) ← 𝐻 𝑟 (𝑊1 , 𝑀) ← 𝐻 𝑟−1 (𝑀),
146 Applications of the Eells–Kuiper Invariant to Exotic 7‐Spheres

zero
we find 𝐻 𝑟 (𝑋) → 𝐻 𝑟 (𝑊1 ) → 𝐻 𝑟 (𝑀) is the zero map. Thus
zero j
𝐻 𝑟 (𝑀) ← 𝐻 𝑟 (𝑋) ← 𝐻 𝑟 (𝑋, 𝑀)
implies that 𝑗 is epic.
It follows from easy diagram chasing that 𝑗 is an isomorphism and thus
𝑘 is an isomorphism.
If 𝛼 = 𝑗ℎ−1 (𝛼1 ⊕ 𝛼2 ) ∈ 𝐻 2𝑘 (𝑋), then
2 −1 2 2
⟨𝜈, 𝛼 ⟩ = ⟨𝜈, 𝑗ℎ (𝛼1 ⊕ 𝛼2 )⟩
= ⟨𝜈1 ⊕ (−𝜈2 ), 𝛼12 ⊕ 𝛼22 ⟩ = ⟨𝜈1 , 𝛼12 ⟩ − ⟨𝜈2 , 𝛼22 ⟩
where 𝜈 = [𝑋], 𝜈1 = [𝑊1 , 𝑀], 𝜈2 = [𝑊2 , 𝑀] are orientation classes. Thus
the intersection form of 𝑋 splits into the direct sum of those of 𝑊1 and 𝑊2 ,
and
𝜏(𝑋) = 𝜏(𝑊1 ) − 𝜏(𝑊2 ).
Clearly, the pullback of the tangent bundle of 𝑋 by inclusion 𝑖𝑙 ∶ 𝑊𝑙 →
𝑋 is the tangent bundle of 𝑊𝑙 , 𝑙 = 1, 2. By naturality, 𝐻 ∗ (𝑋) → 𝐻 ∗ (𝑊1 )⊕
𝐻 ∗ (𝑊2 ) → 𝐻 ∗ (𝑊𝑙 ), 𝑝𝑠 (𝑋) ↦ 𝑝𝑠 (𝑊𝑙 ) for any Pontrjagin class 𝑝𝑠 . So
𝑘(𝑝𝑠 (𝑋)) = 𝑝𝑠 (𝑊1 ) ⊕ 𝑝𝑠 (𝑊2 ). As above, for any monomial 𝑓 , 𝑓 (𝑝(𝑋))
becomes the direct sum of 𝑓 (𝑝(𝑊1 )) and 𝑓 (𝑝(𝑊2 )), so
𝑓 (𝑝1 , … , 𝑝𝑘−1 , 0)[𝑋] = ⟨𝑓 (𝑗1−1 𝑝1 (𝑊1 ), … , 𝑗1−1 𝑝𝑘−1 (𝑊1 ), 0), [𝑊1 , 𝑀]⟩
−⟨𝑓 (𝑗2−1 𝑝1 (𝑊2 ), … , 𝑗2−1 𝑝𝑘−1 (𝑊2 ), 0), [𝑊2 , 𝑀]⟩.
For simplicity, write
𝑓 (𝑝1 , … , 𝑝𝑘−1 , 0)[𝑊 ] = ⟨𝑓 (𝑗 −1 𝑝1 (𝑊 ), … , 𝑗 −1 𝑝𝑘−1 (𝑊 ), 0), [𝑊 , 𝑀]⟩.
The equality above becomes
𝑓 (𝑝1 , … , 𝑝𝑘−1 , 0)[𝑋] = 𝑓 (𝑝1 , … , 𝑝𝑘−1 , 0)[𝑊1 ] − 𝑓 (𝑝1 , … , 𝑝𝑘−1 , 0)[𝑊2 ].
Combining results above, with 𝑓 replaced by 𝑁𝑘 ,
̂
𝜇(𝑀1 ) − 𝜇(𝑀2 ) = 𝐴[𝑋]∕𝑎 𝑘

is an integer (modulo 1) assuming 𝑋 is spin.


Now we show that 𝑋 is spin. From the exact sequence
𝑘∗1 ⊕𝑘∗2 Δ
⋯ ← 𝐻 2 (𝑊1 ; ℤ2 ) ⊕ 𝐻 2 (𝑊2 ; ℤ2 ) ← 𝐻 2 (𝑋; ℤ2 ) ← 𝐻 1 (𝑀; ℤ2 )
𝑖∗1 −𝑖∗2
← 𝐻 1 (𝑊1 ; ℤ2 ) ⊕ 𝐻 1 (𝑊2 ; ℤ2 ) ← …
we see that 𝑖∗1 − 𝑖∗2 is surjective by condition (b), so Δ is zero. Therefore
𝑘∗1 ⊕ 𝑘∗2 is injective. Since by naturality 𝑘∗1 ⊕ 𝑘∗2 (𝑤2 (𝑋)) = 𝑤2 (𝑊1 ) ⊕
𝑤2 (𝑊2 ) = 0 ⊕ 0 = 0, 𝑋 is spin. ◻
Here are some facts about 𝜇 invariant.
3 The Eells–Kuiper 𝜇 invariant 147

Proposition 1. 1) 𝜇(−𝑀) = −𝜇(𝑀).


2) If 𝑀1 , 𝑀2 are h-cobordant, then 𝜇(𝑀1 ) = 𝜇(𝑀2 ).
3) If 𝑀1 , 𝑀2 are in the domain of 𝜇, then so is their connected sum
𝑀1 #𝑀2 , and 𝜇(𝑀1 #𝑀2 ) = 𝜇(𝑀1 ) + 𝜇(𝑀2 ).

Given two spin 𝑛-manifolds 𝑋1 , 𝑋2 , we can form a spin structure on


𝑋1 #𝑋2 such that 𝑋1 #𝑋2 and 𝑋1 ∐ 𝑋2 are spin cobordant. See Milnor’s
paper in [Cai15] or Remark 2.17 in [LM89].

Rohlin invariant
This subsection is added to show that in 3-dimensional case, in the com-
mon domain of the Rohlin invariant and the Eells–Kuiper invariant, these
invariants are equivalent. This will not be used in the applications.

Definition of the Rohlin invariant


The following facts can be found in [Sav99].

Theorem 3. Any closed orientable 3-dimensional manifold can be obtained


from 𝑆 3 by Dehn surgery along an even link.

The procedure of Dehn surgery along links in 𝑆 3 can be identified with


the behavior of the boundary when one glue handles 𝐷2 ×𝐷2 to the boundary
of 𝐷4 .
Let ℒ = 𝐿1 ∪ ⋯ ∪ 𝐿𝑛 be an oriented framed link in 𝑆 3 , the 𝑖-th com-
ponent being framed by 𝑒𝑖 ∈ ℤ. The symmetric matrix 𝐴 = (𝑎𝑖𝑗 ), 𝑖, 𝑗 =
1, … , 𝑛, with the entries

𝑒𝑖 , if 𝑖 = 𝑗
𝑎𝑖𝑗 =
{ lk(𝐿𝑖 , 𝐿𝑗 ), if 𝑖 ≠ 𝑗

is called the linking matrix of ℒ .

Theorem 4. Let 𝑀 be a 4-manifold with boundary obtained by integral


surgery on a framed link ℒ in 𝑆 3 . Then the intersection form 𝑄𝑀 is iso-
morphic to the linking matrix of ℒ .

Let 𝑄 be a ℤ-valued unimodular symmetric ℤ-bilinear form defined on a


finitely generated free abelian group 𝐿. It is even if 𝑄(𝑥, 𝑥) is even, ∀𝑥 ∈ 𝐿.
There is a basic fact from linear algebra that the signature of such an even
form is divisible by 8, see [Ser12] or [MH73].
Let Σ be a compact oriented integral homology 3-sphere. From theo-
rems above we deduce that Σ bounds a smooth simply-connected oriented
4-manifold 𝑊 with even intersection form. Thus 𝜏(𝑊 ) ≡ 0 mod 8.
148 Applications of the Eells–Kuiper Invariant to Exotic 7‐Spheres

Definition 4. Let Σ be a compact oriented integral homology 3-sphere,


and let 𝑊 be a simply-connected, compact, smooth, oriented manifold with
boundary Σ. Assume that the intersection form 𝑄𝑊 is even. The Rohlin
invariant of Σ in ℤ∕2ℤ ⊂ ℚ∕ℤ is defined to be

𝜏(𝑊 )
𝜈(Σ) = mod 1.
16
This is well-defined because of the theorem of Rohlin:
Theorem 5. If 𝑀 is a simply-connected, closed, smooth, oriented 4-
manifold with even intersection form then 𝜏(𝑀) ≡ 0 mod 16.
Proposition 2. 𝜈(−𝑀) = −𝜈(𝑀), and 𝜈(𝑀1 #𝑀2 ) = 𝜈(𝑀1 ) + 𝜈(𝑀2 ).
There is another definition of Rohlin invariant, using the fact that ho-
mology 3-spheres have a unique spin structure, and the third spin cobordism
ring is zero. Define the Rohlin invariant of a homology 3-sphere to be

𝜏(𝑊 )
𝜈(Σ) = mod 1,
16
where 𝑊 is any smooth compact spin 4-manifold with spin boundary the
homology sphere.
To show that this is independent of 𝑊 , there is a corresponding version
of Rohlin’s theorem:
Theorem 6. The signature of a smooth closed spin 4-manifold 𝑋 is divisible
by 16.
This extends the previous definition. In fact, if the manifold 𝑊 is ob-
tained from 𝐷4 by gluing 2-handles according to an even surgery on a link in
𝑆 3 , then the canonical spin structure on 𝐷4 extends to 𝑊 . See Proposition
5.7.1 in [GS99] for a proof.

Comparison with 𝝁
Let Σ be a compact oriented integral homology 3-sphere. From universal
coefficient theorem, since for any 𝑘, 𝐻𝑘 (𝑆 3 ; ℤ) = 𝐻𝑘 (Σ; ℤ) is free over ℤ,
all Tor and Ext terms vanish and thus Σ is a compact oriented homology
3-sphere over any ℤ-algebra and in particular over ℚ and ℤ2 .
To show that Σ is in the domain of 𝜇, note that the third spin cobordism
ring is zero, so Σ always spin bounds a spin manifold. Since 𝐻 2 (Σ; ℚ) = 0,
condition (a) is satisfied. Since 𝐻 1 (Σ; ℤ2 ) = 0, condition (b) is satisfied.
From direct calculation one shows that for any 3-manifold 𝑀 in the
domain of 𝜇,
𝜏(𝑊 )
𝜇(𝑀) = − mod 1.
16
4 Applications to 𝑆 3 bundles over 𝑆 4 149

Consequently, in 3-dimensional case, in the common domain of the


Rohlin invariant and the Eells–Kuiper invariant, these invariants are equiv-
alent.

4 Applications to 𝑆 3 bundles over 𝑆 4

Calculation of characteristic classes of 𝑆 3 bundles over 𝑆 4


Define a collection of 𝑆 3 -bundle over 𝑆 4 as follows. For each (ℎ, 𝑗) ∈
ℤ ⊕ ℤ, consider an element 𝜙ℎ𝑗 ∈ 𝜋3 (𝑆𝑂4 ) = ℤ ⊕ ℤ as the clutching
function of the sphere bundle 𝑆 3 → 𝑆𝐸ℎ𝑗 → 𝑆 4 , defined by

𝑥 ∈ 𝑆 3 ⊂ ℍ, 𝑣 ∈ ℍ ≅ ℝ4 , 𝜙ℎ𝑗 (𝑥)𝑣 ∶ = 𝑥ℎ 𝑣𝑥𝑗 ∈ ℍ ≅ ℝ4 ,

where ℍ denotes the quaternions.


Let 𝑆𝐸ℎ𝑗 , 𝐷𝐸ℎ𝑗 , 𝐸ℎ𝑗 be the sphere bundle, disk bundle and vector bun-
dle associated to 𝜙ℎ𝑗 respectively. We need to calculate the Euler class and
the first Pontrjagin class of these bundles, in order to apply 𝜇 invariant.
Theorem 7. 𝑆𝐸ℎ𝑗 is homeomorphic to the standard sphere 𝑆 7 iff ℎ + 𝑗 =
±1.

Proof. We give a sketch of proof. By a simple application of the long ex-


act sequence, 𝑆𝐸ℎ𝑗 is simply connected. From Poincaré conjecture or h-
cobordism theorem, 𝑆𝐸ℎ𝑗 is homotopy equivalent to 𝑆 7 iff 𝑆𝐸ℎ𝑗 is home-
omorphic to 𝑆 7 . Consequently, we only need to find out for which ℎ, 𝑗 is
𝑆𝐸ℎ𝑗 an integral homology sphere.
Applying Mayer-Vietories sequence to the restrictions of the sphere
bundle to the northern hemisphere 𝐷+4 and southern hemisphere 𝐷−4 , it
follows that 𝑆𝐸ℎ𝑗 an integral homology sphere iff

𝐻3 (𝜕𝐷4 × 𝑆 3 ; ℤ) → 𝐻3 (𝐷+4 × 𝑆 3 ; ℤ) ⊕ 𝐻3 (𝐷−4 × 𝑆 3 ; ℤ)

is an isomorphism, where both sides are isomorphic to ℤ ⊕ ℤ. Choosing


a basis for these ℤ-modules and using the definition of the clutching map,
this map is given by the matrix

0 1
(ℎ + 𝑗 1)

which is invertible iff ℎ + 𝑗 = ±1. ◻


Lemma 1. ℤ ⊕ ℤ → 𝜋3 (𝑆𝑂4 ) ≅ 𝜋4 (𝐵𝑆𝑂4 ), (𝑖, 𝑗) ↦ 𝜙𝑖𝑗 is a group homo-
morphism.
150 Applications of the Eells–Kuiper Invariant to Exotic 7‐Spheres

Proof. Note that 𝜙𝑖+𝑗,𝑘+𝑙 (𝑞) = 𝜙𝑖,𝑘 (𝑞)𝜙𝑗,𝑙 (𝑞) for each 𝑞 ∈ 𝑆 3 , where the
multiplication on right hand side is the multiplication in 𝑆𝑂4 . But in a
Lie group 𝑆𝑂4 , 𝑞 ↦ 𝜙𝑖,𝑘 (𝑞)𝜙𝑗,𝑙 (𝑞) is homotopic to the sum in 𝜋3 (𝑆𝑂4 ) of
𝑞 ↦ 𝜙𝑖,𝑘 (𝑞) and 𝑞 ↦ 𝜙𝑗,𝑙 (𝑞). ◻

Lemma 2. 𝜋3 (𝑆𝑂4 ) → 𝐻 4 (𝑆 4 ; ℤ), 𝑓 ∈ 𝜋3 (𝑆𝑂4 ) ↦ the Euler class of the


bundle determined by 𝑓 , is a group homomorphism. The same is true for
the first Pontrjagin class.

Proof. Let 𝐸𝑓 denote the bundle determined by 𝑓 . Assume 𝑓 , 𝑔 ∈ 𝜋3 (𝑆𝑂4 )


are basepoint preserving maps. Their sum is a map 𝑓 + 𝑔 ∶ 𝑆 3 → 𝑆 3 ∨
𝑓 ∨𝑔
𝑆 3 → 𝑆𝑂4 . 𝑓 ∨ 𝑔 determines a bundle on the reduced suspension Σ(𝑆 3 ∨
𝑆 3 ) = 𝑆 4 ∨ 𝑆 4 , and the map ℎ ∶ 𝑆 4 → 𝑆 4 ∨ 𝑆 4 pulls back this bundle to
get the bundle 𝐸𝑓 +𝑔 .
Consequently, ℎ∗ (𝑒(𝐸𝑓 ) ⊕ 𝑒(𝐸𝑔 )) = 𝑒(𝐸𝑓 +𝑔 ). But ℎ∗ (𝑎 ⊕ 𝑏) = 𝑎 + 𝑏.
The result follows. ◻
From lemmas above, we deduce that after picking a generator
𝑢 ∈ 𝐻 4 (𝑆 4 ; ℤ),
𝑒(𝐸𝑖𝑗 ) = (𝛼𝑖 + 𝛽𝑗)𝑢 ∈ 𝐻 4 (𝑆 4 ; ℤ)

𝑝1 (𝐸𝑖𝑗 ) = (𝛾𝑖 + 𝛿𝑗)𝑢 ∈ 𝐻 4 (𝑆 4 ; ℤ)

for some 𝛼, 𝛽, 𝛾, 𝛿.

Lemma 3. 𝐸−𝑗,−𝑖 is obtained from 𝐸𝑖,𝑗 by reversing the orientation of fiber.

Proof. We define an orientation reversing bundle isomorphism 𝜓 ∶ 𝐸𝑖,𝑗 →


𝐸−𝑗,−𝑖 as follows. For each 𝑣 in a fiber ℍ ⊂ 𝐸𝑖,𝑗 , define 𝜓(𝑣) = 𝑣 ̄ ∈ ℍ. The
conjugation in ℍ is orientation reversing. To show that this is well-defined
near the equator, simply note that 𝑞 𝑖 𝑣𝑞 𝑗 = 𝑞 −𝑗 𝑣𝑞̄ −𝑖 for 𝑞 ∈ 𝑆 3 . ◻
Since when the orientation of the bundle is reversed, the Euler class
changes sign while the first Pontrjagin class does not change, we have

𝑒(𝐸𝑖𝑗 ) = 𝛼(𝑖 + 𝑗)𝑢 ∈ 𝐻 4 (𝑆 4 ; ℤ)

𝑝1 (𝐸𝑖𝑗 ) = 𝛾(𝑖 − 𝑗)𝑢 ∈ 𝐻 4 (𝑆 4 ; ℤ)

for some 𝛼, 𝛾.
To find out the coefficients, it suffices to calculate for a concrete 𝐸𝑖𝑗 , say
𝐸01 .
Note that the clutching function of 𝐸01 is complex linear, which implies
that it can be identified with a complex vector bundle. The space 𝑆𝐸01 is
4 Applications to 𝑆 3 bundles over 𝑆 4 151

homeomorphic to 𝑆 7 , so from the fourth page of the Serre spectual sequence

ℤF 0 0 0 ℤ
F FF
FF
FF
FF
0 0 FFF 0 0 0
FF
FF
FF
FF
FF
0 0 0 FF0 0
FF
FF
FF
FF
#
ℤ 0 0 0 ℤ

we see that the Euler class must give rise to an isomorphism. Thus

𝑒(𝐸𝑖𝑗 ) = ±(𝑖 + 𝑗) ∈ 𝐻 4 (𝑆 4 ; ℤ) ≅ ℤ

where the sign depends on our choice of generator of ℤ.


Since 𝐸01 is a complex bundle,

𝑐(𝐸01 ⊗ ℂ) = 𝑐(𝐸01 ⊕ 𝐸01 ) = (1 + 𝑐2 (𝐸01 ))2 = 1 + 2𝑐2 (𝐸01 ) = 1 + 2𝑒(𝐸01 ),

𝑝1 (𝐸01 ) = −𝑐2 (𝐸01 ⊗ ℂ) = −2𝑒(𝐸01 ) = ±2𝑢.

Consequently,

𝑝1 (𝐸𝑖𝑗 ) = ±2(𝑖 − 𝑗) ∈ 𝐻 4 (𝑆 4 ; ℤ) ≅ ℤ.

If we set 𝑒(𝐸01 ) = 1, then the sign convention is

𝑒(𝐸𝑖𝑗 ) = 𝑖 + 𝑗 ∈ 𝐻 4 (𝑆 4 ; ℤ) ≅ ℤ

𝑝1 (𝐸𝑖𝑗 ) = 2(𝑖 − 𝑗) ∈ 𝐻 4 (𝑆 4 ; ℤ) ≅ ℤ.

Differentiable structures on 𝑆 7
If 𝑘 = 2, the formula for the Eells–Kuiper invariant becomes

(𝑝21 [𝑊 ] − 4𝜏[𝑊 ])
𝜇(𝑀 7 ) ≡ mod 1.
(27 × 7)

Since 𝑆𝐸ℎ𝑗 is homeomorphic to the standard sphere 𝑆 7 iff the Euler


class of the bundle 𝑒(𝑆𝐸ℎ𝑗 ) = ±1, discussions in previous sections show
that the Eells–Kuiper 𝜇 invariant applies to these topological spheres. Since
by reversing the orientation of fiber we obtain 𝐸ℎ𝑗 = 𝐸−𝑗,−ℎ , it suffices to
consider the case ℎ + 𝑗 = 1.
152 Applications of the Eells–Kuiper Invariant to Exotic 7‐Spheres

Assume ℎ+ 𝑗 = 1, ℎ − 𝑗 = 𝑝 = 2ℎ − 1, and let 𝑀𝑝 = 𝑆𝐸ℎ𝑗 , 𝐵𝑝 = 𝐷𝐸ℎ𝑗 ,


𝐸ℎ𝑗 be the sphere bundle, disk bundle and vector bundle associated to 𝜙ℎ𝑗
respectively.
Since 𝐻 4 (𝐵𝑝 ; ℤ) = 𝐻 4 (𝐵𝑝 , 𝑀𝑝 ; ℤ) = ℤ, the orientation may be chosen
such that
⟨(𝑗 −1 𝜋 ∗ 𝑢)2 , [𝐵𝑝 , 𝑀𝑝 ]⟩ = 1 and 𝜏(𝐵𝑝 ) = 1,

where 𝑢 is a generator of 𝐻 4 (𝑆 4 ; ℤ), 𝜋 ∗ ∶ 𝐻 4 (𝑆 4 ) → 𝐻 4 (𝐷𝐸ℎ𝑗 ) is an iso-


morphism, 𝜋 ∗ 𝑢 is a generator of 𝐻 4 (𝐷𝐸ℎ𝑗 ; ℤ), and 𝑗 is the isomorphism
defined before. This can be done because of the following lemma.

Lemma 4. Let 𝑀 4𝑘 be a compact oriented connected 4𝑘-manifold with


connected boundary 𝜕𝑀, such that 𝐻 2𝑘−1 (𝑀) = 0. Assume all cohomol-
ogy groups are free ℤ-modules. Then the intersection form is unimodular
iff 𝐻2𝑘 (𝜕𝑀) = 0.

Proof. Assume first that the intersection form

𝑄 ∶ 𝐻 2𝑘 (𝑀, 𝜕𝑀) ⊗ 𝐻 2𝑘 (𝑀, 𝜕𝑀) → ℤ,

which is dual to
𝑄 ∶ 𝐻2𝑘 (𝑀) ⊗ 𝐻2𝑘 (𝑀) → ℤ,

is unimodular. Let 𝑖∗ ∶ 𝐻2𝑘 (𝜕𝑀) → 𝐻2𝑘 (𝑀). Since ∀ fixed 𝑎 ∈


im 𝑖∗ , ∀𝑏 ∈ 𝐻2𝑘 (𝑀), 𝑄(𝑎, 𝑏) = 0, it follows that im 𝑖∗ = 0. By Poincaré
duality and universal coefficient theorem, 𝐻2𝑘+1 (𝑀, 𝜕𝑀) = 𝐻 2𝑘−1 (𝑀) =
Hom(𝐻2𝑘−1 (𝑀), ℤ), so 𝐻2𝑘−1 (𝑀) = 0 implies 𝐻2𝑘+1 (𝑀, 𝜕𝑀) = 0. From
the long exact sequence

𝑖∗
⋯ → 𝐻2𝑘+1 (𝑀, 𝜕𝑀) → 𝐻2𝑘 (𝜕𝑀) → 𝐻2𝑘 (𝑀)
→ 𝐻2𝑘 (𝑀, 𝜕𝑀) → 𝐻2𝑘−1 (𝜕𝑀) → ⋯

we see that im 𝑖∗ = 0 implies 𝐻2𝑘 (𝜕𝑀) = 0.


Conversely, if 𝐻2𝑘 (𝜕𝑀) = 0, then 𝐻2𝑘−1 (𝜕𝑀) = 0. Thus 𝐻2𝑘 (𝑀) ≅
𝐻2𝑘 (𝑀, 𝜕𝑀) and 𝐻 2𝑘 (𝑀) ≅ 𝐻 2𝑘 (𝑀, 𝜕𝑀). From Poincaré duality we
have that

𝐻 2𝑘 (𝑀) ⊗ 𝐻 2𝑘 (𝑀, 𝜕𝑀) → ℤ, (𝑎, 𝑏) ↦ ⟨𝑎 ∪ 𝑏, [𝑀, 𝜕𝑀]⟩

is nondegenerate. The result follows. ◻


To calculate the first Pontrjagin class, we use the fact that 𝑝1 (𝐸ℎ𝑗 ) =
±2(ℎ − 𝑗)𝑢 ∈ 𝐻 4 (𝑆 4 ; ℤ) where 𝑢 is a generator, and find over ℚ that

𝑝1 (𝑇 𝐷𝐸ℎ𝑗 ) = 𝑝1 (𝜋 ∗ 𝐸ℎ𝑗 ⊕ 𝜋 ∗ 𝑇 𝑆 4 ) = 𝜋 ∗ 𝑝1 (𝐸ℎ𝑗 ) = ±2(ℎ − 𝑗)𝜋 ∗ 𝑢.


4 Applications to 𝑆 3 bundles over 𝑆 4 153

Consequently
𝑝21 [𝐵2ℎ−1 ] = 4(ℎ − 𝑗)2 = 4(2ℎ − 1)2 .

(4(2ℎ − 1)2 − 4) ℎ(ℎ − 1)


𝜇(𝑀2ℎ−1 ) ≡ ≡ mod 1.
(27 × 7) 56

Lemma 5. The possible values of ℎ(ℎ − 1)∕2 ∈ ℤ∕28ℤ are

0, 1, 3, 6, 7, 8, 10, 13, 14, 15, 17, 20, 21, 22, 24, 27.

Proof. Note that ℎ(ℎ − 1) ≡ 𝑚(𝑚 − 1) mod 56 iff (𝑚 + ℎ − 1)(𝑚 − ℎ) ≡


0 mod 56 iff

(𝑚 + ℎ − 1)(𝑚 − ℎ) ≡ 0 mod 8 and (𝑚 + ℎ − 1)(𝑚 − ℎ) ≡ 0 mod 7.

Thus it suffices to calculate for the 16 values of ℎ such that

ℎ ≡ 1, 2, 3, 4 mod 7 and ℎ ≡ 1, 2, 3, 4 mod 8.

This lemma follows from direct calculation. ◻


Now we have found 16 exotic spheres in the collection of sphere bundles
over spheres. Actually we can construct more.
Theorem 8. There are at least 28 different differentiable structures on the
7-dimensional topological sphere.

Proof. Let ℎ = 2. Since 𝜇(𝑀3 ) ≡ 1∕28 mod 1, ∀1 ≤ 𝑚 ≤ 28, the 𝜇


invariant of the connected sum of 𝑚 copies of 𝑀3 is 𝑚∕28 mod 1. ◻
Indeed, there are exactly 28 different differentiable structures on the 7-
dimensional topological sphere. Thus two such manifolds are diffeomor-
phic iff their 𝜇 are the same.
Corollary 1. The group Θ7 formed by these spheres is a cyclic group.
Corollary 2. The 16 spheres of the form 𝑀𝑝 admit infinitely many essen-
tially different differentiable fibrations by 𝑆 3 .

Proof. For each of the values in the lemma above, there are infinitely many
values of ℎ corresponding to it. ◻
Proposition 3. For any 7-dimensional manifold 𝑀 in the domain of 𝜇,
the underlying topological manifold has at least 28 different differentiable
structures.

Proof. Consider 𝑀#(𝑀3 # … #𝑀3 ) where there are 𝑚 copies of 𝑀3 . The


underlying topological manifolds are the same, but their 𝜇 invariants vary
among 28 different values in ℚ∕ℤ. ◻
154 Applications of the Eells–Kuiper Invariant to Exotic 7‐Spheres

Proposition 4. There are at least 14 different differentiable structures on


the 7-dimensional real projective space ℝ𝑃 7 .

Proof. Consider ℝ𝑃 7 #(𝑀3 # … #𝑀3 ) where there are 𝑚 copies of 𝑀3 . Its


universal cover is the connected sum of 2𝑚 copies of 𝑀3 , with 𝜇 invariant
2𝑚∕28 = 𝑚∕14 mod 1. ◻
It is proved in Milnor’s paper in [Cai15] that there are at least 28 different
differentiable structures on ℝ𝑃 7 .

5 Applications to 𝑆 1 bundles over homotopy ℂ𝑃 3

Consider certain circle bundles over homotopy ℂ𝑃 3 whose total space is


homeomorphic to 𝑆 7 . The purpose of this section is to use Eells–Kuiper
invariant to find out the diffeomorphism classes of these bundles over ℂ𝑃 3 ,
or more generally over homotopy ℂ𝑃 3 .

Characteristic classes of bundles over ℂ𝑃 3


Circle bundles over homotopy ℂ𝑃 3 are completely determined by their
Chern classes, which can be identified with an integer. Recall for example
that 𝒪(−1) is the tautological line bundle, and 𝒪(1) is the hyperplane
bundle.
Denote the total space of the line bundle 𝒪(𝑑) by 𝐸𝑑 , and denote the
corresponding disk bundle and sphere bundle by 𝐷𝐸𝑑 and 𝑆𝐸𝑑 respectively.

Theorem 9. 𝑆𝐸𝑑 is homeomorphic to 𝑆 7 iff 𝑑 = ±1.

Proof. From the cell structure we see that ℂ𝑃 3 is simply connected. The
second page 𝐸2 of the Serre spectual sequence associated to 𝑆𝐸𝑑 is

ℤ OO 0 ℤ OO 0 ℤ OO 0 ℤ
OOO OOO OOO
OOO𝑑 OOO𝑑 OOO𝑑
OOO OOO OOO
OOO OOO OOO
' ' '
ℤ 0 ℤ 0 ℤ 0 ℤ

So 𝑆𝐸𝑑 is a homology sphere iff 𝑑 = ±1.


If 𝑆𝐸𝑑 is a homology sphere, the first nontrivial homotopy group is
𝜋7 (𝑆𝐸𝑑 ) = ℤ. Take a generator of 𝜋7 (𝑆𝐸𝑑 ) = ℤ and we see that this
map induces isomorphisms in homology groups between simply connected
manifolds (proved in the next lemma), so this map gives rise to a homotopy
equivalence. Thus 𝑆𝐸𝑑 is a homotopy 7-sphere, which implies that 𝑆𝐸𝑑 is
homeomorphic to 𝑆 7 . ◻
5 Applications to 𝑆 1 bundles over homotopy ℂ𝑃 3 155

Lemma 6.
⎧ ℤ, if 𝑑 = 0

𝜋1 (𝑆𝐸𝑑 ) = ⎨ 0, if 𝑑 = ±1

⎩ ℤ∕|𝑑|ℤ, otherwise.

Proof. From the long exact sequence

⋯ → 𝜋2 (ℂ𝑃 3 ) → 𝜋1 (𝑆 1 ) → 𝜋1 (𝑆𝐸𝑑 ) → 𝜋1 (ℂ𝑃 3 ) → ⋯

where 𝜋2 (ℂ𝑃 3 ) = 𝜋1 (𝑆 1 ) = ℤ, 𝜋1 (ℂ𝑃 3 ) = 0, we see that 𝜋1 (𝑆𝐸𝑑 ) is a quo-


tient of ℤ and hence is abelian, so 𝜋1 (𝑆𝐸𝑑 ) = 𝐻1 (𝑆𝐸𝑑 ; ℤ) = 𝐻 6 (𝑆𝐸𝑑 ; ℤ).
The result follows from the Serre spectual sequence associated to 𝑆𝐸𝑑 . ◻
Let 𝜋 be the bundle projection. Let 𝑣 ∈ 𝐻 2 (ℂ𝑃 3 ; ℤ) be the first Chern
class of the hyperplane bundle. Then

𝐻 ∗ (ℂ𝑃 3 ; ℤ) = ℤ[𝑣]∕(𝑣4 ).

Lemma 7. 𝑤2 (𝑇 𝐷𝐸𝑑 ) ≡ 𝑑 ⋅ 𝜋 ∗ 𝑣 mod 2. Consequently, 𝐷𝐸𝑑 is spin iff 𝑑


is even, and in particular 𝐷𝐸−1 is not spin.

Proof.

𝑤2 (𝑇 𝐷𝐸𝑑 ) = 𝜋 ∗ (𝑤2 (𝐸𝑑 ⊕ 𝑇 ℂ𝑃 3 )) = 𝜋 ∗ (𝑤2 (𝐸𝑑 ) + 0 + 𝑤2 (𝑇 ℂ𝑃 3 ))


≡ 𝜋 ∗ (𝑑𝑣 + 4𝑣) ≡ 𝑑 ⋅ 𝜋 ∗ 𝑣 mod 2.

Here we use the fact that when considering the real bundle underlying a
complex rank 𝑛 vector bundle 𝐸, 𝑤2 (𝐸) ≡ 𝑐1 (𝐸) mod 2. ◻
It is possible to apply the Eells–Kuiper invariant to find out the diffeo-
morphism classes of the homotopy spheres 𝑆𝐸1 and 𝑆𝐸−1 , which are spin.
Also since 𝐸1 is just the complex conjugate of 𝐸−1 , they are the same as
real vector bundles and thus 𝑆𝐸1 and 𝑆𝐸−1 are diffeomorphic, via an ori-
entation reversing diffeomorphism. It suffices to calculate the Eells–Kuiper
invariant for 𝑆𝐸−1 . However, 𝐷𝐸−1 is not spin. We cannot apply the for-
mula in previous sections directly to the pair (𝐷𝐸−1 , 𝑆𝐸−1 ).

A generalization of 𝜇 for nonspin case


We need to calculate 𝜇 for a nonspin coboundary when considering disk
bundles with boundary circle bundles over homotopy ℂ𝑃 3 . This follows
from [KS88], [KS91].
Let 𝑀 be a 7-dimensional nonspin closed manifold with 𝐻 4 (𝑀; ℚ) =
0, together with a class 𝑢 ∈ 𝐻 2 (𝑀; ℤ) whose mod 2 reduction is
𝑤2 (𝑀). Assume there exist an 8-dimensional manifold 𝑊 and el-
ements 𝑧, 𝑐 ∈ 𝐻 2 (𝑊 ; ℤ) restricting to 𝑢, 0 respectively, such that
𝑤2 (𝑊 ) = 𝑧 + 𝑐 mod 2.
156 Applications of the Eells–Kuiper Invariant to Exotic 7‐Spheres

Similarly, if 𝑀 is a 7-dimensional spin closed manifold with 𝐻 4 (𝑀; ℚ) =


0, take 𝑢 = 0, 𝑧 = 0 and 𝑐 ∈ 𝐻 2 (𝑊 ; ℤ) restricting to 0, such that
𝑤2 (𝑊 ) = 𝑐 mod 2.
Define invariants 𝑠𝑖 (𝑀, 𝑢) = 𝑆𝑖 (𝑊 , 𝑧, 𝑐) ∈ ℚ∕ℤ, 𝑖 = 1, 2, 3 as follows:

̂ ), [𝑊 , 𝜕𝑊 ]
𝑆1 (𝑊 , 𝑧, 𝑐) = ⟨𝑒(𝑐+𝑧)∕2 𝐴(𝑊 ⟩
̂ ), [𝑊 , 𝜕𝑊 ]
𝑆2 (𝑊 , 𝑧, 𝑐) = ⟨ch(𝜆(𝑧) − 1)𝑒(𝑐+𝑧)∕2 𝐴(𝑊 ⟩
̂ ), [𝑊 , 𝜕𝑊 ] .
𝑆3 (𝑊 , 𝑧, 𝑐) = ⟨ch(𝜆2 (𝑧) − 1)𝑒(𝑐+𝑧)∕2 𝐴(𝑊 ⟩

Here 𝜆(𝑧) is the complex line bundle over 𝑊 with first Chern class 𝑧,
ch is the Chern character, and 𝐴(𝑊̂ ) is the 𝐴-polynomial
̂ of 𝑊 . Since we
4 2
are requiring 𝐻 (𝑀; ℚ) = 0, 𝑝1 and 𝑧 can be considered as elements of
𝐻 4 (𝑊 , 𝑀; ℚ) = 0 and thus 𝑝21 , 𝑝1 𝑧2 and 𝑧4 can be evaluated on the funda-
mental class [𝑊 , 𝜕𝑊 ] = [𝑊 , 𝑀]. However, 𝑝2 appears in 𝑒(𝑐+𝑧)∕2 𝐴(𝑊 ̂ ),
which cannot be considered as a relative class in general. Our strategy is
the same as for 𝜇 invariant, using the signature of 𝑊 to eliminate 𝑝2 from
the expression, i.e. replacing some constant multiple of 𝐿(𝑊 ) (evaluated on
the fundamental class) by some constant multiple of 𝜏(𝑊 ), so that 𝑝2 term
disappears.
These characteristic numbers for closed manifolds are integers. For
a proof, see Theorem 26.1.1. in [HBS13]. It follows that 𝑠𝑖 (𝑀, 𝑢) =
𝑆𝑖 (𝑊 , 𝑧, 𝑐) ∈ ℚ∕ℤ, 𝑖 = 1, 2, 3 are well-defined.
Note that, if 𝑊 is spin and 𝑀 has the induced spin structure, we may
take 𝑧 = 0, 𝑐 = 0 and thus

̂ ), [𝑊 , 𝜕𝑊 ]⟩ = 𝜇(𝑀).
𝑠1 (𝑀, 0) = ⟨𝐴(𝑊

So 𝑠1 is a generalization of 𝜇.
If 𝑐 = 0, explicit formulas are: for spin case

1 1
𝑆1 (𝑊 , 𝑧, 0) = − 𝜏(𝑊 ) + 7 𝑝21 ,
25 ⋅ 7 2 ⋅7
for nonspin case

1 1 1 1
𝑆1 (𝑊 , 𝑧, 0) = − 𝜏(𝑊 ) + 7 𝑝21 − 6 𝑧2 𝑝1 + 7 𝑧4 ,
25⋅7 2 ⋅7 2 ⋅3 2 ⋅3
where, by abuse of notation, everything is considered as a relative cohomol-
ogy class and is evaluated on [𝑊 , 𝜕𝑊 ] = [𝑊 , 𝑀]. If 𝑀 is spin but 𝑊 is
not, then
1 1 1 1
𝑆1 (𝑊 , 𝑧, 𝑐) = − 𝜏(𝑊 ) + 7 𝑝21 − 6 (𝑧 + 𝑐)2 𝑝1 + 7 (𝑧 + 𝑐)4 .
25⋅7 2 ⋅ 7 2 ⋅3 2 ⋅3
5 Applications to 𝑆 1 bundles over homotopy ℂ𝑃 3 157

Calculation for 𝑆𝐸−1


Take 𝑢 = 0, 𝑧 = 0, 𝑐 = 𝜋 ∗ 𝑣. Since 𝐻 4 (𝐷𝐸−1 ; ℤ) ≅ 𝐻 4 (𝐷𝐸−1 , 𝑆𝐸−1 ; ℤ) =
ℤ, the orientation may be chosen so that

⟨(𝑗 −1 𝜋 ∗ 𝑣)4 , [𝐷𝐸−1 , 𝑆𝐸−1 ]⟩ = 1 and 𝜏(𝐷𝐸−1 ) = 1,

where 𝜋 ∗ 𝑣2 ∈ 𝐻 4 (𝐷𝐸−1 ; ℤ) is a generator, and 𝑣 ∈ 𝐻 2 (ℂ𝑃 3 ; ℤ) denotes


the first Chern class of the hyperplane bundle. This can be done because in
this case the intersection form is unimodular, which follows from Lemma
4. A different choice of orientation leads to negative of the final result.
To calculate the 𝑠1 invariant, we need to calculate the first Pontrjagin
class from Chern classes:

𝑝1 (𝑇 𝐷𝐸−1 ) = 𝜋 ∗ (𝑝1 (𝐸−1 ) + 𝑝1 (𝑇 ℂ𝑃 3 )) = 𝜋 ∗ (𝑣2 + 4𝑣2 ) = 5𝜋 ∗ 𝑣2 .

Substituting into the last formula in the previous section, we find


1 1 25 5 1
𝜇(𝑆𝐸−1 ) = 𝑠1 (𝑆𝐸−1 , 0) = (− + − + ) = 0.
32 7 28 6 12
This implies
Theorem 10. 𝑆𝐸−1 and 𝑆𝐸1 are diffeomorphic to 𝑆 7 with the standard
differentiable structure.

Calculation for spin manifolds homotopy equivalent to ℂ𝑃 3


In [Wal66], a classification of certain 6-manifolds is given by Theorem 3
and Theorem 5. For spin manifolds homotopy equivalent to ℂ𝑃 3 , this is
restated as follows:
Theorem 11. Diffeomorphism classes of closed, smooth, simply-connected
spin 6-manifolds 𝑀 homotopy equivalent to ℂ𝑃 3 correspond bijectively to
isomorphism classes of the homomorphism

𝑝1 ∶ 𝐻 2 (𝑀; ℤ) ≅ 𝐻4 (𝑀; ℤ) ≅ ℤ → ℤ

subject to

∀𝑥 ∈ 𝐻 2 (𝑀; ℤ), 𝑝1 (𝑥) ≡ 4𝜇(𝑥, 𝑥, 𝑥) mod 24.

Equivalently, these manifolds are characterized by their first Pontrjagin


classes satisfying 𝑝1 (𝑇 𝑀) = (24𝑘 + 4)𝑣2 , 𝑘 ∈ ℤ, where 𝑣 ∈ 𝐻 2 (𝑀; ℤ)
corresponds to 𝑣 ∈ 𝐻 2 (ℂ𝑃 3 ; ℤ) defined above. Write 𝑋𝑘 for this manifold.
Note that 𝑋0 is just ℂ𝑃 3 .
Again circle bundles over 𝑋𝑘 are completely determined by their Chern
classes, which can be identified with an integer. Write 𝐸𝑑 for the complex
158 Applications of the Eells–Kuiper Invariant to Exotic 7‐Spheres

line bundle over 𝑋𝑘 with first Chern class 𝑑𝑣, and define 𝑆𝐸𝑑 , 𝐷𝐸𝑑 in the
same way. One verifies that only 𝑆𝐸1 and 𝑆𝐸−1 are homeomorphic to 𝑆 7
as in the previous sections, and it suffices to consider 𝑆𝐸−1 .
𝑋𝑘 is spin, so 𝑤2 (𝑇 𝐷𝐸𝑑 ) ≡ 𝑑 ⋅ 𝜋 ∗ 𝑣 mod 2 and it is zero iff 𝑑 is divisible
by 2, and in particular 𝐷𝐸−1 is not spin. 𝑆𝐸−1 is spin, being a homotopy
sphere.
Take 𝑢 = 0, 𝑧 = 0, 𝑐 = 𝜋 ∗ 𝑣. Since

𝐻 4 (𝐷𝐸−1 ; ℤ) ≅ 𝐻 4 (𝐷𝐸−1 , 𝑆𝐸−1 ; ℤ) = ℤ,

the orientation may be chosen so that

⟨(𝑗 −1 𝜋 ∗ 𝑣)4 , [𝐷𝐸−1 , 𝑆𝐸−1 ]⟩ = 1 and 𝜏(𝐷𝐸−1 ) = 1,

where 𝜋 ∗ 𝑣2 ∈ 𝐻 4 (𝐷𝐸−1 ; ℤ) is a generator. This can be done by the same


reason.
The first Pontrjagin class is

𝑝1 (𝑇 𝐷𝐸−1 ) = 𝜋 ∗ (𝑝1 (𝐸−1 ) + 𝑝1 (𝑇 𝑋𝑘 )) =


𝜋 ∗ (𝑣2 + (24𝑘 + 4)𝑣2 ) = (24𝑘 + 5)𝜋 ∗ 𝑣2 .

The 𝑠1 invariant is

1 (24𝑘 + 5)2 (24𝑘 + 5) 1


𝜇(𝑆𝐸−1 ) = 𝑠1 (𝑆𝐸−1 , 0) = − + − + 7
25 ⋅ 7 27 ⋅ 7 26 ⋅ 3 2 ⋅3
9 2 1
= 𝑘 + 𝑘 mod 1.
14 7
Equivalently, we may consider

𝜇(𝑆𝐸−1 ) = 18𝑘2 + 4𝑘 mod 28.

For 𝑘 ranging from 0 to 27, the values are

0, 22, 24, 6, 24, 22, 0, 14, 8, 10, 20, 10, 8, 14, 0


, 22, 24, 6, 24, 22, 0, 14, 8, 10, 20, 10, 8, 14.

So the possible values of 𝜇(𝑆𝐸−1 ) (with the orientation chosen in the


proof above) are
0, 6, 8, 10, 14, 20, 22, 24.

Note that 𝜇(𝑆𝐸−1 ) and 𝜇(𝑆𝐸1 ) are diffeomorphic via an orientation


reversing diffeomorphism. The possible values of 𝜇(𝑆𝐸−1 ) and 𝜇(𝑆𝐸1 )
are
0, 4, 6, 8, 10, 14, 18, 20, 22, 24.

In [Jia13], Corollary 4.1 states that


References 159

Proposition 5. Among the 28 homotopy 7-spheres Σ𝑟 , 0 ≤ 𝑟 ≤ 27, the


following ones admit smooth regular circle actions
Σ𝑟 , 𝑟 = 0, 4, 6, 8, 10, 14, 18, 20, 22, 24.
Thus we arrive at the final result
Theorem 12. Among the 28 homotopy 7-spheres Σ𝑟 , 0 ≤ 𝑟 ≤ 27, the fol-
lowing ones admit smooth regular circle actions
Σ𝑟 , 𝑟 = 0, 4, 6, 8, 10, 14, 18, 20, 22, 24.
Each of these Σ𝑟 can be realized as the total space of a principal 𝑆 1 -bundle
over some homotopy ℂ𝑃 3 with primitive Euler class, and hence admits
smooth regular 𝑆 1 -actions.

6 Further Discussions
For more information about 𝑆 1 -actions on homotopy spheres, see [Hsi66],
[MY68] and [Sch71].
The topology of sphere bundles over spheres has been discussed in
[Jam62a], [Jam62b]. A complete classification of 𝑆 3 -bundles over 𝑆 4 is
done in [CM03].
In [KM63], the group Θ𝑛 of homotopy spheres is discussed. The h-
cobordism classes of homotopy 𝑛-spheres form an abelian group Θ𝑛 under
connected sum operation. Actually, for 𝑛 ≥ 5 homotopy 𝑛-spheres are in-
deed homeomorphic to 𝑆 𝑛 , and two homotopy 𝑛-spheres are h-cobordant
if and only if they are diffeomorphic. Consequently Θ𝑛 describes exotic
spheres of dimension 𝑛, for 𝑛 ≥ 5.
A generalized version of the Eells–Kuiper invariant is the Kreck–Stolz
𝑠-invariant. See [TW15].

References
[BH58] A. Borel and F. Hirzebruch (1958). “Characteristic Classes and Homo-
geneous Spaces, I”. American Journal of Mathematics 80 (2), 458–538.
ISSN: 00029327, 10806377. URL: http://www.jstor.org/stable/
2372795.
[Bre13] G.E. Bredon (2013). Topology and Geometry. Graduate Texts in Math-
ematics. Springer New York. ISBN: 9781475768480. URL: https://
books.google.com.hk/books?id=wuUlBQAAQBAJ.
[BT13] R. Bott and L.W. Tu (2013). Differential Forms in Algebraic Topol-
ogy. Graduate Texts in Mathematics. Springer New York. ISBN:
9781475739510. URL: https://books.google.com.hk/books?
id=COuPBAAAQBAJ.
160 Applications of the Eells–Kuiper Invariant to Exotic 7‐Spheres

[Cai15] S.S. Cairns (2015). Differential and Combinatorial Topology: A Sym-


posium in Honor of Marston Morse (PMS-27). Princeton Mathematical
Series. Princeton University Press. ISBN: 9781400874842. URL: https:
//books.google.com.hk/books?id=Vg3WCgAAQBAJ.
[CM03] Diarmuid Crowley and Christine M.Escher (2003). “A classification of
𝑆 3 -bundles over 𝑆 4 ”. Differential Geometry and its Applications 18 (3),
363–380. ISSN: 0926-2245.
[EK62] James Eells and Nicolaas H. Kuiper (1962). “An invariant for certain
smooth manifolds”. Annali di Matematica Pura ed Applicata 60 (1),
93. ISSN: 1618-1891. DOI: 10 . 1007 / BF02412768. URL: https : / /
doi.org/10.1007/BF02412768.
[FS14] F.T. Farrell and Y. Su (2014). Introductory Lectures on Manifold Topol-
ogy: Signposts. Surveys of modern mathematics. International Press.
ISBN: 9781571462879. URL: https : / / books . google . com . hk /
books?id=J1vuoAEACAAJ.
[GS99] R.E. Gompf and A.I. Stipsicz (1999). 4-Manifolds and Kirby Calcu-
lus. Graduate studies in mathematics. American Mathematical Society.
ISBN: 9780821809945. URL: https : / / books . google . com . hk /
books?id=TpGDAwAAQBAJ.
[Hat02] A. Hatcher (2002). Algebraic Topology. Cambridge University Press.
ISBN: 9780521795401. URL: https : / / books . google . com . hk /
books?id=BjKs86kosqgC.
[HBS13] F. Hirzebruch, A. Borel, and R.L.E. Schwarzenberger (2013). Topo-
logical Methods in Algebraic Geometry. Grundlehren der math-
ematischen Wissenschaften. Springer Berlin Heidelberg. ISBN:
9783662306970. URL: https : / / books . google . com . hk /
books?id=DmLoCAAAQBAJ.
[Hsi66] Wu-chung Hsiang (1966). “A Note on Free Differentiable Actions of 𝑆 1
and 𝑆 3 on Homotopy Spheres”. Annals of Mathematics 83 (2), 266–272.
ISSN: 0003486X. URL: http://www.jstor.org/stable/1970431.
[Jam62a] I.M. James (1962). “THE HOMOTOPY THEORY OF SPHERE BUN-
DLES OVER SPHERES (I)”. Algebraic and Classical Topology. Ed. by
I.M. James. Pergamon, 37–59. ISBN: 978-0-08-009872-2.
[Jam62b] I.M. James (1962). “THE HOMOTOPY THEORY OF SPHERE BUN-
DLES OVER SPHERES (II)”. Algebraic and Classical Topology. Ed.
by I.M. James. Pergamon, 61–79. ISBN: 978-0-08-009872-2.
[Jia13] Yi Jiang (2013). “Regular Circle Actions on 2-connected 7-manifolds”.
Journal of the London Mathematical Society 90. DOI: 10.1112/jlms/
jdu028.
[KM63] Michel A. Kervaire and John W. Milnor (1963). “Groups of Homotopy
Spheres: I”. Annals of Mathematics 77 (3), 504–537. ISSN: 0003486X.
URL: http://www.jstor.org/stable/1970128.
References 161

[KS88] Matthias Kreck and Stephan Stolz (1988). “A Diffeomorphism Clas-


sification of 7-Dimensional Homogeneous Einstein Manifolds with
𝑆𝑈 (3) × 𝑆𝑈 (2) × 𝑈 (1)-symmetry”. Annals of Mathematics 127 (2),
373–388. ISSN: 0003486X. URL: http://www.jstor.org/stable/
2007059.
[KS91] Matthias Kreck and Stephan Stolz (1991). “Some nondiffeomorphic
homeomorphic homogeneous 7-manifolds with positive sectional
curvature”. J. Differential Geom. 33 (2), 465–486. DOI: 10 . 4310 /
jdg / 1214446327. URL: https : / / doi . org / 10 . 4310 / jdg /
1214446327.
[LM89] H.B. Lawson and M.L. Michelsohn (1989). Spin Geometry. Princeton
Mathematical Series. Princeton University Press. ISBN: 9780691085425.
URL: https://books.google.com.hk/books?id=3d9JkN8w3X8C.
[MH73] J.W. Milnor and D. Husemöller (1973). Symmetric Bilinear Forms.
Ergebnisse der Mathematik und ihrer Grenzgebiete. Springer-Verlag.
ISBN: 9783540060093. URL: https : / / books . google . com . gi /
books?id=dopWtQEACAAJ.
[Mil56] John Milnor (1956). “On Manifolds Homeomorphic to the 7-Sphere”.
Annals of Mathematics 64 (2), 399–405. ISSN: 0003486X. URL: http:
//www.jstor.org/stable/1969983.
[Mil59a] John Milnor (1959). “Differentiable manifolds which are homotopy
spheres”.
[Mil59b] John Milnor (1959). “Differentiable Structures on Spheres”. American
Journal of Mathematics 81 (4), 962–972. ISSN: 00029327, 10806377.
URL: http://www.jstor.org/stable/2372998.
[MS74] J.W. Milnor and J.D. Stasheff (1974). Characteristic Classes. An-
nals of mathematics studies. Princeton University Press. ISBN:
9780691081229. URL: https : / / books . google . com . hk /
books?id=5zQ9AFk1i4EC.
[MY68] Deane Montgomery and C. T. Yang (1968). “Free Differentiable Ac-
tions on Homotopy Spheres”. Proceedings of the Conference on Trans-
formation Groups. Ed. by Paul S. Mostert. Berlin, Heidelberg: Springer
Berlin Heidelberg, 175–192. ISBN: 978-3-642-46141-5.
[Sav99] N. Saveliev (1999). Lectures on the Topology of 3-manifolds: An In-
troduction to the Casson Invariant. De Gruyter textbook. Walter de
Gruyter. ISBN: 9783110162721. URL: https://books.google.com.
hk/books?id=ErraOM8HYcIC.
[Sch71] Reinhard Schultz (1971). “The Nonexistence of Free 𝑆 1 Actions on
Some Homotopy Spheres”. Proceedings of the American Mathemati-
cal Society 27 (3), 595–597. ISSN: 00029939, 10886826. URL: http :
//www.jstor.org/stable/2036505.
[Ser12] J.P. Serre (2012). A Course in Arithmetic. Graduate Texts in Mathemat-
ics. Springer New York. ISBN: 9781468498844. URL: https://books.
google.com.hk/books?id=8fPTBwAAQBAJ.
162 Applications of the Eells–Kuiper Invariant to Exotic 7‐Spheres

[Sma61] Stephen Smale (1961). “Generalized Poincaré’s Conjecture in Dimen-


sions Greater Than Four”. Annals of Mathematics 74 (2), 391–406. ISSN:
0003486X. URL: http://www.jstor.org/stable/1970239.
[TW15] Wilderich Tuschmann and David J. Wraith (2015). Moduli spaces of
Riemannian metrics. Springer.
[Wal66] C. T. C. Wall (1966). “Classification problems in differential topology.
V”. Inventiones mathematicae 1 (4), 355–374. ISSN: 1432-1297. DOI:
10 . 1007 / BF01389738. URL: https : / / doi . org / 10 . 1007 /
BF01389738.
Transversality Theorems by Example

Xie Yuxiao1

ABSTRACT

We prove a general transversality theorem in the setting of Fredholm maps


between Banach manifolds. After that, we illustrate its power by describing
applications in three geometric contexts.

Contents

1 Introduction 164

2 Transversality in general 165


Banach manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
Interlude: Examples in geometry . . . . . . . . . . . . . . . . . . 166
Fredholm theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
The Sard–Smale theorem . . . . . . . . . . . . . . . . . . . . . . . . . 167
A general transversality theorem . . . . . . . . . . . . . . . . . . . . . 169
Remarks on smoothness . . . . . . . . . . . . . . . . . . . . . . . . . . 170

3 Example: Classical Morse theory 170


Genericity of Morse functions . . . . . . . . . . . . . . . . . . . . . . 170
Moduli spaces of flow lines . . . . . . . . . . . . . . . . . . . . . . . . 171
Setting the stage . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
Fredholm property . . . . . . . . . . . . . . . . . . . . . . . . . 172
Regular value . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
Passage to 𝐶 ∞ . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

1
谢雨潇, 清华大学数学系数 80 班.
164 Transversality Theorems by Example

4 Example: Pseudoholomorphic curves 173


Background material . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
Pseudoholomorphic curves . . . . . . . . . . . . . . . . . . . . . 174
The symplectic setting . . . . . . . . . . . . . . . . . . . . . . . 174
Moduli spaces of 𝐽 -holomorphic curves . . . . . . . . . . . . . . . . . 174
Setting the stage . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
Fredholm property . . . . . . . . . . . . . . . . . . . . . . . . . 175
Regular value . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
Passage to 𝐶 ∞ . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

5 Example: Yang–Mills connections 177


Background material . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
Connections on principal bundles . . . . . . . . . . . . . . . . . . 178
Anti-self-dual connections . . . . . . . . . . . . . . . . . . . . . 178
Moduli spaces of self-dual connections . . . . . . . . . . . . . . . . . . 178

convention. Unless otherwise stated, the word “manifold” refers to one which has
no boundary but which is not necessarily smooth. Whenever we write 𝐶 𝑘 or 𝑊 𝑘,𝑝 ,
it is tacitly understood that 𝑘 ≥ 1, 1 < 𝑝 < ∞.

1 Introduction
A transversality theorem is a theorem stating that a certain desirable property can
be achieved by an arbitrarily small perturbation, or that the property in question is
generic. We begin by making this notion precise.
Let 𝑋 be a topological space, 𝑃 a property that points of 𝑋 may or may not
satisfy. We say 𝑃 holds for generic 𝑥 ∈ 𝑋 if the set

{𝑥 ∈ 𝑋 ∣ 𝑃 holds for 𝑥}

is residual in the sense of Baire, i.e., it contains a countable intersection of dense


open sets. Note that residual sets in complete metric spaces are dense, as a conse-
quence of the Baire category theorem.
In classical differential topology, the prototypical example of a transversality
theorem is the result that transversal intersection of two submanifolds is generic.
This follows from a general parametric transversality theorem, which, in turn, re-
lies on Sard’s theorem. The same technique has been extended to the infinite-
dimensional setting so that it applies to, e.g., spaces of maps between manifolds
and spaces of sections of fiber bundles. Explaining and illustrating the use of this
extension is the purpose of the present paper.
Most of the transversality theorems that we shall prove fit into a larger program,
which can be abstractly formulated as follows. Here one is interested in the moduli
space ℳ of a class of geometric objects, typically defined by a differential equation.
Such a program studies the structure of ℳ in the following aspects:
2 Transversality in general 165

1. Transversality.
In the generic case, ℳ is a finite-dimensional smooth manifold. The present
paper concerns itself with how one proves such a result.
2. Orientation.
The moduli space ℳ is orientable. This seemingly innocent fact could be
highly nontrivial. Incidentally, an important tool for proving this is the de-
terminant line bundle on the space of Fredholm operators.
3. Compactness.
Any sequence in ℳ has a subsequence “converging” in a suitable sense.
Adding all possible limits to ℳ, we obtain a compactification ℳ. Of course,
such results are necessarily quite analytic in nature.
4. Gluing.
The compactified moduli space ℳ is a smooth manifold with an expected
boundary: Taking hint from the previous step, we expect the boundary to be
the space of another class of geometric objects. That all such objects arise
as boundary points is proved by a gluing procedure.

As an example, ℳ could be the moduli space of flow lines in Morse or Floer


homology, in which case the boundary (or corner) points are broken flow lines.

2 Transversality in general
In this section, we establish the theoretical foundations for proving transversality
theorems. Since we shall be mostly working with infinite-dimensional manifolds,
we first recall the relevant theory and fix our notation. After that, we prove the
Sard–Smale theorem and deduce from it a general transversality theorem, which
we shall apply in the sections that follow.
For a Banach space 𝒳 , we denote by ‖⋅‖𝒳 the norm defining 𝒳 .

Banach manifolds
Let 𝒳 , 𝒴 be Banach spaces, 𝑈 ⊂ 𝒳 an open set. A map 𝐹 ∶ 𝑈 → 𝒴 is (Fréchet)
differentiable at 𝑥 ∈ 𝑈 if there is a continuous linear map 𝐴 ∶ 𝒳 → 𝒴 such that
‖𝐹 (𝑥) − 𝐹 (𝑥0 ) + 𝐴(𝑥 − 𝑥0 )‖𝒴
lim = 0.
𝑥→𝑥0 ‖𝑥 − 𝑥0 ‖𝒳
In this case, 𝐴 is called the differential of 𝐹 at 𝑥0 and denoted by 𝐷𝐹𝑥0 . If 𝐹
is differentiable at each point in 𝑈 , we consider 𝐷𝐹 ∶ 𝑈 → ℒ (𝒳 , 𝒴 ), where
ℒ (𝒳 , 𝒴 ) is the Banach space of continuous linear maps 𝑋 → 𝑌 . We say 𝐹 is
continuously differentiable if 𝐷𝐹 is continuous. Iterating this definition, 𝐹 is 𝐶 𝑘
is 𝐷𝐹 is 𝐶 𝑘−1 .
166 Transversality Theorems by Example

With this notion of differentiability, we define 𝐶 𝑘 diffeomorphisms, immer-


sions, submersions, regular values, etc., just as in the finite-dimensional case. Sim-
ilarly, a 𝐶 𝑘 Banach manifold modeled on a Banach space 𝒳 is a Hausdorff space
obtained by piecing together open sets in 𝒳 by transition maps that are 𝐶 𝑘 diffeo-
morphisms. Note that we do not require a Banach manifold to be second countable,
since the model space 𝒳 itself may not be second countable. Tangent vectors are
defined as equivalence classes of ordered triples so that 𝑇𝑥 𝑈 = 𝒳 for an open set
𝑈 ⊂ 𝒳 . Much of the basic theory of finite-dimensional 𝐶 𝑘 manifolds carries over
to Banach manifolds. A standard reference in this direction is [Lan99].

Theorem 2.1 (Inverse mapping theorem). Let 𝒳 , 𝒴 be 𝐶 𝑘 Banach manifolds,


𝐹 ∶ 𝒳 → 𝒴 a 𝐶 𝑘 map. If 𝑥 ∈ 𝒳 is such that 𝐷𝐹𝑥 ∶ 𝑇𝑥 𝒳 → 𝑇𝐹 (𝑥) 𝒴 is an isomor-
phism, then there are open neighborhoods 𝑈 , 𝑉 of 𝑥, 𝐹 (𝑥) in 𝒳 , 𝒴 , respectively,
such that 𝐹 ∶ 𝑈 → 𝑉 is a 𝐶 𝑘 diffeomorphism.

Theorem 2.2 (Implicit mapping theorem). Let 𝒳 , 𝒴 be 𝐶 𝑘 Banach manifolds,


𝐹 ∶ 𝒳 → 𝒴 a 𝐶 𝑘 map. If 𝑦 ∈ 𝒴 is a regular value of 𝐹 , then 𝐹 −1 (𝑦) is a 𝐶 𝑘
Banach submanifold of 𝒳 with 𝑇𝑥 𝐹 −1 (𝑦) = ker 𝐷𝐹𝑥 .

For our applications, the maps are always sufficiently smooth, and we shall omit
the technical verifications of smoothness for reasons of space.

Interlude: Examples in geometry

Let 𝑀 be a compact smooth manifold, 𝐸 → 𝑀 a smooth vector bundle. Fix a


Riemannian metric on 𝑀, a bundle metric on 𝐸, and a metric connection on 𝐸.
For a smooth section 𝑠 ∈ Γ(𝑀, 𝐸), define

𝑘
‖𝑠‖𝐶 𝑘 ∶= ∑ sup |∇𝑖 𝑠|,
𝑖=0 𝑀
𝑘 1∕𝑝
𝑖 𝑝
‖𝑠‖𝑊 𝑘,𝑝 ∶= ∑ ∫ |∇ 𝑠| .
( 𝑖=0 𝑀 )

The completions of Γ(𝑀, 𝐸) with respect to these norms are Banach spaces which
we denote by 𝐶 𝑘 (𝑀, 𝐸) or 𝑊 𝑘,𝑝 (𝑀, 𝐸). We also have the space 𝐶 ∞ (𝑀, 𝐸), which
is a Fréchet space defined by the sequence (‖⋅‖𝐶 𝑘 )𝑘 of norms, i.e., a sequence 𝑠𝑛 →
𝑠 in 𝐶 ∞ (𝑀, 𝐸) if and only if ‖𝑠𝑛 − 𝑠‖𝐶 𝑘 (𝑀,𝐸) → 0 for all 𝑘. Since 𝑀 is compact,
it is easy to check that the topology on these spaces do not depend on the metrics
and connections chosen.
It is technically more difficult to define manifold structures on spaces of sections
of general fiber bundles, due to lack of linear structures. The case of trivial fiber
bundles, which amounts to spaces of maps between two manifolds, is discussed in,
e.g., [MS12, Remark B.1.24]. We shall not bother with this, since knowledge of
their tangent spaces already suffices in most applications.
2 Transversality in general 167

Fredholm theory
The Fredholm property is a crucial technical condition for generalizing finite-
dimensional phenomena. Fortunately, Fredholm maps abound in geometry. For
example, elliptic differential operators are Fredholm. In this article, we omit
the analytical verification of the Fredholm property and the computation of the
Fredholm index for reasons of time. However, we do recall the definitions:
Let 𝒳 , 𝒴 be Banach spaces. A continuous linear map 𝐴 ∶ 𝒳 → 𝒴 is Fredholm
if it has closed image and finite-dimensional kernel and cokernel. In this case,
its (Fredholm) index is defined to be ind 𝐴 ∶= dim ker 𝐴 − dim coker 𝐴. It is
standard that the space of Fredholm linear maps is open in the norm topology, and
the Fredholm index is locally constant on this space.
Let 𝒳 , 𝒴 be Banach manifolds. A 𝐶 1 map 𝐹 ∶ 𝒳 → 𝒴 is Fredholm if
𝐷𝐹𝑥 ∶ 𝑇𝑥 𝒳 → 𝑇𝐹 (𝑥) 𝒴 is Fredholm for all 𝑥 ∈ 𝒳 . If 𝒳 is connected, then by
continuity, ind 𝐷𝐹𝑥 is independent of 𝑥 ∈ 𝒳 , so 𝐹 has a well-defined Fredholm
index.

The Sard–Smale theorem


In 1965, Smale [Sma65] proved the following generalization of Sard’s theorem:
Theorem 2.3 (Sard–Smale). Let 𝒳 , 𝒴 be 𝐶 𝑘 Banach manifolds with 𝒳 separable,
𝐹 ∶ 𝒳 → 𝒴 a 𝐶 𝑘 map Fredholm with index 𝑚, where 𝑘 ≥ max(1, 𝑚 + 1). Then a
generic 𝑦 ∈ 𝒴 is a regular value of 𝐹 , so that 𝐹 −1 (𝑦) is an 𝐶 𝑘 submanifold of 𝒳
of dimension 𝑚.
The proof is by reducing to the finite-dimensional case using the Fredholm
property. To describe this reduction, we introduce the following notion.
Let 𝒳 , 𝒴 be Banach spaces, 𝐴 ∶ 𝒳 → 𝒴 a continuous linear map. A pseudo-
inverse of 𝐴 is a continuous linear map 𝐵 ∶ 𝒴 → 𝒳 such that 𝐴𝐵𝐴 = 𝐴, 𝐵𝐴𝐵 =
𝐵. In this case, we have 𝒳 = 𝒳0 ⊕ 𝒳1 , 𝒴 = 𝒴0 ⊕ 𝒴1 , where 𝒳0 ∶= ker 𝐴,
𝒳1 ∶= im 𝐵, 𝒴0 ∶= ker 𝐵, 𝒴1 ∶= im 𝐴. With respect to these splittings,

0 0 0 0
𝐴= , 𝐵= ,
(0 𝐴1 ) (0 𝐵 1 )

where 𝐴1 ∶= 𝐴|𝒳1 and 𝐵1 ∶= 𝐵|𝒴1 are mutual inverses.


It is easy to see that 𝐴 has a pseudoinverse if and only if 𝐴 has closed im-
age, ker 𝐴 has a complement in 𝒳 , and im 𝐴 has a complement in 𝒴 . In fact, if
𝒳 = ker 𝐴 ⊕ 𝒳1 , 𝒴 = 𝒴0 ⊕ im 𝐴, we simply define 𝐵 by 𝐵|𝒴0 = 0, 𝐵|im 𝐴 =
(𝐴|𝒳1 )−1 . Since closed subspaces of finite dimension or codimension are always
complemented, linear Fredholm maps admit pseudoinverses.
Lemma 2.4. Let 𝒳 , 𝒴 be Banach spaces, 𝑈 ⊂ 𝒳 an open neighborhood of 0,
𝐹 ∶ 𝑈 → 𝒴 a 𝐶 𝑘 map with 𝐹 (0) = 0. Suppose 𝐴 ∶= 𝐷𝐹0 has a pseudoinverse 𝐵.
Then there is an open neighborhood 𝑉 of 0 in 𝒳 , a 𝐶 𝑘 diffeomorphism Φ ∶ 𝑉 →
Φ(𝑉 ) onto an open set Φ(𝑉 ) ⊂ 𝑈 , and a 𝐶 𝑘 map 𝐹0 ∶ 𝑉 → 𝒴0 = ker 𝐵 such
168 Transversality Theorems by Example

that (𝐹 ∘ Φ)(𝑥) = 𝐹0 (𝑥) + 𝐴𝑥 for 𝑥 ∈ 𝑉 , and Φ(0) = 0, 𝐷Φ0 = id𝒳 , 𝐹0 (0) = 0,


(𝐷𝐹0 )0 = 0.

In other words, after changing the chart on the domain, 𝐹 takes the form (𝐹0 , 𝐴)
with respect to the splitting 𝒴 ≅ 𝒴0 ⊕ 𝒴1 .

Proof of lemma. We apply the inverse mapping theorem to Ψ ∶ 𝑈 → 𝒳 , 𝑥 ↦


𝑥+𝐵(𝐹 (𝑥)−𝐴𝑥) at 0. Since Ψ(0) = 0, 𝐷Ψ0 = id𝒳 , there is an open neighborhood
𝑊 of 0 such that Ψ ∶ 𝑊 → Ψ(𝑊 ) is a 𝐶 𝑘 diffeomorphism. Let 𝑉 ∶= Ψ(𝑊 ),
Φ ∶= Ψ−1 ∶ 𝑉 → 𝑊 , 𝐹0 ∶= (id𝒴 − 𝐴𝐵)(𝐹 ∘ Φ). Since 𝐴Ψ = 𝐴 + 𝐴𝐵𝐹 − 𝐴𝐵𝐴 =
𝐴𝐵𝐹 on 𝑊 , we have 𝐴 = 𝐴𝐵𝐹 ∘ Φ on 𝑉 , so 𝐹 ∘ Φ = 𝐹0 + 𝐴𝐵𝐹 ∘ Φ = 𝐹0 + 𝐴
on 𝑉 . The rest is easy to verify. ◻

We now commence the proof proper of the Sard–Smale theorem.

Step 1. Any point in 𝑋 has a neighborhood 𝑉 such that the set of regular values
of 𝐹 |𝑉 is dense in 𝒴 .

Proof. Working in local charts, we are in the situation of the lemma. Thus we may
assume 𝐹 = (𝐹0 , 𝐴) on 𝑉 with respect to 𝒴 = 𝒴0 ⊕ 𝒴1 , where we retain the
notation above. The equation 𝐹 (𝑥) = 𝑦 becomes 𝑦0 = 𝐹0 (𝑥0 , 𝑥1 ), 𝑦1 = 𝐴1 𝑥1 for
𝑥 = 𝑥0 +𝑥1 ∈ 𝒳0 ⊕𝒳1 , 𝑦 = 𝑦0 +𝑦1 ∈ 𝒴0 ⊕𝒴1 . Thus 𝑦 is a regular value for 𝐹 |𝑉 if
and only if 𝑦0 is a regular value of 𝑥0 ↦ 𝐹0 (𝑥0 , 𝐵1 𝑦1 ), which is a finite-dimensional
map. Then the claim follows from Sard’s theorem for 𝐶 𝑘 maps. ◻

Note that we cannot conclude that the set fo regular values of 𝐹 |𝑉 is residual,
since Sard’s theorem only implies that the set of regular values of 𝐹 |𝑉 is residual
on each section 𝒴0 ⊕ {𝑦1 }.

Step 2. With 𝑉 constructed in Step 1, the set of regular values of 𝐹 |𝐾 2 is open for
any closed set 𝐾 ⊂ 𝑉 .

Proof. We may assume 𝑉 is bounded. Let (𝑥𝑛 )𝑛 ⊂ 𝐾 be a sequence of critical


points of 𝐹 with 𝐹 (𝑥𝑛 ) → 𝑦 ∈ 𝒴 . Write 𝑥𝑛 = 𝑥𝑛,0 + 𝑥𝑛,1 ∈ 𝒴0 ⊕ 𝒴1 . Then 𝑥𝑛,1 =
𝐵𝐴𝑥𝑛 = 𝐵𝑦𝑛 → 𝐵𝑦, since 𝐵𝐹0 = 0. In particular, (𝑥𝑛,0 )𝑛 is bounded. Since 𝒳0
is finite-dimensional, passing to a subsequence, we may assume 𝑥𝑛,0 → 𝑥0 ∈ 𝒳0 .
Then 𝑥𝑛 → 𝑥 ∶= 𝑥0 + 𝐵𝑦. Since the set of surjective continuous linear maps
between Banach spaces is open in the norm topology, 𝐷𝐹𝑥 = lim 𝐷𝐹𝑥𝑛 cannot be
surjective, so 𝑦 = 𝐹 (𝑥) is a critical value. ◻

Step 3. Conclusion of the proof.

Proof of Sard–Smale. We have proved that any point in 𝒳 has a neighborhood 𝑉


such that the set of regular values of 𝐹 |𝐾 is dense and open for any closed set
𝐾 ⊂ 𝑉 . Since 𝒳 is separable, it can be covered by countably such open sets. But
the set of regular values of 𝐹 is the intersection of those of 𝐹 |𝑉 as 𝑉 ranges over
this countable cover. ◻
2
We say 𝑦 ∈ 𝒴 is a regular value of 𝐹 |𝑉 if 𝐷𝐹𝑥 ∶ 𝑇𝑥 𝒳 → 𝑇𝑦 𝒴 is surjective for 𝑥 ∈ 𝑉 ∩ 𝐹 −1 (𝑦).
2 Transversality in general 169

A general transversality theorem


Instead of applying directly the Sard–Smale theorem, we shall streamline our
proofs using the following general transversality theorem:

Theorem 2.5 (General transversality theorem). Let 𝒳 , 𝒴 be separable Banach


manifolds, ℰ → 𝒳 × 𝒴 a Banach vector bundle, 𝒮 ∶ 𝒳 × 𝒴 → ℰ a section.
Suppose they are sufficiently smooth and for all (𝑥, 𝑦) ∈ 𝒮 −1 (0),

(a) 𝐷1 𝒮(𝑥,𝑦) ∶ 𝑇𝑥 𝒳 → ℰ(𝑥,𝑦) is Fredholm with index 𝑚;

(b) 𝐷𝒮(𝑥,𝑦) ∶ 𝑇𝑥 𝒳 × 𝑇𝑦 𝒴 → ℰ(𝑥,𝑦) 3 is surjective.

Then for a generic 𝑦 ∈ 𝒴 , 𝒮 −1 (0) ∩ (𝒳 × {𝑦}) is an appropriately smooth sub-


manifold of dimensional 𝑚, and 𝐷1 𝒮(𝑥,𝑦) ∶ 𝑇𝑥 𝒳 → ℰ(𝑥,𝑦) is surjective for all
(𝑥, 𝑦) ∈ 𝒮 −1 (0).

In applications, 𝒮 is usually a nonlinear differential operator on the space 𝒳


that depends on an auxiliary structure given by elements of 𝒴 , and the moduli
space in question is the solution space of 𝒮 = 0. The desired transversality theo-
rem is that the moduli space is a smooth manifold for a generic auxiliary structure.
Incidentally, 𝒮 −1 (0) is sometimes called the universal moduli space in the litera-
ture.

Proof. By the implicit mapping theorem, (b) implies 𝒮 −1 (0) is a Banach manifold.
We apply the Sard–Smale theorem to the projection 𝜋 ∶ 𝒮 −1 (0) → 𝒴 . It suffices
to show 𝜋 is Fredholm with index 𝑚. For (𝑥, 𝑦) ∈ 𝒮 −1 (0), since 𝑇(𝑥,𝑦) 𝒮 −1 (0) =
{(𝑣, 𝑤) ∈ 𝑇𝑥 𝒳 × 𝑇𝑦 𝒴 ∣ 𝐷1 𝒮(𝑥,𝑦) (𝑣) + 𝐷2 𝒮(𝑥,𝑦) (𝑤) = 0}, 𝐷𝜋(𝑥,𝑦) ∶ (𝑣, 𝑤) ↦ 𝑤,
we see that ker 𝐷𝜋(𝑥,𝑦) = ker 𝐷1 𝒮(𝑥,𝑦) , im 𝐷𝜋(𝑥,𝑦) = (𝐷2 𝒮(𝑥,𝑦) )−1 (im 𝐷1 𝒮(𝑥,𝑦) ) is
closed, and 𝐷2 𝒮(𝑥,𝑦) induces an isomorphism coker 𝐷𝜋(𝑥,𝑦) → coker 𝐷1 𝒮(𝑥,𝑦) , so
𝐷𝜋(𝑥,𝑦) is Fredholm with ind 𝐷𝜋(𝑥,𝑦) = ind 𝐷1 𝒮(𝑥,𝑦) by (a). For the last conclusion,
𝑇𝑥 𝒳 + 𝑇(𝑥,𝑦) 𝒮 −1 (0) = 𝑇(𝑥,𝑦) (𝒳 × 𝒴 ) since 𝑦 is a regular value of 𝜋, but 𝐷𝒮(𝑥,𝑦)
vanishes on 𝑇(𝑥,𝑦) 𝒮 −1 (0), so im 𝐷1 𝒮(𝑥,𝑦) = im 𝐷𝒮(𝑥,𝑦) = ℰ(𝑥,𝑦) . ◻

Remark 2.6. Note that in the finite-dimensional case, this reduces to the familiar
parametric transversality theorem with 𝒴 the space of parameters that provides
room for perturbation.

We have made a deliberate effort to recast the proofs of transversality theorems


into the following four-step streamline:

1. Setting the stage.


Specify 𝒳 , 𝒴 , ℰ , 𝒮 and their Banach manifold structures.
3
More precisely, this is the differential 𝐷𝒮(𝑥,𝑦) followed by the vertical projection 𝑇𝒮 (𝑥,𝑦) ℰ → ℰ(𝑥,𝑦) .
This is possible since 𝑇𝒮 (𝑥,𝑦) ℰ ≅ 𝑇(𝑥,𝑦) (𝒳 × 𝒴 ) ⊕ ℰ(𝑥,𝑦) splits naturally where 𝒮 (𝑥, 𝑦) = 0.
170 Transversality Theorems by Example

2. Fredholm property.
Check that 𝐷1 𝒮(𝑥,𝑦) ∶ 𝑇𝑥 𝒳 → ℰ(𝑥,𝑦) is Fredholm.

3. Regular value.
Check that 𝐷𝒮(𝑥,𝑦) ∶ 𝑇𝑥 𝒳 × 𝑇𝑦 𝒴 → ℰ(𝑥,𝑦) is surjective.

4. Passage to 𝐶 ∞ .
See the next subsection.

Remarks on smoothness
So far we have been solely concerned with Banach manifolds. However, many
spaces in geometry do not admit a natural Banach structure. This includes the
space of smooth objects or objects on noncompact manifolds, which are merely
Fréchet manifolds. Fortunately, it is often possible to remedy this problem. We
shall presently describe two ways to pass to smooth objects, i.e., to prove transver-
sality theorems of the form “For a generic smooth ..., we have ...”
The first approach is due to Clifford H. Taubes. It is based on the observation
that the set of good 𝐶 ℓ objects is open. Sometimes further modification is needed.
Since we shall follow this approach here, the reader will see this in action.
The second approach is due to Andreas Floer. Instead of 𝐶 ℓ objects, one con-
siders for (𝜀ℓ )ℓ ⊂ ℝ>0 the Banach space 𝐶𝜀∞ of 𝐶 ∞ objects with norm

‖⋅‖𝐶𝜀∞ ∶= ∑ 𝜀ℓ ‖ ⋅ ‖𝐶 ℓ .
ℓ=0

This approach is adopted by, e.g., [Sch93].

3 Example: Classical Morse theory

Genericity of Morse functions


As a baby example, we prove the classical result that Morse functions are generic:

Theorem 3.1. Let 𝑀 be a compact smooth manifold. Then a generic smooth func-
tion on 𝑀 is Morse.

Proof. We apply the general transversality theorem with

𝒳 : the manifold 𝑀;

𝒴 : the Banach space 𝐶 𝑘 (𝑀) of 𝐶 𝑘 functions on 𝑀;

ℰ : the pullback 𝜋 ∗ 𝑇 ∗ 𝑀 via the projection 𝜋 ∶ 𝑀 × 𝐶 𝑘 (𝑀) → 𝑀;

𝒮 : the section 𝑀 × 𝐶 𝑘 (𝑀) → 𝜋 ∗ 𝑇 ∗ 𝑀, (𝑥, 𝑓 ) ↦ 𝑑𝑓𝑥 .


3 Example: Classical Morse theory 171

Here 𝑘 ≥ 2. For (𝑥, 𝑓 ) ∈ 𝒮 −1 (0), 𝐷2 𝒮(𝑥,𝑓 ) (ℎ) = 𝑑ℎ𝑥 , 𝐷1 𝒮(𝑥,𝑓 ) (𝑣) = ∇𝑣 𝑑𝑓 ,


where ∇ is any fixed connection on 𝑇 ∗ 𝑀. Note that the latter is well-defined since
𝑑𝑓𝑥 = 0. The assumptions are trivial to verify: (a) 𝐷2 𝒮(𝑥,𝑓 ) is already surjective;
(b) 𝐷1 𝒮(𝑥,𝑓 ) is Fredholm with index 0 since it is a map between spaces of the same
finite dimension. Thus the general transversality theorem implies that for a generic
𝑓 ∈ 𝐶 𝑘 (𝑀), 𝑣 ↦ ∇𝑣 𝑑𝑓 is surjective whenever 𝑑𝑓𝑥 = 0, i.e., 𝑓 is Morse.
𝑘
To pass to 𝐶 ∞ , we use the argument of Taubes. Let 𝐶M (𝑀) be the space of
𝑘
𝐶 Morse functions on 𝑀. We have proved 𝐶M (𝑀) is dense in 𝐶 𝑘 (𝑀). Since
𝑘
𝑘
Morse functions are stable, 𝐶M (𝑀) is open in 𝐶 𝑘 (𝑀) and hence in 𝐶 ∞ (𝑀). It
∞ 𝑘
remains to show 𝐶M (𝑀) is dense in 𝐶 ∞ (𝑀), so that 𝐶M
∞ ∞
(𝑀) = ⋂𝑘=2 𝐶M (𝑀) is

residual. This is done by a 1∕𝑘-argument: Let 𝑓 ∈ 𝐶 (𝑀). For each 𝑘, take
𝑓𝑘 ∈ 𝐶M 𝑘
(𝑀) with ‖𝑓𝑘 − 𝑓 ‖𝐶 𝑘 ≤ 1∕𝑘, take 𝛿𝑘 > 0 such that ‖𝑓̃ − 𝑓𝑘 ‖𝐶 𝑘 < 𝛿𝑘
̃
implies 𝑓 ∈ 𝐶M 𝑘
(𝑀), then take 𝑓̃𝑘 ∈ 𝐶 ∞ (𝑀) with ‖𝑓̃𝑘 − 𝑓𝑘 ‖𝐶 𝑘 < min(1∕𝑘, 𝛿𝑘 ),
which is possible by the Whitney approximation theorem. Then 𝑓̃𝑘 ∈ 𝐶M ∞
(𝑀) and
̃ ∞
𝑓𝑘 → 𝑓 in 𝐶 (𝑀). ◻

Moduli spaces of flow lines


Let 𝑀 be a smooth manifold, 𝑓 a 𝐶 2 Morse function on 𝑀. We denote by Crit(𝑓 )
the set of critical points of 𝑓 , and for 𝑝 ∈ Crit(𝑓 ), by ind(𝑝) = ind𝑓 (𝑝) the Morse
index of 𝑓 at 𝑝. Fix a Riemannian metric 𝑔 on 𝑀. Consider the negative gradient
flow generated by 𝑓 . A flow line for this flow is a curve 𝛾 ∶ ℝ → 𝑀 such that
𝛾 ′ = −∇𝑓 ∘ 𝛾. For 𝑝, 𝑞 ∈ Crit(𝑓 ), the (parametrized) moduli space of flow lines
from 𝑝 to 𝑞 for the pair (𝑓 , 𝑔), denoted by ℳ(𝑝, 𝑞, 𝑓 ; 𝑔), is defined to be the space
of all flow lines 𝛾 such that lim𝑡→−∞ 𝛾(𝑡) = 𝑝, lim𝑡→∞ 𝛾(𝑡) = 𝑞.

Remark 3.2. Once we know it is a smooth manifold, it is not hard to show that
𝛾 ↦ 𝛾(0) gives a diffeomorphism between the moduli space consider here and the
usual moduli space defined as the intersection of stable and unstable manifolds.

This subsection is devoted to proving the following theorem:

Theorem 3.3. Let 𝑀 be a compact smooth manifold, 𝑓 a 𝐶 𝑘+1 Morse function


on 𝑀, 𝑝, 𝑞 ∈ Crit(𝑓 ). Then for a generic Riemannian metric 𝑔 on 𝑀, the moduli
space ℳ(𝑝, 𝑞, 𝑓 ; 𝑔) is a smooth manifold of dimension ind(𝑝) − ind(𝑞).

If 𝑝 = 𝑞, then ℳ(𝑝, 𝑝, 𝑓 ; 𝑔) is a singleton. Assume 𝑝 ≠ 𝑞.

Setting the stage


We shall apply the general transversality theorem with

𝒳 : the space 𝒫 1,2 of 𝑊 1,2 curves on 𝑀 from 𝑝 to 𝑞;

𝒴 : the space 𝒢 ℓ of 𝐶 ℓ Riemannian metrics on 𝑀;


172 Transversality Theorems by Example

0,2
ℰ : the bundle ℰ 0,2 → 𝒫 1,2 × 𝒢 ℓ with fiber ℰ(𝛾,𝑔) = 𝐿2 (𝛾 ∗ 𝑇 𝑀);

𝒮 : the section 𝒫 1,2 × 𝒢 ℓ → ℰ 0,2 , (𝛾, 𝑔) ↦ 𝛾 ′ + ∇𝑓 ∘ 𝛾.


Here ℓ is sufficiently large. Clearly ℳ(𝑝, 𝑞, 𝑓 ; 𝑔) = 𝒮 −1 (0) ∩ (𝒫 1,2 × {𝑔}).
Let us explain the Banach manifold structures on these spaces. Since 𝒢 ℓ is an
open subset of the Banach space 𝐶 ℓ (Sym2 𝑇 ∗ 𝑀), where Sym2 𝑇 ∗ 𝑀 is the vector
bundle of symmetric (0, 2)-tensors on 𝑀, it is a smooth Banach manifold with
𝑇𝑔 𝒢 ℓ = 𝐶 ℓ (Sym2 𝑇 ∗ 𝑀). The manifold structure on 𝒫 1,2 is more complicated.
Recall that a 𝑊 1,2 map on ℝ is continuous by the Sobolev embedding theorem.
Thus to be more precise, we define
𝒫 1,2 ∶= {𝛾 ∈ 𝑊 1,2 (ℝ, 𝑀) ∩ 𝐶 0 (ℝ, 𝑀) ∶ 𝛾(−∞) = 𝑝, 𝛾(∞) = 𝑞},
equipped with the 𝑊 1,2 topology. One can show that 𝒫 1,2 is a smooth Banach
manifold with 𝑇𝛾 𝒫 1,2 = 𝑊 1,2 (𝛾 ∗ 𝑇 𝑀). Charts on 𝒫 1,2 are given by 𝑇𝛾 𝒫 1,2 →
𝒫 1,2 , 𝜉 ↦ exp𝛾 𝜉, where the exponential map is with respect to some fixed Rie-
mannian metric on 𝑀. One can also show that ℰ 0,2 → 𝒫 1,2 × 𝒢 ℓ is a smooth
Banach vector bundle and 𝒮 is a smooth section. See [Sch93, Appendix A].
For a flow line 𝛾, write
𝐷𝛾 ∶= 𝐷1 𝒮(𝛾,𝑔) ∶ 𝑊 1,2 (𝛾 ∗ 𝑇 𝑀) → 𝐿2 (𝛾 ∗ 𝑇 𝑀)
and its formal adjoint
𝐷𝛾∗ ∶ 𝑊 1,2 (𝛾 ∗ 𝑇 𝑀) → 𝐿2 (𝛾 ∗ 𝑇 𝑀).
By definition,
𝜕 𝑔+𝑡ℎ
𝐷ℎ ∇𝑓 ∶= 𝐷2 𝒮(𝛾,𝑔) (ℎ) = ∇ 𝑓| ,
𝜕𝑡 𝑡=0

where ∇𝑔 denotes gradient with respect to 𝑔. To compute it, applying 𝜕𝑡𝜕 |𝑡=0 to
(𝑔 + 𝑡ℎ)(∇𝑔+𝑡ℎ 𝑓 , 𝜉) = 𝜉𝑓 gives ⟨𝐷ℎ ∇𝑓 , 𝜉⟩ = −ℎ(∇𝑓 , 𝜉) for 𝜉 vector field.

Fredholm property

Proposition 3.4. For (𝛾, 𝑔) ∈ 𝒮 −1 (0),


𝐷1 𝒮(𝛾,𝑔) = 𝐷𝛾 ∶ 𝑊 1,2 (𝛾 ∗ 𝑇 𝑀) → 𝐿2 (𝛾 ∗ 𝑇 𝑀)
is Fredholm with index ind(𝑝) − ind(𝑞).
This follows from the theory of spectral flows, cf. [Sch93, Section 2.2].
Sketch of proof. Since 𝛾 is an embedding and ℝ is contractible, 𝛾 ∗ 𝑇 𝑀 is a trivial
bundle, so 𝐷1 𝒮(𝛾,𝑔) is similar to an operator of the form 𝜕𝑡 + 𝐴 ∶ 𝑊 1,2 (ℝ, ℝ𝑛 ) →
𝐿2 (ℝ, ℝ𝑛 ) where 𝐴 ∈ 𝐶𝑏0 (ℝ, End(ℝ𝑛 )). One checks that such operators are all
Fredholm. Since the space of such operators with fixed 𝐴(±∞) is convex, the
Fredholm index is constant on it, i.e., depends only on 𝐴(±∞). Thus to compute
ind(𝜕𝑡 +𝐴), one takes a nice 𝐴 such that one can write down explicitly the solutions
to the ODE (𝜕𝑡 + 𝐴)𝑠 = 𝑠′ , so that ind(𝜕𝑡 + 𝐴) is directly computable. ◻
4 Example: Pseudoholomorphic curves 173

Regular value

Lemma 3.5. For 𝜂 ∈ ℝ𝑛 ⧵ {0}, Sym2 ℝ𝑛 → ℝ𝑛 , ℎ ↦ ℎ𝜂 = (𝜂 ⊺ ℎ)⊺4 is surjective.

Proof. Conjugating by an orthogonal matrix, we may assume 𝜂 = (𝑡, 0, … , 0)⊺


with 𝑡 > 0, in which case it is obvious. ◻

Proposition 3.6. For (𝛾, 𝑔) ∈ 𝒮 −1 (0),

𝐷𝒮(𝛾,𝑔) ∶ 𝑊 1,2 (𝛾 ∗ 𝑇 𝑀) × 𝐶 ℓ (Sym2 𝑇 ∗ 𝑀) → 𝐿2 (𝛾 ∗ 𝑇 𝑀)

is surjective.

Proof. Since 𝐷1 𝒮(𝛾,𝑔) is Fredholm, im 𝐷1 𝒮(𝛾,𝑔) and hence im 𝐷𝒮(𝛾,𝑔) is closed and
has finite codimension. Thus it suffices to show (im 𝐷𝒮(𝛾,𝑔) )⟂ = 0. Suppose 𝜂 ∈
𝐿2 (𝛾 ∗ 𝑇 𝑀) and

∫ ⟨𝜂, 𝐷𝛾 𝜉⟩, ∀ 𝜉 ∈ 𝑊 1,2 (𝛾 ∗ 𝑇 𝑀)


𝑀

∫ ⟨𝜂, 𝐷ℎ ∇𝑔 𝑓 ⟩ = 0, ∀ ℎ ∈ 𝐶 ℓ (Sym2 𝑇 ∗ 𝑀).


𝑀

By elliptic regularity for 𝐷𝛾∗ ,5 the first equation implies 𝜂 ∈ 𝑊 1,2 . Thus 𝜂 is con-
tinuous by the Sobolev embedding theorem. If 𝜂(𝑡) ≠ 0 for some 𝑡 ∈ ℝ, then by
the lemma, there exists ℎ ∈ Sym2 𝑇𝛾(𝑡) ∗
𝑀 with ⟨𝜂(𝑡), 𝐷ℎ ∇𝑓 (𝛾(𝑡))⟩ > 0. Extend ℎ to
ℓ 2 ∗
a section ∈ 𝐶 (Sym 𝑇 𝑀). By continuity, ⟨𝜂, 𝐷ℎ ∇𝑓 ⟩ ≠ 0 in some neighborhood
𝐼 of 𝑡. Since 𝛾 is a flow line between distinct critical points, it is an embedding. If
𝜙 is a smooth cutoff function supported in an open set 𝑈 ⊂ 𝑀 with 𝛾 −1 (𝑊 ) = 𝐼,
then ⟨𝜂, 𝐷𝜙ℎ ∇𝑓 ⟩ ≥ 0 on ℝ, > 0 near 𝑡. This contradicts the second equation. ◻

Passage to 𝑪 ∞
We omit this.

4 Example: Pseudoholomorphic curves


Pseudoholomorphic curves were introduced by M. Gromov in 1985 and soon be-
came an indispensable tool in modern symplectic geometry. In this section, we
prove the fundamental transversality theorem that for a generic almost complex
structure on a compact symplectic manifold, the space of simple pseudoholomor-
phic curves has the structure of a finite-dimensional smooth manifold.
4
We view elements in ℝ𝑛 as 𝑛 × 1 column vectors, elements in Sym2 ℝ𝑛 as 𝑛 × 𝑛 matrices, and ⊺

denotes matrix transposition.


5
This is an ODE, so it is straightforward to check this regularity.
174 Transversality Theorems by Example

Background material
Pseudoholomorphic curves
Let (𝑀, 𝐽 ) be an almost complex manifold, (Σ, 𝑗) a Riemann surface. A pseudo-
holomorphic or 𝐽 -holomorphic curve on 𝑀 parametrized by Σ is a smooth map
𝑢 ∶ Σ → 𝑀 such that 𝑑𝑢 is complex linear, i.e., 𝐽 ∘ 𝑑𝑢 = 𝑑𝑢 ∘ 𝑗, or 𝜕𝐽̄ 𝑢 = 0, where

1
𝜕𝐽̄ 𝑢 ∶= (𝑑𝑢 + 𝐽 ∘ 𝑑𝑢 ∘ 𝑗) ∈ Ω0,1 (Σ, 𝑢∗ 𝑇 𝑀).
2
We shall always consider the case that Σ is compact.
A 𝐽 -holomorphic curve 𝑢 ∶ Σ → 𝑀 is multiply covered if it factors nontrivially
through another Riemann surface, i.e., if there is a compact Riemann surface Σ′ , a
holomorphic branched covering 𝜑 ∶ Σ → Σ′ with deg 𝜑 > 1, and a 𝐽 -holomorphic
curve 𝑢′ ∶ Σ′ → 𝑀 such that 𝑢 = 𝑢′ ∘ 𝜑. Otherwise 𝑢 is simple.
A smooth map 𝑢 ∶ Σ → 𝑀 is somewhere injective if it has an injective point, a
point 𝑧 ∈ Σ such that 𝑑𝑢(𝑧) ≠ 0, 𝑢−1 (𝑢(𝑧)) = {𝑧}.

Proposition 4.1. Let 𝑀 be a smooth manifold, 𝐽 a 𝐶 2 almost complex structure


on 𝑀, Σ a compact Riemann surface, 𝑢 ∶ Σ → 𝑀 a 𝐽 -holomorphic curve. Then 𝑢
is simple if and only if it is somewhere injective, in which case the complement of
the set of injective points is countable and can only accumulate at critical points
of 𝑢.

See [MS12, Proposition 2.5.1] for a proof.

The symplectic setting


Let (𝑀, 𝜔) be a symplectic manifold. An almost complex structure 𝐽 on 𝑀 is 𝜔-
compatible if 𝜔(⋅, 𝐽 ⋅) defines a Riemannian metric on 𝑀. Fix such a 𝐽 , a compact
Riemann surface Σ, and a homology class 𝐴 ∈ 𝐻2 (𝑀; ℤ). The moduli space of
𝐽 -holomorphic curves on 𝑀 that are parametrized by Σ and represent 𝐴, denoted
by ℳ(𝐴, Σ; 𝐽 ), is the space of all 𝐽 -holomorphic curves 𝑢 such that [𝑢] = 𝐴. The
subspace of simple curves is denoted by ℳ ∗ (𝐴, Σ; 𝐽 ).

Moduli spaces of 𝐽 ‐holomorphic curves


This subsection is devoted to proving the following theorem:

Theorem 4.2. Let (𝑀, 𝜔) be a compact symplectic 2𝑛-manifold, 𝐴 ∈ 𝐻2 (𝑀; ℤ),


(Σ, 𝑗) a compact Riemann surface. Then for a generic 𝜔-compatible almost com-
plex structure 𝐽 on 𝑀, the moduli space ℳ ∗ (𝐴, Σ; 𝐽 ) is a smooth manifold of
dimension 𝑛𝜒(Σ) + 2𝑐1 (𝐴), where 𝑐1 (𝐴) denotes the first Chern number.

Setting the stage


We shall apply the general transversality theorem with
4 Example: Pseudoholomorphic curves 175

𝒳 : the space ℬ 𝑘,𝑝 of somewhere injective 𝑊 𝑘,𝑝 maps Σ → 𝑀 that represent 𝐴;

𝒴 : the space 𝒥 ℓ of 𝐶 ℓ 𝜔-compatible almost complex structures on 𝑀;

𝑘−1,𝑝
ℰ : the bundle ℰ 𝑘−1,𝑝 → ℬ 𝑘,𝑝 × 𝒥 ℓ with fibers ℰ(𝑢,𝐽 ) = 𝑊 𝑘−1,𝑝 (Σ, Λ0,1 ⊗𝐽
𝑢∗ 𝑇 𝑀), where Λ0,1 ⊗𝐽 𝑢∗ 𝑇 𝑀 is the bundle of 𝐽 -antilinear 1-forms;

𝒮 : the section ℬ 𝑘,𝑝 × 𝒥 ℓ → ℰ 𝑘−1,𝑝 , (𝑢, 𝐽 ) ↦ 𝜕𝐽̄ 𝑢.

Here
𝑘 ≥ 2, 𝑝 > 2, ℓ ≥ 𝑘 + 1 + max(1, 𝑛𝜒(Σ) + 2𝑐1 (𝐴) + 1).

Clearly ℳ ∗ (𝐴, Σ; 𝐽 ) = {(𝑢, 𝐽 ) ∈ 𝒮 −1 (0) ∣ 𝑢 is simple}.

Let us explain the Banach manifold structures on these spaces. It is easy to


see that 𝒥 ℓ is a smooth Banach manifold with 𝑇𝐽 𝒥 ℓ = 𝐶 ℓ (𝑀, End(𝑇 𝑀, 𝐽 , 𝜔)),
where End(𝑇 𝑀, 𝐽 , 𝜔) is the vector bundle of (1, 1)-tensors on 𝑀 that preserves 𝐽
and 𝜔. By the Sobolev embedding theorem, 𝑊 𝑘,𝑝 ⊂ 𝐶 1 on Σ, so ℬ 𝑘,𝑝 is open in
𝑊 𝑘,𝑝 (Σ, 𝑀). Similarly to the space 𝒫 1,2 of 𝑊 1,2 curves discussed in the previous
section, the latter is a smooth Banach manifold with 𝑇𝑢 𝑊 𝑘,𝑝 (Σ, 𝑀) = 𝑇𝑢 ℬ 𝑘,𝑝 =
𝑊 𝑘,𝑝 (Σ, 𝑢∗ 𝑇 𝑀) and charts are given by the exponential map. One can show that
ℰ 𝑘−1,𝑝 → ℬ 𝑘,𝑝 × 𝒥 ℓ is a 𝐶 ℓ−𝑘 bundle and 𝒮 is a 𝐶 ℓ−𝑘 section ([MS12, p.50]).

For a 𝐽 -holomorphic curve 𝑢, write

𝐷𝑢 ∶= 𝐷1 𝒮(𝑢,𝐽 ) ∶ 𝑊 𝑘,𝑝 (Σ, 𝑢∗ 𝑇 𝑀) → 𝑊 𝑘−1,𝑝 (Σ, Λ0,1 ⊗𝐽 𝑢∗ 𝑇 𝑀)

and its formal adjoint


′ ′
𝐷𝑢∗ ∶ 𝑊 𝑘,𝑝 (Σ, Λ0,1 ⊗𝐽 𝑢∗ 𝑇 𝑀) → 𝑊 𝑘−1,𝑝 (Σ, 𝑢∗ 𝑇 𝑀).

Clearly
𝐷𝒮(𝑢,𝐽 ) (𝜉, 𝑌 ) = 𝐷𝑢 𝜉 + 21 𝑌 (𝑢) ∘ 𝑑𝑢 ∘ 𝑗.

Fredholm property

Proposition 4.3. For (𝑢, 𝐽 ) ∈ 𝒮 −1 (0),

𝐷1 𝒮(𝑢,𝐽 ) = 𝐷𝑢 ∶ 𝑊 𝑘,𝑝 (Σ, 𝑢∗ 𝑇 𝑀) → 𝑊 𝑘−1,𝑝 (Σ, Λ0,1 ⊗𝐽 𝑢∗ 𝑇 𝑀)

is Fredholm with index 𝑛𝜒(Σ) + 2𝑐1 (𝐴).

This follows from the generalized Riemann–Roch theorem for the Cauchy–
Riemann operator 𝐷𝑢 , cf. [MS12, Theorem C.1.10].
176 Transversality Theorems by Example

Regular value
Lemma 4.4. Let 𝐽0 , 𝜔0 be the standard linear almost complex and symplectic
structures on ℝ2𝑛 . Then End(ℝ2𝑛 , 𝐽0 , 𝜔0 ) acts transitively on ℝ2𝑛 ⧵ {0}. In other
words, for any 𝜉, 𝜂 ∈ ℝ2𝑛 ⧵{0}, there exists 𝑌 ∈ ℝ2𝑛×2𝑛 such that 𝑌 = 𝑌 ⊺ = 𝐽0 𝑌 𝐽0 ,
𝑌 𝜉 = 𝜂.
Proof. Simply take
1 ⊺ ⊺ ⊺ ⊺
𝑌 ∶= (𝜂𝜉 + 𝜉𝜂 + 𝐽0 (𝜂𝜉 + 𝜉𝜂 )𝐽0 )
|𝜉|2
1
− 4 (⟨𝜂, 𝜉⟩(𝜉𝜉 ⊺ + 𝐽0 𝜉𝜉 ⊺ 𝐽0 ) + ⟨𝜂, 𝐽0 𝜉⟩(𝐽0 𝜉𝜉 ⊺ − 𝜉𝜉 ⊺ 𝐽0 )). ◻
|𝜉|

Proposition 4.5. For (𝑢, 𝐽 ) ∈ 𝒮 −1 (0),

𝐷𝒮(𝑢,𝐽 ) ∶ 𝑊 𝑘,𝑝 (Σ, 𝑢∗ 𝑇 𝑀)×𝐶 ℓ (𝑀, End(𝑇 𝑀, 𝐽 , 𝜔)) → 𝑊 𝑘−1,𝑝 (Σ, Λ0,1 ⊗𝐽 𝑢∗ 𝑇 𝑀)

is surjective.
Proof for 𝑘 = 1. Since 𝐷1 𝒮(𝑢,𝐽 ) is Fredholm, im 𝐷1 𝒮(𝑢,𝐽 ) and hence im 𝐷𝒮(𝑢,𝐽 ) is
closed and has finite codimension. Thus by the Hahn–Banach theorem, it suffices
to show (im 𝐷𝒮(𝑢,𝐽 ) )⟂ = 0, where ⟂ denotes its annihilator in the dual space.

Suppose 𝜂 ∈ 𝐿𝑝 (Σ, Λ0,1 ⊗𝐽 𝑇 𝑀) and

∫⟨𝜂, 𝐷𝑢 𝜉⟩ = 0, ∀ 𝜉 ∈ 𝑊 1,𝑝 (Σ, 𝑢∗ 𝑇 𝑀),


Σ

∫⟨𝜂, 𝑌 (𝑢) ∘ 𝑑𝑢 ∘ 𝑗⟩ = 0, ∀ 𝑌 ∈ 𝐶 ℓ (𝑀, End(𝑇 𝑀, 𝐽 , 𝜔)).


Σ

By elliptic regularity for 𝐷𝑢∗ ([MS12, Proposition 3.1.11 and Theorem C.2.3]), the
first equation implies 𝜂 ∈ 𝑊 1,𝑝 . Thus 𝜂 is continuous by the Sobolev embedding
theorem. We claim that 𝜂 vanishes at injective point of 𝑢. Since 𝑢 is simple, the set
of such points is open and dense, so by continuity, the claim implies 𝜂 = 0.
Let 𝑧 be an injective point of 𝑢. If 𝜂(𝑧) ≠ 0, then by the lemma, there exists
𝑌 ∈ End(𝑇𝑢(𝑧) 𝑀, 𝐽𝑢(𝑧) , 𝜔𝑢(𝑧) ) with ⟨𝜂(𝑧), 𝑌 ∘ 𝑑𝑢(𝑧) ∘ 𝑗(𝑧)⟩ > 0. Extend 𝑌 to a
section ∈ 𝐶 ℓ (𝑀, End(𝑇 𝑀, 𝐽 , 𝜔)). By continuity, ⟨𝜂, 𝑌 ∘ 𝑑𝑢 ∘ 𝑗⟩ > 0 in some
neighborhood 𝑉 of 𝑧 in Σ. By injectivity and compactness, take a neighborhood
𝑈 of 𝑧 in 𝑀 disjoint from 𝑢(Σ ⧵ 𝑉 ). If 𝜙 is a smooth cutoff function supported
in 𝑈 , then ⟨𝜂, (𝜙𝑌 ) ∘ 𝑑𝑢 ∘ 𝑗⟩ ≥ 0 on 𝑀, > 0 near 𝑧. This contradicts the second
equation. ◻
Proof for general 𝑘. Let 𝜂 ∈ 𝑊 𝑘−1,𝑝 (Σ, Λ0,1 ⊗𝐽 𝑢∗ 𝑇 𝑀). By the case 𝑘 = 1, there
exist (𝜉, 𝑌 ) ∈ 𝑊 1,𝑝 (Σ, 𝑢∗ 𝑇 𝑀) × 𝐶 ℓ (𝑀, End(𝑇 𝑀, 𝐽 , 𝜔)) with 𝐷𝒮(𝑢,𝐽 ) (𝜉, 𝑌 ) =
𝐷𝑢 𝜉 + 21 𝑌 (𝑢) ∘ 𝑑𝑢 ∘ 𝑗 = 𝜂. Then 𝐷𝑢 𝜉 = 𝜂 − 12 𝑌 (𝑢) ∘ 𝑑𝑢 ∘ 𝑗 ∈ 𝑊 𝑘−1,𝑝 , since 𝑢 ∈ 𝑊 ℓ,𝑝
by elliptic regularity for 𝜕𝐽̄ ([MS12, Theorem B.4.1]). By elliptic regularity for 𝐷𝑢
([MS12, Theorem C.2.3]), this implies 𝜉 ∈ 𝑊 𝑘,𝑝 . Thus 𝐷𝒮(𝑢,𝐽 ) is surjective. ◻
5 Example: Yang–Mills connections 177

Passage to 𝑪 ∞
We shall use the argument of Taubes to pass to 𝐶 ∞ almost complex structures.

Proof of main theorem. From the proof of the general transversality theorem, we
have ℳ(𝐴, Σ; 𝐽 ) is a manifold if 𝐷𝑢 is surjective for any 𝐽 -holomorphic curve

𝑢 ∶ Σ → 𝑀. Let 𝒥reg be the set of such 𝐶 ℓ almost complex structures. We

have proved 𝒥reg is dense in 𝒥 ℓ . For 𝐾 > 0, let 𝒥reg,𝐾

be the set of 𝐶 ℓ al-
most complex structures 𝐽 such that 𝐷𝑢 is surjective for any 𝐽 -holomorphic curve
𝑢 ∶ Σ → 𝑀 representing 𝐴 such that (1) ‖𝑑𝑢‖𝐿∞ ≤ 𝐾, and (2) there exists 𝑧 ∈ Σ
with inf𝑤≠𝑧 𝑑(𝑢(𝑤),𝑢(𝑧))
𝑑(𝑤,𝑧)
≥ 1∕𝐾. Note that (2) implies 𝑢 is injective at 𝑧 and hence
ℓ ℓ ∞
simple, so 𝒥reg ⊂ 𝒥reg,𝐾 . We claim that 𝒥reg,𝐾 is open and dense in 𝒥 ∞ , so that
∞ ∞
𝒥reg = ⋂𝐾>0 𝒥reg,𝐾 is residual.

To see that 𝒥reg,𝐾 is open in 𝒥 ∞ , let (𝐽𝑛 )𝑛 , (𝑢𝑛 )𝑛 , (𝑧𝑛 )𝑛 be sequences where
𝑑(𝑢 (𝑤),𝑢 (𝑧 ))
𝑢𝑛 ∶ Σ → 𝑀 is 𝐽𝑛 -holomorphic with ‖𝑑𝑢𝑛 ‖𝐿∞ ≤ 𝐾, inf𝑤≠𝑧𝑛 𝑛𝑑(𝑤,𝑧𝑛 ) 𝑛 ≥ 1∕𝐾,
𝑛
and 𝐷𝑢𝑛 not surjecitve. Suppose 𝐽𝑛 → 𝐽 in 𝒥 ∞ . By elliptic regularity estimates
([MS12, Theorem B.4.2]) and compactness, passing to a subsequence, we may
assume 𝑢𝑛 → 𝑢 in 𝐶 ∞ , 𝑧𝑛 → 𝑧. Then 𝑢 is 𝐽 -holomorphic with ‖𝑑𝑢‖𝐿∞ ≤ 𝐾,
inf𝑤≠𝑧 𝑑(𝑢(𝑤),𝑢(𝑧))
𝑑(𝑤,𝑧)

≥ 1∕𝐾, and 𝐷𝑢 = lim 𝐷𝑢𝑛 is not surjective, so 𝐽 ∉ 𝒥reg,𝐾 . The

same argument shows 𝒥reg,𝐾 is open in 𝒥 ℓ for all ℓ.

That 𝒥reg,𝐾 is dense in 𝒥 ∞ is proved by a 1∕ℓ-argument as before: Let 𝐽 ∈
𝒥 ∞ . For each ℓ, take 𝐽ℓ ∈ 𝒥regℓ
with ‖𝐽ℓ − 𝐽 ‖𝐶 ℓ ≤ 1∕ℓ, take 𝛿ℓ > 0 such that
̃ ̃
‖𝐽 − 𝐽ℓ ‖𝐶 ℓ < 𝛿ℓ implies 𝐽 ∈ 𝒥reg,𝐾ℓ
, then take 𝐽̃ℓ ∈ 𝒥 ∞ with ‖𝐽̃ℓ − 𝐽ℓ ‖𝐶 ℓ <
min(1∕ℓ, 𝛿ℓ ). Then 𝐽̃ℓ ∈ 𝒥 ∞ and 𝐽̃ℓ → 𝐽 in 𝒥 ∞ .
reg,𝐾 ◻

Remark 4.6. The smooth structure on ℳ ∗ (𝐴, Σ; 𝐽 ) ⊂ 𝑊 𝑘,𝑝 (Σ, 𝑀) does not de-
pend on 𝑘, 𝑝, since from the proof we see that charts are given by 𝜉 ↦ exp𝑢 𝜉 for a
fixed 𝐽 -holomorphic curve 𝑢, which does not depend on 𝑘, 𝑝.

5 Example: Yang–Mills connections


Mathematical gauge theory was inspired by physics and is now a basic tool in
low-dimensional geometry. In this section, we sketch the proof of the fundamen-
tal transversality theorem that for a generic Riemannian metric on a compact 4-
manifold, the space of irreducible self-dual connections on a fixed principal SU(2)-
bundle has the structure of a smooth 5-manifold. This is a major step in the proof
of the celebrated diagonalizability theorem of S. K. Donaldson.

Background material
The purpose of this subsection is mainly to fix notation.
178 Transversality Theorems by Example

Connections on principal bundles


Let 𝐺 be a Lie group, 𝔤 its Lie algebra. Let 𝑀 be a smooth manifold, 𝑃 → 𝑀 a
smooth principal 𝐺-bundle. Given an open cover (𝑈𝛼 )𝛼 of 𝑀 on which 𝑃 → 𝑀 is
trivial, a connection on 𝑃 can be described by connection 1-forms 𝐴𝛼 ∈ Ω1 (𝑈𝛼 ; 𝔤).
The difference of two connection 1-forms on 𝑃 is a section ∈ Ω1 (ad 𝑃 ), so the space
of connections on 𝑃 is an affine space modeled on Ω1 (ad 𝑃 ). For a connection 𝐷
on 𝑃 , we denote by 𝐹𝐷 ∈ Ω2 (ad 𝑃 ) the curvature 2-form of 𝐷.

Anti-self-dual connections
Let 𝑀 be a compact smooth 4-manifold, 𝑃 → 𝑀 a principal SU(2)-connection.
We shall usually identify it with the associated vector bundle 𝑃 ×𝜌 ℂ2 , where 𝜌
is the standard representation of SU(2) on ℂ2 . A connection on 𝑃 is reducible if
𝑃 ×𝜌 ℂ2 splits into the direct sum of two line bundles and the induced connection
on it also splits. Otherwise it is irreducible.
Fix a Riemannian metric 𝑔 on 𝑀. This induces the Hodge star operator
∗ ∶ Λ2 → Λ2 with ∗2 = id. The subbundles Λ2± ∶= ker(∗ ∓ 1) are called the
bundle of (anti-)self-dual forms. This definition extends to forms with coefficients
in any vector bundle. In particular, a connection 𝐷 on 𝑃 is (anti-)self-dual if so
is 𝐹𝐷 . Such connections are automatically Yang–Mills in the sense that they are
critical points of the Yang–Mills functional. Since we shall make no use of this
functional, we omit its definition.

Moduli spaces of self‐dual connections


This subsubsection is devoted to sketching the proof of the following equivariant
transversality theorem:
Theorem 5.1. Let 𝑀 be a compact smooth 4-manifold, 𝑃 → 𝑀 a smooth principal
SU(2)-bundle. Then for a generic Riemannian metric 𝑔 on 𝑀, the moduli space
ℳ ∗ (𝑃 ; 𝑔) is a smooth manifold of dimension 5.
We intend to apply the general transversality theorem with
𝒳 : the space 𝒜 𝑘−1,2 of irreducible 𝑊 𝑘−1,2 connections on 𝑃 ;
𝒴 : the space 𝒞 ℓ ∶= 𝐶 ℓ (GL(𝑇 𝑀));
ℰ : the product bundle ℰ 𝑘−2,2 with fiber 𝑊 𝑘−2,2 (Λ2− ⊗ ad 𝑃 );
𝒮 : the map 𝒜 𝑘−1,2 × 𝒞 ℓ → ℰ 𝑘−2,2 , (𝐷, 𝜑) ↦ (𝐷, 𝜑, 𝑃− ((𝜑−1 )∗ 𝐹𝐷 )).
Here ℓ ≫ 𝑘, ∗ is with respect to a fixed Riemannian metric 𝑔 on 𝑀. As 𝜑 ranges
over 𝒞 ℓ , 𝜑∗ 𝑔 ranges over all 𝐶 ℓ metrics on 𝑀.6 The Hodge star operator with
respect to 𝜑∗ 𝑔 is 𝜑∗ 𝑃− (𝜑−1 )∗ , so ℳ ∗ (𝑃 ; 𝜑∗ 𝑔) = 𝒮 −1 (0) ∩ (𝒜 𝑘−1,2 × {𝜑})∕𝒢 𝑘 .
6
This trick is from [FU91]. The technical advantage of this parameter space is that it turns ℰ into a
trivial bundle. One could also proceed with 𝒴 the space of 𝐶 ℓ Riemannian metrics, much like in the
previous subsection.
References 179

Let us explain the Banach manifold structures on these spaces. Since 𝒞 ℓ is an


open subset of the Banach space 𝐶 ℓ (End 𝑇 𝑀), it is a smooth Banach manifold. In
fact, it is a Banach Lie group with Lie algebra 𝑇id 𝒞 ℓ = 𝐶 ℓ (Sym2 𝑇 ∗ 𝑀). Recall
that 𝒜 𝑘−1,2 is an affine space modeled on 𝑊 𝑘−1,2 (Λ1 ⊗ ad 𝑃 ), so it is a Banach
manifold with 𝑇𝐷 𝒜 𝑘−1,2 = 𝑊 𝑘−1,2 (Λ1 ⊗ ad 𝑃 ).
Now we pause for a moment. The general transversality theorem only gives
that 𝒮 −1 (0) ∩ (𝒜 𝑘−1,2 × {𝜑}) is a smooth manifold, and we have to quotient out by
𝒢 ℓ to get the theorem. However, it is not feasible to do this in the end. Instead, we
follow the following steps:
1. Regular value.
0 is a regular value of 𝒮 . Thus the universal moduli space 𝒮 −1 (0) is a Banach
manifold.
2. Slices.
For 𝐷 ∈ 𝒜 𝑘−1,2 , 𝒜 𝑘−1,2 is locally 𝒢 ℓ -equivariantly diffeomorphic to
ker 𝐷∗ × 𝒢 𝑘 near 𝐷, where 𝒢 𝑘 is the space of 𝐶 𝑘 gauge transformations.
Thus the orbit space 𝒳 𝑘−1,2 ∶= 𝒜 𝑘−1,2 ∕𝒢 𝑘 is a Banach manifold. This is a
simple application of the implicit mapping theorem. Similarly, 𝒮 −1 (0)∕𝒢 ℓ
is a Banach manifold.
3. Fredholm property.
The projection 𝜋 ∶ 𝒮 −1 (0)∕𝒢 𝑘 → 𝒞 ℓ is Fredholm with index 5. This fol-
lows from the Atiyah–Singer index theorem for the elliptic complex
𝐷 𝐷1 𝒮(𝐷,𝜑)
0 → Ω0 (ad 𝑃 ) −→ Ω1 (ad 𝑃 ) −−−−−−→ Ω2 (ad 𝑃 ) → 0.

4. Application of Sard–Smale.
The theorem now follows immediately from the Sard–Smale theorem since
ℳ ∗ (𝑃 ; 𝜑∗ 𝑔) = 𝜋 −1 (𝜑).
See [FU91] for the complete proof.

References
[FU91] Daniel S. Freed and Karen K. Uhlenbeck (1991). Instantons and Four-Manifolds.
second. Vol. 1. Mathematical Sciences Research Institute Publications. Springer.
[Lan99] Serge Lang (1999). Fundamentals of Differential Geometry. Vol. 191. Graduate
Texts in Mathematics. Springer.
[MS12] Dusa McDuff and Dietmar Salamon (2012). 𝐽 -holomorphic Curves and Sym-
plectic Topology. second. Vol. 52. Colloquium Publications. American Mathe-
matical Society.
[Sch93] Matthias Schwarz (1993). Morse Homology. Vol. 111. Progress in Mathematics.
Birkhäuser.
180 Transversality Theorems by Example

[Sma65] Stephen Smale (1965). “An infinite dimensoinal version of Sard’s theorem”.
American Journal of Mathematics 87 (4), 861–866.
超实数

《荷思》编辑部

摘 要

超实数是在实数中添加无穷大、无穷小元素得到的数系. 我们首先介绍
超实数的构造过程, 然后介绍它在数学分析中的应用, 也就是非标准分析,
以及超实数与博弈论的联系.

目录

1 超实数 182
超实数的构造 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
超实数的运算 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184

2 超实数与非标准分析 186
非标准实数 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
非标准微积分 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189

3 超实数与博弈论 190
游戏局面与超实数的对应 . . . . . . . . . . . . . . . . . . . . . . . . 190
游戏策略 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
182 超实数

1 超实数
超实数1 (surreal number) 是在实数中添加无穷大、无穷小元素得到的数系.
在这一节中,我们介绍超实数的构造过程,并在超实数上定义一些基本运算.

超实数的构造
超实数的构造过程如下. 首先, 记

0=( ∶ ),

然后让
1 = (0 ∶ ), −1 = ( ∶ 0).

这里我们用 (𝐿 ∶ 𝑅) 表示一个比 𝐿 中数大且比 𝑅 中数小的数. 同样, 我们让

2 = (1 ∶ ), −2 = ( ∶ −1).

利用这样的归纳构造, 我们可以构造出所有的整数.
接下来, 我们让
1 3
= (0 ∶ 1), = (1 ∶ 2).
2 2
用这种方式可以得到所有半整数:
2𝑘 + 1
= (𝑘 ∶ 𝑘 + 1).
2
进一步地, 我们令
4𝑘 + 1 1 4𝑘 + 3 1
= (𝑘 ∶ 𝑘 + ) , = (𝑘 + ∶ 𝑘 + 1 ) .
4 2 4 2
这样就得到了所有 1∕4 的整数倍. 如此下去, 我们能得到所有二进制下的有限
小数. 这些数在实数中是稠密的, 因此, 其它的实数就可以通过类似 Dedekind
分割的方式来得到.
超实数除了包含一般的实数之外, 还有一些其它的数. 例如,

𝜔 = (1, 2, 3, ⋯ ∶ )

是一个比所有实数都要大的数, 而
1 1 1
𝜀 = (0 ∶ , , ,⋯
2 4 8 )
1
也常译作超现实数.
1 超实数 183

是一个比所有正实数都要小的正数. 这两个数也是合法的超实数.
以上的操作可以一直持续下去: 只要我们有两个超实数的集合 𝑋𝐿 和 𝑋𝑅 ,
使得对任意的 𝑥𝐿 ∈ 𝑋𝐿 和 𝑥𝑅 ∈ 𝑋𝑅 , 都有 𝑥𝐿 < 𝑥𝑅 , 那么 (𝑋𝐿 ∶ 𝑋𝑅 ) 就是一个
合法的超实数. 这样不断递归得到的全体对象就是所有的超实数.
超实数的大小关系按照如下方式递归定义: 对于两个超实数

𝑥 = (𝑋𝐿 ∶ 𝑋𝑅 ), 𝑦 = (𝑌𝐿 ∶ 𝑌𝑅 ),
我们说 𝑥 ≤ 𝑦, 如果对任意的 𝑥𝐿 ∈ 𝑋𝐿 和 𝑦𝑅 ∈ 𝑌𝑅 , 都有 𝑥𝐿 ≱ 𝑦 和 𝑥 ≱ 𝑦𝑅 成
立; 𝑥 和 𝑦 满足 𝑥 < 𝑦, 如果 𝑥 ≤ 𝑦 且 𝑦 ≰ 𝑥. 我们记 𝑥 = 𝑦, 如果 𝑥 ≤ 𝑦 且 𝑦 ≤ 𝑥,
并将 𝑥 和 𝑦 视作相等的超实数.
全体超实数构成的类,我们叫它 𝕊. 这是一个真类, 也就是说, 它的大小比
任何的集合都要大, 它因而不是集合.
根据超实数的构造过程, 我们可以通过超实数的 “复杂度” 来对超实数进
行分类. 我们记 𝕊0 为可以用空集 ∅ 构造的超实数的集合, 即

𝕊0 = {0}.

而对每个序数 𝛼, 我们记 𝕊𝛼 为可以用 𝕊<𝛼 构造的超实数的集合, 其中

𝕊<𝛼 = 𝕊𝛽 .

𝛽<𝛼

我们称一个超实数 𝑥 在第 𝜶 天被定义, 若 𝑥 ∈ 𝕊𝛼 而 𝑥 ∉ 𝕊<𝛼 .


例如, 按照这个记号, 𝕊1 为可以用 𝕊0 构造的超实数, 即

𝕊1 = {−1, 0, 1};

而 𝕊<𝜔 为所有可以在有限步内得到的超实数的集合, 也就是所有的 “二进制下


的有限小数”, 这里 𝜔 是最小的无限序数. 进一步地, 可以验证,

𝕊𝜔 = ℝ ∪ {±𝜔, 𝕊<𝜔 ± 𝜀}.

这样对于超实数的分类方式可以帮助对超实数的大小进行比较: 𝕊𝛼 中超
实数和 𝕊𝛽 中超实数的大小比较, 依赖于 𝕊<𝛼 中超实数和 𝕊𝛽 中超实数的大小比
较, 以及 𝕊𝛼 中超实数和 𝕊<𝛽 中超实数的大小比较. 利用归纳法, 我们可以验证
以下的关于超实数的大小关系的性质:

• 任何两个超实数都可以比较大小;
• 超实数的大小关系具有自反性和传递性;
• 若 𝑥 = (𝑋𝐿 ∶ 𝑋𝑅 ) 是一个超实数, 那么 𝑋𝐿 < 𝑥 < 𝑋𝑅 .

一个超实数可以有多种不同表示形式. 例如, 可以验证

2 = (1 ∶ ) = (0, 1 ∶ ).
184 超实数

超实数的运算
超实数不仅仅可以做大小比较, 它们还可以像实数一样, 做加减法, 乘除
法. 我们首先定义加减法, 为此只需要定义超实数的加法, 和超实数的相反数.

定义 1.1. 设有超实数

𝑥 = (𝑋𝐿 ∶ 𝑋𝑅 ), 𝑦 = (𝑌𝐿 ∶ 𝑌𝑅 ).

我们定义
−𝑥 = (−𝑋𝑅 ∶ −𝑋𝐿 ),

以及
𝑥 + 𝑦 = ((𝑥 + 𝑌𝐿 ) ∪ (𝑦 + 𝑋𝐿 ) ∶ (𝑥 + 𝑌𝑅 ) ∪ (𝑦 + 𝑋𝑅 )).

这里
−𝑆 = {−𝑠 ∣ 𝑠 ∈ 𝑆}, 𝑥 + 𝑆 = {𝑥 + 𝑠 ∣ 𝑠 ∈ 𝑆}.

例 1.2. 我们给出一些超实数间加减法的例子.

• 对于任何超实数 𝑥 = (𝑋𝐿 ∶ 𝑋𝑅 ) 我们都有

𝑥 + 0 = (𝑋𝐿 ∶ 𝑋𝑅 ) + ( ∶ ) = (𝑋𝐿 ∶ 𝑋𝑅 ) = 𝑥.

• 我们有
1 + 1 = (0 ∶ ) + (0 ∶ ) = (1 ∶ ) = 2.

• 我们有
𝜔 + 1 = (0, 1, ⋯ , 𝜔 ∶ ) = (𝜔 ∶ ),

又比如,
2𝜔 = 𝜔 + 𝜔 = (𝜔 + 1, 𝜔 + 2, ⋯ ∶ ).

超实数的乘除法的定义要略复杂一些. 我们略过严格的定义; 读者可以参


见 [Gon86].

例 1.3. 我们给出一些超实数乘法的例子.

• 2 和 𝜔 的乘积就是 2𝜔 = 𝜔 + 𝜔.
• 超实数 𝜀 和 𝜔 的乘积是 1, 也就是说这两个超实数互为倒数.
1 超实数 185

超实数可以做 𝜔-进制展开. 设 𝑥 = (𝑋𝐿 ∶ 𝑋𝑅 ) 是一个超实数, 我们定义

𝜔𝑥 = (𝑌𝐿 ∶ 𝑌𝑅 ),

其中
𝑌𝐿 = {0} ∪ {𝑠 ⋅ 𝜔𝑥𝑙 ∣ 𝑠 ∈ ℝ, 𝑠 > 0, 𝑥𝑙 ∈ 𝑋𝐿 },
𝑌𝑅 = {𝑠 ⋅ 𝜔𝑥𝑟 ∣ 𝑠 ∈ ℝ, 𝑠 > 0, 𝑥𝑟 ∈ 𝑋𝑅 }.

可以验证, 对每个超实数 𝑥, 都存在一个实数 𝑟 和一个超实数 𝑦, 使得对任意的


实数 𝑠 都有
𝑠 ⋅ (𝑥 − 𝑟 ⋅ 𝜔𝑦 ) < 𝑥.

进一步的, 这样的 𝑟 和 𝑦 是唯一的. 不断使用 𝑥 − 𝑟 ⋅ 𝜔𝑦 代替 𝑥, 我们便得到了如


下定理:

定理 1.4. 对于每个超实数 𝑥, 存在唯一的序数 𝛼, 且对每个小于 𝛼 的序数 𝛽,


存在唯一的实数 𝑟𝛽 和超实数 𝑦𝛽 , 使得:

• 序列 (𝑦𝛽 ) 是单调递减的;
• 有
𝑥 = ∑ 𝑟 𝛽 ⋅ 𝜔 𝑦𝛽 .
𝛽<𝛼

这个表达式叫做 𝑥 的 𝝎 进制展开.

与实数类似, 超实数上也可以定义指数函数, 𝑥 ↦ exp(𝑥). 具体的定义请读


者参见 [Gon86, 第 10 章]. 我们这里只给出一些超实数上的指数函数的性质和
例子:

• 映射 exp 在实数上的限制就是实数上的指数函数. 例如, exp(0) = 1,


exp(1) = 𝑒.
• 映射 exp 的值域是所有的正超实数 𝕊+ . 并且, 映射

exp ∶ 𝕊 → 𝕊+

是单调递增的, 从而是一一映射. 它的逆映射我们记作 log, 叫做对数函


数.
• 对任意的超实数 𝑥, 𝑦, 我们都有

exp(𝑥 + 𝑦) = exp(𝑥) ⋅ exp(𝑦).

例 1.5. 我们有
186 超实数

• exp 𝜔 = 𝜔𝜔 .

• log 𝜔 = 𝜔𝜀 .

特别地, 这说明 𝜔𝜔 与 exp(𝜔 log 𝜔) 是不同的: 前者只是我们定义的记号, 而不


是真正的指数函数.

2 超实数与非标准分析

在数学分析中, 函数的极限、连续性、导数等概念都是通过 𝜀-𝛿 语言来定义


的. 在超实数中, 我们有一个现成的无穷小元素 𝜀, 它可以用来绕过 𝜀-𝛿 语言, 而
直接定义极限、连续性、导数等概念, 这样定义能与我们的直观更加相符. 这一
门学科叫做非标准分析 (nonstandard analysis).

非标准实数

在非标准分析中, 我们并不使用所有的超实数, 因为它们太多了, 不构成集


合. 非标准分析使用的是非标准实数2 (hyperreal number) ∗ ℝ, 也就是直到第 𝜔1
天定义出来的超实数, 其中 𝜔1 是最小的不可数序数. 严格地说, 我们定义


ℝ = 𝕊<𝜔1 ,

可以说明它对四则运算封闭, 从而是一个域. 它包含所有的实数, 它的大小 (势)


和实数集相同.
非标准分析的威力实际上不在于非标准实数的构造, 而在于一个叫做转移
原理 (transfer principle) 的性质: 对任意关于实数 ℝ 的命题 (当然, 对命题有些
限制), 它为真当且仅当对应的关于非标准实数 ∗ ℝ 的命题为真. 这意味着, 要证
明一个关于 ℝ 的定理, 只需要在 ∗ ℝ 上证明这个定理, 就能直接知道原来的关
于 ℝ 的定理是正确的.
转移原理也意味着, ℝ 上的函数可以转移到 ∗ ℝ 上, 成为 ∗ ℝ 上的函数, 然
后在 ∗ ℝ 上定义其极限、连续性、导数等概念, 定义出来的概念与 ℝ 上的定义是
相同的.
为了做到这种转移, 在非标准分析中, 我们用另一种等价的方式来定义非

2
也常译作超实数.
2 超实数与非标准分析 187

标准实数. 我们把非标准实数定义为形如 {𝑎𝑛 }𝑛∈ℕ 的实数列, 例如:

𝑥 = (𝑥, 𝑥, 𝑥, 𝑥, 𝑥, … ) (𝑥 ∈ ℝ),
𝜔 = (1, 2, 3, 4, 5, … ),
𝜔 + 1 = (2, 3, 4, 5, 6, … ),
𝜔2 = (1, 4, 9, 16, 25, … ),
𝜀 = 1∕𝜔 = (1, 1∕2, 1∕3, 1∕4, … ),

√𝜔 + 𝜀 = (1 + 1, √2 + 1∕2, √3 + 1∕3, … ),

等等. 严格的定义如下:

构造 2.1. 记 ℝℕ 是所有形如 {𝑎𝑛 }𝑛∈ℕ 的实数列构成的环, 其中加法、乘法都按


照分量来定义. 我们注意到, 形如

𝐼𝑘 = {{𝑎𝑛 }𝑛∈ℕ ∈ ℝℕ | 𝑎𝑘 = 0}

的集合是 ℝℕ 的极大理想, 并且 ℝℕ ∕𝐼𝑘 ≃ ℝ. 但是, ℝℕ 还有别的极大理想: 例


如, 理想

{{𝑎𝑛 }𝑛∈ℕ ∈ ℝ | 只有有限个 𝑎𝑛 非零}

一定包含于某个极大理想, 而这个极大理想不可能是任一个 𝐼𝑘 . 我们就取一个


不同于所有 𝐼𝑘 的极大理想 𝐼, 并定义


ℝ = ℝℕ ∕𝐼.

因为交换环商去极大理想能得到域, 所以这里 ∗ ℝ 是一个域. 我们就把这个 ∗ ℝ


叫做非标准实数. 可以证明这样得到的 ∗ ℝ 和之前的定义相同, 不过这超出了
本文的范围. 读者可参见 [Ehr12].

注 2.2. 这里, 极大理想 𝐼 的存在性依赖于选择公理. 也就是说, 我们无法直接


构造出一个极大理想, 只能证明其存在性. 下文的一些构造将会依赖于 𝐼 的选
取, 但无论 𝐼 如何选取, 转移原理都是正确的.

注 2.3. 非标准实数域 ∗ ℝ 是一个完备有序域, 这里完备性是指所有 Cauchy 序


列都收敛. 在数学分析中, 我们知道满足 Archimedes 原理的完备有序域一定同
构于 ℝ. 这里, ∗ ℝ 是一个不满足 Archimedes 原理的完备有序域的例子.

注 2.4. 我们可以定义非标准整数 ∗ ℤ 为 ℤℕ ⊂ ℝℕ 在 ∗ ℝ 中的像, 也可以定义非


标准有理数 ∗ ℚ 为 ℚℕ ⊂ ℝℕ 在 ∗ ℝ 中的像, 等等. 例如, 𝜔 是一个非标准整数;
𝜔 + 1∕2 不是非标准整数, 但它是非标准有理数.
188 超实数

通过这种方式, 我们可以把 ℝ 上的函数转移到 ∗ ℝ 上.

构造 2.5. 设 𝑓 ∶ ℝ → ℝ 是任何一个函数 (不一定连续). 它转移到 ∗ ℝ 上得到的


函数 ∗ 𝑓 ∶ ∗ ℝ → ∗ ℝ 定义为

(∗ 𝑓 )(𝑎1 , 𝑎2 , 𝑎3 , … ) = (𝑓 (𝑎1 ), 𝑓 (𝑎2 ), 𝑓 (𝑎3 ), … ).

在无歧义时, 也直接把 ∗ 𝑓 记为 𝑓 .

例 2.6. 由这个构造, 我们有

sin 𝜔 = (sin 1, sin 2, sin 3, … ).

我们也可以得出一个奇怪但正确的等式:

sin(𝜔π) = (0, 0, 0, 0, … ) = 0.

然而,
cos(𝜔π) = (−1, 1, −1, 1, … ),

注意到它的平方等于 1, 而在域 ∗ ℝ 中, 1 的平方根至多有两个, 从而

cos(𝜔π) = ±1.

那么, 它到底是 1 还是 −1 呢? 这取决于构造非标准实数时, 极大理想 𝐼 ⊂ ℝℕ


的选取, 两个值都是有可能的.

例 2.7. 我们也有

sin 𝜀 = (sin 1, sin(1∕2), sin(1∕3), … )



(−1)𝑛 2𝑛+1
=∑ 𝜀 ,
𝑛=0 (2𝑛 + 1)!

这是因为, 我们知道
𝜀𝑛 = (1, 1∕2𝑛 , 1∕3𝑛 , … ),

按照这个公式计算上面的和式, 就能得到 (sin 1, sin(1∕2), sin(1∕3), … ).


这个例子展示了非标准分析的威力: 我们还没有引入任何微积分, 就已经
得到了函数的 Taylor 展开!
2 超实数与非标准分析 189

非标准微积分
使用非标准分析的语言, 我们可以重新构建微积分的理论. 作为例子, 我们
使用非标准分析来定义函数的极限、导数等概念.

构造 2.8. 我们回忆, 每个超实数 𝑥 能唯一地写成

𝑥 = ∑ 𝑎 𝑛 𝜔𝑛
𝑛∈𝕊

的形式 (定理 1.4), 其中 𝑎𝑛 ∈ ℝ. 在这个展开式中, 𝜔0 = 1 这一项的系数 𝑎0 被


称为 𝑥 的标准部分 (standard part), 记为 st 𝑥. 例如, st 𝜀 = 0, st 1 = 1, st 𝜔 = 0.

定义 2.9. 我们说 ℎ ∈ ∗ ℝ 是一个无穷小, 如果它非零, 并且它的绝对值 (作为非


标准实数) 小于任何正实数.

定义 2.10 (极限). 设 𝑓 ∶ ℝ → ℝ 是一个函数, 设 𝑥0 ∈ ℝ. 如果存在 𝑦 ∈ ℝ, 使得


对任何无穷小 ℎ ∈ ∗ ℝ, 都有

st 𝑓 (𝑥0 + ℎ) = 𝑦,

就说 𝑦 是 𝑓 (𝑥) 在 𝑥 → 𝑥0 时的极限, 记为

lim 𝑓 (𝑥) = 𝑦.
𝑥→𝑥0

在数学分析中, 函数的连续性、函数的导数都是通过极限来定义的. 现在,


我们更换了极限的定义, 连续性和导数的概念可以以直观的方式写下来.

定义 2.11. 设 𝑓 ∶ ℝ → ℝ 是一个函数, 设 𝑥 ∈ ℝ. 我们说 𝑓 在 𝑥 处连续, 如果对


任何无穷小 ℎ ∈ ∗ ℝ, 有
st 𝑓 (𝑥 + ℎ) = 𝑓 (𝑥).

定义 2.12. 设 𝑓 ∶ ℝ → ℝ 是一个函数, 设 𝑥 ∈ ℝ. 我们说 𝑓 在 𝑥 处可导, 如果存


在实数 𝑓 ′ (𝑥) ∈ ℝ, 称为 𝑓 在 𝑥 处的导数, 使得对任何无穷小 ℎ ∈ ∗ ℝ, 有
𝑓 (𝑥 + ℎ) − 𝑓 (𝑥)
st = 𝑓 ′ (𝑥).

例 2.13. 我们来计算指数函数 exp ∶ ℝ → ℝ 的导数:

exp ℎ − 1 ℎ𝑛−1 1
exp′ (0) = st = st ∑ − = 1.
ℎ (𝑛=0 𝑛! ℎ)

这样的定义也能够简化很多命题的证明, 这里我们举一个例子.
190 超实数

例 2.14. 我们来证明求导的链式法则

(𝑓 ∘ 𝑔)′ (𝑥) = 𝑓 ′ (𝑔(𝑥)) 𝑔 ′ (𝑥).

如果 𝑔(𝑥 + 𝜀) ≠ 𝑔(𝑥), 那么

𝑓 (𝑔(𝑥 + 𝜀)) − 𝑓 (𝑔(𝑥))


左边 = st
𝜀
𝑓 (𝑔(𝑥 + 𝜀)) − 𝑓 (𝑔(𝑥)) 𝑔(𝑥 + 𝜀) − 𝑔(𝑥)
= st st = 右边.
𝑔(𝑥 + 𝜀) − 𝑔(𝑥) 𝜀

这是因为对有限大的数而言, st 与乘法相容. 如果 𝑔(𝑥 + 𝜀) = 𝑔(𝑥), 那么上式为


0, 链式法则也成立.

3 超实数与博弈论
在这一小节, 我们给出超实数在博弈论中的简单应用.

游戏局面与超实数的对应
我们考虑满足如下条件的博弈问题: 一个双人游戏, 给定初始状态, 每人轮
流进行操作; 在游戏状态满足给定条件时游戏结束, 且有唯一的胜利者; 在有限
步操作后游戏必然结束 (例如, 严格禁止全局同形再现规则下的围棋即满足此
条件).
我们接下来将以一种广义超实数来表示一个游戏状态. 这里的广义超实数
与超实数定义类似, 即仍形如 (𝑋𝐿 ∶ 𝑋𝑅 ), 但不要求 𝑋𝐿 ≤ 𝑋𝑅 . 例如 (0 ∶ 0) 是
广义超实数, 却不是超实数.
称两人为 L,R. 如果形成轮到 L 操作的局面时触发结束条件, 且 L 失败,
则记 𝑋𝐿 = ∅, 反之亦然. 由此可以归纳地定义每个局面所对应的广义超实数
(𝑋𝐿 ∶ 𝑋𝑅 ). 例如轮到谁操作谁输的局面即为 (∅ ∶ ∅) = 0; 轮到 R 操作时 R 失
败, 但轮到 L 操作时 L 可以将此局面变为 0 的游戏局面即为 (0 ∶ ∅) = 1; 如果
此局面任意人操作以后均变为 0, 则为 (0 ∶ 0), 一个不是超实数的广义超实数.
我们用记号 ∗ 表示 (0 ∶ 0).
反之, 给定一个广义超实数 (𝑋𝐿 ∶ 𝑋𝑅 ), 可以构造如下的游戏: 轮到 L 操作
时, 其可以将此超实数变为 𝑋𝐿 中的一个元素, 如果轮到 L 操作时 𝑋𝐿 = ∅, 则
𝐿 失败, 反之亦然. 注意到集合只能嵌套有限次, 游戏必然在有限步操作后结
束. 由此我们构造了游戏局面和广义超实数的一一对应, 接下来我们使用广义
超实数来研究游戏.
3 超实数与博弈论 191

广义超实数可以如同超实数般定义运算 (加法, 相反数与乘法) , 序关系与


等价关系. 其构成一个偏序交换幺环, 却不是域 (∗2 = ∗) , 也没有全序 (∗ 与 0
无法比较大小) .
在这种对应下, 广义超实数的加减法运算和序关系均有对应的意义. 广义
超实数的加减法在游戏中具有如下的意义:

定理 3.1. 在上述对应之下, 广义超实数的加法即对应游戏之并, 即下面的游


戏: 对多个广义超实数 (𝑋𝐿,1 ∶ 𝑋𝑅1 ), ⋯ , (𝑋𝐿𝑛 ∶ 𝑋𝑅𝑛 ), 某个人可以选择对任何
一个数进行操作, 游戏的结束条件即为: 如轮到某人操作时其无法对任何一
个数操作 (即所有数均被变为了 (∅ ∶ 𝑥𝑅 ) 的形式), 则其失败. 广义超实数的
相反数对应反过来的游戏, 即一个人进行的操作为原来游戏中另一个人进行
的操作.

超实数的序关系则对应游戏的胜负关系. 设某游戏局面对应某广义超实数
𝑥, 则:

定理 3.2. 𝑥 = 0 当且仅当对应的游戏局面为先操作的人失败. 𝑥 > 0 当且仅


当对应的游戏局面为 L 胜利 (无论谁先操作), 𝑥 < 0 当且仅当对应的游戏局
面为 R 胜利. 𝑥 与 0 无法比较大小当且仅当对应的游戏局面为先操作的人胜
利.

注 3.3. 超实数的乘法在游戏理论中没有好的对应.

广义超实数中序关系十分微妙. 这由下面的例子可以看出:

例 3.4. 对所有的超实数 𝑥 > 0, 有 𝑥 > ∗ 成立. (0 ∶ ∗) > 0 成立, 但 ∗ 与 (0 ∶ ∗)


却无法比较大小. (1 ∶ −1) 是广义超实数, 但对所有满足 −1 ≤ 𝑥 ≤ 1 的超实数
𝑥, 其与 𝑥 均无法比较大小.

为解决此问题, 我们定义广义超实数的均值. 我们接下来仅考虑可以在有


限天被定义的广义超实数 (游戏局面对应的广义超实数均满足这个条件).

定义 3.5. 对广义超实数 𝑥, 存在实数 𝑥1 , 𝑥2 使得 𝑥1 > 𝑥, 𝑥2 < 𝑥. 此时定义 𝑥 的


上界为
sup(𝑥) ∶= inf{𝑥1 ∈ ℝ ∣ 𝑥1 ≥ 𝑥}.

同样的方法可以定义其下界 inf(𝑥).

定义 3.6. 对广义超实数 𝑥, 在实数意义下的极限


sup(𝑛𝑥)
lim
𝑛→∞ 𝑛
存在, 此值被定义为 𝑥 的均值, 记作 mean(𝑥) (将 sup 换为 inf 会得到相同的结
果).
192 超实数

可以看出, 如果均值越大, 则此局面越可能对 L 有利.

例 3.7. 如果一个广义超实数 𝑥 是实数, 其均值就是其正常意义下的值. 如 𝑥 形


如 (𝑋𝐿 ∶ 𝑋𝑅 ) 使 𝑋𝐿 > 𝑋𝑅 均为实数, 其均值为 (𝑋𝐿 +𝑋𝑅 )∕2. 对 𝑋𝐿,𝐿 > 𝑋𝐿,𝑅 >
𝑋𝑅 为实数, ((𝑋𝐿,𝐿 ∶ 𝑋𝐿,𝑅 ) ∶ 𝑋𝑅 ) 的均值为 min{(𝑋𝐿,𝐿 +𝑋𝐿,𝑅 +2𝑋𝑅 )∕4, 𝑋𝐿,𝑅 }.

游戏策略
对于一般的游戏局面, 我们并无法使用广义超实数对其进行好的分析. 有
时一个游戏局面可以表示为若干较为简单的子局面之和 (和的定义见上述), 则
可以使用广义超实数给出一个较好的策略. 这类局面的一个有代表性的例子是
围棋的官子理论.
我们首先定义一步操作的价值.

定义 3.8. 设局面 𝑥 = (𝑋𝐿 ∶ 𝑋𝑅 ). 假设现在轮到 L 操作, 定义 L 操作 𝑥 的价值

𝛼𝐿 = sup{mean(𝑦) ∣ 𝑦 ∈ 𝑋𝐿 } − mean(𝑥),

即 L 操作后与操作前得到的局面的均值之差.

对一个局面 𝑥 = (𝑋𝐿 ∶ 𝑋𝑅 ), 使得

𝑥 = 𝑥1 + ⋯ + 𝑥 𝑛 , 𝑥𝑖 = (𝑋𝐿𝑖 ∶ 𝑋𝑅𝑖 ).

则对 𝐿 来说, 更可能有利的策略是对操作价值 𝛼𝑖,𝐿 最大的 𝑥𝑖 进行操作.

例 3.9. 𝑥𝑖 如上述. 如 𝑥𝑖 = (𝑋𝐿𝑖 ∶ 𝑋𝑅𝑖 ), 其中 𝑋𝐿,𝑖 > 𝑋𝑅,𝑖 均为实数, 则对 L 而


言最有利的操作是操作满足 𝑋𝐿,𝑖 − 𝑋𝑅,𝑖 最大的 𝑥𝑖 . 这与使用广义超实数理论
所得出的结论一致.

然而这样的策略未必是最佳的. 这从下面的例子可以看出.

例 3.10. 设局面 𝑥 = 𝑥1 + 𝑥2 , 其中 𝑥1 = (3 ∶ 0), 𝑥2 = (0 ∶ (−2 ∶ −10)). 则


𝛼1,𝐿 = 3∕2 而 𝛼2,𝐿 = 2. 然而, L 操作 𝑥1 会赢, 而操作 𝑥2 会输. 又如如局面
𝑥 = 𝑥1 + 𝑥2 + 𝑥3 , 其中 𝑥1 = (0 ∶ ∗), 𝑥2 = 𝑥3 = ∗. 则所有操作的价值均为 0, 然
而 L 只有操作 𝑥1 才能赢.

这说明了使用广义超实数分析游戏的局限. 只有在一些简单的情况之下,
可以给出相同均值的广义超实数的序关系 (例如 (0 ∶ ∗) > 0), 从而给出精确的
游戏策略, 参见 [BW97], 一般情况只能进行近似估计.
参考文献 193

参考文献
[BW97] E. R. Berlekamp and D. Wolfe (1997). Mathematical Go: Chilling gets the last
point. A K Peters/CRC Press.
[Ehr12] P. Ehrlich (2012). “The Absolute Arithmetic Continuum and the Unification of
All Numbers Great and Small”. Bulletin of Symbolic Logic 18 (1), 1–45.
[Gon86] H. Gonshor (1986). An Introduction to the Theory of Surreal Numbers. Cam-
bridge University Press.
[Ros04] E. E. Rosinger. Short Introduction to Nonstandard Analysis. arXiv: math /
0407178.
折纸问题

黄虞来1

摘 要

本文将完整地给出折纸构造点的刻画, 并给出折纸的两个有趣实例: 三
等分角与折正 𝑁 边形.

1 折纸公理
折纸实际上是利用已知几何对象构造新的几何对象的过程. 我们称折纸中
构造出的点、线为可构造点、可构造线. 我们将折纸的纸面与复平面 ℂ 等同起
来, 于是可构造点、可构造线实际上是复平面的一些特殊子集.
实际操作中, 我们发现主要会使用以下六种构造方式 (称之为折纸的六公
理):

(1) 连接两个可构造点的直线是可构造的;

(2) 两条可构造线的交点是可构造点;

(3) 两个可构造点的中垂线是可构造线;

(4) 两条可构造线的角平分线是可构造的;

(5) 给定可构造线 𝑙 和可构造点 𝑃 , 𝑄, 经过 𝑄 且将 𝑃 反射到 𝑙 上的直线 (如


果存在) 是可构造的;

(6) 给定可构造线 𝑙, 𝑚 及可构造点 𝑃 , 𝑄, 将 𝑃 反射到 𝑙, 并将 𝑄 反射到 𝑚 上


的直线 (如果存在) 是可构造的.
1
清华大学数学系数 92 班.
196 折纸问题

在公理 (5) (6) 中, 满足要求的直线不一定存在. 事实上, 这些直线可由下


面的引理刻画:

引理 1.1. 给定平面上一点 𝑃 以及直线 𝑙, 假设 𝑃 ∉ 𝑙, 并设 Γ 是以 𝑃 为焦点, 𝑙


为准线的抛物线. 那么直线 𝑚 将 𝑃 反射到 𝑙 上当且仅当 𝑚 是 Γ 的一条切线.

请读者自行完成引理的证明. 从引理可以看出, 公理 (5) 实际上是过给定


点作抛物线的切线; 公理 (6) 实际上是作两条抛物线的公切线. 而这样的构造
并不总是存在的.

2 代数结构
设 𝒫 ⊂ ℂ 为复平面 ℂ 的子集, ℒ 为 ℂ 上一些直线构成的集合. 我们可以
把公理 (1) 到 (6) 理解为集合 𝒫 和 ℒ 之间的关系: 比如公理 (1) 说的就是过
𝒫 中两点的直线属于 ℒ .
我们先来刻画通过前五个折纸公理构造出的点与线:

定义 2.1. 如上所述的二元组 (𝒫 , ℒ ) 称为一个 Euclid 构造, 若满足公理 (1) 到


(5). 称 𝒫 和 ℒ 中的元素分别为该 Euclid 构造的可构造点、可构造线.

引理 2.2. 设 (𝒫𝛼 , ℒ𝛼 )𝛼∈𝒜 是若干 Euclid 构造, 则 (⋂𝛼 𝒫𝛼 , ⋂𝛼 ℒ𝛼 ) 也是一个


Euclid 构造.

有了上述引理之后, 我们就有了 “生成” 的概念: 设 𝑇 ⊂ ℂ 且包含 {0, 1}, 𝑇


生成的 Euclid 构造 (𝒫 , ℒ ) 被定义为所有包含 𝑇 的 Euclid 构造之交. 为了方便
讨论, 我们记 𝒫𝑟 ∶= 𝒫 ⋂ ℝ 为 𝒫 中的实数.

定理 2.3 (Euclid 构造的刻画). 设 (𝒫 , ℒ ) 为 𝑇 生成的 Euclid 构造, 则 𝒫 为 ℂ


中包含 𝑇 且对共轭、开平方封闭的最小子域, ℒ 为系数在 𝒫𝑟 中的二元一次
方程确定的直线全体.

在证明定理 2. 3 之前, 我们先给出公理 (1) 到 (4) 的几个直接推论:

𝑃 𝑄 = ⃖⃖⃖⃖⃗
(a) 若 𝑃 , 𝑄, 𝑅 ∈ 𝒫 , 则满足 ⃖⃖⃖⃖⃗ 𝑅𝑆 的点 𝑆 ∈ 𝒫 ;

(b) 若 𝑃 ∈ 𝒫 , 𝑙 ∈ ℒ , 则过 𝑃 作 𝑙 的垂线 𝑚 ∈ ℒ , 𝑃 关于 𝑙 的对称点


𝑄 ∈ ℒ;

(c) 若 𝑡 ∈ 𝒫𝑟 , 𝑡 ≠ 0, 则 1∕𝑡 ∈ 𝒫𝑟 ;

(d) 若 𝑎, 𝑏 ∈ 𝒫𝑟 , 则 𝑎𝑏 ∈ 𝒫𝑟 .
2 代数结构 197

一个有用的观察是, 我们可以在可构造线上取到足够多的可构造点 (事实


上是稠密的) , 这为后面的论证省下了不少麻烦. 推论 (a) 的证明如下图, 其余
证明请读者自行尝试.

R S’ S

P Q

Figure 1: 推论 (a) 的证明

命题 2.4. 𝒫 是 ℂ 的子域且对共轭封闭.

证明 由 0, 1 ∈ 𝒫 知实轴为可构造线. 用公理 (3) 做出 0, 1 的中垂线, 再用推


论 (a) 将其平移至 0, 因此虚轴也为可构造线. 由公理 (4) 知 45∘ 线 𝑙 ∶ 𝑦 = 𝑥 可
构造, 再由推论 (b), 实轴、虚轴上的可构造点关于 𝑙 对称. 又因为可构造点向实
轴、虚轴的投影均为可构造点, 故

𝒫 = {𝑥 + 𝑖𝑦 ∣ 𝑥, 𝑦 ∈ 𝒫𝑟 }.
最后推论 (c)(d) 告诉我们 𝒫 是 ℂ 的子域. ◻

Q
P

Figure 2: 对开方封闭

注意命题 2.4 的证明只用到公理 (1) 到 (4). 下面说明公理 (5), 也即过定点


作抛物线切线的操作, 保证了 𝒫 对开平方封闭:
198 折纸问题

设 𝑟 > 0, 考虑复平面上的点 𝑃 ∶ (0, −𝑟∕2) 及抛物线 Γ ∶ 𝑥2 = 2𝑦, 那么过


𝑃 作 Γ 的切线交实轴于 𝑄 ∶ (√𝑟∕2, 0). 若 𝑟 ∈ 𝒫 是一个可构造点, 则 𝑃 、Γ 的
交点及准线均是可构造的, 从而切线也可构造, 因此 √𝑟 ∈ 𝒫 . 而复数开方实际
上只涉及了辐角平分与模长开方的过程, 故 𝒫 对开方封闭.
设 ℳ ⊂ ℂ 是 ℂ 中包含 𝑇 且对共轭、开平方封闭的最小子域, 那么之前的
分析实际上说明了 ℳ ⊂ 𝒫 . 由于 (𝒫 , ℒ ) 是包含 𝑇 的最小 Euclid 构造, 故要证
明 𝒫 = ℳ, 只用证明 ℳ 是某个 Euclid 构造的可构造点集:

命题 2.5. 设 ℳ ⊂ ℂ 是 ℂ 中包含 𝑇 且对共轭、开平方封闭的最小子域, 𝒮 为


系数在 ℳ𝑟 中的二元一次方程确定的直线全体, 则 (ℳ, 𝒮 ) 为 Euclid 构造.

证明 利用域对四则运算封闭易知 (ℳ, 𝒮 ) 满足公理 (1) 到 (3). 对于公理 (4),


任取两条 𝒮 中相交直线, 将交点平移至原点, 并设两条直线的倾角分别为
𝜃, 𝜂 ∈ [0, 𝜋]. 设倾角 𝜃 的直线方程为

𝑎𝑥 + 𝑏𝑦 = 0 (𝑎, 𝑏 ∈ ℳ𝑟 ).

由于 ℳ 对开平方封闭, 故 √𝑎2 + 𝑏2 ∈ 𝒫𝑟 . 而
𝑏 𝑎
cos 𝜃 = ± , sin(𝜃) = ± ,
√𝑎2 + 𝑏2 √𝑎2 + 𝑏2
因此 sin(𝜃), cos(𝜃) ∈ ℳ𝑟 . 同理 sin(𝜂), cos(𝜂) ∈ ℳ. 角平分线方程的系数
𝜃±𝜂 𝜃±𝜂
sin( ), cos( )
2 2
可由 𝜃, 𝜂 的三角函数经过四则运算及开方得到, 故角平分线可构造. 最后请读
者自行验证公理 (5), 从而 (ℳ, 𝒮 ) 是 Euclid 构造. ◻

由命题 2.5 可以得到 𝒫 = ℳ, 又容易验证 𝒮 包含于 ℒ , 从而 𝒮 = ℒ , 定


理 2.3 成立. 有趣的是, 从定理 2.3 可以看出折纸公理 (1) 到 (5) 实际上与尺规
作图等价, 这一点单从折纸公理直接看并不显然.

3 第六公理
公理 (6) 是折纸公理中最不平凡的一条, 它使折纸操作超出了尺规作图的
界限.

定义 3.1. 称 (𝒫 , ℒ ) 为 Origami 构造, 若满足公理 (1) 到 (6).

同上一节, 设 𝑇 ⊂ ℂ 且 {0, 1} ⊂ 𝑇 . 我们同样可以定义 𝑇 “生成” 的 Origami


构造.
3 第六公理 199

定理 3.2. 设 (𝒫 , ℒ ) 为 𝑇 ⊂ ℂ 生成的 Origami 构造, 那么 𝒫 为 ℂ 中包含 𝑇 且


对共轭、开平方、开立方封闭的最小子域, ℒ 为系数在 𝒫𝑟 中的直线全体.

设 ℳ ⊂ ℂ 是 ℂ 中包含 𝑇 且对共轭、开平方、开立方封闭的最小子域, 𝒮 为
系数在 ℳ𝑟 中的直线全体, 那么只需要证明 (𝒫 , ℒ ) = (ℳ, 𝒮 ). 由上一节可知
𝒫 为对共轭、开平方封闭的子域, 剩下只用说明:

• 𝒫 对开立方封闭:

• (ℳ, 𝒮 ) 确为 Origami 构造.

公理 (6) 保证了第一点成立: 考虑抛物线


1 1 2
Γ1 ∶ (𝑦 − 𝑝)2 = 2𝑞𝑥, Γ2 ∶ 𝑦 = 𝑥
2 2
的公切线 𝑚. 当 𝑝, 𝑞 ∈ 𝒫𝑟 时 𝑚 为可构造线, 且斜率 𝑘 ∈ 𝒫𝑟 满足方程 𝑘3 +𝑝𝑘+𝑞 =
0. 对复数开立方涉及辐角三等分与模长开立方的过程, 而这些都可由解实系数
的三次方程实现.
下面验证第二点. 由命题 2.5 知 (ℳ, 𝒮 ) 满足公理 (1) 到 (5), 故还需验证其
满足公理 (6). 事实上有更强的结论: 系数在 ℳ𝑟 中的两个圆锥曲线的公切线在
𝒮 中. 先回顾一些解析几何的知识:
设 𝐴 为三阶实对称阵, 并记 𝑋 = (𝑥, 𝑦, 1)𝑡 ∈ ℝ3 . 当 𝐴 可逆时, 二次方程

𝑋 𝑡 𝐴𝑋 = 0

的解 (𝑥, 𝑦) 的轨迹是一条圆锥曲线 Γ𝐴 . 过 Γ𝐴 上一点 (𝑥0 , 𝑦0 ) 的切线方程为

𝑋0𝑡 𝐴𝑋 = 0,

其中 𝑋0 = (𝑥0 , 𝑦0 , 1)𝑡 , 𝑋 = (𝑥, 𝑦, 1)𝑡 .

命题 3.3. 设 𝐴, 𝐵 是系数在 ℳ𝑟 中的三阶可逆对称阵, 𝑚 为复平面上的直线.


若 𝑚 是 Γ𝐴 , Γ𝐵 的公切线, 则 𝑚 ∈ 𝒮 .

证明 设 𝑚 与 Γ𝐴 , Γ𝐵 的切点分别为 𝑋 = (𝑥, 𝑦, 1)𝑡 , 𝑌 = (𝑧, 𝑤, 1)𝑡 , 则由切线方程


的形式知存在 𝜆 ≠ 0, 使得 𝑋 𝑡 𝐴 = 𝜆𝑌 𝑡 𝐵. 而 𝑋, 𝑌 分别满足圆锥曲线方程:

⎧ 𝑡
⎪𝑋 𝐴𝑋 = 0;
⎨ 𝑡
⎩𝑌 𝐵𝑌 = 0.

消去 𝑌 得到 (𝑥, 𝑦) 满足的两个二次方程:

⎧ 𝑡
⎪𝑋 𝐴𝑋 = 0; (∗)
⎨ 𝑡 −1
⎩𝑋 𝐴𝐵 𝐴𝑋 = 0.
⎪ (∗∗)
200 折纸问题

将 (∗)(∗∗) 视为 𝑥, 𝑦 的二次多项式, 则它们关于 𝑥 的结式 𝑅(𝑦) 是 𝑦 的不超过四


次的多项式. 若 𝑅(𝑦) 恒等于 0, 请读者自行验证 Γ𝐴 与 Γ𝐵 重合2 , 从而导致矛盾.
由于 𝑅(𝑦) 的系数均在 ℳ𝑟 中, 由四次方程求根公式, 有 𝑦 ∈ ℳ𝑟 . 解方程 (*) 得
𝑥 ∈ ℳ𝑟 . 故 𝑚 的系数 𝑋 𝑡 𝐴 均在 ℳ𝑟 中, 命题得证. ◻

至此, 我们完整地证明了定理 3.2. 下面介绍两个具体的折纸问题: 折纸三


等分角及折纸正多边形.

4 折纸三等分角
给定平面上两条射线夹出的角, 仅凭直尺和圆规不一定能将其三等分. 这
是因为三等分角实际上涉及了开立方运算:

cos(3𝜃) = −3 cos(𝜃) + 4 cos3 (𝜃)

其中 3𝜃 为已知角. 由于公理 (6) 可以实现开立方运算, 由定理 3.2 知折纸三等


分角是可以办到的. 下面给出了一种实现方法 [Hul96], 其证明是相似三角形的
简单推导.

Figure 3: 三等分角

5 折纸正 𝑁 边形
用纸条折出五角星的游戏能勾起许多人美好的童年回忆. 我们很自然地会
考虑折纸正 𝑁 边形的问题: 对于怎样的 𝑁, 可以折出一个正 𝑁 边形?
在之前的讨论中, 我们默认了这样一件事: 由 𝑇 ⊂ ℂ 生成的 Origami 构造
(𝒫 , ℒ ) 确为实际操作中可以折出的点和线. 事实上 “生成” 的概念是用集合的
交定义的, 这只是出于数学表述上的简洁, 但未必符合实际.
为了刻画折纸过程中可构造点集的变化, 引入根式塔的概念:
2
平面上五个点, 任三点不共线, 则唯一确定一条圆锥曲线.
参考文献 201

定义 5.1. 设 𝐹 为复数域的子域, 若 𝐹 的扩域 𝐹 (𝑢1 , ..., 𝑢𝑛 ) 满足条件: 对每


个 𝑖 ∈ {1, 2, ..., 𝑛}, 有 𝑢2𝑖 ∈ 𝐹 (𝑢1 , ..., 𝑢𝑖−1 ) 或 𝑢3𝑖 ∈ 𝐹 (𝑢1 , ..., 𝑢𝑖−1 ) 成 立, 则 称
𝐹 (𝑢1 , ..., 𝑢𝑛 ) 为 𝐹 上的一个根式塔.

引理 5.2. 设 𝑇 ⊂ ℂ 且 {0, 1} ⊂ 𝑇 . 令 𝐹 为 𝑇 中元素及其共轭生成的 ℂ 的子


域, (𝒫 , ℒ ) 为 𝑇 生成的 Origami 构造, 则所有 𝐹 上根式塔之并即为 𝒫 .

证明 由定理 3.2 知 𝐹 上任意根式塔含于 𝒫 . 而 𝐹 上所有根式塔之并构成 ℂ


的对共轭、开平方、开立方封闭的子域, 且包含 𝑇 , 定理得证. ◻

由于根式塔中的点均可经过有限步折纸操作得到, 故 ℱ 确实为从 𝑇 出发,


经过有限步可以得到的可构造点之集. 下面设 𝑇 = {0, 1}, 则 𝐹 = ℚ. 我们考虑
在已知两点的情况下能否折出正 𝑁 边形.

命题 5.3. 已知纸面上两点 0, 1, 若 𝑧 ∈ ℂ 可由折纸得到, 则 𝑧 是 ℚ 上代数元,


且极小多项式次数形如 2𝛼 3𝛽 , 其中 𝛼, 𝛽 为非负整数.

证明 由引理 5.2, 𝑧 属于 ℚ 上某个根式塔 ℚ(𝑢1 , ..., 𝑢𝑛 ). 由根式塔的定义, 知


[ℚ(𝑢1 , ..., 𝑢𝑖 ) ∶ ℚ(𝑢1 , ..., 𝑢𝑖−1 )] ≤ 3. 因此 [ℚ(𝑢1 , ..., 𝑢𝑛 ) ∶ ℚ] 的素因子只能为 2, 3,
从而 [ℚ(𝑧) ∶ ℚ] 形如 2𝛼 3𝛽 , 其中 𝛼, 𝛽 为非负整数. ◻

从命题 5.3 出发, 利用 Galois 对应以及分圆域的知识, 可以得到折纸正 𝑁


边形的最终刻画:

定理 5.4. 正 𝑁 边形可由折纸得到的充要条件是 𝑁 = 2𝛼 3𝛽 𝑝1 … 𝑝𝑟 , 其中
𝛼, 𝛽 ≥ 0, 𝑝𝑖 为互异素数, 且形如 2𝑎 3𝑏 + 1.

定理 5.4 的证明细节可以在 [Vid97] 中找到. 作为本文的结束, 考虑这样一


个例子: 正 7 边形不能由尺规作图得到, 但注意到 7 = 2 × 3 + 1, 故正 7 边形可
由折纸得到. 一个具体而有些繁琐的构造如下:

参考文献
[Hul96] Thomas Hull (1996). “A Note on ”Impossible” Paper Folding”. The American
Mathematical Monthly 103 (3), 240–241.
[Vid97] Carlos R. Videla (1997). “On Points Constructible from Conics”. Mathemat-
ical Intelligencer 19 (2), 53–57.
[毛天一 08] 毛天一,石权,林洁,王芝兰 (2008). “折纸的代数结构”. 荷思 (2), 4–14.
202 折纸问题

Figure 4: 折纸正 7 边形
A Game About Competing Area

尚鉴桥1

ABSTRACT
This article is about a problem in daily life: KFC and McDonald’s both
want to open 𝑛 restaurants in a city. McDonald’s opens its restaurant first, KFC
opens its restaurants based on locations of McDonald’s. If everyone goes to
the nearest restaurant, could McDonald’s make sure that it has at least half of
the customers?

Contents

1 Introduction 203

2 Lemmas 204

3 The Situation of a Square 207

4 Two Theorems when 𝑻 or 𝒏 is Fixed 210

5 Summary 219

1 Introduction
Before the formal discussion of this question. Let us consider a question in daily
life: why the McDonald’s and KFC are always together on a street. To solve this
problem, we should establish a mathematical model. Assume that the population
density on this street is invariant and everyone would choose the closest restaurant
to have meal (If the two restaurants are at the same point, people will choose one
1
清华大学数学系数 80 班
204 A Game About Competing Area

randomly). If you are the owner of McDonald’s and you want to build a restaurant
first before KFC. You want to get as more customers as you can, which place will
you choose?
As the picture shown behind, if Mc and KFC open their restaurant at 𝐴 and
𝐵. Let 𝐶 be the middle point of the segment 𝐴𝐵. We can easily prove that the
customers in one side of C would go to Mc and others would go to KFC. So if you
want more customers than your competitor, you should choose the middle point of
the street or he could open his restaurant at the middle point of the street and get
more than half customers.

𝐴 𝐶 𝐵

If you want to open 𝑛 restaurants on the street, it can be proved that the best
choice is dividing the street into 𝑛 equal parts and choose middle point of each
part.

𝐴1 𝐵1 𝐴2 𝐵2 𝐵3 𝐴3 𝐴4𝐵4

So we naturally want to know that what will happen if we change the street
into the whole city. Could the first one always make sure he get at least half of the
customers?
We express this question in another way:
For a convex set 𝑇 in ℝ2 and a constant 𝑛, can we find a set of points 𝐴 =
{𝐴1 , 𝐴2 ...𝐴𝑛 } of 𝑛 points in 𝑇 which satisfy the following conditions? For any set
𝐵 = {𝐵1 , 𝐵2 ...𝐵𝑛 } of 𝑛 points on the plane , we color 𝑇 by two colors. For a point
𝑠 in 𝑇 , find the closet point to it in 𝐴 ∪ 𝐵. If it is a point in 𝐴, color it in red,
otherwise color it in green. The area of red parts in the diagram is always not less
than the green parts (Here the metric is always the Euclidean metric).
In this article we will try to find out all pairs of (𝑇 , 𝑛) which meets the require-
ment. At first we would give some lemmas in the second section. Then we would
consider a special case (𝑇 is a square) as an example in the third section. In the
forth section, we will give out two strong conclusion to solve the case when 𝑛 is
big enough.
We always use capital letters for points and domains, lower-case letters for lines,
|𝐴𝐵| ∶= 𝑑(𝐴, 𝐵). For a domain 𝑇 we always use 𝑆𝑇 to denote the area of 𝑇 .

2 Lemmas
In this chapter we will give some trivial but useful lemmas for this problem.
Definition 1. For any point 𝑃 ∈ 𝐴, we call the set 𝕋𝑃 ∶= {𝐾 ∶ 𝐾 ∈ 𝑇 , |𝐾𝑃 | =
min𝑛𝑖=1 |𝐾𝐴𝑖 |} the domain of 𝑝 and we call 𝑃 the origin point of 𝕋𝑃 .
2 Lemmas 205

Definition 2. If 𝑇 and 𝑛 satisfies the condition in question, we call the pair (𝑇 , 𝑛)


a good pair.

In this section we just consider the properties of good pairs.

Lemma 1. The intersection line of two adjacent domains is the perpendicular bi-
sector of the two origin points they correspond to.

This lemma can be shown by the definition of domain.

Lemma 2. Domains are all convex.

𝐴1

𝑄 𝑅

𝐴2

Proof. If 𝕋𝐴1 is not convex, we may as well assume that ∠𝑃 𝑄𝑅 is more than 𝜋,
here 𝑃 , 𝑄, 𝑅 are adjacent vertexes of 𝕋𝐴1 . Because 𝑇 is convex, 𝑃 𝑄 can’t be the
side of 𝑇 . So the symmetric point of 𝐴1 about 𝑄𝑅 must be another origin point.
We call it 𝐴2 . Since ∠𝑃 𝑄𝑅 > 𝜋, the length of |𝑃 𝐴1 | is larger than |𝑃 𝐴2 |. So P
couldn’t be on the side of 𝕋𝐴1 . Then we get the contradiction. ◻

There are some lemmas about the area of each domain. For convenience, we
sometimes use 𝑆𝑖 to denote 𝑆𝕋𝑖 , 𝑆𝐴𝑖 to denote 𝑆𝕋𝐴𝑖

Lemma 3. For any domain 𝕋𝐴𝑖 and 𝜖 > 0, we can choose a point as one of points
in 𝐵 and let the point change almost half area of 𝐴 into green. (i.e. it can change
𝑆
the domain with area at least 𝕋2𝐴𝑖 − 𝜖 into green.). And we can also choose two
points 𝐵1 and 𝐵2 so that they change almost the whole 𝕋𝐴𝑖 into green.
206 A Game About Competing Area

𝑘
𝐵2
𝐴1 𝑙
𝐵1

Proof. For any line 𝑙 passing through 𝐴1 , it divides 𝐴 into two parts. We consider
the line 𝑘 perpendicular to 𝑙 and go through 𝐴1 .
𝑘 divides 𝐴1 into two parts 𝕋1 , 𝕋2 (We assume 𝑆1 ≥ 𝑆2 ). Let the diameter of
𝜖
𝐴1 be 𝑑, choose a point 𝐵1 on 𝑘 in 𝑆1 such that |𝐵1 𝐴1 | ≤ 2𝑑 . This point meets the
𝜖
first requirement. Then choose another point 𝐵2 on 𝑘 in 𝑆2 such that |𝐵2 𝐴1 | ≤ 2𝑑 .
The two points, 𝐵1 and 𝐵2 , meet the second requirement. ◻

Corollary 1. The areas of all the domains are equal.

Proof. Suppose that 𝑆𝐴1 ≥ 𝑆𝐴2 ≥ ... ≥ 𝑆𝐴𝑛 . By lemma 1.3, for every 𝜖 we can
choose two points in 𝕋𝐴1 one point in 𝕋𝐴2 and one point in 𝕋𝐴3 ... until one point in
𝑆𝐴2 +...𝑆𝐴𝑛−1
𝕋𝐴𝑛−1 so that these points change the domain of area at least 𝑆𝐴1 + −𝜖
2
𝑆𝐴2 +...𝑆𝐴𝑛−1 𝑆𝐴1 +...𝑆𝐴𝑛
into green. Then we get that 𝑆𝐴1 + 2
−𝜖 ≤ 2
is true for every 𝜖.
So 𝑆𝐴1 = 𝑆𝐴𝑛 , we get the corollary. ◻

Lemma 4. Every line goes through the origin point divides the domain into two
parts of equal area.

Proof. Using the same notation in lemma 3, for each line passing through 𝐴𝑖 ,
𝑆 −𝑆
choose a point 𝐵 which changes the area of area 1 2 2 − 𝜖 into green and we
choose other 𝑛 − 1 points which change at almost half area left into green. We can
get that 𝑆1 ≤ 𝑆2 , so 𝑆1 = 𝑆2 . ◻

We can get the most useful conclusion from lemma 4.

Lemma 5. All the domains are centrally symmetric.


3 The Situation of a Square 207

𝐵1
𝐶1
𝐴1
𝐶2
𝐵2

Proof. For each line passing through 𝐴1 , it intersects the boundary of the domain
at two points 𝐵1 and 𝐵2 . If |𝐴1 𝐵1 | > |𝐴2 𝐵2 |, we rotate the line by a small angle
𝜃 and let the two new points of intersection 𝐶1 and 𝐶2 satisfies |𝐴1 𝐶1 | ≥ |𝐴2 𝐶2 |.
Then we know 𝑆𝐴1 𝐵1 𝐶1 = 12 |𝐴1 𝐵1 ||𝐴1 𝐶1 |𝑠𝑖𝑛𝜃 > 12 |𝐴1 𝐵2 ||𝐴1 𝐶2 |𝑠𝑖𝑛𝜃 = 𝑆𝐴1 𝐵2 𝐶2 .
From lemma 4 𝑆𝐴1 𝐵1 𝐶1 = 𝑆𝐴1 𝐵2 𝐶2 , then we get the contradiction. Hence |𝐴1 𝐵1 | =
|𝐴2 𝐵2 | for any line, which implies that 𝕋𝐴1 is centrally symmetric. ◻

Now we have got all tools we need. Then we will discuss an example to use
these tools.

3 The Situation of a Square

In this section we will prove that if 𝑇 is a square, the only 𝑛 to make (𝑇 , 𝑛) a good
pair is 1. We can get an idea to consider general questions from this special case.
To achieve our goal, we need to consider what the shape 𝕋𝐴1 , 𝕋𝐴2 ... would be by
lemma 6, and then discuss some situations and details.
In the following discussion if we say lines and points without specified, we
mean the intersection line segments of domains and the endpoints of lines.

Theorem 3.1. The lines which intersect the boundary of the square are perpen-
dicular to the boundary.

We prove this with lemma 5.


208 A Game About Competing Area

𝐺 𝐻
𝑗 ℎ

𝐴 𝐶𝑘 𝐶𝑘+1 𝐵

Proof. We just consider one side 𝐴𝐵 of the square.


Let 𝐶1 , 𝐶2 ... 𝐶𝑛 be points on 𝐴𝐵. For a line 𝑖 which has an end point on 𝐴𝐵,
call the angle between 𝑖 and 𝐴𝐵 ∠𝑖𝐶𝑖 𝐵.
We will show that for two lines 𝑖 and ℎ which has endpoints 𝐶𝑘 and 𝐶𝑘+1 , the
∠𝑖𝐴𝐵 ≤ ∠ℎ𝐴𝐵. Assume that 𝑖 is the rightest line which has endpoint 𝐶𝑘 and 𝑘 is
the leftest line which has endpoint 𝐶𝑘+1 (or we can compare the rightest line which
has endpoint 𝐶𝑘 and the leftest line which has endpoint 𝐶𝑘+1 first). 𝑖 and ℎ are the
sides of the same domain. Because this domain is centrally symmetric and convex,
we get that ∠𝑖𝐶𝑘 𝐶𝑘+1 + ∠ℎ𝐶𝑘+1 𝐶𝑘 ≤ 𝜋 for any 𝑘.
We know that the other two adjacent sides of the square are both vertical with
𝐴𝐵, so the inequality we give is taken equal and complete the proof. ◻

By the theorem above we note that the domain on the sides must be rectangles.
If we draw it it can be naturally found out that whole graph must look like a table,
so there is naturally next theorem.

Theorem 3.2. If 𝐴𝑖 satisfies the conditions, then the domains are congruent rect-
angles.
3 The Situation of a Square 209

Proof. At first we show that the domains on one side are congruent rectangles
(Picture 1). For any two adjacent domains 𝕋𝐴1 and 𝕋𝐴2 , the intersection of them
must be perpendicular bisector of the two origin points 𝐴1 and 𝐴2 . Thus 𝕋𝐴1 and
𝕋𝐴2 are congruent rectangles since they have the same area. Then we prove that
all the domains are congruent rectangles (Picture 2). For a rectangle 𝕋𝐴3 in corner.
The symmetric point of 𝐴3 about the topside 𝐴4 is another origin point and we
can prove that 𝕋𝐴4 is also a rectangle by using theorem 3.1 again for the topside
of all the domains on the downside of the square. Then we can prove that 𝕋𝐴4
is congruent to 𝕋𝐴3 and we can show that there are a rows of rectangles on the
domain of the first row (Picture 3). We can repeat this operation until prove that all
the domains are congruent rectangles.

𝐴4
𝐴1 𝐴2 𝐴3

Now it only needs to discuss the situation theorem 3.2 gives. If 𝑛 = 1, it is clear
that one can choose 𝐴 in the middle of the square to turn at least half of the square
into green. Now we want to show that 𝑛 = 1 is the only good situation.
If 𝑛 > 1, there are only two cases. The one is that the table has two or more
rows and columns and the other is that it has only one row or one column.

𝐵4
𝐴1 𝐴2

𝐵1

𝐴3 𝐴4
𝐵2 𝐵3

For the first case, we consider 4 rectangles which aligned as the picture above.
Choose the midpoint of 𝐴1 𝐴3 as 𝐵1 and choose 𝐵2 , 𝐵3 , 𝐵4 behind and very close
to 𝐴2 , 𝐴3 , 𝐴4 (we can choose them well enough by lemma3). Then 𝐵1 changes at
210 A Game About Competing Area

𝑎𝑏 𝑎3
least 2
+ 16𝑏
into green, here 𝑎 and 𝑏 are the length and width of the rectangle.
3𝑎𝑏 𝑎3
And 𝐵2 , 𝐵3 , 𝐵4 change almost 2
into green. Because 16𝑏 is a positive number
𝑆𝑇 −𝑎𝑏
and other points can change 2
− 𝜖 into green, then we get contradiction.

O
𝐴1 𝐴2 𝐵2
𝐵1
𝐸
𝐷

𝑇 𝐷′

For the second case, select two adjacent rectangles 𝕋𝐴1 , 𝕋𝐴2 . Because the
length of the rectangles 𝑎 are twice more then there width 𝑏, we can choose a point
𝑂 on the perpendicular bisector of 𝐴1 𝐴2 such that 2|𝑂𝐴1 | < |𝑂𝑇 | by choosing
|𝑂𝑇 | an suitable number (for example 5𝑎 9
). Here 𝑇 is an endpoint of the intersec-
tion line of two domains. Let the intersection of |𝑂𝐴1 | and the left side of 𝕋𝐴1
be 𝐸, rotate 𝑂𝐸 counterclockwise a small angle 𝜃 which makes |𝑂𝐷| ≤ |𝑂𝑇 |.
Find a point 𝐷′ on the bottom side of 𝕋𝐴2 which makes ∠𝑇 𝑂𝐷′ be 𝜃. Then
𝑆𝐸𝑂𝐷 ≤ 𝑆𝑇 𝑂𝐷′ . Choose 𝐵 as the symmetry point of 𝐴1 about 𝑂𝑃 . Call the lower
left corner 𝐴1 be 𝐹 . Then we find that 𝐵 changes the area 𝐷𝑂𝐷′ 𝐹 into green.
The area of 𝐷𝑂𝐷′ 𝐹 is 𝑆𝑇 𝑂𝐷′ − 𝑆𝐸𝑂𝐷 , more than half area of a rectangle. Choose
𝑆 −𝑎𝑏
other points suitably to change 𝑇 2 − 𝜖 and we get the contradiction.

Now we have solved the case that 𝑇 is a square. It shows a basic idea to solve
the question, and we will see in the next section that this example is more important
than we think. The idea of the first main theorem comes from here and it is a case
needed to be discussed in the second main theorem.

4 Two Theorems when 𝑇 or 𝑛 is Fixed


In this section two main theorems are stated.

Theorem 4.1. For a fixed 𝑛 there exists a 𝑇 which makes (𝑇 , 𝑛) a good pair.

Theorem 4.2. For a fixed 𝑇 and a large enough 𝑛, (𝑇 , 𝑛) is not a good pair.
4 Two Theorems when 𝑇 or 𝑛 is Fixed 211

The two theorems show that (𝑇 , 𝑛) may possibly be good or bad when 𝑇 or
𝑛 changes. Moreover we can see another fact by the proof then: the condition is
better when a domain would not “interfere” with another domain.

We prove the first theorem first as it is easier.

proof of Theorem 4.1. We just need to construct the 𝑇 for each 𝑛.


Let 𝑇 be a rectangle of length 8𝑛 and width 2, 𝐴𝑘 be (8𝑘 − 4, 1) (let the lower
left corner of 𝑇 be the origin point and establish coordinate system).
To prove such an example meets the requirement, we need to prove that each
𝐵 changes at most an area of measure 8 into green.
Note that a point 𝐵 changes the points in at most three different domains into
green. So we just discuss by the number of domains 𝐵 changes.
If 𝐵 changes two domains. We have two cases.

The first case: The area that 𝐵 changes is a trapezoid.

𝑇 𝐵
𝐴1 𝐸 𝐴2
𝐷

Label the points with letters as it shown in the picture above, we figure out that
𝐻𝐼 𝐿𝑀 are the perpendicular bisectors of 𝐴1 𝐵 and 𝐴2 𝐵. We need to prove that
𝑆𝐻𝐼𝑀𝐿 ≤ 4. Note that |𝐴1 𝐷| = |𝐵𝐷| ≥ |𝐷𝐸|. Then we get 𝑆𝐻𝐼𝐺𝐾 ≤ 𝑆𝐹 𝐻𝐼𝐽 .
Similarly we have that 𝑆𝐺𝐾𝑀𝐿 ≤ 𝑆𝑀𝐿𝑁𝑂 and then get the conclusion.

The second case: The area that 𝐵 changes is a triangle.

𝐽
𝐷
𝐴1 𝐴2
𝐾
𝐿
𝐵
𝐻 𝐸 𝐼 𝐹

We may as well let the coordinate of the two origin points be (4, 1), (12, 1) and
the coordinate of 𝐵 be (4 + 𝑎, 1 − 𝑏).
212 A Game About Competing Area

1 |𝐷𝐼| 𝑎 𝑏
𝑆𝐷𝐼𝐹 = |𝐷𝐼||𝐼𝐹 | = |𝐷𝐼| × ≤ 2|𝐷𝐼||𝑘𝐵𝐴2 | = 2|𝐷𝐸|
2 |𝑘𝐷𝐹 | √𝑎2 + 𝑏2 8 − 𝑎
𝑏
≤|𝐷𝐸| ≤ |𝐸𝐷||𝐾𝐴1 | = 2𝑆𝐸𝐷𝐴1 ≤ 𝑆𝐸𝐷𝐴1 + 𝑆𝐿𝐸𝐴1
2
≤𝑆𝐸𝐷𝐴1 + 𝑆𝐿𝐻𝐸𝐴1 = 𝑆𝐻𝐸𝐷𝐽
(4.0.1)

So 𝑆𝐷𝐸𝐹 ≤ 𝑆𝐼𝐽 𝐻 .

If 𝐵 changes three domains.

𝐴
𝐷
𝐵 𝐶

Let the coordinate of the three origin points be (4, 1), (12, 1), (20, 1) and coordinate
of 𝐵 be (12 − 𝑎, 1 − 𝑏). Here 𝑎 and 𝑏 are positive. We can calculate the area of
𝐴𝐵𝐶𝐷.

4𝑏 − 2𝑏2 − 2𝑎2 1 8𝑎 − 𝑎2 + 2𝑏 − 𝑏2 2 1 −8𝑎 − 𝑎2 + 2𝑏 − 𝑏2 2


+ ( ) + ( )
𝑏 2𝑘𝐴𝐵 2𝑏 2𝑘𝐶𝐷 2𝑏
4𝑏 − 2𝑏2 − 2𝑎2 𝑏 8𝑎 − 𝑎2 + 2𝑏 − 𝑏2 2 𝑏 −8𝑎 − 𝑎2 + 2𝑏 − 𝑏2 2
≤ + ( ) + ( )
𝑏 7 2𝑏 7 2𝑏
𝑏 2𝑏 + 2𝑏 2 𝑏 2𝑏 2
≤4 − 2𝑏 + ( ) + ( )
7 2𝑏 7 2𝑏
≤4
(4.0.2)

In this inequality we use the conclusion that 4𝑎 ≤ 𝑏 and 𝑏 ≤ 1. The first


inequality can be proved by geometric relationship |𝐵𝐶| < |𝐶𝐴2 | (or the 𝐵 will
just change two domains). From the fact that 𝐵 changes three domains we can also
get the inequality that 𝑎2 + 𝑏2 ≥ 8𝑎 − 2𝑏.
This finishes the proof of theorem 4.1 ◻

Remark 1. In fact we have not discussed all the cases, such as the case we discuss
in the last of section 3. However, we know that the other cases can be treated as a
part of these three cases we have discussed.
Remark 2. It is clear that the length of small rectangles is not the best constant
here. The best constant of length ∶ width is around 2.
4 Two Theorems when 𝑇 or 𝑛 is Fixed 213

Now it is time to go to our last main theorem. The main idea of proof is that
when 𝑛 is big enough there must be a lot of quadrilateral and hexagonal domains.
We need to find a contradiction in it.
At first we prove that there cannot be quadrilaterals in most cases.
Theorem 4.3. If 𝐴1 , 𝐴2 ...𝐴𝑛 satisfy the condition and there is a quadrilateral in
𝕋𝐴𝑖 , then the all the domains must be like the case in theorem 4.1 (a row of rectan-
gles).
Proof. By lemma 5 the quadrilateral must be a parallelogram. At first we prove
that it must be a rectangle. For the origin point of the quadrilateral 𝕋𝐴1 , we notice
that the symmetric point of 𝐴1 about the side of 𝕋𝐴1 which is not the bound of 𝑇
must be another origin point, called 𝐴2 . By lemma 4 and lemma 5 we know that
𝕋𝐴2 is also a quadrilateral. If 𝕋𝐴1 is not a rectangle, then two of its angle are obtuse
angles. We select the angle that one side is also the side of 𝕋𝐴2 and label it with
letter 𝑆. Then we label other points like the picture.

𝐵1
𝐴2
𝐻 𝐼
𝑈
𝑆 𝐵1
𝐾 𝐽
𝐴1

Call intersection of 𝑈 𝑆 and 𝐴1 𝐴2 𝐵. Here 𝐻𝑆 = |𝑆𝑇 4


|
then 𝐵1 changes the
black area into green. Consider the disk 𝐷(𝑆, 𝑟) whose center is at 𝑆 with radius
|𝑆𝐴2 −𝑆𝐵1|
of 𝑟 = 2
. The (purple) area of 𝐷(𝑆, 𝑟) ∩ (𝕋𝐴1 ∪ 𝕋𝐴2 )𝑐 is also turned to
green by 𝐵 because for any point 𝑇 in 𝐷(𝑆, 𝑟) and another origin point 𝐴 we have
|𝑆𝐴2 −𝑆𝐵1 | |𝑆𝐴2 −𝑆𝐵1 |
that |𝐴𝑇 | ≤ |𝐴𝑆| − 2
≤ |𝑆𝐴1 | − 2
≤ |𝑆𝐵1 |. Thus 𝐵 changes
more than half (both gray and purple part) of the quadrilateral. Then we use the
trick we have used many times to select other 𝐵𝑖 Then we get the contradiction.
If 𝕋𝐴1 is a rectangle, using the trick we used in section 3 we know that all
domains are rectangles and they make up a table. Using the proof in section 3
again we can know that the table only have one row (or one column) and we get
the conclusion. ◻

Hence only if 𝑇 is a rectangle then there could be quadrilateral in 𝕋𝐴1 , 𝕋𝐴2 ...𝕋𝐴𝑛 .
However, when 𝑇 is a rectangle and 𝑛 becomes big enough it becomes a case in
section 3 which turns out to be bad.
214 A Game About Competing Area

Then we prove that there would be a lot of hexagons, which will be done by
Euler’s Formula 𝑉 − 𝐸 + 𝐹 = 1.

Theorem 4.4. We divide a polygon 𝑇 into small polygons with even edges. If there
are not quadrilaterals, the numbers of “points on edges”, “the polygons which
have more than eight edges”, “the points in 𝑇 which is the endpoint of more than
four lines” can be controlled by a constant 𝑐 only depending on 𝑇 .

Remark 3. By lemma 6 𝑇𝑖 must be polygons (we even know it is a centrally sym-


metric polygon), and all the numbers of edges of polygons must be even. Thus we
naturally consider theorem 4.4.

Proof. Let 𝑛 be the number of edges of 𝑇 , 𝑝 be the number of points on edges,


𝑟 be the number of points in 𝑇 , ℎ be the number of hexagons, 𝑡 be the number of
polygons which has more than seven edges, 𝑙 be the number of lines in 𝑇 and 𝑓 be
the the number of the points in 𝑇 who is the endpoint of more than four lines.
Then we have

⎧𝑛 − 2 + 𝑝 + 2𝑟 ≥ 4ℎ + 6𝑡

⎨ℎ + 𝑡 + 𝑝 + 𝑟 − 𝑙 − 𝑝 = 1
⎪𝑙 ≥ 𝑝+3𝑟+𝑓
⎩ 2

The first inequality is got by calculating degrees, the second one follows from
Euler’s Formula and the third one is from every point inside 𝑇 has at least 3 lines
from it and every point on edges has at least one.
Solving the inequalities system we get that 𝑛 − 6 ≥ 𝑝 + 2𝑡 + 2𝑓 and the conclu-
sion. ◻

This theorem gives out a way leading to our goal. It tells us that there are enough
hexagons which are not on the edges. It is interesting that a hexagon will not cause
contradiction directly but when a hexagon is surrounded by other hexagons we can
get contradiction.
By theorem 4.4 we know that if 𝑛 is big enough we can select a lot of adjacent
hexagons, every of which is surrounded by other six hexagons and each vertex of
the original hexagon is just the vertex of three domains.
Consider such a hexagon:

Theorem 4.5. Such a hexagon must be a diagonal parallel hexagon.

We give out the definition of parallel hexagon then.

Definition 3. A diagonal parallel hexagon is a hexagon that each main diagonal is


parallel to the opposite.
4 Two Theorems when 𝑇 or 𝑛 is Fixed 215

𝐴 𝐵 𝐴𝐵 ∥ 𝐹 𝐶 ∥ 𝐸𝐷
𝐴𝐹 ∥ 𝐵𝐸 ∥ 𝐶𝐷
𝐹 𝐸 ∥ 𝐴𝐷 ∥ 𝐵𝐶

𝐹 𝐶

𝐸 𝐷

Now we give out the proof, the idea is based on translating 𝐵 a little distance
from 𝐴, because calculating of areas is a bit complicated. We use different colors
to sign different domain for convenience. We clarify our notation. Let 𝒯 be any
function such that lim𝜖→0 𝒯 𝜖(𝜖) = 1, and 𝑜(𝜖) be higher order infinitesimal of 𝜖.

𝑍
𝐴1 𝐻
𝐺 yellow
𝐼 𝑉 𝐽
𝑀𝑄𝑈 𝐵
black 𝑆
𝑂𝑅 orange
gray
𝐴2 purple 𝐴4
𝑇 𝑊 𝑌
𝑁 cyan 𝐿
𝑋
𝐾𝑃

𝐴3

Here is a local picture of points around 𝑈 .


216 A Game About Competing Area

𝑄
𝑀 𝑈

Proof. At first we tell the construction of points in the picture. We translate 𝐵 a


infinitesimal 𝜖 from 𝐴 on the line 𝐴𝑊 which is the vertical line from 𝐴 to 𝑆𝑇 . 𝐴2 ,
𝐴3 and 𝐴4 are the symmetric point of 𝐴 about 𝑈 𝑇 , 𝑇 𝑆 and 𝑆𝑅. 𝐼𝐽 , 𝑀𝑁 and
𝐾𝐿 are perpendicular bisectors of𝐴1 𝐵, 𝐴1 𝐴2 and 𝐵𝐴3 . The intersection of 𝐵𝐴2
and 𝑀𝑁 are 𝑂 and 𝑃 𝑄 is the lines which vertical to 𝐵𝐴2 and passed 𝑂.
𝐵 changes the orange, black, gray, purple and the symmetrical part of gray and
purple (call it 𝐷) into its own domain.
What we want to prove now is that 𝑆cyan +𝑆gray +𝑆purple +𝑆𝐷 −𝑆yellow = 𝑐𝒯 (𝜖).
Here 𝑐 ≥ 0 and is 0 iff the hexagon is a parallel hexagon.
|𝐺𝐻| |𝑇 𝑌 |
Notice that 𝑆yellow = 2
𝒯 (𝜖), 𝑆cyan = 2
𝒯 (𝜖) because we have that that
|𝐴1 𝑉 | = |𝑊 𝑋| = 2𝜖 .
Now calculate 𝑆gray + 𝑆purple . Let 𝑓𝑈 𝑇 = |𝑇 𝑆| − |𝑆𝑈 | and 𝜃 be 𝜋 − ∠𝑈 𝑇 𝑆.
At first we notice that

𝑆gray + 𝑆black
=|𝑈 𝑇 |(|𝑂𝑅| + 𝑜(|𝑂𝑅))
=|𝑈 𝑇 |(|𝑆𝑂|| cos ∠𝑆𝑂𝑅 + 𝑜(𝑆𝑂)| (4.0.3)
𝒯 (𝜖)
=|𝑈 𝑇 | cos 𝜃
2

and

𝑆purple − 𝑆black
1
=𝒯 ( |𝑁𝑂2 − 𝑀𝑂2 || sin ∠𝑃 𝑂𝑁|)
2
1 (4.0.4)
=𝒯 ( |𝑇 𝑆 2 − 𝑆𝑈 2 | sin ∠𝐴1 𝐴2 𝐵)
2
sin ∠𝐴2 𝐴1 𝐵
=𝒯 (𝑑𝑈 𝑇 |𝑈 𝑇 |𝜖 ).
𝐴1 𝐴2
4 Two Theorems when 𝑇 or 𝑛 is Fixed 217

Thus 𝐵 changes the domain of area: (𝜃 ′ means the 𝜋 − ∠𝑅𝑆𝑇 )

𝑆hexagon 𝒯 (𝜖) sin ∠𝐴2 𝐴1 𝐵 𝒯 (𝜖)


+ |𝑈 𝑇 | cos 𝜃 + 𝒯 (𝑑𝑈 𝑇 |𝑈 𝑇 |𝜖 + |𝑅𝑆| cos 𝜃 ′
2 2 𝐴1 𝐴2 2
sin ∠𝐴4 𝐴1 𝐵 |𝐺𝐻| |𝑇 𝑆|
−𝒯 (𝑑𝑆𝑅 |𝑆𝑅|𝜖 )− 𝒯 (𝜖) + 𝒯 (𝜖)
𝐴4 𝐴2 2 2
𝑆hexagon 1 1 sin ∠𝐴2 𝐴1 𝐵
= + 𝒯 ((𝜖)( |𝑈 𝑇 | cos 𝜃) + |𝑅𝑆| cos 𝜃 ′ + (𝑓𝑈 𝑇 |𝑈 𝑇 | )
2 2 2 𝐴1 𝐴2
sin ∠𝐴4 𝐴1 𝐵 |𝑇 𝑆| |𝐺𝐻|
−(𝑓𝑆𝑅 |𝑆𝑅| )+ − ).
𝐴4 𝐴2 2 2
(4.0.5)

It is the case that 𝐵 moves a little distance 𝜖 on the vertical line from 𝐴 to 𝑆𝑇 .
We let this 𝐵 be 𝐵1 . Similarly if 𝐵 moves 𝜖 on the perpendicular lines to other
edges and we calculate the average of them we get that 𝐵 changes the domain of
area

𝑆hexagon 1 1
+ 𝒯 (𝜖)( ∑(|𝑈 𝑇 | cos 𝜃 + |𝑅𝑆| cos 𝜃 ′ + |𝑇 𝑆| − |𝐺𝐻|)).
2 6 2 cyc

Here cyc means sum interchangeably.


We notice that |𝑈 𝑇 | cos 𝜃 + |𝑅𝑆| cos 𝜃 ′ + |𝑇 𝑆| ≥ |𝐺𝐻| and the equation
established if and only if 𝐺𝐻\\𝑇 𝑆. We can prove this by geometric relationship.
There are three cases. First of two are:

𝑃 𝑄 𝑃 𝑄

𝑈 𝑅
𝐺 𝐻 𝐺 𝐻
𝐴1 𝐴1
𝑅 𝑈

𝐿 𝑇 𝑆 𝐾 𝐿 𝑇 𝐾 𝑆

For the first case we know that

|𝑈 𝑇 | cos 𝜃 + |𝑅𝑆| cos 𝜃 ′ + |𝑇 𝑆| ≥ |𝐽 𝑇 | + |𝑇 𝑆| + |𝑆𝐾| ≥ |𝑇 𝑆|.

For the second case we know that

𝐿𝐻𝑆 ≥ |𝐽 𝑇 | + |𝑇 𝑆| + |𝑆𝐾| = |𝐽 𝑊 | + |𝑊 𝑆| ≥ |𝐽 𝑊 | + |𝑊 𝐾| = |𝑇 𝑆|.

Hence |𝑈 𝑇 | cos 𝜃 + |𝑅𝑆| cos 𝜃 ′ + |𝑇 𝑆| − |𝐺𝐻| ≥ 0.


218 A Game About Competing Area

If the equation were not established, then we can always find a 𝐵 in the hexagon
and 𝐵 changes more than half area of the hexagon into its own domain. Then use
the tricks we have used many times to select other 𝐵𝑖 . Thus the equations are all
established.
There is one case left: when 𝑈 𝑇 ⟂ 𝑇 𝑆, the projection of 𝐺𝐻 is the same as
𝑈 𝑅. In that case, we can change the direction of 𝐵 moved from 𝐴 to get contra-
diction. ◻

When we have such a strong conclusion, we can select three adjacent hexagons
and calculate the angles to get further conclusion.

Theorem 4.6. If 3 hexagons are adjacent, they are all regular hexagons.

𝐴1

𝐶 𝐷
𝑇

𝐴2 𝐴3

Proof. We just need to calculate angles. In the equation we use the facts that
𝐴1 𝐵𝑇 𝐶, 𝐴3 𝐷𝑇 𝐵 and 𝐴2 𝐷𝑇 𝐶 are parallelograms, 𝐴1 and 𝐴2 and 𝐴3 are sym-
metric points about 𝑇 𝐶, 𝑇 𝐷, and 𝑇 𝐵.

∠𝑇 𝐴3 𝐵 = ∠𝐷𝑇 𝐴3 = 𝜋 − ∠𝐶𝑇 𝐵 = ∠𝑇 𝐵𝐴1 . (4.0.6)

Similarly we have that ∠𝑇 𝐴1 𝐵 = ∠𝑇 𝐵𝐴3 . Then we get that ∠𝐵𝑇 𝐴1 =


∠𝐵𝑇 𝐴3 So we can get that ∠𝐵𝑇 𝐴1 = ∠𝐵𝑇 𝐴3 = ∠𝐷𝑇 𝐴3 = ∠𝐷𝑇 𝐴2 =
∠𝐶𝑇 𝐴2 = ∠𝐶𝑇 𝐴1 . We can get that ∠𝐵𝑇 𝐴3 = ∠𝑇 𝐵𝐴3 = 𝜋3 then we can get that
𝕋𝐴3 is a regular hexagon. Similarly, 𝕋𝐴2 and 𝕋𝐴3 are regular hexagons. ◻

Consider four adjacent regular hexagons. We can show that it’s not good by
date in this picture.
5 Summary 219

𝐴1 :(0.500,0.866)
𝐵:(0.404,2.121)
𝑆ℎ𝑒𝑥𝑎𝑔𝑜𝑛 =2.598
𝑆𝑂𝑃 𝑄𝑅 =1.309
𝑆𝑂𝑃 𝑄𝑅
𝑆
=0.504
ℎ𝑒𝑥𝑎𝑔𝑜𝑛
𝐴4
𝑅
𝐵 𝑄
𝐴2 𝐴3
𝑂 𝑃

𝐴1

We can choose a 𝐵 ∶ (0.404, 2.121) such that it changes more than half a
hexagon, so this situation is also not good.

We have finished all the theorems and given out some conclusions about this
question. However, it is a pity that we have not solved the problem completely, we
will give out some ideas about the question in next section.

5 Summary
Firstly we recall the conclusions we have gotten: for initial question, (square, 𝑛) is
good if and only if 𝑛 = 1. For any 𝑛 there exists 𝑇 such that (𝑇 , 𝑛) is good. For any
𝑇 , when 𝑛 is large enough (𝑇 , 𝑛) are always bad.
A fact is that it is hard to construct another good (𝑇 , 𝑛) when 𝑛 is more than
2 and 𝑇 is not a rectangle (except those in theorem 4.1). The difficulty is that we
cannot divide a convex polygon into centrally symmetric polygons which has more
than five edges easily. Moreover we need to use area-equivalent polygons: even
diagonal parallel (polygons with 4𝑛 + 2 edges) or every edge has two edges vertical
to it (polygons with 4𝑛 edges) when a polygon is surrounded by other polygons. It
is hard to achieve all such things. It can be conjectured there are no other good pairs
except the cases in theorem 4.1. However, on the other hand, as we do not have the
condition that 𝑛 is big enough, it is not easy to consider this problem locally. An
available idea is to prove that a polygon cannot be divided into such polygons. We
may prove this by considering the lines between origin points and using the length
and angles to calculating areas. It is also possible to calculating integral in some
certain area locally around an origin point to prove the existence of 𝐵𝑖 . However,
they all need nontrivial calculation.
220 A Game About Competing Area

As a generalization of the question, we can change 𝑇 into Riemannian mani-


folds with constant curvature, for example, 𝑆 2 with canonical metric. It is easy to
see that (𝑆 2 , 1) and (𝑆 2 , 2) are good but (𝑆 2 , 3) not. However, (𝑆 2 , 4) is good again.
The question on Riemannian manifolds with constant curvature may be much more
complicated. It may be connected with isometric group of them. The condition of
constant curvature is necessary here since vertical bisector is well defined in this
case.
At the end of this article, we figure out what we have proved that in fact, in
such a game the second person has advantage most of time. We can easily get the
corollary that even if the number of points two people choose are different, the
average of area of the second is almost always bigger than the first. It can be seen
that in such situations it is always important to know the other people’s position
which will bring advantage to you.
变号矩阵与可积系统
——从一道新领军测试题到现代数学

徐凯1

摘 要
在这篇短文中, 我们从变号矩阵的计数问题出发, 将其表述为六顶点模
型中态和的计算, 并通过 Yang–Baxter 方程和可积性给出问题的显式解.

清华大学 2021 年新领军综合测试中有如下一道试题:

问题. 要求矩阵满足条件: (1) 只有三种元素 −1, 0, 1; (2) 每一行每一列不全为


0; (3) 每行每列在去掉所有 0 之后, 剩下的元素均形如 1, −1, 1, … , −1, 1. 已知
这样的三阶矩阵有 7 个, 求四阶的有多少.

这道题目虽然表述完全初等, 但背后蕴藏非常丰富的结构, 是中学阶段的


同学接触现代数学之深邃高明十分难得的机会, 故撰此文聊作剖析, 以为引玉
之砖. 本文主要证明来自 G.Kuperberg.

1 变号矩阵到六顶点模型
我们首先重新表述变号矩阵的计数问题, 将其化归为统计力学中的六顶点
模型.
在六顶点模型中, 我们关心平面上每边带定向, 并且每个顶点恰好有两条
边进入两条边离开的正方形图, 每一个这样的图称为一个态 (state). 给定一个
态, 在每个顶点附近有如下六种可能的定向:

𝑎 𝑏 𝑐 𝑑 𝑒 𝑓
1
原清华大学数学系数 41 班.
222 变号矩阵与可积系统

我们为每一种可能 𝑖 赋一个权重 𝑤(𝑖), 每个顶点的权重相乘定义为图的权


重, 而 (满足给定条件的) 所有态的权重之和称为态和 (state sum). 六顶点模型
中也可设置边界条件, 即引入只连一条 (给定) 定向边的顶点. 特别的, 我们可
以考虑 𝑛 × 𝑛 方形网格, 再加左右两侧向内上下两侧向外的边界条件, 这个模型
称为方冰 (square ice). 一个方冰态可以用如下对应转化为一个变号矩阵: (注意
这里的 −1, 0, 1 和权重无关)

1 −1 0 0 0 0

并且可以验证这是一个一一对应. 因此, 变号矩阵的计数问题即等价于六顶点


模型所有权重为 1 的态和. 故我们只需求解六顶点模型即可. 六顶点模型可以
严格求解, 其中的核心结构为 Yang–Baxter 方程 (以下简称 YBE).

2 YBE 的场论来源

这一节中我们简短介绍 YBE 在场论中的起源. 可积场论可以视为可积格


点模型 (lattice model) 的连续极限. 譬如六顶点模型取时间方向的连续极限得
到 Heisenberg 自旋链 (spin chain), 时空均取连续极限得到正弦 Gordon 模型,
正弦 Gordon 模型即为最简单的可积场论之一. 本节内容与后文内容并无逻辑
关联, 仅通过一个自然的来源以启发 YBE 这一概念, 故相对简略, 不感兴趣的
读者可以跳过本节.
(相 对 论 场 论 中 ) 可 积 性 是 一 个 仅 存 在 于 二 维 的 独 特 现 象 , 事 实 上 ,
Coleman–Mandula 论证了在更高维度, 倘若存在自旋至少为 2 的守恒荷, 则
我们总可以利用相应的对称性移动粒子到一般位置, 使得他们的轨迹永不相
交, 不会发生散射因而 𝑆 矩阵 (即粒子散射过程中初始状态到最终状态的线性
变换) 一定为 1.
但平面上两条一般位置的直线总会相交, 故以上论证不能成立. 然而我们
总能通过移动至一般位置避免三条直线交于一点, 这时所有散射都为两两弹
性碰撞, 所以全部散射振幅都由弹性碰撞的所决定. 另外, 给定三条定向直线
𝑙, 𝑚, 𝑛, 倘若 𝑚, 𝑛 交点在 𝑙 左侧, 那么我们总是可以利用前文的对称性将其移动
到右侧. 这时得到的构型与先前并不拓扑等价, 二者散射振幅相等需要弹性碰
撞的散射振幅满足一个三次方程, 即为 YBE (这时动量是其中的一个谱参数).
3 六顶点模型的解 223

3 六顶点模型的解
𝑞 𝑥∕2 −𝑞 −𝑥∕2
取一个未定元 𝑞, 记 [𝑥] = 𝑞 1∕2 −𝑞 −1∕2
, 对一个带 𝑥 标记的顶点

我们为六种定向赋权如下:

−𝑞 −𝑥∕2 −𝑞 𝑥∕2 [𝑥 − 1] [𝑥 − 1] [𝑥] [𝑥]

这可以视为指定了一个二维场论的弹性碰撞 (大小为 22 × 22 ) 的 𝑆 矩阵
(在可积格点模型中通常称为 𝑅 矩阵), 而 YBE 正是这个可积场论的相容性条
件:

定理 1 (Baxter). 若 𝑥 = 𝑦 + 𝑧, 则 𝑅-矩阵 𝑅(𝑥), 𝑅(𝑦), 和 𝑅(𝑧) 满足如下两图对


应矩阵相等
𝑦 𝑥

𝑧 𝑧

𝑥 𝑦

我们首先解释何为两图对应的矩阵: 一言以蔽之, 他们是这个散射图对应


的 𝑆 矩阵. 每个图都有六条外边, 并有自然的一一对应. 他们共有 64 种定向,
每种定向都可以作为边界条件得到一个态和, 这正是这些矩阵的项.
我们将权重的定义延拓到更一般的光滑曲线系统上. 对一种选定的定向方
式, 约定每个切线水平向左的点, 若上凸则赋权 −𝑞 1∕2 , 下凸则 −𝑞 −1∕2 , 切线水
平向右的点则赋权 1, 然后乘起来得到一个单项式. 把每种定向方式所得到的
单项式加起来, 得到的多项式就是态和. 我们有如下的态和表达式:

= = = −𝑞 1∕2 − 𝑞 −1∕2 = −[2]

可以看出曲线的贡献只取决于它的同伦类, 并且 𝑅(𝑥) 可以表示为

= [𝑥] + [𝑥 − 1]
224 变号矩阵与可积系统

因此一个态和可以表示为对曲线的求和, 其中每个闭圈贡献一个因子 −[2]


(这种计算规则起源于 Temperley–Lieb 范畴, 与 𝔰𝔩2 的量子群密切相关). 由此直
接计算便可验证 YBE. 以下我们记

𝑥 =
𝑥−𝑦
𝑦

对于 𝑋 = (𝑥𝑖 ), 𝑌 = (𝑦𝑖 ), 记 𝑍(𝑛, 𝑋, 𝑌 ) 为如下的态和

𝑥0

𝑥1

⋮ ⋱ ⋮

𝑥𝑛−1 ⋯

𝑦0 𝑦1 𝑦𝑛−1

𝑍 关于 𝑥𝑖 和 𝑦𝑖 分别都是对称的: 这可以由 YBE 直接证明, 也可以用第二


节中 YBE 的场论推导类似可得. 这对 𝑍 的结构施加了很强的限制, 通过仔细
分析它的零点与极点, 我们可以显式解得

定理 2 (Izergin, Korepin). 态和 𝑍(𝑛; 𝑋, 𝑌 ) 有如下表达

(−1)𝑛 (∏𝑛−1
𝑖=0 𝑞
(𝑦𝑖 −𝑥𝑖 )∕2
) ∏0≤𝑖,𝑗<𝑛 [𝑥𝑖 − 𝑦𝑗 ][𝑥𝑖 − 𝑦𝑗 − 1]
𝑍(𝑛; 𝑋, 𝑌 ) = det 𝑀,
(∏0≤𝑗<𝑖<𝑛 [𝑥𝑖 − 𝑥𝑗 ]) (∏0≤𝑖<𝑗<𝑛 [𝑦𝑖 − 𝑦𝑗 ])
其中
1
𝑀𝑖,𝑗 = .
[𝑥𝑖 − 𝑦𝑗 ][𝑥𝑖 − 𝑦𝑗 − 1]
由第一节的论证我们知道 𝑛 × 𝑛 变号矩阵的数量恰好为方冰态数量, 而方
冰态中顶点 𝑎 比 𝑏 多 𝑛 个, 𝑐, 𝑑 数目相等, 𝑒, 𝑓 也相等, 所以可得变号矩阵数目

2
1 𝑛−𝑛
(2) (−1)𝑛 𝑞 𝑛∕4 𝑍1∕2 (𝑛)|𝑥=1 ,

其中 𝑍1∕2 (𝑛) = 𝑍(𝑛; 1∕2, ..., 1∕2, 0, ..., 0), 𝑥 = [ 21 ]−2 = 𝑞 1∕2 + 𝑞 −1∕2 + 2.
这是定理 2 中表达式的 (可去) 奇点, 使用 L’Hôpital 法则我们可以直接计
算得如下表达式 (计算细节请参见 [Kup96]):
参考文献 225

定理 3. 𝑛 × 𝑛 变号矩阵数目为
1! 4! 7! ⋯ (3𝑛 − 2)!
.
𝑛! (𝑛 + 1)! (𝑛 + 2)! ⋯ (2𝑛 − 1)!
由此我们就解决了变号矩阵的计数问题.

4 总结
可积系统对现代数学的深刻影响远不止于此: YBE 的解通常称为 𝑅 矩
阵, 是量子群与辫张量范畴的核心信息, 对其与场论的关系的研究启发了大量
重要的数学. 譬如著名的 Kazhdan–Lusztig 等价联系起量子群的 𝑅 矩阵与共
形场论中 Knizhnik–Zamolodchikov 方程解的单性 (monodromy). 又如 Maulik–
Okounkov 通过 𝑅 矩阵构造了瞬子模空间上同调上的量子群作用, 给出了量子
场论中 AGT 对应的数学验证. 希望本文对中学阶段的同学们有所启发, 借此机
会了解现代数学中众多深刻精微的洞见.

参考文献
[Kup96] Greg Kuperberg (1996). “Another proof of the alternative-sign matrix conjec-
ture”. International Mathematics Research Notices 1996 (3), 139–150.

You might also like