(Open Mathematics) Fredholm Determinants

Cent. Eur. J. Math.
• 9(2) • 2011 • 205-243

DOI: 10.2478/s11533-011-0003-5
Central European Journal of Mathematics
Fredholm determinants
Invited Review Article
Henry P. McKean1
1 CIMS, 251 Mercer Street, New York, NY 10012 USA
Received 15 November 2010; accepted 16 January 2011
Abstract: The article provides with a down to earth exposition of the Fredholm theory with applications to Brownian motion
and KdV equation.
MSC: 46E20, 35Q53
Keywords: Fredholm detemninant • Brownian motion • KdV equation

© Versita Sp. z o.o.
Preface
The work of I. Fredholm [10] is one of the loveliest chapters in mathematics, coming just at the turn of the century,
when serious attention was first given to analysis in infinite-dimensional (function) space. Nowadays, it is explained
mostly in an abstract and, if I may say so, a superficial way. So I thought it would be nice to do it from scratch,
concretely, following Fredholm in using only the simplest methods and adding a few illustrations of what it is good for.
The treatment has much in common with Peter Lax [16] which you may wish to look at, too; see also Courant–Hilbert [6]
and/or Whittaker–Watson [25], and, of course Fredholm’s 1903 paper, itself.
Cramer’s rule expresses the inverse of a square d matrix by ratios of sub-determinants, as I trust you know. There, the
dimension is finite. What Fredholm did was to adapt this rule to infinite dimensions, i.e. to function space. Poincaré
had posed the problem, giving as his opinion that it would not be solved in his lifetime. Fredholm did it just two years
later.
Chapter 1 is to remind you about determinants and Cramer’s rule in d < ∞ dimensions, first in the conventional form
and then in Fredholm’s form which is specially adapted to the passage from d = 1, 2, 3, . . . to d = ∞. Obviously, if
this is to work, you want only matrices which are close enough to the identity, i.e. something like I + K where I is the
identity, K is “small”, and you need to express det (I + K ) as 1 + [terms in K , K 2 , . . . ] under effective control. This is a
matter of bookkeeping. Look at a I + λK with a variable λ, develop its determinant in powers of λ (it is a polynomial
after all), and put λ back equal to 1. This produces the formula
 
" # K ii Kij K ik
X X Kii Kij X
det (I + K ) = 1 + Kij + det  Kji Kjj Kjk  + . . . + det K ,
 
det +
Kji Kjj
1≤i≤d 1≤i<j≤d 1≤i<j<k≤d Kki Kkj Kkk
205
Brought to you by | Pontificia Universidad Catolica de Chile
Authenticated
Download Date | 8/13/19 4:23 PM
which is the key to the passage to d = ∞ carried out in Chapter 2. Think now of K as an integral operator
Z1
K : f 7→ K (x, y)f(y) dy,
0
acting on C [0, 1] say, and approximate its action by that of the matrix

i j 1
K ,
n n n 1≤i,j≤d
acting in dimension d. Then the preceding formula for the determinant of the matrix I + K makes plain the good sense
of Fredholm’s recipe for the determinant of the operator I + K :
Z Z " #
K (x1 , x1 ) K (x1 , x2 ) 2
det (I + K ) = 1 + K (x, x) dz + det dx
K (x2 , x1 ) K (x2 , x2 )
0≤x≤1 0≤x1 <x2 ≤1
 
Z K (x1 , x1 ) K (x1 , x2 ) K (x1 , x3 )
det K (x2 , x1 ) K (x2 , x2 ) K (x2 , x3 ) d3 x + ad infinitem.
 
+
0≤x1 <x2 <x3 ≤1 K (x3 , x1 ) K (x3 , x2 ) K (x3 , x3 )
Chapter 2 develops this idea along with Cramer’s rule, in detail. (Don’t worry. I’ll make a better notation when the time
comes.) Particular attention is paid to the special but important case of symmetric kernels: K (x, y) = K (y, x); of these
I give only the simplest examples.
Chapter 3 presents three serious examples of what you can do with this machinery. The first application is to Gaussian
probabilities with emphasis on Brownian motion, as initiated by Cameron and Martin [3]. The second is to the Kortweg–
de Vries equation (KdV) describing long waves in shallow water. Here, I follow mostly Ch. Pöppe’s [21] explanation of
a formula of Dyson [8], representing the solution of KdV by means of a special type of Fredholm determinant. The third
application is to the N-dimensional unitary ensemble, so-called, i.e. the group of N×N unitary matrices equipped with
its invariant volume element. What is in question is the distribution of eigenvalues of a typical matrix both for N < ∞
and for N = ∞. Here, the principal names are Dyson [7], Jimbo et al. [12], Mehta [19], and Tracy–Widom [22].
I cannot close this preface without a little joke. I gave a lecture in Paris a while back: something about “Fred Holm’s
determinants” the advertisement said. Oh well! The name is Ivar. Don’t forget it.
Prerequisites
Not much in fact. You will need: 1) standard linear algebra, both real and complex; 2) calculus in several variables, of
course; 3) some acquaintance with Lebesgue measure (on the unit interval only) and the associated space L2 [0, 1] of (real)
R1
measurable functions with 0 f 2 < ∞; 4) elementary complex function-theory. I will explain anything non-standard or
more advanced. Good references are: for 1), Lax [15]; for 2), Courant [5]; for 3), Munroe [20]; for 4), Ahlfors [1].
1. An old story with a new twist

I collect here what you need to know about linear maps of the finite-dimensional space Rd to itself, first in the conventional
format and then in Fredholm’s way.
206
Authenticated
H.P. McKean
1.1. Simplest facts

L : Rd → Rd is the map, R = y : y = Lx for some x ∈ Rd is its range, N = {x : Lx = 0} its null space, and similarly
for the transposed map L† : range R † and null space N † . The figure displays these notions in a convenient way.
The position of N † to the right indicates that it is the annihilator R o of R: Indeed, y ∈ R o if and only if (Lx, y) = (x, L† y)
vanishes for every x, which is to say L† y = 0, and similarly to the left: N = (R † )o . L kills N and so maps R † one-
to-one onto R; likewise, L† kills N † and maps R one-to-one onto R † . It follows that dim R † = dim R ≡ r and so also

dim N † = dim N ≡ n = d − r. The map L : x → y may be expressed by a matrix M = mij 1≤i,j≤d relative to the standard
frame:
e1 = (100 . . . 0), e2 = (010 . . . 0), ..., ed = (000 . . . 1)
as in
d
X
yi = mij xj , i = 1, . . . , d.
j=1
R is the span of the columns of M, R † the span of its rows, and dim R † = dim R expresses the equality of “row rank”
and “column rank”.
1.2. Determinant and Cramer’s rule
It is desirable to have a practical way to tell if L : Rd → Rd is one-to-one and onto (and so invertible), and also a recipe
for its inverse L−1 in this case. Introduce (out of the air) the determinant:
X d
Y
det M = sign π miπ(i) ,
π∈Sd i=1
the sum being taken over the “symmetric group” Sd of all permutations π of the “letters” 1, 2, . . . , d. Then L is one-to-one
6 0, in which case L−1 is expressed by the inverse matrix M −1 = m−1

and onto if and only if det M = ij 1≤i,j≤d formed
according to Cramer’s rule:
1) strike out the ith row and jth column of M and take the (d − 1) × (d − 1) determinant of what’s left, this being the ij
“cofactor” of M, so-called; 2) supply the signature (−1)i+j ; 3) transpose; 4) divide by det M. More briefly,
m−1 −1
(−1)i+j × the ji cofactor of M .

ij = (det M)
This is, I trust, familiar. If not, don’t worry. I will derive it all in a geometrically appealing way in the next three sections.
207
Authenticated
1.3. Signed volume
The determinant, which seems at first so complicated and unmotivated, has a simple geometrical meaning: It is the
volume V of the parallelepiped spanned by the vectors f1 = Le1 , . . . , fd = Led , with a suitable signature attached.
Plainly, L is one-to-one and onto if and only if dim R = d, which is to say that the family F = [f1 , . . . , fd ] is a “frame”,
i.e. that it spans Rd , and this in turn is the same as to say that the associated (signed) volume V = V (F ) does not
vanish. I discuss this signed volume one dimension at a time.
d = 1. e1 = +1 is the standard frame; any other frame F is just a number f1 = 6 0, and V (F ) = f1 is the length |f1 |
with the signature of f1 attached. The frame f1 is “proper” or “improper” according as f1 points in the same direction as
e1 = +1 or not. Obviously, you cannot deform a proper into an improper frame without passing through f1 = 0 which is
not a frame at all, i.e. the proper and improper frames are disconnected. This is all simple stuff made complicated. The
next step conveys the idea more plainly.
d = 2. Here is the standard frame.
A second frame F = [f1 , f2 ] can be lined up with it by a (counter-clockwise) rotation aligning f1 with e1 as in the next
figure.
The resulting frame is “proper” or “improper” according as the rotated f2 lies in the upper or the lower half plane. It
cannot lie on the line directed by e1 as that would make it depend upon f1 ; in particular, the proper and improper
frames are disconnected, one from the other, as in d = 1. The signed volume V is now declared to be the area of the
parallelogram spanned by F , taken with the signature plus or minus according as the frame is proper or not, i.e.
V = |f1 | |f2 | × sin θ,
208
Authenticated
H.P. McKean
so that the plus sign prevails for 0 < θ < π and the minus sign for π < θ < 2π. Let’s spell it out: The absolute volume
is

|V | = |f1 | × the coprojection of f2 on f1 = |f1 | × |f2 − e1 (f2 , e1 )|
p
= |f1 | × |f2 |2 − |f1 |−2 (f2 , f1 )2 e1 being f1 divided by its length |f1 |
p q
= |f1 |2 |f2 |2 − (f2 , f1 )2 = (f11 2
+ f12
2 2
)(f21 + f22
2
) − (f21 f11 + f22 f12 )2
p
= (f11 f22 − f12 f21 )2 = ±(f11 f22 − f12 f21 ).
d = 3. Here is the standard frame again, obeying the right-hand screw rule.
The absolute volume for the frame F = [f1 , f2 , f3 ] is

|V | = |f1 | × coprojection of f2 on f1 × coprojection of f3 on the plane of f1 & f2 ,
but what signature should be attributed to V itself? To decide, consider the following four steps: 1) rotate F so as to
align f1 with e1 and adjust its length so they match; 2) rotate about e1 to bring f2 to the “upper” half of the plane of
e1 and e2 and deform it to e2 plain; 3) f3 now lies either above or below the plane of e1 & e2 and can be deformed
to ±e3 : in short, F can be deformed (via honest frames) to one of the two frames [e1 , e2 , ±e3 ], and it is this signature,
attached to e3 , that is ascribed to V . In a fancier language, the frames of R3 fall into two connected deformation classes,
“proper” and “improper”, distinguished by the above signature, much as for d ≤ 2, which is to say that R3 is capable of
two different “orientations” exemplified by the right- and left-hand screw rules.
d ≥ 4. It is much the same in higher dimensions. A frame determines an increasing family of subspaces Rn =
span {fk : 1 ≤ k ≤ n}, ending with Rd itself; the absolute volume is the product, from n = 1 up to d, of the length
of the coprojection of fn on Rn−1 ; and the signature of V is determined by the deformation class, of which there are two,
as before: Any frame may be deformed to [e1 , e2 , . . . , ±ed ], and it is this signature, attached to ed , that is ascribed to
V . The definition of the signed volume is now complete, but what has it to do with the determinant? I postpone the
punchline in favor of two simple observations.
Observation 1.
V is “skew” in a sense I will now explain. Let π ∈ Sd be a permutation of the letters 1, . . . , d. Then V (πF ) =

V fπ(1) , fπ(2) , . . . represents the same absolute volume as V (F ): it is only the signature that changes, as in V (πF ) =
χ(π)V (F ) with χ(π) = ±1. Plainly, this χ is a character of Sd meaning that χ(π1 π2 ) = χ(π1 ) χ(π2 ), and it is a fact
that Sd has only one such character besides the trivial character χ ≡ 1: Obviously, every permutation is a product
of “transpositions”, exchanging two letters, fixing the rest. Now you will check that if χ ascribes the value +1 to any
transposition, then it is the same for every transposition, in which case χ is trivial, i.e. χ ≡ 1. Otherwise, it ascribes the
209
Authenticated
value −1 to every transposition and χ(π) = (−1)# , # being the number of transpositions in π. This is nothing but the
signature of π appearing in the formula for the determinant. The meaning of “V skew” is now this: V (πF ) = χ(π)×V (F ).
Aside. Here you have a somewhat indirect proof that χ(π) = (−1)# , alias the signature of π, is well-defined, i.e. that
the parity of # is always the same, however π may be decomposed into transpositions. Another way is to examine how
Q
π changes the signature of the “discriminant” i>j (xi − xj ) of d distinct real numbers x1 < x2 < · · · < xd . Try it.
Observation 2.
V is now extended to any family of vectors F = [f1 , . . . , fd ], independent or not, by putting V = 0 in the second case,
when the parallelepiped collapses. Then (most important) V is linear in each of its arguments f1 , . . . , fd . Because V
is skew, you have to check this only for fd . That is obvious for d = 2 from the explicit formula V = f11 f22 − f12 f21 and
extends easily to d ≥ 3: in computing V , the part of fd in the span of fn , 1 ≤ n < d, changes nothing; it is only the
projection of fd upon the normal g to that span which counts, as in V (F ) = V [f1 , . . . , fd−1 , g] × (fd • g). This makes the
linearity obvious.
1.4. Why V is the determinant
It remains to compute V from the two properties, 1) skewness and 2) the linearity elicited in Section 1.3, plus the
d
P
normalization 3) V [e1 , . . . , ed ] = 1. Write fi = fij ej for 1 ≤ i ≤ d. Then
j=1
d
X d
X
V = ... f1j1 . . . fdjd V [e1j1 , . . . , e1jd ] by 2).
j1 =1 jd =1
Here, it is only summands with distinct j1 , . . . , jd that survive, V being skew by 1), so you may write
X d
Y X d
Y
V = V [eπ(1) , . . . , eπ(d) ] fiπ(i) = sign π fiπ(i) by 1) and 3)
π∈Sd i=1 π∈Sd i=1

= det fij 1≤i,j≤d ,
and if now f1 = Le1 , etc., so that the preceding matrix is the old M transposed, then it is seen that det M † is nothing but
the signed volume of the image of the unit cube C under L. Actually, det M † = det M, as in rule 1 of the next section,
so with an imprecise notation, det M = V (LC ).
Aside. Besides its interpretation as a signed volume, det M has also a spectral meaning: it is the value of the
characteristic polynomial det (M − λI) at λ = 0; as such, it is the product of the (generally complex) eigenvalues of M.
These are either real or else they appear in conjugate pairs, M being real, in accord with the reality of the determinant.
1.5. Simplest rules
Here, A, B, C , etc. are d × d matrices.
Rule 1.
det C † = det C : indeed,

X d
Y X d
Y X d
Y
χ(π) ciπ(i) = χ(π) cjπ −1 (j) = χ π −1 cjπ(j)
i=1 j=1 j=1
since π −1 runs once through the symmetric group as π does so, and χ(π) = χ(π −1 ).
210
Authenticated
H.P. McKean
Rule 2.
det (AB) = det A × det B = det (BA). This is also pretty clear. The absolute value | det C | represents the factor by
which the volume of any geometrical figure in Rd is changed by application of the associated linear map, as you will
see by dividing space up into little cubes, so the formula is correct, up to a signature to the right in case none of the
determinants vanish. But then each of the associated frames can be deformed without change of signature to [e1 , . . . , ed ]
or to [e1 , . . . , −ed ], and you can check the rest by hand: for example, if A is proper (orientation-preserving) and B is
improper (orientation-reversing), then AB is improper and the signs work out as follows (−1) = (+1) × (−1).
Rule 3.

det C −1 = (det C )−1 if det C =
6 0, as is immediate from Rule 2.
Rule 4.
det C does not depend on the basis of Rd . This is important: It means that the determinant is an attribute of the
associated geometrical transformation Rd → Rd and not an artifact of its presentation by the matrix C . The point is
that under change of basis C is replaced by D −1 C D with det D 6= 0 and det (D −1 C D) = det C by Rules 2 and 3.
Rule 5.
ln det C = sp ln C if C is near the identity. Here sp is the trace. Now it is better to write C = I + K with small K so
that ln C = ln (I + K ) = K − K 2 /2 + K 3 /3 − . . . , by definition. The identity is trivial at K = 0 (C = I). Let ∆K be a
tiny variation of K . Then, with a self-evident notation,

∆ ln det (I + K ) = ln det (I + K + ∆K )/ det (I + K )
= ln det I + (I + K )−1 ∆K

by Rules 2 and 3
−1

' ln 1 + sp (I + K ) ∆K why ?
' sp (I + K )−1 ∆K = sp I − K + K 2 − K 3 + . . . ∆K

1 1
= sp ∆K − (K ∆K + ∆K K ) + K ∆K + K ∆K K + ∆K K − . . .
2 2
2 3
= sp [∆ ln (I + K )] = ∆ sp ln (I + K ).
The identity sp (AB) = sp (BA) is used in line 5. I leave the rest to you.
1.6. Cramer’s rule
The determinant of M being linear in each of its columns,
d
( ) d
∂ det M X Y 1 if π(1) = 1 X Y
= χ(π) mkπ(k) × = χ(π 0 ) mkπ 0 (k)
∂m11 0 if π(1) =
6 1
π∈S d k=2 π 0 ∈Sd−1 k=2
where π 0 permutes 2, . . . , d leaving 1 fixed. This is nothing but the 11 cofactor of M. To evaluate ∂ det M/∂mij , put
the ith row/jth column in first place, forming a new matrix M 0 with m011 = mij . This can be done by permutations of
rows and columns requiring i + j transpositions, counting modulo 2: for example, m23 is brought to the place of m11 by
transposing a) columns 3 and 2, b) columns 1 and 2, and c) rows 1 and 2, requiring 5 = (2 + 3) transpositions. It follows
that det M 0 = (−1)i+j det M, so the rule is
∂ det M
= (−1)i+j × [the ij cofactor of M].
∂mij
211
Authenticated
Now look at
d
X ∂ det M
× mkj ,
k=1
∂mki
in which you see ∂ det M/∂m transposed. This represents the signed volume associated to M with its ith column replaced
by its jth column; as such, it is just det M if i = j, as you will see by Observation 2 of Section 1.3, but the parallelepiped
collapses if i 6= j and you get 0, i.e., with an abbreviated notation,
†
∂ det M
M = det M × [the identity].
∂m
That is Cramer’s rule, expressing M −1 as (det M)−1 × (∂ det M/∂m)† .
1.7. Fredholm’s idea

To extend this machinery from d = 1, 2, 3, etc. to d = ∞ dimensions, it is clear that M cannot be too far from the
identity, i.e. you must have M = I + K with K “small”. For example, the ∞ × ∞ determinant
 
1+1 0 0 0 ...
 0 1 + 1/2
 0 0 . . .
 Y ∞
1
. . . =
 
det  0 0 1 + 1/3 0 1+ = +∞
  n
 0 0 0 1 + 1/4 . . . n=1
... ... ... ... ...
makes no sense, but  

1 1/2 0 0 0 ...
1/2
 1 1/4 0 0 . . .

. . .
 
det  0 1/4 1 1/8 0
 
 0 0 1/8 1 1/16 . . .
... ... ... ... ... ...
should be OK: after all the determinant as volume cannot be more than the product of the lengths of the individual
columns, so the square of the determinant should be less than
∞
!
X
1 1 1 1 1 5 5 5 5 −n 5 5
1+ 1+ + 1+ + · ... = 1+ 1+ · . . . < exp 5 4 = exp .
4 4 16 16 64 4 16 64 4 n=2
4 12
The task is now to re-express det (I + K ) in such a way that the smallness of K can be exploited when d ↑ ∞. Following
Fredholm, this is done by introducing a parameter λ, for book-keeping, as in ∆(λ) = det (I + λK ), expanding in powers
of λ:
∆(λ) = ∆0 + λ∆1 + λ2 ∆2 + · · · + λd ∆d ,
and putting λ back equal 1. Here, ∆0 = ∆(0) = 1, ∆d = det K as you will check, and the rest is not hard to come by:
For 1 ≤ p ≤ d and 1 ≤ i1 < i2 < . . . < ip ≤ d, write i = (i1 , . . . , ip ), and |i| = p. Then, with [eij ] = I, the identity matrix,
X d
Y d d
X X X Y X X
det (I + λK ) = χ(π) eiπ(i) + λKiπ(i) = λp χ(π) Kiπ(i) = λp det [Kij ]i,j∈i ,
π i=1 p=1 |i|=p π fixing i i∈i p=1 |i|=p
i.e.
 
" # Kii Kij Kik
X X Kii Kij X
det (I + λK ) = 1 + λ Kii + λ2 + λ3 det  Kji Kjj Kjk  + . . . + λd det K
 
det
Kji Kjj
1≤i≤d 1≤i<j≤d 1≤i<j<k≤d Kki Kkj Kkk
= 1 + λ∆1 + λ2 ∆2 + λ3 ∆3 + . . . + λd ∆d .
212
Authenticated
H.P. McKean
Cramer’s rule is to be recast in this format with a view to inverting I + λK if ∆(λ) =

6 0. Taking λ into account, the recipe
is †
∂∆
(I + λK )−1 = ∆−1 × ×
1
,
λ ∂K
but I want to spell it out.
Let i be any unordered collection of p distinct indices 1 ≤ i1 , . . . , ip ≤ d. [Kij ]1≤i,j≤p is abbreviated as Kij . Then
1 X
∆p = det Kij
p! |i|=p
i=j
with a self-evident notation. Here, the extension of the sum to unordered indices merely produces p! copies of each
sub-determinant in ∆p , whence the factor 1/p! . Now look at ∂∆p+1 /∂Kij on diagonal: with further self-evident notations,
∂∆p+1 1 h i
= × the sum over i = j 3 i with |i| = p + 1 of the ii cofactor of Kij
∂Kii (p + 1)!
1 X
= × the p + 1 ways of placing i in i × det Ki0 j0
0 0
(p + 1)!
i =j 63i
|i0 |=p
× [the unrestricted sum over i0 − the restricted sum with i0 3 i]

1
=
p!
" # " #
1 0
X Kii Kij 1 X Kii Kij
= ∆p − × the p ways of placing i in i × det = ∆p − det ,
p! i=j63i
Kii Kij (p − 1)! i=j3i Kii Kij
|i|=p−1 |i|=p−1
from which the restriction i 63 i can be removed since the determinant vanishes if i 3 i. The off-diagonal is much the
same: with i < j for definiteness,
†
∂∆p+1
× the sum over i 3 both i and j of (−1)i+j × the ji cofactor of Kij
1
=
∂Kij (p + 1)!
X
(−1)i+j × [the determinant of the matrix displayed below]
1
=
(p + 1)! i=j3i,j
in which you keep ij(•) and remove ji(◦) together with its attendant row and column. Let’s move ij to the upper left-hand
corner, keeping the indices k =6 i or j in their original order. This takes i − 1 transpositions for i and j − 2 for j, the
intervening ith column having been crossed out, for a total of i − 1 + j − 2 ≡ i + j + 1 (mod 2) changes of sign. Taking
the factor (−1)i+j into account leaves just one change of sign, net, and if now you also reflect that there are (p + 1) × p
ways of placing i and j in i, you will obtain
† " #
∂∆p+1 X K K
det ij ij ,
1
=−
∂Kij (p − 1)! i=j63i,j Kij Kij
|i|=p−1
213
Authenticated
just as on diagonal, only now the term ∆p is missing. I leave it to you to bring all this into the form
 
† d−1
" #
∂∆ X X Kij K ij 
λp
1
= ∆ − the matrix  .

det
λ ∂K K ij Kij

p=0 i=j
|i|=p 1≤i,j≤d
It is instructive to see what these objects are when expressed more compactly. You know that
†
∂∆
(I + λK ) λ−1 =∆
∂K
always, by Cramer’s rule, and also that

∞
X
(I + λK )−1 = (−λK )p
p=0
if λ is small, so
∞ † † ∞ ∞ ∞
X ∂∆p+1 ∂∆ X X X X
λp = ∆ × (I + λK )−1 = λp ∆p × (−λK )q = λn ∆p (−K )q ,
1
=
p=0
∂K λ ∂K p=0 q=0 n=0 p+q=n
i.e.
† p
∂∆p+1 X
= ∆p−q (−K )q .
∂K q=0
It is this kind of book-keeping that is the key to Fredholm’s success for d = ∞, as you will see in the next chapter.
2. Fredholm in function space
2.1. Where to start
I take the simplest case presenting the least technical difficulties; variants will readily suggest themselves and will be
used in Chapter 3 without further comment. I propose to replace Rd by the ∞-dimensional space C [0, 1] and K by the
integral operator
Z1
K : f 7→ K (x, y) f(y) dy
0
with kernel K (x, y), 0 ≤ x, y ≤ 1, of class C [0, 1] . Here and below, I do not distinguish between the operator and the
2
kernel by means of which it is expressed: K means what it has to mean, by the context.
2.2. Aside on compactness
It is always some kind of compactness that makes things manageable for d = ∞; in effect, it reduces d = ∞ to finite d.
Two kinds of compactness are needed here. The proofs are relegated to an appendix.
The space C[0, 1].
This is the space of (real) continuous functions f on the unit interval, provided with the norm kfk∞ =
max {|f(x)| : 0 ≤ x ≤ 1}. A figure F ⊂ C is compact relative to the natural distance d(f1 , f2 ) = kf1 − f2 k∞ if it is
1) closed, 2) bounded, and 3) |f(b) − f(a)| ≤ ω(|b − a|) for every f ∈ F , with one and the same, positive, increasing
function ω(h), h > 0, subject to ω(0+) = 0. Below, fn → f∞ means this type of convergence.
214
Authenticated
H.P. McKean
The space L2 [0, 1].

R1
This is the space of (real) Lebesgue-measurable functions subject to 0 f 2 dx < ∞, with the Pythagorean norm kfk2 =
qR R1
f dx and the associated inner product (f1 , f2 ) = 0 f1 f2 dx. Here, the unit ball B = {kfk2 ≤ 1} is not compact
1 2
0
relative to the distance d(f1 , f2 ) = kf1 − f2 k2 : in fact, B contains the mutually perpendicular functions en (x) = sin nπx,
n ≥ 1, and d(ei , ej ) = 1 for any i = 6 j. Fortunately, B is weakly compact and this will be enough. It means that if
en ∈ B for n ≥ 1, then you can pick indices n1 < n2 < n3 < etc. and a function e∞ ∈ B so as to make (en , f) tend to
(e∞ , f) as n = n1 , n2 , . . . ↑ ∞, for every f ∈ L2 simultaneously. Below, en e∞ means this type of convergence. Take
it as an exercise to check that sin nπx 0.
Example.
Obviously, K maps L2 into C . Let’s prove that F = K B is compact in C by verifying 1), 2), 3) in reverse order: 3), 2), 1).
3) For any function f = K e in K B,
s
Z1 Z 1
|f(b) − f(a)| ≤ |K (b, y) − K (a, y)| |e(y)| dy ≤ kek2 |K (b, y) − K (a, y)|2 dy ≤ ω(|b − a|)
0
0
with

ω(h) = max |K (b, y) − K (a, y)| : 0 ≤ a, b ≤ 1, |b − a| ≤ h, 0 ≤ y ≤ 1 .
3) follows from the fact that ω(0+) = 0. Why so?

2) More simply,
s
Z 1
|f(x)| ≤ |K (x, y)|2 dy ≤ max |K (x, y)| : 0 ≤ x, y ≤ 1 ≡ m < ∞,
0
so K B is bounded.
1) K B is closed: in fact, if K B 3 fn = K en → f∞ and if you assume, as you may, that en e∞ , then you will have
f∞ = K e∞ , as required.
2.3. Some geometry
Now to business, recapitulating Section 1.1. Let R/R † be the range and N/N † the null space of I +K /I +K † , respectively,
in the space C [0, 1].
Item 1.
N, and likewise N † , is finite-dimensional; in particular, the associated (L2 ) projections map L2 into C .
Proof. Pick ek , 1 ≤ k ≤ n, from N, each of unit length:
Z1
kek k22 = e2k = 1
0
and mutually perpendicular:

Z1
(ei , ej ) = ei ej = 0, i 6= j.
0
215
Authenticated
Then K ek = −ek , so
n
X n
X Z 1

|ek (x)| =
2 the projection of K (x, ·) upon ek 2 ≤ (K (x, y))2 dy,
2
k=1 k=1 0
this being an instance of Bessel’s inequality, expressive of the Pythagorean rule. Now integrate to obtain
n Z
X
1 Z1 Z1
n= |ek | dx ≤
2
|K (x, y)|2 dxdy.
k=1 0 0 0
Item 2.
N † = R o and N = (R † )o as in the picture, just as for d < ∞; in particular, (I + K ) : R † → R is one-to-one and onto.
Item 3.
R, and so also R † , is closed as it stands; in particular, C is the perpendicular sum of closed subspaces R † ⊕ N or
R ⊕ N† .
Proof. Let R 3 fn = (I + K ) en → f∞ . If ken k2 ↑ ∞, then
fn en
fn0 = = (I + K ) e0n with e0n = ,
ken k2 ken k2
and you may suppose that e0n e0∞ . Now fn0 → 0, so e0n = fn0 − K e0n → −K e0∞ on the one hand and, on the other
hand, to e∞ itself, the upshot being that e0∞ , which cannot vanish as it is of unit length, belongs to N. But you may
0
also suppose, from the start, that en ∈ N o , whence e0∞ ∈ N o too, and this is contradictory. If now lim inf ken k2 < ∞ you
may suppose en e∞ . Then K en → K e∞ and fn → f∞ force en → e∞ and so also f∞ = (I + K ) e∞ , as wanted.
This was all plain sailing with the help of compactness. What is not clear is whether dim N = dim N † is still true. Now
you can’t just count dimensions as for d < ∞ when you could see at once that dim R = dim R † . Both dim R and dim R †
are infinite and it is not obvious what to do.
Item 4.
n = dim N and n† = dim N † are one and the same.
†
Proof. For conversation’s sake let n† be larger than n, let ek , 1 ≤ k ≤ n, and ek , 1 ≤ k ≤ n† , be unit mutually
perpendicular functions spanning N and N † respectively, and replace K by
n
X † †
K0 = K +

ek ⊗ ek ≡ ek (x)en (y) .
k=1
216
Authenticated
H.P. McKean
†
Then I + K 0 has no null space: It maps C one-to-one onto R 0 = R ⊕ span en : 1 ≤ k ≤ n , permitting a reduction to
the simplified case: n = 0 < n† . Pick e† from N † and discretize (I + K )† e† = 0 so:
X d
i j i 1 † j i
e† + K , e ≡f for 1≤i≤d
d j=1
d d d d d
with
d
X † k 2 1
e
d d =1
k=1
as you may do and

i
max f = o(1) as d ↑ ∞.
i d
This can be rewritten:
X d
i
† j i i † j 1 † j
e + K , −f e e = 0.
d j=1
d d d d d d
Now for d < ∞, it is guaranteed that the transposed problem also has a solution, e:
X d
i i j i j j
− e†
1
e + K , f e =0
d j=1
d d d d d d
with
d 2
e k 1 = 1
X
k=1
d d
if you want, and the idea is to produce a (contradictory) null function e ∈ N by making d ↑ ∞ here. This is easy:

2 X
d d 2
X
j j 1
f j 1 .

f e ≤

j=1 d d d j=1
d d
is negligible, so
X d
i i j j 1
e + K , e = o(1)
d j=1
d d d d
in which the error o(1) is small independently of 1 ≤ i ≤ d. Let fd (x) be the function with values e(i/d) for (i − 1)/d ≤
x < i/d, and suppose, as you may, that fd f∞ , kfd k2 being unity by display no. 3 above. Then
Z1
fd (x) + K (x, y) fd (y) dy = 0
0
up to a like error, small independently of 0 ≤ x ≤ 1, and if you now make d ↑ ∞, you will find 1) K fd → K f∞ , 2) fd → f∞
with kf∞ k2 = 1, and 3) (I + K ) f∞ = 0, with the implication f∞ ∈ C , i.e. n > 0 which is contradictory.
217
Authenticated
2.4. Fredholm’s determinant
To make a determinant for I + K in the ∞-dimensional space C [0, 1], you do the obvious thing: Replace the function f(x),

0 ≤ x ≤ 1, by the d-dimensional vector f(k/d), 1 ≤ k ≤ d, and the operator K by the d × d matrix K (i/d, j/d)/d 1≤i,j≤d
so that
d Z1
X i j 1 j
K , f approximates K (x, y) f(y) dy,
j=1
d d d d
0

write out Fredholm’s series for the determinant of I + K (i/d, j/d)/d 1≤i,j≤d as in
d
X 1 X i j 1
1+ det K , ,
p=1
p! d d d i,j∈k
|k|=p
and make d ↑ ∞ to produce
X∞ Z1 Z1
. . . det K (xi , xj ) 1≤i,j≤p dp x.
1
det (I + K ) = 1 +
p=1
p!
0 0
The convergence is obvious term-wise, only you must have an estimate to be sure the tail of the sum is small. This is
easy: a finite-dimensional determinant is the signed volume of the parallelepiped spanned by the rows/columns of the
matrix in hand so with |k| = p,
v
2
i j 1 Yu
K i , j × d−2 ≤ pp/2 m ,
uX p
det K , ≤ t
d d d i,j∈k i∈k j∈k
d d d

in which m = max |K (x, y)| : 0 ≤ x, y ≤ 1 and

1 X i j 1
det K ,
p! d d d i,j∈k
|k|=p
is dominated by
m p pp/2 mp
d (d − 1) . . . (d − p + 1) pp/2
1
. √ by Stirling’s approximation
p! d pp+ 2 e−p 2π
1
< (me)p p−p/2
independently of d. This is plenty.

Without further ado, the main rules (1,2,3) of Section 1.5 carry over:

Rule 1. det (I + K ) = det I + K † .
Rule 2. det [(I + K )(I + G)] = det (I + K ) × det (I + G).
Rule 3 applies when det (I + K ) does not vanish. Then, as will be seen in Section 2.5, I + K has a nice inverse I + G

of the same type as itself, and Rule 2 implies det (I + K )−1 = [det (I + K )]−1 , as it should be.
Rule 4 is left aside; it’s analogue is not needed.
Rule 5. ln det (I + K ) = sp ln (I + K ) if everything makes sense. This is proved as in Section 2.4 for small K . Take it as
R1
an exercise. Here and below sp G means 0 G(x, x) dx.
218
Authenticated
H.P. McKean

Rule 6. If K depends upon a parameter h and K 0 = dK /dh is nice, then [ln det (I + K )]0 = sp (I + K )−1 K 0 . I leave it
to you to see what is needed and make a proof.
The function ∆(λ) = det (I + λK ) of λ ∈ C is central to the rest of the discussion: it is an integral function of order ≤ 2
in view of the estimate
∞
X (constant × r)p
|∆(λ)| ≤ 1 + ≤ A exp (Br 2 )
p=1
(p/2)!
with r = |λ| and suitable constants A and B. Take this also as an exercise. The estimate can be much improved if
K (x, y) is smooth: for example, if it is of class C 1 , then it pays, in the estimation of det [K (xi , xj )]1≤i,j≤p , to subtract, from
each row of index i ≥ 2, the row before. This does not change the determinant and allows a better estimate:

det [K (xi , xj )]1≤i,j≤p ≤ [ an unimportant geometrical factor ] × pp/2 very roughly
× (x2 − x1 )(x3 − x2 ) . . . (xp − xp−n )
for x1 < x2 < . . . < xp . Here, line 2 is largest when the points are equally spaced (why?), leading to the estimate
Z Z p
1 p

| det K (xi , xj )| dp x ≤ pp/2 ×
1 1 1
p! det [K (xi , xj )] d x ≤ × ' 3p/2 ,
roughly
p p! p
x1 <x2 <...<xp

whence |∆(λ)| ≤ A exp Br 2/3 , i.e. ∆ is of order 2/3 as the phrase is. Note in this connection, the implication of
Hadamard’s theorem (Ahlfors [1, p. 208]): that such a function of order < 1 with ∆(0) = 1 is the product Π(1 − λ/λn ) over
its roots λ1 , λ2 , etc. counted according to multiplicity.
Example.
Here K (x, y) = x(1 − y) if x ≤ y and y(1 − x) if x > y. Let’s compute ∆ explicitly: det [K (xi , xj )]1≤i,j≤p is a polynomial
in x1 , x2 , . . . , xp of degree 2 in each variable: for example with p = 3 and x1 < x2 < x3 ,
   
x1 (1 − x1 ) x1 (1 − x2 ) x1 (1 − x3 ) 1 − x1 1 − x2 1 − x3
det x1 (1 − x2 ) x2 (1 − x2 ) x2 (1 − x3 ) = x1 (1 − x3 ) × det x1 (1 − x2 ) x2 (1 − x2 ) x2 (1 − x3 )
   
x1 (1 − x3 ) x2 (1 − x3 ) x3 (1 − x3 ) x1 x2 x3
   
1 1 1 1 0 0
= x1 (1 − x3 ) × det x1 x2 x2  = det x1 x2 − x1 0  = x1 (x2 − x1 )(x3 − x2 )(1 − x3 ).
   
x1 x2 x3 x1 x2 − x1 x3 − x2
Now in general, det [K (xi , xj )]1≤i,j≤p vanishes: in respect to x1 at 0 and x2 ; in respect to x2 at x1 and x3 ; in respect to x3
at x2 and x4 ; etc., so what could it be but a constant multiple of
x1 (x2 − x1 )(x3 − x2 ) . . . (xp − xp−1 )(1 − xp ).
This constant is unity, as you will check, so
Z Z1 Zx3 Zx2
1 p 1
det [K (xi , xj )]1≤i,j≤p d x = (1 − xp ) dxp . . . (x3 − x2 ) dx2 (x2 − x1 ) x1 dx1 = .
p! (2p + 1)!
0 0 0
In short,
∞ ∞ √ √
X λp 1 X ( λ)2p+1 sh λ
∆(λ) = = √ = √ ,
p=0
(2p + 1)! λ p=0 (2p + 1)! λ
which is of order 1/2.
219
Authenticated
2.5. Fredholm’s formulas
These implement Cramer’s rule in dimension d = ∞. They follow at once from Section 1.7 in the style of Section 2.4,
as I ask you to check. [K (xi , xj )]1≤i,j≤p is abbreviated as K (i, j); the value of p will be obvious from the context. Now
∞
X Z1 Z1
p
det K (i, j) dp x,
1
∆(λ) = 1 + ∆p λ with ∆p = ...
p=1
p!
0 0
and
† Z1 Z1 " #
∂∆p+1 K (x, y) K (x, j)
dp−1 x
1
= ∆p − ... det
∂K (x, y) (p − 1)! K (i, y) K (i, j)
0 0
with a self-evident notation, a factor of 1/d having been dropped from both sides in comparison to Section 1.7: symbol-
ically,
p
∂∆p+1 † X

= ∆p−q (−K )q .
∂K q=0
Also,
† ∞ Z1 Z1 " #
λp

1 ∂∆ X K (x, y) K (x, j) p
= ∆−λ . . . det d x,
λ ∂K p! K (i, y) K (i, j)
p=0 0 0
and †
1 ∂∆
(I + λK ) = ∆(λ) × [the identity].
λ ∂K
I leave it to you to check this line by line from Section 1.7.
6 0: Then (I + λK )(I + G/∆) = I with ∆ + G = (∂∆/∂K )† /λ, and

The last display shows that I + λK is invertible if ∆(λ) =
†
the same applies to I + λK so that
†
G†

G
I + λK †

I+ (I + λK ) = I+ = I,
∆ ∆
as it should be. But what if ∆(λ) = 0? Then I + λK kills G(·, y) for every 0 ≤ y ≤ 1 and it is desired to conclude that
1 + λK is not invertible, i.e. N =
6 0. This is an automatic if G(x, y) does not vanish identically, and even if it did, there
is an easy way out: For large d,
i j 1
det I + λ(d) K ,
d d d 1≤i,j≤d
vanishes for some λ(d) close to λ, by Hurwitz’s theorem (Ahlfors [1, p. 178]), so there must be an associated null vector
P
e(k/d), 1 ≤ k ≤ d, which may be adjusted so that |e(k/d)|2 /d = 1. Now, upon making d ↑ ∞, the sort of compactness
employed in Section 2.3, item 4 produces, from
d
i X i j 1 j
e + λ(d) K , e ≡ 0,
d j=1
d d d d
R1
a genuine null function e(x), 0 ≤ x ≤ 1, of I + λK with 0
e2 = 1, and all is well.
Here Fredholm’s strategy comes into focus: If λ is small, the series for ∆ + G reduces to ∆×[the standard Neumann
series for (I + λK )−1 ], as in
∞ p ∞ ∞
X X X X
∆+G = λp ∆p−q (−K )q = ∆p λp (−λK )q = ∆(I + λK )−1 .
p=0 q=0 p=0 q=0
220
Authenticated
H.P. McKean
But, at best, the sum (−λK )q represents the inverse only out to the first root of ∆(λ) = 0 where it fails to exist.
P
Contrariwise, Fredholm’s recipe, expressing the inverse as a “rational” function (∆ + G)/∆, works in the large, provided
Pp
only that you avoid roots of ∆. The trick is that the coefficients Gp = q=0 ∆p−q (−K )q of the series for ∆ + G may be
expressed by determinants of size [(p/2)!]−1 , more or less, providing a technical control not at all apparent at the start.
That is the key to Fredholm’s success.
Aside. The notation ∂∆/∂K is not fanciful. Recall that for smooth functions F : Rd → R, you have

F (x + hy) = F (x) + h × the inner product ∂F /∂x • y] + o(h),
which you may take as the definition of grad F = ∂F /∂x. It is the same in dimension d = ∞: A reasonable function
F : C [0, 1] → R ought to satisfy

F (f + hg) = F (f) + h × the inner product of something with g + o(h),
and you declare the “something” to be grad F = ∂F /∂f(x), as in
Z1
∂F
F (f + hg) = F (f) + h g dx + o(h).
∂f
0
The idea extends in an obvious way to ∆ which is a function on the space C [0, 1]2 where the kernel K (x, y) lives. You
may like to check that with this understanding, the formula for ∂∆/∂K is quite correct.
Some examples.
Rx
1. K is convolution by x, i.e. K f = x ∗ f ≡ 0 (x − y) f(y) dy with kernel K (x, y) = (x − y)+ vanishing for x ≤ y. Then
P p Pp q
(−λK )q . This must be OK as it stands:
P
∆ ≡ 1 and ∆ + G = λ q=0 ∆p−q (−K ) reduces to the Neumann series
−1
√ √
in fact, from x ∗ · · · ∗ x p-fold = x 2p−1
/(2p − 1)!, it develops that (I + λK ) is the convolution operator I − λ sin λx ∗,
as you will readily check.
Rx
2. K is now Volterra’s operator f 7→ 1 ∗ f(x) = 0 f(y) dy. This is outside the present rules since K (x, y) jumps down
from 1 to 0 at the diagonal, but never mind: just take K (x, x) = 0 and hope for the best. Then ∆ ≡ 1, and
" #
(−1)p
Z
K (x, y) K (x, j) p
det d x= (x − y)p or 0 according as x > y or not.
K (i, y) K (i, j) p!
x1 <x2 <...<xp
For example, if p = 2 and x1 < x2 , the determinant vanishes except on the figure x > x2 > x1 > y where it reduces to
 
111
 
det 1 0 0 = 1,
110
and the integral is just the volume of that figure: (x − y)2 /2! . Try it for p = 3. The outcome is
∞ Z " #
X
p K (x, y) K (x, j)
λ det = e−λ(x−y) or 0 according as x > y or not,
K (i, y) K (i, j)
0 x1 <...<xp
i.e. (I + λK )−1 = I − λe−λx ∗. Check this by hand.
221
Authenticated
3. It is amusing to observe that if K is the even more singular operator

Z x
f(y)
f 7→ √ dy
0 x−y
Rx
of Abel, then (I + λK )−1 may be found by a trick due to himself: K 2 f = π 0 f, as you will check, so from (I + λK ) f = g
Rx R x
you find K f + λπ 0 f = K g, i.e. f − λ2 π 0 f = (I − λK ) g, to which the inversion I + λ2 π exp (λ2 πx)∗ may be applied
to recover f. Check it out. What is ∆ now?
4. K (x, y) = K (y, x) = x (1 − y) if x ≤ y, as in the Example of Section 2.4. I leave it as an exercise to verify the formula
 
 √ Zx √ √ Z1 √ 
−1 −1
(I + λK ) f = f − ∆ × sh λ (1 − x) sh λy f(y) dy + sh λx sh λ (1 − y) f(y) dy ,
 
0 x
√ √
in which ∆ is the determinant det (I + λK ) = (sh λ)/ λ computed in Section 2.4. Here, Fredholm’s integrals are
formidable: even for p = 3 you must evaluate
 
K (x, y) K (x, x1 ) K (x, x2 ) K (x, x3 )
Z
 (x1 , y)
K x1 (1 − x1 ) x1 (1 − x2 ) x1 (1 − x3 )
 
det   dx1 dx2 dx3
K (x2 , y) x1 (1 − x2 ) x2 (1 − x2 ) x2 (1 − x3 )
x1 <x2 <x3
K (x3 , y) x1 (1 − x3 ) x2 (1 − x3 ) x3 (1 − x3 )
and the integral has to be split into 16 pieces according to the placement of x and y among the 5 points 0 < x1 < x2 <
x3 < 1! Let’s try p = 1 only. Then, with x < y,
Z1 " # Zx
x(1 − y) K (x, z) x(1 − y) x3
det dz = z(1 − x)z(1 − y) dz = (1 − x)(1 − y)
K (z, y) z(1 − z) 6 3
0 0
Zy
y2 y3 x2 x3
− x(1 − z)z(1 − y) dz = x(1 − y) − − +
2 3 2 3
x
Z1
(1 − y)3
− x(1 − z)y(1 − z) dz = xy
3
y
1 1
= x 3 (1 − y) + x(1 − y)3 ,
6 6
√ √
which is, indeed, the coefficient of λ2 in sh λ x sh λ (1 − y), God be thanked!
2.6. A question of multiplicities
It is natural to ask what if any is the relation between n the dimension of the null space N of I + λK and m the
multiplicity of the root of ∆(λ) = 0. So far, it is known only that n = 0 if and only if m = 0. The general rule is n ≤ m
with n = 0 only if m = 0. Skip this at a first reading if you like.
Example.
To illustrate the possibility of n < m, consider K = f1 ⊗ e1 + f2 ⊗ e2 . Taking

  " #
Z1
ab
G= fi e j  = ,
cd
0 1≤i,j≤2
222
Authenticated
H.P. McKean
the question is quickly reduced to the dimension n of the null space of I + λG vs. the multiplicity m of the root, if any,
of det (1 + λG). Take a = c = d = 1 and b = 0. Then ∆ = (1 + λ)2 has a double root at −1 but
" #
I−G =
0 0
−1 0
0
has only the one null vector 1
.
Proof of n ≤ m (adapted from Kato [13, p. 40]). If n = 0, there is nothing to prove, so take 1 ≤ n and let ∆
6 0 for nearby λ 6= 1. (I + λK ) is then invertible, and you
vanish at λ = 1, say, with multiplicity m ≥ 1. Obviously ∆(λ) =
can form the operator I
dλ
P=− √ · (I + λK )−1
1
,
2π −1 λ
the integral being taken about a small circle enclosing λ = 1.
Item 1.
P is a projection, i.e. P 2 = P. To confirm this, express P 2 as a double integral:
I I
dλ dµ
√ √ (I + λK )−1 (I + µK )−1
1 1
P2 = · ·
2π −1 λ 2π −1 µ
I I
λ µ
√ · dλ √
1 1 1 1
= · dµ − ,
2π −1 2π −1 λ − µ I + λK I + µK λµ
where the first/second integral is taken about the outer/inner circle seen in the picture.
Do the inner integral first to wash out the contribution of (I + λK )−1 (λ is outside). Then do the outer integral of what
remains to produce I
dµ
√ · (I + µK )−1
1
− = P.
2π −1 µ
Item 2.
P acts as the identity on the null space N of I + K : indeed, K e = −e for e ∈ N, so
I
dλ
√ (1 − λ)−1
1
Pe = − e = e,
2π −1 λ
as advertised.
223
Authenticated
Item 3.
√ −1 H
P = − 2π −1 (G/∆) dλ/λ inherits from G a nice kernel of class C [0, 1]2 and so is compact, meaning that the image
R1
PB of the unit null B = f : 0 f 2 ≤ 1 is compact in L2 [0, 1]; as such, its range Q is of some finite dimension d. (Why?)
Now P commutes with K , so you may write K = PK P + (I − P) K (I − P) and infer that

∆(λ) = det (I + λK ) = det (I + λPK P) × det I + λ(I − P)K (I − P) .

Here, the second factor cannot vanish at λ = 1 since I + (I − P)K (I − P) f = 0 implies Pf = 0 and so also (I + K ) f = 0,
i.e. f ∈ N, and f = Pf = 0, by Item 2.
Item 4.
(I + K )m P = 0, m being the multiplicity of the root of ∆: indeed,
I I
I+K dλ (1 − λ)K dλ
√ =− √
1 1
(I + K )P = − · −I ·
2π −1 I + λK λ 2π −1 I + λK λ
I I
I K dλ (1 − λ)2 K 2 dλ
(I + K )2 P = − √ =− √
1 + 1
· (1 − λ) K −I ·
2π −1 I + λK λ 2π −1 I + λK λ
and so forth, up to
(1 − λ)m K m dλ
I I
G dλ
(I + K )m P = − √ =− √ (1 − λ)m K m
1 1
· · .
2π −1 I + λK λ 2π −1 ∆ λ
This vanishes as the integrand has no pole anymore.
Item 5.
Put (I + K ) P = J and keep it in mind that P commutes with K so that J m = 0, by Item 4. Then
I + λPK P = (1 − λ)(I + ΛJ) with Λ = λ(1 − λ)−1
is a d × d matrix, d being the dimension of Q the range of P. It follows that the root-bearing factor det (I + λPK P) of
∆(λ) from Item 3 can be expressed as
det (1 − λ) (I + ΛJ) = (1 − λ)d det (I + ΛJ) for λ 6= 1
But det (I + ΛJ) = 1 being a polynomial in Λ without roots: in fact, a root entails a null function, and (I + ΛJ) f = 0 only
if f = −ΛJf = (−Λ)m J m f = 0. This means that ∆ = (1 − λ)d × [a non-vanishing factor], so d = m, and since P = I on N,
by Item 2, so Q ⊃ N and m = dim Q ≥ n, as wanted.
2.7. Symmetric K
This is a very special case, but it’s important for all sorts of applications and so worthy to be spelled out. For simplicity
R1
and because it is all I need, I take K non-negative in the sense that the quadratic form Q[f] = 0 fK f is always ≥ 0.
224
Authenticated
H.P. McKean
Existence of eigenfunctions.
For d < ∞, the eigenvalue problem K e = λe has just enough solutions to span the range of K and it is the same here.
In keeping with the tone of this chapter, I will obtain the existence of enough eigenfunctions by the simplest means. The
quadratic form vanishes identically only if K ≡ 0 (Why?). Otherwise, it is capable of positive values, as I now suppose.
R1
Q is controlled by 0 f 2 × max |K (x, y)| : 0 ≤ x, y ≤ 1 , so sup Q[f] : kfk2 ≤ 1 = λ is finite and positive. Replace K
by the d × d matrix [K (i/d, j/d)/d]1≤i,j≤d . The top value of its quadratic form in the ball of vectors e(k/d), 1 ≤ k ≤ d,
with dk=1 |e(k/d)|2 /d ≤ 1 is its top eigenvalue λd , which is close to λ if d is large, as I invite you to check; moreover,
P
you can make the associated eigenvector ed approximate a function e ∈ C [0, 1] by choice d = d1 < d2 < . . . ↑ ∞,
by the now familiar type of compactness employed in Section 2.3. Then for d = ∞, you have K e = λe, i.e. e is an
R1
eigenfunction of K with eigenvalue λ and unit length: 0 e2 = 1. Now write λ = λ1 , e = e1 , and look at the reduced
operator K − λ1 e1 ⊗e1 acting on the annihilator of e1 . It may be that its quadratic form is incapable of positive values.
Then Q[f] = λ1 (e1 , f)2 , K = λ1 e1 ⊗ e1 , and you stop. Otherwise, Q is capable of positive values on the annihilator
of e1 , in which case you can repeat the procedure to produce a new unit eigenfunction e2 , perpendicular to e1 , with
eigenvalue 0 < λ2 ≤ λ1 , and so on. If the quadratic form of K − dn=1 λn en ⊗ en vanishes on the annihilator of en ,
P
Pd
n ≤ d, the procedure stops at stage d, K ≡ 1 λn en ⊗en , and “enough” eigenfunctions are in hand. Otherwise, it never
stops, producing an infinite number of mutually perpendicular eigenfunctions en , n ≥ 1, with diminishing eigenvalues
λ1 ≥ λ2 ≥ . . . > 0.
Mercer’s Theorem.
P∞
Mercer’s Theorem states that K (x, y) = n=1 λn en ⊗ en with uniform convergence of the sum on the square [0, 1] ;
2
P∞
in particular, n=1 en (en , f) converges to f in C [0, 1] for every function f in the range of K : that is what “enough”
eigenfunctions means. The case of finite rank when the procedure stops is left aside.
Pd
Proof. Fix d < ∞. The reduced operator K − n=1 λn en ⊗en is non-negative, i.e.
Z1 Z1 " d
#
X
f(x) K (x, y) − λn en ⊗en f(y) dx dy ≥ 0,
0 0 1
from which you learn the following.

Pd
1) λn e2n (x) ≤ K (x, x) by making f approximate h−1 × [the indicator of |x − y| ≤ h/2] and taking h down to 0+.
n=1
P∞ R
2) n=1 λn ≤ K (x, x)dx < ∞, by integration.
P∞
3) n=1 λn en ⊗en is now seen to converge in the whole square and, indeed, uniformly in x/y for fixed y/x in view of 1)
and the tail estimate
2
X X X X
λn en (x)en (y) ≤ λn e2n (x) × λn e2n (y) ≤ K (x, x) λn e2n (y).

e.g.

n≥d n≥d n≥d n≥d
P P
4) λn en⊗en f = λn en (en , f) is likewise uniformly convergent by the tail estimate:
2 Z1
X X X
λn en (en , f) ≤ λ2n e2n (en , f)2 ≤ λd K (x, x) f 2 dx

n≥d n≥d n≥d 0
P
and the fact that λd ↓ 0 by 2); alternatively, 1) and the decrease of n≥d (en , f)
2
to 0 do the trick; see Dini’s theorem
in the appendix.
225
Authenticated
5) Now the continuous function f0 = K f − ∞

P
n=1 λn en (en , f) is perpendicular to en , n ≥ 1, and since Q[f0 ] ≤ λd |f0 |2 ↓ 0
2
as d ↑ ∞ by 2), so Q[f0 ] = 0. Therefore, Q[f0 + hf] is minimum (0) at h = 0, and
Z Z1
d
Q[f0 + hf] = 2 f0 K f = 2 f02 = 0,
dh
0
i.e. f0 ≡ 0. It follows that

∞
Z1 X
K f(x) = λn en (x) en (y) f(y) dy,
0 n=1
for every f ∈ C [0, 1] and every 0 ≤ x ≤ 1, whence K (x, y) = ∞

P
n=1 λn en (x)en (y) at all points 0 ≤ x, y ≤ 1, by 3).
P
6) Take y = x. Then n≤d λn e2n (x) increases with d to the continuous function K (x, x). Dini’s theorem now tells you
P
that this convergence is uniform. Therefore, the convergence of the sum λn en ⊗en for K (x, y) must be uniform, too.
Check it out.
Item 6) plus 5) is the content of Mercer’s theorem.
Expressing ∆ as a product.
P
Because K = λn en ⊗en , with uniform convergence in the square,
d
Y d
Y ∞
Y
det (I + λK ) = lim det I + λ (λn en ⊗en ) = lim det I + λ (λn en ⊗en ) = (1 + λλn )
d↑∞ d↑∞
n=1 n=1 n=1
since all the little p × p determinants entering into the series for det (I + λe ⊗ e) vanish for p ≥ 2. Here, there is no
P
question as to the convergence of the product in view of 2). Besides |∆(λ)| ≤ exp r λn for r = |λ|, so ∆ is of order 1,
only.
Example.
I go back to the symmetric kernel of Section 2.4: K (x, y) = x(1 − y) if x ≤ y. This is

√ positive, being the inverse of −D
2
acting on the domain C [0, 1] ∩ {f : f(0) = 0 = f(1)}, with eigenfunctions en (x) = 2 sin πnx and eigenvalues 1/n π .
2 2 2
But you know the determinant of 1 + λK from Section 2.4 already, and its present evaluation may be looked upon as a
proof of the standard product
∞ √ ∞
sin λ Y λ2 sh λ Y λ
= 1− 2 2 in the present form √ = 1+ 2 2 .
λ n=1
nπ λ n=1
nπ
3. Fredholm applied
I collect here three applications of Fredholm determinants to: 1) Brownian motion, 2) long surface waves in shallow
water, and 3) unitary matrices in high dimension. Quite a variety, no?
3.1. Brownian motion: preliminaries
The Brownian motion describes the path of, e.g. a (visible) dust note in a sun beam, impelled unpredictably, this way
and that, by collisions with the (invisible) molecules of the air. Brown [2] studied it first, in Nature; later, Einstein [9]
related it to diffusion and the underlying heat equation ∂p/∂t = 21 ∂2 p/∂x 2 .
226
Authenticated
H.P. McKean
The mathematical set-up required can be reduced to the unit interval, equipped with its standard Lebesgue measure.
I suppose you know something about that. Only the language changes: a measurable set A ⊂ [0, 1] is now an “event”;
the measure of A is its “probability” P(A); a measurable function x : [0, 1] → R is a “random variable”; the integral
R1
0
x ≡ E(x) is its “mean-value” or “expectation”.
With this lingo in place, a collection of random variables x is said to be a Gaussian family if any combination k • x =
k1 x1 + . . . + kn xn with (k1 , . . . , kn ) ∈ Rn − 0 is Gauss-distributed with mean 0 and positive root-mean-square σ , i.e.
Zb
e−x /2σ
2 2
P(a ≤ k • x < b) = √ dx for any a < b.

2πσ
a
Here, σ 2 is a quadratic form in k:
Z+∞
e−x /2σ
2 2
X
σ2 = x2 √ dx = E(k • x)2 = ki kj E(xi xj ) ≡ k • Qk,
2πσ
−∞ 1≤i,j≤n
Q being the (symmetric, positive-definite) matrix of “correlations” Qij = [E(xi xj )]1≤i,j≤n . Now
Z+∞ √
√ e−x /2σ
2 2
−1k•x
e −1 k•x √ dx = e−σ /2 = e−k•Qk/2

E e
2
=
2πσ
−∞
is the Fourier transform of the joint distribution of x1 , . . . , xn . This may be inverted to obtain their joint probability
density as function of x = (x1 , . . . , xn ) ∈ Rn :
−1
√ e−x•Q x/2
Z
e− −1 x•k
e−k•Qk/2 dn k =
1
p(x) = .
(2π)n
p
(2π)n/2 det Q
Rn
You see how the determinant enters in and can imagine where all this could lead. I spell out the computation: Q can be
brought to principal axes by a rotation k → k 0 of Rn . This does not change the volume element dn k but simplifies the
quadratic form as in k • Qk = ni=1 λi (ki0 )2 , the λ’s being the (positive) eigenvalues of Q. Then, with the same rotation
P
0
x→x,
√ √
Z Z
− −1 x 0 •k 0 − λi (ki0 )2 /2 n 0 0
P P
e− −1 x •k e− λi (ki ) /2 dn k.
1 1
p(x) = e e d k
2
n n
=
(2π) (2π)
The joint density is now seen to be the product of
Z+∞ √
0
e −1 xi k e−λi k /2 dk
1 2
2π
−∞
taken over 1 ≤ i ≤ n, i.e.

n 0 2 −1
Y e−(xi ) /2λi e−x•Q x/2
p(x) = √ = p
i=1
2πλi (2π)n/2 det Q
as advertised. The formula itself is nice, and you learn from it two important things: 1) that the statistics of a Gaussian
family are completely determined by its correlations, and 2) that two such families are statistically independent if and
only if they are uncorrelated, one with the other: indeed, if x1 , . . . , xp and xp+1 , . . . , xp+q are uncorrelated, then their joint
correlation Q breaks up into blocks as in the figure,
227
Authenticated
Q −1 does likewise, and the joint density splits, which is what independence is all about: if A is any event concerning
xi , i ≤ p, and B any event concerning the other xj , j > p, then P(A ∩ B) = P(A) × P(B).
A little trick will help to convince you that the simple space [0, 1], equipped with its Lebesgue measure, really is rich
enough to support such families. Let x ∈ [0, 1] be expanded in binary notation: x = . x1 x2 . . . The variables xn , n ≥ 1,
are independent (check this!) with common distribution P(xn = 0) = P(xn = 1) = 1/2.
Define
x1 = . x1 x3 x6 x10 . . . ,
x2 = . x2 x5 x9 . . . ,
x3 = . x4 x8 . . . ,
x4 = . x7 . . . ,
and so on.
You see the pattern. These new variables are independent, with common distribution P(x ≤ x) = x, 0 ≤ x ≤ 1. Then
R x0n
the modified variables x0n defined by xn = −∞ (2π)−1/2 e−x /2 dx are also independent with common probability density
2
−1/2 −x 2 /2
p
(2π) e . Now just one more step: if Q is any n × n, symmetric, positive matrix and Q its symmetric positive
root, then the family x00i = nj=1 ( Q)ij x0j , 1 ≤ i ≤ n, is Gaussian with correlation Q.
P p
The generalities now are ended and I turn to the (standard, 1-dimensional) Brownian motion itself. This is the Gaussian
family x(t), t ≥ 0, with correlation E[x(t1 )x(t2 )] = the smaller of t1 and t2 .
It is immediate that 1) P[x(0) = 0] = 1; 2) for t1 ≤ t2 , the increment x(t2 ) − x(t1 ) has mean-square t2 − t1 ; 3) increments
taken over disjoint intervals of time are uncorrelated and so also statistically independent. It follows that for 0 < t1 <
t2 < . . . < tn , the joint probability density of the “observations” x(t1 ), x(t2 ), . . . , x(tn ) is
e−x1 /2t1 e−(x2 −x1 ) /2(t2 −t1 ) e−(xn −xn−1 ) /2(tn −tn−1 )
2 2 2
p(x) = √ p ... p .
2πt1 2π(t2 − t1 ) 2π(tn − tn−1 )
Wiener [26] proved that the Brownian motion has continuous paths. I explain what he meant, technically. The present
machinery cannot deal directly with the event “x(·) is continuous” since it involves an uncountable number of observations
p
of the Brownian path t 7→ x(t). What is needed is something like this: with probability 1, |x(i/n) − x(j/n)| ≤ 3 i/n − j/n
for all 0 ≤ j/n < i/n ≤ 1 with sufficiently small separation max |i/n − j/n| ≤ h, independently of n ≥ 1. This permits
you to complete the countable Gaussian family x(k/n), k ≤ n, n ≥ 1, to the uncountable family x(t) = limk/n↓t x(k/n),
0 ≤ t ≤ 1, with the correct joint distributions for any finite number of observations and guaranteed continuity of paths.
The proof takes some work; see McKean [18] for Ciesielski’s slick version.
It is natural to write the measure induced from [0, 1] onto path-space, symbolically:
R∞ •
e− 0 [x (t)] /2 dt ∞
2
dP = d x.
(2π0+)∞/2
Here nothing is kosher: with probability 1, x• (t) does not exist for any t ≥ 0; (2π0+)∞/2 is silly; and the flat volume
element d∞ x is a bogus, too. Nevertheless, the whole reminds you of what is happening and is even a reliable guide to
computation.
228
Authenticated
H.P. McKean
A variant of this “free” Brownian motion figures below. This is the “tied” Brownian motion, alias the free motion,
considered for times t ≤ 1 and conditioned so that x(1) = 0. The conditioning does not change the Gaussian character
of the family, so only the correlation needs to be found.
Here is a neat way to do that. Look at the picture.
The variables
x(1) − x(t1 )
ξ = x(t1 ) − t1 x(1), η = x(t2 ) − x(t1 ) − × (t2 − t1 ), and x(1)
1 − t1
are uncorrelated, as you will check, and so independent. Now condition by x(1) = 0. The independence of ξ and η is not
spoiled, whence the (conditional) correlation between x(t1 ) = ξ and x(t2 ) = ξ (1 − t2 )/(1 − t1 ) + η is just the unconditional
correlation E(ξ 2 ) (1 − t2 )/(1 − t1 ) = t1 (1 − t2 ).
A nice model of the tied Brownian motion is obtained from the free Brownian motion in the form x(t) − tx(1), 0 ≤ t ≤ 1.
You have only to check (in one line) that the correlations are what they should be.
3.2. Brownian motion: some applications
Simplest examples.
R1
I want to compute the mean-value of F = exp − (λ/2) 0 x2 (t) dt for the free Brownian motion. The task is simplified
by the continuity of the Brownian path, permitting the reduction of F to Fn = exp − (λ/2) nk=1 x2 (k/n)/n with a
P
subsequent passage to the limit n ↑ ∞. Write the joint density for x(k/n), 1 ≤ k ≤ n, as

exp − x • Q −1 x/2

i j
p with Qij = min , , 1 ≤ i, j ≤ n.
(2π)n/2 det Q n n
Then
s −1
exp − x • ((λ/n)I + Q −1 ) x/2 n det (λ/n)I + Q −1
Z
−1/2
E(Fn ) = p d x= = det (I + λQ/n)
(2π) n/2 det Q det Q
Rn
so that  
Z1
λ 1
E exp − x2 dt  = p
2 det (I + λK )
0
with Fredholm’s determinant and the (symmetric) operator
Z1
K : f 7→ min (x, y) f(y) dy.
0
229
Authenticated
√is easy to compute: K is the (symmetric) inverse

The determinant to −D 2 acting on C 2 [0, 1] ∩ {f : f(0) = f 0 (1) = 0}, with
−2
eigenfunctions 2 sin π(n + 1/2)x and eigenvalues [π (n + 1/2)] , n ≥ 0, so
∞
√

Y λ
det (I + λK ) = 1+ = ch λ
n=0
π (n + 1/2)
2 2
and  
λ
Z1 √ −1/2
E exp − x dt  = ch
2
λ .
2
0
The computation for the tied Brownian motion is much the same: Now Qij = (i/n)(i − j/n) for i ≤ j; K (x, y) is the
symmetric kernel x(1 − y) if x ≤ y; the associated operator√ inverts −D acting on C [0, 1] ∩ {f : f(0) = f(1) = 0}; the
2 2
−2 −1/2
eigenvalues are (πn) , n ≥ 1; and det (I + λK ) = λ sh λ, as in Section 2.7, so now
 
λ
Z1 √ √ −1/2
E exp − x dt  =
2
λ sh λ .
2
0
P. Lévy’s area.
Let x : t ∈ [0, 1] → R2 be the plane motion of two independent copies x1 & x2 of the tied Brownian motion and consider
P. Lévy’s area [17, p. 254]
Z1 Z1
1
A= (x1 dx2 − x2 dx1 ) = x1 dx2 = lim An
2 n↑∞
0 0
with
n−1
X k k +1 k
An = x1 x2 − x2 .
k=0
n n n
√ 1 and
The Brownian path is too irregular for this to be any kind of classical integral, but the limit exists with probability
is declared to be the area. I ask you to believe this and proceed to the evaluation of the Fourier transform E exp ( −1 λA)
of the distribution of A. The individual Brownian motions x1 and x2 are independent, so their joint probabilities split and
you may begin by fixing x1 and integrating out x2 with the help of the general rule
−1 x/2
e−x•P dn x
Z
ex•y √ = ey•Py/2 .
(2π)n/2 det P
Rn
Here, P is to be taken as the correlation of the increments of x2 figuring in An , and y is the vector x1 (k/n), 0 ≤ k < n.
You have Pij = −1/n2 plus an extra 1/n on diagonal, as you will check, so with x1 fixed, you have
  
√  λ2 Xn−1 n−1 !2  2
h
−1λAn
i k 1 X k 1  λ
E e x1 (t) : 0 ≤ t ≤ 1 = exp −  x21 − x1 = exp − y • Py .
 2
k=0
n n k=0
n n  2
It is convenient to shift the summations above from 0 ≤ k < n to 0 ≤ k ≤ n; this makes no difference to the limit. Then,
with the correlations

i j k
Qij = 1− , i ≤ j, for x1 , 1 ≤ k ≤ n, alias y,
n n n
230
Authenticated
H.P. McKean
you find
s
−1
√ e−x•Q x/2 det (λ2 P + Q −1 )−1
Z
−1 λAn −λ2 x•Px/2
dn x =
1
E e = e p = p ,
(2π)n/2 det Q det Q det (I + λ2 QP)
in which QP = Q/n − (Q1/n2 ) ⊗ 1. √This matrix approximates the operator with the familiar kernel K (x, y) = x (1 − y)
minus e1⊗1 with e = K 1, so E exp ( −1 λA) is 1 over the square root of Fredholm’s determinant
−1 −1
det I + λ2 (K − e⊗1) = det I + λ2 K × det I − I + λ2 K e⊗1 = λ−1 sh λ × 1 − sp I + λ2 K e⊗1 .

P∞ √
It remains to evaluate the trace (sp) which is not hard: K = n=1 en ⊗en /n2 π 2 with en = 2 sin nπx for n ≥ 1, so
 1 2
Z1 ∞ Z
−1 −1 X 1 X 1 1 2 sh λ/2
sp I + λ K 2
e⊗1= 1 I+λ K 2
e dx =  en  = 8 × 2 2 =1− .
n=1
λ +n π
2 2 2
odd n
λ 2 + n2 π 2 n π λ ch λ/2
0 0
Q∞
The last expression comes from the product λ−1 sh λ = n=1 1 + λ2 /n2 π 2 , by logarithmic differentiation and a bit of
manipulation which I leave to you. Then
2
sh (λ/2)
det I + λ2 (K − e ⊗ 1) reduces to
λ/2
for the final evaluation:

√ λ/2
−1 λA

E e = .
sh (λ/2)
Inverting.
This is all very well, but what is really wanted in these examples is the inverse transform, i.e. the probability density
of the variable in hand. P. Lévy’s area A is the simplest in this regard, the√ density being (π/2) × ch−2 (πx), as is easily
R
checked. I leave it to you: Just compute the inverse transform (2π)−1 e− −1 λx (λ/2)[sh (λ/2)]−1 dλ for x < 0 by residues
in the upper half-plane.
√ −1/2 √ −1/2
The other inversions: of 1) ch λ and 2) λ−1/2 sh λ are more difficult. 1) was done by Cameron and Martin
[3, pp. 195–209]. The present computation is more economical:
√ √
∞
2e− λ/2 √ X 1 · 3 · 5 · . . . · (2n − 1) −(2n+1/2)√λ
(−1)n
1
p √ = p √ = 2 e
ch λ 1 + e−2 λ n=0
2 · 4 · 6 · . . . · 2n
∞ Z∞
e−(n+1/4)
2 /x
X 1 · 3 · 5 · . . . · (2n − 1)
(−1)n √ e−λx dx
1
= n+
n=0
2 · 4 · 6 · . . . · 2n 4 πx 3
0
where
∞
X 1 · 3 · 5 · . . . · (2n − 1)
√ xn
1
=
1−x n=0
2 · 4 · 6 · . . . · 2n
is used for the second equality and
Z∞
e−x /2t −λt √
2
x√ e dt = e−x 2λ
2πt 3
0
for the third one; the latter is easily deduced from the fact that p = (2πt)−1/2 e−x /2t solves the heat equation ∂p/∂t =
2
R1
∂2 p/2∂x 2 . The probability density of 0 x2 (t) dt for the free Brownian motion may now be read off:
∞
1 e−(n+1/4) x

1 · 3 · 5 · . . . · (2n − 1)
2
X
(−1)n × n+ √ .
n=0
2 · 4 · 6 · . . . · 2n 4 πx 3
√ −1/2
It is related to one of Jacobi’s theta functions. Not very attractive, but there it is. The inversion of 2) λ−1/2 sh λ
is worse, so I leave the matter here.
231
Authenticated
3.3. Robt. Brown’s Brownian motion
I come to the subject by indirection.

Let’s modify the symbolic expression of the free Brownian measure for “short” paths x(t), 0 ≤ t ≤ 1, by the introduction
of a density function, as in
   
Z1 Z1
m m2
em/2 × exp − x2 (1) − (x• )2  d∞ x.
1 1
x2  × exp −
2 2 (2π0+)∞/2 2
0 0
The exponent is of degree 2 in x, so the measure is still Gaussian, up to a normalizer to make the total mass
be 1. That is the role of the factor em/2 : The total mass of the rest, alias the free Brownian mean-value of
R1
exp − mx2 (1)/2 − m2 /2 0 x2 , may be reduced in the now familiar way, to the reciprocal square root of ∆ =
R1
det I + m2 K + mK e⊗e with K (x, y) = min (x, y) and e the unit mass at x = 1, i.e. 0 fe ≡ f(1). The symbol
e ⊗ e is not quite kosher but don’t worry: it is only of rank 1 and will take care of itself. The computation of this
determinant follows:
−1 −1
∆ = det I + m2 K × I + I + m2 K mK e⊗e = ch m × 1 + sp I + m2 K mK e⊗e
−1
= ch m × 1 + I + m2 K mK evaluated at (1, 1) .
−1
Now you need to know I + m2 K ≡ I + G. Here, K is inverse to −D 2 acting on C 2 [0, 1] ∩ {f : f(0) = 0 = f 0 (1)}, so
G is inverse to D 2 /m2 − I acting on the same domain, and an easy computation produces its (symmetric) kernel
−m
G(x, y) = sh mx ch m(1 − y) if x ≤ y.
ch m
Then
−1 −1 1 G
I + m2 K K = I + m2 K × I + m2 K − I = − 2 ,
m2 m
so ∆ = ch m × (1 − G(1, 1)/m) = em , as wanted, i.e. the measure is normalized.
But what Gaussian measure is this anyhow? The correlation
   
Z1
m m2
Q = em/2 E exp − x2 (1) − x2 x(t1 ) x(t2 )
2 2
0
will tell, or, what is just as good, the associated quadratic form
   2 
Z1 Z1
m m2
(f, Qf) = em/2 E  exp − x2 (1) − x2   fx  ,
 
2 2
0 0
R1 R 1 2
which may be obtained from Z = em/2 E exp − mx2 (1)/2 − m2 0 x2 /2 exp − λ 0 fx /2 by taking −2∂/∂λ at λ = 0.
The computation follows the pattern for the normalizer: Z is reduced to
−1/2
em/2 det I + m2 K + mK e⊗e + λK f ⊗f

from which rule 6 of Section 2.4 produces
−3/2 −1
(f, Qf) = −2em/2 ×−
1
det I + m2 K + mK e⊗e ×det I + m2 K + mK e⊗e ×sp I + m2 K + mK e⊗e K f ⊗f .
2
232
Authenticated
H.P. McKean

Now det I + m2 K + mK e⊗e = em , as you already know, so line 1 is cancelled. Only line 2 remains, which is to say
−1 −1 −1 −1

Q = I + m2 K + mK e⊗e K = I + I + m2 K mK e⊗e I + m2 K K.
−1
Here, I + m2 K K = −G/m2 as before, and the simple rule
(I + a⊗b)−1 = I − (1 + (a, b))−1 a⊗b with a = −Ge/m and b=e
produces
" #−1
1 1 1
Q = I − 1 − G(1, 1) − Ge⊗e ×− 2 G
m m m

= − 2 G(x, y) − e−m ch m × − G(x, 1) × − G(y, 1) = sh m x e−my
1 1 1 1 1
for x ≤ y,
m m m m m
as you will check, only now you should put x = t1 and y = t2 .

OK. But, still, what is that? Something more tangible is wanted. A symbolic calculation will guide us: Let F (x) be any
nice function of the short path x(t), 0 ≤ t ≤ 1. Then
 1

Z 2 Z
Z Z Z
m/2 F (x) m m  exp − 1 (x• )2 d∞ x = em/2 F (x) •
d∞ x.
1
e − x (1) −
2
x 2
− mx) 2
(2π0+)∞/2 (2π0+)∞/2
exp exp (x +
2 2 2 2
0
Put x• + mx = y• , or what is the same for x(0) = y(0) = 0,
x(t) = e−mt ∗ y• = y(t) − me−mt ∗ y,
where ∗ stands for the convolution. The prior display now takes the form
 
Z1
F (e−mt ∗ y• )
Z
∂x ∞
em/2 • 2
− 1
d y
(2π0+)∞/2
exp (y ) det
2 ∂y
0
with the formal Jacobian determinant of x in respect to y:

(
∂x(t1 ) [the identity] − m × em(t2 −t1 ) if t2 < t1 ,
=
∂y(t2 ) 0 otherwise.
This is ambiguous on diagonal, but if you take it to be −m/2 by way of compromise, then the Jacobian determinant is
e−m/2 , and the expectation assumes its final form:
  
Z1
m m2
em/2 E F (x) exp − x2 (1) − x2  = E F (e−mt ∗ y• )

2 2
0
with the free Brownian mean-value to the right. The formula originates with Cameron and Martin [3]; it suggests that
the Gaussian probabilities
   
Z1 Z1
m/2 m m2
(x ) d∞ x
•
1 1
e exp − x2 (1) − x 2
exp − 2
2 2 (2π0+)∞/2 2
0 0
233
Authenticated
are one and the same as what the free Brownian y induces on path space by the transformation y(t) → x(t) = e−mt ∗ y• ;
in fact, for these paths x,
Zt1
 
−mt1 2mt 0 0
sh (mt1 ) e−mt2
1
E [x(t1 )x(t2 )] = e exp −mt2 e dt = for t1 ≤ t2 ,
m
0
as it ought to be. Take that as an exercise.

Finally, I come to the “real” Brownian motion: Ornstein–Uhlenbeck [23] advocated x as a model of velocity of the
Brownian motion of Robt. Brown. I change notation accordingly, from x to v. Then the position of the Brownian particle,
starting from rest at 0, is
Zt
x(t) = v(t 0 ) dt 0
0
and you have, symbolically, Newton’s law:
d2 x dv dy
[mass 1] × [acceleration] = = = −mv + = [force]
dt 2 dt dt
in which −mv is descriptive of air resistance (drag) and dy/dt of the (thermal) agitation of the air molecules. Now
intensify this agitation by a factor m, so that drag and agitation are comparable, and make m ↑ ∞. Then

d2 x dx dy
= −m v +m ,
dt 2 dt dt
and with x(0) = v(0) = 0, you have
r
(1 − e−2mt ).
1
x(t) = y + [a negligeable error of root-mean-square] =
2m
In this way, Ornstein–Ulhenbeck reconciled their “real” Brownian motion x with that of Einstein [9] who took it to be
the free Brownian motion y, plain.
3.4. Shallow water: background and preliminaries
The equation of Korteweg–de Vries (1895) is descriptive of the leading edge of long surface waves in shallow water; in
convenient units, it reads: ∂v/∂t + v∂v/∂x = ∂3 v/∂x 3 , v(t, x) being the velocity of the wave, taken to be independent of
y as in the figure.
234
Authenticated
H.P. McKean
Here, you have a competition between 1) ∂v/∂t + v ∂v/∂x = 0 and 2) ∂v/∂t = ∂3 v/∂x 3 . 1) exhibits “shocks”. 2) is “disper-
sive”. I explain: 1) states that points at height h on the graph of v move to the right at speed h, so if f = v(0+, ·) is in-
creasing nothing bad happens, but if it is decreasing, then high values overtake low values and pass beyond, so that v(t, ·)

ceases to be a single-valued function. The inception of this turning over is a “shock”. As to 2), v(t, x) = sin k x − k 2 t
is a solution, showing that waves of different frequencies (k) move at different speeds (k 2 ). This is “dispersion”. In KdV,
dispersion overcomes the shocks and the solution is perfectly nice for all times t ≥ 0 and, indeed, for t ≤ 0, too.
The explicit solution of KdV by Gardiner–Greene–Kruskal–Miura [11] makes a fascinating story. I cannot tell it here,
but see Lamb [14] for the best introduction to what has come to be called “KdV and all that”. But one remarkable aspect
of it can be told quite simply, following a beautiful computation of Ch. Pöppe [21].
Take a solution w = w(t, x) of KdV with the non-linearity v∂v/∂x crossed out, i.e. ∂w/∂t = ∂3 w/∂x 3 , and use it to make
an operator W on the half-line [0, ∞), depending upon 0 ≤ t < ∞ and −∞ < x < ∞, as in
Z∞
W : f ∈ C [0, ∞) 7→ w(t, x + ξ + η) f(η) dη.
0
Now let ∆ = ∆(t, x) be the determinant det (I + W ), introduced into the subject by Dyson [8]. Then, miraculously,
v(t, x) = −(1/12) ∂2 ln ∆/∂x 2 solves KdV, which is to say that this machinery w → ∆ → v converts the (easy) linear flow
∂w/∂t = ∂3 w/∂x 3 into the (hard) non-linear KdV flow ∂v/∂t + v ∂v/∂x = ∂3 v/∂x 3 , but more of this below.
Of course, you must take care that v makes sense: The function little w had better decay at x = +∞ so that ∆ is kosher
as a Fredholm determinant, as would be the case if w(0+, ·) were smooth and rapidly vanishing at ±∞ and you took
Z+∞ √ √
e− −1/kx e −1 k t w(0+,
1
w(t, x) = k) dk,
3
b
2π
−∞
R √
in which b signifies the transform f 7→ e −1kx f(x) dx. Then ∆ = 1 at x = +∞, but you still need to keep ∆ positive
elsewhere so that ln ∆ is nice, which is to say that I + W has never any null-space. Now if f ∈ C [0, ∞) is such a (real)
null-function, then the rapid vanishing of w at +∞ imparts the same decay to f itself, and with the understanding that
f(ξ), vanishes for ξ ≤ 0, you may conclude from
√ √ √
Z
e− −1 kξ − −1 kx −1 k 3 t
1
f(ξ) + e e w(0+,
b k)bf(−k) dk = 0
2π
that
Z∞ Z+∞ Z+∞
1 1
f =
2
|f| =
b 2
|w|
b 2 |bf|2 dk
2π 2π
0 −∞ −∞
in view of bf(k)∗ = bf(−k), f being real. This cannot be maintained if |w|

b < 1 unless bf, and so also f, vanishes: in fact,
|w|
b ≤ 1 suffices and is necessary, as well, but that is a little longer story which I omit. This ends the preparations.
3.5. Pöppe’s computation

Pöppe showed, by direct manipulation of determinants, that −(1/12)(ln ∆)00 solves KdV. The simpler function V ≡ −(ln ∆)0
is better suited to the proof. It is − sp (1 + W )−1 DW with D = d/dx, by rule 6 of Section 2.4, and may be reduced to

(I + W )−1 W (0, 0)/2 by means of the “Hankel” character of W , by which I mean that its kernel is a function of ξ + η

alone: in fact, the kernel of (I + W )−1 DW is (∂/∂η) (I + W )−1 W (ξ, η), so
V = − sp (I + W )−1 DW = − sp DW (I + W )−1 + (I + W )−1 DW

1
2
Z∞
d
= − sp D(I + W )−1 W + (I + W )−1 DW = − (I + W )−1 W (ξ, ξ) dξ = (I + W )−1 W (0, 0).
1 1 1
2 2 dξ 2
0
235
Authenticated
I will need V 0 , V 00 , V 000 , (V 0 )2 , and also V • . The first three are easy, as is the last: with the notation W 0 = DW etc. and
abbreviating [ ](0, 0) by [ ] plain, you find
V0 = − (I + W )−1 W 0 (I + W )−1 W + (I + W )−1 W 0 = (I + W )−1 W 0 (I + W )−1 ,

1 1
2 2
V 00 = − (I + W )−1 W 0 (I + W )−1 W 0 (I + W )−1 + (I + W )−1 W 00 (I + W )−1 ,
1
2
V 000 = 3 (I + W )−1 W 0 (I + W )−1 W 0 (I + W )−1 W 0 (I + W )−1 − 3 (I + W )−1 W 00 (I + W )−1 W 0 (I + W )−1

(I + W )−1 W 000 (I + W )−1 ,

1
+
2
and
V• = (I + W )−1 W 000 (I + W )−1 .

1
2
The reduction of (V 0 )2 to like form employs the Hankel trick again: Let the operators A, B, C and D have nice kernels
and good decay at +∞, abandoning temporarily the prior meaning of D, and let B and C be Hankel besides. Then,
with an obvious notation,
Z∞ Z∞
d d
[AB](0, 0)[C D](0, 0) = − [AB](0, ξ) × [C D](ξ, 0) dξ − [AB](0, ξ) × [C D](ξ, 0) dξ
dξ dξ
0 0
Z∞ Z∞
= − [AB0 ](0, ξ) [C D](ξ, 0) dξ − [AB](0, ξ) [C 0 D](ξ, 0) dξ = −[A(B0 C + BC 0 )D](0, 0).
0 0
0
Now write 2V in its initial form:
2V 0 = (I + W )−1 W 0 − (I + W )−1 W 0 (1 + W )−1 W ,

dropping the (0, 0) as before, and square:
4(V 0 )2 = (I + W )−1 W 0 (W 0 (I + W )−1 − 2 (I + W )−1 W 0 W (I + W )−1 W 0 (I + W )−1

+ (I + W )−1 W 0 (I + W )−1 W W (I + W )−1 W 0 (I + W )−1 ,

in which the symmetric character of W , W 0 , and (I + W )−1 is used to manipulate the second factors one by one. Then
apply the ABC D rule to produce
4(V 0 )2 = − (I + W )−1 W 00 W 0 + W 0 W 00 (I + W )−1 + 2 (I + W )−1 W 00 W + (W 0 )2 (I + W )−1 W 0 (I + W )−1

− (I + W )−1 W 0 (I + W )−1 W 0 W + W W 0 (I + W )−1 W 0 (I + W )−1

and write out V • − V 000 + 6(V 0 )2 in the present notation:
V • − V 000 + 6(V 0 )2
(I + W )−1 W 000 (I + W )−1

1
= (1)
2
− 3 (I + W )−1 W 0 (I + W )−1 W 0 (I + W )−1 W 0 (I + W )−1

−(2)
+ 3 (I + W )−1 W 00 (I + W )−1 W 0 (I + W )−1

(3)
(I + W )−1 W 000 (I + W )−1

1
− −(4)
2
− 3 (I + W )−1 W 00 W 0 (I + W )−1

−(5)
+ 3 (I + W )−1 W 00 I − (I + W )−1 W 0 (I + W )−1

(6) − (7)
+ 3 (I + W )−1 (W 0 )2 (I + W )−1 W 0 (I + W )−1

(8)
− 3 (I + W )−1 W 0 (I + W )−1 W 0 I − (I + W )−1 W 0 (I + W )−1

−(9) + (10)
236
Authenticated
H.P. McKean
in which (1) and (4) cancel, also (5) and (6), (3) and (7), (8) and (9), (2) and (10), i.e. the whole thing vanishes.
Remarkable! Now take v = V 0 . Then v • − v 000 + 12vv 0 = 0 which is KdV aside from the nuisance factor 12, and this is
removed by changing v into v/12.
Example: solitons.
Take, as solution of ∂w/∂t = ∂3 w/∂x 3 , the function
n
X
w(t, x) = mi exp − ki x − ki3 t
i=1
with positive numbers m and distinct positive k’s. Then v = −(1/12)(ln ∆)00 solves KdV. Here, ∆ can be expressed pretty
explicitly. Let’s look for eigenfunctions of W at t = 0:
X Z∞
W : f 7→ mi e−ki ξ e−ki η f(η) dη
0
Pn −kj ξ
so any such eigenfunction is of the form f(ξ) = j=1 fj e , and a self-evident computation shows that W f = λf is the
same as to say
n
X mi
λfi = fj for 1 ≤ i ≤ n,
j=1
ki + kj
i.e. ∆ = det (I + W ) is the ordinary determinant

mi
det I + ,
ki + kj 1≤i,j≤n
which may be spelled out in Fredholm’s way as

n X n Y
X m ki − kj 2
1+ ,
p=1 |n|=p
2k i<j
ki + kj
i,j∈n
in which n is a collection of indices 1 ≤ n1 < n2 < · · · < np ≤ n and (m/2k)n means (m1 /2k1 )n1 × (m2 /2k2 )n2 × etc.
I leave this to you as an exercise with a hint to be verified first:
Y Y
1
det = (ki − kj )2 (ki + kj )2 .
ki + kj 1≤i,j≤n i<j i,j
Omitting the corrective 1/12, the corresponding solution of KdV is then v(t, x) = −(ln ∆)00 with

mi
∆ = det I + e−ki x e−ki t .
3
ki + kj 1≤i,j≤n
For n = 1, ∆ is simply 1 + (m/2k) exp (−kx − k 3 t), and

k 2 −2 kx k 3t 1 2k
v(t, x) = − ch + + ln ,
4 2 2 2 m
as you will check. This is the famous “soliton” of “KdV and all that”, representing, e.g. a solitary wave (tsunami) in the
open ocean, traveling east to west; compare Lamb [14]. For n ≥ 2, v is a non-linear combination of such solitons. Here
is why: The soliton has amplitude k 2 /4 and speed k 2 , i.e. big solitons go fast, small solitons go slow. Now you can
start off a solution of KdV by placing widely-spaced solitons in order of diminishing amplitude/speed, right to left. They
hardly feel each other: the individual soliton dies off fast from its peak value, so the non-linearity vv 0 does not count for
much, and you can simply add the n solitons together. Now turn on KdV: The big solitons catch up to the little ones,
and a complicated (non-linear) interaction takes place, but by and by you see the original solitons emerging to the left,
moving off to −∞ in order of increasing amplitude/speed, left to right, with a little evanescent “radiation” in between.
It is this evolution that v = −(ln ∆)00 describes. I omit the computation, but try it for n = 2.
237
Authenticated
Coda.
Now the only trouble is that you don’t know how to pick w(0+, ·) to produce the initial velocity profile v(0+, ·) you want.
That is a pretty long story, how it’s done. Here, I am after determinants and just refer you to Lamb [14] and Dyson [8] for
the full story. It involves an elaborate detour via a sort of non-linear Fourier transform (“scattering”), quite remarkable
in itself but, to my mind, a little wrong-headed. If only you could invert the map w(0+, ·) → v(0+, ·) directly, in real
space√so to say, then all would be well. I like to make this caricature: To solve x • = 1 is easy even if you know nothing;
Rx
x • = 1 − x 2 looks harder, but if you have the wit to introduce the special function y = sin−1 x = 0 (1 − r 2 )−1/2 dr, then
y• = 1, x = sin y, and you’re back in business. The map v → w is just a variant of this device, i.e. it is an ∞-dimensional
“special function”, inverse to w → v = −(ln ∆)00 . How bad can it be? The question merits a direct answer.
3.6. The unitary ensemble: background and preparations
The idea to model atomic energy levels by the spectra of random matrices is due to Wigner [27]. The subject has been
much developed since then; see Mehta [19] for the best general account. Curiously, there even seems to be a mysterious
connection to Gauss’s prime number theorem, # {primes ≤ n} ' n/ lg n, mediated by Riemann’s zeta function, for which
see Conrey [4]. The group G of N×N unitary matrices provides the simplest example of what people do. G is equipped
√
with its (one and only) translation invariant volume element, normalized to make the total volume equal 1. That is the
only natural way to introduce probabilities into the picture, and it is the statistics of the eigenvalues e −1 θ of a “typical”
unitary matrix which is sought, especially in high dimensions N ↑ ∞. The pretty fact is that these statistics can be
of Fredholm determinants: for example, for N < ∞, the probability that no eigenvalue falls in the
√
expressed by means

circular arc e −1 θ : α ≤ θ ≤ β is

1 sin N(θ − φ)/2
det I − .
2π sin (θ − φ)/2 α≤θ,φ≤β
The discussion is adopted from Tracy–Widom [22], more or less.
The invariant volume.

√
−1 θ
I follow√Weyl [24, pp. 194–198]. Let D ⊂ G be the commutative subgroup of diagonal unitaries e = e , by which I
mean e −1 θn , 1 ≤ n ≤ N, on diagonal.
The figure depicts G fibered by D over the base G/D. To understand what is meant, take a general element g ∈ G
and bring it, as you may, to diagonal form by a unitary similarity: g = heh−1 with e ∈ D and h ∈ G. Here, h can
be multiplied from the right by any e0 ∈ D without changing g, so it is only the coset hD that counts. In fact, for g in
“general position”, both this coset and the covering point e figuring in g = heh−1 are fully determined in the small by
the requirement that e not change much in response to a little change in g – e is made from the eigenvalues of g which
can only be permuted, and for g in general position, such a permutation is a big change, so if now h is determined at
238
Authenticated
H.P. McKean
the base, then also e is known, and likewise in a small patch about g = heh−1 , with e and h changing only a little, by
de and dh in response to a little change dg in g. Obviously
√
g−1 dg = he−1 h−1 dh eh−1 + h de h−1 − heh−1 dh h−1 ) = h e−1 h−1 dh e − h−1 dh + −1 dθ h−1 ,

in which you may reduce the apparent supernumerary (real) degrees of freedom (N 2 for dh plus N more for dθ)
−1
√ number (N ) by the observation that h dh is pure imaginary on diagonal, permitting−1the restriction
2
to the correct
−1
h dh = −1 dθ there. The volume element Ω is now produced by wedging the several 1-forms g dg up to top
dimension. Each of these is insensitive to left translations, so Ω inherits this feature. Besides, the associated line
element
sp (g−1 dg)†∗ g−1 dg = sp (dg)†∗ dg ≡ |dg|2

−1
−1 −1 −1
√ of h to the left and h to the right, so these can be dropped and the bare 1-forms
is insensitive to the presence
e h dhe − h d = h + −1 dθ wedged up to top dimension to produce the volume element:
Y ej
Ω= − 1 dθ1 dθ2 . . . × [a volume element ω on G/D],
ei
i6=j
or, taking the factors of the product by pairs,

Y
Ω= |ei − ej |2 dθ × ω.
i<j
The spectral density.
The form of ω does not matter anymore. What is wanted is the volume element induced upon the eigenvalues:
Y √ √ Y
Z −1 × e −1 θi − e −1 θj 2 dθk ≡ p(θ) dθ,
i<j
in which Z is a normalizing factor, to be determined presently,

√
fixing the total volume at 1 so that these are probabilities.

To put them in a better form, introduce the matrix Q = e −1 (i−1) θj 1≤i,j≤N with the Vandermonde-type determinant:
Y √ √
−1 θj −1 θi

e −e .
i<j
Then Y √ √
e −1 θj − e −1 θi 2 = det Q † × det Q ∗ = det Q † Q ∗ ,

i<j
√ √
and from √
N−1
X √ e −1 (θi −θj )N
−1 e −1 θi N/2
sin N2 (θi − θj ) e− −1 θj N/2
Q† Q∗ −1 (θi −θj )k

ij
= e = √ = √ √ ,
−1(θi −θj ) −1 θj /2
k=0 e −1 e −1 θi /2 sin 12 (θi − θj ) e
you find
p(θ) = Z −1 det [K (θi − θj )]1≤i,j≤N
with
1 sin N(θ − φ)/2
K (θ, φ) =
2π sin (θ − φ)/2
and the modified normalizer " #
N−1
1 X √−1 (θi −θj )k
Z
Z= det e dθ.
2π k=0
1≤i,j≤N
239
Authenticated
Now
"N−1 # N−1 X
N−1 N−1 X
N−1
X √ X h √ i X h √ i
−1(θi −θj )k
det e = . . . det e −1 (θi −θj )ki = . . . det e −1 θi (ki −kj ) ,
1≤i,j≤N 1≤i,j≤N
k=0 1≤i,j≤N k1 =0 k2 =0 k1 =0 k2 =0
and as the latter determinant vanishes unless the k’s are distinct, so Z reduces to the sum over such k’s of
X χ(π) Z YN √
−1 θi (ki −kπi )
e dθ.
π
(2π)N i=0
But all these integrals vanish unless π is the identity, so the whole reduces to the number of choices of N distinct k’s
from 0, 1, . . . , N − 1, i.e. Z = N! .
The same style of computation produces a nice formula for the marginal density p(θ1 , . . . , θn ) with the other angles θk ,
N ≥ k > n, integrated out:
(N − n)!
p(θ1 , . . . , θn ) = det K (θi − θj ) 1≤i,j≤n .
N!
I leave that to you as an exercise.
3.7. Occupation numbers, determinants and the scaling limit
Now Fredholm determinants enter the story. Let # be the number of angles θ that fall within an interval J ⊂ [0, 2π).
It is desired to evaluate the probabilities P(# = n) for n ≥ 0. Everything is symmetric in the angles and there are
“N choose n” ways of specifying which particular angles fall in J, so
 
n N
N \ \
P(# = n) = P  Ei Ej0 
n i=0 j=n+1
in which Ei is the event “θi ∈ J” and Ej0 the compliment of Ej , i.e. “θj ∈
/ J”. This yields to the principle of inclusion-
exclusion: with ni=0 Ei = E for short, you have
T
   
N
\ N
[
P E ∩ Ej0  = P(E) − P E ∩ Ej 
j=n+1 j=n+1
X X X
= P(E) − P(E ∩ Ei ) + P(E ∩ Ei ∩ Ej ) − P(E ∩ Ei ∩ Ej ∩ Ek ) + . . .
i>n j>i>n k>j>i>n
N
X N−n
(−1)p−n

= P E ∩ En+1 ∩ . . . ∩ Ep by symmetry
p=n
p−n
N Z
X N−n (N − p)!
(−1)p−n

= det K (θi − θj ) 1≤i,j≤p dθ,
p=n
p−n N!
Jp
which reduces, after multiplication by “N chose n”, to
N
(−1)p−n
X Z

P(# = n) = det K (θi − θj ) 1≤i,j≤p dθ.
p=n
n!(p − n)!
Jp
240
Authenticated
H.P. McKean
Here is the punch-line: multiply by (1 − λ)n , sum from n = 0 to N, and reverse the two summations to produce
N N
" p # Z
X X X p!
(1 − λ)n P(# = n) = (1 − λ)n (−1)p−n
1
det K (θi − θj ) 1≤i,j≤p dθ
n=0 p=0 n=0
n!(p − n)! p!
Jp
N
(−λ)p
X Z

= det K (θi − θj ) 1≤i,j≤p dθ = det I − λ (K restricted to J × J) ,
p=0
p!
Jp

K being of rank N, i.e. P(# = n) is the coefficient of (1 − λ)n in det I − λ (K restricted to J × J) .
Now let 0 < θ1∗ < θ2∗ < . . . be the angles taken in order. Obviously, the mean-value of θ2∗ − θ1∗ is 2π/N, by symmetry,
so if you want to see what happens for infinite dimensions, i.e. for N ↑ ∞, you’d better scale the angles up by the factor
N/2π, as in
P #[0, 2πL/N) = n = the coefficient of (1 − λ)n in

N 2πL/N 2πL/N
(−λ)p
Z Z
X 1 sin N(θi − θj )/2
... det dθ
p=0
p! 2π sin (θi − θj )/2 1≤i,j≤p
0 0
N ZL ZL
(−λ)p

X 1 sin π(xi − xj )
= . . . det dx
p=0
p! 2π(N/2π) sin π(xi − xj )/N 1≤i,j≤p
0 0
∞ ZL ZL
(−λ)p

X sin π(xi − xj )
' . . . det dx for large N
p=0
p! π(xi − xj ) 1≤i,j≤p
0 0

sin π(x − y)
= det I − λ = det (I − λK )
π(x − y) 0≤x,y≤L
with
sin π(x − y)
K (x, y) = restricted to 0 ≤ x, y ≤ L.
π(x − y)
It follows that

lim P #[0, 2πL/N) = n
N↑∞
exists for each n ≥ 0, and it is a source of satisfaction that these numbers add to unity, i.e. they represent a genuine
probability distribution. This is because det (I − λK ) reduces to 1 at λ = 0.
Jimbo et al. [12] discovered the surprising fact that ∆ = det (I − λK ), as a function of L, solves a differential equation –
and a famous one too, called Painlevé 5 – but that lies outside the scope and purpose of these notes. A nice account may
be found in Tracy–Widom [22]. Here I note only one particular consequence: For λ = 1, ∆(L) is just the probability that

no (scaled) angle θ is found in the interval between 0 and L, and this has a Gaussian tail, to wit, ∆(L) ' exp − π 2 L2 /8 .
Appendix on compactness
The space C[0, 1].
The statement is that the figure F ⊂ C is compact relative to the distance d(f1 , f2 ) = kf1 − f2 k∞ if (and only if) it is 1)
closed, 2) bounded, and 3) |f(b) − f(a)| ≤ ω(|b − a|) for every f ∈ F with some universal, positive, increasing function
ω(h), h > 0, subject to ω(0+) = 0. I sketch the proof.
241
Authenticated
Sufficiency.
Let r1 , r2 , . . . be a list of the (countably many) rational points 0 ≤ r ≤ 1. The numbers fn (rm ) are bounded for fixed
m ≥ 1, so you can pick 1) a subsequence n1 = (n1 , n2 , . . .) of n = (1, 2, 3, . . .) to make fn (r1 ), n ∈ n1 , tend to some
number f∞ (r1 ), 2) a further subsequence n2 = (n21 , n22 , . . .) ⊂ n1 , to make fn (r2 ), n ∈ n2 , tend to some number f∞ (r2 ),
and so on, and if you now pass to the diagonal sequence d = (n11 , n22 , n33 , . . .), then fd (rm ) will tend to f∞ (rm ) for every
m = 1, 2, 3, . . . simultaneously. 3) does the rest.
Necessity.
3) states that

sup |f(b) − f(a)| : f ∈ F , 0 ≤ a < b ≤ 1, (b − a) ≤ h ↓ 0 with h.
Its negation asserts the existence of a number ω > 0 of this nature: that for every n ≥ 1, you can find fn ∈ F and
0 ≤ an < bn ≤ 1 with bn − an ≤ 1/n so as to make |fn (bn ) − fn (an )| ≥ ω. But if F were compact, you could weed
out to make fn → f∞ and, at the same time, make an & bn tend to a common value c, [0, 1] being compact. Then
0 < ω ≤ |fn (bn ) − fn (an )| tends to 0, which is contradictory.
Dini’s Theorem, used twice in Section 2.7, belongs to the same circle of ideas. It states that if fn ∈ C [0, 1] decreases
pointwise to 0, then fn → 0, i.e. kfn k∞ ↓ 0.
Otherwise, for each n ≥ 1, fn (xn ) exceeds some fixed number ω > 0 for some 0 ≤ xn ≤ 1. Then you can weed out to
make xn tend to x∞ as n ↑ ∞, with the contradictory result that
fm (x∞ ) = lim fm (xn ) ≥ lim fn (xn ) ≥ ω,

n↑∞ n↑∞
i.e. fm (x∞ ) does not decrease to 0.
The space L2 [0, 1].

R1
Now comes the “weak” compactness of the unit ball B = e : 0 e2 ≤ 1}. The key to this is the fact that L2 contains a
countable dense family of functions fn , n ≥ 1, such as the broken lines with corners at rational places and rational slopes
between. Take en , n ≥ 1, from B. The diagonal method employed above permits you to weed out so as to make (en , fm )
tend, as n ↑ ∞, to some number L(fm ) for every m = 1, 2, 3, . . ., simultaneously. Obviously, |L(fi ) − L(fj )| ≤ kfi − fj k2 ,
whence L extends to a bounded linear map from L2 [0, 1] to R and (en , f) tends to the extended L(f) for every f ∈ L2 ,
simultaneously. The final step employs the fact that any such L is of the form L(f) = (e∞ , f) with a fixed function
e∞ ∈ L2 [0, 1]. The proof is easy. Let N be the null space of L and N o its annihilator. Then either L(f) ≡ 0, i.e. en 0 or
6 0 for any non-vanishing e ∈ N o . Pick such an e = e1 . Then for any other e2 ∈ N o , L sends L(e1 )e2 − L(e2 )e1
else L(e) =
to 0 from which you infer that dim N o = 1, i.e, N o = Re with kek2 = 1 if you want. Now split f into its projection
e(e, f) onto N o plus the rest, viz. f − e(e, f) ∈ N. Then L(f) = L(e)(e, f) ≡ which is to say en L(e) × e. This is the
weak compactness of the unit ball.
That’s all you need to know about compactness, but I will add a simple
Exercise.
√
The functions en = 2 sin nπx, n ≥ 1, form a unit perpendicular base of L2 [0, 1], i.e. every f ∈ L2 can be expanded as
P P
f= f̂(n) en with f̂(n) = (en , f). Prove that the ellipsoid E = f : |f̂(n)|2 /l2n ≤ 1 is (truly) compact in L2 [0, 1] if and
only if the positive numbers l2n tend to 0 as n ↑ ∞.
242
Authenticated
H.P. McKean
References
[1] Ahlfors L.V., Complex Analysis, Internat. Ser. Pure Appl. Math., 3rd ed., McGraw-Hill, New York, 1979
[2] Brown R., A brief account of microscopical observations made in the months of June, July, and August, 1827, on the
particles contained in the pollen of plants, etc., Philosophical Magazine, 1828, 4, 161–173
[3] Cameron R.H., Martin W.T., The Wiener measure of Hilbert neighborhoods in the space of real continuous functions,
Journal of Mathematics and Physics Mass. Inst. Tech., 1944, 23, 195–209
[4] Conrey J.B., The Riemann hypothesis, Notices Amer. Math. Soc., 2003, 50(3), 341–353
[5] Courant R., Differential and Integral Calculus, vol. 2, Wiley Classics Lib., John Wiley & Sons, New York, 1988
[6] Courant R., Hilbert D., Methoden der Mathematischen Physik, Springer, Berlin, 1931
[7] Dyson F.J., Statistical theory of the energy levels of complex systems. I, J. Mathematical Phys., 1962, 3, 140–156
[8] Dyson F.J., Fredholm determinants and inverse scattering problems, Comm. Math. Phys., 1976, 47(2), 171–183
[9] Einstein A., Über die von der molekularkinetischen Theorie der Wärme gefordete Bewegung von in ruhenden Flüs-
sigkeiten suspendierten Teilchen, Ann. Phys., 1905, 17, 549–560; reprinted in: Investigations on the Theory of the
Brownian Movement, Dover, New York, 1956
[10] Fredholm I., Sur une classe d’équations fonctionelles, Acta Math., 1903, 27(1), 365–390
[11] Gardner C.S., Greene J.M., Kruskal M.D., Miura R.M., Methods for solving the Korteweg–deVries equation, Phys.
Rev. Lett., 19(19), 1967, 1095–1097
[12] Jimbo M., Miwa T., Môri Y., Sato M., Density matrix of an impenetrable Bose gas and the fifth Painlevé transcendent,
Phys. D, 1980, 1(1), 80–158
[13] Kato T., A Short Introduction to Perturbation Theory for Linear Operators, Springer, New York–Berlin, 1982
[14] Lamb G.L., Jr., Elements of Soliton Theory, Pure Appl. Math. (N. Y.), John Wiley & Sons, New York, 1980
[15] Lax P.D., Linear Algebra, Pure Appl. Math. (N. Y.), John Wiley & Sons, New York, 1997
[16] Lax P.D., Functional Analysis, Pure Appl. Math. (N. Y.), John Wiley & Sons, New York, 2002
[17] Lévy P., Le Mouvement Brownien, Mémoir. Sci. Math., 126, Gauthier-Villars, Paris, 1954
[18] McKean H.P., Jr., Stochastic Integrals, Probab. Math. Statist., Academic Press, New York–London, 1969
[19] Mehta M.L., Random Matrices, 2nd ed., Academic Press, Boston, 1991
[20] Munroe M.E., Introduction to Measure and Integration, Addison–Wesley, Cambridge, 1953
[21] Pöppe Ch., The Fredholm determinant method for the KdV equations, Phys. D, 1984, 13(1-2), 137–160
[22] Tracy C.A., Widom H., Introduction to random matrices, In: Geometric and Quantum Aspects of Integrable Systems,
Scheveningen, 1992, Lecture Notes in Phys., 424, Springer, Berlin, 1993, 103–130
[23] Uhlenbeck G.E., Ornstein L.S., On the theory of the Brownian motion, Phys. Rev., 1930, 36, 823–841
[24] Weyl H., The Classical Groups, Princeton University Press, Princeton, 1939
[25] Whittaker E.T., Watson G.N., A Course of Modern Analysis, 4th ed., Cambridge University Press, New York, 1962
[26] Wiener N., Differential space, Journal of Mathematics and Physics Mass. Inst. Tech., 1923, 2, 131–174
[27] Wigner E., On the distribution of the roots of certain symmetric matrices, Ann. of Math., 1958, 67, 325–326
243
Authenticated

(Open Mathematics) Fredholm Determinants

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

(Open Mathematics) Fredholm Determinants

Uploaded by

Copyright:

Available Formats

Cent. Eur. J. Math.

• 9(2) • 2011 • 205-243

Central European Journal of Mathematics

1 CIMS, 251 Mercer Street, New York, NY 10012 USA

Received 15 November 2010; accepted 16 January 2011

MSC: 46E20, 35Q53

Keywords: Fredholm detemninant • Brownian motion • KdV equation

1. An old story with a new twist

1.1. Simplest facts

1.2. Determinant and Cramer’s rule

1.3. Signed volume

V = |f1 | |f2 | × sin θ,

The absolute volume for the frame F = [f1 , f2 , f3 ] is

1.4. Why V is the determinant

1.5. Simplest rules

Here, A, B, C , etc. are d × d matrices.

det C † = det C : indeed,

1.6. Cramer’s rule

The determinant of M being linear in each of its columns,

That is Cramer’s rule, expressing M −1 as (det M)−1 × (∂ det M/∂m)† .

1.7. Fredholm’s idea

makes no sense, but  

Cramer’s rule is to be recast in this format with a view to inverting I + λK if ∆(λ) =

× [the unrestricted sum over i0 − the restricted sum with i0 3 i]

always, by Cramer’s rule, and also that

2. Fredholm in function space

2.1. Where to start

2.2. Aside on compactness

The space C[0, 1].

The space L2 [0, 1].

3) follows from the fact that ω(0+) = 0. Why so?

2.3. Some geometry

Proof. Pick ek , 1 ≤ k ≤ n, from N, each of unit length:

and mutually perpendicular:

Proof. Let R 3 fn = (I + K ) en → f∞ . If ken k2 ↑ ∞, then

n = dim N and n† = dim N † are one and the same.

as you may do and  

2.4. Fredholm’s determinant

and make d ↑ ∞ to produce

< (me)p p−p/2

independently of d. This is plenty.

x1 (x2 − x1 )(x3 − x2 ) . . . (xp − xp−1 )(1 − xp ).

This constant is unity, as you will check, so

which is of order 1/2.

2.5. Fredholm’s formulas

6 0: Then (I + λK )(I + G/∆) = I with ∆ + G = (∂∆/∂K )† /λ, and

and you declare the “something” to be grad F = ∂F /∂f(x), as in

i.e. (I + λK )−1 = I − λe−λx ∗. Check this by hand.

3. It is amusing to observe that if K is the even more singular operator

2.6. A question of multiplicities

To illustrate the possibility of n < m, consider K = f1 ⊗ e1 + f2 ⊗ e2 . Taking

P is a projection, i.e. P 2 = P. To confirm this, express P 2 as a double integral:

P acts as the identity on the null space N of I + K : indeed, K e = −e for e ∈ N, so

(I + K )m P = 0, m being the multiplicity of the root of ∆: indeed,

This vanishes as the integrand has no pole anymore.

I + λPK P = (1 − λ)(I + ΛJ) with Λ = λ(1 − λ)−1

det (1 − λ) (I + ΛJ) = (1 − λ)d det (I + ΛJ) for λ 6= 1

from which you learn the following.

5) Now the continuous function f0 = K f − ∞

as d ↑ ∞ by 2), so Q[f0 ] = 0. Therefore, Q[f0 + hf] is minimum (0) at h = 0, and

i.e. f0 ≡ 0. It follows that

for every f ∈ C [0, 1] and every 0 ≤ x ≤ 1, whence K (x, y) = ∞

Item 6) plus 5) is the content of Mercer’s theorem.

as you may do and