Professional Documents
Culture Documents
Feinsilver, Schott - Algebraic Structures and Operator Calculus. Vol 2, Special Functions & Computer Science (Kluwer 1994) PDF
Feinsilver, Schott - Algebraic Structures and Operator Calculus. Vol 2, Special Functions & Computer Science (Kluwer 1994) PDF
Managing Editor:
M . HAZEWINKEL
Centre for Mathematics and Computer Science, Amsterdam, The Netherlands
Also of interest:
Algebraic Structures and Operator Calculus . Volume I.- Representations and Probability
Theory, by P. Feinsilver and R . Schott, 1993, x + 224 pp., ISBN 0-7923-2116-2, MAIA 241 .
Volume 292
Algebraic Structures
and Operator Calculus
Volume II :
Special Functions
and Computer Science
by
Philip Feinsilver
Department of Mathematics,
Southern Illinois University,
Carbondale, Illinois, U.S.A .
and
Rene Schott
CRIN,
Universite; de Nancy 1,
Vandoeuvre-les-Nancy, France
1 0
KLUWER ACADEMIC PUBLISHERS
DORDRECHT / BOSTON / LONDON
A C .I .P. Catalogue record for this book is available from the Library of Congress .
ISBN 0-7923-2921-X
Preface ix
Introduction
I . General remarks 1
II . Some notations 2
III . Orthogonal polynomials and continued fractions 3
IV . Bessel functions, Lommel polynomials, theta functions 5
V . Some analytic techniques 7
VI . Polynomials : reference information 9
References 139
Index 145
vii
Preface
As in all volumes of this series, this one is suitable for self-study by researchers . It is
as well appropriate as a text for a course or advanced seminar .
The solutions are tackled with the help of various analytical techniques, such as gen-
erating functions, and probabilistic methods/insights appear regularly . An interesting
feature is that, as has been the case in classical applications to physics, special functions
arise - here in complexity analysis . And, as in physics, their appearance indicates an
underlying Lie structure .
ix
INTRODUCTION
General remarks
In Chapter J, we present the basic data structures we will be studying and introduce
the ideas guiding the analysis. Chapter 2 presents applications of orthogonal polynomials
and continued fractions to the analysis of dynamic data structures in the model introduced
by J. Frangon and D.E. Knuth, called Knuth's model. The approach here is original
with this exposition. The underlying algebraic/analytic structures play a prominent role.
Chapter 3 presents some results involving Bessel functions and Lommel polynomials arising
from a study of the behavior of the symbol table. Then, in another direction, theta
functions come up as solutions of the heat equation on bounded domains, describing the
limiting behavior of the evolution of stacks and the banker's algorithm. In Chapter 4 we
present some basic material on representations of finite groups, including Fourier transform.
Then Fourier transform is considered in more detail for abelian groups, particularly cyclic
groups. Next we look at Krawtchouk polynomials, which arise in a variety of ways, e.g.,
in the study of random walks. The concluding Chapter 5 discusses representations of the
symmetric group and connections with Young tableaux. Then, a variety of applications of
Young tableaux are shown, including examples related to questions in parallel processing.
Chapters 4 and 5 mainly include material from the literature that we find particularly
interesting as well as some new approaches. It is hoped that the reader will find these
chapters to be useful as background for further study as well as providing examples having
effective illustrative and reference value.
1. First, we recall the gamma and beta functions, given by the integrals
r(x)= / t^-ifi-'dt
Jo (2.1)
r(x + y) Jo
for R e x , Hey > 0.
2. In hypergeometric functions, we use the standard Pochammer notation:
r(a + 6)
(^)'^ r(a)
where T denotes the gamma function. Thus, we have for binomial coeiEcients:
©=<-"' A:!
a i , a ^ , . . . , dp A ^ v ^ (ai)n(a2)n---(ap)n x"
(fl)n(fe)« n
2-ro
71 = 0
= 1 + a6a; + a ( a + 1 ) 6 ( 6 + l ) i V 2 + ---
4. For Stirling numbers, we use Sn,k and s^^k for Stirling numbers of the first and
second kinds, respectively. They may be defined by the relations:
(x)„ = Y^S„,kx''
k=0
'Tn (2-2)
x" = ^ . „ , t ( - l ) " - * ( x ) ,
t=o
See Knuth[53], p. 65fF. (where, however, a different notation is used).
5. For finite sets E, we denote the cardinality of E by \E\ or # E .
INTRODUCTION 3
We will take F to be normalized so that ^o = 1- It is assumed that the moments exist for
all n > 0. Integration with respect to dF, m,athem,atical expectation or expected value ,
is denoted by angle brackets ( • )
in particular, we can write the moments as Hn = {X"). In the discrete case, we have a
probability sequence (discrete distribution), {pk }k>o, satisfying p* > 0, ^ p ^ t = 1, with
corresponding moments
/^n = Y^(ak)"pk
k=0
where at denotes the value taken on with corresponding probability pk- In the (absolutely)
continuous case, we have a nonnegative density function p{x), and
We define an inner product (•, •} on the set of polynomials, which is the mathematical
expectation of their product. On the basis { x" }, this is given in terms of the moments:
The corresponding sequence of orthogonal polynomials, { <j>„{x) }n>Oi say, may be deter-
mined by applying the Gram-Schmidt orthogonalization procedure to the basis { a;" }. The
polynomials { <f>n{x) } satisfy
With (f>„{x) = x" + • • •, monic polynomials, the { <j>„{x) } satisfy a three-term recur-
rence relation of the form:
ln=Wh---hn (3.1.4)
P„ and Q„ are the n"" partial numerators and partiaJ denominators, respectively. They
satisfy the recurrence relations:
P _ i = l , Po=6o, g _ i = 0 , Qo = l
respectively. One may remark that this setup corresponds to the matrix relations:
Q
5n Pn J \an b„J \ai bi J \1 bo J
INTRODUCTION 5
for n > 1, where the numerator poJynomiais, P„{x), satisfy the initial conditions PQ = 0,
Pi = 1, and the denominator polynomials, Qn{x), satisfy Qo = 1, Qi = 1 — aox.
The reciprocal polynomials, j/'„(a;), defined as
M^) = ^"Qn{l/x)
with initial conditions I/JO = 1, V"! = ^^ — tio- These give a sequence of orthogoned polyno-
mials with squared norms 7n = &1&2 • • • 6„, n > 0.
The reader is referred to G.N. Watson's [84] treatise for an extensive presentation of
the theory of Bessel functions. Here we indicate the features that are important for our
applications.
..M = (./yE„'r'ritt)
. . . . J r ( n + « + l) <"•"
This function arises as the solution of the differential equation
cPu 1 du /. v^
In the present study, a principal feature of Bessel functions is the fundamental recurrence:
with the initial conditions i i _ i = 0, iio = 1- The i?„ may be given explicitly in the form
«..W^ g (";')(-)* ^ 0 )
*:=0
i.e., with initial values ^ _ i = C2, </'o — c\- Eq. (4.1.4) can be interpreted as expressing the
solution to x-\\)n — ^n+\ + e~^nipri + «/'n-i in the form
V. S o m e analytic techniques
Here we indicate some basic analytic techniques that will be used. Included are: asymp-
totic behavior of coefficients of a power series, central limit approximation of sums involving
binomial coefficients, and Lagrange inversion.
oo
Let f{z) — 2_^ Onz" be analytic in a neighborhood of 0 in the complex plane. We are
n=0
interested in the behavior of |a„| as n ^ oo. The easiest observation is:
oo
5.1.1 P r o p o s i t i o n . Let f{z) = V^ ctn-^" be analytic in a disk of radius R > 1. Then
n=0
the coefficients an decrease exponentially fast.
Proof: The terms of the series 2 ^ Q |c(„|(iJ — e)" converge to zero for any e > 0.
Choose e so that R~ e > 1. •
Because of this, one looks for the singularities of functions when it is known that the
coefficients a„ grow as n —» oo. For our purposes, the following suffices.
5.1.2 L e m m a . Let f{z) = S ^ o * * " ^ " ^'^ meromorphic, analytic in a neighborhood of
0 in the complex plane. Let ^ denote the pole of f nearest to the origin. If ( is a poJe of
order p, then we have the estimate
, , A nP-i
r(p) c"+p
where A = lim f(z){z — C)*".
where the second term has at most a pole of order < p. Now, for \z\ < |C|, the binomial
theorem implies
At a pole of larger modulus, the same estimate gives a rate of growth exponentially slower,
whereas terms of lower order poles grow at correspondingly lower powers of n. •
See Flajolet-Vitter[28] for more along these lines.
8 INTRODUCTION
N \ /' N
2-2"' y/Nda
k + N/2) ^ \N/2
2 / ^ 1 / 1
.,4-',,^
|*|<iV/2
{k
^
+ N/2j
' •" \
\NI2)
' / » ^ ^V
Proof: This follows from the above remarks and the Gaussian integral
oo J.OO
/
a'^'e-^'^" da = 2 a^^g-^"' da
-00 Jo
= 7rTT7^ / t^-'^'e-'dt
2P+1/2 y^,
= ^ r ( p + i/2)
via the substitution t = 2a^. The result follows, using r ( | ) = y/n. •
See Canfield[9] for more information on this topic.
5.3 LAGRANGE INVERSION
k-\
I, s I X Xo
See Sansone-Gerretsen[79] for details. The proof uses Cauchy's integral formula and
integration by parts.
Here we include some basic reference material on the classes of orthogonal polynomials
that come up, notably in the study of dynamic d a t a structures. In general, references are
Szego[81] and Chihara[10].
First we note
1. The inner product is denoted by angle brackets (•,•), thus
f(x)gix)dFix)
-00
Here we list basic information on polynomials orthogonal with respect to the binomial
and Poisson distributions, respectively.
with A'o = 1, K\ = X.
10 INTRODUCTION
jn = n\{-in-N)„
V —r/^n = (cosh 5 ) ^
N
T ^ K„(x) = (1 + t,)(^+^)/^(l - vY^-^y'
n=0
lt=0
with Po = 1, Pi = a ; - t .
Measure of orthogonality and squared norms
7n = n\r
Moments
/in = Y^Sn^kt''
k
V —Ain = e x p ( i ( e ' - 1 ) )
n
n=0
INTRODUCTION 11
n=0
P„(x,0 = (-ir^(")<"-*(-a.)*
jfc=o ^ ^
R e m a r k . The moments for the Poisson distribution are readily calculated by considering
/•oo oo
/ y^ dF[x) = e-* V
k=0
Then the Stirling numbers of the second kind, eq. (2.2), convert these to the usual moments
given above.
Here we have Tchebychev and Hermite polynomials. Recall that the Tchebychev poly-
nomials are determined according to
The Hermite polynomials are orthogonal with respect to the Gaussian distribution.
1 f^ dx
TT J-1 V l - X^
7n = 2 ' ^^'^
12 INTRODUCTION
Moments
n=0 yr^
Generating function for the polynomials
1 — XV
^'"^'M-T^^^.
71 = 0
[n/21
2 yi
{Un,Um) —— / Uriix)Um{x)\/l - x'^ dx = 'y„S„„
7n = 1 , n>Q
Moments
2^ S /^2n - 2
[n/2]
k/n^\n-2k
*:=0 ^ ^
-oo
7n = n\t"
Moments
n,l^ ( 2 » ) ! .n
^2n = ( 2 0 " ( | ) n = ^ ^ <
n=0
Generating function for the polynomials
. n!
[n/2]
^ ' / r) \
k=0 ^ ^
Chapter 1 BASIC DATA S T R U C T U R E S
In this Chapter are introduced the basic data structures that will be the principal sub-
ject of the first two chapters (and part of Chapter 3). First, we introduce basic terminology
and constructions. In Chapter 2, detailed computations and results will be presented.
The structures discussed here are representations of files or lists of data stored in
a computer. The idea is to organize the data for ready access and effective use of the
information.
I. Basic d a t a s t r u c t u r e s
We start with a universe of elements, items. It is assumed that one can equip the universe
with a total order. Once given an order structure, the items are referred to as keys
1.1 Definition. A list is a sequence of items.
When the items can be ordered, i.e., when the list consists of keys, then one has a sorted or
unsorted list accordingly. Particular data types are determined according to operations
used to construct corresponding lists. A sequence of operations performed on a given
structure is represented as a word in a language with alphabet corresponding to the
operations allowed to be performed on the data structure. Here we consider structures
which permit insertions denoted by J, deletions , denoted D, and record searches
(examining items for information) called queries denoted Q"*" or Q~ according as the
result of the search was positive (successful) or negative (unsuccessful). This gives a basic
alphabet {I,D,Q+,Q-}.
R e m a r k . Since the words can be of arbitrary length, one could consider the set of infinite
sequences of letters, restricting to the subclass of tame sequences, having all but finitely
many components equal to e the 'empty symbol', denoting a blank, which thus would be
adjoined (formally) as part of the alphabet.
A basic structure of interest is the linear list. This is a list which is constructed using
only insertions and deletions.
1.1.1 Definition. A linear list is a list that admits the operations / and D only.
For purposes of illustration, we denote the items by lowercase letters { a, 6, c , . . . } and the
corresponding operations as a word using the alphabet { I,D}. For example, starting from
abc, inserting d, e, / , then deleting a and / might give successively: dabc, dabec, dabfec,
dbfec, dbec, corresponding to the word IIIDD. It is clear that many different lists may
correspond to a given sequence of operations. A main feature of the analysis is how to
deal with precisely this fact in determining the complexity of a particular structure.
Probably the simplest and most well-known example of a lineax list is the stack.
BASIC DATA STRUCTURES 15
1.1.2 Deflnition. A stack is a linear list for which the insertions and deletions are
all made at only one end of the list.
For a stack, a given word corresponds to only one possible set of consequences. In the
above example, after three insertions, the deletions would necessarily indicate removal of
/ and then e from the list.
1.1.1 Queues
1.1.1.1 Definition. A queue is a linear list for which insertions are made at one end
of the list and deletions are made at the other end.
As with a stack, a series of operations corresponds to only one possible set of consequences.
An important modification is the priority queue. We assume that the items are keys,
i.e., come from a totally ordered universe. In the priority queue, only the highest priority
item can be deleted, with priority according to the ordering of the keys.
1.1.1.2 Definition. A priority queue is a linear list of keys for which only the minimal
key may be deleted. Insertions are unrestricted.
1.1.2 Dictionaries
Dictionaries are modelled on the everyday notion of dictionary, i.e., queries, corresponding
to looking up information, arise. Thus, all four operations I, D, Q"*", Q~ are admitted.
The elements are keys.
1.1.2.2 Definition. A symbol table is a list of keys admitting the three operations
/ , D, Q"*". Deletions operate only on the last (most recent) key inserted.
At this point it is not clear which structure is the most effective in a given situation.
The question is how to determine what practical implementations are most efficient. In
the next section we consider some basic data organizations — how the structures are
implemented in the machine. The mathematical analysis, average case analysis, gives one
an idea of the relative efficiency of the various data organizations.
16 CHAPTER 1
1.2 IMPLEMENTATIONS
We will be concerned primarily with linear lists, priority queues, dictionaries, and
symbol tables. Each of these data types may be implemented as sorted lists, for keys, as
well as unsorted lists. Additionally there are various implementations available for these
structures. Below, we illustrate the use of pagodas, a structure developed by Fran^on,
Viennot& Vuillemin [35].
R e m a r k . Sorting algorithms are a good example of the differences that can arise between
worst-case and average-case behavior. Consider the most efficient sorting algorithm, 'quick-
sort', developed by C.A.R. Hoare. Analysis shows that this algorithm sorts a sequence of
n elements in an average time of the form O ( n l o g n ) , i.e., bounded by a constant times
n log 77. Comparison among all sorting algorithms yields the conclusion that quicksort is
the most efKcient on the average. However, the worst case, when the data is almost in
order at the start, is quite poor, taking on the order of n^ steps. Fortunately these worst
cases are rare. See Knuth[53], v. 3, p. 114ff.
For priority queues, in addition to sorted and unsorted lists, there are implementations
as binary tournaments and pagodas. The idea of these structures is to take advantage of
the priorities among the items.
Here we identify a priority queue of size n with a permutation of the integers from 1
to n, the keys. To these correspond labelled, rooted trees. The labels are the keys 1 , . . . , n.
First we recall the definition of a binary tree . A binary tree T is given recursively as
a triple T = {l{T),v{T),r{T)) where l(T) and r{T) are binary trees, the left and right
subtrees respectively of T, and v(T) is the label (value) of the root of T. The empty
tree is included in the definition of a binary tree. Similarly, a binary tournament is
defined recursively by the condition that the left and right subtrees of the root are binary
tournaments satisfying v{T) < min{ v{(): ( a node of 1{T) or r{T) }. In other words, for
any node (, its sons (' satisfy v{() < v{^'), priority increasing with the depth of the node.
For example, consider the queue (permutation) 573961428. Successively taking out
the minimal values, partitioning accordingly, and otherwise keeping order, leads to the
following binary tournament:
1
/ \
3 2
/ \ ^ \
5 6 4 8
\ ^
7 9
BASIC DATA STRUCTURES 17
or, adjoining O's so that every node has left and right subtrees, this can be represented by
the set of triples
For example, 6 = (d, 2, h) indicates that the node 2 has attached left subtree d ajid right
subtree h.
1.2.1 Pagodas
Now we are ready for the construction of pagodas. Start with a binary tournament T,
given as a set of triples (as above). The pagoda representation is a set of triples determined
as follows. For each subtree P, determine the left inverse subtree l''^{P) as the unique
subtree, Q, say, such that 1{Q) = P in the triple corresponding to Q. For example, in the
tournament of the previous section, we have /~^(e) = c. Similarly, determine the right
inverse subtree of P. Next, define recursively the leftmost descendant of a subtree P as
follows
i*( p> _ / -P' if KP) is empty
I ^*{KP)), otherwise
r*{P) is defined correspondingly. (Here, empty subtrees correspond to the O's in the triples
used in the representation given above for tournaments.) Now we have
1.2.1.1 Definition. Let T be a binary tournament. A •pagoda is a representation of
T as a set of triples (A(P), w(P), p{P)), one for each nonempty subtree P of T, such that
\(p\^i^~\P)^ if r ^ P ) exists
^ ^ \l\P), otherwise
and
p(P)=(''~'(-P)' i f ' - " H ^ ) exists
r*[P), otherwise
Briefly, for each subtree locate its (left or right) ancestor if there is one; if none, then
give its furthest descendant. For example, the pagoda representation for the tournament
considered above is given by:
The motivation for studying dynamic data structures is to compare the efficiency of
the various data structures as implemented. How is one to analyze the behavior of these
systems? Two basic ideas are time costs and space costs, involving respectively the com-
plexity due to the number of operations involved and the complexity due to storage —
memory — requirements. Here we study time costs.
2.1 HISTORIES
We present a useful formalism. Let O denote the alphabet of operations, in our case we
have O C {I,D,Q'^,Q~ }. Thus, a word means a sequence of elements belonging to O.
We denote the set of words over the alphabet O hy O*. We want to count the number of
operations involved when the operations are performed on files under constraints imposed
according to the various data structures. First, we make a general definition, corresponding
to the necessary condition that a file cannot be of negative size.
2.1.1 Definition. For w £ O* and O £ O, denote by \w\o the number of times O
appears in w.
Note that the difference between the number of insertions and deletions is just the size of
the file. Now we have
2.1.3 Definition. A schema is a word w = O1O2 • • • On € O* such that the sequence
of subwords Wi = O1O2 • • -Oi satisfies h(wi) > 0, for all i, 1 < i < n.
When counting file operations involving keys, we consider only their relative order.
The method of counting the number of ways an operation can be applied to a given file in a
given situation is determined by a sequence of nonnegative integers, ranks, corresponding
to (the relative ordering of) keys.
2.1.4 Definition. Let ri,...,r„ be a sequence of nonnegative integers, ranks . A
history is a sequence of ordered pairs { ( O i , r i ) , . . . ,{0„,r„) } such that O1O2 • • On is a
schema and r j , . . . , r „ are admissible ranks (for a given model).
The ranks allowed determine the particular models. Below, we discuss two models that
have been studied: the maricovian model and Knuth's model. So a model is determined
by the alphabet O and, for each data structure, the specification of allowed histories. (In
Chapter 2, detailed results for Knuth's model will be presented.)
Here we continue with a sketch of the general approach, assuming a model has been
given.
BASIC DATA STRUCTURES 19
1) '^k,t,n denotes the set of histories of length n with initial height k and final height /.
Denote the cardinality |?it,;,„| by Hk,i,n-
2) Hn denotes the set of histories starting and ending with an empty file, i.e.,
Tin = 'Ho,Q,n- Denote the cardinality |7f„| by H„.
Suppose we are given a cost functional on the set of histories: cost (h), some measure of
the complexity of the histories. Then we have
1
iiTn = — ^ cost (h)
In order to analyze time costs, it is convenient that the data structure satisfy a condition of
stationarity. Many implementations commonly used in computer science have the required
property.
Here is one way of stating what is needed. Denote by Ek the set of all states of a structure
of size k. This depends on the particular implementation. For example, for a sorted list,
there is only one state for each fc, while there axe A;! possible states of size k for an unsorted
list. As another example, there are k\ pagodas of size k. One defines
The structures we will consider are known to be stationary. This property permits the
calculation of costs in a particularly effective manner. Namely, we can use the notion of
individual costs.
20 CHAPTER 1
From this table we can see that the relative efficiencies are not evident. For example,
look at dictionaries. The table shows that, implemented as a list, the average cost of an
insertion in a dictionary of size k is (k + 2)/2 if the hst is sorted, 0 if unsorted. The average
cost for deletions and positive queries is independent of whether the list is sorted, while
the average cost of a negative query is (A;-|-2)/2 for sorted lists, k for unsorted. Comparing
these costs, it is unclear how to determine which list structure is more efficient. This is
where the idea (of Frajigon, Knuth, et al.) to consider the dynamic behavior of these
structures comes from.
To compute the average (integrated) cost we use the individual costs COk- First we
have
2.2.6 Definition. The level crossing number NOk,n is the number of operations of
type O performed on a file of size k in the course of the histories W„.
BASIC DATA STRUCTURES 21
where
k, for 0 = Q+ OT Q-
!
it + 1, for O = J
/b - 1, for O = JO
From the definitions, we see that
2.2.7 P r o p o s i t i o n . For stationary structures, the integrated cost is given by
The models presented here differ principally in the way one counts insertions (and
negative queries). In [54], Knuth discusses two types of insertions in a data structure.
Take the interval [0,1] as the set of keys. Consider inserting the next key into a file of k
keys:
/o means insertion of a random number by order in the sense that the new number is
equally likely to fall in any of the k + 1 intervals determined by the keys in the file
/ means the insertion of a random number uniformly distributed on the interval [0,1],
independent of previously inserted numbers.
G.D. Knott[52] showed that Ig differs from / . Considering insertions of type IQ gives the
marJcovian model (studied in Fran5on[29]).
r, = 0 0,=D
0<rj <h{0i...0j-i) Oj=I
22 CHAPTER 1
Tj = 0 Oj=D
0<r, <h{0i...0j_i) Oj = Q+
0<rj <h(0i...0j-i) Oj=I
rj = 0 Oj = D
0<rj <\Oi...Oj-i\i Oj = I
rj =0 Oj = D
0 <rj < h{Oi ...Oj-i) Oj = Q+
0<rj<\Oi...Oj-i\i 0,=I
Thus, the possibility functions are given by:
Linear list i k
Priority queue i 1
Dictionary i k k
Symbol table i l k
In this Chapter, we first present the analysis involved in finding the behavior of data
structures in Knuth's model. Then we indicate some features of a finite universe model,
which has perhaps a more realistic flavor than the models first discussed. Then we con-
sider mutual exclusion. This models a system of parallel processes accessed, at random,
by various users. Each process can be accessed by only one user at a time, hence the
description. A probabilistic model again leads to recurrences of the type studied in this
Chapter. We conclude with some remarks on a general approach to duality, relating the
space-time recurrence with the dual action of the operator of multiplication by X.
Here we will analyze the behavior of the data structures of Chapter 1, §2.3.2. We present
the approach based on path counting, used as well in the study of random walks. Using the
notion of dual recurrences this leads to orthogonal polynomials and an associated operator
calculus. We then employ the operator calculus in our analysis of the data structures.
We will look in detail at the behavior of histories and corresponding integrated costs
for linear lists, priority queues, and dictionaries, following the indications of Ch. 1, §2.2.
First, we enumerate the histories. Then we obtain expressions for the integrated costs using
level crossing numbers and the individual costs. This puts us in a position to compare the
relative efficiency of different implementations.
Before looking at Knuth's model, we introduce the basic technique in the context of
the markovian model, where the approach is directly applicable. Then we will return to
Knuth's model.
Here we use the approach used in the study of random walks, where transitions from
state (position) to state are governed by certain probabilities. For counting paths, the
idea is the same, but we use the number of possible transitions, rather than probabilities.
Consider the markovian model of Ch. 1, §2.3.1. Let c„t denote the number of paths, i.e.,
histories, starting from height 0 that are at level k at time n. This is, in fact, the same as
Ho,k,n- We think of the height, file size, as the location of a particle (random walker) on
the line. It jumps right or left or sits at each time step according to whether the operation
7, D, or a query is applied to the file. Denote
Then we have, considering the one-step transition from time n to time n -(- 1,
DATA STRUCTURES AND ORTHOGONAL POLYNOMIALS 25
1.1.1 P r o p o s i t i o n . Let Cnk denote the niimber of ways starting from level 0 to reach
level k in n steps. The c„jt satisfy the recurrence
( 90 «o 0 0 A
• • •
rfi 91 «1 0 •••
0 d2 ?2 22 • • •
(1.1.1)
0 0 dz 93 '••
0 0 0 d^
0 0 0 0
V
where the kl entry indicates the number of ways of going from level k to level / in one step,
fc, ? > 0. The n'*' power of this matrix thus gives the number of ways of going from one
level to another in n steps.
Now the idea is to look at the duaJ recurrence. Think of the matrix in (1.1-1) as
an operator JT on a space with basis { <^o, 0 i , . . . , 4>k-, • • • }. Consider the spectral problem
where X acts as multiplication by the variable x. Writing $ = (<^0i i^ij • • •) we want
This turns into the three-term recurrence relation for the sequence {i^t }:
with initial conditions 4>-\ = 0, <j>o = 1. Thus, the {<^jt } are identified as orthogonal
polynomials in the variable x. As in eq. (3.1.1) of the Introduction, we denote integration
with respect to the corresponding distribution, expected value, by ( • ).
The operator X is the one-step transition operator and may be thought of as a position
operator (as in quantum mechanics). Another way to see that the recurrence relation
(1.1.2) is dual to that of Prop. 1.1.1 is given by:
1.1.2 P r o p o s i t i o n . The action of X on the basis { I/>A } satisfies
X"<l>0 = /^Cnk<l>k
fc=0
26 CHAPTER 2
= }_CnJr\k<l>k
k
by Prop. 1.1.1. •
This shows X " as the n-step transition operator. Shifting all indices by k
1.1.3 Corollary. Denoting hy Cni,k the number of ways of going from level k to level I
in n steps, we have
n
X''^k = Y,Cnl,k(f>l
1=0
We can introduce the monic polynomials J/J^ = </>;t ioh • • • ik-i- They satisfy the recurrence
relation
xil^k = i>k+i + Qki^k + ik-idki>k-i
Equation (3.2.1.2) of the Introduction indicates the link with the continued fraction ap-
proach of section §1.6. As follows by equations (3.1.3), (3.1.4) of the Intro., the squared
norm of ^k is
As in §3.1 of the Intro., eq. (3.1.2), for the inner product we have
= k
«'•«-{?• j ^k
with the squared norms
2 dxu2 * • • dk
ek = Ukr = iQii •• -ik-i
Thus, the markovian model indeed corresponds to the notion of Markov chain or
random walk in this case. Let us rewrite the above corollary in terms of histories. We have
the result (Proposition 7 of Flajolet, Fran^on &Vuillemin [26])
DATA STRUCTURES AND ORTHOGONAL POLYNOMIALS 27
In particular,
Ho,k,n = {X"^k}/£k, Hk,o,n = {X"<j>k}
with
Ho,0,n = {X"') = fin
the moments.
And we have the observation, from the point of view of operator calculus, via the first
statement of the Theorem,
are the matrix elements of the operator e'^ in the basis {(i>k}-
It turns out that, appropriately modified, these techniques can be applied to the study
of Knuth's model as well.
Here we introduce c^j., the number of paths starting from 0 that are of height k after n
steps, with s the number of insertions and negative queries combined. Prop. 1.1.1 becomes
Cnk{t) = Xll"*^"*
8= 0
Thus, the s is summed out, leading to a situation similar to the maxkovian model. We see
immediately that
28 CHAPTER 2
k=0
as in Prop. 1.1.2. Dual to Prop. 1.1.1.2 is
1.1.1.3 P r o p o s i t i o n . The polynomials { 4>kix,t) } satisfy the recurrence
X(j>k = t(f>k+i + (t + qk)4>k + dk4>k-l
with <^_i = 0, (/"o = 1. The squared norms are given by
ekit):=Ukf =t-''did2---dk
The corresponding monic polynomials {«/"*:} are given by tpk = <* <f>k satisfying the recur-
rence
xi/^k = V't+i +(i+qk)tpk -\-tdk^k-\ (1.1.1.1)
The n-step transitions satisfy
n
Proof: In the above Proposition, integrate both sides with respect to e~* from 0 to
oo using the relation
/ t'e-^ dt = si
Jo
The result on the right-hand side is
oo
3=0
4>k{x,t)=t-''I^Uk{xl2yft)
<" f2n
Tchebychev polynomials of the second kind, and the moments /^2n(0 =
n + 1 \n
Proof: The recurrence formula follows directly from Prop. 1.1.1.3, modified accord-
ing to the absence of queries. Then # 2 follows from the recurrence formula for the Un
given in Intro., §6.2. The density function for the orthogonality relations for the U„ is
- \/l — x'^ on [—1,1]. Changing variables x —> x/(2y/i) gives the density
TTV^ V it
with Coo{t) = 1, Cok{t) = 0, k > 0. (Notice that this is very close to Pascal's triangle.)
Inserting the formula in the statement of the Proposition, one sees that it must be verified,
after cancelling common factors of i("+*:+i)/2^
Here, a.s in the previous section, there are no queries, while dk = k, for A: > 1. Thus,
Proof: The recurrence formula follows from Prop. 1.1.1.3, modified according to the
absence of queries, and # 2 follows from the recurrence formula and facts given in Intro.,
§6.2. •
DATA STRUCTURES AND ORTHOGONAL POLYNOMIALS 31
1.2.3 Dictionaries
Proceeding as in the above two sections, we find
1.2.3.1 P r o p o s i t i o n . For dictionaries we have:
1. The recurrence relations
Look at the dual recurrence showing the operator X acting on the basis vector (fije
Each term of this recurrence corresponds to an action on the file: insertion, query, deletion.
We define corresponding operators /Z, the raising operator, N, or neutral operator, and L,
lowering operator, named according to their effect on the index k:
R<j)k = ik<f>k+i
N4>k = qk(l>k
L<^k = dk4>k-i
A useful operator V is defined by
V(l>k = {klik-i)4>k-\
so that
RVH = kcj>k
is the nunjber operator, multiplying by the level k.
R e m a r k . Some of the Lie algebras of dimensions three and four correspond to various
choices of the coefficients ijt, qk, dk- It turns out that our basic d a t a structures correspond-
ing to classical orthogonal polynomials correspond as well to fundamental Lie algebras. The
latter connection is a major theme of volume 1. In particular, the Poisson distribution,
Poisson-Charher polynomials, corresponds to the oscillator algebra with ik = I, gk = t + k,
dk = k, and the Gaussian distribution, Hermite polynomials, corresponds to the iJeisen-
berg algebra with ik = I, qk = 0, dk = k. The discussion here applies to the Meixner
polynomials class in general, see Chapter 4 for complete details, but in fact we will apply
it here explicitly for just the Poisson and Gaussian cases.
Observe that
32 CHAPTER 2
^-^ n!
n=0
n=0
Proof: These follow from the definition and the Proposition above describing the
action on the tpn- •
From the general theory of Meixner polynomials, we have the following structure.
Given that the moment generating function for Xt is
(e^^')=e'^(»)=^i^M^)
n=0
Define the function V(s) as the derivative of the function Ho{s), the function H{3) for the
centered distribution, i.e., with mean zero. And we denote the functional inverse to V by
U, i.e., V(U(s)) = s.
1.3.3 P r o p o s i t i o n . For the Meixner polynomials class, we have the coherent state in
the form
^^ ^ gXl/(a)-«H(J/(a))
for the Poisson distribution: H{s) = e' — 1, Ho{s) = e' — I — s, V(s) = e' — 1, U(v) =
log(l + v).
for the Gaussian distribution: H(s) = 5^/2, with V(s) = s — U(s).
DATA STRUCTURES AND ORTHOGONAL POLYNOMIALS 33
1.3.4 P r o p o s i t i o n . For the Poisson and Gaussian cases, these functions satisfy the
relation
H{a + 6) - H(a) - H{b) = V{a)V{b)
Notice that by orthogonality, the inner product of two coherent states is a generating
function for the squared norms ||^„(p. However, using the observations above, this can be
calculated directly as follows.
1.3.5 P r o p o s i t i o n . For the Poisson and Gaussian cases, the inner product of the
coherent states is given by
Proof: We have (e"^'e''^') = (e(«+'')-^') = exp(tH{a + b)) and thus by the cocycle
identity, using the formulation of Prop. 1.3.3,
V,„i, = gtlHiU(a)+U(b))-HiU{a))-H(U{b))]
^ ^tViU{a))V(Uib)) ^^tab
(Q)a6-(V'a,W6)/(V'a,^6)
so that the result for e*"^ follows by Prop. 1.3.5 after dividing out ^ai- For e ' ^ , use the
relation Vipt = brj^i,. Thus, e*^ tpb = e'* V'ti and the result follows. The number operator
acts as
°° 6"
RVtpb = y~l —rnrp„(x)
°° ?)"
^—' n!
n=0
Thus
{R)ab=at, {V)ab = b
and
{RV)ab = abt, {{RVf)ab = aHh"" + abt
and evaluating at p = 0 yields the result for R. In general, differentiate with respect to the
appropriate parameters and evaluate at 0 for R and V, 1 for RV, to get the result. •
From the CSR's we see that the adjoint of R is tV, with CSR tb.
R e m a r k . Note that the coherent state representation of A^^ is the same as the moment
generating function for a Poisson distribution with parameter abt.
Now for a main result.
1.3.1.3 T h e o r e m . For an operator Q, we have
tHia+b) 11
Now we apply this to the analysis of data structures. So, we are given a model for the data
structures. Define the operators iZ, iV, L as above where i t , gjt, dk are the corresponding
number of possibilities for the operations. In general, we denote the operator corresponding
to the operation O by io- Thus ^/ = i i , ^Q = A^, and ^ D = L. For Knuth's model, ijt
becomes t. And for dk = k, as for linear lists and dictionaries, we have L = tV.
NOknit) = Y,{Xr'-Ho<t>k){X\<j>k)lek
NOknit) = N p o s ( 0 , i : ) 5]iro,t,i(<)fl't',o,n-i-i(<)
substitute the formulas giving the path numbers in terms of matrix elements. What must
be seen, then, is the significance of this formula. It really contains the assumptions of
the model. Expanding in t, the superscripts denoting the number of insertions/negative
queries:
^ - i V O L = Npos(0,fc)^ 53 —Hl,,Hl.^,^^_,_,
s=a ' i a,6=0
Combining like powers of t on the right-hand side yields
Proof: Applying the cost operator to the level crossing formula given by the lemma
k k i
k i
= Y.'^c'o{RV)CoX?-'-'x\)
= Y,{xr'-'ioCo{Rv)x\)
i
where Parseval's formula is used to go from the second to the third line, •
Kn{t)Hn=J2^Xr'^'XtXi)
i
= n {X^) = nunit)
Multiplying by e~* and integrating from 0 to oo, recalling that H„ — J^ e ^ ' /J-nit) dt, the
result follows. •
Proof: By the first main formula, writing Q for ^oCoiRV), noting that for n = 0,
the first term is zero,
oo oo
n=0 n=l i
oo
n=0 i
Proof: Let u = st. Then the integral reduces to the beta function (Intro., §2). •
Now we have the
1.4.7 T h e o r e m . Third main formula for integrated cost
The exponential generating function for KOn(t)Hn satisfies
equivalently,
V-JtO„(0-ff„=e*«W / {ioCo{RV))vi,-u),v(u)du
n=0 "• •'O
n=0""' ^ '^ 1
which gives the first form of the result. The second form follows by the theorem on CSRs,
Theorem 1.3.1.3. •
For Knuth's model KOn{t) must be multiplied by e~* and integrated from 0 to 00 to get
the integrated costs.
Along with this remark, to get the number of histories H„, we observe
38 CHAPTER 2
°° n
n=0
from 0 to oo. This yields the result, recalling that the moments convert to the H„.
Now we apply the results of the previous section to find the behavior of the integrated
costs. We consider the case where the cost function is linear in k. Thus, we consider the
function COk = k. As noted above, the contribution of a constant to the integrated cost
is of order n. We will see that the contribution of a cost of k is of order n^. Here we will
find the leading asymptotic behavior of the integrated costs for list implementations.
First, we consider linear lists and dictionaries, so that ^D = L = tV. With the cost
function k, we need the coherent state representation {RRV)ab for insertions, for queries
we need {RVRV)ab, and for deletions, (tVRV)ab- As in deriving the Corollary to Theorem
1.3.1.1, i.e., differentiating with respect to parameters to bring down needed factors,
1.5.1 P r o p o s i t i o n . From Tieorera 1.3.1.1, we have
{RRV)al> = a'^bt^
{tVRV)ab ^bt + abh^
{RVRV}ab = a^bH''+abt
We see that the leading asymptotic behavior is given by the terms involving t^ as
these correspond to the highest order singularities.
Jo 12
Multiplying by e " ' and integrating from 0 to oo gives the generating function
1.5.2 Dictionaries
This is the Poisson case. Here H(s) = V(s) = e' — 1. First,
1.5.2.1 P r o p o s i t i o n . The number of histories Hn satisGes
H„ 1
where I = log 2.
Proof: The generating function for Hn/nl is 1/(1 — H{s)). In this case, (2 — e')~^.
Apply the technique of the Intro., §5.1. There is a first-order pole a,t s = I. To evaluate A,
note that this is just the residue, which can be found by evaluating the derivative of the
denominator at the singularity. This yields the factor 1/2 and the result follows. •
Using the third main formula, we consider e"^'*' t^ times the coherent state integral.
Multiplying by e~* and integrating from 0 to oo gives the generating function for A'„if„.
Note the calculation, for s near 0, so that |Jy(5)| is small,
e-U^e'"^'Ut^
Jo " ~" -^ (1 - His)r
As noted above, the pole is at / = log 2. To get the asymptotic behavior, we use the
technique of the Intro., §5.1. So we calculate the coherent state integral and evaluate at
s — log 2. Since the singularity is a 3rd-order pole, we get asymptotic behavior of order
The coherent state integrals and their corresponding contribution to the integrated
cost are as follows:
For negative queries, here we just multiply the number operator RV by t. Thus, using
the Corollary to Theorem 1.3.1.1, after extracting the factor of t^, we have just ah or
v{s-u)v{uy.
f (e" - l ) ( e ^ - " -l)du = se' - 2{e' - 1) + s
Jo
Evaluating at s = I gives a contribution of 3/ — 2.
For insertions and deletions:
7
Evaluating at s = / gives a contribution of - — 5Z.
For positive queries:
f [(e" - l ) ( e ' ' - " - 1)]2 du = se^' + e^' - 1 + 3 - Ae' (e' - 1) + ise' - 4(e" - 1)
Jo
Evaluating at s = / gives a contribution of 13/ — 9.
Combining these with the appropriate factors from Chapter 1, §2.2, we have these contri-
butions:
19
For unsorted lists: (14/ ;r)/2
For sorted lists: (6/ - 4)/2
Using the Intro., Lemma 5.1.2, we see that we pick up a factor A = (1/2)^, and
l / r ( 3 ) = 1/2. Combining these gives the behavior:
Scaling out the factors of H„, the net effect of the factors other than powers of / is a factor
of 1/8. So,
1.5.2.2 T h e o r e m . For dictionaries, we have the behavior of the integrated costs:
, V /^ 7 19 \ 2
For unsorted hsts: — — —— -—r n
V 4 log 2 16(log2)2;
, ,• . 3 1
For sorted hsts: 4 log 2 2(log2)2
"(•) - ^
and
Vis) = ^ - ^ ^
DATA STRUCTURES AND ORTHCXJONAL POLYNOMIALS 41
Proof: This may be verified by substituting the indicated expression for V{s) into
the generating function
i;«"^"(^) = r r 2 ^
... . 1- Vl-is^t
^^^^ = 2SH
and
^ ' 2s
To apply Theorem 1.4.5, we need to expand the coefficients V(s)" in powers of s. This
is where we use Lagrange inversion.
1.5.3.3 P r o p o s i t i o n . Let V(s) = (1 - ^ 1 -As'^t)/2s. Then
-w-S.'f(,„,V)'-
Proof: Let x = V{s) = (1 - Vl - 'is'^t)/2s. Then one readily finds that
X
s = X2 +t
Applying the Lagrange inversion formula, Ch. 0, §5.3, we have, with XQ = SQ = 0, writing
D for d/dx,
jt=i
Expanding (x'^ + <)* by the binomial theorem and differentiating accordingly, the result
follows. •
Now we will calculate the asymptotic behavior of the integrated cost for unsorted lists.
Referring to Ch.l, §2.2, we see that deletion on a file of size k costs k — 1.
42 CHAPTER 2
1.5.3.4 L e m m a . For unsorted list implementation of priority queues, we have for the
generating function for the integrated costs
mV{s) 2m+3
m=l
E Stm+2
= sM{sfV{s) Y^ X^(s)^'"m<-'"
using the squared norms ||^nlP = *""• Now observe that M ( s ) = V{s)/(st). Substituting
in the formula above, the result follows. •
1.5.3.5 T h e o r e m . For unsorted hsts, the integrated cost for priority queues satisfies
Kn ~ hny/im
Proof: In the Lemma, substitute in for V^'"+^(5) using Prop. 1.5.3.3 to get
E ^ ^ " ^ E - ( 2 m + 3)(n 2n + l
+ rn + 2
The initial factor of | comes from the fact that the sum is only for k > 0 so that only half
of the Gaussian integral contributes. To get the integrated costs, we have to integrate out
the factor of t", this gives
2n + l V"/
n+1 \n/
yields n\/(n + 1) x (")• Dividing this into the above result, eq. (1.5.3.1), the theorem
follows. •
For priority queues, we list the results for some other implementations. The behavior of
K„ as n —+ oo is given by:
We review the technique of continued fractions to derive the generating functions for the
number of histories. In the Introduction, §3.2, we indicated how the three-term recurrence
for the approximants of the continued fraction gives a family of orthogonal polynomials, t/i„,
the reciprocal polynomials of the denominator polynomials. The connection with orthog-
onal polynomials is based on classic work of Stieltjes and Markov. See Jones&Thron[47]
for details (especially, pp. 241-256, 331-333, 342-244). Uaiko-v's Theorem says that the
continued fraction, of the type in §3.2 of the Introduction, converges to the generating
function for the moments of the distribution with respect to which the polynomials ip„ are
orthogonal. Another way of stating the result is that the continued fraction (the corre-
sponding J-fraction) converges to the Stieltjes transform of the distribution. Below, we
will illustrate this in our context. We indicate the continued fraction approach to counting
histories, detailed in Flajolet[23].
For example, words { A* }, each consisting of a string of A's of arbitrary length, maps to
the geometric series via the correspondence A <-> a.
1—a
Knuth's model counts the number of ordered possibilities of i insertions as i\, regardless
of the size of the file, as long as it has total length at least i. Thus, for dictionaries, since
in the model considered here, insertions and negative queries are combined, the number of
histories of a given length n, say, depends only on the total s = i + q, for i insertions and
q negative queries. In the generating function, we denote by r the combined number of
deletions and positive queries. Denote by H^ the number of histories of length n = s + r.
We consider the generating function :
H{t,x,z)= Y. ^^'-^"
i+q=s
i-\-q+r—n
We denote by fff*! the corresponding generating function for histories of height < h.
1.6.1 T h e o r e m . For dictionaries, H(t,x,z) has the continued fraction expansion
where q^ = Npos (Q"*", A;), dk = Npos (Z?, A;). For linear lists and priority queues, the
corresponding result holds, after setting qt = 0, k > 0, and t — 0.
A={I,Q-,Qt,Qt,--,Di,D2,...}
where Ok, O € {D, Q'^}, denotes operation O performed on a file of size k. Let 5'''' denote
the set of schemas represented by words having height < h, with initial and final level 0.
The SW axe given by the following regular expressions;
we get
I - t - qoz
d\xz
ifW(<,x,z) = 1 \ —t — qoz
1 — t — qiz
and so on. Generally, W'^~^^'(t,x,z) is obtained by the substitution
dh+ixz
t + qhz + : ^ t + qhz
1 - t - qh+iz
in H^'^^t,x,^). Now let h go to infinity. The result for linear lists and priority queues
follows from the fact that there are no queries. •
For priority queues and linear lists we can recognize the continued fractions directly.
Call it 77. As this is a periodic continued fraction, substituting s'^ = xz^ we see that it
satisfies
1 2„
= 1-5-^77
l-Vl- 4s'2
1 = 2s2
Comparing with the moment generating function connected with Tchebychev polynomials
of the second kind given in Intro., §6.2, we see that this equals ^(25)^"/i2n- Comparing
with Theorem 1.6.1 and using the formula for the moments, we have
Fia,b,s)=2Fo
i^(|,l,2x0) = ^ ( | ) „ ( 2 x ^ r
n=0
and hence
/f3"„ = n ! 2 " ( i ) „
where we recognize the moments for the Gaussian distribution given in Intro., §6.2. Thus,
we should expect a connection with the Hermite polynomials, as indeed we have seen.
So far, we have assumed that the universe of keys is infinite. Since in practice there are
only a finite number of keys, it is of some interest to see this explicitly taken into account.
Flajolet and Fran^on considered the markovian model and gave the time cost generating
functions. Here we will follow Frangon, Randrianarimanana&;Schott[34]. We will state the
results of their analysis, without presenting details here.
In other words, to perform an insertion / at step i a key is taken from the universe and
added to the structure. The operation D consists of discarding a key from the structure.
A positive query Q~^ modifies neither the universe nor the structure, while performing a
negative query Q~ consists of discarding a key from the universe leaving the structure
unmodified.
DATA STRUCTURES AND ORTHOGONAL POLYNOMIALS 47
jJ^l[Npos(D,k,)l[NposiQ+,k,)
^ '' jes lep
where S (resp. P) is tije set of indices j such that Oj = D (resp. Q'^) and i denotes the
total number of insertions I and negative queries Q~ in the schema.
The analysis can now be done with the same kind of techniques as for the infinite-keys
case.
^(M,.) = E<.v^^^^^^^-w
iV!
a) Dictionary
b) Linear list
H f i ( 0 , x,z)=lll- Ixzll - Ixzl... Il
c) Priority queue
H ^ Q ( 0 , a;, z) = 1 A - xzll - a;2 / ... I\ - xz
a) Dictionary
where the .s„^,- are the Stirling numbers of the second kind.
b) Linear list
c) Priority queue
N_^.r^ (2n)'-
H2"n = (N)n
{n + iy.
a) Priority queue
binary tournament
DATA STRUCTURES AND 0RTHCX30NAL POLYNOMIALS 49
3
-n\og(n} + 0{n), n < N
-pagoda
b) Linear list
- sorted list
r^ + hn
n< N
3 '
- unsorted list
n +5n
n< N
6 '
c) Dictionary
- sorted list
R e m a r k . Letting N —y oo recovers the results proved for the infinite universe case. We
leave this for the reader.
II. M u t u a l exclusion
We have a system of processes which axe accessing a common set of resources. Each
process either accesses or releases a resource in a given time period. The system is ruled
by the restriction that only one process may access any given resource at a given time
and (consequently) each process releases at most one resource. The objective is to avoid
a deadlock state where processes have no available resources. There are various ways to
model this. One way is to use a combinatorial approach, which will be discussed in Chapter
5. Another approach is to use a probabilistic model, e.g., a type of birth-and-death process
50 CHAPTER 2
counting the number of active processes going on and assigning a certain probability of
the system locking up.
Consider p processes PI,P2T - • ,Pp and r resources Ri,R2, •. • ,Rr- A process P, may
access resource Rj, denoted by Pi(Rj), or it may release it, denoting this by Pi{Rj).
Fran5on[31] considers all 'behaviors' corresponding to a given resource Rj. This is the
language C{Rj) which we represent by the formal sum:
k>0 l<i<p
(assuming that all resources have been released at the end). Since r resources are used, the
behavior of the full system is an element of the shuffle product: LlJi< j < r £{Rj). The idea
is now to use the exponential generating function. This is effective since the exponential
generating function of the shufHe product of two languages associated to disjoint alphabets
is the product of the corresponding generating functions. By this method, Frangon proves
that the exponential generating function of the admissible behaviors is:
This leads to the asymptotic result for the number of admissible behaviors
&s n —* oo.
We present an alternative probabilistic approach to the average behavior of such
systems.
R e m a r k . Raynal[75] is of particular interest for this topic. Also see Lavault[55].
The mutual exclusion protocol can be represented by a graph whose nodes correspond
to processes at work, the edges indicating requests for resources. For example,
means that the processes P I £ind P3 wait for some resources used by P2, while P 4
(resp. P6) waits for a resource used by P 3 (resp. P5). Assume that P2 releases the resource
requested by P 3 , we get the new graph
P I —^ P 2 P 3 <— P 4
P6—>P5
P I —• P2 P S <— P 4
i
P6 — » P 5
DATA STRUCTURES AND ORTHOGONAL POLYNOMIALS 51
Should P3 ask for a resource used by P I , the system runs into a (partial) deadlock situation.
Another view is this. If the state Ek indicates that k processes are at work with the
mutual exclusion protocol, we have the following graph of possible transitions:
Writing P ( t ) as the row vector with components Pk{t), we can write the system in the
form
P'(t) = P ( t ) M
where the matrix M has columns
( . . . , A i t , - ( Q t + \k + fik),fik,---y
Note that the ajt are the infinitesimal rates at which the system deadlocks.
See Feller[22] for thorough background on Markov processes and how to analyze such
systems.
52 CHAPTER2
Here we can apply the transition approach of Section 1 as well. Assuming that ac-
cess/release of a resource is independent of the time, n, and that access/release depends
only on the number of active processors, we have a recurrence of the form
where Hk,n is the number of histories such that k processors are active at time n. A
deadlock is modelled by the fact that a path touches the a;-axis, i.e., there are no processors
active. To calculate the behaviors, we could enumerate all histories whose height is strictly
greater than 0. This is like the problems studied in Section 1 and similar methods are
applicable.
III. E l e m e n t s of duality t h e o r y
Here we briefly indicate the general approach to duality — the correspondence between
the recurrence for the c„jt and the recurrence for orthogonal polynomial basis vectors used
in §1.1. The idea is to interpret Cnk as components of a vector C „ .
C „ + i = ACn
with C „ a vector of components Cn(k). This gives a matrix C{n,k) — C„(/;). The
operator X is determined by the relation
And
X " + V o = X{X"4>o) = C „ • X $ = ACn • * = C „ • A*$
So the action of X on $ is dual to that of J4 on C:
X^ = A*^
DATA STRUCTURES AND ORTHOGONAL POLYNOMIALS 53
Example. Consider the factorial powers x^"' = x{x — l)(x — 2) • • • (x — n + 1). Then the
Stirhng numbers of the second kind are determined by the relations
C" = y^^Sn^kX
k
I. A n a l y s i s of t h e s y m b o l table in K n u t h ' s m o d e l
The approach developed by Flajolet, et al., leads in this case to a divergent generating
function. So a different approach was called for. Recall that for Knuth's model, one
considers the generating function Hn(t) = Yli'^'/^^•)-^n where s counts the number of
insertions. A study of the numbers H „ ( l ) as a first step towards the solution was made in
Flajolet-Schott[27] where the asymptotic behaviour of these numbers was obtained with
the help of Bessel functions. We recall these results in the next section.
Here the notation H„ refers to iif„(l), i.e., the function H„(t) evaluated at t = 1.
Flajolet&Schott[27] proved that:
n+2
Hn^Y. as n —> oo
lt>l
/f„Ri(n/2elogn)"
R e m a r k . The numbers H„ are denoted 5 * in [27].
The starting point for the study is the fundamental recurrence for Bessel functions :
J^+i{x) = 2vx~^ Jv{x) — Jy-i{x)
Rewriting this relation as:
This continued fraction provides the connection with symbol table histories for the analysis
of this data structure. We now recall some of the results from Flajolet-Schott[27].
BESSEL FUNCTIONS AND LOMMEL POLYNOMIALS 55
1.1.2 L e m m a . We have
J.-i(2)
1 1V "'•
vJ^{2)
r>0 ^ '''•
where
For the symbol table in Knuth's model we have the recurrence, cf. Ch. 2, Prop. 1.1.1.3,
Proof: Observe that (l>k = t'~''''^ipk{x/\/i) where the polynomials V"* satisfy
1.2.2 Definition. For a given r , the set of zeros of the function of x given by J _ i _ i ( r )
is denoted by E^.
Now we have
1.2.3 L e m m a . Let (^ € E^ be a zero of J-I^X{T)- Then
Rn.-dr) J-C(r)
Proof: Replacing in Intro., eq. (4.1.6), ex by x and the argument —26 by r , we have,
dividing out J-^iT),
Mis) = ( - ^ ) = - ^ ^ ^
I-SX sVtJ-l-l/s{T)
with T = —2y/i. The Stieltjes transform is
^s-X' ViJ-i-,{T)
giving the spectrum S^ as the zeros of the denominator. The corresponding probability-
measure is given by the residues
Proof: From Watson [84], pp. 153, 303, as noted in the previous section, the ratio
of Bessel functions can be expressed as a continued fraction which yields the equation
And thus,
• 2 / 1 „ ±2 f-\ o„ J^^ /i o_ A,*!
1 - — = ts^ / I - 3 - ts^ 11 - 23 - ts' 11 - :ia - ts
M~ 2v \ z J^{z) ) ~ 2v J^{z)
(-y^)-._,_,(.) = X : ; ; T r ^
E
n=0 ^ '
Now we see that the discussion in [27] is for the case t = \. For general t we have the
asymptotic form of the zeros
/ • < "
We follow the technique used in Chapter 2 for priority queues. We want an expansion
of (1 — sx)^^ in terms of the polynomials (/>„.
Write
( ^ - I ^ ^ n W ) = M{s)V„is) (1.3.1)
58 CHAPTERS
We know that (t>o = 1, </ii = x/t from the recurrence relation. We also know that Vo{s) = 1,
M(0) = 1, and that V„(0) = 0 for n > 0 by orthogonality. Introduce the operator A j acting
on functions f(s) by
A,/(s) =
that is, the functions (1 — sx)~^ are eigenfunctions of A, with spectrum x. Applying As
to both sides of eq. (1.3,1), it follows via the recurrence for 4>n that, for n > 0,
Thus, Vn{s) satisfies the same recursion as does i^„ except for initial conditions. Setting
n = 0 gives, using 4>i — x/t,
, 1 y, M{S)-1
via the operator As, while from the right-hand side we have
Therefore
K(.) = * - " / ^ ^ ^ ^ i ^
1 °°
J — ^ = Mis) J2 Vn{s)M^)/£n
s\/iJ-i-i/^{T) J-I/,(T)
^oCoiRV)4>n = n^4>n
Set
s^/i J_i_i/s(r)
so that (1 — sx)~^ = J2 ^n(s)4>n(3;). Using the second main formula for integrated cost,
Ch.2, Theorem 1.4.5, we want to calculate
oo
using the orthogonality relations of the <(>„• Substituting back in the expression in terms
of Bessel functions for Wm(s) yields
oo
'>m-\ja\J)
-T J-\-\IS{T)
St ^
with Er the poles of the summands. So we look at an expansion in I/5 of
Jm-sir)
E \.J~x-.{r)
expanding in partial fractions the main contributions come from the terms with double
poles:
/ Jm-s{T) \ _ • ^ Jm-
+ lower order
y(s-cy
The Stieltjes transform says that
p(C) _ J-S{T)
(7:^)=^
''~^' C^.'-^ ViJ-i-.(r)
60 CHAPTERS
with
Hence
1 Y^ Jm-<:{r) __ -s-^
Jm-dr? __ Y- McmO!^. 2
since, on E r ,
n m ^
Now, Yl <J^m(C)^P(C) sums to ||<?^m|P = ^""^ SO that this last expression has the form
E^E-'(^>(^))-
n m
as upper estimate for KOn{t)Hn- Dividing out by the Hn will yield the result. Though the
calculations have not been completed at this time, preliminary computations (Louchard,
private communication) indicate that the result is
(logn)i/^
J. Coulomb[13] first investigated the zeros of the Bessel functions Jv{x), with x fixed.
He observed that the zeros converge to the negative integers. The study Flajolet-Schott[27]
BESSEL FUNCTIONS AND LOMMEL POLYNOMIALS 61
gave a quajititative version of Coulomb's result in the context of some combinatorial prob-
lems related to non-overlapping partitions. Their work led to the question of the speed of
convergence of zeros of Lommel polynomials to zeros of corresponding Bessel functions. It
is likely that the technique used here will prove useful in similar cases for other classes of
functions.
In the following, the argument of the Bessel function is fixed and v is variable, and
will be denoted by x. Consider the (modified) Lommel polynomials:
[m/2] . .
n)„
r»=0 ^ ^
Hm[x) = R^,^{2)
We let
0
For m —> CO, recalling that x is non-integral,
1
J{x + m) = 11 + 0
r(a; -l-m-l- 1)
and (1.4.1)
1
J(—x — m) =
r(-x-m + l) 1 + < ^
where the O terms are uniform (in a;) on compact sets. The principal formula required is
the following, which may be found in Watson[84, p. 295],
Now let /x be a positive integer. Define the domain fi^ = { x: 0 < |x -|- /i| < 1 }. The
first lemma is related to Hurwitz' result for Lommel polynomials [84, p. 302].
1 7r (-l)'"J(-a:-m)J(x-l)
o r{x + m)T{x + m + 1) + sinTTX r(x + m)
The first term is negligible. We have to show that the coefficient of J{x — 1) in the second
term converges normally to 1. In fact
Recalling that the zeros of J{x) are very closely negative integers as a; —> — oo we have
1.4.2 L e m m a . Let J{( — 1) = 0. Let Cm be the zero of Hm(x) closest to (. Then, as
oo, we have the asymptotic reJation
HmiO
\u-c\~\j'ic-i)r r(c + m)
Proof: (Note; the idea is similar to that of Newton's method for solving equations.)
Let fni{x) — p / ^ ^ N . So fm and Hm have the same roots near (. By the mean value
theorem, there exists ^mi e.g. on the segment between ^ and Cm, such that
fmiO-fmi(:m) = (C-Cm)f'miU)
fmiO
IC-Cm| =
ImK^rn)
\fln(0\-' \fm(0\
where, since {fmix)} converges normally to J(x — 1), (by Cauchy's formula) the derivatives
are equicontinuous. And for large m, we replace /'„{() by J'{( — 1). •
And finally we have:
BESSEL FUNCTIONS AND LOMMEL POLYNOMIALS 63
1.4.3 T h e o r e m . Let (^ denote a zero of J{x — 1). Let (rn denote the zero of Hm{x)
nearest to (. Then, as m ~* oo,
TT J(l-C) 1
Km - C I (1.4.3)
sin nC J'(( - 1) r(C + m)r(C + m + 1)
Proof: We apply eq. (1.4.2) at a; = (, so that the J{x — 1) term vanishes. Thus,
^m(C)
r(C+m) - j(i_c)^g±4)
sinTfC
' ' r ( C + m)
Combining eq. (1.4.1) with the above Lemma yields the result.
R e m a r k . Relation (1.4.3) may be written in the form
7(1-C) r(-C-m)
ICm-CI
j ' ( C - i ) r(c + m)
via the reflection formula for the gamma function.
64 CHAPTERS
Numerical Results
m
r-61Practice
Theory Theory
1^^-61
Practice
r-^si
Theory Practice
4 1 X 10-^ 9 X IQ-''
5 5 X 10-5 4 X 10-5
6 1 X 10-« 1 X 10-«
7 3 X 10-* 2 X 10-* 8 X 10-^ 4 X 10-^
8 4 X IQ-" 4 X 10-^° 7 X 10-^ 4 X 10-3
9 3 X 10-* 2 X 10-*
10 1 X 10-= 8 X 10-« 8 X 10-^ 4 X 10-2
11 3 X 10-'' 2 X 10-'' 7 X 10-3 4 X 10-3
12 5 X 10-^ 4 X 10-"* 3 X 10-" 2 X lO-"
13 1 X 10-5 8 X 10-^
14 3 X 10-'' 2 X 10-''
15 5 X 10-^
The purpose of this discussion is to convince the reader that even very simple distributed
algorithms can lead to difficult mathematical analysis, involving special functions and
advanced probabilistic techniques.
The two stacks begin on opposite ends of the block of size m and grow until the
cumulative size exhausts the available storage. The time to absorption and the final stack
sizes are random variables whose distributions will depend on m and on the probabilities
of the elementary operations : insertion 7, (resp. deletion £);), i = 1,2 is performed
with probability p; (resp. qi). Obviously Pi + q\ + P2 + <12 = 1^ Therefore, as shown in
Flajolet[25], the natural formulation of this shared storage allocation algorithm is in terms
of random walks inside a triangle on a 2-dimensional lattice: a state is a pair formed by
giving the size of the two stacks. Consider a particle that starts at the point (a, h) of the
plane lattice and moves according to the transition rule
{x + 1, t/), with probability pi
{x — l,y), with probability 5i
{x,y)
{x,y + 1), with probability p2
, {x,y — 1), with probability 52
where pi + p2 + 5i + 92 = 1 It stops when it hits the absorbing barrier x + y = m.
If the parameter m is the memory size, then we can say that the transitions above
correspond respectively to random insertion / i and deletion Di (resp. /2, D2) in stack 1
(resp. stack 2). So a natural formulation of the two stacks problem is in terms of random
walks inside a 2-dimensional lattice triangle with two reflecting barriers along the axis — as
a deletion has no effect on an empty stack — and one absorbing barrier along the diagonal
— the algorithm stops when the combined sizes of the stacks exhausts the available storage
rn. (See Figure at the end of this section.)
66 CHAPTER 3
In a similar way, we can see that the formulation of this problem is in terms of random
walks inside a 2-dimensional lattice rectangle with a broken corner, four reflecting barriers
and one absorbing barrier. (See Figure at end of this section.)
The problem is to find the time and the position of the paxticle when it reaches the
absorbing boundary. The general case involves four parameters, pi +P2 + 9i + 92 = 1; it is
rather complicated and some problems remain open. Therefore we will restrict our study
to the isotropic case, where Pi = P2 = 9i = 92 = 1/4. We refer the reader to the papers
mentioned above for further details.
Let Ym be the random walk described above with initial condition Fm(0) = Xm, and
we denote by Zm (resp. T ^ ) the hitting place (time) on the absorbing boundary. Flajolet
has given an analytical proof based on continued fractions and orthogonal polynomials,
methods similar to those used for his analysis of dynamic data structures. Here we follow
Louchard, who derives a diffusion approximation. The densities involved may be expressed
in terms of theta functions, see Intro., §4.2.
M?!!!M ^ Wit)
m
where W{t) is a two-dimensional (reHected and absorbed) Brownian motion (B.M.) with
appropriate boundary conditions. The convergence is in the Skorohod sense. Let the
hitting time tor W{-) be T. The density o{W{-) is given by
T'AW{t)&dy,t<T] =
00 00
Proof: We include a sketch of the proof. The proof presented here is based on Chung-
Williams[ll]. The weak convergence is easily deduced from Chung-Williams, Th.8.4. Our
reflecting probabilities are such that (in one dimension, for instance)
1=1
and
P(A. = l ) = p , FiAi = -l) = q
By the central limit theorem, for each t > 0, as n —^ oo, X" = {ay/n)~^X^„t] —> Bt in
distribution with B a standard Brownian motion on R . For each m € N , {X"} converges
weakly in D[0,772] to {-B(, 0 < t < m } . Then it follows by the continuous mapping theorem
that
X(" + max i-X^), 0<t<m
0<9<t
converges in the weak sense to
Since Bo = 0, we have
max(—B.) = max BI
0<3<t 0<3<t
Weak convergence on D[0, m] for each m € N implies convergence in distribution for each
time t, hence
X r + max ( - X ; ) -^Bt+ max ( - 5 7 )
o<s<t 0<a<t
in distribution for each t.
The continuous mapping technique used in Chung-Williams leads to the reflected B.M.
W. Also the hitting time T is a continuous functional of W (almost surely) so that the
weak convergence is still valid for our absorbed process. The classical one-dimensional
B.M. with reflecting boundaries at 0 and 1 is well known, see Feller[22, Ex. X.5.e. and
prob. XIX.9.11]. Its density is given by, calling this process Wi:
F,[W{t)edy,t<T] =
P^[T€dt,ZiT)edz2] =
r °°
exp(—A;'^7r^t/2)cos(fc7rxi)—pr sin[A;7r(- -\—j=)]
*• 4 = 1 yl 2 -\/2
+ ^exp(-A^7r^V2)cos(A7rz2)^^sin[A7r(- - -j=)\
A=l
00 00
P4Z{T)edz2] =
( 4 ^cos(/c7rxi) . r, , 1 , -^2^,1 , ^ cos(A7rx2) . ,. ,1 ^2 .,
8 Y ^ V ^ cos(fc7ra;i)cos(A7rar2),
^^,^^,^ 4A:sinM^ + ^ ) ] c o s [ A . ( i - ^ ) ]
+ 1775
T T V ^ 2^ 2^"
/t=l A=l
+ Acos[^7r(- + •^]sin[A7r(- - - ^ ) ] ] | d z 2
and
p,[Tedt] =
•^4 ^ exp(-A:^7r^i/2)cos(fc7rxi)+ 4 ^ exp(-A^7r^</2)cos(A7ra;2)
'^ kodd>0 Xodd>0
8 Y^ E e x p [ - ( P + A2)7r2</2](ifc2 + A2)/(fc2-A^)cos(fc7ra;i)cos(A7rx2)
iorfd>0 Aet;en>0
yi = 2 + ( « i + ^ 2 ) / \ / 2
or
22 = -yiyi -y2)
(this is a classical outward heat flow). This gives the first formula after standard manipu-
lations. Note that permuting xi with X2 in changes Z2 into —?2 (as it should). Integrating
over t gives the second formula. The last result is obtained from the first formula after
some tedious but simple computations. One could wonder if this last expression is an
honest density. This can be checked as follows. Integrating over t gives, for the first term
2
4 ^ cos(fc7rxi) • - ^ =-2xi-f 1
koddyO
by standard results on Fourier series. The second term gives —2x2 + 1- The fourth term
leads to
, „, 2 v-^ cos(A;;r3;i) ^ ^ 1 ,, ,
keven>0 Xodd^O ^ '
4 ^^ cos(A;7rxi)sin(fc7rx2) 2 v^ ,.,-,, M
- - Ya H r ^ ^ = — ^ smM^i+X2
fc euen>0 fceven>0
•\- sin[^7r(a;2 — xi)]]/A;
70 CHAPTER 3
Assume, without loss of generadity, that xi > X2. By Fourier series results, this last reduces
to 212• Similarly, the third term of our original expression leads to (—1 + 2a;i). Collecting
results, we see that we have indeed an honest density — that integrates to 1. •
By the first Theorem above, the original hitting place {Zm) and hitting time (Tm) are
given by
m m^
Indeed, Z and T are (almost surely) continuous functionals of W. The weak convergence
leads to our asymptotic approximations. Note also that the particular case a;i = a;2 = 0
leads to results of Flajolet[25], Th.2, and our last expression above goes into the bivariate
0 distribution announced in Flajolet, Th.4.
In this situation the absorption time T^ corresponds to deadlock detection. This can
be modelled by a random walk in a rectangle with a rectangular section removed from the
upper right corner. As long as the random walk remains inside the main rectangle the
banker satisfies requests without delay, but inside the sectioned-off rectangle prevention is
necessary since the customers Pi and P2 violate the announced upper bounds mi and 1712-
Of
= AF, in Q,
dt '
f- = 0, \/t, along Ta
df Vi, along Tr
= 0,
du '
fix,y,0) = S{x - xo,y~yo)
n=l
where £>„ are real coeiRcients to be chosen in order to satisfy the initial condition. The
eigenvalues A„ and the eigenfunctions M„ axe solutions of a classical mixed homogeneous
boundary value problem for the 2d-Helmholtz equation:
Aun + A„ii„ = 0 in fl
M„ = 0 along Fa
-;:— = 0 along Fr
Of
To find u„ and A„ we can make use of the large singular finite element method.
This method has been specially developed to tackle elliptic boundary value problems in
polygonal domains, Descloux-Tolley[14].
The particular case of the Helmholtz equation has been studied, both from analytical
and numerical points of view, in Descloux-Tolley [14]. As shown by them, the eigenfunc-
tions Un may be written as:
oo
where
f2i is the closure of the «'*' subdomain 0.,, i = 1,2,-• • ,N that constitute the subdivisions
of ft, ft = fti U ft2 U • • • U ftiv. Every ft; contains at most one vertex of 5ft and
ft, n ftj = 0 for (i ^ j). Moreover, aft,- n 9ft> = r,^ for i ^ j , with Tij = 0 if it is
formed of a finite number of points.
ri, 9i are polar coordinates centered at a point P; lying on 5ft fl 5fti. In general, P; is
located at a vertex of 5ft
Cinp i = 1,2,... ,N, p = 1,2,... are real coefficients to be properly chosen. More precisely,
the C'inp should be taken in order to ensure the continuity of «„ and of its normal
derivative along r,j, i ^ j ; i,j = 1,2,... ,N.
72 CHAPTER 3
Hip are real numbers whose values depend on the particular boundary conditions
Jfii^{-) is the Bessel function of the first kind and of order Hip
g(-) is a sine or cosine function, depending on the boundary conditions.
Finally, taking into account that for A„ / A^
ii m ^ n
<Un,Um>= I/ I/ U„(X,yjUm{X,y)dil
Ur,{x,y)u,riix,y)dQ = •{
= I ||^_
i' 11||2 -^ ^
we obtain
IL 6(x - xo,y - yo)M„(a;,y)rffi = u„{xo,yo)
R e m a r k . The separation of ft into subdomains Q,i can be done in sever2d ways. However,
it is important to take into account the convergence properties of the series.
For some eigenvalue A„, with the proper coefficients Cinp, the series converges in a
disk centered at Pi with radius Ri > 0, where Ri is the distance from Pi to the "nearest
singularity of u „ " . We shall not discuss here the precise meaning of this.
Note that if we define
then Ri < Ri. For practical applications we can thus use Ri instead of Ri. As a conse-
quence, ft should be divided so that every ft, lies entirely in a disk centered at P, with a
radius pi < Ri. (ft; contains at most one vertex of ft).
For the deadlock problem for the two dimensional banker algorithm, one is working
in a five-sided domain consisting of a square with the northeast corner cut off'. This leads
to eigenfunctions of the form
_ w„(0,0) _ Ci„i
and
where
Pi
Un = /^ CjnpJ,ii^{y)^nri)g{tJ.ip6i) for {ri,6i) e H;
p=l
The approximate values Cjnp and A„ of Ci„p and A„ are found as explained in [14].
Numerical investigations show that it is possible, though not easy, to obtain accurate
approximations to the eigenvalues A„ and eigenfunctions M„. The main dificulty is due
to the presence of the Bessel functions that lead to very ill-conditioned matrices if not
suitably scaled (as noticed by Fox, Henrici, and Moler).
Another numerical problem arises from the initial condition. In order to have a sharp
approximation of the initial condition in the form of the generalized Fourier development
TV
f{x, y, 0) = 8{x, J/) ~ ] ^ DnU„{x, y)
n=l
it would be necessary to have a very high value of N. This is practically impossible due
to the enormous amount of calculations required. However, as the values of A„ (and A„)
increase rather quickly with n, even for small values of t, / ( x , y, t) may be considered as a
fairly good approximation to / ( x , y,t) due to the rapid decay of e""^^"'. As a criterion to
test the validity of approximation one can evaluate
-\fdtl 'Ids
and compare this quantity to the exact value
2 Jo ir. du
R e m a r k . With a; = 0 the deadlock problem for the banker algorithm degenerates into
that of the evolution of two stacks in bounded space whose solution is well known, see
Louchard-Schott [61].
74 CHAPTER 3
R e m a r k . These methods apply to any kind of polygonal domain and of course to triangles
(i.e., the 2 stacks problem). The question remains how to solve the general problem. It
looks like the exit-time technique of Matkowsky et al.[70] is applicable and will give at
least the asymptotic behavior of (T), the expected value of the hitting time, cf. Maier[64].
The question remains: How to get the limiting distributions of T and Z in the general
case?
R e m a r k . The classic example of the analysis of an algorithm that involves Bessel functions
is Jonassen-Knuth[46]. For some probabilistic methods applicable to the sorts of problems
discussed here, see Aldous[4].
BESSEL FUNCTIONS AND LOMMEL POLYNOMIALS 75
Colliding stacks
i£I
R
(t- 'b
R
1
<^ R
Banker algorithm
Chapter 4
F O U R I E R T R A N S F O R M ON F I N I T E G R O U P S
A N D RELATED T R A N S F O R M S
A group is a set with an associative operation, an identity for the operation, with the
property that every element has an inverse. We will conventionaJly denote the operation
as multiplication, and 1 for the identity. Typical examples of abelian, i.e., commutative,
groups include rings and fields, with respect to addition. Nonabelian groups that arise
naturally are permutation groups and groups of geometric transformations.
A group action on a set S means a mapping from G to the set of invertible functions
on 5 , which form a group under composition. Denote the mapping by the correspondence
g —^ (f>g. We can also consider this as a map G x 5 —> 5 :
{g,s) -^ (t>gis)
If the set 5 is a vector space of dimension d < oo, and G maps to linear transformations
on S, g —> Tg^ then the action is a repTCsentation of G of degree d. A simple example is
the trivial representation in which every element of G maps to the identity operator on
5 . For a given basis on S, the elements of g are realized as dx d matrices. Representations
T^^' and T'^^ such that, for all g e G,
for some invertible operator Q are equivalent. Realized as matrices, these differ only in
choice of basis. It is usually understood that referring to representations means up to
equivalence.
FOURIER TRANSFORM ON FINITE GROUPS 77
1.1 Definition. If S is a vector space over C with an inner product { , ), then a unitary
representation is one such that the operators Tg are unitary for every g:
(T,v,T,w) = (v,w)
for all V, w € 5 .
An observation that is often useful is that for a finite group G, any given representation
is equivalent to a unitary representation. This may be seen as follows. Let [u, w] be any
inner product on 5 . Given the representation T, define
( T , v , r , w ) = jL ^ [ T , , T , v , r , - T , w ]
\G\ ^
which reduces back to (v,w} since gg' for fixed 5 is a permutation of the elements of G.
Now, we know from linear algebra that ( v , w ) has the form [ P v , P w ] for some invertible
P . Thus, T is equivalent to T defined by Tg = PTgP~^, which is unitary with respect to
the original inner product.
x{g)^trTg
which depends only on the equivalence class of T. The character of the trivial representa-
tion is just dim 5 , the degree of the representation.
The idea of the Fourier transform on groups is to use representations to study functions
on the group. Here we will only be concerned with the case of \G\ £nite. One thinks of
functions / : G —> C as elements of r ( G ) , the group algebra consisting of formal sums
over G with coefficients given by the corresponding functions:
This is an alternative way of writing the |G|-tuple {f{g))g€G, displaying the values of the
function / . The principal feature of the algebra F is that multiplication in F corresponds
to convolution of functions.
gea
Proof: Write
7(/i)7(/2)= J2 f^^9')f2(g")9'g"
a' ,9"
This is the 'generic' version of the Fourier transform. For any representation T, we have
corresponding to F the operator algebra generated by { Tg }geG within the algebra of linear
transformations on the given vector space. Since the product of any two operators of the
form Tg is again of the same form, we have the elements of the operator algebra of the
form
gee
for some scalars Cg. This is exactly the im.age of 7 ( / ) € F corresponding to the function
f{g) — Cj- Thus, for any representation T, we define the Fonrier transform of / by
f(T) = X; f(g)Tg
gee
FOURIER TRANSFORM ON FINITE GROUPS 79
fix) = E /(^)x(s)
Suppose that we can parametrize the irreducible representations, call them T^^\ with
corresponding characters x^- Then we write
/(T(^)) = Yl f^9)Ti^^
gea
m = J2f(9)xH9)
gea
Xihgh-') = tr ( T A T . T ^ - I ) = t r T , = xig)
the characters depend only on elements equivalent up to conjugation, namely on the con-
jugacy classes. We denote a typical conjugacy class by p.
1.2.1 Definition. We denote by x^ the value of the character of an irreducible repre-
sentation T^^* on the conjugacy class p.
fig) = fihgh-')
for all heG.
Equivalently, a class function is determined by the property that
figg') = fig'9)
for all g, g' € G. It follows readily from the definitions that
1.2.3 P r o p o s i t i o n . The convolution of two class functions is a class function.
Another terminology is central functions . Recall that the center of G consists of those
elements commuting with all other elements. Similarly, the center of an algebra consists
of elements commuting with all others. Elements of the center are called centrai.
1.2.4 P r o p o s i t i o n . / is a class function if and only if'f(f) — ^ f{g)g is in the center
of the group algebra F, i.e.,
7if)7i<t>) = li4>)lif)
for all functions <f> on G.
80 CHAPTER 4
Proof: This is the same a,a f * (j) = <f> * f ior &\\ <f>:
9€G gea
k€G keG
for / central. Choosing (f>{k) = 1 if fc = g'~^, 0 otherwise, the above equation reads
fig'g) = figg'), yielding the converse. •
Now we have the fundamental
1.2.5 L e m m a . Schur's Lemma
UT = T'U
Proof: For a., note that any eigenspace of U will be invariant under the Tg-.
If U is nonzero, then it has at least one nontrivial eigenspace, which by irreducibility must
be the whole space.
For b., let V be in the kernel of U, i.e., Uv = 0. Then
T'Uv = UTv = 0
so that the kernel oi U is an invariant subspace for T', if J7 ^ 0, then it must be zero.
Hence U is one-to-one. Similarly, the image of U is invariant under T, so that U is onto.
I.e., U is an isomorphism and the representations are equivalent. •
Thus, any element of the center of the group algebra, F, maps to a scalar multiple of
the identity for any irreducible representation. I.e.,
1.2.6 Corollary. For any class function f on G, f{T) is a scalar multiple of the identity
for any irreducible representation T.
For a conjugacy class p, denote the fimction which is 1 on ;0, 0 otherwise, i.e., the
indicator function of p, by | p .
FOURIER TRANSFORM ON FINITE GROUPS 81
lip) =fi^p) = ^ 9
9Gp
UT''^) = \P\fx'pI
ipW = \p\x).
and for any class function f,
fiTW) = l.f^X)I
Proof: By Schur's Lemma, ^p(T'^') is of the form cxI for some scalar ex- Tailing
traces gives
aA) = ^ t r T ( ^ ' = Hx^
SGp
= tr {cxI) = cxdx
which gives cx = <^^^|plXn- Thus the result for (,p and hence the result for any class
function / via J(p) = '£,f{p)(p- •
Observe that this Proposition is a refinement of the Fourier transform of a class
function:
/(A) = Y, \p\fiP)^
y y
T}=Tf,
vhere
/+(x) = / ( x - i )
We have
1.3.3 Corollary. The inner product can be expressed in terms of convolution as
tr7(/) = /(l)|G|
X X
The contribution to the trace is given by the coefficient of g, corresponding to the diagonal
element of the matrix of 7 ( / ) on F. This gives / ( I ) . Summing over g € G yields the result.
•
1.3.5 T h e o r e m . We have the isometry
(/i,/2) = | G | - l t r ( 7 ( / i ) 7 ( 4 ) )
FOURIER TRANSFORM ON FINITE GROUPS 83
(/i,/2> = (/i * / l ) ( l ) = | G r H r ( 7 ( / i ) 7 ( / j ) )
as required. •
Now consider the operators Tp, the convolution operators corresponding to multipli-
cation by 7(p) for the conjugacy class p:
Now observe
•
Now we have the decomposition into eigenspaces of the 7(p), the characters appearing in
the eigenvalues.
-yiph = -^Xpf
R e m a r k . Note that the trivial representation M{i} has basis 7(u) = 2 G 5 ' corresponding
to the constant function u{g) = 1 for all g. This element is invariant under multiplication
by all g eG.
So the M\ contain copies of the irreducible representation T^'^K Denote the projec-
tions eA:r —> M A . Then t r e ^ = nixd\ where m\ is the multiplicity of the irreducible
representation T^'^' in Mx. It turns out that rtix = dx- We can now show
1.3.8 T h e o r e m . Fourier inversion
We have the expansion
/(s) = ^E'^^tr{T'_^>/(T(^))}
=wi^-^'^{^i-y(T^''))
1^1 .
and the result follows from the equality mx = dx- •
From this theorem, we can find the form of the projections ex
1.3.9 P r o p o s i t i o n . Tije projections ex are given by
FOURIER TRANSFORM ON FINITE GROUPS 85
M9)=~-.Y.<^,tr{Tll\Sx,)
9€G
{tki,tij) = d~^\G\6ij6ki
86 CHAPTER 4
Proof: For each transformation A of appropriate size (if T and T' are of different
degrees), define the intertwining operator
L{A) = ^T,AT;..
gee
g€C
geG
Thus, by Schur's Lemma it is zero if T and T' are nonequivalent while otherwise it is a
scalar multiple of the identity. Letting A = E^^ be the matrix having a nonzero entry
equal to 1 in the ij position, we get the kl^^ entry
see
gea
dc'^ = 6ij\G\
using the representation property and the fact that Tg-i = T~^. The result follows.
Now we apply this to find the orthogonality relations for the characters.
1.3.1.2 T h e o r e m . Schur orthogonality relations
a. For irreducible representations \, A'
{x\x^')=Sxy\G\
Proof: To prove part a., the orthogonality of the matrix elements says
seG
Using the relation, eq. (1.3.1), x^'ig ^) = X^ig) this reads {x'^,x^') = 0. For A = A', we
have, using the formulation for unitary representations,
{tki.ttj) = d~^\G\SijSki
Summing over k = i gives on the right hand side d~^\G\6ji and summing over j = I yields
the result.
For b., apply Theorem 1.3.5, first observing that multipUcation by fip)\ as the ad-
joint to multiplication by 7(/9), acts on Mx by multiplication by the complex conjugate
eigenvalue. Thus,
\G\Wpp' = \GMp),jip'))
= tr(7(/>)7(p')n
unless g = 1, in which case it equals \G\. Now, for any representation, x ( l ) is the degree.
Thus,
see
as only one term is nonzero, hence the result. •
R e m a r k . The matrix with entries Xp is called the character table of the group G. It is
customary to label the rows by the irreducible representations and the columns by the
conjugacy classes.
|G|E^p>^p IH
Multiplying both sides by \p\ and summing over conjugacy classes yields, C denoting the
number of conjugacy classes,
Proof: We have noted that t r e ^ = fnxd\ = d^. From the decomposition of the
regular representation,
|G| = t r l = ^ t r e A
A
as required. •
For abelian groups, all elements commute so that each element is its own conjugacy
class. Thus, there are \G\ irreducible representations. They are given as one-dimensional
unitary representations, homomorphisms from G to the unit circle in C. These are called
the characters of the group G. (Since these are representations of degree 1, this is consistent
with the terminology for nonabelian groups, but it is a more restrictive usage.)
The simplest abelian groups are the cyclic groups having a single generator x. A
finite cyclic group is determined by x such that x = 1 for some positive integer TV. The
elements of such a group can be enumerated as gi = x', for i = 0,1,... ,N — 1. An infinite
cyclic group is isomorphic to the integers Z, with elements x", n £ Z. The fundamental
theorem on abelian groups states that a finitely generated abelian group is a direct product
of cyclic groups. Thus, the characters of a finite abelian group factor into repesentations
of cyclic groups
x{g) = XnA.9)xnM---Xnr{g)
corresponding to the factorization of the group. Here we will focus on representations of
finite cyclic groups and the connection with finite Fourier transform.
The characters of the cyclic group with generator x satisfying x^ = 1 are given by the
mappings Xk'-
2.1.1 Circulants
The group algebra of the cyclic group with generator x satisfying x = 1 is the algebra
oi N X N circulants, so called because successive rows of the matrix are generated by cyclic
permutation; i.e., each row has the same entries as the first row but permuted cyclically.
If we map
fO 1 0 . . . 0\
0 0 1 ... 0
U= 0 0 0 ... 0
\1 0 0 0/
90 CHAPTER 4
then corresponding t o the function on the group such that f{x^) = Oj is the matrix
ON-l \
ajv-2
V ffli a2 as flo /
R e m a r k . Note that this is actually the transpose of the regular representation:
/O 0 0 . . . 1 \
1 0 0 .
0 1 0 .
\0 0 0 0/
On the k irreducible representation, x acts as multiplication by { so that a basis
for the representation is given by the vector, t denoting transpose,
(i,C\C",...,C<^-^>'=)'
which is cyclically permuted by multiplication by C*. The group algebra is diagonalized
by the finite Fourier transform given by the unitary matrix !F with entries C^/vN,
0 < 2,j < A ^ - 1
/I 1 1 1 \
1 C C^ /•N-1 '
jr = N-'/' 1 C^ C* ^2{N-1)
\1 C'^~^ ^2(Af-l) c /
2.1.1.1 P r o p o s i t i o n . We have the relation
U = TAr*
where A is the diagonal matrix with entries 1, C, C^i • • • > C''^"^ •
/(t7) = 5]/(fc)l/*
In Chapter 5 we will see the connection with inverses of circulants and symmetric
functions.
We will look at random walks and martingales for finite probability distributions. Or-
thogonal polynomial systems for the multinomial distribution will be found. A particular
class of generalized Krawtchouk polynomials is determined by a random walk generated
by roots of unity. Relations with hypergeometric functions and some limit theorems are
discussed. This presentation is based on Feinsilver-Schott[19].
N
Giv) = J2 ^"K„ix, iV) = (1 + v)^'^+">/^(l - z ; p - - ) / 2
0=0
oo n / \
(l_,).-(l_(l_iJ)„)-v = ^^(„)„^i.J-'^'!/ ji\ (3.1.1)
n=0 • ^ '
One class arises from a random walk on the lattice N [ l , ^, ( ^ , . . . , C''"^] where C —
g2jri/d There are evident connections with the finite Fourier transform. It is quite likely
that the classes of polynomials discussed here will prove useful in image processing, among
other possible applications.
92 CHAPTER 4
In the next section we give the probabihstic approach to the binomial case. Then we
present a general approach for finite probability distributions. After that, some related
limit theorems are presented. Specialization to the cyclic case completes the study.
1. Throughout, N will denote 'time' — the number of steps taken in the random walk.
2. We use () to denote expected value, e.g., (X).
3. For brevity, we call Krawtchouk polynomials simply K-polynomials .
Consider a random walk on Z, with equiprobable increments ± 1 . We write Xj, I < j <
N, for the corresponding Bernoulli variables. The generating function
where x — Y^Xj is the position after N steps. As noted above, G(v) = Y2v"Ka(x,N),
with Krawtchouk polynomials Ka-
One can take the viewpoint of 'quantum probability' and consider the Xj as the
spectrum of an operator A, an TV x iV matrix. Then the condition on A is that A e
spectrum (A) => A € { —1,1 }. And the variables of interest axe the multiplicities. We have
And the variables x, N are simply ti A, ti A'^. We thus have the principal observation that
since the Xj take two values, two variables suffice to specify the Ka, which are seen to
be elementary symmetric functions in the Xj. The variables x, N are the corresponding
power sums: a; = ^ X j , iV = ^ Xj .
exhibits the orthogonality of the Ka nicely. So the Ka are important mainly for these two
features:
The probabilistic approach may be carried out for general finite probability spaces. Each
increment X takes d possible values {ifoi • • • i $s } with P{X — ij) = Pj, Q < j < ^, where
throughout we will use the convention 8 — d—\. Denote the mean and variance by n and
(7^ as usual.
Take A" independent copies of X: Xj, 1 < j < N. Define the martingale
N
3=0 a=0
this last defining our generalized K-polynomials . One quickly gets
3.3.1 P r o p o s i t i o n . Denoting the multi-index n = ( n o , . . . ,ns) and by Cj the standard
basis on Z^, K-polynomials satisfy the recurrence
Ka(n + ej) = Kain) + {^j - /i)Ka-iin)
\k\=a
where |k| = fco + • • • + ks.
3.3.3 P r o p o s i t i o n . If ^o = 0, then
K(r,^-(m V UipjCiY' p / - r , - n 1 n
i^„(n)-(-7V)„ 2^ F B ( ^ ^'•••'WJ
|r|=a
Proof: Let Vj = vpj(,j, bj = —rij, t = —N, Sj = Pj ^ in eq. (3.3.1), for 1 < j < <5.
Note that J2""j — '"('••> Y^^j " ^ ~ ^ " ("i + ' •' + "«) = no- •
As for the binomial case, we may use the power sum variables, e.g. for centered increments,
N 6
}=1 j=0
to express the functions Ka- (This will be useful in the cycHc case.)
{K^K^)=6^^a'"f^^
Proof:
Thus, {G(v) G{w)) = {l-'rvwa'^)^. This shows orthogonality and yields the squared norms
as well. •
Limit theorems for products ]7(1 + vXj) are known for Xj the discrete increments of a
process. The limits yield iterated integrals of the process.
Here we look at limit theorems based on the power sum functions. First, define the
normalized values fij = (^j — /i)/(T. Then the central limit theorem says that
* 7=1 ^ j=0
FOURIER TRANSFORM ON FINrrE GROUPS 95
converges in distribution to a standard Gaussian as iV —> oo. We look at the power sums
of the normalized variables:
B^)'=«-"•£»/";
3.4.1 T h e o r e m . Let Yj satisfy n, = pjN + YJVN. Then, as N -^ oo, Yl^-^o yj0j
converges to a standard Gaussian.
^<"'=nC+»^)
converges to the Brownian martingale at time 1,
G{v) vY-v^/2
e
where Y is the standard Gaussian denoting the limit of the normalized sums.
Proof: Write
= Y[exp(n^\ogil+vpJ/^/N))
= exp('f;(-l)-i^5.^
where S; are the scaled power sums N~'''^ S'^j/^j- The above Theorem deals with i = \.
For z = 2 we have, with rij = pjN + Yjy/N,
N-'Y.^,P]^Y.P^P]+N-^I^Y.YiP]=.l + o(l),
since the /?'s are scaled to variance 1. For i > 3, we have
as A'' —> oo. Denoting the limit of Yl'^jl^j ^V ^ ^^^ result follows. •
We conclude that the corresponding K-polynomials converge to Hermite polynomials in
the variable Y.
96 CHAPTER 4
Now consider a random walk in C , with increments X taking values in the dth roots of
unity: 1, (,, C,^,..., C*, with C = g^'^"''', and (^ = d — 1 as above. For simplicity we discuss
the isotropic case, all values occurring with equal probability \jd.
So = " 0 + " 1 + • • • = N
Notice that the Xk are the finite Fourier transform of the variables ( n o , . . . ,ni). So here
we can conveniently go back and forth between the two sets of variables.
From algebraic number theory, see Ribenboim[77, p.266ff.], it is known that for rf = a
prime power, 4>{d) denoting Euler's function, the powers 1, C,,(^,..., ^^('')~i form a basis
for the Z-module spanned by the ('•', 0 < j < 6. I.e. each sum of the form ^nj(^ has
a unique expression as a sum involving (^ for j < (i>{d). The problem is that we have to
count how many general sums, involving all the i^-', reduce to the same canonical form,
involving only C''T J < <i>i<i)- From the point of view of Fourier analysis and probablility
theory this has the flavor of finite prediction theory and requires a separate study.
Here we note that if <i is a prime, then 1, (, ^ ^ , . . . , C*"^ form a Z-basis. And we have
the elementary relation X) C'' = 0- Thus,
3.5.1 P r o p o s i t i o n . Let d be prime. Given xo = N and xi = J^WjC"'; the nj are
uniquely determined.
By the result quoted above, the numbers nj—n^, 0 < j < 6 are uniquely determined. Thus,
their sum, call it f = noH hn^-i —Sns is known, thence dn^ =no + --- + ns — i/ — N — v
is determined and hence the n j , 0 < j < <5 as well. •
FOURIER TRANSFORM ON FINITE GROUPS 97
Now we look at the K-polynomials for the cycHc case. The generating function is
i N
Giv) = ]l{l+vCr' =5]t;°A'„
K.(no,...,n,)= E n ( l ; ) c ^
6 6
\]vj = —V and y j bj = no — N
r=a
3.5.2 T h e o r e m . In G{v), scale v —* vn~^l^, XQ —> XQU and, for 1 < k < 6, Xk
x/tn*'*'**'''', where {d,k) = greatest common divisor of d and k. Then
as n —» oo.
98 CHAPTER 4
For fixed k, let (d, k) = g. Then, for 0 < j < d, we can write j = Id/g + r, with r < d/g.
Thus,
jk = l{dk/g) + rk = rk (mod d)
So
(''/^'-i'-^ - ^r'-./rf
9-1
] ] [ ( ! + K ' ' ' ' ' ' ) = 1 + y9C^''/''>(»(*-i)/2) = 1 _|. y 9 ( _ l ) » - i
(=0
With the scahngs indicated, z* —> x^n^' , v —> vn~^''^, each factor of the product will
converge to expressions of the form
(d/g)-l ^tj/
r=0
C»-* - 1 ~
That is, the ^g are the functions F normalized to have unit expectation. The term states
comes from physics denoting a function of unit norm in L^ of p. The idea is that the state
is a line or ray in the vector space, an equivalence class of functions up to multiplication
by scalars. Note here that the ^^ remain invariant if the functions F are multiplied by
scalars.
The Appell states are used to define transforms of operators acting on functions of x,
typically L^ of p.
Thus, these are the normalized matrix elements of the operator Q with respect to the
Appell states. The coherent state representation used in Chapter 2 is the Appell transform
for the family e'^ . Here is a useful lemma connecting the Appell transform and matrix
elements with the family F itself.
100 CHAPTER 4
4.1.3 L e m m a . For an operator Q, whenever the matrix elements are deSned we have
{^a,Q^,) _ (F(a,X),QF{b,X))
(*„,**) {F{a,X),Fib,X))
Proof: Write F{s, X) = M(s) ' l ' a ( ^ ) on the right-hand side. The factors M{a)M{b)
cancel out. •
X,F(s,x) =xF(s,x)
Denote the family of orthogonal polynomials with respect to p by { <^n } with squared
norms 7„ = ||<^TI|| . We define the transforms
Thus, we have the expansion (in general, under assumption of completeness of the (f)n)
*s = X ] V„(s)?i„(x)/7„
n=0
with initial conditions </>_i = 0, </>o = 1. Observe that the recurrence relation implies
(i>i{x) = ( i — ao)/c(,. We have
FOURIER TRANSFORM ON FINITE GROUPS 101
Proof: With (J>Q = 1, setting n — 0 in eq. (4.2.1) yields Vb = 1- For general n, write
eq. (4.2.1) in the form
{<j^„,F{s,X))=M{s)V„{s)
{<j>n,X^s) = Mis)-'X,{Mis)V„is))
Now use the recurrence formula for <j>n on the left-hand side and apply eq. (4.2.1) to get
the result. For n = 0, this procedure yields
{X<i>,)=M{s)-'X,M{s)
But eq. (4.2.1) for n = 1 says {^i,*^) = Vi. Writing X = co^i + an and taking inner
products with ^, thus gives
For general systems of orthogonal polynomials, we use the states corresponding to the
family F{s,x) =: (1 — sx)"^. The theory applies whenever this is in L^ of the underlying
measure. For example, extending to complex values, if s = ia, a real, then |(1 — sa;)""^ | < 1
for all real x. Here, the operator X, = A , , which acts as
And As^'(s, x) = xF{s.,x) holds as required. We use the notations of the discussion above.
Theorem 4.2.1 yields
102 CHAPTER4
Vnis)
= C„V„+i + anVn + bnVn-l
Proof: This follows directly from Theorem 4.2.1. We have to check that for n > 0,
V„(0) = 0 so that M-'^X,MVnis) reduces to y „ ( s ) / s . We have F(0,a;) = 1 for all x, and
M(0) = 1, so that Vl/o = 1. Hence
Vn{0) = {<i>n,-^o)=0
follows by orthogonality. •
An interesting application of this theorem is to the Chebyshev polynomials. As we
saw in Ch. 2, Prop. 1.5.3.1, the coefficients of the expansion of (1 — sx)~^ are of the form
Vn{s) = V{s)". It is not difficult to see from the above Theorem, that this form of Vn(s)
implies a quadratic equation for V{s), which leads back to the form of V(s) we had in
Chapter 2 with extra parameters coming from the recurrence relations.
Proof: In the proof of Theorem 4.2.1, eq. (4.2.2), yields, with X, — d/ds,
M'{3)/M{s)=coVi{s) + ao
This gives the formula for Vi. As well, for the left-hand side of the recurrence, we have
Vnis) = v{sr
where, in particular, 1^1(5) = V{s). From the above theorem we have
4.2.2.2 T h e o r e m . For Meixner systems we have the expansion
00
e- = Mis)^VisrM^)hn
n=0
where
V = f + 2aV + fiV^
and the recurrence formula for the orthogonal polynomials is of the form
Proof: Substituting V„ = V" into the recurrence given by the theorem above yields
y, _ c„ — Co 2 I "n ~ " o y , ^
n n n
Cn = Co + / 3 , a„ =2na + ao, b„ = nj
V. O r t h o g o n a l p o l y n o m i a l e x p a n s i o n s via Fourier t r a n s f o r m
The Meixner polynomials are special families of orthogonal polynomials closely related
to operator calculus and Lie algebras. A particular class, the Krawtchouk polynomials, can
be thought of as functions on the finite abelian group ZJ and thus the calculation of the
Krawtchouk transform is of interest. E.g., Diaconis-Rockmore[15] mention the question of
rapid calculation of the Krawtchouk transform. Here, we use the close relationship between
Meixner polynomials and operator calculus to give expressions for the generalized Fourier
coefficients of a function expanded in a series of orthogonal polynomials of Meixner type.
From these formulas, we obtain, in conjunction with the fast Fourier transform ( F F T ) ,
efficient methods for the calculation of the coefficients.
First we present the basic facts concerning Meixner polynomieJs and their connection
with operator calculus. Next, we give a detailed discussion of the role of the lowering
operator V. The Krawtchouk case is discussed in detail. After that comes the general
Meixner case, and a return to the Krawtchouk expansions. We reduce the transforms to
multiplication by a Vandermonde matrix and the F F T .
A set of three operators A, B, C is a. basis for the Heisenberg algebra if they satisfy
[A, B] = C and [^4, C] = [B, C] = 0, i.e., C commutes with A and B.
Given a vector space with basis «/)„ a representation of the Heisenberg algebra may be
given in terms of the raising and lowering operators R and V defined by their action on
the basis vectors:
R'^n = 1pn+l
Vlpn = ni)n-\
It is readily seen that, with the Lie bracket given by the commutator VR — RV,
[V,R]=I
5.1.2 P r o p o s i t i o n . For the Meixner classes of polynomials the V operators take the
form:
•where, for the general case, a, j3 are given parameters and q'^ — a^ — /?.
The role of V as lowering operator may be seen from the generating functions. These
have a particular structure that is determined by the function H{s), the logarithm of the
moment generating function of the measure of orthogonality. Each class of polynomials
corresponds to a convolution family of measures, p(. They satisfy the relation
where H is analytic in a neighborhood of the origin (in the complex plane). Let U be the
functional inverse of V (in a neighborhood of 0) and set M(.s) = H{U{s)). The generating
functions have the form:
the function V is given as the derivative of H. (This is discussed in detail below). The
measures pt are a convolution family, pi corresponding to the t th power of pi: pt = p**.
106 CHAPTER 4
The operator V is the lowering operator, the action of which on the polynomials is given
by
From orthogonality, one finds that V satisfies a Riccati equation, cf. the previous
section on Appell states, which in standard form may be written
V = 1 + 2aV + PV^
for real constants a, /9, the prime denoting differentiation. Consider the measure of or-
thogoneJity p{dx) (i.e., pi{dx)). With a replacing is in (5.1.1.1) we express the moment
generating function in the form
gH(.) ^ I e^'pidx)
V{3)e"'''^ = f e'^xp{dx)
with V{s) = H'(s). By the Riccati equation it follows that repeated differentiation leads
to a relation of the form
Thus, from the Fourier point of view, V" corresponds to the operator of multiplication by
4>n(x)- We will see that i^„(x) is proportional to J„(x, 1).
5.2.1 P r o p o s i t i o n . The polynomials defined via (5.2.1) satisfy
n\
<t>n{x) = Jn{xA)
In
= V*(Drpix)
FOURIER TRANSFORM ON FINITE GROUPS 107
JU.X,l)V'{D)"pix)dx = {V"Jm)=n\6nm
/•
Comparing with the orthogonality relations of the J ' s shows the Rodrigues-type formula
v%Drp{x) = -ux,i)p{x)
with 7n = ( / „ ) . Now compare with (5.2.2). •
In the generating function
oo „
^-—' n!
n=0
replacing s -^ V{s) gives
In this form the action of V is clear. Multiplication by V{D) on the left multiplies by 1^(5)
on the right, resulting in J„ —> nJn-i-
First we see how the Krawtchouk polynomials fit into the above scheme. Then calculation
of the Krawtchouk expansions is considered.
For the binomial distribution, the measurept is given by the discrete weights PN(X) =
2~'^(^) with ir = {N + x)/2. We have the moment generating function:
(1 + t;)<^+^)/^(l - „ ) ( ^ - - ) / 2 = y ^ K„{x, N)
n=0
where N = TT + 1/ is the total number of steps and x — n — i/ is the position of the walker
after TV steps. This may be rewritten in the form
' n=0
Now substitute v = t a n h s to get
108 CHAPTER 4
5.3.1 P r o p o s i t i o n .
,'^^^^nJ2'os^~Y^"'K„(x,N) (5.3.1.1)
n=0
27r(iV - 2k)
0<k<N
1+N '
If / has Krawtchouk-expansion ^ / n - R ' n / " ' j then, denoting the Gnite Fourier trans-
form of / by / ,
/ ( , ) = (Ar + i ) - i ^ e - - " / ( ^ )
X
where the sum on x runs from —N,..., TV in steps of two, (5.3.1.1) yields
5.3.1.1 P r o p o s i t i o n .
/ „ = i" ^ f{s) cos^-"s sin"s (5.3.1.2)
2) Denote here ^ ds by D. Thus, f{x) = e ^ ^ / ( s ) | o , which may be denoted just e^° /(O),
as we will do below. From (5.3.1) we have
^Drr , ^ c o s h ^ - " D s i n h » D
e /(5) = 2 ^ -, f(s) Kn{x, N)
0
i ) As in (5.3.1.2):
U = i"Y^cos^stan"s /(s)
Observe that the exponent in the first factor may be computed as |(iV — \N — 2n\).
In this section we present two methods for calculating the expansion of a function for the
general Meixner case. The first method is adapted for numerical computation, the second
for symbolic calculation.
/n = { V " / ) / n !
(y"/)= fe*"^"^V(is)"f(s)ds
110 CHAPTER 4
Proof: Recall the (inverse) Fourier transform of the measure pt given by (5.1.1.1),
je">^pt{dx)
Thus,
Using ParsevaJ's formula, for any function g we have, in the sense of distributions,
From the Fourier inversion formula we have the action of the operator V{d/dx)
f{x) = Je''^fis)ds
Combining this with formula (5.4.1.1) for calculating expected values yields the result.
•
e—'«(^'=f:^j„(^,o
n=0
We extend (5.3.1) for the Krawtchouk case to the general Meixner case.
5.4.2.1 P r o p o s i t i o n . For the Meixner polynomials we have the expansion
e - = e ' « W ^—'
V ^ Jn!„ ( x , * )
n=Q
Using the operational formula e"^ f(x) = f(x + a), put x —* a, s —* D, then x —> 0, a —* x,
which gives the expansion
n=0
Thus
FOURIER TRANSFORM ON FINITE GROUPS 111
5.4.2.2 P r o p o s i t i o n . For the Meixner polynomials the expansion coefficients are given
by
I. R e p r e s e n t a t i o n s of t h e s y m m e t r i c g r o u p
Given n > 0, S„, the symmetric group is the group of permutations of n symbols.
Equivalently, the group of one-to-one mappings on a set of cardinahty n. The composition
of mappings provides the group law. Here we will discuss the basic features of the sym-
metric groups and their representation theory. Some applications including MacMahon's
Master Theorem and Molien's Theorem will be presented.
If one cycle acts on another by conjugation, the result is that of applying the substi-
tution to the cycle acted upon. An example illustrates this:
(532146)-i(524)(632146) = (641235)(524)(532146)
=(316)
1.1.1 P r o p o s i t i o n . The conjugacy classes are determined according to the decompo-
sitions into disjoint cycles.
We denote by
(n"" ...2"'!'")
the conjugacy class of elements having pj j-cycles. Observe that YliPj — '^•
1.1.2 Definition. The alternating representation of Sn is the mapping x —> sgn (x).
Acting on n symbols we have
A standard or Young tableau is a Young frame filled with the integers 1 , 2 , . . . , n so that
the numbers are strictly increasing along each row and down each column. For example,
1 3 4 6 7
2 5 8 10
9 12
11 13
14
is a standard tableau of the above shape. Many properties of Young tableaux depend on
the notion of hooks.
1.2.1 Definition. The hook corresponding to a cell in a Young frame is the cell itself
plus the cells lying below and to its right.
In the above tableau, e.g., 5, 8, 10, 12, 13 define the hook of 5 having length 5. A beautiful
result of Frame, Robinson and Thrall counts the number of standard tableaux of a given
shape.
114 CHAPTERS
dx =
1-4-5-2-3-1-2
1 3
2 4
we have
QP = ((1 - (12))(1 - (34))) ((1 + (13))(1 + (24)))
The hook formula gives dx = 2, with the other tableau given by exchanging 2 and 3 in the
one above.
YOUNG TABLEAUX AND COMBINATORIAL ENUMERATION 115
Schur and Weyl developed the connection between the representations of the symmetric
group and representations of GL(N), the general linear group oi N xN invertible matrices
(here we are interested only with real or complex entries). Given a vector space V, the n-
fold tensor product V®" consists of linear combinations of formal products Vi <8iV2 ®- • •®v„
of vectors which is multilinear, i.e., scalars factor out, and the distributive law holds, i.e.,
v ® ( x + y) = v ® x + v(8)y
R e m a r k . Sometimes it is convenient to drop the tensor product signs and think of products
of noncommuting variables or concatenation, as in multiplying words of a formal language.
If { e i , 6 2 , . . . , e ^ } is a basis for V then a basis for V®" is given by
e>i.2...i„ = e.-, ® ei, ® . . . ® e,„
corresponding to N choices per each of n positions. Thus, dim V®" = iV".
Example. For [21], use the symmetrizer (1 — (13))(1 + (12)) corresponding to the tableau
1 2
3
We write just abc for the tensor ei ® 62 ® 63. Thus,
Vi = abc + bac — cba — cab
V2 = (23)vi = acb + bca — cab — cba
With this as basis, one checks the representation
(0 1
That is, A effectively acts on each factor independently. We want to calculate the character
of this action of A, i.e., the trace. By a similarity transformation, we may assume A is
upper triangular with the eigenvalues, say, xi,... x^ on the diagonal. Acting on n-tensors,
contributions to the trace will consist of monomials in the Xi.
For each shape A, applying the Young symmetrizer corresponding to a given standard
tableau to an n-tensor gives a tensor of syraraetry CISLSS A. Since repetitions are allowed in
any given basis element of V®", one considers column-strict tableaux where along rows
only nondecreasing numbering is required. Each such tableau corresponds to a given basis
tensor in the symmetry class and the contribution to the trace is given by the corresponding
monomial in the variables x,. The sum of these monomials gives the character of the
representation of GL{N) and is denoted {A}, the Schur function or S-function . To
illustrate:
Example. Consider the action of a 2 x 2 matrix on V®^ corresponding to the shape [32].
Then the ^-function {32} = xlx2 + x^x^ corresponding to the diagrams
1 1 1 1 1 2
2 2 2 2
Two particularly important classes are the symmetric tensors and the antisymmetric
tensors . The symmetric tensors are gotten by symmetrizing over all positions. On n-
tensors, this action is the Young symmetrizer for the shape [n]. Similarly, antisymmetrizing
over all indices, i.e., putting in factors acccording to the parity of the permutations, is
the Young symmetrizer for shape [1"]. The corresponding representations of GL(N) are
respectively the fully symmetric and antisymmetric representations. The corresponding
S-functions are respectively the homogeneous symmetric functions and the elementary
symmetric functions . That is,
1.3.2.1 P r o p o s i t i o n . The S-function {n} is the homogeneous symmetric function con-
sisting of the sum over all monomials homogeneous of degree n in the variables Xi,... ,XN-
The S-function {!"} is the elementary symmetric function consisting of the sum of all
monomials Xi^ • • • Xi^ corresponding to all choices of n distinct indices.
YOUNG TABLEAUX AND COMBINATORIAL ENUMERATION 117
1.3.2.2 Definition. We denote by /i„ and a„ the n'** homogeneous and elementary
symmetric functions respectively, in a given set of variables xi,... ,XN- We denote the
power sum symmetric functions by
5. = E^"
i=l
Remark. Note that if A is an A'^ x Af matrix with eigenvalues xi,... ^ XN, then the power
sum Sn = t r ^ " .
r^i det(xf>+^-^)
^ ^ det(rf-^)
= det(/iAi+j-i)
' ' p
with det(x; ~') the Vandermonde determinant in the variables xi,... , XN. The equality of
the Brst two hnes is the Jacobi-TVudi identity and the last formula noted is the Frobenius-
Schur formula expressing the S-function { A } as t i e Fourier transform of monomieds in
the power sums.
p
1 Y^
We will see some more formulas for 5-functions later on in the Chapter.
Here we discuss MacMahon's Master Theorem and Molien's Theorem. We present some
simple examples to show the basic ideas behind the theorems.
118 CHAFHERS
So taJce as variables ^1, • - • ,^N- And consider polynomials in these variables. The
symmetric representation of a matrix is the action on polynomials induced by:
where
where we are actually working with the tensor components, as usually done when working
with vectors. Now it is easy to see directly that if A is upper triangular, with eigenvalues
xi,... ,XN, then the trace of this action on the space of polynomials homogeneous of degree
n is exactly h„(xi,..., XN), the n'^ homogeneous symmetric function. We recall, as are
readily proved,
J J ( 1 +vxi) = ^v"a„{xi,...,XN)
i=l n
N
J J ( 1 - vxi)~^ = '^v"h„{xi,.. .,XN)
Thus,
1 °°
det(/-tA) =T.^"^'Syn.A
YOUNG TABLEAUX AND COMBINATORIAL ENUMERATION 119
I =n-^
d e t ( / - tA) A l 1 _ fxi
n=0
oo
Tl = 0
as seen above. •
F„+i = F„ + F„-i
1 °°
11=0
.= 0 ^ ' /
Contributions to the trace come from the coefficient of £"'^"% namely ( ). Thus, with
a slight change of notation
1.4.1.3 P r o p o s i t i o n . The Fibonacci numbers satisfy
a-ffr=Ti
120 CHAPTER 5
It is clear that a similar approach works for any recurrence with constant coefficients,
choosing for A a companion matrix, i.e., one with ones on the superdiagonal, except for
the last row, which contains the coefficients of the recurrence.
Example. The Tchebychev polynomials of the second kind have the generating function
1 "
l-2st + <2
Y^i^Unis)
n=0
We have j/i = ^2> J/2 = —^i + 25^2- Applying MacMahon's Theorem as in the previous
example, we find
Un{s) = j:(^''-y-in2sr
Now we consider any matrix representation, say of degree N, of a finite group. An element
g is represented as an A'^ x iV matrix, say A{g), and acts on the space of polynomials via
the symmetric representation. A polynomial p{x) is invariant under the group if it is
unchanged by this action, i.e. p{gx) = p{x) for all g G G, where gx denotes the variables
X transformed according to the matrix A{g). The homogeneous invariant polynomials are
a subspace of the space of homogeneous polynomials of given degree, a finite-dimensional
space. Thus,
Proof: We know from MacMahon's Theorem that this is 1/|G| times the sum over
G of the trace of the symmetric representation. Recall from our discussion of the group
algebra in Chapter 4 that the sum ^ ^ g is the projection onto the trivial representation
in the group algebra. Alternatively, think of decomposing into irreducible representations,
then the Schur orthogonality relations imply that summing any character of an irreducible
representation yields zero except for the trivial representation, when the result is |G|. So
summing any character over all of the group yields \G\ times the multiplicity of the trivial
representation in the given representation. In the present context, the trivial representa-
tions are just the invariant polynomials. •
Let's look at some examples.
Example. For the cyclic group of order 3, we have, as we saw in our study of representations
of cyclic groups in Chapter 4, two elements x and x^ with characteristic polynomial det(/—
tx) = det(J — ix'^) = 1 — <*, where we take for x the matrix
The 3 x 3 identity matrix has characteristic polynomial (1—<)^. Thus we have the generating
function
V>nc 2/3 1/3
n=0 n=0
For n = 3, we have, in the variables ^i,^2)i^3i as invariants: as — ^i^2^3, the homogeneous
symmetric function /^3, the power sum (I + Q + (.1 and the Vandermonde determinant
(6-6X6-6X6-6).
Example. For the symmetric group S3, we use the 3 x 3 matrix representation of the action
as permutations on three symbols. The 3-cycles have characteristic polynomial 1 — t^ as
in the previous example. The transpositions have characteristic polynomial (1 — <)(1 — t'^)
as they fix one symbol and exchange the other two. With the identity this gives
Here the invariants are symmetric polynomials. To expand the right-hand side, we observe
the identity
1 1 / 1 1
il-t){l-t^) 2 \l-t^ + (1-t)
122 CHAPTERS
Consider a process where an operation consists of creating a new file and adding an item
either to one of the previously existing files or to the new file. The state of the system is
described by the number of items in the files. After N operations, there are N files and
N items. The occupancy numbers, i.e., the number of items in each file, give a partition
of N and the system can be represented by a Young frame.
dx = 22 ^1^
where, for A a partition of N, the sum is over partitions oi N — 1 such that the shape A
comes from a shape fi by adjoining a single box at the end of a row or column. We define
evolution functions 4>x by the equation
where the factors 4>x\ii correspond to the transition fi —* X. For example, for iV = 8,
A = [3221] comes from [2H], [321^], and [32^]. The hook formula verifies the relation
among the degrees: ^A = 70 = 14 + 35 + 21. The transition [3211] -^ [3221] is indicated
by the diagram
2.2 T h e o r e m . For a class function f(p) on 5jv, we have the Plancherel identity
A p
YOUNG TABLEAUX AND COMBINATORIAL ENUMERATION 123
X,p,eT
Consequently,
2.3 P r o p o s i t i o n . If d\(j>x = /(A) for some class function f, then the normalizing
constant in the corresponding generalized Plancherel measure is the average value of P
over the group.
then we have what is usually called the Plancherel measure on the shapes. This is the case
where (fix = 1 for all A. These Plancherel measures have been studied by Versik-Kerov[82]
and have come up in work of Logan-Shepp [57].
Xp =sgn(/>)Xp
{A} = det{hxi+j-i)
R e m a r k . An important feature to note is that the relationships connecting the /j's, a's,
a's, and {A}'s hold regardless of the existence of underlying variables that make these
symmetric functions. The determinantal relations as well as Frobenius-Schur formula only
depend on the following generating function relations, given a formal series L{t) that is
the generating function for the variables corresponding to the power sums:
n=l
CXI
n=0
oo
n=0
A = ( a i a 2 . ..as\hih2 . • .h,)
where, moving along the principal diagonal, the ai count the number of boxes to the right,
including the diagonal box, and similarly, the 6; count the number of boxes below and
including the i^^ diagonal box. For A = [5422],
we see that 5 = 2 and A = (53|43). Observe that in this notation, the conjugate shape
is (6i . . . hs\ai ... as). In general.
Ui = Xi - i + 1, bj = \j - j + 1
YOUNG TABLEAUX AND COMBINATORIAL ENUMERATION 125
Now we fix a parameter q and define corresponding functions <j). With ijio = 1, we have
for integer n > 0
n—1 n—1
't>n^\{{l+3q), 4>n=X{{l-iq)
>=0 j=0
•i^CM) = n '^''•-
2.2.1 D e f i n i t i o n . For a conjugacy class p denote the number of non-zero parts by r{p),
i.e., if p = (n"" . . . 2^n>''), then r{p) = J^Pj-
We define the class function
F{p) = q^-'-^"^
which is F{p). We have then to check the generating function relations connecting hn and
a„ with the variables s„. We have for the s„:
00
Z—rf ft
Ti=0 n=0 I
This clearly equals exp(L(<)) as required. The result for the a^ follows by mapping q —» —q
for the conjugate ^n- •
126 CHAPTER 5
F{X) :^ det((t>x,-i+j)
Proof: In the expansion, the column indices, j , are some permutation of correspond-
ing row indices, i. The subscripts on (j) in any given summand are of the form A; — i + j , ,
summing to
^{\.-i + ji) = J2X. = N
Proof: Think of F{X) given as the determinant in the previous Lemma. The first
column of the determinant is
\ •• J
where there are either i/io's or O's below row s. Since the subscripts increase across each
row, the determinant clearly has a factor of Y[ ^a; • Similarly, consider the conjugate shape
and replace (f> by </>. In that form there is clearly a factor of Yl <l'bi- •
deg{F(X)) = N - s = deg<j>x
since ^ ( a ; + bj) = N + s. By the previous Lemma, then, deg F{X) > N ~ s. We will show
that the maximal degree of any 4'{n) in the expansion of the determinant is no greater than
N -s.
In det(0A(-i+j), row i has a 4>Q entry just in case A; < i, in which case ijio occurs in
column i — A;. Note that if, say, ii > iii then i\ — A,, = Z2 — A^^ implies 0 < ii — 12 =
YOXJNG TABLEAUX AND COMBINATORIAL ENUMERATION 127
Ail "^'2 ^ Oj 8, contradiction. That is, the <l>o's in different rows occur in different columns.
The degree of a non-zero product (^(^) is just A'' minus the number of factors having non-
zero subscripts, since // is a partition of N and each non-zero fXj contributes a factor <l>f,.
of degree /ij — 1. Thus, the terms of maximal degree have the most factors of <{>o. Since
s = ^{ Xf. Xi > i], the number of rows containing <f>o equals
#{Ai:Ai < 2} = 6i - 5
as b\ is the number of rows of A, i.e., Aj. So, a non-zero 0/^) has 6i factors in all, with at
most hi — s factors of <j>o, i.e., at least s factors (/>^j with m / 0. Hence the result. •
Proof: By the Lemmas preceding Lemma 2.2.6, we have that F(X) = cx(j>x for some
constant ex, independent of q. Evaluating at 5 = 0, Lemma 2.2.6 completes the proof.
•
Combining this with the Plancherel identity, we have
2.2.8 T h e o r e m . For any real q, the measure on shapes
P ( A = A) = dl<l>l
Proof: We just have to check the average value of F^ over the group. This is clearly
the same as F([iV]) with q^ replacing q. The result then follows from Lemma 2.2.2. •
Another observation:
128 CHAPTERS
Example. Take q = 1/2. T h e partitions with non-zero probabihty are of the form [JV] or
[N-jJ], with 1 < i < N/2. In Frobeniusnotation, f o r ; > 2, [N-j,j] = (N-j,j-l\2,1),
with [N -l,l] = (N - j\2) . Thus,
And
P(A = [iV])= ^ - ( W _ 6(iV + l)
N\<j>N{llA) (iV + 2)(JV + 3)
iV(iV - 1)
The probabihty that there are two nonempty files is j — — T T T T ^ ^ • The expected num-
ber of nonempty files, the possibilities being either one or two only, is
a. For N = 2,
b. For TV = 3,
^^"^-^^J^-6(1+5^X1+2^2)
P ( A = [21])= ^(^-^^)^
6(l+9^)(l+2g2)
(l-g)^(l-2g)^
P ( A = [111]) =
6(1+52X1+252)
We can consider the limit as 5 —> 00. Note that a shape consisting of a single hook is
of the form [TV - j , V\.
_ A^! fN -1
and
P{A = {N-j,V] =
C-')'4'%-.^U
^
'^ m{N-iy.
_ i_
" N
m
Finally, we remark that by choosing q to be purely imaginary, we construct a symmetric
measure, assigning equal probabilities to conjugate partitions:
P ( A = A) = P ( A = A ) = '^'l^^l'
iV!<Aiv(kP)
Using the theory of 5-functions, we can develop general formulas for the inverses of
circulant matrices of arbitrary (fixed) size. Let A be A'^ x A'' with minimal polynomial
m-l
p{x) = x"" - ^ am-jx J
7r(x) = 1 — y^ ajX-'
j=i
m—1 m—l
^(i^y'Tr^-i-.W - ^(My+v^_i_;(i)
j=0 j=0
tn—l m—l
= Y, itAyi^m-l-j(t) - TTm-jit)) + x ( 0 - t'" J ] « — i ^ '
= 7r(<)
•
We can find a formula for any power of A in terms of 5-functions. We associate
S-functions t o the polynomial p{x) by considering, with ao = 1, On = (—1)""'"-'Q:„ as
elementary symmetric functions in the theory discussed above, as in fact they are the ele-
mentary symmetric functions of the eigenvalues of A for p{x) the characteristic polynomial.
We associate /i„'s via the generating function 7r(a;)~^ = '^x"h„.
a„ = {!"} = det(hi_i+j)
And in the determinant for {n, 1""}, expanding by cofactors along the first row, the minors
that arise all have just hi and ho on the diagonal, giving corresponding a;t's. The elements
of the first row are of the form hn+k and the indicated result follows. •
3.3 T h e o r e m . Let ^4 have minimaJ poiynomiaJ p(a;) = a;"* — ^ ^ " ^ a-m-jX^- Then
m-l
A" = "^{71 - m + 1,1""~^~^} i-lT'''^ A^
Proof: In Prop. 3.1, expand the inverses, using geometric series for the left-hand
side and on the right-hand side expanding in terms of the h„. The geometric series will
YOUNG TABLEAUX AND COMBINATORIAL ENUMERATION 131
converge in a neighborhood of the origin in C for |f | less than the reciprocal of the spectral
radius of A. (Optionally, one can consider these as calculations with formal series.)
f m-\
Y^t-A- = (Y^t^h^ Y. ^ru-^-,{i){tAy
fc=0 / \ J=0
tn —1 / m—J—2
E }=0 \ k=0
Now recall the Kronecker product of matrices. This is the same as the tensor product
of the matrices. If T and A act on V, then A®T acts on Vi ® V2 6 V ® V as
extending to all of V ® V by linearity. Notice that if { f; } and { rjj } denote the eigenvalues
of A and T respectively, then the eigenvalues of A ® T are { (iTjj } . So det(7® /— A ® T) =
f l ; (1 — i,,rij). Now the idea is to keep A and take the 'partial determinant' with respect
to T yielding
/(A) = d e t ( / ® / - A ® T ) (3.1)
Proof: To get the first line, in eq. (3.1), interchange the order of taking determinants:
= det 7r(T)
m—\
7r(/ ® T) = ( / ® / - A ® T) Y^ 7 r ^ - i - j ( / ®T){A® Tf
3=0
m—l
d e t ( / ( A ) ) / = / ( A ) X det ^ 7 r „ - i - , ( / ®T)iA® Ty
j=o
At this point, we need another result on 5-functions. It was noted by Foulkes that any
S-function can be expressed simply as a determinant of hook 5-functions.
{A} =
di,d2,...
for the partition A = ( n i , r ( 2 , . . . |c(i + 1, c/g + 1,...) in the standard Frobenius notation.
'ni,n2,.
{A} = detiiujl-"})
di,d2,... _
where the entries are S-functions for the hook diagrams indicated.
and
5,3 {51} {31}
{54} =
1,0 {5} {3}
3.2 CIRCULANTS
3.2.2 L e m m a . Let T be the companion matrix for f{x) = 1 — X^,_j 0jX^. Then, for
k > I
{T^j^i-iy'^ik-l + i,!'-^}
with the S-functions corresponding to the b„.
134 CHAPTER 5
Proof: The Cayley-Hamilton theorem yields T' = X^j=o Pi-i'^'• Consider the action
of T as a matrix on the column vector T = (J, T , . . . , T'~^)', the t denoting transpose —
y r ^ (T*, T * + \ . . . , T^+'-i)*. From Theorem 3.3, we have, for r > /,
i-i
T'- = ^ { r - / + 1, ! ' - > - ! } ( - l ) ' - J - i T ^
3=0
Substituting into (T*, r * + ^ . . . , T*+'~^)' for ^ < r < fc + / and matching with the action
on T yields the required form. •
Jt=o o<n<.-<u<( ^ *, * 1, , 1 /
where \i\ = ii + 12 H \-ik, and tiae reduced determina.nta.1 form refers to the S-functions
corresponding to the bk = (—1 )*"'''/3yt.
Example. One can apply this result to find the following formulae:
3.2.5 Definition. Given n > 0, we let h'^. denote the truncated homogeneous polynomial
which is the homogeneous symmetric function of degree r in given variables, { ^i }, but
only including monomials with the property that no variable occurs with an exponent
greater than n — 1.
So, we write
In general, we have
3.2.6 P r o p o s i t i o n . Write f''\x) = 1 - Y!J=I / ? ' " ' X J . For 0 < it < n, we have
3=1
= f"\x'')J2x^h,
However, the determinant on the left-hand side equals, as we just saw, ^^^^^'r- Equating
coefficients of T'^+''" yields the result. •
Finally,
3.2.7 T h e o r e m . For a circuiant f{U),
j=0 r=0
Proof: To check the limits of summation over r, observe that, by construction, h'^ = 0
if 5 > (n - 1)/. This says that j + rn < {n - 1)1, oi r < I - (l +j)/n < I. •
136 CHAPTERS
Here we indicate the connection between Young tableaux and parallel processing.
Fran5on[32] proposed recently a quantitative approach to the serializability problem by
counting commutation classes, i.e., classes of words that can be derived by commuting
some letters in a word. He proved that there is a one-to-one correspondence between cer-
tain commutation classes and Young tableaux of a given shape. A basic question in this
area is to count the number of Young tableaux with a given number, or at most a given
number, of rows. Work of Regev[76], Askey-Regev[7], and D. Gouyou-Beauchamps[43] is
of particular interest in this regard.
R e m a r k . There are other applications of Young tableaux as well. E.g., the connection
between Young tableaux and invariant theory leads to applications in such fields as com-
puter vision. See, e.g., S.S. Abhyankar[l]. Also see Knuth's discussion of Young tableaux,
[53], V. 3, p.48fF.
Giving exact formulas for the number Y„!c of Young tableaux having n cells and at
most k rows is in general an open problem. For k = 2 and k = 3, exact formulae have
been given by Regev[76]:
(2n)!
where C„ = ———-^—z is the Catalan number and M„ the Motzkin number. D.G.
n! (n -h 1)!
Beauchamps[43] proved that
and
^ Y^^ 3!n!(2» + 2)!
"'^ ^ in-2i)\il{i + l)l{i + 2)\ii + 3)l
4.1.2 Definition. Let w and w' be two words of X*, the shuffle product of w and w'
denoted by w liAw' is the set defined by:
LlUL'= (J (u)iUJw2)
4.1.4 L e m m a . Let A, A ' be two disjoint alphabets and w € A* (resp. w' € A'*) of
length n (resp.n'), then :
, (.n + n')\
card w LUw = j — - ^
R e m a r k . The problem of analyzing the evolution of two stacks in a common memory area
has been solved by Flajolet[25] using shuffles of languages corresponding to one-dimensional
random walks .
We now consider the application of Young tableaux to the enumeration of some com-
mutation classes. These results of this section are due to Fran5on[32]. Consider an al-
phabet X = {x\,Xi,X2,X2, • •. ,x„,x„} and the following commutation rules: the only
letters which do not commute are Xi and Xj, and Xi, Xj, 1 < i < n, 1 < j < n. For the
commutation class C„ = [xiX2 ... a;„;riX2 • • • x„], we get card(Cn) = 1 • 3 • 5 • • • {2n — 1).
As shown by Fran5on, this example has an interpretation in queuing theory. Consider a
queue where only two kinds of events can happen :
The behavior of the queue is a sequence of such events, two events never being realized
at the same time. If the departure is determined by some priority, such as first-in-first-
out or last-in-first-out, we get a priority queue and each element of Cn corresponds to a
possible behavior of the priority queue. Frangon found an interesting link between some
commutation classes and Young tableaux.
138 CHAPTER 5
4.1.5 P r o p o s i t i o n . Let Cn,m be the commutation class [xiX2 • • • XnXiX2 • • • Xm], for
0 < m < n, with the following commutation rules: the only non-commuting letters are Xi
and Xj, Xj and Xj, a:, and xi for i ^ j , i = 1 , 2 , . . . , n and j = 1 , 2 , . . . , m . Then C„^rn is in
one-one correspondence with the set of two-rowed Young tableaux, having the first row of
length n, the second of length m.
In addition,
card (C„,m) = — — — — C^+m
4.1.6 P r o p o s i t i o n . Consider an alphabet {x\ jX^ .,•.. ,x\ ,^2 ,. • • } with commu-
tation rules defined as follows: x\''^ and a;^" commute if and only if i ^ k and j ^ /. Let
ni,n2, •••,nk be k integers such that ni > n2 > ... > nk > 0. Then the commutation class
J. r (1) (1) (1) (2) (2) (2) (3) (it) (*:) ((t), . . ,
oi \x\ x\ . . . Xn(x.\ x\ . . . Xn^x\ x\ x\ . . . i n t J Js m one-one correspondence
with the set of Young fabieaux of shape [ni, n2,..., nk]-
17. P. Feinsilver, Bernoulli systems in several variables, Springer Lecture Notes 1064
(1984) 86-98.
18. P. Feinsilver and R. Schott, Orthogonal polynomial expansions via Fourier transform,
Rapport INRIA 1745, 1992.
19. P. Feinsilver and R. Schott, Krawtchouk polynomials and Unite probability theory,
Conference Proceedings: Probability Measures on Groups X, 129-136, Plenum Press,
1991.
20. P. Feinsilver and R. Schott, On Bessel {unctions and rate of convergence of zeros of
Lommel polynomials, Mathematics of Computation 59, 199 (1992) 153-156.
21. P. Feinsilver and R. Schott, Algebraic structures and operator calculus,
Volume 1: Representations and Probability Theory, Kluwer, 1993.
22. W. Feller, Introduction to probability theory and its applications, 2 vols., Wiley,
1971.
23. Ph. Flajolet, Combinatorial aspects of continued fractions. Discrete Math., 32 (1980)
125-161.
25. Ph. Flajolet, T i e evolution of two stacks in bounded space and random walks in a
triangle. Rapport INRIA 518, 1986 and Proceedings of MFCS'86, Lecture Notes in
Comp. S c , 233, (1986) 325-340.
26. Ph. Flajolet, J.Fran^on and J. Vuillemin, Sequence of operations analysis for dynamic
data structures, J. of Algorithms, 1 (1981) 111-141.
27. Ph. Flajolet and R. Schott, JVon-overiapping partitions, continued fractions, BesseJ
functions, and a divergent series, European J. Combinatorics, 1 1 , (1990) 421-432.
28. Ph. Flajolet and J.S. Vitter, Average-case analysis of algorithms and data structures.
Handbook of Theoretical Computer Science, chapter 9, Elsevier Sc. Pub. B. V., 1990.
52. G.D. Knott, Deletion in binasy storage trees, Report Stan-CS 75-491, 1975.
53. D.E. Knuth, The art o{ computer programming, 3 volumes, Addison-Wesley, 1973.
54. D.E. Knuth, Deletions thai preserve randomness, IEEE Trans. Software Eng., SE-3,
5 (1977) 351-359.
55. C. Lavault, Analysis of a distributed algorithm for mutual exclusion, INRIA Report
1309, October 1990.
56. J. van Leeuwen, ed., Algorithms and complexity, Handbook of Theoretical Computer
Science, Vol. A, Elsevier, 1990.
57. B.F. Logan and L. Shepp, A variationaj probJem for random Young tableaux, Adv.
Math., 26 (1977) 206-222.
58. G. Louchard, Random walks, Gaussian processes and list structures, T.C.S., 5 3
(1987) 99-124.
59. G. Louchard, C. Kenyon and R. Schott, Data structures maxima, Proceedings
FCT'91, Lecture Notes in Comp. Science 529 (1991) 339-349.
60. G. Louchard, B. Randrianarimanana and R. Schott, ProbabiJistic analysis of dynamic
algorithms in D.E. Knuth's model, T.C.S., 9 3 , (1992) 201-225.
61. G. Louchard and R. Schott, Probabilistic analysis of some distributed algorithms.
Random Structures and Algorithms, 2, 2, (1991) 151-186.
62. G. Louchard, R. Schott, J. Tolley and P. Zimmermann, Random walks, heat equation
and distributed algorithms, Journal of Computational and Applied Mathematics (in
press).
63. R.S. Maier, A path integral approach to data structures evolution. Journal of
Complexity, 7, 3, (1991) 232-260.
64. R.S. Maier, Colliding stacks: a large deviations analysis, Random Structures and
Algorithms, 2 (1991) 379-420.
65. R.S. Maier and R. Schott, The exhaustion of shared memory: stochastic results.
Proceedings WADS'93, Lecture Notes in Comp. Science 709 (1993) 494-505.
66. R.S. Maier and R. Schott, Regular approximations to shufBe products of context-
free languages and convergence of their generating functions. Proceedings FCT'93,
Lecture Notes in Comp. Science 710 (1993) 352-362.
67. K. Mehlhorn, Data structures and sdgorithms 1: Sorting and searching, EATCS
Monographs, Springer Verlag, 1984.
68. K. Mehlhorn and A. Tsakalidis, Data structures, Handbook of Theoretical Computer
Science, chapter 6, Elsevier Sc. Pub. B. V., 1990.
69. J. Meixner, Orthogonale Polynomsysteme mit einem besonderen Gestalt der erzeu-
genden Funktion, J. London. Math. S o c , 9 (1934) 6-13.
REFERENCES 143
70. T. Naeh, M.M. Klosek, B.J. Matkowsky and Z. Schuss, A direct approach to the exit
problem, SIAM J. Appl. Math. 50 (1990) 595-627.
71. C. Pair, R. Mohr and R. Schott, Construire les algorithmes: les ameliorer, les
connaitre, les eveduer, Dunod informatique, 1988.
72. J. Peterson and A. Silberschatz, Operating Systems Concepts, Addison-Wesley,
1983.
73. E.D. Rainville, Special functions, The Macmillan Company, 1960.
74. B. Randrianarimanana, Complexite des structures de dannees dynamiques, These
de rUniversite de Nancy 1, 1989.
75. M. Raynal, Algorithmique du parallelisme: le probleme de I'exclusion mutuelle,
Dunod informatique, 1984.
76. A. Regev, Asymptotic values for degrees associated with strips of Young' diagrams,
Advances in Mathematics, 4 1 (1981) 115-136.
77. P. Ribenboim, Aigebraic numbers, Wiley-Interscience, 1972.
78. G.C. Rota (ed.), Finite operator calculus, Academic Press, 1975.
79. G. Sansone and J. Gerretsen, Lectures on the theory of functions of a complex
variable, P. Noordhoff, 1960.
80. R. Sedgewick, Algorithms, Addison-Wesley, 1988.
81. G. Szego, Orthogonal polynomials, American Math. S o c , 1975.
82. A.M. Versik and S.V. Kerov, Asymptotics of the Pliaicherel measure of the symmetric
group and the limiting form of Young tableaux, Sov. Math. Dokl., 18 (1977) 527-531.
83. G.X. Viennot, Algebres de Lie libres et Monoides libres, Lect. Notes in Math. 6 9 1 ,
Springer Verlag, 1978.
84. G.N.Watson, A Treatise on the Theory of Bessel Functions, Cambridge University
Press, 1980.
85. A.C. Yao, An anaiysis of a memory aiiocation scheme for impJementingstacJcs, SIAM
J. Comput., 10 (1981) 398-403.