Professional Documents
Culture Documents
P. Holgate
Department of Mathematics and Statistics, Birkbeck CoIlege, London WCIE 7HX, UK
to calculate, and those arising in the general distribution are obtained as the
same linear combination of those arising in the basic cases. This approach is
described in (Holgate 1968). If there are ri + 1 alleles at locus i, there is an
r l r 2 . . , rk-dimensional family of equilibria (see Theorem 4). The result relating
the dimensionality of the manifold of equilibria to the multiplicity of the
principal train root ½, although applicable in the multilocus algebra, does not
hold in as great generality as stated in Eq. (25), (see W6rz-Busekros 1980, pp.
65 69). The principal train roots are 1, and the numbers A(I) = ~ j = ~)~(J, jc).
The value A(I) is the probability that the loci in I come from the father,
irrespective of what happens elsewhere. Then, since clearly A(I) <<,½, any se-
quence of vectors of genetic proportions under random mating converges to an
equilibrium, by Gonshor's Theorem 2.2.
Heuch (1972) extended Reiersol's approach to multiple loci linked to the sex
factor. Theorem 2 explicitly establishes the relevant modification of the H a r d y -
Weinberg law for this case. In a later paper, Heuch (1973) dealt with the k-locus
autosomal problem, allowing for mutation at all loci, using a method related to
that of my 1968 paper. Heuch took as the basis of the set of recombination
distributions, the special cases where at each gap between chromosomes, crossing
over was either impossible or compulsory. In §6 Heuch points out the advantages
of his approach. Every actual distribution appears as a convex combination of
the basic distributions, and the calculation of the weighting coefficients is more
straightforward. On the other hand, there is some advantage in having as basis
elements distributions that could conceivably occur in nature. This line of
research was continued by Heuch (1977), where particular emphasis was placed
on the determination of the equilibria of a k-locus system with mutation. The
last paragraph of §4 contains an explicit statement of the multilocus H a r d y -
Weinberg law. Here Heuch extended Reiersol's method for sex-linkage, and
applied it to a certain incompatibility system.
In (Holgate 1979) I introduced a convenient calculus of chromosome recom-
bination. Let the alleles at each locus be arbitrarily labelled 0, 1 and for each
subset I of the loci {1, 2 . . . . . k}, denote by a(I) the chromosome containing the
'1' allele at the loci o f / , and the '0' allele at those of I c. The output of union
between a(I) and a(J) can be written down in terms of the recombination
distribution. We now extend the notation so that 2(L J), for any pair L J of
subsets of { 1, 2 . . . . . k }, denotes the probability that the loci of I come from the
father, and those of J from the mother. These coefficients are mutually dependent,
and 2(L J ) = 0 if I c~J ~ ~ . We take a new basis in the genetic algebra of
multilocus gametes, given by c(I) = ~s=_ 1 ( - l )Fgla(J), a(I) = ~,s=_ i ( - 1)lJIc(J) •
We then have the simple multiplication c(I)c(J) = 2 ( L J ) c ( I w J ) . Since
c ( ~ ) c ( J ) = t.(~,~, J)c(J), the principal train roots, which are the eigenvalues of
multiplication by any element representing a population, e.g. c ( ~ ) , are the set
{)4(1, ~ ) } ( = {2(1)} for brevity), and the corresponding eigenfunctions are {c(I)}.
The first purpose of this communication is to draw attention to genetic
algebra, specifically its treatment of multilocus, multiallele systems, thus out-
lining its suitability for handling complex systems in nonselective genetics, but it
is opportune to announce a number of further results pertinent to this problem.
Let the population ~ x(I)a(I) in respect of the natural basis be denoted by
y(I)c(I) with respect to the canonical basis, so that y(I) = ( - l)]'q ~s=_ ~ x(J).
Proposition 1 The disequilibrium functions of a nonselective, k-locus, diallelic
system are {1-I~ ~)'(It)} (={a(I~ . . . . . Is)} say), where the product is taken over
104 P. Holgate
each mutually exclusive, but not necessarily exhaustive set of subsets I1,12 . . . . . I,,
excluding the functions y(f2~) and {y(i), i a single locus}. The corresponding
eigenvalues of the operator taking a population to its offspring generation are
{2s HT=~ 2(1,)}.
Proof. We represent the population with canonical coordinates {y(1):y(~) = 1}
by the vector of values a(I~ . . . . . L) specified in the proposition. They are
partially ordered by refinement of the partition in the argument. We have
{ c ( ~ ) +~y(I)c(I)} 2 = ~ y ( I ) y ( J ) 2 ( I , J)c(IwJ). Thus a ( I , , . . . , / s ) is trans-
formed into 2" H~= 12(I,)a(I1 . . . . . /,) + terms with more refined arguments.
Thus the squaring operator is equivalent to a linear operator with eigenfunctions
{r~(Ii,..., L)}, and eigenvalues {2" I-I) 1 ;t(/~)} as required. However the eigen-
values of y ( ~ ) and of each y(i) are 1.
The idea of the diallelic k-locus algebra of (Holgate 1979) can be extended to
the multiallelic case with r i + 1 alleles at locus i, i = 1. . . . . k. At any locus i we
can distinguish one allele, say the t i-th, and amalgamate the others. We form the
direct sum of copies of the k-locus diallelic algebra, corresponding to each string
r = tl . . . . . tk specifying the choice of distinguished allele at each locus. One
allele at each locus can be ignored, so we need V[ r / ( = R) summands.
Definition. The genetic algebra for k linked loci with r~ + 1 alleles at locus i, is
the direct sum of R = H ri copies of the diallelic k-locus algebra. The natural
basis elements are the symbols ~ = ~ ® a~(I~,), where each I~, runs through all the
subsets of K. Its elements are linear combinations ~ X(rl . . . . . rg) ~ _ , O a~(/~,).
The canonical basis is related to the natural basis by the equations
C(Ir)--2j=l~(-1)lJ~la(J~) for each subscripted string z. In each direct sum-
mand the product rule is c(L)c(J~) = 2(/~, J~)c(L wJ~), products corresponding
to different strings r being zero.
An individual is represented by the basis element ~ ®a~(L,) whose i-th
component is the ai (I~,) for which/~ is that set of loci at which the profile of the
individual agrees with the string r~. As usual, populations are represented by
those elements with x(r~ . . . . . zR) ~> 0, ~ X(Zl . . . . . ZR) = 1. The coefficient of the
component corresponding to allele t/ at locus i, with respect to the canonical
basis, will be denoted by y(r~ . . . . . zR).
Proposition 2 The principal train roots of the multilocus, multiallelic genetic algebra
are those of the diallelic algebra, with the multiplicity of each multiplied by R.
Proof. Each of the direct summands in the algebra is isomorphic to a diallelic
algebra with the same crossover distribution.
Genetic algebras of nonselective genetic systems admit obvious discrete
groups of automorphisms corresponding to permutations of the labels of the
alleles. However, the simplest law of gametic combination, a~az=~(a 1 + a 2 ) ,
implies that ifb~ =Oa~ + ( 1 --O)a2, b2 = q0al + (1 - (p)a2, then bib 2 ~(b I + b 2 ) .
This means that not only can a single locus, (r + 1)-allelic population be
described equally well by any r + l linear combinations of gametic frequencies
that add up to 1, of which r are linearly independent, but we can calculate with
these new coordinates from generation to generation just as if they were gametic
frequencies. The automorphism group Aut(k; r~ . . . . . rk) for the multilocus
situation is the product of the affine groups on spaces of dimensions re + 1,
i = 1. . . . . k, seen more naturally in terms of the canonical coordinates as
The multilocus Hardy Weinberg law 105
in terms of the natural coordinates. Whenever the index 'l' occurs it is replaced
by i , the t-th digit of the e-string of the relevant component of the direct sum.
A coordinate containing 'O's is replaced by a sum of coordinates in which
the indices take all values except i t . Some care is necessary here since the
natural coordinatisation is 'singular', since ~ xil . . . xik = 1. For the string 11
the disequilibrium function is transformed into (Xoo+ x02 + X2o + x22)xli-
(Xo~+X2i)(Xlo+X12). For the strings 12,21,22 we obtain respectively:
(X00 -~- X20 -~- X01 + X21)X12 --(X01 -~- X21)(XIO + XI2) = (Xo0 -1- X20)X12 --(Xo1 + XI2)Xlo;
(Xo0 -~- Xlo)X21 -- (X02 + X21 )X20; (Xo0 -~- X01 -~- XlO -'~ Xll ) -- (X02 q- X12)(X20 + X21 ),
the third and fourth functions being written down from the second and first
respectively by interchanging 1 and 2. For three loci each with three alleles, we
have the disequilibrium functions as above for each of the three pairs of loci, the
eigenvalues being the probabilities of nonrecombination between each pair. In
addition, the diallelic situation leads to the further disequilibrium function, which
by Proposition 1 is y ~ l in canonical coordinates. Use of the symbolic calculus with
Yijk ?]i?~j17k,the substitution ~ = rh - t/o and the replacement ~ j ~ k = Xijk leads
:
References
Abraham, V. M.: Linearising quadratic transformations in genetic algebras. Proc. Lond. Math. Soc.,
III. Ser. 40, 346 363 (1980)
Abraham, V. M.: The induced linear transformation in a genetic algebra. Proc. Lond. Math. Soc., III.
Ser. 40, 364-384 (1980)
Bennett, J. H.: On the theory of random mating. Ann. Eugen. 184, 311-317 (1954)
Geiringer, H.: On the probability theory of linkage in Mendelian heredity. Ann. Math. Stat. 15, 25 57
(1944)
Gonshor, H.: Special train algebras arising in genetics. Proc. Edinb. Math. Soc., II. Ser. 12, 41-53
(1960)
Heuch, I.: k loci linked to a sex factor in haploid individuals. Biom. Z. 13, 57-68 (1972)
Heuch, I.: The linear algebra for linked loci with mutation. Math. Biosci. 16, 263 271 (1973)
Heuch, I.: Genetic algebras for systems with linked loci. Math. Biosci. 34, 35-47 (1977)
Hill, W. G.: Disequilibrium among several linked neutral genes in finite populations. I. Theor. Popul.
Biol. 5, 366 392 (1974)
Holgate, P.: Sequences of powers in genetic algebras. J. Lond. Math. Soc., 42, 489 496 (1967)
Holgate, P.: The genetic algebra o f k linked loci. Proc. Lond. Math. Soc., III. Ser. 18, 315 327 (1986)
Holgate, P.: Direct products of genetic algebras and Markov chains. J. Math. Biol. 3, 289 295 (1976)
Holgate, P.: Canonical multiplication in the genetic algebra for linked loci. Linear Algebra Appl. 26,
281 -287 (1979)
Holgate, P.: Linearisation of quadratic operators in genetic algebras. Cah. Math. 38, 23 33 (1989)
Holgate, P.: Bibliography of genetic algebras and related topics. Typescript (1991)
Karlin, S, Liberman, U.: Global convergence properties in multilocus viability selection models: the
additive model and the Hardy Weinberg law. J. Math. Biol. 29, 161 176 (1990)
Peresi, L. A.: The derivation algebra of gametic algebra for linked loci. Math. Biosci. 91, 151 156
(1988)
Reiersol, O.: Genetic algebras studied recursively and by means of differential operators. Math. Scand.
10, 25-44 (1962)
W6rz-Busekros, A.: Algebras in genetics. (Lect. Notes Biomath., vol. 36) Berlin Heidelberg New York:
Springer 1980