You are on page 1of 6

Euphytica 51: 235-240, 1991.

1991 KluwerAcademic Publishers. Printedin the Netherlands.

The interpretation of Nei and Shannon-Weaver within


population variation indices

S. H e n n i n k 1 & A.C. Zeven


Department of Plant Breeding (I. v.P.), Agricultural University, P.O. Box 386, 6700 A J Wageningen
The Netherlands; 1present adress: Centre for Plant Breeding Research CPO, P.O. Box 16,
6700 A A Wageningen, The Netherlands

Received 21 June 1990; accepted 16 August 1990

Key words: variation index, within population variation

Summary

The description of variation in populations is of great importance for plant breeders and gene bank
researchers. The Nei and Shannon-Weaver within population variation indices are reviewed in this paper.
The authors reject the use of the Shannon-Weaver index as an index for measuring variation.
General conclusions about the amount of variation in a population should not be drawn from a few
characters, because there is no association between the amount of variation for different characters. Such
conclusions are only valid for the considered characters.

Introduction borty (1974) derived formulas for Nei's index to


describe variation within and between populations.
A population of plants can often be distinguished in The Shannon-Weaver index (1949) is only a within
several classes for a n u m b e r of characters. The population index. Therefore in the present case the
variation for each character is under control of within population variation will be discussed only.
genetic and environmental factors. For plant
breeders and gene bank researchers it is desirable
to have standardized measures to compare the vari- Nei's within population variation index
ation occurring in different populations.
Plant breeders need to know which population Simpson (1949) defined an infinite population such
contains much variation for certain characters and that each individual belongs to one of Z groups. Let
which population contains the desired variation. Xi (i = 1 , 2 . . . Z ) be the fraction of individuals in
G e n e bank researchers are interested in the com-
groupi, then t Xi=l
plete range of variation of a crop. i=1
Jain et al. (1975) wrote: 'Ultimate success of any H e defined next:
large-scale program for the conservation of genes
depends on survey and p r o p e r documentation of h= ~. X~ I
i=l
variation within and among populations.'
In this paper Nei's and Shannon-Weaver's varia- Then h is a measure of the concentration of the
tion indices are compared. Nei (1973) and Chakra- classification. The quantity h can simply be in-
236

terpreted as the probability that two individuals, Several modifications of this index have been
chosen at random from the population, belong to published. Several times the natural logarithm is
the same group i for the considered character. used instead of base 2 (Asins & Carbonell, 1987;
Hengeveld et al., 1982). It is clear that Shannon &
H= 1-h II Weaver (1949) intended Zlog since they said: 'I has
its largest value, namely one, when two messages
Nei (1973) defined H, the probability of non-identi- are equally probable; that is to say when Pl = P2 =
ty. H is a measure of genetic variation in a pop- 1/2.' This is only possible for 2log.
ulation, when a genetically qualitative controlled Just like H (II), I (V) has a maximum value (that
character is considered. In a random mating pop- depends on the number of groups (Z) which occurs
ulation, when Xi is the fraction of the i th allele, H is in the case of evenness:
usually called heterozygosity. However this is not
appropriate for non-random mating populations. Imax = 210g Z VI
Therefore Nei (1973) uses the term gene diversity.
This term does not fit when we consider characters The relative index I' is:
of which the genetic control is unknown. In that
case Xi is the fraction of the i th group. So we coin the -- ~ Xi ° 210gX i
term Nei's variation index (H). I'= i=l -- ~ Xi• ZlogXi VII
2log Z i=1
Nei's variation index reaches a maximum, called
Hm~x, when there is evenness. This means that the I' indicates, just like H', which part of the theoret-
individuals are uniformly distributed over the Z ically possible variation exists in the population.
groups; implying Xi = 1/Z. Table I shows the maximum values for the indices I
and H.
Hma~ = 1 - 1/Z III Due to their definitions the relative indices H' and
I' become always one under the conditions, given
A relative index H' can be defined: in Table 1.

H ' = I-I/Hm~x IV
Some examples for both indices
This relative index indicates which part of the theo-
retically possible variation actually exists in the For each of two hypothetical plant populations two
population. From the preceeding follows that H' is different classifications for plant height are consid-
one when there is evenness. ered (Table 2). In this example the situation for 3
and 9 classes are considered. From the frequency
distributions shown in Table 2 it is clear that the
The Shannon-Weaver variation index
Table 1. T h e maximum values, which may be attained by Nei's
Shannon & Weaver (1949) described the index for and Shannon-Weaver's variation indices for some values of Z.
In these cases Xi = 1/Z
information I.
Number of groups (Z) Hm~ Ir.~
I=- ~ Xi'21ogXi V
i=l 1 0 0
Although I is often used (Jain et al., 1975; Tolbert 3 0.667 1.585
et al., 1979) and mentioned (Brown & Weir, 1983; 9 0.889 3.170
25 0.960 4.644
Peet, 1974) it is difficult to interpret. Nei (1975) 100 0.990 6.644
said: 'It's not clear what the absolute value of this
quantity means in terms of genetic materials.'
237

Table 2. Absolute frequencies of artificial plant height data for two populations, named A & B. Each individual of both populations
could belong to one of nine classes. In the second case it is supposed that one could distinguish only three classes

Class Counts Class Counts

A9 B9 A3 Ba

1 (1-20) 1 0 "~
2 (21-40) 9 0 1 (1-60) 25 1
3 (41--60) 15 1
4 (61-80) 10 7 )
5 (81-100)
6 (101-120)
8
7
8
15
Y 2 (61-120) 25 30

7 (121-140) 0 10 )
8 (141-160) 0 9 3 (121-180) 0 19
9 (161-180) 0 0
n 50 50 50 50
mean 64.9 111.7 60.5 112.1
std. dev, 26.80 26.61 24.87 26.33
CV 41.30 23.82 41.10 23.49

std dev.
Std dev. corrected with Sheppard's correction (Snedecor & Coehran, 1967) CV = 100. - -
mean

two populations differ, since their means differ deviation (for quantitative characters only) is need-
significantly. This conclusion does not depend on ed to solve this problem.
the method of classifying the data. There is no Table 4 shows the frequency distribution for 49
significant difference between the classification A9 observed characters of a group of 78 morphotypes
and A 3 o r B9 and B3 for mean plant height and its of bread wheat landraces and improved types (Ze-
standard deviation in Table 2. v e n & Schachl, 1989). The range of the two varia-
Table 3 shows no difference for the variation tion indices covers almost the whole range of pos-
indices when the classification in 9 groups (A9 and sible indices, so it is impossible to conclude wheth-
B9) is used, but when a 3 group classification (A3 er this collection as a whole contains much or little
and B3) is used H indicates more variation in pop- variation. The two indices do not give identical
ulation A while I indicates more variation in pop- rankings for the different traits. This can be ex-
ulation B. plained by the fact that the Shannon-Weaver index
When one is unaware of Table 2 and knows only is sensitive for small fractions. A relatively large
Table 3 the variation of the two populations seems contribution to the value of I is made by small
equal. Nei's variation index measures a probabil- fractions.
ity, but the interpretation of the Shannon-Weaver For the individual characters it is not possible to
index is not dear. Both indices are within pop-
ulation indices. These types of indices need not
Table 3. Values for Nei's and Shannon-Weaver's variation and
discriminate among populations. Distinction be- relative indices obtained with data from Table 2
tween populations can be obtained with the aid of
between population indices, which is only available A9 B9 A3 B3
for Nei's index (Nei, 1973; Chakraborty, 1974).
One cannot infer from these population indices H 0.792 0.792 0.5 0.~5
H' 0.~1 0.~1 0.75 0.7~
which population contains the desired variation. A
I 2.364 2,364 1 1.086
frequency distribution or a mean with standard I' 0.7~ 0,7~ 0,631 0.685
238

Table 4. Frequency distributions (Z = 9) for 49 characters of a group of 78 morphotypes of bread wheat iandraces and improved types in
Austrian Alps and the calculated Nei and Shannon-Weaver within population variation indices (Schachl, 1975; Zeven & Schachi, 1989)

Character Observed number per class + Total H (rank) I (rank)

1 2 3 4 5 6 7 8 9

0 Maximum variation possible 0 9 9 9 9 9 9 8 8 8 78 0.889 (1) 3.168 (1)


1 Lodging resistence 48* 21 8 11 1 4 3 8 12 10 78 0.842 (2) 2.859 (2)
2 Flag leaf sheath hairiness 19 14 8 15 0 11 4 22 1 3 78 0.817 (3) 2.634 (3)
3 Lowest leaf giaucosity 14 0 1 16 t6 17 12 13 3 0 78 0.815 (4) 2.524 (4)
4 Lowest leaf hairiness 13 18 7 14 0 21 0 15 3 0 78 0.796 (5) 2.393 (7)
5 Lower giume shoulder width 34 1 4 14 0 7 1 26 8 17 78 0.788 (6) 2.482 (5)
6 Lower glume beak length 37 1 2 21 12 10 0 8 0 24 78 0.781 (7) 2.381 (8)
7 Flag leaf auricle colour 18 13 15 29 0 5 0 6 2 8 78 0.776 (8) 2.430 (6)
8 Flag leaf auricle hairiness 17 7 13 16 1 27 1 13 0 0 78 0.774 (9) 2.334 (9)
9 Ear length 8 0 0 0 0 19 14 20 5 20 78 0.773 (10) 2.202 (15)
10 Ear density 2 19 16 25 1 0 1 10 6 0 78 0.773 (11) 2.317 (10)
11 Culm waxiness 22 1 0 4 0 2 11 32 16 12 78 0.743 (12) 2.246 (13)
12 Flag leaf width 21 0 2 11 6 10 17 32 0 0 78 0.741 (13) 2.205 (14)
13 Lowest leaf hardness 38 2 0 15 0 10 0 17 3 31 78 0.739 (14) 2.162 (16)
14 Lower glume coarseness 35 15 4 5 0 2 0 34 13 5 78 0.734 (15) 2.274 (12)
15 Grain length 43 0 0 0 7 29 22 16 3 1 78 0.730 (16) 2.088 (17)
16 Lower giume curvature 31 0 0 27 6 23 3 19 0 0 78 0.726 (17) 2.011 (19)
17 Ear colour intensity 4 0 10 31 3 25 3 6 0 0 78 0.714 (18) 2.081 (18)
18 Ear shape 5 17 3 14 0 35 0 9 0 0 78 0.704 (19) 1.983 (22)
19 Rachis hairiness 45 10 8 7 0 3 1 7 3 39 78 0.704 (20) 2.283 (11)
20 Lower glume external hairs 26 34 14 20 5 1 0 3 0 0 77 0.699 (21) 1.993 (20)
21 Ear awning 10 13 0 25 0 9 0 0 0 31 78 0.698 (22) 1.846 (29)
22 Culm colour 25 0 0 14 0 13 0 16 0 35 78 0.697 (23) 1.863 (25)
23 Lowest leaf length 16 0 0 0 0 1 5 26 30 16 78 0.695 (24) 1.862 (27)
24 Grain width 39 0 1 2 11 37 17 10 0 0 78 0.690 (25) 1.984 (21)
25 Culm length 23 0 0 0 0 0 14 14 14 36 78 0.690 (26) 1.849 (28)
26 Grain thickness 40 1 2 16 23 34 2 0 0 0 78 0.679 (27) 1.862 (26)
27 Grain shape 41 0 0 29 0 22 0 25 0 2 78 0.679 (28) 1.707 (36)
28 Rachis length of lowest internode 44 0 0 4 2 17 1 38 0 16 78 0.670 (29) 1.889 (23)
29 Ear bending 7 22 0 37 0 13 0 6 0 0 78 0.662 (30) 1.741(33)
30 Lower giume internal hairs 27 36 24 13 2 2 0 0 0 0 77 0.654 (31) 1.744 (31)
31 Ear attitude 6 14 4 15 0 42 1 1 1 0 78 0.638 (32) 1.845 (30)
32 Lower giume beak shape 36 24 0 8 0 4 0 40 0 2 78 0.629 (33) 1.710 (35)
33 Lower giume size 30 0 0 0 11 34 1 32 0 0 78 0.622 (34) 1.529 (38)
34 Ear width 1 0 0 14 6 45 6 4 3 0 78 0.619 (35) 1.872 (24)
35 Lower glume shape 29 0 5 44 0 7 0 18 0 4 78 0.614 (36) 1.740 (34)
36 Grain size 42 0 0 2 12 47 7 9 1 0 78 0.591 (37) 1.744 (32)
37 Lower glume width 28 0 1 21 6 46 4 0 0 0 78 0.571 (38) 1.544 (37)
38 Lowest leaf width 12 0 0 15 10 50 2 1 0 0 78 0.535 (39) 1.465 (39)
39 Flag leaf waxiness 20 0 1 0 0 0 5 50 20 2 78 0.518 (40) 1.385 (40)
40 Ear colour 3 33 1 0 0 44 0 0 0 0 78 0.503 (41) 1.072 (41)
41 Flag leaf attitude 49 0 1 53 0 0 0 24 0 0 78 0.443 (42) 0.983 (43)
42 Anther colour 9 53 0 0 0 0 0 0 0 25 78 0.436 (43) 0.905 (46)
43 Lower glume of the top spikelet 32 1 0 55 0 22 0 0 0 0 78 0.423 (44) 0.951 (45)
44 Culm length of top internode 24 0 0 0 0 0 1 18 1 58 78 0.393 (45) 0.967 (44)
45 Lower glume shoulder shape 33 0 0 1 1 63 2 3 0 8 78 0.335 (46) 1.063 (42)
46 Lowest leaf colour 15 0 0 0 5 65 6 2 0 0 78 0.295 (47) 0.893 (47)
47 Plant leaf attitude 46 0 0 0 0 0 0 69 2 7 78 0.209 (48) 0.604 (48)
48 Plant leaf curvature 47 5 0 1 0 0 0 1 0 71 78 0.167 (49) 0.539 (49)
49 Lowest leaf attitude 11 0 0 0 0 0 0 0 0 78 78 0.000 (50) 0.000 (50)

+ description value, * character number: both according to Zeven & Schachl, 1989.
239

dassiPy the characters as having much, average or landraces from Austria, Cyprus and Pakistan, did
little variation. For example the distribution fre- not find such association. In a P a k i s ~ wheat
quencies of lowest leaf glaucosity (3) and grain landrace Monster (1988) found no variation for leaf
width (24) are not very different although grain rust resistence and gliadin and glutenin composi-
width has a considerable lower value for both varia- tion, but there was a high level of variation for plant
tion indices. This means that the indices are not morphotype.
useful for placing the characters in variation groups Jain et al. (1975) concluded in their paper: 'the
such as having 'much' or 'little' variation. variation seems to be rather high in the Ethiopian
For the characters with 'little' variation there is a and Mediterranean region, and high also in India,
concentration of the entries in one or few groups. but low, on average, in Near Eastern and Middle-
From the indices it is not possible to locate the East countries.' This conclusion was drawn from a
position of the concentration e.g. flag leaf waxiness mean diversity index I based on six characters for
(39) is concentrated in the higher classes i.e. each region. This kind of conclusion should not be
strongly waxed is common for this collection (Ze- generalised. It is only valid for the considered char-
v e n & Schachl, 1989). acters. When large numbers of characters are ob-
served the indices may cover the whole range of
values possible (Table 4). A new character can
Conclusion and discussion have any variation index and so measuring varia-
tion on a new set of characters can give a complete
The interpretation of both indices is difficult. We different outcome.
reject the Shannon-Weaver (I) index as an index
for measuring variation because it has no direct
meaning and it is very sensitive for small fractions, Acknowledgements
The Nei within population index is more com-
prehensible and offers more possibilities, such as a The authors wish to thank Dr. I. Bos and J.A. van
between population variation index. But it is still der Heiden for their helpful comments and sugges-
not optimal. It is possible that several populations tions.
have 'low' variation indices for a certain character,
while these populations together still contain the
complete range of variation. (See e.g. the columns References
A9 and B9of Tables 2 and 3), For the interpretation
of the indices a frequency distribution or a mean Asins, M.J. & E.A. Carbonell, 1987. Concepts involved in
with standard deviation (quantitative characters measuringgeneticvariabilityand its importancein conserva-
tion of plant geneticrecources. EvolutionaryTrends in Plants
only) is still necessary. 1 (1): 51--62.
The number of classes used for the calculation of Brown, A.H.D. & B.S. Weir, 1983. Measuringgenetic varia-
the indices can play an important role, as shown in bility in plant populations. In: S.D. Tanksley & T.J. Orton
Tables 2 and 3. When changing the number of (Eds). Isozymesin plant geneticsand breeding,Part A. Else-
classes, the variation indices change. The indices vier Science PublishersB.V., Amsterdam, pp. 219-239.
Chakraborty, R., 1974. A note on Nei's Measure of Gene
are only comparable if the same number of classes Diversity in a substructured Population. Humangenetik21:
is used. This also applies to the relative indices. It is 85-88.
obvious that the indices of different characters be- Damania, A.B., E. Porceddu & M.T. Jackson, 1983. A rapid
tween and within populations should not be com- method for the evaluationof variation in germplasmcollec-
pared. tions of cerealsusingpolyacrylamidegel electrophoresis.Eu-
phytica 32: 877-883.
Damania et al. (1983) concluded that there is a Hengeveld, R., H.B. Becker & J.B. Biezen, 1982. Aspecten
positive association between the level of variation van statistischgedrag van diversiteitsmaten.Vakbl. Biol. 62
of one character and another character. Zeven (12): 230--234.
(1987) using research data of wheat and barley Jain, S.K., C.O. Qualset, G.M. Bhatt & K.K. Wu, 1975. Ge-
240

ographical Patterns of Phenotypic Diversity in a World Col- communication. The University of Illinois. Urbana, Chicago,
lection of Durum Wheats. Crop Sci. 15: 700-704. London. pp. 3--24.
Monster, E.M., 1988. Unpublished research report. Dept. of Simpson, E.H., 1949. Measurement of Diversity. Nature 163:
Plant Breeding, Agric. Univ. Typescript 80p. 688.
Nei, M., 1973. Analysis of Gene Diversity in Subdivided Pop- Snedecor, G.W. & W.G. Cechran, 1967. Statistical methods.
ulations. Prec. Nat. Acad. Sei. USA 70 (12): 3321-3323. Sixth edition. The Iowa State University Press. Ames, Iowa,
Nei, M., 1975. Molecular population genetics and evolution. In: USA. p82--83.
A. Neuberger & E.L. Tatum (F_As).Frontiers of Biology Vol. Tolbert, D.M., C.O. Qualset, S.K. Jain & J.C. Craddock, 1979.
40. North Holland Publishing Company, Amsterdam, New- A diversity analysis of a world collection of barley. Crop. Sci.
York. pp. 127-154. 19: 789--794.
Feet, R.K., 1974. The measurement of species diversity. An- Zeven, A.C., 1987. Onderzoek op het IvP naar de mate van
nual Review of Ecology and Systematics 15: 285-307. fenotypische and genotypisehe variatie bij landrassen. Pro-
Sehachl, R., 1975. Die Landweizen des westliehes Alpenvor- phyta 9: 199-200.
landes. Doctor's Dissertation. Linz. Mimeographed. 147 p. Zeven, A.C. & R. Schachl, 1989. Groups of bread wheat land-
Shannon, C.E. & W. Weaver, 1949. The mathematical theory of races in Austrian alps. Euphytica 41: 235-246.

You might also like