Professional Documents
Culture Documents
ABSTRACT
Diversity studies make a central theme of study of biodiversity. Several measures of diversity have been proposed
by different authors, but Shannon’s, Simpson’s and Brillouin’s indices are the most widely used ones. At the same
time, these are the least understood indices. In the present review, some of the commonly used diversity indices
have been discussed with specific examples. Special emphasis has been laid to explain the derivation of diversity
indices from basic concepts.
Key words: Biodiversity, Brillouin’s index, Chao’s index, Shannon’s index, Simpson’s index
within species, between species and the ecosystems”. The composition, influx of new species with time, out flux of
totality of genes, species and ecosystems of an area constitutes species, immigration and emigration of species, dormant or
Members Copy, Not for Commercial Sale
the biodiversity. Biodiversity can be studied at three different hidden species, propagules of species etc. Model assumptions,
www.IndianJournals.com
R MAR = the curve decreases with the sampling effort and tends to a
www.IndianJournals.com
In N limiting value.
Odum’s index: Odums’s index (Odum et al., 1960) is similar
to the Margalef’s index.
S
ROdum =
In N
Berger – Parker dominance index: Berger-Parker index is
the ratio of number of individuals of most abundant species
(Nmax) to the total number of individuals of all the species
(Ntot) in the sample.
N
DI BP = max
N tot Fig. 1. Species accumulation curve
Fisher’s a: This index is based upon the logarithmic The curve follows Michaelis–Menten’s equation and
distribution of number of individuals of different species. transcribes a hyperbola as per the following equation:
N S max . n
S = a In 1+ S ( n) =
a B+n
where, S is the total number of species and N is the total where S(n) = Number of species foe a sample size
number of individuals in the sample. The value of Fisher’s a
is computed by iteration. Smax = Maximum no. of species in the community,
n = number of samples or maximum sample size studied,
Average number of species per log cycle of B = Constant.
importance
The equation may be written in the form:
Whittaker (1972) defined log cycle of importance EWI as:
S 1 B 1
E W1 = = +
(In S1 − In Sn ) S (n) S max . n S max
where, S1 is the number of individuals of the most common The y-intercept of the double reciprocal plot between
species and Sn is the number of individuals of least common 1/S(n) and 1/n will provide the value of 1/Smax .
species, and S is the number of species.
2
Chao’s index Table 2. Decimal and binary systems
Chao’s index (Chao 1984, 1987) for estimation of Decimal number Binary number
species richness is given by the equation: 0 0
S(max) Chao = Sobs + (a² + b²) 1 1
where, Smax = Maximum no. of species, 2 1 0
Sobs = Number of species observed in different samples, 3 1 1
a = Singletons (Number of species represented by one 4 1 0 0
individual each), 5 1 0 1
b = Doubletons (Number of species represented by two 6 1 1 0
individuals each). 7 1 1 1
An example for computation of Smax by Chao’s method is 8 1 0 0 0
given in Table 1. Let there be 3 quadrats (Q1 through Q3) and 9 1 0 0 1
a total of 5 species observed (A through E). 10 1 0 1 0
Table 1. An example to calculate Smax by Chao’s method
prime sign on it, viz. H. It seen from the first two columns of
Sample Species
Table 2 that the message length (no. of alphabets) is related to
Sp1 Sp2 Sp3 Sp4 Sp5
the information content (H) as a power function, i.e., m = 2H.
I 1* 0 4 5 0 Vice-versa, H can be defined as a logarithmic function, viz., H
II 0 2** 3 1 1* = Log2 m, as given in column 3. Probability (pi) of each letter
III 0 0 3 4 0 in a message consisting m alphabets, each alphabet occurring
Downloaded From IP - 106.66.56.187 on dated 1-Apr-2024
Sobs = 5, a = 2 singletons ( marked by *), only once, will be 1/m. Information content, H, therefore can
b = 1 doubleton ( marked by **), Smax = 5+(4+2) = 11 be written as a function of probability (columns 5 and 6).
Members Copy, Not for Commercial Sale
1 2 3 4 5
Message length Letters Binary code Information content Information
(m) (H) content per letter
No. of alphabets I bit II bit III bit IV bit No. of bits required No. of bits required
for message for each letter
2 A 0 1 ½
B 1 ½
4 A 0 0 2 2/4
B 0 1 2/4
C 1 0 2/4
D 1 1 2/4
8 A 0 0 0 3 3/8
B 0 0 1 3/8
C 0 1 0 3/8
D 0 1 1 3/8
E 1 0 0 3/8
F 1 0 1 3/8
G 1 1 0 3/8
Downloaded From IP - 106.66.56.187 on dated 1-Apr-2024
H 1 1 1 3/8
16 A 0 0 0 0 4 4/16
Members Copy, Not for Commercial Sale
B 0 0 0 1 4/16
www.IndianJournals.com
C 0 0 1 0 4/16
D 0 0 1 1 4/16
E 0 1 0 0 4/16
F 0 1 0 1 4/16
G 0 1 1 0 4/16
H 0 1 1 1 4/16
I 1 0 0 0 4/16
J 1 0 0 1 4/16
K 1 0 1 0 4/16
L 1 0 1 1 4/16
M 1 1 0 0 4/16
N 1 1 0 1 4/16
O 1 1 1 0 4/16
P 1 1 1 1 4/16
Because log2 tables or software are generally not a species, and number of letters of an alphabet with the
available, Shannon’s index is calculated either in nats or in frequency of the species. The Shannon’s index thus obtained
decits. Different units of diversity may be inter-converted as will be the diversity of the community as represented in the
follows: sample. This index presumes that all the species in the sample
Bits = 1.4427 x Nats (or the quadrat) are represented in the proportions in a larger
Bits = 3.3219 x Decits community (Poole, 1974). Larger the information content
Nats = 2.3026 x Decits (H’), more heterogeneous the sample will be. For the analysis
of biological communities, Shannon’s index may be written
The units of the diversity index must be mentioned in the as
calculations. Shannon’s information as explained above can
ni ni
be used to define the biological diversity of the communities. H' (nats) = – Σ loge
In the text given above, message length can be equated with N N
the number of species in a community, an alphabet with ni = Number of individuals of the i th species,
4
Table 4. Information content (H) of a message if each alphabet occurs only once
1 2 3 4 5 6 7
Message Infor- Information Probability Information as log Information as Shannon’s
length mation as log of (pi) of each of reciprocal of negative log of information of each
content message letter in probability probability letter in message (bits)
length message
No. of (H bits) 1
alphabets H = log2 m pi 1 H = log2 p H = – log2 p1 Hi = – p1 log2 p1
(m) m 1
1 1 1 1
2 1 H = log2 2 1/2 H = log2 H = – log2 Hi = – log2
1/2 2 2 2
1 1 1 1
4 2 H = log2 4 1/4 H = log2 H = – log2 Hi = – log2
1/4 4 4 4
1 1 1 1
8 3 H = log2 8 1/8 H = log2 H = – log2 Hi = – log2
1/8 8 8 8
1 1 1 1
16 4 H = log2 16 1/16 H = log2 H = – log2 Hi = – log2
1/6 16 16 16
Downloaded From IP - 106.66.56.187 on dated 1-Apr-2024
Table 5. Example of Shannon’s information content of a message if different letters occur with different frequencies
1 2 3 4 5 6 7 8
Members Copy, Not for Commercial Sale
www.IndianJournals.com
D 1 D 1/16 1/4
1/16 Hi = – 1 log2 1
16 16
E 1 E 1/16 1/4
1/16
5
N = Total number of individuals of all the species in the The minimum value of H’ is 0 for one species community,
sample. that is, when all the individuals in the sample belong to the
Table 6 gives the computation of Shannon’s information same species. The information content will be maximum (H’
measure for plotting a graph. For a probability equal to 0, = H’max = ln S), if all the species in the sample are represented
since the limit of (0 log 0) is zero, the Shannon’s index at this by equal number of individuals. More the heterogeneous a
point will be zero. A typical graph for Shannon’s diversity sample is, larger the information content will be.
index diversity will be a concave curve (downward). A graph It would be pertinent to mention here that Shannon’s
of the Shannon’s information measure is plotted by taking index is variously refered to as Shannon-Weaver index
the probability values on the X-axis, and the H’ values on the or Shannon-Wiener index. Spellerberg and Fedor (2003)
Y-axis (Fig. 2). studied the historical perspective of the development of the
Table 6. Shannon-Wiener’s index for 2 species communities Shannon’s index. Shannon’s expression for entropy (H) was
having different probabilities first published independently by Shannon in 1948. Shannon
and Weaver jointly authored a book 1949 “The Mathematical
pi 1-pi H’
Theory of Communication” published by the University of
0 1 0 (Limit) Illinois. In the second part of the book Weaver developed
0.05 0.95 0.198515 on the concept. Shannon had built the concept on the work
0.1 0.9 0.325083 already done by Wiener (1939, 1948, 1949). The name
0.2 0.8 0.500402 Shannon-Weaver came from the jointly written book and
the name Shannon-Wiener came from the several references
0.3 0.7 0.610864
cited by Shannon in his work and the idea developed after
0.4 0.6 0.673012 Wiener.
0.5 0.5 0.693147
Simpson’s index
Downloaded From IP - 106.66.56.187 on dated 1-Apr-2024
Fig. 2. Graph for Shannon’s information Fig. 3. Simpson’s index for concentration Fig. 4. Simpson’s index for diversity
measure (abundance)
Fig. 5. Relation between Simpson’s Fig. 6. Inverse Simpson’s index for diversity
concentration and diversity indices.
6
Probability (p1,2) that the second individual drawn from the of 10 individuals is 1/10. The probability of drawing second
sample without replacement also belongs to the same species individual of species A, out of remaining 9 individuals will
will be be (1/10)(2/9). Given in the table are the steps to understand
n n − 1 the derivation of Brillouin’s index.
p1,2 = 1 . 1 The negative logarithm of the equation thus derived (or
n n −1
the logarithm of reciprocal of probability) may be used as
The sum of such probabilities for all the species is a measure Brillouin’s information measure.
of the concentration (or abundance) (C) of the species.
∑ n1 (ni − 1) Good’s Series of Indices
CSimpson =
n(n − 1) Following generalized equation can be used to derive
If the sample size is quite large, then the probability (p1,2) some important indices:
that the second individual drawn from the sample with
HGood = ∑ pi ( − In pi ) n
m
replacement also belongs to the same species (C’) will be
where different values of m and n (0, 1, 2, 3) give different
∑n
2
CSimpson = ∑ pi
2 i
= indices. For example,
n2 For species richness index (S), (m, n) = (0, 0)
Simpson’s index for abundance (concentration) is a typical For Simpson’s index (S pi2 ), (m, n) = (2, 0)
convex curve as shown in Fig. 3. The maximum value of For Shannon’s index (H’), (m, n) = (1, 1)
Simpson’s concentration is one when all the individuals in
the sample belong to the same species. Gini-Simpson’s index Variance as a measure of diversity
of diversity (D) is defined as
Downloaded From IP - 106.66.56.187 on dated 1-Apr-2024
7
Table 7. Calculation of probability of drawing an individual from a community consisting 3 species and 10 individuals
Sp. No. of ind. Names of No. of ind. No. of ind. of sp. drawn Probability of drawing individuals upto the
(ni) ind. left i th individual
A 5 A1 10 1 1
(1st) 10
A2 9 2 1 2
.
(1st, 2nd) 10 9
A3 8 3 1 2 3
. .
(1st, 2nd, 3rd) 10 9 8
A4 7 4 1 2 3 4
. . .
(1st, 2nd, 3rd, 4th) 10 9 8 7
A5 6 5 1 2 3 4 5
. . . .
(1st, 2nd, 3rd, 4th, 5th) 10 9 8 7 6
B 3 B1 5 1 1 2 3 4 5 1
. . . .
10 9 8 7 6 5
B2 4 2 1 2 3 4 5 1 2
. . . . .
10 9 8 7 6 5 4
B3 3 3 1 2 3 4 5 1 2 3
. . . . . .
(5 ind of A and 3 of B) 10 9 8 7 6 5 4 3
C 2 C1 2 1 1 2 3 4 5 1 2 3 1
. . . . . .
10 9 8 7 6 5 4 3 2
Downloaded From IP - 106.66.56.187 on dated 1-Apr-2024
C2 1 2 1 2 3 4 5 1 2 3 1 2
. . . . . . .
(5 ind of A, 3 of B and 2 of C) 10 9 8 7 6 5 4 3 2 1
Members Copy, Not for Commercial Sale
Total 10 All 10 5! 3! 2 ! n1 ! n2 ! n3 !
www.IndianJournals.com
=
10! N!
The authors proved that for a population or for a large a column vector ƒ1 (p1) with a row vector ƒ2 (p2). Either
sample, the sum of squares of probabilities, i.e., Gini- diagonal or non-diagonal elements, or both of these elements
Simpson’s concentration (C’) will be given as of the probability matrix can be used to derive the information
measures which can be used as measures of diversity.
n
σ2 + M2
∑p i
2
= n
i =1 nM 2 I = ∑ f 1 ( p1i ) f 2 ( p2i )
where M is the mean of the sample. Thus variance can be i =1
used as a measure of heterogeneity of data for a continuous The probability matrix generated will be
variable such as length, height, weight, concentration etc.
The relation between the variance of a small sample (S2) and
a population or large sample (σ2) is,
S 2 (n − 1)
σ2 =
n
Therefore, for a small sample, the Simpson’s index will be
In the above matrix if both the row and column matrices
S 2 (n − 1) comprise of pi elements, the sum of diagonal elements will be
n + M2
n
∑
i =1
2
pi =
nM 2
produce Simpson’s concentration, which is Simpson’s index
of diversity.
Same way Simpson’s index can be found for standard error n n
C ' Simpson = ∑ pi pi = ∑ pi
2
(SE) and coefficient of variation (CV).
i =1 i =1
n
SE 2 (n − 1) + M 2
∑
i =1
pi =
2
nM 2
Similarly, non-diagonal elements of this probability
matrix can also be used to derive new information measures
(I),
Information Measures via Matrix Methods
Sarangal et al. (2012) proved that in several cases a
probability matrix can be produced by multiplication of
8
which gives Gini-Simpson’s index LITERATURE CITED
Baumgartner S 2005. Measuring biodiversity of what and for what
purpose? University of Heidelberg. Germany. http://www.bio-
nica.info/biblioteca/Baumgartner2005Biodiversity.pdf)
Bronikowski A and Webb C 1996. Appendix: A critical examination
Similarly, in the matrix given above, if either the row of rainfall variability measures used in behavioral ecology. Beh
or the column vectors comprises of pi and the other vector Ecol Sociobiol 39: 27 – 30.
comprises of ln pi elements, the negative sum of diagonal Chao A 1984. Non-parametric estimation of the number of classes in
elements will be produce Shannon’s index of diversity. a population. Scandinavian Jour Stat 11: 265-70.
H' (nats) = – Σ pi loge pi Chao A 1987. Estimating the population size for capture-recapture
Using matrix methods we can derive several other data with unequal catchability. Biometrics 43: 783-91.
measures of information which can be used as measures of Chawla A, Kumar A, Lal B and Singh RD 2012. Ecological
concentration or diversity. characterization of high altitude Himalayan landscapes in the
upper Satluj river watershed in Kinnaur, Himachal Pradesh,
Evenness indices India. Jour Indian Soc Rem Sens 40: 519-39.
The evenness (equitability) of a sample implies equality Chawla A, Rajkumar S, Singh KN, Lal B, Singh RD and Thukral
in the number of individuals of species (Pielou, 1975). Some AK 2008. Plant species diversity along an altitudinal gradient
of these indices are given below. These are: of Bhabha valley in western Himalaya. J Mountain Science 5:
157-77.
Greig – Smith P 1978. Quantitative Plant Ecology. Butterworths,
London.
Downloaded From IP - 106.66.56.187 on dated 1-Apr-2024
Publishing, Oxford.
Jost L 2006a. The new synthesis of diversity indices, alpha, beta
where S is the number of species, H’ is the Shannon-Wiener’s
and gamma diversity and similarity measures. Oikos preprint.
index, Dmax and Dmin represent the maximum and minimum
values of Simpson’s reciprocal diversity index. Jost L 2006b. Entropy and diversity. Oikos 113: 363-75.
Effective number of species: The number of equally Jost L 2010. The relation between evenness and diversity. Diversity
common species in a sample is called effective number 2: 217-32.
of species. This index characterizes the true diversity of a Keylock CJ 2005. Simpson Diversity and the Shannon–Wiener
community (Jost 2006b). index as a special case of a generalized entropy. Oikos 109:
203-207.
Effective no. of species = e H'
McIntosh R P 1967. An Index of diversity and the relation of certain
For example if the Shannon’s index is 2 nats, this implies
concepts to diversity. Ecology 48:115-26.
that the true diversity is 7.39 species. For information in bits,
the effective number of species can be calculated with base Meffe GK, Nielson LA, Knight RL and Schenborn DA 2002.
2, that is (2H'). Ecosystem management: Adaptive community based
conservation. Island Press, Washington D.C.
Conclusion Magurran AE 1988. Ecological diversity and its measurement,
Chapman and Hall, London.
Alpha diversity indices are extensively used in
characterisation of communities in biology (Chawla et al. Odum HT, Cantfon JE and Kornicker LS 1960. An organizational
2008, 2012) and other research fields such as to describe hierarchy postulate for the interpretation of species – individual
the vertical structure of forest ecosystems, succession distribution, species entropy, ecosystem evolution and the
measuring of species variety index. Ecology 41: 395-99.
of communities (Petrere Jr. et al., 2004), rainfall data
(Bronikowski, 1996), language studies, water and energy Parker KR 1979. Density estimation by variable area transect. J
studies (Singh 2013) and many other research fields. Since Wildlife Management 43: 484-92.
diversity indices are not well explained in the texts, their use Parkash O and Thukral AK 2010. Statistical measures as measures
in literature has been reduced to just a convention. This paper of diversity. International J Biomathematics 32: 173-85.
may help researchers to understand the fundamentals of Peet RK 1974. The measurement of species diversity. Ann Rev Ecol
measurement of diversity and to apply their research results Syst 5: 255-307.
for characterization of plant communities more justifiably.
9
Petrere Jr M, Giordano LC and De Marco Jr P 2004. Empirical Spellerberg IF and Fedor PJ 2003. A tribute to Claude Shannon
diversity indices applied to forest communities in different 1916-2001 and a plea for more rigorous use of species richness,
successional stages. Brazilian J Biol 64: 1-12. species diversity and the ‘Shannon-Wiener” index. Global
Ecol Biogeog 12: 177-79.
Pielou EC 1975. Ecological Diversity. Wiley, New York.
Thukral AK 2010. Measurement of diversity in characterisation of
Ponce-Hernandez R 2004. Assessing carbon stock and modeling
plant communities. In: Information Theory and Optimisation
win-win scenarios of carbon registration through land use
Techniques in Scientific Research Ed. Om Parkash. VDM
changes. FAO, Rome. Chapter IV.
Verlag, Saarbrücken. pp. 89-98.
Poole RW 1974. An Introduction to Quantitative Ecology. McGraw
Thukral AK, Chawla A and Samson MO 2006. Measurement of
Hill, New York.
biodiversity. In: Proc. 6th National Workshop on Environment
Routledge RD 1977. On Whittaker’s components of diversity. Statistics. Central Statistical Organization, Ministry of
Ecology 58: 120 – 27. Statistics and Programme Implementation, Govt. of India, New
Sarangal M, Buttar GS and Thukral AK 2012. Generating Delhi, 244-56.
information measures via matrix over probability spaces. Whittaker RH 1972. Evolution and measurement of species
Intern Jour Pure Appl Math 81: 723-35. diversity. Taxon 21:213-51.
Shannon CE 1948. A mathematical theory of communication. Bell Wiener N 1939. The ergodic theorem. Duke Mathematical Journal
Sys Tech J 27: 379-423, 623-59. (cf Spellerberg and Fedor, 2003).
Shannon CE and Weaver W 1949. A Mathematical Theory of Wiener N 1948. Cybernetics. Wiley, NewYork.
Communication. University of Illinois Press, New York.
Wiener N 1949. The extrapolation, interpolation and smoothing
Simpson EH 1949. Measurement of diversity. Nature 163: 688. of stationary time series with engineering applications. John
Singh VP 2013. Entropy Theory and its Applications in Wiley and Sons, New York.
Environmental and Water Engineering. John Wiley & Sons Wikipedia 2017a. Biodiversity. http://en.wikipedia.org/wiki/
Downloaded From IP - 106.66.56.187 on dated 1-Apr-2024
Smith B 1986. Evaluation of Different Similarity Indices Applied to Wikipedia 2017b. Diversity index. http://en.wikipedia.org/wiki/
www.IndianJournals.com
10