You are on page 1of 13

Diversity indices

Index of diversity

The index of diversity (also referred to as the Index of Variability) is a commonly used measure,
in demographic research, to determine the variation in categorical data.

The most common index of diversity measure was created by Gibbs and Martin (Gibbs, Jack P.,
and William T. Martin, 1962. “Urbanization, technology and the division of labor.” American Sociological
Review 27: 667–77); also referred to by Judith Blau (Group Enmity and Accord: The Commercial Press in
Three American Cities, Social Science History 24.2, 2000: 395-413) -

Where p = proportion of individuals or objects in a category

N = number of categories.

A perfectly homogeneous population would have a diversity index score of 0. A perfectly

heterogeneous population would have a diversity index score of 1 (assuming infinite categories with equal
representation in each category). As the number of categories increases, the maximum value of the diversity
index score also increases (e.g., 4 categories at 25% = .75, 5 categories with 20% = .8, etc.)

An example of the use of the index of diversity would be a measure of racial diversity in a city.
Thus, if Sunflower City was 85% white and 15% black, the index of diversity would be: .255.

The interpretation of the diversity index score would be that the population of Sunflower City is
not very heterogeneous but is also not homogeneous.

Diversity index

In ecology, a diversity index is a statistic which is intended to measure the biodiversity of an

ecosystem. More generally, diversity indices can be used to assess the diversity of any population in which
each member belongs to a unique species. Estimators for diversity indices are likely to be biased, so caution
is advisable when comparing similar values.

Measures of diversity

Alternative ways to measure biodiversity include[.:

• Alpha diversity (α-diversity) , the biodiversity within a particular area, community or ecosystem
• Beta diversity – species diversity between ecosystems; this involves comparing the number of taxa
that are unique to each of the ecosystems.
• Gamma diversity – taxonomic diversity of a region with several ecosystems.
• Phylogenetic diversity – or 'Omega diversity'. The differences or diversity between taxa
• Global diversity – overall biodiversity of Earth.
Diversity may not be congruent at all taxonomic levels and diversity patterns may vary depending
on the type of diversity measured, as seen in the example to the left.

Alpha diversity

Alpha diversity (α-diversity) is the biodiversity within a particular area, community or

ecosystem, and is usually expressed as the Species richness of the area. This can be measured by counting
the number of taxa (distinct groups of organisms) within the ecosystem (eg. families, genera, species).
However, such estimates of species richness are strongly influenced by sample size, so a number of
statistical techniques can be used to correct for sample size to get comparable values.

It include following indices:

• 1 Simpson's Diversity Index

• 2 Shannon index
• 3 Fisher's Alpha
• 4 Rarefaction

Simpson's Diversity Index

Where S is the number of species, N is the total percentage cover or total number of organisms and n is the
percentage cover of a species or number of organisms of a species.

Shannon index


* ni The number of individuals in each species; the abundance of

each species.
* S The number of species. Also called species richness.
* N The total number of all individuals
* pi The relative abundance of each species, calculated as the
proportion of individuals of a given species

to the total number of individuals in the community:

Fisher's Alpha

Rarefaction takes hypothetical subsamples of n organisms from the more-sampled region, and
calculates the average number of species in such subsamples. This average can be compared to the number
of species actually found in the less-sampled region

Beta diversity

Beta diversity (β-diversity) is a measure of biodiversity which works by comparing the species
diversity between ecosystems or along environmental gradients. This involves comparing the number of
taxa that are unique to each of the ecosystems.

It is the rate of change in species composition across habitats or among communities. It gives a
quantitative measure of diversity of communities that experience changing environments. See alpha
diversity, gamma diversity, global diversity.

It include following indices:

1.Sørensen's similarity index 2. Whittaker's measure

• Sørensen's similarity index[1]

where, S1= the total number of species recorded in the first community, S2= the total number of species
recorded in the second community, and c= the number of species common to both communities. The
Sørensen index is a very simple measure of beta diversity, ranging from a value of 0 where there is no
species overlap between the communities, to a value of 1 when exactly the same species are found in both

• Whittaker's measure[2]

where, S= the total number of species recorded in both communities, =average

number of species found within the communities.

Gamma diversity

Gamma diversity (γ-diversity) is a measure of biodiversity. It refers to the total biodiversity over
a large area or region. It is the total of α and β diversity.
According to Whittaker (1972), gamma diversity is the richness in species of a range of habitats in
a geographic area (e.g.,a landscape, an island) and it is consequent on the alpha diversity of the individual
communities and the range of differentiation or beta diversity among them. Like alpha diversity, it is a
quality which simply has magnitude, not direction and can be represented by a single number (a scalar).

The internal relationship between alpha, beta and gamma diversity can be represented a


Phylogenetic diversity

The two species of Tuatara are separated from all other species by over 200 million years

Phylogenetic diversity or omega diveristy is a measure of biodiversity which incorporates

taxonomic difference between species. It is defined and calculated as "the sum of the lengths of the all the
branches that are members of the corresponding minimum spanning path" [1], in which 'branch' is a segment
of a cladogram, and the minimum spanning path is the mimimum distance between the two nodes.

This definition is distinct from earlier measures which attempted to incorporate phylogenetic
diversity into conservation planning, such as the measure of 'taxic diversity' introduced by Vane-Wright,
Humphries, and William [2]

The concept of phylogenetic diversity has been rapidly adopted in conservation planning, with
programs such as the Zoological Society of London's EDGE of Existence programme focused on
evolutionary distinct species. Similarly, the WWF's Global 200 also includes unusual evolutionary
phenomena in their criteria for selecting target ecoregions.

Some studies have indicated that alpha diversity is a good proxy for phylogenetic diversity, so
suggesting that term has little use [3], but a study in the Cape floristic region showed that using phylogenetic
diversity led to selection of different conservation priorities than using species richness. They also
demonstrated that PD led to greater preservation of 'feature diversity', than species richness alone

Global biodiversity

The biodiversity of planet Earth is the total variability of life forms forms. Currently about 1.6
million species are known, but this is thought to be a serious underestimate of the total number of species.
Threats to global biodiversity include natural extinction, an event that occurs to species yearly, as well as
human actions such as pollution. Invasion of non-native species can also have a negative affect on global

The numbers of identified modern species as of 2004 can be broken down as follows:[1]

• 287,655 plants, including:

o 15,000 mosses,
o 13,025 ferns,
o 980 gymnosperms,
o 199,350 dicotyledons,
o 59,300 monocotyledons;
• 74,000-120,000 fungi;[2]
• 10,000 lichens;
• 1,250,000 animals, including:
o 1,190,200 invertebrates:
 950,000 insects,
 70,000 mollusks,
 40,000 crustaceans,
 130,200 others;
o 58,808 vertebrates:
 29,300 fish,
 5,743 amphibians,
 8,240 reptiles,
 10,234 birds, (9799 extant as of 2006)
 5,416 mammals.

Estimates of total number of species

However the total number of species for some phyla may be much higher.

• 10-30 million insects;[3]

• 5-10 million bacteria;[4]
• 1.5 million fungi;[2]
• ~1 million mites[5]

One early estimate by Terry Erwin put global species richness at 30 million, following extrapolations
from the numbers of beetles found in a species of tropical tree. In one species of tree, Erwin identified 1200
species of beetle, of which he estimated 163 were found only in that tree. Based on the 50,000 species of
tropical tree, this would suggest that there are almost 10 million species of beetle in the tropics.

Species richness

The species richness S is simply the number of species present in an ecosystem. This index makes
no use of relative abundances.

Species richness is the number of species in a given area. It is represented in equation form as S.
Typically, species richness is used in conservation studies to determine the sensitivity of
ecosystems and their resident species. The actual number of species calculated alone is largely an arbitrary
number. These studies, therefore, often develop a rubric or measure for valuing the species richness
number(s) or adopt one from previous studies on similar ecosystems.

Factors affecting species richness

There is a strong inverse correlation in many groups between species richness and latitude: the
farther from the equator, the fewer species can be found, even when compensating for the reduced surface
area in higher latitudes due to the spherical geometry of the earth. Equally, as altitude increases, species
richness decreases, indicating an effect of area, available energy, isolation and/or zonation (intermediate
elevations can receive species from higher and lower).

Latitudinal gradient

• The species richness increase from high latitudes to the low latitudes.
• The peak of the species richness is not at Equator, however. It is deducted that the peak is between
20-30°N.[citation needed]

The gradient of species richness is asymmetrical about the equator. The level of species richness
increase rapidly from the north region but decrease slowly from the equator to southern region.

Area effect

The latitudinal gradients of the species richness may result from the effect of area. The area at
lower latitudes is larger than that at higher latitudes, leading to higher species richness at lower latitudes.


The latitudinal gradients of species richness may be result from the energy available to the
ecosystems. At lower latitudes, there are higher amounts of energy available because of more solar
radiation, more resources (for example, minerals and water); as a result, higher levels of species richness
can be allowed at lower latitudes. However, there have been relevant studies showing that species richness
and primary productivity are actually negatively correlated[1].

The Millennium Ecosystem Assessment, an international ecological effort initiated by the United
Nations, states:

"In most ecosystems, changes in the number of species are the consequences of changes in major
abiotic and disturbance factors, so that the ecosystem effects of species richness (number of species) per se
is expected to be both comparatively small and very difficult to isolate. For example, variation in primary
productivity depends strongly on temperature and precipitation at the global scale and on soil resources and
disturbance regime at the region-to-landscape-to-local scales. Factors that increase productivity, such as
nutrient addition, often lead to lower species richness because more productive species outcompete less
productive ones. In nature, therefore, high species diversity and high productivity are often not positively

The species-area relationship is commonly approximated as following equation: S = cAz or log(S)
= log(c) + zlog(A) where S is the number of species, reflecting the species richness (sometimes also called
species diversity), A is the area given in hectares, and c and z are constants. c is the species richness factor,
usually between 20 and 2000; z is the species accumulation factor, usually between 0.2 and 0.5. This
equation was first described by Arrhenius in 1921 [2]and explains the variation of species richness among
different areas [3].


Species richness may not really relate to the area size but rather be a statistical artifact. More
species can be recorded maybe just because more samples are collected in larger area.[citation needed]

Habitat diversity

It is possible that larger area contain more habitats as it is said that larger area is more
topographically and environmentally diverse. Therefore, there are more opportunities for more species to
set up their populations due to higher habitat diversity.

Relationship between endemism and species richness

The levels of endemism and that of species richness are frequently positively correlated. However,
on some oceanic islands, there are high levels of endemism but the levels of species richness are quite low.

Other methods for measuring biodiversity

Adjusting the species richness

The most common formula for working out Species Diversity is the Simpson's diversity index,
which uses the following formula:


• D = diversity index
• N = Total number of organisms of all species found
• n = number of individuals of a particular species

A high D value suggests a stable and ancient site, while a low D value could suggest a polluted site,
recent colonisation or agricultural management.

Usually used in studies of vegetation but can also be applied to animals.

In order to account for the probability of missing some of the actual total number of species
present in any count based on a sample population, the Jackknife estimate may be employed:

• S = species richness
• n = total number of species present in sample population
• k = number of "unique" species (of which only one organism was found in sample population)

Similarly the equation may also be noted as:


• E = the summation of number of species in each sample

• k = number of rare/unique species
• n = number of sample

As well, when looking at local diversity the appropriate formula to use is:


• c = a specific number for each taxa

• A = the area of study
• z = the slope perimeter

Other measures of biodiversity may also take into account the rarity of the taxa, and the amount of
evolutionary novelty they embody.


As a measure of biodiversity, species richness suffers from the lack of a good definition of
"species." There are at least 7 definitions, with their own strength and weakness. Still, it is easy to measure,
and is well studied.

Species richness fails to take into consideration species evenness. Other measures of biodiversity,
such as the Simpson index, the Shannon index, and the fundamental biodiversity parameter θ of the unified
neutral theory of biodiversity take species evenness into consideration.
Species evenness
Species evenness is a diversity index, a measure of biodiversity which quantifies how equal the
community are numerically. So if there are 40 foxes, and 1000 dogs, the community is not very even. But if
there are 40 foxes and 42 dogs, the community is quite even. The evenness of a community can be
represented by Pielou's evenness index:

Where H' is the number derived from the Shannon diversity index and H' max is the maximum value of H',
equal to:

E is constrained between 0 and 1. The less variation in communities between the species, the higher E is.
Other indices have been proposed by authors where H'min > 0 eg. Hurlburt's evenness index.

Simpson's diversity index

If pi is the fraction of all organisms which belong to the i-th species, then Simpson's diversity
index is most commonly defined as the statistic

This quantity was introduced by Edward Hugh Simpson.

If ni is the number of individuals of species i which are counted, and N is the total number of all
individuals counted, then

is an estimator for Simpson's index for sampling without replacement.

Note that , with values near zero corresponding to highly diverse or

heterogeneous ecosystems and values near one corresponding to more homogeneous ecosystems.
Biologists who find this confusing sometimes use 1 / D instead; confusingly, this reciprocal quantity is also
called Simpson's index. Another response is to redefine Simpson's index as

This quantity is called by statisticians the index of diversity.

In sociology, psychology and management studies the index is often known as Blau's Index, as it
was introduced into the literature by the sociologist Peter Blau.

Shannon's diversity index

Shannon's diversity index is simply the ecologist's name for the communication entropy
introduced by Claude Shannon:

where pi is the fraction of individuals belonging to the i-th species. This is by far the most widely
used diversity index. The intuitive significance of this index can be described as follows. Suppose we
devise binary codewords for each species in our ecosystem, with short codewords used for the most
abundant species, and longer codewords for rare species. As we walk around and observe individual
organisms, we call out the corresponding codeword. This gives a binary sequence. If we have used an
efficient code, we will be able to save some breath by calling out a shorter sequence than would otherwise
be the case. If so, the average codeword length we call out as we wander around will be close to the
Shannon diversity index.

It is possible to write down estimators which attempt to correct for bias in finite sample sizes, but
this would be misleading since communication entropy does not really fit expectations based upon
parametric statistics. Differences arising from using two different estimators are likely to be overwhelmed
by errors arising from other sources. Current best practice tends to use bootstrapping procedures to estimate
communication entropy.

Shannon himself showed that his communication entropy enjoys some powerful formal properties,
and furthermore, it is the unique quantity which does so. These observations are the foundation of its
interpretation as a measure of statistical diversity (or "surprise", in the arena of communications). The
applications of this quantity go far beyond the one discussed here; see the textbook cited below for an
elementary survey of the extraordinary richness of modern information theory.

Shannon index

The Shannon index, also known as the Shannon-Weaver Index and sometimes referred to as the
Shannon-Wiener Index [1]), , is one of several diversity indices used to measure diversity in categorical
data. It is simply the Information entropy of the distribution, treating species as symbols and their relative
population sizes as the probability.

This article treats its use in the measurement of biodiversity. The advantage of this index is that it
takes into account the number of species and the evenness of the species. The index is increased either by
having additional unique species, or by having a greater species evenness.

The "Shannon-Weaver" name is a misnomer; apparently some biologists jumped to the conclusion
that Warren Weaver, author of an influential preface to the book form[2] of Claude Shannon's 1948
paper[3]founding information theory, was a cofounder of this theory. Weaver did play a crucial role in the
rapid postwar development of information theory in a different way, however; as an influential early
administrator of the Rockefeller Foundation, he ensured that the first information theorists received
generous research grants. Norbert Wiener had no hand in the index either, although his influential
popularisation of cybernetics was often conflated with information theory in the 1950s.

Berger-Parker index

The Berger-Parker diversity index is simply

This is an example of an index which uses only partial information about the relative abundances
of the various species in its definition.

Renyi entropy

The Species richness, the Shannon index, Simpson's index, and the Berger-Parker index can all be
identified as particular examples of quantities bearing a simple relation to the Renyi entropy,

for α approaching respectively.

Unfortunately, the powerful formal properties of communication entropy do not generalize to

Renyi's entropy, which largely explains the much greater power and popularity of Shannon's index with
respect to its competitors.


• ni The number of individuals in species i; the abundance of species i.

• S The number of species. Also called species richness.
• N The total number of all individuals
• pi The relative abundance of each species, calculated as the proportion of individuals of a given

species to the total number of individuals in the community:

Computing the index

It can be shown that for any given number of species, there is a maximum possible , Hmax = lnS
which occurs when all species are present in equal numbers.
Proof that maximum evenness maximizes the index

The following will prove that any given population will have a maximum Shannon Index if and
only if each species represented is composed of the same number of individuals.

Expanding the index:

Now, let's define Clearly, since N is a positive constant for a given population
size, and NlnN is also a constant, then maximizing Hs is equivalent to maximizing .


Let's split an arbitrarily sized population into two groups, with each group receiving an arbitrary
number of individuals and an arbitrary number of species. Now, within each group, each species has the
same number of individuals as any other species in that group, but the number of individuals per species in
the first group may be different from the number of individuals per species in the second group.

Now, if it can be proven that Hs is maximized when the number of individuals per species in the
first group matches the number of individuals per species in the second group, then it has been proved that
the population has a maximum index only when each species in the population is evenly represented. Hs
doesn't depend on the total population. So Hs may be built by simply adding the indices of two sub-
populations. Since the population size is arbitrary, this proves that if you have two species (the smallest
number that can be considered two groups), their index is maximized if they are present in equal numbers.
So the rules of mathematical induction have been satisfied.


Now, divide the species into two groups. Within each group, the population is evenly distributed among the
species present.

• k The number of individuals in the second group.

• p The number of species in the second group.
• ni2 = k / p Number of individuals in each species in the second group.
• N − k The number of individuals in the first group.
• S − p The species in the first group.

• The individuals in each species in the first group.

To find out which value of k will maximize Hs, we must find the value of k which satisfies the equation:



Now by applying the definitions of Ni1 and Ni2, we get


Now we have accomplished the proof that the Shannon index is maximized when each species is
present in equal numbers (see #strategy). But what is the index in that case? Well,

, so Therefore: