You are on page 1of 74

Population genetics

4 February 2020
Reading: Rowe, Sweet, Beebee
chapter 7

Photo: Truls Moum


Population sizes
Recap from Lecture 4

An important theoretical measure in population


genetics is the (long term) effective populations size Ne:

the (average) number of individuals that reproduce


successfully in each generation
Census and effective population sizes

Nc = Ne (“ideal” population)
Nc > Ne (most real populations)

Ne/Nc ratio
highly variable, often less than 0.2

Photo: Truls Moum


What affects effective population size?

4(𝑁𝑒𝑓 𝑥 𝑁𝑒𝑚) 4(100 𝑥 20)


o Sex ratios 𝑁𝑒 =
𝑁𝑒𝑓 + 𝑁𝑒𝑚
𝑁𝑒 =
100 + 20
= 67

4𝑁𝑐 − 2 4(100) − 2
o Variations in reproductive success (VRS) 𝑁𝑒 =
𝑉𝑅𝑆 + 2
=
6+2
= 50

𝑡 4
o Fluctuating population size 𝑁𝑒 =
1 1 1 1
=
1 1 1 1
= 48
+ + + + + +
𝑁𝑒1 𝑁𝑒2 𝑁𝑒3 𝑁𝑒4 100 80 20 100

There is a predictable relationship between effective


population size, genetic drift and genetic diversity
What affects effective population size?

Genetic variation is lost at a rate of 1/2Ne per generation for nuclear


genes in diploids

Genetic variation is lost at a rate of 1/(female Ne) per generation, for


mitochondrial genes (i.e. the loss is 4 times faster, if the sex ratio is 1)

There is a predictable relationship between effective


population size, genetic drift and genetic diversity
Quantifying effective population sizes

Effective population size (Ne) is an essential measure


there are in principle two ways to estimate it based on molecular markers

1. Single sample estimators

2. Temporal changes (two or more samplings)

Both are based on the fact that genetic drift increases as Ne decreases

A potential problem: immigration BOX


7.4
Quantifying effective population sizes
BOX
7.4

Single sample estimators

- Heterozygote excess method

- Linkage Disequilibrium (LD), increased occurrence of LD in small


populations

For overview and software programs, see Luikart et al. (2010)


Quantifying effective population sizes
BOX
7.4
Temporal method

The most powerful approach, but it requires the population to be


sampled at least two times separated by at least one generation

Basis: Allele frequencies shift more rapidly in populations with small Ne

Use highly polymorphic codominant markers: microsatellites


(with the Stepwise Mutation Model (SMM))
Population fluctuations
Population bottlenecks

o Sex ratios

o Variations in reproductive success (VRS)

o Fluctuating population size

https://en.wikipedia.org/wiki/Population_bottleneck
Population fluctuations
Population bottlenecks

Demographic and genetic impact


on the population!

https://en.wikipedia.org/wiki/Population_bottleneck
Population fluctuations
Extreme population bottlenecks

Black robin (Petroica traversi)

The endangered Black


robin population 1980:
5 individuals

Current population of
approx. 250 derived from
a single breeding pair!

A most extreme bottleneck

Figure 10.3 R, S & B


Population fluctuations
Extreme population bottlenecks

Black robin (Petroica traversi)


Conservationists rescued the species by increasing the
reproductive output from the last remaining fertile female and
The endangered Black subsequent females
robin population 1980:
5 individuals A maladaptive reproductive trait, laying eggs on the rim of nests,
increased in the population as a result of conservation efforts
Current population of
approx. 250 derived from
a single breeding pair!

A most extreme bottleneck


Population fluctuations
Extreme population bottlenecks

Black robin (Petroica traversi) Lessons for conservation and population viability:

Conservation measures could render small populations


The endangered Black completely dependent on humans
robin population 1980:
5 individuals
High level of genetic variation is not always necessary for
Current population of survival of an endangered species
approx. 250 derived from
a single breeding pair!

A most extreme bottleneck


Population fluctuations
Lesson:

Bottlenecks that last for a short period


result in relatively little loss of diversity

In contrast, bottleneck that are severe


and last for a long time, could be very
harmful to the population/species

Do we have historical records to


document that a bottleneck occurred?

https://en.wikipedia.org/wiki/Population_bottleneck
Discrepancies between population size
and genetic diversity
Population size: 120 000

Genetic variation across 50 allozyme loci was zero

Bottleneck in the late nineteenth century

Present population is based on ~ 10 individuals

Northern elephant seal (Mirounga angustirostrus)


Discrepancies between population size
and genetic diversity

Indian rhinoceros (Rhinoceros unicornis)


https://en.wikipedia.org/wiki/Indian_rhinoceros
Discrepancies between population size
and genetic diversity
Nepal population of the Indian rhinoceros

Current population size: ~ 650

Population size in the 1960s: ~ 80 individuals

Heterozygosity across 29 allozyme loci: ~ 10%

> 100 000 rhinos in recent times

Allelic diversity low (not more than 2 alleles per locus):


Indian rhinoceros (Rhinoceros unicornis)
Loss of allelic diversity is more likely than loss of heterozygosity https://en.wikipedia.org/wiki/Indian_rhinoceros

Recent bottleneck was less severe than in elephant seals


Discrepancies between population size
and genetic diversity
Historical data:

“Historical” population samples might be available


(museums)

Population genetic analysis should always be interpreted


in the context of other information, if available
Population fluctuations Genetic variation is lost at a rate of
1/2Ne for nuclear genes in diploids

Genetic variation is lost 4 times faster,


Population bottlenecks at a rate of 1/(female Ne), for
mitochondrial genes

There are several genetic tests to identify bottlenecks:

- Heterozygosity vs observed number of alleles (alleles are lost quickly during


bottleneck, but without affecting heterozygosity that much (cf. rhinos))

- Ratio of allele numbers to range of allele sizes (microsatellites)

-Between generations test (pre-/post- bottleneck) (cf. estimation of Ne)


Founder effects and population expansions

Founder effects
Similar to bottlenecks, with loss of diversity inversely
correlated to the number of founders

Population expansions
Population expansions in the past can be detected Figure 9.3 Rowe, Sweet & Beebee
because they leave characteristic signatures in molecular
data Star shaped phylogeny based
- Star shaped phylogeny of mitochondrial DNA on mitochondrial DNA
- Heterozygote deficiency in microsatellites during the
period when new alleles appear as a result of
Heterozygote deficiency in
mutations
microsatellites: Ho < He
Natural selection

Three common modes of selection that will influence phenotypic variation

Selection will influence the level of phenotypic variation and the


variability of the underlying genotypes within a population
What affects genetic diversity?

The main factors that influence genetic variation within populations


What affects genetic diversity?

The main factors that influence genetic variation within populations


Several (interacting) populations

Wood Cranesbill (Geranium sylvaticum)

Photo: Truls Moum

Photo: Truls Moum


Population genetics

Studying allele frequencies in populations

• mutation

• genetic drift

• selection
Several (interacting) populations

Studying allele frequencies in populations

• mutation

• genetic drift

• selection

• gene flow
Hardy-Weinberg equilibrium
The assumptions

o Random mating

o Mendelian inheritance

o No selection

o No mutations Alleles: p + q = 1

o Infinite size of population (no genetic drift) Genotypes: p2 + 2pq + q2 = 1

o No effect of migration
Population structure

Population structure: the way populations are organized

Panmictic

Populations/subpopulations – individuals more likely to interbreed

Gene flow

Genetic drift

Selection

(mutation)
Population genetics

Genetic differentiation between populations

• mutation

• genetic drift

• selection

• gene flow

Differentiation, convergence, or equilibrium


Genetic distance between populations

Nei’s (1972) standard genetic distance: D

Based on Nei’s measure of genetic identity: I

0 < I < 1.0

D = -ln I

0<D<∞
Population structure
The most common method for calculating genetic
differentiation between populations is based on:

Wright’s F-statistics (1951)

Note the terminology:


by convention we use the term subpopulation instead of
population
Population structure
F-statistics
Subpopulations. Partitioning of genetic variation within and
among subpopulations, based on inbreeding coefficients

F-statistics: the Infinite alleles model (IAM)


Population structure
Two main mutation models:

• Infinite Alleles Model (IAM)

• Stepwise Mutation Model (SMM) for microsatellites

Photo: Truls Moum

http://nitro.biosci.arizona.edu/ftdna/models.html
Population structure
F-statistics

Partitioning of genetic variation within and among subpopulations,


using three statistics: FIS, FIT, and FST
Population structure
F-statistics

Partitioning of genetic variation within and among subpopulations

FIS – genetic diversity of individuals compared to subpopulations

FIT – genetic diversity of individuals compared to total population

FST – genetic diversity among subpopulations

FIT = FIS + FST – (FIS)(FST)


Population structure
F-statistics

Partitioning of genetic variation within and among subpopulations

FIS – genetic diversity of individuals compared to subpopulations

FIS equals inbreeding F = (He – Ho/He)

FIS = (HS – HI/HS)

HI = observed heterozygosity in a subpopulation (“individual heterozygosity”)


HS = expected heterozygosity of subpopulation (HWE)
Population structure
F-statistics

Partitioning of genetic variation within and among subpopulations

FIT – genetic diversity of individuals compared to total population

FIT = HT – HI/HT

HI = observed heterozygosity in a subpopulation


HT = expected heterozygosity of total population
Population structure

FST = (HT – HS)/HT

HT – heterozygosity of the total population

HS – average heterozygosity over all subpopulations

0 < FST < 1.0


Population structure

FST = (HT – HS)/HT


Undifferentiated populations at equilibrium: FST = 0

“Significant” population differentiation: FST > 0.2

Little differentiation: FST = 0 – 0.05


Moderate differentiation: FST = 0.05 – 0.25
Pronounced differentiation: FST > 0.25
Population structure

Formula used in the International HapMap Project


for humans based on SNP analysis

FST = 0.12
Population structure

Statistical significance of FST

Permutations: shuffle genotypes among populations

P-value according to the number of times FST values


are equal or larger than FST from the actual data set
Population structure

Multiple alleles: GST (Nei 1973; Hamrick & Godt 1990)

Stepwise mutation model (microsats): RST (Slatkin 1995)

Highly polymorphic markers: D (Jost 2008)

Comparison of measures of genetic differentiation among studies:

Comparisons will be valid only if estimates are based on the same methods!
Population structure

AMOVA:
hierarchical Analysis of MOlecular VAriance
(cf. ANOVA, Analysis of variance, used to analyse differences among group means)

Partitioning of variance among:


• Individuals within populations
• Populations within groups
• Groups of populations
Population structure

AMOVA:
hierarchical Analysis of MOlecular VAriance
(cf. ANOVA, Analysis of variance, used to analyse differences among group means)
Population structure

Fst, Gst, Rst, D, theta

How to interpret estimates of genetic distance and differentiation?

http://ieg.ebd.csic.es/arndthampe/
Population structure

Isolation by distance (IBD)


Gene flow between populations is inversely proportional
to geographic distances between them

Mantel test – testing for correlation between genetic and


geographic distances
Isolation by distance (IBD)

A Mantel's test for correlation between C.S. Chord (51) genetic distance and geographic
distance (km) showed high correlation, r = 0.599, P < 0.0058 from 10,0000 randomizations.
Pusadee et al. PNAS 2009
Isolation by distance (IBD)
Correlation between genetic and geographic distances
Population differentiation

Empirical observations:

It is extremely difficult to predict the levels of


genetic differentiation between populations

There is a wide range of population


divergences both within and between species
Population structure
A priori and non-a priori delimitation of populations

(Sub)populations?

Based on assumptions -?

http://ieg.ebd.csic.es/arndthampe/
Population structure
A priori and non-a priori delimitation of populations

(Sub)populations?

Assignment test methods:


STRUCTURE (most likely number of populations)
BAPS (geographical populations) http://ieg.ebd.csic.es/arndthampe/
GENELAND (geographical)

These softwares are using Bayesian methods: model and prior distribution
parameters – updated to produce posterior probabilities
Population structure
A priori and non-a priori delimitation of populations

Non-Bayesian assignment methods:


Eigensoft; adegenet (principal component analysis)
AW-clust (multidimensional scaling for SNP data)
Genetix (frequency correspondence analysis)

Taking geographic information into account: http://ieg.ebd.csic.es/arndthampe/

Geneland
sPCA
TESS

Experimental design is critical!


Population structure

Nuclear markers, i.e. microsatellites or sequence data

How many loci; genomic coverage?

Cytoplasmic markers, i.e. mtDNA or cpDNA

Cytoplasmic markers: effective population size ¼ that of nuclear markers

Sensitivity

Differences in male and female behaviour


Population structure

Philopatry

Australian coast

4 microsatellite markers

mtDNA control region sequences


Green turtle (Chelonia mydas)
Differentiation more pronounced in
FitzSimmons et al. (1997) mitochondrial markers
Genetics, 147: 1843-1854

Male mediated gene flow


Population structure

Green turtle (Chelonia mydas)

FitzSimmons et al. (1997)


Genetics, 147: 1843-
1854
Gene flow
Nm, the average number of successfully reproducing migrants
per generation between subpopulations

Gene flow

Dispersal

Migration

Fig. 7.9 Rowe, Sweet & Beebee


Gene flow
Nm, the average number of successfully reproducing migrants
per generation between subpopulations

Methods for estimating dispersal


and gene flow

1. Direct methods
2. Indirect methods
3. Assignment tests

Fig. 7.9 Rowe, Sweet & Beebee


Quantifying gene flow
Nm, the average number of successfully reproducing migrants
per generation between subpopulations

Direct methods:

Mark- recapture
Dispersal/gene flow?

Radio tracking/GPS

Genotyping parents and offspring

Fig. 7.9 Rowe, Sweet & Beebee


Quantifying gene flow

GPS based tracking of seabirds

Bird tracking loggers

http://www.abdn.ac.uk/lighthouse/research/techniques/tracking/
Gene flow – indirect methods
Nm, the average number of successfully reproducing migrants
per generation between subpopulations

Nm relates to F-statistics:

FST =1/(4Nm + 1)

Nm = ¼(1/FST – 1)

Fig. 7.9 Rowe, Sweet & Beebee


Gene flow – indirect methods
Nm, the average number of successfully reproducing migrants
per generation between subpopulations

Nm relates to F-statistics:

FST =1/(4Nm + 1) Thus, the number of migrants can be estimated from FST

Nm = ¼(1/FST – 1)

BOX
7.7

Fig. 7.9 Rowe, Sweet & Beebee


Gene flow – indirect methods
Nm, the average number of successfully reproducing migrants
per generation between subpopulations

Nm relates to F-statistics:

FST =1/(4Nm + 1) Thus, the number of migrants can be estimated from FST
But, assumptions are often unrealistic
Nm = ¼(1/FST – 1)

BOX
Assumptions: 7.7
Island model of population structure
No selection
No mutation
Infinite number of populations
Migration-drift equilibrium
Fig. 7.9 Rowe, Sweet & Beebee
Gene flow

Non-equilibrium situation Equilibrium situation

Fig. 7.10 Rowe, Sweet & Beebee


Quantifying gene flow

Assignment tests:

Maximum likelihood to identify individuals that


dispersed – individuals assigned to the population
from which they have the highest probability of
originating
- also Bayesian approach – several scenarios can be
compared simultaneously
Gene flow
Nm, the average number of successfully reproducing migrants
per generation between subpopulations

Alternative models

Stepping stone model

Island model

Fig. 7.9 Rowe & Sweet


Metapopulations
A metapopulation is a population of subpopulations within a single species with many
possible patterns of gene flow between the subpopulations

Characterized by recurrent extinctions and recolonizations

Common toad (Bufo bufo)

Figure 7.7 Beebee & Sweet


Metapopulations Most notably in butterflies, toads, fish

Fragmented habitats, but absence of barriers to dispersal:

Local extinctions and recolonizations

Source and sink populations


Monarch butterfly (Danaus plexippus)

Island model Island-mainland model

Common toad (Bufo bufo)

Poecilia reticulata
Figure 7.7 Beebee & Sweet
Metapopulations Most notably in butterflies, toads, fish

Predictions:
• Population bottleneck – reduced genetic variation

• Genetic differentiation between demes should vary over time


Monarch butterfly (Danaus plexippus)

Island model Island-mainland model

Common toad (Bufo bufo)

Poecilia reticulata
Figure 7.7 Beebee & Sweet
Population differentiation

Interaction of the factors affecting population differentiation

Is population differentiation due to

genetic drift or selection?

…and how does gene flow moderate the action of drift and selection?
Gene flow and genetic drift

Fst = 0.2 is equivalent to Nem = 1 (one migrant per generation)

As little as one migrant per generation could be


sufficient to prevent differentiation by genetic drift!
Population differentiation

The interaction of:

Effective population size

Genetic drift

Selection

Gene flow
How to identify selection

In two weeks:

Discordant patterns of differentiation

Clinal variations in allele frequencies

Calculating the proportion of non-synonymous to synonymous


substitutions (dN/dS): the proportion of non-synonymous
substitutions should be higher in selected genes
Molecular markers for population genetics

Considerations:
Cheap or expensive?
Technical complexity (easy to use?)
Likely to be neutral or non-neutral?

Allozymes
Microsatellites
mtDNA
RAPD
AFLP
cpDNA
rDNA spacers (ITS)
Introns

SNPs
Molecular markers for population genetics

Considerations:
Cheap or expensive? Tests:
Technical complexity (easy to use?)
Likely to be neutral or non-neutral?
Hardy-Weinberg (HW)
Allozymes
Microsatellites Linkage disequilibrium (LD)
mtDNA
RAPD
AFLP Neutrality tests
cpDNA (Tajima’s D; Fu & Li’s tests)
rDNA spacers (ITS)
Introns

SNPs Use complementary information from different markers!

You might also like