Lecture 6 Pop Gen Feb 4 2020 Canvas PDF

Population genetics
4 February 2020
Reading: Rowe, Sweet, Beebee
chapter 7
Photo: Truls Moum

Population sizes
Recap from Lecture 4
An important theoretical measure in population

genetics is the (long term) effective populations size Ne:
the (average) number of individuals that reproduce

successfully in each generation
Census and effective population sizes
Nc = Ne (“ideal” population)
Nc > Ne (most real populations)
Ne/Nc ratio
highly variable, often less than 0.2
Photo: Truls Moum

What affects effective population size?
4(𝑁𝑒𝑓 𝑥 𝑁𝑒𝑚) 4(100 𝑥 20)

o Sex ratios 𝑁𝑒 =
𝑁𝑒𝑓 + 𝑁𝑒𝑚
𝑁𝑒 =
100 + 20
= 67
4𝑁𝑐 − 2 4(100) − 2
o Variations in reproductive success (VRS) 𝑁𝑒 =
𝑉𝑅𝑆 + 2
=
6+2
= 50
𝑡 4
o Fluctuating population size 𝑁𝑒 =
1 1 1 1
=
1 1 1 1
= 48
+ + + + + +
𝑁𝑒1 𝑁𝑒2 𝑁𝑒3 𝑁𝑒4 100 80 20 100
There is a predictable relationship between effective

population size, genetic drift and genetic diversity
What affects effective population size?
Genetic variation is lost at a rate of 1/2Ne per generation for nuclear

genes in diploids
Genetic variation is lost at a rate of 1/(female Ne) per generation, for

mitochondrial genes (i.e. the loss is 4 times faster, if the sex ratio is 1)
There is a predictable relationship between effective

population size, genetic drift and genetic diversity
Quantifying effective population sizes
Effective population size (Ne) is an essential measure

there are in principle two ways to estimate it based on molecular markers
1. Single sample estimators
2. Temporal changes (two or more samplings)
Both are based on the fact that genetic drift increases as Ne decreases
A potential problem: immigration BOX

7.4
BOX
7.4
Single sample estimators
- Heterozygote excess method
- Linkage Disequilibrium (LD), increased occurrence of LD in small

populations
For overview and software programs, see Luikart et al. (2010)

BOX
7.4
Temporal method
The most powerful approach, but it requires the population to be

sampled at least two times separated by at least one generation
Basis: Allele frequencies shift more rapidly in populations with small Ne
Use highly polymorphic codominant markers: microsatellites

(with the Stepwise Mutation Model (SMM))
Population fluctuations
Population bottlenecks
o Sex ratios
o Variations in reproductive success (VRS)
o Fluctuating population size
https://en.wikipedia.org/wiki/Population_bottleneck
Population bottlenecks
Demographic and genetic impact

on the population!
Extreme population bottlenecks
Black robin (Petroica traversi)
The endangered Black

robin population 1980:
5 individuals
Current population of
approx. 250 derived from
a single breeding pair!
A most extreme bottleneck
Figure 10.3 R, S & B

Black robin (Petroica traversi)

Conservationists rescued the species by increasing the
reproductive output from the last remaining fertile female and
The endangered Black subsequent females
5 individuals A maladaptive reproductive trait, laying eggs on the rim of nests,
increased in the population as a result of conservation efforts
Current population of

Black robin (Petroica traversi) Lessons for conservation and population viability:
Conservation measures could render small populations

The endangered Black completely dependent on humans
5 individuals
High level of genetic variation is not always necessary for
Current population of survival of an endangered species

Lesson:
Bottlenecks that last for a short period

result in relatively little loss of diversity
In contrast, bottleneck that are severe

and last for a long time, could be very
harmful to the population/species
Do we have historical records to

document that a bottleneck occurred?
Discrepancies between population size
and genetic diversity
Population size: 120 000
Genetic variation across 50 allozyme loci was zero
Bottleneck in the late nineteenth century
Present population is based on ~ 10 individuals
Northern elephant seal (Mirounga angustirostrus)

Indian rhinoceros (Rhinoceros unicornis)

https://en.wikipedia.org/wiki/Indian_rhinoceros
Nepal population of the Indian rhinoceros
Current population size: ~ 650
Population size in the 1960s: ~ 80 individuals
Heterozygosity across 29 allozyme loci: ~ 10%
> 100 000 rhinos in recent times
Allelic diversity low (not more than 2 alleles per locus):

Indian rhinoceros (Rhinoceros unicornis)
Loss of allelic diversity is more likely than loss of heterozygosity https://en.wikipedia.org/wiki/Indian_rhinoceros
Recent bottleneck was less severe than in elephant seals

Historical data:
“Historical” population samples might be available

(museums)
Population genetic analysis should always be interpreted

in the context of other information, if available
Population fluctuations Genetic variation is lost at a rate of
1/2Ne for nuclear genes in diploids
Genetic variation is lost 4 times faster,

Population bottlenecks at a rate of 1/(female Ne), for
mitochondrial genes
There are several genetic tests to identify bottlenecks:
- Heterozygosity vs observed number of alleles (alleles are lost quickly during

bottleneck, but without affecting heterozygosity that much (cf. rhinos))
- Ratio of allele numbers to range of allele sizes (microsatellites)
-Between generations test (pre-/post- bottleneck) (cf. estimation of Ne)

Founder effects and population expansions
Founder effects
Similar to bottlenecks, with loss of diversity inversely
correlated to the number of founders
Population expansions
Population expansions in the past can be detected Figure 9.3 Rowe, Sweet & Beebee
because they leave characteristic signatures in molecular
data Star shaped phylogeny based
- Star shaped phylogeny of mitochondrial DNA on mitochondrial DNA
- Heterozygote deficiency in microsatellites during the
period when new alleles appear as a result of
Heterozygote deficiency in
mutations
microsatellites: Ho < He
Natural selection
Three common modes of selection that will influence phenotypic variation
Selection will influence the level of phenotypic variation and the

variability of the underlying genotypes within a population
What affects genetic diversity?
The main factors that influence genetic variation within populations

What affects genetic diversity?
The main factors that influence genetic variation within populations

Several (interacting) populations
Wood Cranesbill (Geranium sylvaticum)
Photo: Truls Moum
Photo: Truls Moum

Population genetics
Studying allele frequencies in populations
• mutation
• genetic drift
• selection
Several (interacting) populations
Studying allele frequencies in populations
• mutation
• genetic drift
• selection
• gene flow
Hardy-Weinberg equilibrium
The assumptions
o Random mating
o Mendelian inheritance
o No selection
o No mutations Alleles: p + q = 1
o Infinite size of population (no genetic drift) Genotypes: p2 + 2pq + q2 = 1
o No effect of migration
Population structure
Population structure: the way populations are organized
Panmictic
Populations/subpopulations – individuals more likely to interbreed
Gene flow
Genetic drift
Selection
(mutation)
Population genetics
Genetic differentiation between populations
• mutation
• genetic drift
• selection
• gene flow
Differentiation, convergence, or equilibrium

Genetic distance between populations
Nei’s (1972) standard genetic distance: D
Based on Nei’s measure of genetic identity: I
0 < I < 1.0
D = -ln I
0<D<∞
The most common method for calculating genetic
differentiation between populations is based on:
Wright’s F-statistics (1951)
Note the terminology:

by convention we use the term subpopulation instead of
population
F-statistics
Subpopulations. Partitioning of genetic variation within and
among subpopulations, based on inbreeding coefficients
F-statistics: the Infinite alleles model (IAM)

Two main mutation models:
• Infinite Alleles Model (IAM)
• Stepwise Mutation Model (SMM) for microsatellites
Photo: Truls Moum
http://nitro.biosci.arizona.edu/ftdna/models.html
F-statistics
Partitioning of genetic variation within and among subpopulations,

using three statistics: FIS, FIT, and FST
F-statistics
Partitioning of genetic variation within and among subpopulations
FIS – genetic diversity of individuals compared to subpopulations
FIT – genetic diversity of individuals compared to total population
FST – genetic diversity among subpopulations
FIT = FIS + FST – (FIS)(FST)

F-statistics
FIS – genetic diversity of individuals compared to subpopulations
FIS equals inbreeding F = (He – Ho/He)
FIS = (HS – HI/HS)
HI = observed heterozygosity in a subpopulation (“individual heterozygosity”)

HS = expected heterozygosity of subpopulation (HWE)
F-statistics
FIT – genetic diversity of individuals compared to total population
FIT = HT – HI/HT
HI = observed heterozygosity in a subpopulation

HT = expected heterozygosity of total population
FST = (HT – HS)/HT
HT – heterozygosity of the total population
HS – average heterozygosity over all subpopulations
0 < FST < 1.0

FST = (HT – HS)/HT

Undifferentiated populations at equilibrium: FST = 0
“Significant” population differentiation: FST > 0.2
Little differentiation: FST = 0 – 0.05

Moderate differentiation: FST = 0.05 – 0.25
Pronounced differentiation: FST > 0.25
Formula used in the International HapMap Project

for humans based on SNP analysis
FST = 0.12
Statistical significance of FST
Permutations: shuffle genotypes among populations
P-value according to the number of times FST values

are equal or larger than FST from the actual data set
Multiple alleles: GST (Nei 1973; Hamrick & Godt 1990)
Stepwise mutation model (microsats): RST (Slatkin 1995)
Highly polymorphic markers: D (Jost 2008)
Comparison of measures of genetic differentiation among studies:
Comparisons will be valid only if estimates are based on the same methods!
AMOVA:
hierarchical Analysis of MOlecular VAriance
(cf. ANOVA, Analysis of variance, used to analyse differences among group means)
Partitioning of variance among:

• Individuals within populations
• Populations within groups
• Groups of populations
AMOVA:
hierarchical Analysis of MOlecular VAriance
(cf. ANOVA, Analysis of variance, used to analyse differences among group means)
Fst, Gst, Rst, D, theta
How to interpret estimates of genetic distance and differentiation?
http://ieg.ebd.csic.es/arndthampe/
Isolation by distance (IBD)

Gene flow between populations is inversely proportional
to geographic distances between them
Mantel test – testing for correlation between genetic and

geographic distances
A Mantel's test for correlation between C.S. Chord (51) genetic distance and geographic
distance (km) showed high correlation, r = 0.599, P < 0.0058 from 10,0000 randomizations.
Pusadee et al. PNAS 2009
Correlation between genetic and geographic distances
Population differentiation
Empirical observations:
It is extremely difficult to predict the levels of

genetic differentiation between populations
There is a wide range of population

divergences both within and between species
A priori and non-a priori delimitation of populations
(Sub)populations?
Based on assumptions -?
http://ieg.ebd.csic.es/arndthampe/
(Sub)populations?
Assignment test methods:

STRUCTURE (most likely number of populations)
BAPS (geographical populations) http://ieg.ebd.csic.es/arndthampe/
GENELAND (geographical)
These softwares are using Bayesian methods: model and prior distribution
parameters – updated to produce posterior probabilities
Non-Bayesian assignment methods:

Eigensoft; adegenet (principal component analysis)
AW-clust (multidimensional scaling for SNP data)
Genetix (frequency correspondence analysis)
Taking geographic information into account: http://ieg.ebd.csic.es/arndthampe/
Geneland
sPCA
TESS
Experimental design is critical!

Nuclear markers, i.e. microsatellites or sequence data
How many loci; genomic coverage?
Cytoplasmic markers, i.e. mtDNA or cpDNA
Cytoplasmic markers: effective population size ¼ that of nuclear markers
Sensitivity
Differences in male and female behaviour

Philopatry
Australian coast
4 microsatellite markers
mtDNA control region sequences

Green turtle (Chelonia mydas)
Differentiation more pronounced in
FitzSimmons et al. (1997) mitochondrial markers
Genetics, 147: 1843-1854
Male mediated gene flow

Green turtle (Chelonia mydas)
FitzSimmons et al. (1997)

Genetics, 147: 1843-
1854
Gene flow
Nm, the average number of successfully reproducing migrants
per generation between subpopulations
Gene flow
Dispersal
Migration
Fig. 7.9 Rowe, Sweet & Beebee

Gene flow
Methods for estimating dispersal

and gene flow
1. Direct methods
2. Indirect methods
3. Assignment tests

Quantifying gene flow
Direct methods:
Mark- recapture
Dispersal/gene flow?
Radio tracking/GPS
Genotyping parents and offspring

GPS based tracking of seabirds
Bird tracking loggers
http://www.abdn.ac.uk/lighthouse/research/techniques/tracking/
Gene flow – indirect methods
Nm relates to F-statistics:
FST =1/(4Nm + 1)
Nm = ¼(1/FST – 1)

FST =1/(4Nm + 1) Thus, the number of migrants can be estimated from FST
Nm = ¼(1/FST – 1)
BOX
7.7

FST =1/(4Nm + 1) Thus, the number of migrants can be estimated from FST
But, assumptions are often unrealistic
Nm = ¼(1/FST – 1)
BOX
Assumptions: 7.7
Island model of population structure
No selection
No mutation
Infinite number of populations
Migration-drift equilibrium
Gene flow
Non-equilibrium situation Equilibrium situation

Assignment tests:
Maximum likelihood to identify individuals that

dispersed – individuals assigned to the population
from which they have the highest probability of
originating
- also Bayesian approach – several scenarios can be
compared simultaneously
Gene flow
Alternative models
Stepping stone model
Island model
Fig. 7.9 Rowe & Sweet

Metapopulations
A metapopulation is a population of subpopulations within a single species with many
possible patterns of gene flow between the subpopulations
Characterized by recurrent extinctions and recolonizations
Common toad (Bufo bufo)
Figure 7.7 Beebee & Sweet

Metapopulations Most notably in butterflies, toads, fish
Fragmented habitats, but absence of barriers to dispersal:
Local extinctions and recolonizations
Source and sink populations

Monarch butterfly (Danaus plexippus)
Island model Island-mainland model
Poecilia reticulata
Metapopulations Most notably in butterflies, toads, fish
Predictions:
• Population bottleneck – reduced genetic variation
• Genetic differentiation between demes should vary over time

Monarch butterfly (Danaus plexippus)
Island model Island-mainland model
Poecilia reticulata
Interaction of the factors affecting population differentiation
Is population differentiation due to
genetic drift or selection?
…and how does gene flow moderate the action of drift and selection?
Gene flow and genetic drift
Fst = 0.2 is equivalent to Nem = 1 (one migrant per generation)
As little as one migrant per generation could be

sufficient to prevent differentiation by genetic drift!
The interaction of:
Effective population size
Genetic drift
Selection
Gene flow
How to identify selection
In two weeks:
Discordant patterns of differentiation
Clinal variations in allele frequencies
Calculating the proportion of non-synonymous to synonymous

substitutions (dN/dS): the proportion of non-synonymous
substitutions should be higher in selected genes
Molecular markers for population genetics
Considerations:
Cheap or expensive?
Technical complexity (easy to use?)
Likely to be neutral or non-neutral?
Allozymes
Microsatellites
mtDNA
RAPD
AFLP
cpDNA
rDNA spacers (ITS)
Introns
SNPs
Molecular markers for population genetics
Considerations:
Cheap or expensive? Tests:
Technical complexity (easy to use?)
Likely to be neutral or non-neutral?
Hardy-Weinberg (HW)
Allozymes
Microsatellites Linkage disequilibrium (LD)
mtDNA
RAPD
AFLP Neutrality tests
cpDNA (Tajima’s D; Fu & Li’s tests)
rDNA spacers (ITS)
Introns
SNPs Use complementary information from different markers!

Lecture 6 Pop Gen Feb 4 2020 Canvas PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 6 Pop Gen Feb 4 2020 Canvas PDF

Uploaded by

Copyright:

Available Formats

Population genetics

Photo: Truls Moum

An important theoretical measure in population

the (average) number of individuals that reproduce

Photo: Truls Moum

4(𝑁𝑒𝑓 𝑥 𝑁𝑒𝑚) 4(100 𝑥 20)

There is a predictable relationship between effective

Genetic variation is lost at a rate of 1/2Ne per generation for nuclear

Genetic variation is lost at a rate of 1/(female Ne) per generation, for

There is a predictable relationship between effective

Effective population size (Ne) is an essential measure

1. Single sample estimators

2. Temporal changes (two or more samplings)

A potential problem: immigration BOX

Single sample estimators

- Heterozygote excess method

- Linkage Disequilibrium (LD), increased occurrence of LD in small

For overview and software programs, see Luikart et al. (2010)

The most powerful approach, but it requires the population to be

Basis: Allele frequencies shift more rapidly in populations with small Ne

Use highly polymorphic codominant markers: microsatellites

o Variations in reproductive success (VRS)

o Fluctuating population size

Demographic and genetic impact

Black robin (Petroica traversi)

The endangered Black

A most extreme bottleneck

Figure 10.3 R, S & B

Black robin (Petroica traversi)

A most extreme bottleneck

Conservation measures could render small populations

A most extreme bottleneck

Bottlenecks that last for a short period

In contrast, bottleneck that are severe

Do we have historical records to

Genetic variation across 50 allozyme loci was zero

Bottleneck in the late nineteenth century

Present population is based on ~ 10 individuals

Northern elephant seal (Mirounga angustirostrus)

Indian rhinoceros (Rhinoceros unicornis)

Current population size: ~ 650

Population size in the 1960s: ~ 80 individuals

Heterozygosity across 29 allozyme loci: ~ 10%

> 100 000 rhinos in recent times

Allelic diversity low (not more than 2 alleles per locus):

Recent bottleneck was less severe than in elephant seals

“Historical” population samples might be available

Population genetic analysis should always be interpreted

Genetic variation is lost 4 times faster,

There are several genetic tests to identify bottlenecks:

- Heterozygosity vs observed number of alleles (alleles are lost quickly during

- Ratio of allele numbers to range of allele sizes (microsatellites)

-Between generations test (pre-/post- bottleneck) (cf. estimation of Ne)

Three common modes of selection that will influence phenotypic variation

Selection will influence the level of phenotypic variation and the

The main factors that influence genetic variation within populations

The main factors that influence genetic variation within populations

Wood Cranesbill (Geranium sylvaticum)

Photo: Truls Moum

Photo: Truls Moum

Studying allele frequencies in populations

Studying allele frequencies in populations

o Infinite size of population (no genetic drift) Genotypes: p2 + 2pq + q2 = 1