R-Lequin Graphic Description

R-lequin - Description of the graphics
Table of contents:
1.
Mismatch distribution............................................................................................. 2
2.
FST Matrix .............................................................................................................. 2
3.
Population average pairwise difference ................................................................ 3
4.
Haplotype distance matrix ..................................................................................... 3
5.
Haplotype distance matrix between/within two populations .................................. 4
6.
Haplotype distance matrix between/within populations and groups...................... 4
7.
Expected/observed haplotype ............................................................................... 5
8.
Haplotype frequencies in populations ................................................................... 5
9.
Heterozygosity....................................................................................................... 6
10.
Allelic size range ................................................................................................... 6
11.
Molecular diversity indexes ................................................................................... 6
12.
Divergence times allowing for unequal population sizes (tau) .............................. 7
13.
Population assignment test ................................................................................... 8
Mismatch distribution
Mismatch distribution
1000
500
number
1500
Observed
CI0.05
The mismatch distribution is the distribution

of the observed number of differences
between pairs of haplotypes. Populations
at demographic equilibrium show a
multimodal distribution. Unimodal mismatch
distributions have been interpreted as being
due to past demographic expansions
(Slatkin and Hudson 1991, Rogers and
Harpending 1992). But also spatial
expansion can lead to the same unimodal
mismatch distribution if neighboring subpopulations exchange many migrants (Ray
et al. 2003, Excoffier 2004).
10
12
14
differences
In this graphic you can see on the x axis the number of differences between pairs
of haplotypes and on the y axis their frequencies. The solid line indicates the
observed frequency and the dashed lines the 95% confidential intervals (=0.05).
1. FST Matrix
0.7
Number of different Allels (FST)
0.6
0.5
0.4
0.3
10
Population
FST= 1- Hw/Hb
Distance matrix: No. of different alleles (FST)
The fixation index FST is a measure of

population differentiation based on genetic
polymorphism data. It compares the genetic
variability within and between populations. A
common definition given by Hudson et al.
(1992) is:
0.2
12
11
Where Hw is the mean number of differences

between different sequences sampled from the
same subpopulation and Hb is the mean
number of differences between sequences
1
2
3
4
5
6
7
8
9 10 11 12 13
sampled from two different subpopulations
Population
sampled. The average pairwise difference
within a population can be calculated as the sum of the pairwise differences
divided by the number of pairs. The pairwise FST can be used as short-term
genetic distance between populations with a slight transformation to linearize the
distance with population divergence time (Excoffier et al. 2005).
In this graphic you can see the pairwise FST values between each population.
The FST values are coded with a color code with the legend on the right side.
13
0.1
14
0.0
14
0.8
0.6
0.4
0.2
0.0
0.7
0.5
0.4
0.3
0.2
0.1
0.0
0.35
Population
0.6
0.30
0.25
0.20
10
0.15
0.10
1
Population
3. Haplotype distance matrix

The inter-haplotypic distance matrix is
simply the number of different alleles
between each haplotype. The legend of
the colour code is on the right side.
10
0.05
0.00
Corrected average pairwise

difference(PiXY-(PiX+PiY)/2)
Population average pairwise difference
is the mean number of pairwise

differences. In this graphic you can
see the average number of pairwise
differences between each population
in the upper half of the matrix (green).
The average number of pairwise
differences within each population is
shown in the diagonal (orange). And
the lower half of the matrix (blue)
shows the corrected average pairwise
difference between the populations
(between xy (within x + within y) / 2).
The corresponding legends to the
colour codes are on the right side.
Average number of pairwise

Average number of pairwise differences
differences within population (PiX)
between populations (PiXY)
2. Population average pairwise difference
4. Haplotype distance matrix between/within two populations

haplotype distance matrix between/within populations
AM_2
The haplotype distance matrix is simply the

number of different alleles between each
haplotype within or between two
populations.
Black solid lines separate the haplotypes of
the two populations. In the lower left edge
of the graphic you can see the haplotype
distance matrix between the populations. In
the upper left and lower right edge the
haplotype distance within the populations is
shown. The legend of the colour code is on
the right side.
AM_7
AM_12 AM_11 AM_10
AM_2
AM_1
Haplotype
Population 1
AM_6
AM_3
AM_5
AM_6
AM_8
AM_9
Population 2
AM_4
AM_10
AM_2
AM_3
AM_6
AM_7
AM_10
AM_11 AM_12
AM_1
AM_2
AM_4
Population 1
AM_5
AM_6
AM_8
AM_9
AM_10
Population 2
Haplotype
5. Haplotype distance matrix between/within populations and

groups
haplotype distance matrix between/within populations and groups
5
AM_2
Population 1
The haplotype distance matrix is simply

the number of different alleles between
each haplotype within or between two
populations or groups.
AM_3
AM_6
AM_7
AM_10
Group 1
AM_11
AM_12
This graphic consist of 3 different

graphics. The graphic in the upper left
edge shows the haplotype distance
matrix within group 1. Black solid lines
separate the haplotypes of the two
populations. In the lower left edge of this
graphic you can see the haplotype
distance matrix between the populations
and in the upper left and lower right
edge the haplotype distance within each
population is shown.
In the graphic in the lower left edge you
can see the haplotype distance matrix
Group 1
Group 2
between group 1 and group 2. The
populations are also separated by black lines. The graphic in the lower right edge
shows the haplotype distance matrix within group 1 and like before the
populations are separated by solid black lines.
The legend of the colour code is in the upper right edge.
AM_1
AM_2
Population 2
AM_4
AM_5
AM_6
AM_8
AM_9
AM_10
Population 3
AM_2
AM_3
AM_6
AM_7
AM_10
Group 2
AM_11
AM_12
AM_1
AM_2
Population 4
AM_4
AM_5
AM_6
AM_8
AM_9
Population 1
Population 3
AM_9
AM_8
Population 4
AM_10
AM_6
AM_5
AM_4
AM_2
AM_1
AM_12
AM_11
AM_7
AM_6
AM_10
AM_3
AM_2
AM_9
AM_8
Population 2
AM_10
AM_6
AM_5
AM_4
AM_2
AM_1
AM_12
AM_11
AM_7
AM_10
AM_6
AM_3
AM_2
AM_10
6. Expected/observed haplotype
Observed
Expected
Expected +/- sd
0.04
0.02
relative Frequency
0.06
0.08
Haplotype Frequency
This graphic shows the
observed haplotype
frequencies and the
expected haplotype
frequencies with their
standard deviation at
different alleles. The
expected haplotype
frequencies are
calculated under the
infinite-allele model as
predicted by Ewens
(1972) sampling
distribution. The null Distribution of the haplotype frequency is generated by
simulating random neutral samples having the same number of genes and the
same number of haplotypes using the algorithm of Stewart (1977). It can be used
to test the hypothesis of selective neutrality and population equilibrium against
either balancing selection or the presence of advantageous alleles (Excoffier et
al. 2005). Watterson (1978) has shown that the homozygosity is a good statistic
for testing departures from selective neutrality in the direction of heterozygote
advantage or disadvantage.
1
10 11 12 13 14 15 16 17 18 19 20 21
22 23 24 25 26 27
28 29 30 31 32 33
34 35 36 37 38 39
40 41 42 43 44 45
46 47 48 49 50 51
52 53 54 55 56 57
58 59 60
Allel
7. Haplotype frequencies in populations
1.0
Haplotype frequencies in populations
0.6
0.4
0.2
Haplotype frequency
0.8
Population:
Tharu
Oriental
Wolof
Peul
Pima
Maya
Finnish
Sicilian
Israeli_Jew
Israeli_Arab
0.0
This graphic shows the frequency of

each haplotype in different
populations. Each population has a
line of different colour or style which
you can see in the legend in the upper
right edge.
With this graphic you can see the
dominate haplotypes in different
populations an you can therefore
compare the different populations with
each other.
1 6 8
11
17
22
28
34
38
41
44
47
Haplotype
50
53
57
66
69
73
77
95
8. Heterozygosity
0.4
0.3
heterozygosity
0.2
0.1
In this graphic you can see

the heterozygosity of each
observed locus.
Heterozygosity
0.0
Heterozygosity is the fraction

of individuals in a population
that are heterozygous for a
particular locus.
100
200
300
400
Loci
9.
Allelic size range

Allelic size range at different loci
50
Allelic size
This graphic shows the allelic size for

each population. The different colours
indicate the allelic size for each
population at different loci.
100
150
200
Locus:
1
2
3
4
The allelic size range is the difference

between the maximum and the
minimum number of repeats for
microsatellite data (Excoffier et al.
2005).
10
11
12
13
14
Population
10. Molecular diversity indexes

molecular diversity indexes
10
Theta k (CI 0.05)

Theta H (+/- sd)
Theta S (+/- sd)
Theta pi (+/- sd)
3
2
1
Israeli_Arab
Sicilian
Israeli_Jew
Population
Finnish
Maya
Pima
Peul
Wolof
Oriental
Tharu
Theta is population parameter of genetic

differentiation.
= 2Mu, where M is equal to 2N for
diploid populations of size N or equal to N
for haploid populations and u is the overall
mutation rate at the haplotype level
(Excoffier et al. 2005).
This graphic shows you the values and

the standard deviation of four different
diversity indexes in different populations.
The solid lines of different colours shows
the different values of the diversity
indexes and the dashed lines in the same
colour the corresponding standard
deviations.
Theta H:
is calculated from the expected homozygosity (H) in a population at

equilibrium between drift and mutation (see Arlequin 3.1 manual
p.92).
Theta S:
is estimated from the infinite-site equilibrium relationship

(Watterson, 1975) between the number of segregating sites (S), the
sample size (n) and theta () for a sample of non-recombining DNA
(see Arlequin 3.1 manual p.93).
Theta k:
is estimated from the infinite-allele equilibrium relationship (Ewens,

1972) between the expected number of alleles (k), the sample size
(n) and theta () (see Arlequin 3.1 manual p.94).
Theta :
is estimated from the infinite-site equilibrium relationship between

the mean number of pairwise differences () and theta () (Tajima,
1983) (see Arlequin 3.1 manual p. 94)
11. Divergence times allowing for unequal population sizes (tau)
0.0
tau
-0.1
-0.2
10
Population
Divergence times allowing for unequal population sizes (tau)

The divergence time (tau) between
populations of unequal size is estimated
(Gaggiotti and Excoffier 2000). The model
assumes that two populations have
diverged from an ancestral population of
size N0 some T generations in the past
and have remained isolated from each
other ever since. The size of the daughter
populations can be different but their sum
adds up to the size of the ancestral
population. From the average number of
pairwise difference between and within
populations the divergence time scaled by
mutation rate is estimated. The estimated
values should be interpreted with caution.
1
2
3
4
5
6
7
8
9
10
The procedure implemented is based on
Population
the comparison of intra and interpopulation diversities (s) which have a
large variance variance, which means that for short divergence times, the
average diversity found within population could be larger than that observed
between populations. This could lead to negative divergence times (Excoffier et
al., 2005).
In this graphic you can see the divergence time (tau) between each population.
The legend of the colour code is on the right side.
-0.3
Population assignment test

Population assignment test
Log(L(Population 1))
In the graphic the log-likelihood of two populations is shown on the two different
axes. The line in the graphic indicates the equal probability to belong to
population 1 or population 2. The more the genotypes of the individuals are
above the line, the better they are explained by belonging to the population 2
than to population 1 and the more the genotypes of the individuals are below the
line, the better they are explained by belonging to population 1.
With this graphic we try to detect outsider individuals from a given population,
allocating migrant individuals to different source populations, inferring
movements of animals over different years, detecting admixed populations,
detecting hybridisation events and so on. But interpreting these results in term of
gene flow is difficult. Additional there exist some problems for example for rare
and private alleles, linkage disequilibrium, individuals may be chimeric for the
source of their genes, etc.
Population 1
Population 2
-5
-5
-10
-10
-15
-15
Log(L(Population 2))
-20
-20
We may be interested to detect the

number of migrants that are presently
exchanged between populations, to
detect recent immigrants in a given
population or to determine the origin of
particular individuals.
In the population assignment test
determine the log-likelihood of each
individual multi-locus genotype in each
populations sample, assuming that
the individual comes from that
population. The allele frequencies
estimated in each sample from the
original constitution of the sample is
used to calculate the likelihood. It is
assumed that all loci are independent,
such that the global individual
likelihood is obtained as the product of
the likelihood at each locus (Excoffier
et al. 2005).
-25
12.

R-Lequin Graphic Description

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

R-Lequin Graphic Description

Uploaded by

Copyright:

Available Formats

R-lequin - Description of the graphics

FST Matrix .............................................................................................................. 2

Population average pairwise difference ................................................................ 3

Haplotype distance matrix ..................................................................................... 3

Haplotype distance matrix between/within two populations .................................. 4

Haplotype distance matrix between/within populations and groups...................... 4

Expected/observed haplotype ............................................................................... 5

Haplotype frequencies in populations ................................................................... 5

Allelic size range ................................................................................................... 6

Molecular diversity indexes ................................................................................... 6

Divergence times allowing for unequal population sizes (tau) .............................. 7

Population assignment test ................................................................................... 8

The mismatch distribution is the distribution

Number of different Allels (FST)

Distance matrix: No. of different alleles (FST)

The fixation index FST is a measure of

Where Hw is the mean number of differences

3. Haplotype distance matrix

Corrected average pairwise

Population average pairwise difference

is the mean number of pairwise

Average number of pairwise

2. Population average pairwise difference

4. Haplotype distance matrix between/within two populations

The haplotype distance matrix is simply the

5. Haplotype distance matrix between/within populations and

The haplotype distance matrix is simply

This graphic consist of 3 different

7. Haplotype frequencies in populations

Haplotype frequencies in populations

This graphic shows the frequency of

In this graphic you can see

Heterozygosity is the fraction

Allelic size range

This graphic shows the allelic size for

The allelic size range is the difference

10. Molecular diversity indexes

Theta k (CI 0.05)

Theta is population parameter of genetic

This graphic shows you the values and

is calculated from the expected homozygosity (H) in a population at

is estimated from the infinite-site equilibrium relationship

is estimated from the infinite-allele equilibrium relationship (Ewens,

is estimated from the infinite-site equilibrium relationship between

11. Divergence times allowing for unequal population sizes (tau)

Divergence times allowing for unequal population sizes (tau)

Population assignment test

We may be interested to detect the

You might also like