You are on page 1of 8

R-lequin - Description of the graphics

Table of contents:
1.

Mismatch distribution............................................................................................. 2

2.

FST Matrix .............................................................................................................. 2

3.

Population average pairwise difference ................................................................ 3

4.

Haplotype distance matrix ..................................................................................... 3

5.

Haplotype distance matrix between/within two populations .................................. 4

6.

Haplotype distance matrix between/within populations and groups...................... 4

7.

Expected/observed haplotype ............................................................................... 5

8.

Haplotype frequencies in populations ................................................................... 5

9.

Heterozygosity....................................................................................................... 6

10.

Allelic size range ................................................................................................... 6

11.

Molecular diversity indexes ................................................................................... 6

12.

Divergence times allowing for unequal population sizes (tau) .............................. 7

13.

Population assignment test ................................................................................... 8

Mismatch distribution
Mismatch distribution

1000
500

number

1500

Observed
CI0.05

The mismatch distribution is the distribution


of the observed number of differences
between pairs of haplotypes. Populations
at demographic equilibrium show a
multimodal distribution. Unimodal mismatch
distributions have been interpreted as being
due to past demographic expansions
(Slatkin and Hudson 1991, Rogers and
Harpending 1992). But also spatial
expansion can lead to the same unimodal
mismatch distribution if neighboring subpopulations exchange many migrants (Ray
et al. 2003, Excoffier 2004).

10

12

14

differences

In this graphic you can see on the x axis the number of differences between pairs
of haplotypes and on the y axis their frequencies. The solid line indicates the
observed frequency and the dashed lines the 95% confidential intervals (=0.05).

1. FST Matrix

0.7

Number of different Allels (FST)

0.6

0.5

0.4

0.3
10

Population

FST= 1- Hw/Hb

Distance matrix: No. of different alleles (FST)

The fixation index FST is a measure of


population differentiation based on genetic
polymorphism data. It compares the genetic
variability within and between populations. A
common definition given by Hudson et al.
(1992) is:

0.2

12

11

Where Hw is the mean number of differences


between different sequences sampled from the
same subpopulation and Hb is the mean
number of differences between sequences
1
2
3
4
5
6
7
8
9 10 11 12 13
sampled from two different subpopulations
Population
sampled. The average pairwise difference
within a population can be calculated as the sum of the pairwise differences
divided by the number of pairs. The pairwise FST can be used as short-term
genetic distance between populations with a slight transformation to linearize the
distance with population divergence time (Excoffier et al. 2005).
In this graphic you can see the pairwise FST values between each population.
The FST values are coded with a color code with the legend on the right side.
13

0.1

14

0.0
14

0.8

0.6
0.4

0.2

0.0

0.7

0.5
0.4

0.3
0.2

0.1
0.0
0.35

Population

0.6

0.30
0.25
0.20
10

0.15
0.10
1

Population

3. Haplotype distance matrix


The inter-haplotypic distance matrix is
simply the number of different alleles
between each haplotype. The legend of
the colour code is on the right side.

10

0.05
0.00

Corrected average pairwise


difference(PiXY-(PiX+PiY)/2)

Population average pairwise difference

is the mean number of pairwise


differences. In this graphic you can
see the average number of pairwise
differences between each population
in the upper half of the matrix (green).
The average number of pairwise
differences within each population is
shown in the diagonal (orange). And
the lower half of the matrix (blue)
shows the corrected average pairwise
difference between the populations
(between xy (within x + within y) / 2).
The corresponding legends to the
colour codes are on the right side.

Average number of pairwise


Average number of pairwise differences
differences within population (PiX)
between populations (PiXY)

2. Population average pairwise difference

4. Haplotype distance matrix between/within two populations


haplotype distance matrix between/within populations
AM_2

The haplotype distance matrix is simply the


number of different alleles between each
haplotype within or between two
populations.
Black solid lines separate the haplotypes of
the two populations. In the lower left edge
of the graphic you can see the haplotype
distance matrix between the populations. In
the upper left and lower right edge the
haplotype distance within the populations is
shown. The legend of the colour code is on
the right side.

AM_7
AM_12 AM_11 AM_10

AM_2

AM_1

Haplotype

Population 1

AM_6

AM_3

AM_5
AM_6
AM_8

AM_9

Population 2

AM_4

AM_10

AM_2

AM_3

AM_6

AM_7

AM_10

AM_11 AM_12

AM_1

AM_2

AM_4

Population 1

AM_5

AM_6

AM_8

AM_9

AM_10

Population 2

Haplotype

5. Haplotype distance matrix between/within populations and


groups
haplotype distance matrix between/within populations and groups
5
AM_2

Population 1

The haplotype distance matrix is simply


the number of different alleles between
each haplotype within or between two
populations or groups.

AM_3

AM_6
AM_7

AM_10

Group 1

AM_11
AM_12

This graphic consist of 3 different


graphics. The graphic in the upper left
edge shows the haplotype distance
matrix within group 1. Black solid lines
separate the haplotypes of the two
populations. In the lower left edge of this
graphic you can see the haplotype
distance matrix between the populations
and in the upper left and lower right
edge the haplotype distance within each
population is shown.
In the graphic in the lower left edge you
can see the haplotype distance matrix
Group 1
Group 2
between group 1 and group 2. The
populations are also separated by black lines. The graphic in the lower right edge
shows the haplotype distance matrix within group 1 and like before the
populations are separated by solid black lines.
The legend of the colour code is in the upper right edge.
AM_1
AM_2

Population 2

AM_4
AM_5
AM_6

AM_8
AM_9

AM_10

Population 3

AM_2
AM_3
AM_6
AM_7

AM_10

Group 2

AM_11
AM_12
AM_1
AM_2

Population 4

AM_4
AM_5
AM_6
AM_8
AM_9

Population 1

Population 3

AM_9

AM_8

Population 4

AM_10

AM_6

AM_5

AM_4

AM_2

AM_1

AM_12

AM_11

AM_7

AM_6

AM_10

AM_3

AM_2

AM_9

AM_8

Population 2

AM_10

AM_6

AM_5

AM_4

AM_2

AM_1

AM_12

AM_11

AM_7

AM_10

AM_6

AM_3

AM_2

AM_10

6. Expected/observed haplotype

Observed
Expected
Expected +/- sd

0.04
0.02

relative Frequency

0.06

0.08

Haplotype Frequency
This graphic shows the
observed haplotype
frequencies and the
expected haplotype
frequencies with their
standard deviation at
different alleles. The
expected haplotype
frequencies are
calculated under the
infinite-allele model as
predicted by Ewens
(1972) sampling
distribution. The null Distribution of the haplotype frequency is generated by
simulating random neutral samples having the same number of genes and the
same number of haplotypes using the algorithm of Stewart (1977). It can be used
to test the hypothesis of selective neutrality and population equilibrium against
either balancing selection or the presence of advantageous alleles (Excoffier et
al. 2005). Watterson (1978) has shown that the homozygosity is a good statistic
for testing departures from selective neutrality in the direction of heterozygote
advantage or disadvantage.
1

10 11 12 13 14 15 16 17 18 19 20 21

22 23 24 25 26 27

28 29 30 31 32 33

34 35 36 37 38 39

40 41 42 43 44 45

46 47 48 49 50 51

52 53 54 55 56 57

58 59 60

Allel

7. Haplotype frequencies in populations

1.0

Haplotype frequencies in populations

0.6
0.4
0.2

Haplotype frequency

0.8

Population:
Tharu
Oriental
Wolof
Peul
Pima
Maya
Finnish
Sicilian
Israeli_Jew
Israeli_Arab

0.0

This graphic shows the frequency of


each haplotype in different
populations. Each population has a
line of different colour or style which
you can see in the legend in the upper
right edge.
With this graphic you can see the
dominate haplotypes in different
populations an you can therefore
compare the different populations with
each other.

1 6 8

11

17

22

28

34

38

41

44

47

Haplotype

50

53

57

66

69

73

77

95

8. Heterozygosity
0.4
0.3
heterozygosity

0.2
0.1

In this graphic you can see


the heterozygosity of each
observed locus.

Heterozygosity

0.0

Heterozygosity is the fraction


of individuals in a population
that are heterozygous for a
particular locus.

100

200

300

400

Loci

9.

Allelic size range


Allelic size range at different loci

50

Allelic size

This graphic shows the allelic size for


each population. The different colours
indicate the allelic size for each
population at different loci.

100

150

200

Locus:
1
2
3
4

The allelic size range is the difference


between the maximum and the
minimum number of repeats for
microsatellite data (Excoffier et al.
2005).

10

11

12

13

14

Population

10. Molecular diversity indexes


molecular diversity indexes

10

Theta k (CI 0.05)


Theta H (+/- sd)
Theta S (+/- sd)
Theta pi (+/- sd)

3
2
1

Israeli_Arab

Sicilian

Israeli_Jew

Population

Finnish

Maya

Pima

Peul

Wolof

Oriental

Tharu

Theta is population parameter of genetic


differentiation.
= 2Mu, where M is equal to 2N for
diploid populations of size N or equal to N
for haploid populations and u is the overall
mutation rate at the haplotype level
(Excoffier et al. 2005).

This graphic shows you the values and


the standard deviation of four different
diversity indexes in different populations.
The solid lines of different colours shows
the different values of the diversity
indexes and the dashed lines in the same
colour the corresponding standard
deviations.

Theta H:

is calculated from the expected homozygosity (H) in a population at


equilibrium between drift and mutation (see Arlequin 3.1 manual
p.92).

Theta S:

is estimated from the infinite-site equilibrium relationship


(Watterson, 1975) between the number of segregating sites (S), the
sample size (n) and theta () for a sample of non-recombining DNA
(see Arlequin 3.1 manual p.93).

Theta k:

is estimated from the infinite-allele equilibrium relationship (Ewens,


1972) between the expected number of alleles (k), the sample size
(n) and theta () (see Arlequin 3.1 manual p.94).

Theta :

is estimated from the infinite-site equilibrium relationship between


the mean number of pairwise differences () and theta () (Tajima,
1983) (see Arlequin 3.1 manual p. 94)

11. Divergence times allowing for unequal population sizes (tau)

0.0

tau

-0.1

-0.2

10

Population

Divergence times allowing for unequal population sizes (tau)


The divergence time (tau) between
populations of unequal size is estimated
(Gaggiotti and Excoffier 2000). The model
assumes that two populations have
diverged from an ancestral population of
size N0 some T generations in the past
and have remained isolated from each
other ever since. The size of the daughter
populations can be different but their sum
adds up to the size of the ancestral
population. From the average number of
pairwise difference between and within
populations the divergence time scaled by
mutation rate is estimated. The estimated
values should be interpreted with caution.
1
2
3
4
5
6
7
8
9
10
The procedure implemented is based on
Population
the comparison of intra and interpopulation diversities (s) which have a
large variance variance, which means that for short divergence times, the
average diversity found within population could be larger than that observed
between populations. This could lead to negative divergence times (Excoffier et
al., 2005).

In this graphic you can see the divergence time (tau) between each population.
The legend of the colour code is on the right side.

-0.3

Population assignment test


Population assignment test
Log(L(Population 1))

In the graphic the log-likelihood of two populations is shown on the two different
axes. The line in the graphic indicates the equal probability to belong to
population 1 or population 2. The more the genotypes of the individuals are
above the line, the better they are explained by belonging to the population 2
than to population 1 and the more the genotypes of the individuals are below the
line, the better they are explained by belonging to population 1.
With this graphic we try to detect outsider individuals from a given population,
allocating migrant individuals to different source populations, inferring
movements of animals over different years, detecting admixed populations,
detecting hybridisation events and so on. But interpreting these results in term of
gene flow is difficult. Additional there exist some problems for example for rare
and private alleles, linkage disequilibrium, individuals may be chimeric for the
source of their genes, etc.

Population 1
Population 2

-5

-5

-10

-10

-15

-15

Log(L(Population 2))

-20

-20

We may be interested to detect the


number of migrants that are presently
exchanged between populations, to
detect recent immigrants in a given
population or to determine the origin of
particular individuals.
In the population assignment test
determine the log-likelihood of each
individual multi-locus genotype in each
populations sample, assuming that
the individual comes from that
population. The allele frequencies
estimated in each sample from the
original constitution of the sample is
used to calculate the likelihood. It is
assumed that all loci are independent,
such that the global individual
likelihood is obtained as the product of
the likelihood at each locus (Excoffier
et al. 2005).

-25

12.

You might also like