Measures of Genetic Distance

M.B. McEachern, W. Savage, S. Hooper,
S. Kanthaswamy
Genetic Distance (D)
• Quantitative measure of genetic divergence
between two sequences, individuals, or taxa

• Relative estimate of the time that has past since
two populations existed as a single, panmictic
population

• Units of D depend on the kind of molecular data
collected (allozymes, nucleotide sequences, etc.)
3 Most Commonly used Distance
Measures
• Nei’s genetic distance (Nei, 1972)
• Cavalli-Sforza chord measure (Cavalli-Sforza and
Edwards, 1967)
• Reynolds, Weir, and Cockerham’s genetic
distance (1983)

• Nei’s assumes that differences arise due to
mutation and genetic drift, C-S and RWC assume
genetic drift only
Nei’s Genetic Distance
• D = -ln I
where I = Σx
i
y
i
/ (Σx
i
2
Σy
i
2
)
0.5

• For multiple loci, use the arithmetic means
across all loci
• Interpreted as mean number of codon
substitutions per locus
Assumptions for Nei’s Distance
• IAM
• All loci have same rate of neutral mutation
• Mutation-genetic drift equilibrium
• Stable effective population size
Cavalli-Sforza Chord Distance
• populations are conceptualised as existing as points in a m-
dimensional Euclidean space which are specified by m
allele frequencies (i.e. m equals the total number of alleles
in both populations). The distance is the angle between
these points:




• xi and yi are the frequencies of the ith allele in populations x and y
• Assumes genetic drift only (no mutation)
• Geometric distance b/w points in multi-dimensional space
Reynold’s Distance
• Assumes IAM
• Developed for allozyme data on small
populations and assumes genetic drift is
only force operating on allelic frequencies
(i.e. no mutation)
• Based on the coancestry coefficient, θ
D = -ln(1-θ)
What is Coancestry?
• Degree of relationship by descent between
two individuals
• Probability that a randomly picked allele
from one individual is IBD to a randomly
picked allele in another individual
Testing Significance of Distance
Measures
• Bootstrap: generation of many new data
sets by resampling original data with
replacement
• For each bootstrap data set, obtain estimates
of parameters of interest and their variances
• Generates confidences intervals of
parameter estimates

Phylip
• Computes Nei’s, C-S, and Reynold’s genetic
distances using GENDIST (we will do this in lab
today)
• Uses Bootstrap to generate confidence intervals
(but we don’t know how to view that output)

• Other programs that estimate distance: TFPGA,
GDA, Popgene, DISPAN

Lots of other Distance Measures!
• Euclidean distance
• Shared allele distance
• Roger’s distance
• Goldstein distance (for microsatellites)
In Lab Today:
• Use Phylip to estimate genetic distance for
Bear data
• AMOVA using Arlequin