You are on page 1of 65

Genotyping and

Linkage Mapping
Genotyping

Overview
What is genotyping ?

 The analysis of DNA-sequence variation

 Genotype = the genetic constitution of an individual


How much biodiversity

 1.7—2.0 million species


 Estimates to 10 million
Important Terms
Variation : Any nucleotide change in the genome
Rare Polymorphism: Variation found in < 1% of population
Polymorphism : Variation found in ≥1% of population
Locus: Chromosomal location of a gene

Allele : alternative form of a gene or DNA sequence at a specific chromosomal location (locus)

Heterozygous: Feature of interest is different in both alleles


Homozygous : Feature of interest is identical in each allele
Hemizygous : Only one allele exists (X in Males)
What are the Types of Mutations /
Polymorphisms to be Genotyped?
There are six major classes of genetic variation:
1. Single base changes

2. Simple di-, tri-, tetranucleotide repeats

3. Small insertions or deletions

4. Larger, tandem repeats

5. Multi-gene (Megabase) duplication (CNV)

6. Complex rearrangements
Classes of Mutation
An example of one simple question:
How much variation is there?
What are the most Informative Classes for
Genotyping Studies ?
Polymorphism Type Nickname Heterozygosity

1. Single base changes SNP 1-50%

2. Simple di-, tri-, tetranucleotide repeats STR- short tandem repeats 50-90%

3. Small insertions or deletions INDELS - Insertions or deletion 1-50%

4. Larger, tandem repeats VNTR- variable # of tandem repeat 50-90%

5. Multi-gene (Megabase) duplication CNV - Copy Number Variation 1-50%

6. Complex rearrangements ----------- 1-50%


How many loci should be assayed?

Two strategies for selecting are possible:

• Select a few highly informative markers

• Select numerous, poorly informative, markers randomly


distributed within the genome
To scan the whole genomes…
Not like this……. but like this

Microcentrifuge Tube
384-well plates

96-well plates Affymetrix genechip


Not like this…….
Setting up but like this

the reactions
Not like this……. but like this
Applications enabled by HTP genotyping
Diagnostics, MAS, disease related genes, Domestication traits,
bar coding, industrial protection of genotypes

100,000
Plant and
animal
Genotyped individuals

breeding for
10,000 GWAS Genome-Wide Association Studies
selected traits
validation and
candidate gene
association
Candidate region
1,000 fine mapping

100
Diagnostics

10 Fingerprinting, Whole genome scans

10 100 1,000 10,000 100,000 1,000,000


Genotyped loci
High Throughput genotyping techniques
Two main suppliers for GWA: ILLUMINA and AFFYMETRIX

100,000
Genotyped individuals

Genome-Wide Association Studies


10,000
iPLEX Sequenom
1,000 Gold
TaqMan SNPlex, AB
Invader GenPlex
SNaP Illumina Illumina High-Density 1M-Duo chip
100 shot Pyroseq
Affymetrix Genome-Wide Human SNP Array 6.0
VeraCode GoldenGate
TaqMan assay
GoldenGate Illumina
Openarrays
10 iselect
Affymetrix Infinium BeadChips BeadChips
Targeted GeneChips

10 100 1,000 10,000 100,000 1,000,000


Genotyped loci
5 Basic Methodologies …..
1) Hybridization
– Microarrays
– TaqMan, Molecular Beacons
2) Allele-specific PCR
– FRET
– Intercalating Dyes
3) Primer Extension
– MALDI-TOF (Matrix Assisted Laser Desorption/Ionization Time-of-flight mass spectrometry)
– SNaPshot (Single nucleotide primer extension)
4) Ligation
– Padlock Probes
– Rolling Circle Amplification
5) Endonuclease Cleavage
– RFLP
– PIRA/RFL
RFLPs (Based on Endonuclease Cleavage)
 Differences in DNA sequence generate different recognition sequences and DNA
cleavage sites for specific restriction enzymes
 Two different genes will produce different fragment patterns when cut with the same
restriction enzyme due to differences in DNAsequence
Microarray (Based on Hybridization)

 Purpose: multiple simultaneous measurements by hybridization of labeled


probe

 DNA elements may be:


 Oligonucleotides
 cDNA’s
 Large insert genomic clones
Microarray technologies
Microarray chip

 Affymetrix 100k chip set


 Entire genome with 100 000 SNPs (low density).
 Affymetrix 500k chip (SNP array 5.0)
 Entire genome with 500 000 SNPs (high density)
 Affymetrix 1M chip (SNP array 6.0)
 Entire genome with 1 000 000 SNPs (very high density)
Organization of a DNA microarray

1.28 cm
Hybridization of a labeled probe to the microarray
Detection of hybridization on microarray
Light from laser
Hybridization intensities on DNA microarray
following laser scanning
B BB
(0)

AB
(0.5)

AA
(1)

A
SNPs

 Single Nucleotide Polymorphisms


 Change one nucleotide
 Insert
 Delete
 Replace it with a different nucleotide
 Many have no phenotypic effect
 Some can disrupt or affect gene function
SNP genotyping methods
 over 100 different approaches
 Ideal SNP genotyping platform:
 high-throughput capacity
 simple assay design
 robust
 affordable price
 automated genotype calling
 accurate and reliable results
Overview of SNP array technology
A little more on SNPs
 Most SNPs have only
two alleles
 Easy to automate their
scoring
 Becoming extremely
popular
 Typing Methods
 Sequencing
 Restriction Site
 Hybridization
Linkage Mapping

Overview
Types of Maps
 Physical Maps
 Complete or partially sequenced organisms
 Cytogenetic Maps
 Breakpoints in disease
 Direct binding of probes to chromosome
 Genetic Linkage Maps
 Markers
What happens in meiosis…
 Leads to formation of haploid
gametes from diploid cells

 Assortment of genetic loci

 Recombination or crossover
What is Linkage?
 Linkage is defined genetically: the failure of two genes to assort independently.

 Linkage occurs when two genes are close to each other on the same chromosome.

 However, two genes on the same chromosome are called syntenic.

 Linked genes are syntenic, but syntenic genes are not always linked. Genes far
apart on the same chromosome assort independently: they are not linked.

 Linkage is based on the frequency of crossing over between the two genes.

 Crossing over occurs in prophase of meiosis 1, where homologous chromosomes


break at identical locations and rejoin with each other.
Applications/Uses of Linkage Maps
 Studying genome structure, organization and evolution.
 Estimation of gene effects of important agronomic traits.
 Tagging genes of interest to facilitate marker assisted
selection (MAS) programs.
 Map based cloning
 Identify genes responsible for traits.
 Plants or Animals
 Disease resistance
 Meat or Milk Production, …… etc
Genetic Linkage Mapping Steps

 Development of The Mapping Population


 Genotyping of Mapping Population (Selection of suitable MM).
 Linkage Analysis
 Map Construction
 QTL Identification (in case QTL-Mapping)
 Marker-Assisted Selection.
Development of The Mapping Population
Linkage analysis
Linkage : alleles from two loci segregate together in a family.

Recombination fraction (θ): the probability of a marker and a susceptibility


locus segregating independently (recombination).

θ= 0.5 No linkage; θ< 0.5 linked together


Reasons why alleles at different loci may not assort independently:

1. Chance

2.Preferential Segregation (nonrandom segregation of non-


homologous chromosomes) - hinted at but not shown in humans

3.Linkage - the presence of loci measurably close together on the


same chromosome.
Types of Linkage Analysis

ƒParametric Lod-Score
Hƒaseman-Elston Sib-Pair
ƒAffected Sib-Pair and
Affected Relative Pair
ƒAffected Pedigree Member Method
ƒVariance Components Method
Recombination frequency
Total amount of recombinants
Ɵ =
Total amount of recombinants + Total amount of non-recombinants

Parent Gametes Theta

50% non-rec and 50% rec 0.5


A a
90% non-rec and 10% rec 0.1
B b
99% non-rec and 1% rec 0.01

100% non-rec 0
In double heterozyote:

 Cis configuration = mutant alleles of both


genes are on the same chromosome =
ab/AB

 Trans configuration = mutant alleles are


on different homologues of the same
chromosome = Ab/aB
 Genes with recombination frequencies less than 50 percent are on the same
chromosome = linked)
 Linkage group = all known genes on a chromosome
 Two genes that undergo independent assortment have recombination frequency of
50 percent and are located on nonhomologous chromosomes or far apart on the
same chromosome = unlinked
Recombination
 Recombination between linked genes occurs at the same frequency
whether alleles are in cis or trans configuration

 Recombination frequency is specific for a particular pair of genes

 Recombination frequency increases with increasing distances between


genes

 No matter how far apart two genes may be, the maximum frequency of
recombination between any two genes is 50 percent.
• Cross-over frequencies can be converted into map units.
• Ex: A 5% cross-over frequency equals 5 map units.
–gene A and gene B cross over 6.0
percent of the time
–gene B and gene C
cross over 12.5 percent
of the time

– gene A and gene C cross over 18.5 percent of the


time
Lod scores

1cM = 1MB
1MB=1000kb
1kb=1000bp
1cM = 1,000,000 bp
Genetic Mapping
 The map distance (cM) between two genes equals one half the average
number of crossovers in that region per meiotic cell
 The recombination frequency between two genes indicates how much
recombination is actually observed in a particular experiment; it is a
measure of recombination
 Over an interval so short that multiple crossovers are precluded (~ 10
percent recombination or less), the map distance equals the recombination
frequency because all crossovers result in recombinant gametes.
 Genetic map = linkage map = chromosome map

58
Gene Mapping: Crossing Over

 Crossovers which occur outside the region between


two genes will not alter their arrangement

 The result of double crossovers between two


genes is indistinguishable from independent
assortment of the genes

 Crossovers involving three pairs of alleles


specify gene order = linear sequence of genes

59
Genetic vs. Physical Distance

 Map distances based on recombination


frequencies are not a direct measurement of
physical distance along a chromosome

 Recombination “hot spots” overestimate physical


length

 Low rates in heterochromatin and centromeres


underestimate actual physical length

60
Gene Mapping

 Mapping function: the relation between genetic map distance and the
frequency of recombination

 Chromosome interference: crossovers in one region decrease the probability


of a second crossover close by

 Coefficient of coincidence = observed number of double recombinants


divided by the expected number

Interference = 1-Coefficient of coincidence


Genetic distance

Genetic distance = the genetic length over which one crossover occurs in 1% of
meiosis. This distance is expressed in cMorgan.

1 cMorgan = 0.01 recombinants = average of 1Mb (physical distance)


(Assuming that the recombination frequency is uniform along the chromosomes)

As double recombinants occur the further two loci are,


the frequency of recombination does not increase
proportionately.
Linkage related Concepts

 Interference - A crossover in one region usually decreases the probability of a


crossover in an adjacent region.

 CentiMorgan (cM) - 1 cM is the distance between genes for which the


recombination frequency is 1%.

 Lod Score - a method to calculate linkage distances (to determine the distance
between genes).
Linkage vs. Association

 Linkage analyses look for relationship between a marker and disease


within a family (could be different marker in each family)

 Association analyses look for relationship between a marker and


disease between families (must be same marker in all families)
Thank You

Any Questions ??

You might also like