Professional Documents
Culture Documents
• ESTs act as specific tags for each human gene, since they are derived
by sequencing cDNA clones which came from mRNA and therefore
represent the actual transcribed sequences (as opposed to STSs, which
can be derived from anywhere in the genome and are mostly non-
coding). They allow rapid access to the actual genes, ignoring introns
and “junk” DNA
ESTs can be 3' or 5' depending on which end of the cDNA was sequenced.
1.Because of the methods used to make cDNA libraries, parts of the 5' end
of the gene are often lost during cloning whereas the 3' end is more reliable.
2. This shown on the diagram by the white boxes representing cDNA clones
being different lengths.
3. Another complication is due to alternative splicing.
4. On the left is shown the genomic structure of a gene, with the exons as
boxes - the red one is subject to alternative splicing.
• Genome? Total set of different DNA molecules- human25
different DNA molecules: single mitochondrial DNA and 24
nuclear DNA molecules
• Loosely comprises ofnuclear and mitochondrial genome.
• The approximately 16.5kb mitochondrial genome was
published 1981, primary goal for HGP was to sequence the
3000Mb nuclear genome.
Human gene and DNA segment nomenclature
• The nomenclature used was decided by HUGO
nomenclature committee. Genes are allocated
symbols of usually 2-6 characters.
• For anonymous DNA sequence the convention is to
use D (DNA) followed by 1-22, X or Y to denote
chromosomal location, then S for the unique
segment, Z for a chromosome specific repetitive
DNA family or F for a multilocus DNA family and
finally a serial number. The letter E following the
number for an anonomous DNA sequence indicates
that the sequence is known to be expressed.
Symbol Interpretation
CRYB1 Gene for crystallin beta
pepetide1
mitosis
Random loss of
Human chromosomes
Such hybrid cells are unstable
and lose a few and retain some
of the human chromosomes
• Although panels of hybrid cells can be used to map
a human gene or DNA sequence to a specific
human chromosome.
• It is most efficient to use panels of
monochromosomal hybrids (cells containing just
a single type of human chromosome) collectively
expressing all 24 types of human chromosome.
• To make this, donor human cells are exposed to
colcemid, causing the chromosome set to be
partitioned into discrete subnuclear packets
(micronuclei).
• This is followed by centrifugation resulting in
micronuclei, consisting of single micronucleus with a
thin rim of cytoplasm surrounded by intact plasma
membrane.
• The microcells are fused with recipient
rodent cells (microcell fusion) to generate
hybrids, some containing single human
chromosome.
To aid human genome mapping
Match
pattern Database on
central server
location
A YAC-based physical map of human genome
• At the official beginning of the HGP in 90, the available
genomic DNA libraries contained inserts upto 40kb in length
(cosmid), because of the large size of the human genome an
average insert of 40kb would need to have several hundreds
of thousand different clones to ensure high probability of
representing 100% of the genome. Screening of these
individual clones and organizing them would be a daunting
task.
• To circumvent this problem novel methods for making
artificial eukaryotic chromosomes were developed. It was
known that only small regions of the yeast chromosomal
sequence was enough to let them function like independent
chromosomes. By purifying these sequences and combining
with large human DNA fragments it was possible to make
hybrid molecules containing megabase sized inserts.
• YAC libraries with an average insert size of 1MB would
range 12,000-15,000 clones to reasonably represent the
human genome, and would have advantage of enabling
large genes to be retained in individual clones. The first
reasonably detailed map using YACs Cohen et al., 1993.
• An updated YAC map covering 75% of human genome
consisting of 225 contigs with an average 10Mb was
subsequently published Chumakov 1995.
• The underlying principle in YAC maps (and all other clone-
based physical maps) is to order the clones in the library on
the basis of the subchromosomal region of origin for the
insert DNA.
• This means that the relevant subchromosomal region is
represented by a linear array of partially overlapping clones
without leaving any gaps. Such a contigous set of cloned
DNA sequences is called a clone contig.
A high resolution STS sequence map of the human
genome
Francis Collin
Director, NHGRI