Comparative Dna Sequencing Analyses For Feasibilities: Presented by Jajati Keshari Nayak PHD 1 Year

DOCTORAL SEMINAR-I
COMPARATIVE DNA
SEQUENCING ANALYSES FOR
FEASIBILITIES
Presented by
Jajati Keshari Nayak
PhD 1st year
C
A
B
I
N
C
A
B
I
N
Central Dogma of Life
C
A Transcription Translation
B
I mRNA
Gene Protein
N
Why ?
 Study genetic Variation
 Source & Effect
 Comparative genomics
 Similarity, Syntenic genes/allele
resource, clinical studies
 Evolutionary insight
 Variation among species and course of
evolution …
ACGTGACTGAGGACCG
 Structure and Function of Genomes TGGACTGAGACTGACT
GGGTCTAGCTAGACTA
 Understanding the ‘rules of nature’, ??CGTTTTATATATATAT
ACGTCGTCGTACTGAT
GACTAGATTACAGAC…
SEQUENCING GENOME
Strategy
Libraries
Sequencing
Assembly
Annotation
Annotation
Release
TIME MONEY
Strategies for sequencing
• Genome complexity Strategy
• dispersed repetitive sequence Libraries

Organism
• telomeres & centromeres Sequencing
Assembly
• Size & GC content
e
•Resources & Understanding Annotation
• Volume of data Release

Genome Sequencing
TG..GT TC..CC
AC..GC
CG..CA
TT..TC
TG..AC
AC..GC GA..GC
CT..TG
GT..GC AC..GC AC..GC
AA..GC AT..AT
TT..CC
Genome Short fragments of DNA Short DNA sequences
ACGTGACCGGTACTGGTAAC
GTACACCTACGTGACCGGTA
ACGTGGTAA CGTATACAC TAGGCCATA CTGGTAACGTACGCCTACGT
GACCGGTACTGGTAACGTAT
GTAATGGCG CACCCTTAG ACACGTGACCGGTACTGGTA
TGGCGTATA CATA… ACGTACACCTACGTGACCGG
TACTGGTAACGTACGCCTAC
GTGACCGGTACTGGTAACGT
ACGTGGTAATGGCGTATACACCCTTAGGCCATA ATACCTCT...
Sequenced genome
8
8
Basics of DNA
Synthesis
PPi + O
3’OH Absolute requirement of 3’OH for

Addition of New Nucleotide in the
growing chain - is the basis of
SNAGER’s method
How we obtain the sequence of
nucleotides of a species
 1st Generation DNA Sequencing (1977)
• Maxam & Gilbert – Chain degradation, Chemical Method
• Sanger – Chain termination, Enzymatic method
• Both methods took too much of time and manual.

• Maxam & Gilbert method used chemicals
modification.
• Sanger’s method was suitable for Automation and
Used Enzymes
• Maxam & Gilbert
method not suitable for
automation
Dideoxy (Sanger) Method
• ddNTP: 2’,3’-dideoxynucleotide has No 3’ hydroxyl group

available for Phospho-di-ester bond
• Terminates chain when incorporated
• Add enough so each ddNTP is randomly and completely
incorporated at each base
SANGER SEQUENCING
Primer
5’
T G C G C G G C C C A G T C T T G G G C T A G C G C
A C G C G C C G G G T C A G A A C C C G A T C G C G
3’ 5’
5’
T G C G C G G C C C A G T C T T G G G C T 21 bp
SANGER SEQUENCING
Primer
5’
T G C G C G G C C C A G T C T T G G G C T A
3’ 5’
5’
5’
T G C G C G G C C C A G T C T T G G G C T A G C G C 26 bp
SANGER SEQUENCING
Primer
5’
T G C G C G G C C C A G
3’ 5’
5’
5’
5’
T G C G C G G C C C A G T C T T G G G C T A 22 bp
SANGER SEQUENCING
Primer
5’
T G C G C G G C C C A G T C T T G G G C
3’ 5’
5’
5’
5’
5’
T G C G C G G C C C A G 12 bp
SANGER SEQUENCING
Primer
5’
T G C G C G G C C C A G T C T T
3’ 5’
5’
5’
5’
5’
5’
T G C G C G G C C C A G T C T T G G G C 20 bp
SANGER SEQUENCING
3’ 5’
5’
5’
5’
5’
5’
T G C G C G G C C C A G T C T T G G G C 20 bp
5’
T G C G C G G C C C A G T C T T 16 bp
Automated Sanger’s Sequencing & Data
Documentation/ Interpretation
Sanger’s Sequencing
 Advantages
 Long reads (~900bps)
 Suitable for small projects
 No need of gel electrophoresis
 Disadvantages
 Low throughput
 Expensive
19
2nd Generation: Pyrosequencing
 Basic idea:
 Visible light is generated and is proportional to the
number of incorporated nucleotides
DNA Polymerase I from

E.coli.
pyrophospate
From fireflies, oxidizes luciferin and generates light

2nd Generation: Pyrosequencing
 Sequencing by
synthesis
 Advantages:
 Accurate
 Parallel processing
 Easily automated
 Eliminates the need for
labeled primers and
nucleotides
 No need for gel
electrophoresis
Pyrosequencing Results:
AGGGGTCAGGTCAGTTTCAGGGGTTCAGTCAGTTCAG
Illumina sequencing
(reversible terminator sequencing)
In illumina modified dNTPs use contains reversible

terminator which is combine with fluorescent dye
SOLiD (sequencing by ligation)
Next Generation Sequencing
Polony array/ DNA Beads (454, SOLiD)
DNA Beads are placed in wells, No Need of tubes, High-

throughput 25
Conventional vs 2nd generation sequencing
Available next-generation sequencing
platforms
 Illumina/Solexa – Modified
terminators
 ABI SOLiD – Ligation Chemistry
 Roche 454 – Pyrosequencing
 Nanopore – Non – amplification,
single molecule
Which technology to go with
Read length Sequencing Throughput Cost (1mbp)*

Technology (per run)
Sanger ~800bp Sanger 400kbp 500$
454 ~400bp Polony 500Mbp 60$
Solexa 75bp Polony 20Gbp 2$
SOLiD 75bp Polony 60Gbp 2$
Helicos 30-35bp Single 25Gbp 1$

molecule
*Source: Shendure & Ji, Nat Biotech, 2008 28

What, When and Why
 Sanger:
Small projects (less than 1Mbp)
 454:
Whole genome, De-novo sequencing, metagenomics
 Solexa, SOLiD, Heliscope:
 Gene expression,
 Resequencing
29
Genomic DNA
Shearing/Sonication
Sequence Each Clone Fragment
Match
Overlapping
Sequences
Assembly
Contigs Compile Data,

Gap, Find and Fill Quality Check
&
Editing
Reassemble
Draft sequence
First Generation Sequencing
Second Generation Sequencing
2nd Gen Sequencing Tech
 Traditional sequencing: 384 reads ~1kb / 3 hours
 454 (Roche):
 1M reads 450-1000bp / 10-24 hours
 HiSeq (Illumina):
 http://www.youtube.com/watch?v=HtuUFUnYB9Y
 100-200M reads of 50-100bp / 3-8 days * 16 samples
 SOLiD (Applied Biosystems)
 >100M reads of 50-60bp / 2-8 days * 12 samples
 Ion Torrent (Roche):
 http://www.youtube.com/watch?v=yVf2295JqUg
 5-10M reads of 200-400bp / < 2 hours
33
Illumina HiSeq2000
 Throughput:
 $1000-2500 / lane (depends on read length, SE / PE)
 50-100 bp / read
 16 lanes (2 flow cells) / run
 150-200 million reads / lane
 Sequencing a human genome: $3000, 1 week
 Bioinfo challenges
 Very large files
 CPU and RAM hungry
 Sequence quality filtering
 Mapping and downstream analysis
34
Third Generation Sequencing
 Single molecule sequencing (no amplification needed)
 Oxford Nanopore: Read fewer but longer sequences
http://www.youtube.com/watch?v=_rRrOT9gfpo
 In 1-2 years, the cost of sequencing a human genome
will drop below $1000, storage will cost more than
sequencing
 Personal genome sequencing might become a key
component of public health in every developed
country
 Bioinformatics will be key to convert data into
knowledge
37
Nanopore Sequencing
(Potential) Applications
 Metagenomics and infectious disease
 Ancient DNA, recreate extinct species
 Comparative genomics (between species) and
personal genomes (within species)
 Genetic tests and forensics
 Circulating nucleic acids
 Risk, diagnosis, and prognosis prediction
 Transcriptome and transcriptional regulation
 More later in the semester…
39
Genome (sequence) annotation
 Structural :
 Identify genes, Pseudogenes, clusters
 Identify Mutations/ Variations
 Identify repeats
 Identify ESTs, UREs, Homologous, Analogous regions
 Identify SNPs
 Functional
 Protein/polypeptide encoded by genes
 Putative stage/tissue of expression of gene
 Associations with ‘probable’ phenotypes
 Relation to diagnostics
 Variation among population, species, genus - phylogeny
 THANK YOU

Comparative Dna Sequencing Analyses For Feasibilities: Presented by Jajati Keshari Nayak PHD 1 Year

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Comparative Dna Sequencing Analyses For Feasibilities: Presented by Jajati Keshari Nayak PHD 1 Year

Uploaded by

Copyright:

Available Formats

DOCTORAL SEMINAR-I

• dispersed repetitive sequence Libraries

•Resources & Understanding Annotation

• Volume of data Release

Genome Short fragments of DNA Short DNA sequences

3’OH Absolute requirement of 3’OH for

• Both methods took too much of time and manual.

• ddNTP: 2’,3’-dideoxynucleotide has No 3’ hydroxyl group

DNA Polymerase I from

From fireflies, oxidizes luciferin and generates light

In illumina modified dNTPs use contains reversible

DNA Beads are placed in wells, No Need of tubes, High-

Read length Sequencing Throughput Cost (1mbp)*

454 ~400bp Polony 500Mbp 60$

Solexa 75bp Polony 20Gbp 2$

SOLiD 75bp Polony 60Gbp 2$

Helicos 30-35bp Single 25Gbp 1$

*Source: Shendure & Ji, Nat Biotech, 2008 28

Sequence Each Clone Fragment

Contigs Compile Data,

You might also like