Professional Documents
Culture Documents
SMRTbell™ Template
Size-Selected
Mouse Lemur
20 kb library
20 kb AMPure®
Mouse Lemur
library
- Input gDNA
- Size-selected
Most Uniform Coverage
• Ross et al. (2013) Characterizing and measuring bias in sequence data. Genome Biology, May 29;14(5):R51
Detection of DNA Base Modifications by SMRT
Sequencing
– >99.999% (QV50)
– GC content
– Epigenome characterization
De Novo Assembly
What can be achieved with infinite coverage given the read length?
PacBio
Koren S. et. al. (2013) Reducing assembly complexity of microbial genomes with single molecule sequencing.
Genome Biology, 14:R101
Easy Bioinformatics Solution to Finish Genomes Using
Only PacBio® Reads
2013 2014
Spinach
1 Gb
Drosophila Contig N50
170 Mb 531 kb
Arabidopsis
Contig N50
Yeast 120 Mb
Bacteria 4.5 Mb
12 Mb Contig N50
1-10 Mb Resolve most 7.1 Mb Human
Finished chromosomes (haploid)
Genomes
3.2 Gb
Contig N50
4.4 Mb
Max=44 Mb
PacBio-Only Sequencing of Arabidopsis
Est. Genome
Size (Mb)
110.4 124.6 11.5%
Polished
Contigs
4,662 545 8.5X
N50 Contig
Length (Mb)
0.067 6.36 95X
Max Contig
Length (Mb)
0.46 13.21 29X
*http://1001genomes.org/data/MPI/MPISchneeberger2011/releases/current/
Cvi ILMN PacBio Cvi These SNPs are highly enriched in peri-
PE Assembly centromere and associate with aberrantly
685,104 high coverage number
55,947 92%/72% 271,335
17
Both
Illumina only
PacBio only
18
21
• Watch Richard McCombie's 2014 AGBT presentation
PacBio-Only Sequencing of a Spinach Genome (980 Mb)
Download Dataset
Human Genome De Novo Assemblies Comparison
4500
Contig N50 (kb) 4378
4000
3500
3000
2500
2000
1500
1000
500 144
107 7,4 5,5 24 127
0
2007 2009 2010 2010 2013 2013 2014
2007 2009 2010 2010 2013 2013 2014
HuRef (Venter) BGI YH KB1 NA12878 RP11_0.7 CHM1 CHM1
Technology ABI 3730 Illumina GA 454 GS FLX Illumina GA 454 GS, HiSeq, PacBio RS II
Titanium HiSeq, MiSeq BAC clones
Assembly method Celera SOAP Newbler ALLPATHS-LG Newbler Reference FALCON,
Assembler de novo Guided Celera
Assembler
# of library types 4 5 2 5 3 NA 1
MHC region
The Next Challenge: Assembling Diploid Genomes
Developing
bioinformatics and
visualization tools to
resolve diploid
genomes
Early
assembly
result for the
Ler-0 + Col-0
Watch Jason Chin’s 2014 AGBT “synthetic” diploid
presentation “String Graph Assembly for
Diploid Genomes with Long Reads”
Benefits of PacBio® Sequencing for Large Genomes
AAAAA AAAAA
AAAAA
TTTTT
TTTTT polyA mRNA
AAAAA
AAAAA AAAAA
PacBio raw
AAAAA
TTTTT
TTTTT AAAAA sequence reads
AAAAA
AAAAA AAAAA
TTTTT AAAAA
AAAAA AAAAA
TTTTT AAAAA
Remove adapters
Remove artifacts
cDNA synthesis
SampleNet: Iso-Seq Method with Clonetech cDNA Synthesis Kit Clean
with adapters
sequence reads
AAAA
polyA TTTT
5’ primer Coding sequence
tail 3’ primer
AAAA
TTTT Reads clustering
Raw (AAA)nAAAA
n
TTTT
(TTT)n SMRT adapter
AAAA
TTTT SMRT adapter
(TTT)n Size partitioning & Isoform clusters
PCR amplification
AAAA
TTTT
Consensus calling
AAAA
TTTT
AAAA
TTTT Nonredundant
AAAA transcript isoforms
TTTT
Reads of Insert (AAA)nn
SMRTbell ligation Quality filtering
Tseng, PAG 2014, “ Isoform Sequencing: Unveiling the Complex Landscape of the Eukaryotic Transcriptome on the
PacBio® RS II” (poster)
“Gene Identification, Even in Well-Characterized Human
Cell Lines and Tissues, is Likely Far From Complete”
Nrxn1α domain
structure
Splice isoform
abundance
(2,574 full-length
Nrxn1α mRNAs
247 unique sequence reads)
alternatively-
spliced
isoforms
6 SMRT® Cells
Exons
• green – present
• white – absent
Treutlein et al. (2014) Cartography of neurexin alternative splicing mapped by single-molecule long-read mRNA
sequencing. PNAS. doi:10.1073/pnas.1403244111
PacBio® Sequences Used for
Gene Model Validation in Lettuce
Without
PacBio
reads
Additional ~5000
gene models
Including validated
PacBio
reads
Confidence
PAG 2014, Marilena Christopouku “Targeted transcriptome analysis using PacBio sequencing to dissect multi-gene
families encoding NBS-LBR resistance proteins in lettuce”
PacBio® Iso-Seq Data Used to Confirm Predicted
Scaffolds in Norway Spruce Genome
14 SMRT® Cells
of PacBio data
using early
chemistry &
protocols
39