You are on page 1of 20

Review

For reprint orders, please contact reprints@expert-reviews.com

Next-generation
sequencing applied to
molecular diagnostics
Expert Rev. Mol. Diagn. 11(4), 425–444 (2011)

Rachael Natrajan†1 and Next-generation sequencing technologies have begun to revolutionize the field of cancer genetics
Jorge S Reis-Filho1 through rapid and accurate assessment of a patient’s DNA makeup with minimal cost. These
1
Breakthrough Breast Cancer Research
technologies have already led to the realization of the inter- and intra-tumor genetic heterogeneity
Centre, The Institute of Cancer and the identification of novel mutations and chimeric genes, however, several challenges lie
Research, 237 Fulham Road, London, ahead. Given the low number of recurrent somatic genetic aberrations in common types of
SW3 6JB, UK cancer, the identification of ‘driver’ genetic aberrations has proven challenging. Furthermore,

Author for correspondence:
Tel.: +44 207 153 5529 implementation of next-generation sequencing and/or some of its derivatives into routine practice
Fax: +44 207 153 5340 as diagnostic tests will require in-depth understanding of the pitfalls of these technologies and
rachael.natrajan@icr.ac.uk a great degree of bioinformatic expertise. This article focuses on the contribution of next-generation
sequencing technologies to diagnosis and cancer prognostication and prediction.

Keywords : cancer • exome • fusion genes • genetics • massively parallel sequencing • mutations
• third-generation sequencing

First-generation DNA sequencing (Sanger second-generation sequencing has provided an


sequencing) was first described by Sanger, and interesting alternative to identify the cause of
Maxam and Gilbert over 30 years ago [1,2] . Since hereditary diseases through genome-wide asso-
then, modifications to the Sanger sequenc- ciation studies, as well as aiding the identification
ing methodology have provided incremental of mutations that cause and promote the progres-
improvements in its accuracy and efficiency, sion of cancer. Next-generation sequencing also
enabling increased automation and incremen- holds the promise of increasing our understand-
tal increases in its throughput. However, this ing of cancer genetics, epigenetics and pharmaco-
technology is still limited to targeted sequenc- genetics, and also detecting novel pathogens that
ing of known genomic sequences, usually up cause human diseases. In fact, next-generation
to 1000 bp per sequencing reaction. In the last sequencing and/or one of its derivatives is likely
few years, second-generation massively paral- to replace microarrays for the molecular charac-
lel sequencing (also known as next-generation terization of cancer for diagnostic, prognostic and
sequencing) has emerged. It is based on method- predictive purposes [20,21] .
ologies that overcome the limitations of Sanger Implementing next-generation sequencing
sequencing. As a result of these technological into routine clinical diagnostic laboratories will
advances, it is now possible to sequence com- require clear guidelines, robust protocols and
plete genomes [3–12] , expressed genes (transcrip- standard operating procedures not only for sam-
tomes) [13,14] and known exons (exomes) [15–17] ple preparation and sequencing (wet laboratory),
of patient samples. As well as providing whole but also for the computational assessment (dry
genomic sequencing coverage, second-genera- laboratory) of the results. The interpretation of
tion sequencing is much more cost effective billions of sequence reads is by no means trivial,
than the first generation. For example, an entire and improvements in the availability of ‘user-
genome can now be sequenced for approximately friendly’ software and enhancements in com-
US$4000 at 30–40-times coverage [18] . puter and data storage infrastructure will be of
These cost reductions in sequencing in tandem paramount importance for the realization of the
with the promise of the US$1000 genome [19] potential of these technologies. Massively paral-
now make it possible for clinical laborato- lel sequencing has already begun to revolution-
ries to adopt this technology. Clinical use of ize medical genetics as we know it, but the vast

www.expert-reviews.com 10.1586/ERM.11.18 © 2011 Expert Reviews Ltd ISSN 1473-7159 425


Review Natrajan & Reis-Filho

amounts of personal genomic data raise ethical considerations that sample preparation, sequencing and data analysis. The advantage
will also have to be considered [22] . Indeed, the era of personalized of this technology lies in its custom alignment algorithm and data
medicine is now a step closer than before and routine screening to analysis pipeline designed for human whole-genome sequence
guide medical treatments of individuals throughout their lifetime comparisons, which have led to a substantial reduction in cost [18] .
is becoming a reality. Second-generation sequencing technologies have, therefore,
provided a new revolution for genetic analysis and have begun
Overview of second-generation to allow resequencing at single-nucleotide resolution of entire
sequencing technologies normal human and cancer genomes at a relatively low cost. These
Massively parallel sequencing methods overcome the limita- revolutionary methods hold the promise of identifying activat-
tions of scalability of traditional Sanger sequencing by eliminat- ing and inactivating mutations in key disease genes, assessment
ing the requirement of electrophoresis for the separation of the of gene expression and novel transcript variants in both coding
sequences [23] . Most of the second-generation sequencing methods and noncoding genes, and in tandem are able to identify somatic
work by ‘sequencing by synthesis’ by either creating micro­reactors structural rearrangements (i.e., translocations) and genomic
and/or attaching DNA molecules to be sequenced to solid sur- copy-number alterations.
faces or beads, allowing for millions of sequencing reactions to be
performed in parallel at the same time. A number of next genera- Whole-genome sequencing
tion DNA sequencers are commercially available, including the The first human whole genome to be sequenced by next-generation
Genome Analyzer and Hi-Seq™ (Illumina), 454-FLX (Roche), sequencing technologies was James Watson’s genome in 2008 at an
SOLiD™ (Applied Biosystems) and Heliscope™ (Helicos), and average depth of 7.4-fold in 2 months, costing one hundredth of
a number of third-generation single molecule sequencers are in the cost of conventional Sanger sequencing technologies [33] . Since
development (Table 1, for reviews see [24–32]). Second-generation then the number of whole genomes sequenced has begun to rise
sequencing technologies involve the production of an adaptor- dramatically, now with up to 30-fold average coverage and includ-
modified random collection of DNA fragments that have been ing individuals from different ethnicities [3–5] . As well as ‘normal’
clonally amplified (Figure 1) . These are subsequently sequenced by genomes, the first cancer genome to be sequenced was reported
repeated cycles of polymerase-based nucleotide extension or by in 2008, in which the full DNA sequence of a patient with acute
cycles of oligonucleotide ligation [26] . Next-generation sequenc- myeloid leukemia was compared with the germ-line DNA from the
ing results in the generation of a huge amount of sequence data same individual [6] . In the past few years, more complete sequences
in the range of megabases to gigabases. Complete genomics offer of cancer genomes compared with their normal genomes have
a ‘one-stop shop’ for whole-genome sequencing, using a propri- been published (Table 2) [8–12] . These whole-genome sequencing
etary sequencing technology based on DNA nanoballs involving studies have enabled the assessment of cancer-specific mutations

Table 1. Comparisons of second-generation sequencing technologies.


Sequencing Amplification Sequence Read Run Data Advantages Disadvantages
platform reaction length time yield
(bp) (days) (Gb)
454 Ti Roche™ Emulsion PCR Pyrosequencing 400 0.4 0.5 Short run times. High reagent cost. Higher
Longer reads improve error rates in repeat
mapping in repetitive sequences
regions. Ability to
detect large structural
variations
Illumina HiSeq™ Bridge PCR Reverse 100 4†, 8‡ 150– The most popular Short reads may miss
2000 terminator 200 platform larger structural
variations. Low
multiplexing capacity
of samples
ABI 5500 Emulsion PCR Ligation 75 3.5†, 7‡ 90 Good base call Short read length
sequencing accuracy. Good
multiplexing capability
Complete PCR on DNA Combinatorial 35 12 200 Low error rate. Analysis Large amounts of input
Genomics nanoballs probe-anchor and management DNA needed. Whole-
ligation handled by company genome DNA sequencing
only
Single-end runs.

Paired-end runs.

Gb: Gigabyte of data.

426 Expert Rev. Mol. Diagn. 11(4), (2011)


Next-generation sequencing applied to molecular diagnostics Review

by removing germ-line variants from the normal DNA sequence; solid cancers [34,35] . By investigating the read pairs that do not
however, this process required sequencing of the germ-line DNA align to the genome as would have been expected in two lung
of the same patient. Identification of somatic rather than germ-line cancer cell lines, Campbell et al. identified 103 somatic structural
mutations is crucial for the cataloguing of the genetic events that variants. They observed that many of the somatic rearrange-
took place during the development and progression of disease. ments mapped to regions of amplification with a proportion
Shah et al. used this approach to look at the mutational evolution leading to expressed fusion transcripts [36] . A low-depth whole-
of a lobular breast carcinoma [11] . A comparative analysis of the genome sequencing approach in 24 breast cancers identified a
repertoire of mutations found in the metastatic deposits compared large number of fusion genes and, most importantly, the exist-
with that from the primary tumor revealed that only five of 32 ence of a subgroup of breast cancers characterized by the pres-
nonsynonymous coding mutations present in the metastasis were ence of multiple tandem repeats inserted in their genomes. The
also present in the primary tumor. Importantly, 19 mutations term mutator phenotype was coined to describe this genomic
present in the metastasis were not detected in the primary tumor. pattern, which is more prevalent in ER-/HER2-negative than
Given the interval between primary tumor and metastasis and in ER-positive breast cancers [37] . The mechanism leading to
the fact that the patient received radio- and chemo-therapy prior this mutator phenotype is yet to be determined; however, this
to the resection of the metastatic deposit, these results are not phenomenon may possibly be linked to the high proliferative
conclusive as to whether the metastasis-restricted mutations are potential of a subgroup of ER-/HER2-negative cancers and the
related to tumor progression, a consequence of chemo-/radio- slippage of replication forks. Further analyses to characterize this
therapy or a combination of the two phenomena. In a subsequent fascinating phenotype are warranted. Although 29 expressed
study, Ding et al. used whole-genome paired end sequencing to in-frame fusions were detected in that study [37] , all of them
determine the genetic alterations associated with tumor progres- proved to be private (i.e., only found in the index case). Lee
sion in a patient with a metaplastic breast carcinoma of basal-like et al. also used this approach to identify novel fusion genes in
phenotype by sequencing the primary tumor, a metastatic deposit a lung cancer patient present in the tumor but not the matched
in the brain, a xenograft derived from the primary tumor and normal tissue [7] . The large majority of the fusions identified
the germ-line DNA from the patient [12] . These analyses demon- mapped to intergenic regions, suggesting that they are likely to
strated that although the primary tumor, the metastatic deposit be passenger events; however, the functional consequences of
and the xenograft were remarkably similar in their repertoire of such rearrangements undoubtedly require further investigation.
somatic mutations, rearrangements and copy number aberrations, A similar analysis of pancreatic primary cancers and metastatic
important differences were found in terms of the percentage of deposits led to the identification of 381 somatic rearrangements,
cells harboring a given mutation in each of these samples and the of which a proportion were also found in the metastasis, suggest-
presence of mutations restricted to one of these samples. In fact, ing the presence of the expansion of a specific clonal populations
the somatic gene aberrations found in the metastatic deposit were in the metastatic process [38] .
more similar to those of the xenograft than those of the primary As well as detecting alterations in known regions of the
tumor. The xenograft and metastasis displayed an expansion in genome, whole genome next-generation sequencing can also
the prevalence of 26 mutations and a decrease in prevalence of two be useful for the detection of mutations in gene promoter and
mutations over the primary tumor; in addition, novel mutations enhancer regions, as well as in introns, noncoding RNA species
of SNED1, FLNC and a 26.9 kb deletion of MECR were restricted and retro­transposons. It is becoming more apparent that hot long
to the cancer cells in the metastatic deposit. Taken together, these interspersed LINE-1 elements (L1’s) account for a proportion
results suggest that metastatic deposits may arise from a number of our inter-individual genetic variation, and may contribute to
of small clonal populations within the primary cancer [12] . disease [39,40] . L1 is a class of mobile elements that generate new
In addition to revealing the repertoire of point mutations in can- retro­transposon insertions in human genomes. Using a combi-
cers, whole-genome sequencing can also provide important biologi- nation of locus junction-specific PCR and 454 next-generation
cal insights in relation to the types of and patterns of DNA insults sequencing, Iskow et al. developed a method to detect new and
involved in the development of a tumor. Disease-specific muta- otherwise young retrotransposon insertions in humans [41] . This
tional signatures in distinct types of cancer have recently been iden- approach led to the observation of somatic L1 insertions in 30%
tified. The mutational signature of lung cancer patients associated of lung cancer genomes. Methylation analysis revealed a hypo­
with tobacco exposure has an enrichment of G>T; C>A transver- methylation signature that was able to distinguish lung cancers
sions [7,10] , whereas the mutational spectrum in a melanoma patient with L1-permissive from non-L1-permissive lung cancers [41] .
showed a UV exposure signature with an enrichment of C>T; G>A Another useful application of next-generation sequencing lies
transitions [9] . These efforts may lead to a better understanding of in the detection of pathogens that are present in the human
the types of genetic damage caused by specific carcinogenic insults genome. Known as ‘metagenomics’, samples of human tissue can
and the characterization of the genetic lesions caused by specific be sequenced and reads aligned to microbial or viral genomes to
types of dysfunctional DNA repair pathways in cancers. identify which pathogenic organism is present in the primary
Whole-genome sequencing also has the advantage of detect- sample. As the sequence reads are present in proportion to the
ing structural rearrangements – that is, fusion genes present in population frequency of each pathogen, information regard-
hematological malignancies and now known to be common in ing the relative abundance can also be inferred [42] . The human

www.expert-reviews.com 427
Review Natrajan & Reis-Filho

A A
C C G
C T
A T
G
T C
G
C

DNA fragmentation and Attachment to Cluster generation by Single-base extension with Imaging
adaptor ligation flow cell bridge amplification of incorporation of fluorescently
DNA fragments labeled nucleotides

T
A

DNA fragmentation and Coupling of DNA fragment to an amplification Single-base extension and
bead and amplification of DNA fragments generation of chemiluminescent Imaging
adaptor ligation
by emulsion PCR signal

nnnCTzzz
nnnCGzzz
ligase
nnnGAzzz
nnnCAzzz

DNA fragmentation and Ligation of fragments to magnetic beads Amplified DNA is attached Single-base extension
adaptor ligation and amplification of DNA fragments by to a glass slide and imaging
emulsion PCR

nnnTzzz
nnnCzzz

Anchor nnnGzzz
nnnGzzz nnnAzzz
GATCATTCCGACGGAA
Matching probe binds
Adaptor Genomic DNA
to anchor DNA
Generation of DNBs DNBs are adhered to a silicon chip Combinatorial probe anchor ligation with fluorescently
labeled oligonucleotides and image detection

Figure 1. Summary of second-generation sequencing technology workflows. (A) Illumina. DNA molecules are fragmented and 5´
and 3´ universal adaptors are added to the DNA. DNA molecules are then immobilized to a glass slide (flow cell) and the DNA molecules
are copied in situ via bridge amplification. This generates clonal clusters that are then denatured and annealed with a sequencing primer
and subjected to sequencing by synthesis using 3´ chemically inactive labeled nucleotides, which ensure a single base is incorporated at
each step. Each base-incorporation cycle is imaged and then the blocked fluorescence group is removed and the next base is
incorporated and imaged. (B) Roche 454. Library construction involves the ligation of special 454 5´ and 3´ adaptors to fragmented DNA.
One DNA fragment is then coupled to a single amplification bead, and emulsion PCR proceeds, where fragments are amplified before
sequencing. The beads are then loaded into picotiter plates. The pyrosequencing reaction then proceeds, which involves the addition of
sequencing enzymes. The sequencing instrument then flows individual nucleotides in a fixed order across the hundreds of thousands of
wells containing one bead each. The addition of one nucleotide complementary to the template strand results in a chemiluminescent
signal recorded by the charge-coupled-device camera within the instrument. (C) ABI SOLID. Oligonucleotide adaptor-linked DNA
fragments are ligated to magnetic beads that hold complementary oligonucleotides on them. Emulsion PCR is used to copy the DNA,
which is subsequently attached to the surface of a glass slide with the aid of a 3´ modification (purple) to allow covalent bonding to the
slide. The slide is placed into a fluidics cassette within the sequencer and sequencing occurs with the hybridization of primers to the
adapter molecules on the beads and a set of four 8mer di-base probes compete for ligation to the sequencing primer. A matched probe
is then ligated to the DNA, and a fluorescent readout of the fifth base is achieved. Chemical cleavage removes the fluorescent group
enabling subsequent rounds of ligation to occur. Two-base encoding aids the discrimination of base calling errors from true mutations.
(D) Complete Genomics. DNA is fragmented and configured in a head-to-tail manner connecting all copies together that form into a
DNB. These nanoballs contain hundreds of copies of a 70 base pair sequence. DNBs are adhered to spots on silicon chips, with each spot
containing a single DNB. Combinatorial probe anchor ligation accurately distinguishes between the different bases to attach fluorescent
molecules that emit signals when detected.
DNB: DNA nanoball.

428 Expert Rev. Mol. Diagn. 11(4), (2011)


Next-generation sequencing applied to molecular diagnostics Review

Table 2. Next-generation sequencing studies of cancer genomes.


Study (year) Method Cancer type Samples Results Ref.
sequenced
(n)
Ley et al. (2008) Whole-genome high Acute myeloid 1 Ten acquired mutations in the tumor [6]
coverage leukemia
Campbell et al. Whole-genome low Lung 2 Identification of somatically acquired [36]
(2008) coverage structural rearrangements
Stephens et al. Whole-genome low Breast 24 Identification of structural variations and [37]
(2009) coverage fusion genes
Pleasance et al. Whole-genome high Melanoma 1 Catalogue of somatic mutations related to [9]
(2010) coverage UV-induced DNA damage
Pleasance et al. Whole-genome high Small-cell lung 1 Catalogue of somatic mutations related to [10]
(2010) coverage tobacco-induced DNA damage
Mardis et al. Whole-genome high Acute myeloid 1 Recurrent mutations in IDH1 [8]
(2009) coverage leukemia
Shah et al. Whole-genome high Breast 1 Identification of somatic mutations present  [11]
(2009) coverage in primary tumors and metastases
Ding et al. Whole-genome high Breast 1 Comparison of mutation spectrum in  [12]
(2010) coverage primary tumor, metastasis and xenograft
Lee et al. (2010) Whole-genome high Lung 1 Identification of mutation spectrum and  [7]
coverage structural variations in a paired normal and
tumor genome
Stephens et al. Whole-genome low Chronic lymphocytic 34 Identification of structural variations and  [99]
(2011) coverage leukemia and bone fusion genes
Zhao et al. Targeted exome Breast 1 Identification of known and novel tumor-  [58]
(2010) sequencing suppressor genes
Harbour et al. Targeted exome Uveal melanoma 2 Identification of recurrent inactivating  [59]
(2010) sequencing mutations in BAP1
Jiao et al. (2011) Targeted exome Pancreatic 10 Identification of mutations in the mTOR  [60]
sequencing neuroendocrine pathway
Timmerman Targeted exome Colorectal 2 Prediction of microsatellite instability  [69]
et al. (2010) sequencing
Jones et al. Targeted exome Pancreatic 5 Identification of PALB2 mutations  [63]
(2009) sequencing
Kan et al. (2010) Targeted exome Breast, ovarian, lung 441 Identification of differential mutation spectra  [62]
sequencing and prostate
Cheng et al. Targeted exome Colorectal 27 Identification of novel SNPs in APC  [66]
(2010) sequencing
Parsons et al. Targeted PCR on Pediatric 22 Identification of MLL2 and MLL3 mutations  [61]
(2011) Illumina GAII medulloblastoma in 16% of patients
and Sanger
sequencing
Villarroel et al. Targeted exome Pancreatic 1 Identification of a PALB2 mutation leading  [65]
(2010) sequencing to resistance to therapy
Varela et al. Targeted exome Renal cell 7 Truncating mutations of PBRM1  [199]
(2011) sequencing

microbiome (collection of organisms that live within the human disease [43] . These include the characterization of the microbial
body) is increasingly being studied to understand the microbial component of the human gut [44,45] and oral cavity [46] and is
components of the human genetic and metabolic landscape and being used to elucidate mechanisms associated with obesity [47]
how they contribute to normal physiology and predisposition to and pneumococcal evolution [48] . Furthermore, analysis of genetic

www.expert-reviews.com 429
Review Natrajan & Reis-Filho

fragments present in matched diseased and healthy tissues has performed in relatively large collections of cancers. This approach
the potential to identify viral agents in cancer and autoimmune involves the capture of the target sequences by molecular inversion
diseases, among others. probes or capture onto DNA or RNA ‘baits’ by arrayed probes
on a solid surface or by bead capture in solution (Figure 2) [50] . Any
Targeted ‘exomic’ sequencing regions of the genome can be selectively targeted for sequenc-
Whilst whole-genome sequencing provides a comprehensive ing by capture methods and custom capture baits provided by
catalogue of mutations, copy number variations and structural the companies selling this technology are becoming increasingly
rearrangements in a single experiment, it requires the greatest more popular. Microfluidic approaches such as RainDance and
amount of sequencing and poses significant logistic challenges Fluidigm technology for sequence enrichment are also becom-
in relation to data handling and storage [49] . Given that protein- ing more popular. RainDance uses a library of PCR primers
coding genes only comprise approximately 1–2% of the genome, that enable the amplification of thousands of genomic loci in
harbor the majority of known disease-related mutations and most, a single tube by using picoliter volume PCR reactions at a rate
if not all, mutations that can currently be addressed therapeuti- of 10 million reactions/h. This has the advantage of high-speed
cally, approaches that selectively sequence just these regions of sample processing, avoiding the bias introduced by capture based
the genome (i.e., the ‘exome’) have the ability to expedite the technologies [51] . An additional PCR-based enrichment strategy
characterization of genetic aberrations in different diseases [15– is available through the Access Array platform (Fluidigm, CA,
17] . Importantly, targeted exomic sequencing can currently be USA). This technology uses nanoscale non-emulsion-based

Exon 1 Exon 2 Exon 3 Exon 4 Exon 5 Gene 1 Gene 2 Gene 3 Gene 4 Gene 5
gDNA gDNA

Fragmentation and nick


Fragmentation of DNA
translation of DNA

Hybridization to baits
Combine droplet
primers and droplet Primers DNA
fragments in a
microfluidic chamber

Elute DNA

Droplet PCR

gDNA purification

Sequencing

Sequencing

Figure 2. Overview of capture-based technologies. (A) Schematic diagram of hybrid selection to capture specific regions of the
human genome. DNA is sheared and hybridized to tagged oligonucelotide ‘baits’ that are specific for the region of interest or exons of
genes. Captured DNA is subsequently eluted and libraries prepared for sequencing. (B) Schematic diagram of microfluidic technologies
for sequence capture. DNA is fragmented and purified. A microfluidic chip encapsulates primer pairs and primer pair droplets are mixed
together to create a primer library. Primers and DNA droplets with PCR components (buffer, deoxyribonucleotide triphosphates and DNA
polymerase) are mixed together where they are merged into a single droplet as they pass through the channel of the microfluidic chip.
Following PCR, the DNA is purified and prepared for sequencing.
gDNA: Genomic DNA.

430 Expert Rev. Mol. Diagn. 11(4), (2011)


Next-generation sequencing applied to molecular diagnostics Review

PCR in microfluidic chips in chambers that are separated by A combination of exon-capture and long-range PCR (to
valves, allowing up to 48 samples per array. This has the advan- enhance capture of GC regions) has been used to sequence the
tage of performing long-range PCR, but is less scalable than adenomatous polyposis coli (APC) gene in 27 colorectal cancers
RainDance [52] . and adjacent normal tissues, identifying 69 novel SNPs including
The sequencing of selected sets of exons has already been one novel exonic SNP, not previously reported [66] . This brings to
proven to be an effective strategy for the identification of somatic our attention one of the drawbacks of whole genome and exomic
mutations for which effective therapies have been developed in sequencing of primary cancer samples: the need for the matched
cancer, such as BRAF mutations in melanoma [53] and EGFR DNA to determine whether a mutation is somatic or germ-line
mutations in lung cancer [54–56] . The first whole exome sequenc- and to eliminate the huge number of private SNPs and copy-
ing study of human cancers was performed using conventional number polymorphisms. Many of the samples in tumor banks do
Sanger sequencing surveying 22 breast and colorectal tumors not have matched germ-line DNA, hence making the evaluation
for 14,661  different coding sequence transcripts and involv- of novel SNPs and more complex genetic aberrations (e.g., indels)
ing 135,483 primer pairs and encompassing 21 Mb of genomic challenging. However, the completion of the 1000  Genomes
sequence [57] . Since the advent of capture technologies and mas- Projects [67,68] will provide a more compete catalogue of germ-line
sively parallel sequencing, sequencing the exome can be per- variations in the general population and aid in the identification
formed within a matter of weeks. Exome capture and transcrip- of disease-specific mutations.
tomic sequencing were employed to compare the sequences from Finally, exomic sequencing has also been shown to predict the
a breast cancer cell line and its matched normal in regions of loss microsatellite instability status of colorectal cancers [69] . One
of heterozygozity [58] . Allele-specific expression analysis led to the can envisage that similar approaches can also be employed to
identification of known and potential novel tumor-suppressor identify patients who have deficient homologous recombination
genes [58] . Exomic sequencing of two uveal melanomas harboring DNA repair defects and may benefit from specific chemotherapy
chromosome 3 monosomy and corresponding germ-line DNA agents (e.g., bifunctional alkylating agents) and inhibitors of the
from the patients, revealed recurrent inactivating mutations of poly(ADP) ribose polymerase (PARP).
the BAP1 gene, which maps to chromosome 3 and was shown New advances in exome sequencing will certainly make the
to be the likeliest target of chromosome  3 deletions in uveal technology more suitable to be applied to the clinical setting.
melanomas. Importantly, there is evidence to suggest that BAP1 The reduction in the amount of starting DNA material down to
gene inactivation plays a pivotal role in the metastatic potential 50 ng will allow the sequencing of samples with limited amounts
of this tumor type and is a potential therapeutic target [59] . The of material, including clinical biopsies [70] . Furthermore, tech-
exomic sequencing of ten pancreatic neuroendocrine tumors nologies that decrease the target capture area and increase the
has identified mutations in mTOR pathway genes, indicating a multiplexing of patients make exomic sequencing significantly
potential avenue for therapeutic intervention in these patients [60] . more affordable and estimates suggest that the price of a single
A recent publication, using a combination of high-throughput exome sequence should reduce by a quarter of the current cost to
Sanger sequencing and targeted next-generation sequencing to approximately US$500 within the next 12 months [71] .
survey the genetic landscape of childhood medullo­blastoma,
identified recurrent inactivating mutations of the histone-­lysine RNA sequencing
N-methyltransferase genes MLL2 or MLL3 in 16% of patients, Second-generation RNA sequencing (RNA-seq) is a power-
demonstrating distinct differences in the prevalence of somatic ful approach to analyze the expression of mRNA, total RNA
mutations in adult and pediatric cancers [61] . or micro- and noncoding RNAs [72–74] . Typically, RNA is
A distinct perspective emerged from a recent study of the fragmented and converted into cDNA before the addition of
exomes of 441 common types of epithelial malignancies, includ- adaptor molecules for sequencing. Although studies show that
ing breast, ovarian, lung and prostate cancer types [62] . Unlike RNA-seq does recapitulate microarray-based measurements to
rare cancer types, these tumors were characterized by numerous some extent [75,76] , RNA-seq has several advantages. Given that
mutations, but at low prevalence, suggesting that these common there is no required prior knowledge of the transcript sequences
types of cancer are heterogeneous in terms of the constellations or variants, RNA-seq has a major advantage over microarrays,
of genetic aberrations they harbor [62] . and can be used to identify novel transcripts and splice vari-
Recently, PALB2 has been identified as a pancreatic cancer ants [77] . Expression values are now digital rather than analogue
susceptibility gene through exomic sequencing in which truncat- and are calculated based on the number of reads mapping to
ing mutations were identified [63] . Previous studies had already the transcript of interest. This increase in dynamic range is well
demonstrated that PALB2 germ-line mutations may also be beyond that achieved by microarray-based expression analysis and
associated with increased breast cancer risk [64] . Importantly, resembles more accurately the range of transcript frequencies in
a combination of functional analysis of xenografts derived the cell. RNA-seq also has the ability to refine transcript struc-
from primary pancreatic cancers that were subjected to exomic tures, including read-throughs and chimeric/fusion genes [13,14] .
sequencing [65] provided an exciting proof of principle of how Given the abundance of some of the more ubiquitously expressed
to individualize therapy according to the mutation repertoire transcripts, such as housekeeping genes, many sequencing reads
of the tumor. can be taken up by such genes, leading to lower coverage for lower

www.expert-reviews.com 431
Review Natrajan & Reis-Filho

expressed transcripts that may have biological importance. This more recurrent fusion genes that bear clinical significance are
is in contrast to microarrays, where the dynamic range for each identified, commercially available assays will become available for
gene is independent from one other. Strategies that can remove clinical use. One of the main clinical applications of these private
highly abundant transcripts before sequencing will enable a better fusion genes is for the assessment of disease burden and monitor-
characterization of such transcripts [78] . ing of residual disease for patients with solid malignancies after
RNA-sequencing can be used in differential expression analysis primary treatment [97,98] . The primary challenge for the translation
between samples [79] , to look at allele-specific expression and novel of these observations into clinical practice lies in the identifica-
imprinted genes [73,80,81] . Analysis of RNA-seq can also be used tion of recurrent ‘driver’ fusion events that can be targeted with
to detect somatic mutations; however, finding a matched normal available drugs.
sample for disease tissues can be problematic, and given that nor- It is now becoming evident that the majority of fusion events
mal tissue is unlikely to express genes to the same level, analysis may in fact be byproducts (i.e., passenger events) of the evolution
of somatic variants is challenging. Despite the challenges posed of amplifications in cancer genomes, as intra- and inter-chromo-
by the identification of somatic mutations by RNA-seq, this tech- somal rearrangements are more frequently found in amplified
nique has led to the identification of recurrent mutations in the regions of the genome [7,36,37,85] . Furthermore, next-generation
FOXL2 gene in granulosa cell tumors of the ovary and ARID1A sequencing has recently demonstrated that some cancers, in par-
mutations in clear-cell and endometrioid ovarian cancers [82–84] . A ticular bone tumors, may undergo a cellular catastrophe termed
survey of ten melanomas by transcriptomic sequencing identified ‘chromothripsis’, whereby tens to hundreds of chromosomal
721 novel nonsynonymous coding variants, including SRRM2 in rearrangements involving a few localized genomic regions can be
30% of the samples [85] . Of note, however, owing to variances in acquired in a single genetic remodeling event, leading to multiple
the expression levels of genes, the authors were unable to identify oncogenic lesions [99] .
the known BRAF mutations in these samples [85] , which high-
lights the limitations of this approach for the identification of Other applications of next-generation sequencing
somatic mutations. Combined RNA and DNA sequencing also MiRNAs have been shown to play a number of important roles
has the ability to identify RNA editing events, such as transcript in cancer and disease and recent efforts have begun to focus on
editing of COG3 and SRP9 genes in a lobular breast cancer [11] . miRNA applications using next-generation sequencing [100–103] .
Recent studies have identified not only differences in the expres-
Next-generation cytogenetics sion levels of miRNAs in cancer and normal samples, but also
As well as using DNA sequencing technologies to study complex 370 novel miRNA species [104] , the miRNA expression profiles of
rearrangements in cancer, translocations can also be inferred by androgen-independent and -dependent prostate cancers [105] , and
using paired-end RNA sequencing. Recurrent fusion genes have several differentially expressed miRNAs that may contribute to
recently been shown to be significantly more pervasive than antic- progression of acute myeloid leukemia [106] . Combined miRNA
ipated in carcinomas. In fact, recurrent fusion genes involving and mRNA next-generation sequencing has identified a somatic
ETS (erythroblastosis virus E26 transformation-specific) fam- mutation in the 3´-UTR of TNFAIP2, a known target of the
ily members have been shown to be a common genetic aberra- PML–RARA oncogene in acute myeloid leukemia, resulting in
tion in prostate cancer [35] , challenging the concept that fusion translational repression in a Dicer-dependent fashion [107] . These
events are only a feature in lymphomas, leukemias and sarcomas. results suggest that somatic mutations in miRNA binding sites are
Recent studies have shown that some special types of breast cancer another mechanism that can affect gene expression. Furthermore,
also harbor recurrent chromosomal translocations, such as the paired analysis of small RNAs from samples of normal and tumor-
ETV6–NTRK3 oncogenic fusion gene in secretory carcinomas [86] adjacent breast tissue has identified 361 novel miRNA precursors,
and the MYB–NFIB fusion gene in adenoid cystic carcinomas [87] . 10% of which were located in amplified regions of the genome.
Furthermore, recurrent fusion genes have been described in several These included a novel miRNA, miR-4728–3p, predicted to medi-
types of salivary gland and renal cancers [88–94] . Gene fusions and ate GTPase cellular transduction, found to span the 5´ splice site of
other chimeric transcripts have been identified through RNA-seq intron 24 within the ERBB2 gene, suggesting that its processing
in prostate, breast and melanoma cancer primary tumors and cell may be dependent on splicing of ERBB2 itself [108] .
lines [13,14,85,95,96] , although the majority of these fusions appear to Modifications of protocols for massively parallel sequencing also
be ‘private’ events (i.e., fusion genes only found in the index case). allow for a complete genome-wide assessment of histone modifica-
However, recent studies using paired-end mRNA sequencing have tions and related chromatin structures (for an excellent review on
identified recurrent BRAF and CRAF fusions in melanoma and this topic see [109]). Also known as Chip-seq, studies have begun
gastric cancers, which have been shown to drive the oncogenic to look at gene regulation, global binding regions and functional
activity of these genes. Furthermore, cancer cells harboring these elements in cancer [110–113] . Studies to date have used this approach
fusion events were shown to be particularly sensitive to treatment to identify Runx1 as a novel tethering factor in estrogen-receptor-
with the small molecule inhibitor sorafenib [95] . Although these mediated transcriptional activation in breast cancer [114] , to iden-
fusion events were found in a small subset of tumors, targeting tify b-catenin binding regions in HCT116 human colon cancer
these events in patients with small molecule inhibitors that are cells [115] and to identify novel CDX2 binding sites in intestinal
currently available is a tantalizing prospect. It is possible that once epithelial cells [116] . In addition to revealing that enzymes that

432 Expert Rev. Mol. Diagn. 11(4), (2011)


Next-generation sequencing applied to molecular diagnostics Review

regulate methylation may be mutated in some types of cancer (e.g., of the ‘low-hanging fruits’ in a number of rare hereditary disorders
DNMT3A mutations have recently been reported in acute myeloid for which no causal gene has been identified (Table 3) . Although
leukemia [117]), next-generation sequencing can also be used for an these diseases only affect a small number of patients, these results
unbiased genome-wide assessment of cytosine methylation sites have clear clinical implications, as these hereditary diseases can now
by bisulphite treatment or MeDip [118–123] and can be a useful be objectively diagnosed. Genes responsible for more complex dis-
tool for profiling DNA methylation in clinical samples [124,125] . eases are now being discovered through targeted sequencing, such
A combined methylation profiling with mutational analysis by as the identification of nonsense mutations in ANGPTL3 causing
means of next-generation sequencing revealed commonly mutated hypolipidemia [131] and GPSM2 mutations in nonsyndromic hear-
and methylated genes that are associated with poor prognosis in ing loss [132] . Larger studies of more common complex phenotypes
cancer patients [126] . Assessment of gene promoter methylation by have already begun to emerge, such as the identification of a poly-
barcoded 454 sequencing of DNA from endometrial cancers has morphism in EPAS1 in inhabitants of the high-altitude Tibetan
led to the identification of a new class of microsatellite instabil- plateau [133] , and the identification of novel nonsynonymous coding
ity (MSI+) endometrial cancers with heterogeneous patterns of SNPs in a population of 200 individuals from Denmark [134] .
MLH1 promoter methylation [127] . The aforementioned approaches Next-generation sequencing of fetal DNA from the mater-
may be used in the future to identify molecular subtypes of cancer nal plasma [135] may also constitute a way forward for prenatal
that are defined by distinct constellations of epigenetic defects. noninvasive diagnosis of fetal genetic disorders. In addition, the
Importantly, methylation markers have been put forward as useful use of massively parallel genomic sequencing of maternal plasma
for both noninvasive early disease detection and disease monitor- for the noninvasive prenatal diagnosis of fetal chromosomal aneu-
ing [128] . Li et al. have recently provided an elegant proof of prin- plodies has also been reported [136] . In particular, two independent
ciple of the potential clinical utility of methylation markers for validation studies have looked at the use of multiplexed massively
the diagnosis and monitoring of colorectal cancer patients [129] . parallel sequencing for the identification of trisomy 21. Using a
Based on the observation that exon  1 of the vimentin (VIM) four-plex and duplex approach, in 449 and 713 high-risk pregnant
gene is methyl­ated in colorectal carcinomas, a method for digital women, respectively [137,138] , trisomy 21 was detected with 100%
assessment of VIM gene methylation based on ‘methyl-BEAMing’ sensitivity and 99.7 and 97.9% specificity, respectively. It is antici-
(Beads, Emulsion, Amplification and Magnetics) and massively pated that the development of haplotyping (distinguishing which
parallel sequencing was devised. Plasma level VIM gene methyla- mutation is inherited from which parent) using sequencing alone
tion as determined by methyl-BEAMing was shown to have a far will be essential in predicting disease risk [139,140] .
superior sensitivity and specificity than that offered by the assess-
ment of carcino-embryonic antigen (i.e., the only biomarker for Next-generation sequencing in the clinic
colorectal cancer that is currently used in clinical practice) for the Second-generation sequencing and its derivatives have already
diagnosis of early-stage colorectal cancer, and to be more strongly begun to be used for the management of cancer patients in proof-
associated with residual disease than levels of carcino-embryonic of-concept studies. For instance, a patient with gemcitabine-­
antigen. Methyl-BEAMing VIM gene methylation levels in feces resistant metastatic pancreatic cancer had tumor grown as a
also showed promise as potential biomarkers for the diagnosis of xenograft, which was shown to be sensitive to chemotherapy
colorectal cancer [129] . These studies do provide evidence to suggest agents that cause DNA double strand breaks (i.e., mitomycin C
that the assessment of cancer-specific methylation patterns can be and cisplatin) [65] . In parallel, the primary tumor was subjected to
used clinically for disease diagnosis and monitoring after therapy. exomic sequencing, which revealed the presence of biallelic somatic
Next-generation sequencing technology is quickly becoming the inactivation of PALB2. Inactivation of PALB2 has been shown
method of choice for epigenomic profiling, and will lead to a more to lead to homologous recombination DNA repair defects [141] ;
comprehensive understanding of the epigenetic contributions to the presence of these mutations is likely to explain the sensitivity
human disease. of the xenograft and patient to mitomycin C and cisplatin [65] . A
patient with a metastatic rare form of tongue carcinoma was treated
Next-generation sequencing for the identification of with sorafenib based on signaling pathways identified through
causes of hereditary disease combined whole genome and transcriptomic sequencing of the
One of the main applications of massively parallel sequencing is primary tumor [142] . Furthermore, exomic sequencing was suc-
to study germ-line DNA for genome-wide association studies to cessfully used in the diagnosis and management of a child with
identify mutations that cause rare Mendelian disorders. This is Crohn’s-like disease, as it led to the identification of a hemizygous
exemplified by the identification of mutations in SH3TC2 caus- missense mutation in XIAP (a known cause of X-linked lympho­
ing Charcot–Marie–Tooth neuropathy [130] , the identification of proliferative syndrome and a risk factor for hemophagocytic lym-
MHY3 germ-line mutations that cause Freeman–Sheldon syndrome phohistiocytosis). These results enabled the patient to receive an
through exomic targeted sequencing of four diseased and eight unre- allogeneic hematopoietic progenitor cell transplant to treat not
lated individuals [17] , and the identification of germ-line mutations only the gastrointestinal symptoms, but also this life-threatening
in DHODH that cause Miller syndrome, found through exomic condition [143] . These studies demonstrate the enormous power of
sequencing of four affected individuals in three independent kin- next-generation sequencing to diagnose and tailor the treatment
dred [16] . Over the past year exome sequencing has identified many to individual patients.

www.expert-reviews.com 433
Review Natrajan & Reis-Filho

amplification [148] . Likewise, resistance to


Table 3. Next-generation sequencing studies of Mendelian disorders.
imatinib mesylate in patients with gas-
Study (year) Method Disease type Gene Ref. trointestinal stromal tumors appears to
associated be mediated by the selection of nonmodal
with disease clones harboring secondary KIT gene muta-
Bilguvar et al. (2010) Exome Severe brain WDR62  [200] tions [149,150] . Even in the case of synthetic
malformations lethality-based therapeutic approaches,
Bolze et al. (2010) Exome Autoimmune FADD  [201] clonal selection appears to be the mecha-
lymphoproliferative nism of resistance. Tumors that harbor
syndrome BRCA1 and BRCA2 mutations show exqui-
Bowden et al. (2010) Exome Insulin resistance ADIPOQ  [202] site sensitivity to PARP inhibitors. In vitro
atherosclerosis models have demonstrated that resistance to
Byun et al. (2010) Exome Fatal classic Kaposi STIM1  [203] these agents in BRCA2 mutant pancreatic
sarcoma cell lines is mediated by the selection of can-
Caliskan et al. (2011) Exome Nonsyndromic mental TECR  [204]
cer cells that harbor intragenic deletions or
retardation secondary mutations that remove the initial
truncating mutation and restore the open
Gilissen et al. (2010) Exome Sensenbrenner WDR35  [205]
reading frame of BRCA1 and BRCA2 [151–
syndrome
154] . These results were further corroborated
Haack et al. (2010) Exome Complex I deficiency ACAD9  [206] by the sequencing of recurrences of BRCA1
Hoischen et al. (2010) Exome Schinzel–Giedion SETBP1  [207] or BRCA2 mutant human ovarian cancers
syndrome treated with platinum salts [151–154] . Next-
Johnson et al. (2010) Exome Brown–Vialetto–van C20orf54  [208] generation sequencing-based approaches
Laere syndrome may help expedite the identification of
Johnson et al. (2010) Exome Amyotrophic lateral VCP  [208] mutations that confer resistance to specific
sclerosis therapeutic agents. If primary tumors are
Krawitz et al. (2010) Exome Hyperphosphatasia PIGV  [209]
sequenced at an increased depth, specific
mental retardation mutations in minor populations of the cells
syndrome can be detected, paving the way for more
Lalonde et al. (2010) Exome Fowler syndrome FLVCR2  [210]
accurate therapeutic predictions.
Lupski et al. (2010) Whole Charcot–Marie–Tooth SH3TC2  [130]
Genetic testing using
genome neuropathy
next-generation sequencing
Musunuru et al. (2010) Exome Familial combined ANGPTL3  [131]
Based on the results achieved in the last
hypolipidemia 5  years, one could argue that it is only a
Ng et al. (2009) Exome Freeman–Sheldon MHY3  [17] matter of time before genetic tests based
syndrome on next-generation sequencing or one of its
Ng et al. (2010) Exome Kabuki syndrome MLL2  [15] derivatives are incorporated into the routine
Ng et al. (2010) Exome Miller syndrome DHODH  [16] of genetic clinics. An increasing number of
sequence-based diagnostic tests for many
Otto et al. (2010) Exome Retinal–renal SDCCAG8  [211]
hereditary disorders, including X-linked
ciliopathy
mental retardation, Parkinson’s disease, epi-
Walsh et al. (2010) Exome Nonsyndromic GPSM2  [132] lepsy and mitochondrial disorders, are being
hearing loss (DFNB82)
developed (Table 4) . One strategy to introduce
Wang et al. (2010) Exome Spinocerebellar TGM6  [212] routine genetic testing in clinical laborato-
ataxias ries is the use of ‘tagged’ PCR products from
a pool of patients [155] . This technique uses
There are now several lines of evidence to suggest that a given long-range PCR followed by standard library preparation and has
cancer is composed of mosaic of cells with distinct genetic aber- been developed for routine genetic testing of BRCA1 and TP53
rations in addition to the founder genetic events found in all mutations in hereditary breast cancer and Li-Fraumeni families,
cancer cells (see previously), and that this intra-tumor genetic where it has been shown to be more sensitive than conventional
heterogeneity may mediate resistance to systemic therapies. Sanger methods [156] . This approach may be of particular inter-
Resistance to EGFR small molecule inhibitors is to some extent est for screening a panel of breast cancer susceptibility genes in
mediated by selection of nonmodal resistant clones harboring families with non-BRCA1/BRCA2 hereditary breast cancer [157,158] .
the T790M EGFR ‘gatekeeper’ mutation [144–147] or MET gene Automated and scalable methods to capture specific ‘exomes’, such

434 Expert Rev. Mol. Diagn. 11(4), (2011)


Next-generation sequencing applied to molecular diagnostics Review

Table 4. Summary of diagnostic tests using next-generation sequencing.


Company Genetic test Sequencing Platform Ref.
Emory University X-linked intellectual disability, congenital muscular Targeted capture ABI SOLID  [314]
dystrophy and congenital disorders of glycosylation
Ambry Genetics Ambry Screen (covers several hundred well- Barcoded PCR Illumina  [315]
understood childhood disease-causing mutations) Genome
Analyzer
German sequencing service 28 different panels, including epilepsy, amyotrophic Targeted capture ABI SOLID  [316]
provider (CeGaT) lateral sclerosis, Parkinson’s, metabolic disorders and
hereditary eye diseases
National Centre for 448 genetic diseases, a mix of autosomal recessive and Targeted capture Illumina Hiseq™  [317]
Genome Resources X-linked recessive disorders
Good Start Genetics Prepregnancy genetic tests Targeted capture Illumina  [318]
Genome
Analyzer
GeneDx 20 genetic diseases including mitochondrial disorders, Targeted capture ABI SOLID  [319]
cystic fibrosis, sickle cell anemia and Tay-Sachs disease
University of Leeds Institute BRCA1/2 and TP53 hereditary breast, ovarian cancer Barcoded PCR Illumina  [320]
of Molecular Medicine and Li-Fraumeni syndrome Genome
Analyzer

as a 480 Kb cancer-exome subset of 115 cancer-related genes, has has digital copy number and small RNA sequencing applications
been developed using microfluidic DNA arrays [71] . The develop- under development [164–172] . Pacific Biosciences [302] have developed
ment and commercialization of these tests (Table 4) is remarkable, the single-molecule real-time (SMRT) technology (PACBIO RS)
especially if one considers the time it has taken for microarray-based that incorporates thousands of zero-mode waveguides (ZMWs)
tests to be incorporated into clinical practice. on SMRT cells to which a single DNA polymerase molecule is
immobilized. This allows the instrument to detect the addition
Third-generation sequencing technologies of each individual fluorophore-labeled phospho-linked nucleotide
Second-generation sequencing, whilst identifying many novel in real time [173,174] . This technology has the advantage of produc-
mutations and translocations that are of clinical benefit in cancer, ing very long read lengths of >1000 bp up to 10,000 bp and short
can only detect clonal mutations in a tumor. There is now direct run times (i.e., between 30 min to 1 h). Owing to the sample
evidence of the genetic heterogeneity within tumors (i.e., that preparation procedure [174] , SMRT sequencing is suitable for a
tumors are composed of a mosaic of nonmodal populations harbor- number of applications that are currently not available with other
ing genetic aberrations in addition to the initiating events) [159,160] technologies, such as the ability to capture kinetic information,
and recurrences may be caused by the outgrowth of nonmodal the incorporation of methylated residues [175] and real time transi-
populations of cancer cells that harbor specific genetic aberra- tion of tRNAs on a single ribosome as it translates mRNA [176] .
tions [12,148,161] . The development of third-­generation sequencing Oxford Nanopore Technologies [303] are currently in the process
technologies can circumvent this limitation of second-generation of developing an electronic-based single-molecule sequencing tech-
platforms by the use of single-molecule sequencing (which avoids nology, which measures the current change as a DNA polymerase
PCR-based amplification bias) that promises to improve data qual- molecule passes through a nanopore, identifying the DNA base
ity and increase the types of data produced. A number of third- on the molecule in sequence [177–179] . This enables single strands of
generation sequencing technologies are currently in development DNA to be sequenced and assessment of the incorporated bases by
(Table 5 ; for a detailed description of these technologies, the readers electrics rather than optics. This novel approach is highly scalable,
are referred to references [32,162]). The Helicos Heliscope [301] was with negligible sequencing costs and provides simpler data process-
the first third-generation sequencer to be commercialized. The ing and storage. In addition to the applications currently feasible
Helicos true single-molecule sequencing technology uses labeled with second-generation platforms, Nanopore sequencing technol-
nucleotides admixed with nucleic acid templates that are immobi- ogy can be used for numerous other applications, including the
lized on a flow cell. Fluorescent signals are emitted in real time as identification of epigenetic alterations [180] and protein analysis [181] .
each additional base is incorporated and detected by the HeliScope The benchtop Ion Torrent PGM [304] sequencer, which has
Genetic Analysis System. This system can sequence up to 180 mb/h already been commercialized, uses a series of semiconductor sen-
with an average read length of 35 bases [163–165] . The Helicos true sors on a single chip that measure hydrogen ions that are pro-
single-molecule sequencing technology allows whole-genome duced by the process of DNA replication (i.e., proton sequenc-
sequencing, targeted resequencing, digital gene expression, DNA ing). The ion semiconductor sequencing chip is the machine,
barcoding, mRNA-seq, Chip-seq and protein quantification, and which houses a high-density array of wells, providing millions

www.expert-reviews.com 435
Review Natrajan & Reis-Filho

Table 5. Summary of third-generation sequencing technologies.


Platform Company TGS Read Run time Applications Ref.
technology length
HeliScope Helicos tSMS 35 bp 8 days Whole-genome sequencing,  [321]
targeted resequencing, digital gene
expression, DNA barcoding,
mRNA-seq, Chip-seq, protein
quantification
PacBio RS Pacific SMRT zero-mode 1000– 1h Single-molecule RNA, DNA  [322]
Biosciences waveguides 10,000 bp sequencing, identification of
epigenetic alterations, RNA
modifications, mRNA translation
Ion Torrent PGM Life Technologies Semiconductor 100–200 bp 2h Targeted DNA sequencing  [323]
Oxford Nanopore Oxford Nanopore Exonuclease DNA 50,000 bp 24 h Single-molecule RNA, DNA  [324]
Technologies sequencing sequencing
Life Technologies Life Technologies Qdot® >100 kb 20 min Single-molecule RNA, DNA  [325]
FRET nanocrystals and sequencing
FRET
FRET: Förster resonance energy transfer; PGM: Personal genome machine; Qdot ®: Quantum dot; SMRT: Single-molecule real time; TGS: Third-generation sequencing;
tSMS: True single-molecule sequencing.

of individual reactors allowing sequencing runs to be performed involves quantification of reads, mutation calling, copy number
in approximately 2 h that generate an ‘ionogram’, which is con- analysis and analysis of structural variations. A plethora of algo-
verted into DNA base calls by the software provided with the rithms have been developed for this purpose (for recent reviews
PGM sequencer. At present, this technology is only suitable for on this topic, see [49,185,186]), including a number of open-source
targeted sequencing and still requires PCR amplification of the packages in Bioconductor [306] . One of the difficulties for the trans-
DNA template and termination in order to monitor base incor- lation of bioinformatic analysis into the clinical setting is that the
poration. Given that the Ion Torrent PGM is relatively cheap and majority of these algorithms need some programming expertise and
provides a complete workflow from sample preparation to sample specialized servers to handle and store all the data. Commercial
analysis, it is envisaged that many clinical laboratories will take software that allows for the analysis of next-generation sequenc-
advantage of the technology for routine resequencing. ing data are available (e.g., NextGene®  [307] , Genesifter ® [308] ,
Another approach to SMRT sequencing uses fluorescence reso- GenomeQuest [309] , and DNAnexus [310]); however, these can be
nance energy transfer (FRET). This technology is being devel- relatively expensive and are relatively limited in the scope of analy-
oped by Life Technologies [305] , and uses a Qdot nanocrystal as the ses they can perform. Freely available visualization tools, such as
FRET donor coupled to a DNA polymerase enzyme. Introduction the Broad’s Integrative Genomics Viewer (IGV) [311] , have recently
of a nucleotide interrupts the Qdot fluorescence and energy is trans- been released and are relatively user friendly.
ferred, leading to the release of the fluorophore label on the nucle- Most researchers align the next-generation sequencing reads to
otide. This technology, while still in its infancy, has the potential a known reference, such as the most current build of the genome,
to sequence millions of bases per second, with a run taking less employing known SNP databases to filter out alterations that
than 20 min and generating hundreds of megabases of sequence. are germ-line polymorphisms. As the 1000 Genomes Project
progresses, a catalogue of germ-line variants will be available to help
Challenges for next-generation sequencing filter out such SNPs [67,68] . In the analysis of Mendelian diseases,
One of the main challenges facing the implementation of next-­ use of unaffected family members provides a useful background
generation sequencing into clinical practice is the bioinformatic in which to identify novel disease-causing mutations. However,
analysis and data storage. The most computationally intensive in cancer research one must also sequence the normal germ-line
part of the basic bioinformatic pipeline is the conversion of the DNA to reliably only look at somatic variations. Somatic mutation
image data into sequence reads, a step called ‘base calling’. This calling in cancer is much more complicated than germ-line due
generates FASTQ files, which are usually obtained using platform- to the variance in ploidy and purity of tumor DNA, and is also
specific pipelines. Once the sequence reads have been generated, strongly dependent on the allelic fraction [49] . Elimination of false-
they are aligned to a known reference sequence or assembled positives and -negatives is also a major challenge in next-generation
de novo [182–184] . The presence of short reads in massively parallel sequencing analysis. Elimination of false-positives can be achieved
sequencing makes alignment more difficult than with conventional through validation by conventional PCR and Sanger sequencing
Sanger sequencing, with the choice of method depending on the and now in a more high-throughput fashion through the use of
sequencing platform used, data type and computational resources. Sequenom massARRAY technology [187] ; however, elimination of
Depending on the application of the sequencing, the next step false-negatives is much more difficult.

436 Expert Rev. Mol. Diagn. 11(4), (2011)


Next-generation sequencing applied to molecular diagnostics Review

Typically a single sequencing run can generate up to 15 tera- of a given genetic disorder, there is a great likelihood of the dis-
bytes of information and we are currently in a climate where we covery of other risk factors affecting their health. In addition,
are generating more sequencing data than we have storage for. variants of unknown significance are inevitably going to cause
The next-generation sequencing data deluge is one of the chal- the patient anxiety and stress. Therefore, several questions are
lenges that are yet to be met. Given that image files require the crucial: how much of the sequence information should be offered
most storage capacity, many researchers are opting to delete these to the patient? How should the results from minors be handled,
files once base calling has been achieved and storing FASTQ in particular carriers of adult-onset diseases? How will we han-
or just the post-alignment files. Clinical laboratories will have dle genetic discrimination? It may be the case that consent will
to implement an efficient way of tracking data and keeping the have to be completely different for whole-genome sequencing
necessary files to go back to if needed. One way of overcoming than for gene locus-specific sequencing tests. This may involve
the need to purchase expensive servers is to use cloud computing. patients having to receive counseling before consent is agreed
In fact both GenomeQuest and DNAnexus have cloud-based and also throughout the testing and analysis process. For an in-
pipelines, eliminating the need for dedicated computer hardware depth discussion of the ethical considerations of next-generation
and software. sequencing, the readers are referred to recent reviews on this
Albeit challenging, assigning reliable mutation calls, insertions/ topic [22,25,27,194] .
deletions and structural variations is only a small part of the story.
The main challenge lies in the distinction of what constitutes a Expert commentary
‘driver’ mutation event versus a ‘passenger’ event (i.e., has no bio- It is arguably inevitable that improvements in sequencing tech-
logical significance on the cell harboring its mutation at a given nologies as we head towards third-generation sequencing, and
point in time). This is particularly true for complex heterogene- a reduction in sequencing costs will allow the eagerly antici-
ous passenger events. There are a number of algorithms that exist pated US$1000 genome to become a reality. Large-scale projects
that predict the functional effect of a mutation of interest on a aimed at sequencing more individuals, such as the 1000 Genomes
protein (e.g., SIFT, PolyPhen, CanPredict and CHASM) [188–191] Project [195] and the Personal Genome Project [196] will lead to a
that can be used to prioritize mutations for follow-up studies, greater understanding of the landscape of the human genome. In
and more recently algorithms that predict the pathogenicity of tandem, the advent of initiatives to identify somatic alterations
somatic mutations based on the selection pressure and type of in cancer, such as TCGA [197] and ICGC [198] , hold the excit-
mutation have been developed [192,193] . However, novel predicted ing prospect of revealing a number of key oncogenic mutations
‘drivers’ still need to be functionally investigated in appropriate that define clinically relevant subtypes for the development of
model systems before they can be definitively defined as driver new cancer therapies [198] . The contribution of next-generation
events. However, these approaches do not take into account the sequencing in diagnosis and treatment decision has already been
issue of intra-tumor genetic heterogeneity. Given that sequence illustrated in proof-of-principle studies. It is anticipated that fur-
reads represent an average of all cells within a tumor, rare muta- ther studies comprising a larger number of samples will provide
tions that predict resistance to therapy (e.g., secondary KIT increasingly more coherent data and potentially pave the way for
mutations and EGFR T790M mutations) but are at very low the next generation of diagnostic genetic tests.
prevalence within the tumor (e.g., 1 or 0.1%) may be missed
by current approaches. This is particularly problematic if the Five-year view
prevalence of the mutation within the tumor population is simi- Next-generation sequencing technologies are advancing at an
lar to the frequency of sequencing error rates of the sequencing unprecedented speed. It is envisaged that 5  years from now,
technology employed. This in part may be overcome through the sequencing of a person’s DNA will be routine clinical practice
development of increasingly sensitive techniques that will allow for the diagnosis of rare and complex hereditary diseases. Targeted
sequencing of single cells with third-generation single-molecule sequencing of therapeutically relevant mutations will become part
sequencing technologies. of the diagnostic repertoire of molecular pathology laboratories for
the management of cancer patients. Furthermore, genome-wide
Ethical considerations or targeted sequencing of cancers may be introduced at least in
Massively parallel next-generation sequencing is already enter- the context of clinical trials designed to determine the mecha-
ing clinical practice, and with it raises important ethical con- nisms of resistance to specific therapeutic agents and to develop
siderations, that need to be fully resolved before routine imple- predictive markers.
mentation, in particular with whole genome/exome sequencing However, it should be noted that numerous challenges lie
of patients DNA as these techniques produce huge amounts of ahead. Despite the decreasing cost of sequencing, genetic associa-
personal medical information. Clarity, in terms of how to ensure tion studies will require incredibly large sample sizes and innova-
the appropriate storage and anonymity of the data from patients, tive statistical approaches. Studies designed to define next-gener-
has been provided by the ethics committees of the International ation sequencing-based predictive markers for cancer therapeutics
Cancer Genome Consortium (ICGC) [312] and The Cancer will not only require large sample sizes, but also a much greater
Genome Atlas TCGA [313] . In the case of a patient who requests understanding of the sources of biases stemming from these
to have their whole genome sequenced because of a family history approaches. Studies published so far are little more than proofs of

www.expert-reviews.com 437
Review Natrajan & Reis-Filho

principle; thorough analyses to determine the generalizability of Acknowledgements


their findings are eagerly awaited. The bioinformatic challenges The authors would like to thank Iwanka Kozarewa for her
posed by the output of next-generation sequencers are orders of helpful discussions.
magnitude greater than those posed by the analysis microarrays in
the last decade. Although bioinformatics and statistical analysis Financial & competing interests disclosure
methods for the analysis of next-generation sequencing data are This work was supported by The Breakthrough Breast Cancer charity. Rachael
being developed at an unprecedented pace, further standard­ Natrajan and Jorge S Reis-Filho are funded in part by Breakthrough Breast
ization of bioinformatic algorithms/tools for data analysis will be Cancer. Jorge S Reis-Filho is the recipient of the 2010 CRUK Future Leaders
required to translate the output from sequencers into benefit for Prize. The authors have no other relevant affiliations or financial involvement
patients. Data sharing to maximize the output of next-generation with any organization or entity with a financial interest in or financial
sequencing studies needs to be facilitated and embraced by the conflict with the subject matter or materials discussed in the manuscript apart
academic community. Finally, ethical issues will also require from those disclosed.
careful consideration. No writing assistance was utilized in the production of this manuscript.

Key issues
• Next-generation sequencing technologies have uncovered a wide range of genetic aberrations that contribute to cancer development
and progression.
• Challenges lie in the identification of ‘driver’ versus ‘passenger’ events.
• Exomic sequencing has begun to emerge as a tool of choice for the identification of the causes of rare hereditary disorders.
• Next-generation sequencing data require an availability of high-performance computing and bioinformatic support that is beyond most
research laboratories.
• Issues of quality control and standardization of protocols need to be addressed before routine clinical implementation.
• The ethical aspects need to be carefully considered by the community before next-generation sequencing can be applied to large
populations of patients.

References 7 Lee W, Jiang Z, Liu J et al. The mutation • First publication descibing the evolution
Papers of special note have been highlighted as: spectrum revealed by paired genome of mutations from primary tumor
• of interest sequences from a lung cancer patient. to metastasis.
•• of considerable interest Nature 465(7297), 473–477 (2010). 13 Maher CA, Kumar-Sinha C, Cao X et al.
1 Maxam AM, Gilbert W. A new method for 8 Mardis ER, Ding L, Dooling DJ et al. Transcriptome sequencing to detect gene
sequencing DNA. Proc. Natl Acad. Sci. Recurring mutations found by sequencing fusions in cancer. Nature 458(7234),
USA 74(2), 560–564 (1977). an acute myeloid leukemia genome. 97–101 (2009).
N. Engl. J. Med. 361(11), 1058–1066
2 Sanger F, Nicklen S, Coulson AR. •• Demonstrates the power of second-
(2009).
DNA sequencing with chain-terminating generation sequencing to identify novel
inhibitors. Proc. Natl Acad. Sci. USA 9 Pleasance ED, Cheetham RK, Stephens PJ fusion genes in cancer.
74(12), 5463–5467 (1977). et al. A comprehensive catalogue of somatic
14 Maher CA, Palanisamy N, Brenner JC et al.
mutations from a human cancer genome.
3 Bentley DR, Balasubramanian S, Chimeric transcript discovery by paired-end
Nature 463(7278), 191–196 (2010).
Swerdlow HP et al. Accurate whole human transcriptome sequencing. Proc. Natl Acad.
genome sequencing using reversible • Describes the mutational signature of a Sci. USA 106(30), 12353–12358 (2009).
terminator chemistry. Nature 456(7218), melanoma patient.
• Describes novel fusions in prostate cancer
53–59 (2008). 10 Pleasance ED, Stephens PJ, O’Meara S by second-generation sequencing.
4 Wang J, Wang W, Li R et al. The diploid et al. A small-cell lung cancer genome
15 Ng SB, Bigham AW, Buckingham KJ et al.
genome sequence of an Asian individual. with complex signatures of tobacco
Exome sequencing identifies MLL2
Nature 456(7218), 60–65 (2008). exposure. Nature 463(7278), 184–190
mutations as a cause of Kabuki syndrome.
(2010).
5 Ahn S-M, Kim T-H, Lee S et al. The first Nat. Genet. 42(9), 790–793 (2010).
Korean genome sequence and ana­lysis: full • Describes the mutational signature of a
16 Ng SB, Buckingham KJ, Lee C et al.
genome sequencing for a socio-ethnic lung cancer patient.
Exome sequencing identifies the cause of a
group. Genome Res. 19(9), 1622–1629 11 Shah SP, Morin RD, Khattra J et al. Mendelian disorder. Nat. Genet. 42(1),
(2009). Mutational evolution in a lobular breast 30–35 (2010).
6 Ley TJ, Mardis ER, Ding L et al. DNA tumour profiled at single nucleotide
17 Ng SB, Turner EH, Robertson PD et al.
sequencing of a cytogenetically normal resolution. Nature 461(7265), 809–813
Targeted capture and massively parallel
acute myeloid leukaemia genome. Nature (2009).
sequencing of 12 human exomes. Nature
456(7218), 66–72 (2008). 12 Ding L, Ellis MJ, Li S et al. Genome 461(7261), U272–U153 (2009).
•• First report of the whole-genome sequence remodelling in a basal-like breast cancer
•• Describes the first hereditary causal gene
of a human cancer. metastasis and xenograft. Nature
through exomic sequencing.
464(7291), 999–1005 (2010).

438 Expert Rev. Mol. Diagn. 11(4), (2011)


Next-generation sequencing applied to molecular diagnostics Review

sequencing. Nature 452(7189), 872–876 46 Marcy Y, Ouverney C, Bik EM et al.


18 Drmanac R, Sparks AB, Callow MJ et al.
(2008). Dissecting biological ‘dark matter’ with
Human genome sequencing using
34 Soda M, Choi YL, Enomoto M et al. single-cell genetic analysis of rare and
unchained base reads on self-assembling
Identification of the transforming uncultivated TM7 microbes from the
DNA nanoarrays. Science 327(5961), 78–81
EML4–ALK fusion gene in non-small-cell human mouth. Proc. Natl Acad. Sci. USA
(2010).
lung cancer. Nature 448(7153), 561–566 104(29), 11889–11894 (2007).
19 Service RF. Gene sequencing. The race for
(2007). 47 Turnbaugh PJ, Ley RE, Mahowald MA,
the $1000 genome. Science 311(5767),
35 Tomlins SA, Rhodes DR, Perner S et al. Magrini V, Mardis ER, Gordon JI.
1544–1546 (2006).
Recurrent fusion of TMPRSS2 and ETS An obesity-associated gut microbiome with
20 Shendure J. The beginning of the end for increased capacity for energy harvest.
transcription factor genes in prostate
microarrays? Nat. Methods 5(7), 585–587 Nature 444(7122), 1027–1031 (2006).
cancer. Science 310(5748), 644–648
(2008).
(2005). 48 Croucher NJ, Harris SR, Fraser C et al.
21 Ledford H. The death of microarrays? Rapid pneumococcal evolution in response
Nature 455(7215), 847 (2008). •• Describes the identification of recurrent
fusion genes in a solid cancer. to clinical interventions. Science 331(6016),
22 Ransohoff DF, Khoury MJ. Personal 430–434 (2011).
genomics: information can be harmful. 36 Campbell PJ, Stephens PJ, Pleasance ED
et al. Identification of somatically acquired 49 Meyerson M, Gabriel S, Getz G. Advances
Eur. J. Clin. Invest. 40(1), 64–68 (2010). in understanding cancer genomes through
rearrangements in cancer using genome-
23 Aparicio SA, Huntsman DG. Does second-generation sequencing. Nat. Rev.
wide massively parallel paired-end
massively parallel DNA resequencing Genet. 11(10), 685–696 (2010).
sequencing. Nat. Genet. 40(6), 722–729
signify the end of histopathology as we Mamanova L, Coffey AJ, Scott CE et al.
(2008). 50
know it? J. Pathol. 220(2), 307–315 (2010). Target-enrichment strategies for next-
37 Stephens PJ, Mcbride DJ, Lin M-L et al.
24 Pettersson E, Lundeberg J, Ahmadian A. generation sequencing. Nat. Methods 7(2),
Complex landscapes of somatic
Generations of sequencing technologies. 111–118 (2010).
rearrangement in human breast cancer
Genomics 93(2), 105–111 (2009). Tewhey R, Warner JB, Nakano M et al.
genomes. Nature 462(7276), 1005–1010 51
25 Tucker T, Marra M, Friedman JM. (2009). Microdroplet-based PCR enrichment for
Massively parallel sequencing: the next big large-scale targeted sequencing.
• Largest collection of breast cancer samples
thing in genetic medicine. Am. J. Hum. Nat. Biotechnol. 27(11), 1025–1031
analyzed for structural rearrangements.
Genet. 85(2), 142–154 (2009). (2009).
38 Campbell PJ, Yachida S, Mudie LJ et al.
26 Voelkerding KV, Dames SA, Durtschi JD. 52 Voelkerding KV, Dames S, Durtschi JD.
The patterns and dynamics of genomic
Next-generation sequencing: from basic Next generation sequencing for clinical
instability in metastatic pancreatic cancer.
research to diagnostics. Clin. Chem. 55(4), diagnostics-principles and application to
Nature 467(7319), 1109–1113 (2010).
641–658 (2009). targeted resequencing for hypertrophic
39 Beck CR, Collier P, Macfarlane C et al. cardiomyopathy. J. Mol. Diagn. 12(5),
27 Ten Bosch JR, Grody WW. Keeping up
LINE-1 retrotransposition activity in 539–551 (2010).
with the next generation: massively parallel
human genomes. Cell 141(7), 1159–1170
sequencing in clinical diagnostics. J. Mol. 53 Davies H, Bignell GR, Cox C et al.
(2010).
Diagn. 10(6), 484–492 (2008). Mutations of the BRAF gene in human
40 Huang CRL, Schneider AM, Lu Y et al. cancer. Nature 417(6892), 949–954 (2002).
28 Morozova O, Marra MA. Applications of
Mobile interspersed repeats are major
next-generation sequencing technologies in 54 Lynch TJ, Bell DW, Sordella R et al.
structural variants in the human genome.
functional genomics. Genomics 92(5), Activating mutations in the epidermal
Cell 141(7), 1171–1182 (2010).
255–264 (2008). growth factor receptor underlying
41 Iskow RC, Mccabe MT, Mills RE et al. responsiveness of non-small-cell lung
29 Fullwood MJ, Wei CL, Liu ET, Ruan Y.
Natural mutagenesis of human genomes by cancer to gefitinib. N. Engl. J. Med.
Next-generation DNA sequencing of
endogenous retrotransposons. Cell 141(7), 350(21), 2129–2139 (2004).
paired-end tags (PET) for transcriptome
1253–1261 (2010).
and genome analyses. Genome Res. 19(4), 55 Paez JG, Janne PA, Lee JC et al. EGFR
521–532 (2009). 42 Mardis ER. The impact of next-generation mutations in lung cancer: correlation with
sequencing technology on genetics. Trends clinical response to gefitinib therapy.
30 Bentley DR. Whole-genome re-sequencing.
Genet. 24(3), 133–141 (2008). Science 304(5676), 1497–1500 (2004).
Curr. Opin. Genet. Dev. 16(6), 545–552
(2006). 43 Turnbaugh PJ, Ley RE, Hamady M, 56 Pao W, Miller V, Zakowski M et al. EGF
Fraser-Liggett CM, Knight R, Gordon JI. receptor gene mutations are common in
31 Mardis ER. New strategies and emerging
The human microbiome project. Nature lung cancers from ‘never smokers’ and are
technologies for massively parallel
449(7164), 804–810 (2007). associated with sensitivity of tumors to
sequencing: applications in medical
research. Genome Med. 1(4), 40 (2009). 44 Gill SR, Pop M, Deboy RT et al. gefitinib and erlotinib. Proc. Natl Acad. Sci.
Metagenomic analysis of the human distal USA 101(36), 13306–13311 (2004).
32 Schadt EE, Turner S, Kasarskis A.
gut microbiome. Science 312(5778), 57 Sjoblom T, Jones S, Wood LD et al.
A window into third-generation
1355–1359 (2006). The consensus coding sequences of human
sequencing. Hum. Mol. Genet. 19,
R227–R240 (2010). 45 Dusko Ehrlich S. Metagenomics of the breast and colorectal cancers. Science
intestinal microbiota: potential 314(5797), 268–274 (2006).
33 Wheeler DA, Srinivasan M, Egholm M
applications. Gastroenterol. Clin. Biol. •• Describes the first example of whole-
et al. The complete genome of an
34(Suppl. 1), S23–S28 (2010). exome sequencing of human cancers.
individual by massively parallel DNA

www.expert-reviews.com 439
Review Natrajan & Reis-Filho

58 Zhao Q, Kirkness EF, Caballero OL et al. next generation sequencing and differential allelic expression data in
Systematic detection of putative tumor bioinformatics analysis. PLoS One 5(12), human. PLoS Comput. Biol. 6(7),
suppressor genes through the combined use e15661 (2010). e1000849 (2010).
of exome and transcriptome sequencing. 70 Adey A, Morrison HG, Asun et al. Rapid, 82 Shah SP, Kobel M, Senz J et al. Mutation
Genome Biol. 11(11), R114 (2010). low-input, low-bias construction of shotgun of FOXL2 in granulosa-cell tumors of the
59 Harbour JW, Onken MD, Roberson EDO fragment libraries by high-density in vitro ovary. N. Engl. J. Med. 360(26),
et al. Frequent mutation of BAP1 in transposition. Genome Biol. 11(12), R119 2719–2729 (2009).
metastasizing uveal melanomas. Science (2010). 83 Jones S, Wang TL, Shih Ie M et al. Frequent
330(6009), 1410–1413 (2010). 71 Summerer D, Schracke N, Wu H et al. mutations of chromatin remodeling gene
• Describes the identification of a Targeted high throughput sequencing of a ARID1A in ovarian clear cell carcinoma.
metastasizing-causing gene in a subtype cancer-related exome subset by specific Science 330(6001), 228–231 (2010).
of melanoma. sequence capture with a fully automated 84 Wiegand KC, Shah SP, Al-Agha OM et al.
microarray platform. Genomics 95(4), ARID1A mutations in endometriosis-
60 Jiao Y, Shi C, Edil BH et al. DAXX/ATRX,
241–246 (2010). associated ovarian carcinomas. N. Engl.
MEN1, and mTOR pathway genes are
frequently altered in pancreatic 72 Wang Z, Gerstein M, Snyder M. RNA- J. Med. 363(16), 1532–1543 (2010).
neuroendocrine tumors. Science 331(6021), Seq: a revolutionary tool for 85 Berger MF, Levin JZ, Vijayendran K et al.
1199–1203 (2011). transcriptomics. Nat. Rev. Genet. 10(1), Integrative analysis of the melanoma
57–63 (2009). transcriptome. Genome Res. 20(4), 413–427
61 Parsons DW, Li M, Zhang X et al.
The genetic landscape of the childhood 73 Morin R, Bainbridge M, Fejes A et al. (2010).
cancer medulloblastoma. Science 331(6016), Profiling the HeLa S3 transcriptome using 86 Li Z, Tognon CE, Godinho FJ et al.
435–439 (2011). randomly primed cDNA and massively ETV6–NTRK3 fusion oncogene initiates
parallel short-read sequencing. Biotechniques breast cancer from committed mammary
62 Kan Z, Jaiswal BS, Stinson J et al. Diverse
45(1), 81–94 (2008). progenitors via activation of AP1 complex.
somatic mutation patterns and pathway
alterations in human cancers. Nature 74 Mortazavi A, Williams BA, Mccue K, Cancer Cell 12(6), 542–558 (2007).
466(7308), 869–873 (2010). Schaeffer L, Wold B. Mapping and 87 Persson M, Andren Y, Mark J,
quantifying mammalian transcriptomes by Horlings HM, Persson F, Stenman G.
63 Jones S, Hruban RH, Kamiyama M et al.
RNA-seq. Nat. Methods 5(7), 621–628 Recurrent fusion of MYB and NFIB
Exomic sequencing identifies PALB2 as a
(2008). transcription factor genes in carcinomas of
pancreatic cancer susceptibility gene. Science
324(5924), 217 (2009). 75 Morrissy AS, Morin RD, Delaney A et al. the breast and head and neck. Proc. Natl
Next-generation tag sequencing for cancer Acad. Sci. USA 106(44), 18740–18744
64 Slater EP, Langer P, Niemczyk E et al.
gene expression profiling. Genome Res. (2009).
PALB2 mutations in European familial
19(10), 1825–1835 (2009). 88 Mitani Y, Li J, Rao PH et al. Comprehensive
pancreatic cancer families. Clin. Genet.
78(5), 490–494 (2010). 76 ‘t Hoen PA, Ariyurek Y, Thygesen HH analysis of the MYB–NFIB gene fusion in
et al. Deep sequencing-based expression salivary adenoid cystic carcinoma:
65 Villarroel MC, Rajesh Kumar NV,
analysis shows major advances in incidence, variability, and clinicopathologic
Garrido-Laguna I et al. Personalizing cancer
robustness, resolution and inter-lab significance. Clin. Cancer Res. 16(19),
treatment in the age of global genomic
portability over five microarray platforms. 4722–4731 (2010).
analyses: PALB2 gene mutations and the
Nucleic Acids Res. 36(21), e141 (2008). 89 Antonescu CR, Zhang L, Chang NE et al.
response to DNA damaging agents in
pancreatic cancer. Mol. Cancer Ther. 10(1), 77 Sultan M, Schulz MH, Richard H et al. EWSR1–POU5F1 fusion in soft tissue
3–8 (2011). A global view of gene activity and myoepithelial tumors. A molecular analysis
alternative splicing by deep sequencing of of sixty-six cases, including soft tissue, bone,
•• One of the first examples of next-generation
the human transcriptome. Science and visceral lesions, showing common
sequencing to be used in a clinical setting.
321(5891), 956–960 (2008). involvement of the EWSR1 gene. Genes
66 Cheng Y, Wang J, Shao J et al. Identification Chromosomes Cancer 49(12), 1114–1124
78 Jacobsen N, Eriksen J, Nielsen PS. Efficient
of novel SNPs by next-generation (2010).
poly(A)+ RNA selection using LNA
sequencing of the genomic region
oligo(T) capture. Methods Mol. Biol. 703, 90 Antonescu CR, Dal Cin P, Nafa K et al.
containing the APC gene in colorectal
43–51 (2011). EWSR1–CREB1 is the predominant gene
cancer patients in China. Omics 14(3),
79 Oshlack A, Robinson MD, Young MD. fusion in angiomatoid fibrous histiocytoma.
315–325 (2010).
From RNA-seq reads to differential Genes Chromosomes Cancer 46(12),
67 Durbin RM, Abecasis GR, Altshuler DL 1051–1060 (2007).
expression results. Genome Biol. 11(12),
et al. A map of human genome variation
220 (2010). 91 Aman P. Fusion genes in solid tumors.
from population-scale sequencing. Nature
80 Wang X, Sun Q, Mcgrath SD, Mardis ER, Semin. Cancer Biol. 9(4), 303–318 (1999).
467(7319), 1061–1073 (2010).
Soloway PD, Clark AG. Transcriptome- 92 Stenman G. Fusion oncogenes and tumor
68 Pennisi E. Genomics. 1000 Genomes Project
wide identification of novel imprinted type specificity – insights from salivary
gives new map of genetic diversity. Science
genes in neonatal mouse brain. PLoS One gland tumors. Semin. Cancer Biol. 15(3),
330(6004), 574–575 (2010).
3(12), e3839 (2008). 224–235 (2005).
69 Timmermann B, Kerick M, Roehr C et al.
81 Wagner JR, Ge B, Pokholok D, 93 Ross H, Argani P. Xp11 translocation renal
Somatic mutation profiles of MSI and MSS
Gunderson KL, Pastinen T, Blanchette M. cell carcinoma. Pathology 42(4), 369–373
colorectal cancer identified by whole exome
Computational analysis of whole-genome (2010).

440 Expert Rev. Mol. Diagn. 11(4), (2011)


Next-generation sequencing applied to molecular diagnostics Review

94 Hedgepeth RC, Zhou M, Ross J. Rapid 105 Xu G, Wu J, Zhou L et al. Characterization 116 Boyd M, Hansen M, Jensen TG et al.
development of metastatic Xp11 of the small RNA transcriptomes of Genome-wide analysis of CDX2 binding in
translocation renal cell carcinoma in a girl androgen dependent and independent intestinal epithelial cells (Caco-2). J. Biol.
treated for neuroblastoma. J. Pediatr prostate cancer cell line by deep sequencing. Chem. 285(33), 25115–25125 (2010).
Hematol. Oncol. 31(8), 602–604 (2009). PLoS One 5(11), e15519 (2010). 117 Ley TJ, Ding L, Walter MJ et al. DNMT3A
95 Palanisamy N, Ateeq B, 106 Kuchenbauer F, Morin RD, Argiropoulos B mutations in acute myeloid leukemia.
Kalyana-Sundaram S et al. et al. In-depth characterization of the N. Engl. J. Med. 363(25), 2424–2433
Rearrangements of the RAF kinase microRNA transcriptome in a leukemia (2010).
pathway in prostate cancer, gastric cancer progression model. Genome Res. 18(11), 118 Hebenstreit D, Gu M, Haider S,
and melanoma. Nat. Med. 16(7), 793–798 1787–1797 (2008). Turner DJ, Lio P, Teichmann SA.
(2010). 107 Ramsingh G, Koboldt DC, Trissal M et al. EpiChIP: gene-by-gene quantification of
• Describes the identification of targetable Complete characterization of the epigenetic modification levels. Nucleic
fusion genes in cancer. microRNAome in a patient with acute Acids Res. 39(5), e27 (2010).
96 Zhao Q, Caballero OL, Levy S et al. myeloid leukemia. Blood 116(24), 119 Lister R, Ecker JR. Finding the fifth base:
Transcriptome-guided characterization of 5316–5326 (2010). genome-wide sequencing of cytosine
genomic rearrangements in a breast cancer 108 Persson H, Kvist A, Rego N et al. methylation. Genome Res. 19(6), 959–966
cell line. Proc. Natl Acad. Sci. USA 106(6), Identification of new microRNAs in paired (2009).
1886–1891 (2009). normal and tumor breast tissue suggests a 120 Hodges E, Smith AD, Kendall J et al.
97 Mcbride DJ, Orpana AK, Sotiriou C et al. dual role for the ERBB2/Her2 gene. Cancer High definition profiling of mammalian
Use of cancer-specific genomic Res. 71(1), 78–86 (2011). DNA methylation by array capture and
rearrangements to quantify disease burden 109 Zhou VW, Goren A, Bernstein BE. single molecule bisulfite sequencing.
in plasma from patients with solid tumors. Charting histone modifications and the Genome Res. 19(9), 1593–1605 (2009).
Genes Chromosomes Cancer 49(11), functional organization of mammalian 121 Pomraning KR, Smith KM, Freitag M.
1062–1069 (2010). genomes. Nat. Rev. Genet. 12(1), 7–18 Genome-wide high throughput analysis of
98 Leary RJ, Kinde I, Diehl F et al. (2011). DNA methylation in eukaryotes. Methods
Development of personalized tumor 110 Grober OM, Mutarelli M, Giurato G et al. 47(3), 142–150 (2009).
biomarkers using massively parallel Global analysis of estrogen receptor b 122 Down TA, Rakyan VK, Turner DJ et al.
sequencing. Sci. Transl. Med. 2(20), 20ra14 binding to breast cancer cell genome A Bayesian deconvolution strategy for
(2010). reveals an extensive interplay with estrogen immunoprecipitation-based DNA
99 Stephens PJ, Greenman CD, Fu B et al. receptor a for target gene regulation. BMC methylome analysis. Nat. Biotechnol.
Massive genomic rearrangement acquired Genomics 12(1), 36 (2011). 26(7), 779–785 (2008).
in a single catastrophic event during cancer 111 Strub T, Giuliano S, Ye T et al. Essential 123 Meissner A, Mikkelsen TS, Gu H et al.
development. Cell 144(1), 27–40 (2011). role of microphthalmia transcription factor Genome-scale DNA methylation maps of
100 Morin RD, Zhao Y, Prabhu AL et al. for DNA replication, mitosis and genomic pluripotent and differentiated cells. Nature
Preparation and analysis of microRNA stability in melanoma. Oncogene 454(7205), 766–770 (2008).
libraries using the Illumina massively DOI: 10.1038/onc.2010.612 (2011)
124 Choi JH, Li Y, Guo J et al. Genome-wide
parallel sequencing technology. Methods (Epub ahead of print).
DNA methylation maps in follicular
Mol. Biol. 650, 173–199 (2010). 112 Fanelli M, Amatori S, Barozzi I et al. lymphoma cells determined by
101 Thomas MF, Ansel KM. Construction of Pathology tissue-chromatin methylation-enriched bisulfite sequencing.
small RNA cDNA libraries for deep immunoprecipitation, coupled with PLoS One 5(9), pii: e13020 (2010).
sequencing. Methods Mol. Biol. 667, 93–111 high-throughput sequencing, allows the
125 Varley KE, Mitra RD. Bisulfite Patch PCR
(2010). epigenetic profiling of patient samples.
enables multiplexed sequencing of
Proc. Natl Acad. Sci. USA 107(50),
102 Git A, Dvinge H, Salmon-Divon M et al. promoter methylation across cancer
21535–21540 (2010).
Systematic comparison of microarray samples. Genome Res. 20(9), 1279–1287
profiling, real-time PCR, and next 113 Fang X, Yu W, Li L et al. ChIP-seq and (2010).
generation sequencing technologies for functional analysis of the SOX2 gene in
126 Chan TA, Glockner S, Yi JM et al.
measuring differential microRNA colorectal cancers. OMICS 14(4), 369–384
Convergence of mutation and epigenetic
expression. RNA 16(5), 991–1006 (2010). (2010).
alterations identifies common genes in
103 Buermans HP, Ariyurek Y, Van 114 Stender JD, Kim K, Charn TH et al. cancer that predict for poor prognosis.
Ommen GJ, Den Dunnen JT, T Hoen PA. Genome-wide analysis of estrogen receptor PLoS Med. 5(5), e114 (2008).
New methods for next generation a DNA binding and tethering mechanisms
127 Varley KE, Mutch DG, Edmonston TB,
sequencing based microRNA expression identifies Runx1 as a novel tethering factor
Goodfellow PJ, Mitra RD. Intra-tumor
profiling. BMC Genomics 11(1), 716 in receptor-mediated transcriptional
heterogeneity of MLH1 promoter
(2010). activation. Mol Cell Biol. 30(16),
methylation revealed by deep single
3943–3955 (2010).
104 Vaz C, Ahmad HM, Sharma P et al. molecule bisulfite sequencing. Nucleic
Analysis of microRNA transcriptome by 115 Bottomly D, Kyler SL, Mcweeney SK, Acids Res. 37(14), 4603–4612 (2009).
deep sequencing of small RNA libraries of Yochum GS. Identification of b-catenin
128 Korshunova Y, Maloney RK, Lakey N
peripheral blood. BMC Genomics 11, 288 binding regions in colon cancer cells using
et al. Massively parallel bisulphite
(2010). ChIP-Seq. Nucleic Acids Res. 38(17),
pyrosequencing reveals the molecular
5735–5745 (2010).

www.expert-reviews.com 441
Review Natrajan & Reis-Filho

complexity of breast cancer-associated single cells. Nat. Biotechnol. 29(1), 51–57 • Describes the presence of reversion
cytosine-methylation patterns obtained (2010). mutations in BRCA2 as a mechanism of
from tissue and serum DNA. Genome Res. 140 Kitzman JO, Mackenzie AP, Adey A et al. resistance to poly(ADP-ribose)
18(1), 19–29 (2008). Haplotype-resolved genome sequencing of a polymerase inhibitors.
129 Li M, Chen WD, Papadopoulos N et al. Gujarati Indian individual. Nat. Biotechnol. 152 Sakai W, Swisher EM, Jacquemont C et al.
Sensitive digital quantification of DNA 29(1), 59–63 (2010). Functional restoration of BRCA2 protein by
methylation in clinical samples. Nat. 141 Tischkowitz M, Xia B. PALB2/FANCN: secondary BRCA2 mutations in BRCA2-
Biotechnol. 27(9), 858–863 (2009). recombining cancer and Fanconi anemia. mutated ovarian carcinoma. Cancer Res.
130 Lupski JR, Reid JG, Gonzaga-Jauregui C Cancer Res. 70(19), 7353–7359 (2010). 69(16), 6381–6386 (2009).
et al. Whole-genome sequencing in a 142 Jones SJM, Laskin J, Li YY et al. Evolution 153 Swisher EM, Sakai W, Karlan BY, Wurz K,
patient with Charco–Marie–Tooth of an adenocarcinoma in response to Urban N, Taniguchi T. Secondary BRCA1
neuropathy. N. Engl. J. Med. 362(13), selection by targeted kinase inhibitors. mutations in BRCA1-mutated ovarian
1181–1191 (2010). Genome Biol. 11(8), R82 (2010). carcinomas with platinum resistance. Cancer
131 Musunuru K, Pirruccello JP, Do R et al. 143 Worthey EA, Mayer AN, Syverson GD Res. 68(8), 2581–2586 (2008).
Exome sequencing, ANGPTL3 mutations, et al. Making a definitive diagnosis: 154 Sakai W, Swisher EM, Karlan BY et al.
and familial combined hypolipidemia. Successful clinical application of whole Secondary mutations as a mechanism of
N. Engl. J. Med. 363(23), 2220–2227 exome sequencing in a child with cisplatin resistance in BRCA2-mutated
(2010). intractable inflammatory bowel disease. cancers. Nature 451(7182), 1116–1120
132 Walsh T, Shahin H, Elkan-Miller T et al. Genet. Med. 13(3), 255–262 (2011). (2008).
Whole exome sequencing and homozygosity 144 Kobayashi S, Boggon TJ, Dayaram T et al. • Describes the presence of reversion
mapping identify mutation in the cell EGFR mutation and resistance of mutations in BRCA2 as a mechanism of
polarity protein GPSM2 as the cause of non-small-cell lung cancer to gefitinib. resistance to cisplatin chemotherapy.
nonsyndromic hearing loss DFNB82. Am. N. Engl. J. Med. 352(8), 786–792 (2005).
J. Hum. Genet. 87(1), 90–94 (2010). 155 Futschik A, Schlotterer C. The next
145 Pao W, Miller VA, Politi KA et al. generation of molecular markers from
133 Yi X, Liang Y, Huerta-Sanchez E et al. Acquired resistance of lung massively parallel sequencing of pooled DNA
Sequencing of 50 human exomes reveals adenocarcinomas to gefitinib or erlotinib is samples. Genetics 186(1), 207–218 (2010).
adaptation to high altitude. Science associated with a second mutation in the
329(5987), 75–78 (2010). 156 Morgan JE, Carr IM, Sheridan E et al.
EGFR kinase domain. PLoS Med. 2(3), Genetic diagnosis of familial breast cancer
134 Li Y, Vinckenbosch N, Tian G et al. e73 (2005). using clonal sequencing. Hum. Mutat.
Resequencing of 200 human exomes 146 Bell DW, Gore I, Okimoto RA et al. 31(4), 484–491 (2010).
identifies an excess of low-frequency Inherited susceptibility to lung cancer may
non-synonymous coding variants. 157 Walsh T, King MC. Ten genes for inherited
be associated with the T790M drug breast cancer. Cancer Cell 11(2), 103–105
Nat. Genet. 42(11), 969–972 (2010). resistance mutation in EGFR. Nat. Genet. (2007).
135 Lo YM, Chan KC, Sun H et al. Maternal 37(12), 1315–1316 (2005).
plasma DNA sequencing reveals the 158 Casadei S, Norquist BM, Walsh T et al.
147 Inukai M, Toyooka S, Ito S et al. Presence of Contribution to familial breast cancer of
genome-wide genetic and mutational profile epidermal growth factor receptor gene
of the fetus. Sci. Transl. Med. 2(61), 61ra91 inherited mutations in the BRCA2-
T790M mutation as a minor clone in interacting protein PALB2. Cancer Res.
(2010). non-small cell lung cancer. Cancer Res. 71(6), 2222–2229 (2011).
•• Describes the use of next-generation 66(16), 7854–7858 (2006).
159 Anderson K, Lutz C, Van Delft FW et al.
sequencing for noninvasive 148 Turke AB, Zejnullahu K, Wu YL et al. Genetic variegation of clonal architecture
prenatal diagnosis. Preexistence and clonal selection of MET and propagating cells in leukaemia. Nature
136 Chiu RW, Chan KC, Gao Y et al. amplification in EGFR mutant NSCLC. 469(7330), 356–361 (2011).
Noninvasive prenatal diagnosis of fetal Cancer Cell 17(1), 77–88 (2010).
160 Geyer FC, Weigelt B, Natrajan R et al.
chromosomal aneuploidy by massively 149 Wardelmann E, Merkelbach-Bruse S, Molecular analysis reveals a genetic basis
parallel genomic sequencing of DNA in Pauls K et al. Polyclonal evolution of for the phenotypic diversity of metaplastic
maternal plasma. Proc. Natl Acad. Sci. USA multiple secondary KIT mutations in breast carcinomas. J. Pathol. 220(5),
105(51), 20458–20463 (2008). gastrointestinal stromal tumors under 562–573 (2010).
137 Ehrich M, Deciu C, Zwiefelhofer T et al. treatment with imatinib mesylate.
Clin. Cancer Res. 12(6), 1743–1749 161 Yachida S, Jones S, Bozic I et al. Distant
Noninvasive detection of fetal trisomy 21
(2006). metastasis occurs late during the genetic
by sequencing of DNA in maternal blood:
evolution of pancreatic cancer. Nature
a study in a clinical setting. Am. J. Obstet. 150 Antonescu CR, Besmer P, Guo T et al. 467(7319), 1114–1117 (2010).
Gynecol. 204(3), 205.e1–11 (2011). Acquired resistance to imatinib in
gastrointestinal stromal tumor occurs 162 Munroe DJ, Harris TJR. Third-generation
138 Chiu RW, Akolekar R, Zheng YW et al.
through secondary gene mutation. sequencing fireworks at Marco Island.
Non-invasive prenatal assessment of
Clin. Cancer Res. 11(11), 4182–4190 (2005). Nat. Biotechnol. 28(5), 426–428 (2010).
trisomy 21 by multiplexed maternal plasma
DNA sequencing: large scale validity study. 151 Edwards SL, Brough R, Lord CJ et al. 163 Bowers J, Mitchell J, Beer E et al. Virtual
BMJ 342, c7401 (2011). Resistance to therapy caused by intragenic terminator nucleotides for next-generation
deletion in BRCA2. Nature 451(7182), DNA sequencing. Nat. Methods 6(8),
139 Fan HC, Wang J, Potanina A, Quake SR.
1111–1115 (2008). 593–595 (2009).
Whole-genome molecular haplotyping of

442 Expert Rev. Mol. Diagn. 11(4), (2011)


Next-generation sequencing applied to molecular diagnostics Review

164 Harris TD, Buzby PR, Babcock H et al. oligonucleotides with a biological 191 Carter H, Chen S, Isik L et al. Cancer-
Single-molecule DNA sequencing of a viral nanopore. Proc. Natl Acad. Sci. USA specific high-throughput annotation of
genome. Science 320(5872), 106–109 106(19), 7702–7707 (2009). somatic mutations: computational
(2008). 178 Clarke J, Wu HC, Jayasinghe L, Patel A, prediction of driver missense mutations.
165 Thompson JF, Steinmann KE. Single Reid S, Bayley H. Continuous base Cancer Res. 69(16), 6660–6667 (2009).
molecule sequencing with a HeliScope identification for single-molecule nanopore 192 Greenman C, Stephens P, Smith R et al.
genetic analysis system. Curr. Protoc. Mol. DNA sequencing. Nat. Nanotechnol. 4(4), Patterns of somatic mutation in human
Biol. Chapter 7, Unit 7 10 (2010). 265–270 (2009). cancer genomes. Nature 446(7132),
166 Pushkarev D, Neff NF, Quake SR. 179 Olasagasti F, Lieberman KR, Benner S 153–158 (2007).
Single-molecule sequencing of an et al. Replication of individual DNA 193 Bignell GR, Santarius T, Pole JCM et al.
individual human genome. Nat. Biotechnol. molecules under electronic control using a Architectures of somatic genomic
27(9), 847–850 (2009). protein nanopore. Nat. Nanotechnol. 5(11), rearrangement in human cancer amplicons
167 Goren A, Ozsolak F, Shoresh N et al. 798–806 (2010). at sequence-level resolution. Genome Res.
Chromatin profiling by directly sequencing 180 Wallace EV, Stoddart D, Heron AJ et al. 17(9), 1296–1303 (2007).
small quantities of immunoprecipitated Identification of epigenetic DNA 194 Kaye J, Boddington P, De Vries J,
DNA. Nat. Methods 7(1), 47–49 (2010). modifications with a protein nanopore. Hawkins N, Melham K. Ethical
168 Kapranov P, Ozsolak F, Kim SW et al. Chem. Commun. 46(43), 8195–8197 implications of the use of whole genome
New class of gene-termini-associated (2010). methods in medical research. Eur. J. Hum.
human RNAs suggests a novel RNA 181 Cheley S, Xie H, Bayley H. A genetically Genet. 18(4), 398–403 (2010).
copying mechanism. Nature 466(7306), encoded pore for the stochastic detection of 195 Via M, Gignoux C, Burchard EG.
642–646 (2010). a protein kinase. Chembiochem. 7(12), The 1000 Genomes Project: new
169 Ozsolak F, Ting DT, Wittner BS et al. 1923–1927 (2006). opportunities for research and social
Amplification-free digital gene expression 182 Pop M, Salzberg SL. Bioinformatics challenges. Genome Med. 2(1), 3 (2010).
profiling from minute cell quantities. challenges of new sequencing technology. 196 Church GM. Molecular systems biology.
Nat. Methods 7(8), 619–621 (2010). Trends Genet. 24(3), 142–149 (2008). Mol. Syst. Biol. 1, 2005.0030 (2005).
170 Ozsolak F, Platt AR, Jones DR et al. Direct 183 Trapnell C, Salzberg SL. How to map 197 Collins FS, Barker AD. Mapping the
RNA sequencing. Nature 461(7265), billions of short reads onto genomes. cancer genome. Pinpointing the genes
814–818 (2009). Nat. Biotechnol. 27(5), 455–457 (2009). involved in cancer will help chart a new
171 Ozsolak F, Goren A, Gymrek M et al. 184 Chaisson MJ, Brinza D, Pevzner PA. course across the complex landscape of
Digital transcriptome profiling from De novo fragment assembly with short human malignancies. Sci. Am. 296(3),
attomole-level RNA samples. Genome Res. mate-paired reads: does the read length 50–57 (2007).
20(4), 519–525 (2010). matter? Genome Res. 19(2), 336–346 198 Hudson TJ, Anderson W, Artez A et al.
172 Tessler LA, Reifenberger JG, Mitra RD. (2009). International network of cancer genome
Protein quantification in complex 185 Ding L, Wendl MC, Koboldt DC, projects. Nature 464(7291), 993–998
mixtures by solid phase single-molecule Mardis ER. Analysis of next-generation (2010).
counting. Anal. Chem. 81(17), 7141–7148 genomic data in cancer: accomplishments 199 Varela I, Tarpey P, Raine K et al. Exome
(2009). and challenges. Hum. Mol. Genet. 19(R2), sequencing identifies frequent mutation of
173 Eid J, Fehr A, Gray J et al. Real-time DNA R188–R196 (2010). the SWI/SNF complex gene PBRM1 in
sequencing from single polymerase 186 Nagarajan N, Pop M. Sequencing and renal carcinoma. Nature 469(7331),
molecules. Science 323(5910), 133–138 genome assembly using next-generation 539–542 (2011).
(2009). technologies. Methods Mol. Biol. 673, 1–17 200 Bilguvar K, Ozturk AK, Louvi A et al.
174 Travers KJ, Chin CS, Rank DR, Eid JS, (2010). Whole-exome sequencing identifies recessive
Turner SW. A flexible and efficient 187 Thomas RK, Baker AC, Debiasi RM et al. WDR62 mutations in severe brain
template format for circular consensus High-throughput oncogene mutation malformations. Nature 467(7312), 207–210
sequencing and SNP detection. Nucleic profiling in human cancer. Nat. Genet. (2010).
Acids Res. 38(15), e159 (2010). 39(3), 347–351 (2007). 201 Bolze A, Byun M, Mcdonald D et al.
175 Flusberg BA, Webster DR, Lee JH et al. 188 Ng PC, Henikoff S. Predicting deleterious Whole-exome-sequencing-based discovery
Direct detection of DNA methylation amino acid substitutions. Genome Res. of human FADD deficiency. Am. J. Hum.
during single-molecule, real-time 11(5), 863–874 (2001). Genet. 87(6), 873–881 (2010).
sequencing. Nat. Methods 7(6), 461–465 189 Kaminker JS, Zhang Y, Watanabe C, 202 Bowden DW, An SS, Palmer ND et al.
(2010). Zhang Z. CanPredict: a computational Molecular basis of a linkage peak: exome
176 Uemura S, Aitken CE, Korlach J, tool for predicting cancer-associated sequencing and family-based analysis
Flusberg BA, Turner SW, Puglisi JD. missense mutations. Nucleic Acids Res. identify a rare genetic variant in the
Real-time tRNA transit on single 35(Web Server issue), W595–W598 ADIPOQ gene in the IRAS Family Study.
translating ribosomes at codon resolution. (2007). Hum. Mol. Genet. 19(20), 4112–4120
Nature 464(7291), 1012–1017 (2010). (2010).
190 Adzhubei IA, Schmidt S, Peshkin L et al.
177 Stoddart D, Heron AJ, Mikhailova E, A method and server for predicting 203 Byun M, Abhyankar A, Lelarge V et al.
Maglia G, Bayley H. Single-nucleotide damaging missense mutations. Nat. Whole-exome sequencing-based discovery
discrimination in immobilized DNA Methods 7(4), 248–249 (2010). of STIM1 deficiency in a child with fatal

www.expert-reviews.com 443
Review Natrajan & Reis-Filho

classic Kaposi sarcoma. J.  Exp. Med. 211 Otto EA, Hurd TW, Airik R et al. 312 International Cancer Genome Consortium
207(11), 2307–2312 (2010). Candidate exome capture identifies www.icgc.org
204 Caliskan M, Chong JX, Uricchio L et al. mutation of SDCCAG8 as the cause of a 313 The Cancer Genome Atlas
Exome sequencing reveals a novel mutation retinal-renal ciliopathy. Nat. Genet. 42(10), http://cancergenome.nih.gov
for autosomal recessive nonsyndromic 840–850 (2010).
314 Emory University
mental retardation in the TECR gene on 212 Wang JL, Yang X, Xia K et al. TGM6 http://genetics.emory.edu/egl/tests
chromosome 19p13. Hum. Mol. Genet. identified as a novel causative gene of
315 Ambry Genetics
20(7), 1285–1289 (2011). spinocerebellar ataxias using exome
www.ambrygen.com
205 Gilissen C, Arts HH, Hoischen A et al. sequencing. Brain 133(Pt 12), 3510–3518
(2010). 316 German sequencing service provider
Exome sequencing identifies WDR35
(CeGaT)
variants involved in Sensenbrenner
www.cegat.de
syndrome. Am. J. Hum. Genet. 87(3), Websites
418–423 (2010). 317 National Centre for Genome Resources
301 Helicos
www.ncgr.org
206 Haack TB, Danhauser K, Haberberger B www.helicosbio.com
et al. Exome sequencing identifies ACAD9 318 Good Start Genetics
302 Pacific Biosciences
mutations as a cause of complex I deficiency. www.goodstartgenetics.com
www.pacificbiosciences.com
Nat. Genet. 42(12), 1131–1134 (2010). 319 GeneDx
303 Oxford Nanopore Technologies
207 Hoischen A, Van Bon BW, Gilissen C et al. www.genedx.com
www.nanoporetech.com
De novo mutations of SETBP1 cause 320 University of Leeds Institute of Molecular
304 Ion Torrent
Schinzel-Giedion syndrome. Nat. Genet. Medicine
www.iontorrent.com
42(6), 483–485 (2010). www.limm.leeds.ac.uk
305 Life Technologies
208 Johnson JO, Mandrioli J, Benatar M et al. 321 HeliScope
www.lifetechnologies.com
Exome sequencing reveals VCP mutations as www.helicosbio.com
a cause of familial ALS. Neuron 68(5), 306 Bioconductor
322 PacBio RS
857–864 (2010). www.bioconductor.org
www.pacificbiosciences.com
209 Krawitz PM, Schweiger MR, 307 NextGENe
323 Ion Torrent PGM
Rodelsperger C et al. Identity-by-descent www.softgenetics.com/NextGENe.html
www.lifetechnologies.com
filtering of exome sequence data identifies 308 Genesifter
PIGV mutations in hyperphosphatasia 324 Oxford Nanopore
www.geospiza.com/index.shtml
mental retardation syndrome. Nat. Genet. www.nanoporetech.com
309 GenomeQuest
42(10), 827–829 (2010). 325 Life Technologies FRET
www.genomequest.com
210 Lalonde E, Albrecht S, Ha KC et al. www.lifetechnologies.com
310 DNAnexus
Unexpected allelic heterogeneity and
https://dnanexus.com
spectrum of mutations in Fowler syndrome
revealed by next-generation exome 311 Integrative Genomics Viewer
sequencing. Hum. Mutat. 31(8), 918–923 www.broadinstitute.org/software/igv
(2010).

444 Expert Rev. Mol. Diagn. 11(4), (2011)

You might also like