Professional Documents
Culture Documents
Abstract
Neurodegenerative diseases exhibit chronic progressive lesions in the central and peripheral
nervous systems with unclear causes. The search for pathogenic mutations in human neuro-
degenerative diseases has benefited from massively parallel short-read sequencers. However,
genomic regions, including repetitive elements, especially with high/low GC content, are far
beyond the capability of conventional approaches. Recently, long-read single-molecule DNA
sequencing technologies have emerged and enabled researchers to study genomes, tran-
scriptomes, and metagenomes at unprecedented resolutions. The identification of novel mu-
tations in unresolved neurodegenerative disorders, the characterization of causative repeat
expansions, and the direct detection of epigenetic modifications on naive DNA by virtue of
long-read sequencers will further expand our understanding of neurodegenerative diseases. In
this article, we review and compare 2 prevailing long-read sequencing technologies, Pacific
Biosciences and Oxford Nanopore Technologies, and discuss their applications in neurode-
generative diseases.
From the Department of Neurology (Y.S., L.F., C.S., T.W., H.Z., H.L., S.Z., Z.H., Y.F., Y.D., J.Y., C.M., Y.X.), The First Affiliated Hospital of Zhengzhou University, Zhengzhou University,
Henan, P. R. China; The Academy of Medical Sciences of Zhengzhou University (Y.S., L.F., C.S., T.E., H.Z., H.L., S.Z., Z.H., Y.F., Y.D., C.M.), Zhengzhou University,Henan, P. R. China;
Henan Key Laboratory of Cerebrovascular Diseases (Y.S., L.F., C.S., H.Z., H.L., S.Z., Z.H., Y.F., Y.D., J.Y., C.M., Y.X.), The First Affiliated Hospital of Zhengzhou University, Zhengzhou
University, Henan, P. R. China; Sino-British Research Centre for Molecular Oncology (C.S., C.M.), National Centre for International Research in Cell and Gene Therapy, School of Basic
Medical Sciences, Academy of Medical Sciences, Zhengzhou University, Henan, P. R. China; and Institute of Neuroscience (J.W., Y.X.), Zhengzhou University, Henan, P. R. China.
Go to Neurology.org/N for full disclosures. Funding information and disclosures deemed relevant by the authors, if any, are provided at the end of the article.
The Article Processing Charge was funded by the National Natural Science Foundation of China (Grants U190420029, 91849115, and 81530037) and the National Key Research and
Development Program of China (Grant 2017YFA0105003).
This is an open access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives License 4.0 (CC BY-NC-ND), which permits downloading
and sharing the work provided it is properly cited. The work cannot be changed in any way or used commercially without permission from the journal.
Copyright © 2021 The Author(s). Published by Wolters Kluwer Health, Inc. on behalf of the American Academy of Neurology. 423
Glossary
AD = Alzheimer disease; ALS = amyotrophic lateral sclerosis; BAC = bacterial artificial chromosome; BAFME = Benign adult
familial myoclonic epilepsy; C9ORF72 = chromosome 9’s open reading frame 72; CCS = circular consensus sequence;
dsDNA = double-stranded DNA; FAME = familial adult myoclonic epilepsy; FCMTE = familial cortical myoclonic tremor with
epilepsy; FMR1 = fragile X mental retardation 1; FTD = frontotemporal dementia; FXS = fragile X syndrome; FXTAS = fragile
X-associated tremor/ataxia syndrome; HD = Huntington disease; HTT = huntingtin; IPD = interpulse duration; LRS = long-
read sequencer; NGS = next-generation sequencing; NIID = neuronal intranuclear inclusion disease; ONT = Oxford Nanopore
Technologies; OPDM = oculopharyngodistal myopathy; PacBio = Pacific Biosciences; PM = premutation; PolyQ =
polyglutamine; RP-PCR = repeat-primed PCR; SCA = spinocerebellar ataxias; SMRT = single-molecule real-time; SNP =
single nucleotide polymorphism; SV = structural variant; UTR = untranslated region; WGS = whole-genome sequencing.
Neurodegenerative diseases are a class of neurologic disorders receives negative results. Here, we summarize the mecha-
characterized by a progressive course of cognitive decline, nisms of LRS and their applications in human neurodegen-
motor disturbance, psychiatric disorders, and eventual dis- erative disorders.
ability. The neuropathologic features are the selective loss or
impairment of specific neurons in the central and peripheral
nervous systems. As the second most common cause of death, Long-Read Technologies
neurodegenerative disorders impose a severe burden on in-
dividuals, communities, and countries. Therefore, it is crucial SMRT Sequencing: PacBio
to detect and intervene early to slow down the irreversible PacBio RS, RSⅡ, Sequel, and SequelⅡ sequencers with in-
progression of these diseases. creased average read lengths and throughputs have allowed
for genomics research at an unprecedented resolution, all of
There are a small number of patients with family histories of which share a common mechanism, known as synthesis se-
neurodegenerative diseases, thus indicating that genetic factors quencing. First, a template called SMRTbell is created by
may play an etiologically important role. Since the identifica- ligating hairpin adapters to both ends of a target double-
tion of expanded trinucleotide repeats in Huntington diseases stranded DNA (dsDNA) molecule, thereby forming a closed
(HDs),1 cumulative evidence has suggested that repeat ex- circular DNA (Figure 1A and B). The library is then loaded
pansions may be more involved in the pathogenesis of neu- onto a flowcell composed of nanoscale wells called zero mode
rodegenerative diseases than first thought, regardless of waveguides, where a polymerase is immobilized at the bottom
whether they reside in coding or noncoding regions including and can combine an SMRTbell adaptor with a complemen-
59 or 39 untranslated regions (59 or 39 UTR) and introns.2,3 tary primer to initiate the replication process (Figure 1C).
Nevertheless, next-generation sequencing (NGS) technologies The incorporation of fluorescently labeled nucleotides re-
are inadequate to resolve the expansions exceeding several ki- leases excited fluorescence signals, with the color and duration
of the emitted light being recorded in real time (Figure 1D). A
lobases, owing to the limit of short read lengths (150–300 bp)
circular consensus sequence (CCS) with higher accuracy
and an underrepresentation of GC-rich/poor regions,4 despite
(>99%) is eventually constructed from multiple reads cover-
innumerable discoveries of mutations facilitated by NGS.
ing the original DNA template.5 In addition, the time between
adjacent nucleotide incorporations is called the interpulse
In contrast to NGS, 2 prevailing long-read sequencers termed
duration (IPD) (Figure 1D), which reflects variable poly-
LRS, single-molecule real-time (SMRT) sequencing de-
merase kinetics influenced by modification events,8 and
veloped by Pacific Biosciences (PacBio)5 and nanopore se-
contributes to the epigenetic analysis.
quencing by Oxford Nanopore Technologies (ONTs),6
provide an alternative strategy for single DNA molecule se-
Despite the preponderance of SMRT, raw read errors domi-
quencing in real time. These platforms generate exceptionally nated by deletions or insertions remain problematic. Given
long reads of several tens of kilobases, which encompass full- the random occurrence of errors, base discrimination can be
length repetitive regions. Moreover, guanine-cytosine (GC) obtained by constructing a CCS having >99% accurate con-
bias can be reduced to a low degree, and more homogeneous sensus with sufficient coverage.5
genome coverage can be achieved without the requirement of
PCR amplification, as opposed to NGS. Recently, clustered Nanopore Sequencing: ONT
regularly interspaced short palindromic repeats/CRISPR as- To date, ONT has introduced 4 sequencers: MinION,
sociated 9 (CRISPR/Cas9)-mediated amplification-free tar- PromethION, GridION, and SmidgION, which are all com-
geted enrichment in conjunction with LRS has emerged as an petent in electronically detecting individual bases during the
instrumental tool to achieve targeted sequencing.7 The disassembling process. To begin with, library preparation
unique features of LRS have made it well suited for deci- occurs in dsDNA fragments, which are end repaired and
phering neurodegenerative diseases, especially when NGS adapter ligated (Figure 2A and B). The adapters contain 59
PacBio platform clearly showed respective repeat sizes, size research and broadly applicable to a complete genetic and
distributions, and small AGG interruptions,15 which over- epigenetic analysis of expanded-repeat elements underlying
came the technical limits. More broadly, not only the effect of many other neurodegenerative diseases. More importantly,
sequence context on polymerase kinetics but also the strand LRS allows the improvement of genetic counseling in terms of
and position-specific influences of G on the polymerase’s assessing an accurate expansion risk of each allele individually
activity were confirmed by the variation in IPD,15 thus laying for female PM carriers to achieve optimal fertility.
the foundation for future investigations of epigenetic modi-
fications and polymerase kinetics. To address the challenging Neuronal Intranuclear Inclusion Disease
detection and location of AGG interruptions in female PM Neuronal intranuclear inclusion disease (NIID) is a pro-
carriers, SMRT provided a direct demonstration of AGG in- gressive and fatal neurodegenerative disease that presents
terruptions within the CGG repeat regions and separated a with great clinical heterogeneity, including dementia, cere-
large number of reads containing AGG interruptions based on bellar ataxia, parkinsonism, peripheral neuropathy, and auto-
the X chromosomes from which they derived.16 With the nomic dysfunction. Patients with NIID pathologically
valuable information offered by SMRT, it is relatively precise manifest in the presence of eosinophilic hyaline intranuclear
to estimate the expansion risk of CGG repeats in female PM inclusions in the central, peripheral, and autonomic nervous
carriers, which can help them to select for preimplantation systems, as well as somatic cells; therefore, skin biopsy effi-
genetic diagnosis in case of high risk,17 or invasive prenatal ciently determines the premortem diagnosis.19
diagnosis within small risk,18 to check the fragile X status of
the fetus. Intensive efforts have been made to explore the genetic
mechanisms of NIID since the first reported NIID-affected
Since the successful sequencing of highly repetitive elements case. Recently, a novel GGC repeat expansion in the 59 UTR
exceeding 2.25 kb of 100% GC content by using SMRT se- of the human specific gene NOTCH2NLC (formerly known
quencing technology, LRS has been the focus of extensive as NBPF19) has been identified as the genetic cause in
NIID GGCexp NOTCH2NLC PacBio RS Ⅱ and ONT Whole genome 8–25× 100% 201920
PromethION
BAFME1 TTTCA/TTTTAexp SAMD12 PacBio RS Ⅱ BAC clone and whole genome NA NA 201841
ONT MinION
Abbreviations: ALS/FTD = amyotrophic lateral sclerosis/frontotemporal dementia; ATXN10 = ataxin 10; BAC = bacterial artificial chromosome; BAFME1 =
benign adult familial myoclonic epilepsy type 1; BAFME4 = benign adult familial myoclonic epilepsy type 4; C9ORF72 = chromosome 9’s open reading frame 72;
exp = expansion; FAME2 = familial adult myoclonic epilepsy type 2; FAME3 = familial adult myoclonic epilepsy type 3; FMR1 = fragile X mental retardation 1;
FXTAS = fragile X-associated tremor/ataxia syndrome; GIPC1 = GIPC PDZ domain containing family member 1; HD = Huntington disease; HTT = huntingtin;
MARCH6 = membrane-associated ring finger (C3HC4) 6; NA = not available; NBPF19 = neuroblastoma break point family member 19; NIID = neuronal
intranuclear inclusion disease; No-Amp targeted sequencing = no-amplification targeted sequencing; NOTCH2NLC = notch 2 amino-terminal-like protein C;
ONT = Oxford Nanopore Technologies; OPDM = oculopharyngodistal myopathy; PacBio = Pacific Biosciences; PD = Parkinson disease; SAMD12 = sterile alpha
motif domain containing 12; SCA10 = spinocerebellar ataxia type 10; STARD7 = StAR-related lipid transfer domain containing 7; YEATS2 = YEATS domain
containing 2.
a
NBPF19 is recently annotated as NOTCH2NLC.
b
The inserted sequence was suggested to be (TTTCT), rather than (TTTCA) as reported previously, possibly due to the high sequencing error rate of long-read
sequencing technologies.
Updated Information & including high resolution figures, can be found at:
Services http://n.neurology.org/content/97/9/423.full
References This article cites 60 articles, 12 of which you can access for free at:
http://n.neurology.org/content/97/9/423.full#ref-list-1
Subspecialty Collections This article, along with others on similar topics, appears in the following collection(s):
All Genetics
http://n.neurology.org/cgi/collection/all_genetics
Association studies in genetics
http://n.neurology.org/cgi/collection/association_studies_in_genetics
Trinucleotide repeat diseases
http://n.neurology.org/cgi/collection/trinucleotide_repeat_diseases
Permissions & Licensing Information about reproducing this article in parts (figures,tables) or in its entirety can be
found online at:
http://www.neurology.org/about/about_the_journal#permissions
Reprints Information about ordering reprints can be found online:
http://n.neurology.org/subscribers/advertise
Neurology ® is the official journal of the American Academy of Neurology. Published continuously since 1951, it is now a
weekly with 48 issues per year. Copyright Copyright © 2021 The Author(s). Published by Wolters Kluwer Health, Inc. on
behalf of the American Academy of Neurology.. All rights reserved. Print ISSN: 0028-3878. Online ISSN: 1526-632X.