You are on page 1of 15

REVIEWS

The chemistry of Cas9 and its CRISPR


colleagues
Janice S. Chen1 and Jennifer A. Doudna1–5
Abstract | RNA-guided binding and cleavage of nucleic acids by CRISPR–Cas systems is a
defining feature of bacterial and archaeal adaptive immunity against viruses and plasmids.
As a result of their programmable ability to cut specific DNA and RNA sequences, Cas9 and
related single-subunit effector proteins from CRISPR–Cas systems have been widely adopted
for research and therapeutic genome engineering applications. In this Review, we discuss the
chemistry of macromolecules involved in the multistep interference pathway used by CRISPR–
Cas systems that mediate accurate nucleic acid target recognition and cutting. Although this
Review mainly focuses on DNA interference by Cas9, we briefly explore nucleic acid targeting
by the single-effector proteins Cas12 and Cas13 to emphasize the conserved themes of precision
DNA and RNA cleavage within class 2 CRISPR–Cas systems. We further highlight the unique
mechanisms of surveillance complex formation, substrate recognition and target cleavage in
molecular detail across diverse single-subunit CRISPR–Cas interference proteins.

Mobile genetic elements


CRISPR–Cas systems involve complex RNA-guided by manipulating its guide RNA has inspired a genome-
DNA sequences that are adaptive immune pathways in bacteria and archaea to editing revolution29–34. Recent efforts in mining bacterial
capable of moving around a combat infection by mobile genetic elements (MGEs)1–3. and archaeal genomes for single-effector proteins have
genome, including transposons, Immunological memory begins with the acquisition uncovered rare and divergent CRISPR–Cas systems, which
plasmids and bacteriophages.
of foreign DNA sequences derived from invading offer new RNA-guided endonucleases with functional
Protospacers bacterio­phages or plasmids4–7, known as protospacers, diversity 35–37. To fully use the nucleic acid-targeting
Spacer precursors that are the selection of which requires a protospacer adjacent capabilities of Cas9 and additional class  2 effector
captured from foreign DNA motif sequence (PAM sequence). These protospacers are Cas proteins, biochemical and structural studies have
and that are complementary integrated into the leader proximal end of a host CRISPR guided our understanding of the molecular mechanisms
to the CRISPR RNA (crRNA)
spacer sequence.
array 8,9, which yields new spacers that are flanked by underpinning target search and destruction.
~20–50 base pair (bp) direct repeats10,11. Transcription of In this Review, we briefly introduce the evolutionary
the CRISPR array generates a non-coding RNA transcript relationships between Cas9 and other single-effector
1
Department of Molecular — the ‘precursor CRISPR RNA’ (pre-crRNA) — which is proteins of class 2 CRISPR–Cas systems and we detail
and Cell Biology, University of further processed by specific Cas proteins or host factors the multistep mechanisms for RNP complex formation,
California. to produce short mature crRNAs that contain a single substrate recognition and endonucleolytic cleavage.
2
Howard Hughes Medical spacer and at least one repeat 12–16. Finally, assembly of Although the Cas9 interference pathway remains the focus
Institute, University of
California.
the mature crRNA with the single-effector Cas protein of this Review, we highlight the mechanistic similarities
3
Innovative Genomics Institute, or multiple Cas subunits forms a surveillance ribonucleo- and differences in target recognition and cleavage across
University of California. protein (RNP) complex. This RNP complex recognizes the evolutionarily distinct class 2 single-effector proteins.
4
Department of Chemistry, foreign nucleic acids containing a PAM sequence and
University of California,
destroys sequences complementary to the spacer segment CRISPR–Cas systems
Berkeley, California 94720,
USA. of the crRNA through endonucleolytic cleavage by Cas The ongoing competition between interference by
5
Molecular Biophysics and nucleases17–20 (FIG. 1). CRISPR–Cas systems and infection by MGEs has trig-
Integrated Bioimaging The programmable, RNA-guided DNA-targeting gered the rapid evolution and extensive diversification of
Division, Lawrence Berkeley and RNA-targeting functions of CRISPR–Cas systems CRISPR loci and Cas proteins38,39. Although CRISPR–Cas
National Laboratory, Berkeley,
California 94720, USA.
have been repurposed for genome- and transcriptome- systems are functionally modular, the selective pressures
engineering applications21–28, which has led to extensive inflicted by invading genetic elements have led to the
Correspondence to J.A.D. 
doudna@berkeley.edu study and interest in CRISPR–Cas biology and enzyme emergence of different Cas proteins that carry out both
doi:10.1038/s41570-017-0078 mechanisms. In particular, the facile ability to direct the specialized and combined roles in spacer acquisition,
Published online 4 Oct 2017 widely used Cas9 protein to any sequence of interest crRNA biogenesis, DNA and RNA interference, and

NATURE REVIEWS | CHEMISTRY VOLUME 1 | ARTICLE NUMBER 0078 | 1


©
2
0
1
7
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
,
p
a
r
t
o
f
S
p
r
i
n
g
e
r
N
a
t
u
r
e
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

Bacteriophage

Adaptation
Class 2 interference genes Protospacer
Type II
RuvC I RuvC II RuvC III S1 S2 S3
cas effector cas1 cas2 R R R R
HNH CRISPR
cas9 array Transcription

Type V RuvC I, RuvC II and RuvC III


Effector Pre-crRNA
protein
cas12a, cas12b, cas12c, cas12d and cas12e RNP crRNA
formation processing
Type VI Mature
HEPN HEPN Interference crRNA
cas13a, cas13b and cas13c

Figure 1 | Overview of CRISPR–Cas adaptive immunity and interference genes from class 2 systems.
Nature CRISPR–Cas
Reviews | Chemistry
immunity is encoded in the host genome at loci comprising the CRISPR array and neighbouring cas genes. During
infection, adaptation machinery acquires ‘protospacers’ from the invading genetic element (for example, a
bacteriophage) and integrates them into the leader end of the CRISPR array to encode immunological memory.
Transcription of the CRISPR array generates a precursor CRISPR RNA (pre-crRNA) transcript, which is further processed by
specific Cas proteins or host factors into mature crRNAs containing the unique protospacer sequence (S1–3; red, yellow
and purple, respectively). Assembly of interference Cas proteins with crRNAs forms the ribonucleoprotein (RNP)
Protospacer adjacent motif surveillance complex, which scans invading nucleic acids and targets foreign complementary sequences for degradation.
sequence In class 2 CRISPR–Cas systems, the interference module comprises a type II (Cas9), type V (Cas12a, Cas12b, Cas12c,
(PAM sequence). A short Cas12d and Cas12e) or type VI (Cas13a, Cas13b and Cas13c) single-effector protein encoded by the cas operon. HEPN,
sequence adjacent to the higher eukaryotes and prokaryotes nucleotide-binding RNase domain; HNH, histidine–asparagine–histidine nuclease
protospacer within foreign
domain; R, direct repeat; RuvC, RNase H‑like fold nuclease domain.
DNA. Recognition of the PAM
sequence by effector Cas
proteins triggers target ancillary functions3. These ancillary functions include On the basis of the available genomic data, type I and
interference. the putative role for the 5ʹ to 3ʹ exonuclease activity of III systems seem to be distantly related and account for
Cas4 during protospacer selection40, the possible involve- ~50% and ~25% of all known CRISPR loci in bacteria
Endonucleolytic cleavage
Achieved by Cas proteins that ment of Csn2 in chromosomal protection during spacer and archaea, respectively 3,53,54.
hydrolyse internal integration41,42 and a novel role for Cas10 in the synthesis
phosphodiester bonds within a of cyclic oligonucleotides to allosterically activate Csm6 Class 2 CRISPR–Cas systems. Class 2 CRISPR–Cas
nucleotide chain. ribonuclease activity 43,44. In this section, we briefly dis- systems are characterized by RNA-guided effector
Guide RNA
cuss the two major classes of CRISPR–Cas systems complexes that require only a single multidomain pro-
An RNA molecule that includes to introduce the evolutionary relationships of genes tein for interference3. The most comprehensively studied
the CRISPR RNA (crRNA) and encoding the interference module. For a detailed overview example in this class is type II, which is characterized
directs the Cas interference of CRISPR–Cas classification, we recommend several by the signature gene cas9 (REF. 55). Cas9 is capable of
protein to a target site that is
comprehensive reviews3,37,45,46. programmable RNA-guided DNA interference without
complementary to the spacer
sequence. the need for additional proteins21, a property that has
Class 1 CRISPR–Cas systems. Class 1 CRISPR–Cas been used to repurpose Cas9 for genome editing. This
Interference systems comprise the multisubunit crRNA–effector finding has since attracted intensive investigation into
The final stage of CRISPR complexes that are required for target interference and its function and mechanism using biochemical recon-
immunity that involves
RNA-directed cleavage of
are further classified into types I, III and IV on the basis stitution and in cellular contexts29–32. Although type II
target nucleic acids by Cas of the presence of signature cas genes3. Type I systems systems represent a small proportion (~5%) of all known
proteins. are characterized by Cas3, which has dual nuclease and CRISPR–Cas systems in bacteria, the Cas9 phylogeny
helicase activity and is recruited by the multiprotein com- is diverse and further classified into subtypes II-A,
Nuclease
plex Cascade (CRISPR-associated complex for antiviral II-B and II-C3,54. Subclassification of type II systems is
An enzyme that catalyses the
cleavage of phosphodiester defence) for DNA target degradation47. Type III systems based on the presence of auxiliary cas genes csn2 (sub-
bonds between nucleic acids. include the large subunit Cas10, which is required for type II-A) or cas4 (subtype II-B), which are involved in
Csm (subtype III-A) and Cmr (subtype III-B) assembly adaptation but not in the interference stage40,56. Although
Helicase and degradation of both RNA and transcriptionally type II systems were previously thought to be restricted
An enzyme that unwinds
double-stranded DNA using
active DNA48–52. Finally, the putative type IV systems con- to bacteria, a divergent Cas9 protein has been discovered
the energy from ATP tain the uncharacterized gene csf1, which encodes a large in nanoarchaea; however, no functional data have so far
hydrolysis. subunit that is related to the Cas8 protein in Cascade3. showed that this system is active35.

2 | ARTICLE NUMBER 0078 | VOLUME 1 www.nature.com/natrevchem


©
2
0
1
7
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
,
p
a
r
t
o
f
S
p
r
i
n
g
e
r
N
a
t
u
r
e
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

RuvC nuclease domain The interest in discovering new facets of CRISPR are typically ~75–110 nucleotides in length, contain
Contains an RNase H‑like fold biology and harnessing the natural diversity of class 2 a sequence complementary to CRISPR repeats that is
and cleaves single-stranded single-effector proteins for genome-engineering applica- required for RNP assembly and crRNA processing and
DNA through a two metal tions has led to the identification of type V and type VI are abundantly expressed within the intergenic regions
mechanism.
CRISPR–Cas systems36. To date, the experimentally tested of the CRISPR–Cas loci13,67,68. Although tracrRNAs can
HEPN domain type V systems (the effector proteins of which have been be diverse in sequence and secondary structure, they
(Higher eukaryotes and redesignated as Cas12a–e (REF. 37)) include the single- contain a conserved ‘anti-repeat’ sequence that base
prokaryotes nucleotide-binding effector proteins Cas12a (also known as Cpf1; subtype pairs with each direct repeat of the pre-crRNA to form
RNase domain). Contains
V-A), Cas12b (also known as C2c1; subtype V-B), Cas12c an ~30 bp double-stranded RNA (dsRNA) duplex 68. In
conserved motifs and functions
as an RNase or a non-catalytic
(also known as C2c3; subtype V-C), Cas12d (also known the presence of Cas9, this dsRNA substrate is cleaved
RNA-binding domain. as CasY; subtype V-D) and Cas12e (also known as CasX; by a non-Cas ribonuclease (RNase III), which generates
subtype V-E)46, all of which are evolutionarily distinct individual Cas9‑bound crRNA–tracrRNA complexes
Ribonuclease from Cas9 (REFS 35–37). Similarly to Cas9, Cas12 pro- from the pre-crRNA transcript 13. Further processing by
An enzyme that hydrolyses the
phosphodiester bonds of an
teins contain a conserved RuvC nuclease domain that is an unknown exonuclease trims the 5ʹ end of the crRNA,
RNA backbone. known to hydrolyse single-stranded DNA (ssDNA)57–59. generating the Cas9 surveillance complex that is primed
Notably, Cas12a contains a second catalytic domain to cleave a complementary DNA target13 (FIG. 2a).
Exonuclease that is independently responsible for processing its own To simplify dual-guided proteins such as Cas9,
An enzyme that hydrolyses
pre-crRNA59,60. Cas12b and Cas12e for programmable targeting in
phosphodiester bonds, one at
a time, from the ends of a
Type VI CRISPR–Cas systems consist of an effector research applications, the crRNA–tracrRNA molecules
nucleic acid chain. protein (reclassified as Cas13 (REF. 37)) containing the required for their activity have been fused by a tetraloop
signature HEPN domain (higher eukaryotes and pro­ to generate a chimeric single-guide RNA (sgRNA)21,69,70.
Single-guide RNA karyotes nucleotide-binding RNase domain)61, of which Since the first generation of Cas9 sgRNAs, efforts to
A chimeric RNA molecule in
which the CRISPR RNA
the Cas13a (also known as C2c2; subtype VI-A) and optimize their on‑target efficiency in cultured cells
(crRNA), which contains a Cas13b (subtype VI-B) proteins have been experimen- have involved maintaining the 3ʹ hairpins, extending
sequence complementary to tally shown to function as RNA-guided RNases36,62. the repeat–anti-repeat duplex and adding synthetic
the target DNA, is covalently Although Cas13a effectors have been shown to indis- modifications to improve the stability of the sgRNA71–73.
linked to a trans-activating
criminately cleave single-stranded RNAs (ssRNAs) on Although many sgRNA variants have been developed for
crRNA (tracrRNA).
binding to a complementary target RNA63–66, they also diverse applications74–78, a highly efficient sgRNA design
possess pre-crRNA processing activity 62–64, similar to that for genome editing and genomic loci imaging comprises
of Cas12a proteins36,60. Compared with type II systems, a long tracrRNA tail containing the native 3ʹ hairpins and
types V and IV are rare and represent only ~2% of a 5 bp extension of the crRNA–tracrRNA duplex 71,72,79.
sequenced CRISPR–Cas systems in bacteria and archaea Activation of dual-guided interference proteins requires
combined3,37; however, the streamlined and diverse abil- competent RNP complex formation, which depends on
ities of type V and IV systems to target nucleic acids RNA folding, stability and sequence, as well as structural
holds potential for various research and therapeutic recognition of the crRNA–tracrRNA duplex. Therefore,
applications. in the case of type II-A Cas9 proteins, minor alterations
to the sgRNA scaffold have been shown to reduce or to
crRNA processing and RNP formation abolish endonuclease activity 80–82.
The molecular memory of infection encoded in the Secondary structure predictions and sgRNA-bound
CRISPR array is transcribed to generate a long, non- and target-bound Cas9 crystal structures have revealed
coding pre-crRNA transcript that requires processing conserved RNA folds that are necessary to define their
and loading into Cas nucleases to authorize RNA- functional roles83–86. The sgRNA from the widely used
guided CRISPR immunity 12. In this section, we briefly type II-A Streptococcus pyogenes Cas9 (SpCas9) includes
review two major classes for crRNA processing in a 5ʹ spacer and repeat sequence from the crRNA, followed
class 2 CRISPR systems — trans-activating crRNA by the tracrRNA-derived anti-repeat sequence (the
(tracrRNA)-dependent pathways, which require a host repeat–anti-repeat duplex of which consists of an upper
factor RNase III (REF. 13), and self-processing pathways, stem, bulge and lower stem), nexus and hairpins 1 and 2
which are accomplished by the interference protein at the 3ʹ end84–87 (FIG. 2b). Functional studies have shown
alone60,64. We also discuss the structural determinants for that, irrespective of the spacer sequence, specific nucleo­
assembling the guide RNAs into their effector proteins tides in the bulge and nexus sequences of the SpCas9
to generate the RNP surveillance complex. sgRNA are most crucial for cleavage87. With respect to
binding energy, SpCas9 binds hairpins 1–2 alone with an
tracrRNA-dependent pathways. Nucleic acid targeting equilibrium dissociation constant (Kd) in the picomolar
by RNP surveillance complexes is dependent on mature range, which is almost indistinguishable from that of the
crRNAs, which facilitate complex activation and molecular full-length sgRNA88. In comparison, SpCas9 binds to the
recognition of the cognate target 12,16. However, for spacer–nexus region with nanomolar affinity, which sug-
type II systems, crRNA processing and Cas9 activation gests that the structured 3ʹ end of the sgRNA contributes
requires an additional small, non-coding tracrRNA13. most of the binding energy for the Cas9–sgRNA com-
Bioinformatic analyses followed by experimental valida- plex 88. Förster resonance energy transfer (FRET)-based
tion using RNA sequencing (RNA-seq) and biochemical experiments have also shown that the 20‑nucleotide
reconstitution have revealed that tracrRNAs, which spacer sequence at the 5ʹ end of the sgRNA is crucial

NATURE REVIEWS | CHEMISTRY VOLUME 1 | ARTICLE NUMBER 0078 | 3


©
2
0
1
7
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
,
p
a
r
t
o
f
S
p
r
i
n
g
e
r
N
a
t
u
r
e
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

a tracrRNA-dependent d Self-processing

S1 S2 S3
cas effector R R R R cas effector

Transcription
Transcription

Effector pre-crRNA Effector


protein protein pre-crRNA
tracrRNA

RNase III

crRNA–tracrRNA crRNA

RNP RNP

b SpCas9 e AsCas12a
Upper stem
A
A A
G UA Lower stem
Spacer A Spacer
U G
C C AA 5′
G
Tetra-loop
A G Bulge G A CU
linker G U
A A A U A U U
Repeat A U G A G C C G
U A U
U A U A U A C G Repeat A U
U A G G U A CA U
G U A
tracrRNA U A U C G C G C G
G U G C U A
G C A U G C U U
5′ AA G UU A U C A G UUUU 3′ 3′
Spacer Nexus Hairpins 1 and 2 Pseudoknot Spacer

c AaCas12b f LshCas13a
R–AR duplex 1 and 2 Spacer
C GGCAC
U UGAGAAGU 3′
A ACUCUUCG A CCGUG
A
A
Spacer
U Spacer AUC
G G C C A C U UU U G
C Stem 3 A A
A C G G U G G AC A A
A C G
C G
A G U U G C C C GA Stem 2 C G
tracrRNA C GA
U U C A A C G GG U Repeat
Repeat U C Bulge Spacer
U U A A G A C A G GA A U
G Stem 1 5′ GG CC A AAA C 3′
5′ G G U C U A
Figure 2 | TracrRNA-dependent and self-processing pathways for crRNA biogenesis and RNP Nature Reviews
complex | Chemistry
formation.
The CRISPR array is transcribed to form the precursor CRISPR RNA (pre-crRNA) transcript containing multiple spacers
(S1–3; red, yellow and purple, respectively) flanked by direct repeats (R; blue), which requires further processing to produce
mature crRNAs containing a single spacer sequence (yellow) and at least one direct repeat (blue). In part a, trans-activating
crRNA (tracrRNA)-dependent crRNA biogenesis is shown in which transcription of an intergenic region of the CRISPR array
generates the small, non-coding tracrRNA molecule (red), which contains a region of complementarity to the direct repeat
(anti-repeat) of the pre-crRNA. The anti-repeat sequence base pairs with each direct repeat, thus forming a pre-crRNA–
tracrRNA double-stranded RNA (dsRNA) duplex. In the presence of Cas9 (Protein Data Bank identifier: 4UN3; part b) or
Cas12b (PDB ID: 5U30; part c), RNase III recognizes and cleaves the repeat–anti-repeat (R–AR) RNA duplex. The dual
crRNA–tracrRNA molecule (which can be fused to generate a chimeric single-guide RNA (sgRNA)) remains tightly
associated with the interference protein to form the ribonucleoprotein (RNP) complex and stabilizes a conformation that
is competent for target surveillance. In part b, hybridization of the crRNA repeat (blue) and tracrRNA anti-repeat (red)
generates the crRNA–tracrRNA duplex containing an upper stem, bulge and lower stem, and the remaining tracrRNA
sequence folds into the nexus and hairpins 1 and 2. In part c, the crRNA–tracrRNA duplex forms the R–AR duplex, and the
remaining tracrRNA topology consists of a pseudoknot structure and four helical stems. In part d, self-processing crRNA
biogenesis is shown in which an additional RNA molecule or nuclease is not required for pre-crRNA processing beyond
the interference protein itself. The single-effector protein Cas12a (PDB ID: 5B43; part e) or Cas13a (PDB ID: 5WTK;
part f) recognizes and binds to the structured repeat within the pre-crRNA transcript, which triggers endoribonuclease
activity that is distinct from that of the active site involved in DNA target cleavage. The resulting RNP complex is competent
for downstream target recognition and degradation. In part e, the crRNA repeat (blue) folds into a pseudoknot structure
upstream of the spacer sequence (yellow). In part f, the crRNA repeat folds into a stem–loop containing a 2‑nucleotide
bulge at the base of the stem–loop. AaCas12b, Alicyclobacillus acidoterrestris Cas12b; AsCas12a, Acidaminococcus spp.
Cas12a; LshCas13a, Leptotrichia shahii Cas13a; SpCas9, Streptococcus pyogenes Cas9.

4 | ARTICLE NUMBER 0078 | VOLUME 1 www.nature.com/natrevchem


©
2
0
1
7
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
,
p
a
r
t
o
f
S
p
r
i
n
g
e
r
N
a
t
u
r
e
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

Scissile phosphate for triggering the ‘lobe closure’ conformation that is Cas12a was validated by small RNA sequencing and
The phosphate within the required to generate the target search complex, whereas biochemical reconstitution58–60. The mature crRNA
nucleic acid backbone that is the 3ʹ end hairpins mainly contribute to the stability of comprises a structured repeat followed by the spacer
cleaved by a nuclease. the closed state89. sequence, and the 5ʹ stem–loop provides the majority of
Crystal structures of the sgRNA-bound and target- the crRNA binding energy for Cas12a endoribonuclease
bound Cas9 complexes have validated these experimental activity 60,96,97 (FIG. 2e). Although Cas12a has ~50‑fold
results and showed that the Cas9 protein makes extensive higher affinity for crRNA in the presence of a magnesium
hydrogen bond contacts and aromatic stacking interac- ion (Mg 2+)96, one study established that pre-crRNA
tions with the repeat–anti-repeat duplex, the nexus processing is metal independent and requires the
sequence and the base of hairpins 1 and 2 (REFS 83–86). 2ʹ‑hydroxyl (2ʹ‑OH) group of the ribonucleotide
Furthermore, the first ten nucleotides from the 5ʹ end upstream of the scissile phosphate for acid–base catalysis59.
of the spacer are disordered in the crystal structure, On the basis of the crRNA-bound Cas12a crystal struc-
whereas the remaining ten nucleotides are pre-organized ture from Lachnospiraceae spp. bacterium ND2006
in a pseudo-A form conformation, which is thermo­ (LbCas12a) and crRNA–target–Cas12a structures from
dynamically favourable for initiating R‑loop formation83. Acidaminococcus spp. Cas12a (AsCas12a) and F. novi-
Variations in sgRNA hairpin structure were observed cida Cas12a (FnCas12a), the RNA processing and RuvC
in crystal structures of type II-A Staphylococcus aureus catalytic centres are spatially distant and functionally
Cas9 (SaCas9)90 and type II-B Francisella novicida Cas9 independent 59,96–99.
(FnCas9)91 proteins, which provided the structural basis Similarly to Cas12a, the Cas13a and Cas13b proteins
for orthogonality between cognate Cas9–sgRNA pairs92. also evolved the streamlined functions of crRNA process-
Notably, the crystal structure of the sgRNA–target-bound ing and target interference in a single-effector protein62–65.
type II-C orthologue Campylobacter jejuni Cas9 (CjCas9) Unexpectedly, the crRNA-bound structure of Leptotrichia
features a triple-helix within a pseudoknot that involves shahii Cas13a (LshCas13a) revealed a two-nucleotide
extensive intramolecular interactions necessary for bulge at the base of the 5ʹ stem–loop in the crRNA,
activating endonucleolytic cleavage93. which seemed to be evolutionarily conserved and func-
The general pathway for dual crRNA–tracrRNA- tionally required65 (FIG. 2f). Biochemical and structural
guided processing and assembly extends beyond type II studies have further validated the functional and spatial
systems and has also been observed and experimentally separation of the crRNA-processing and RNA-targeting
shown for the evolutionarily distant Cas12 interference active sites in Cas13a (REFS 63–66,176), offering another
proteins, Cas12b and Cas12e (REFS 35,36). In contrast example of convergent evolution and underscoring the
to the type II sgRNA architecture, secondary structure functional modularity across CRISPR–Cas systems.
predictions for the Cas12e and Cas12b sgRNAs have
shown an opposite polarity to that of Cas9 sgRNAs, PAM recognition
comprising a 5ʹ end hairpin and anti-repeat sequence In most CRISPR–Cas systems, target DNA search and
from the tracrRNA, followed by the crRNA repeat and recognition requires both complementary base pairing
spacer sequence at the 3ʹ end35,36,94,95. For Alicyclobacillus between the guide RNA and the target DNA and the
acidoterrestris Cas12b (AaCas12b), sgRNA–target- presence of a conserved PAM sequence on the target
bound crystal structures have shown an unexpected DNA adjacent to the target site100–102. PAM recognition
sgRNA topology consisting of a pseudoknot structure and by Cas proteins activates interference as a strategy to
four helical stems, compared with the simple stem-loop distinguish self from non-self sequences2, and single
fold that is predicted to be its minimum free-energy mutations in the PAM can prevent CRISPR–Cas cleavage
conformation94,95 (FIG. 2c). A disordered RNA linker that activity in vitro and in vivo21,102–106.
was previously thought to form part of the repeat–anti- The native PAM sequence for the commonly used
repeat duplex was further shown to be dispensable for SpCas9 is 5ʹ‑NGG‑3ʹ, in which N can be any of the four
AaCas12b activity 94. On the basis of in silico co‑folding, DNA nucleotides100. Single-molecule experiments have
these structural features also seem to be possible with shown that the Cas9–sgRNA complex initiates its search
the crRNA–tracrRNA sequence from the orthologous for target DNA by binding to a PAM sequence before
Bacillus thermoamylovorans Cas12b (BtCas12b), which interrogating the flanking DNA for potential comple-
suggests that these RNA folds might be conserved in mentarity to the guide RNA103. Target recognition occurs
type VB systems94. through three-dimensional collisions; Cas9 rapidly dis-
sociates from DNA that does not contain the appropriate
Self-processing pathways. Unlike the dual crRNA– PAM sequence, whereas it binds for a longer duration at
tracrRNA-guided examples, the finding that the Cas12a sites containing a PAM sequence, with the dwell time
effector does not require additional RNA or protein com- depending on the degree of complementarity between
ponents for crRNA processing and DNA target cleavage the guide RNA and the adjacent DNA107,108. Once Cas9
offered a new paradigm for single crRNA-guided class 2 has found a target site with the appropriate PAM, it trig-
interference proteins58–60 (FIG. 2d). Initial studies showed gers DNA unwinding from the PAM-proximal end to
that a minimal type V-A expression plasmid consisting the PAM-distal end of the target site103.
of only cas12a from F. novicida and a CRISPR array was The molecular mechanism of PAM recognition
capable of pre-crRNA maturation and plasmid DNA involves binding of the PAM DNA to a positively
interference58; the dual RNase and DNase activity by charged groove in Cas9, in which the first base in the

NATURE REVIEWS | CHEMISTRY VOLUME 1 | ARTICLE NUMBER 0078 | 5


©
2
0
1
7
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
,
p
a
r
t
o
f
S
p
r
i
n
g
e
r
N
a
t
u
r
e
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

DNA melting PAM sequence (which can be any nucleotide) remains interactions via the major groove with two respective
The process of DNA strand base paired with its nucleobase counterpart but does arginine residues (Arg1333 and Arg1335) located in a
separation that does not not interact with Cas9 (REFS  85,86,109) . The adja- β‑hairpin of the PAM-interacting domain of SpCas9
require an external energy cent GG dinucleotide on the non-target strand of the (REFS 85,86,109) (FIG. 3a). Although naturally occurring
source such as ATP.
PAM duplex forms base-specific hydrogen-bonding differences in sequence and location of PAM-interacting
residues explain how diverse PAM sequences are rec-
ognized by Cas9 orthologues90,91,93, engineered residue
a SpCas9 modifications in the PAM-interacting domain have been
Guide RNA Phosphate
lock-loop shown to alter PAM specificities109,110. Structural studies of
5′ P P P P P 3′ S1109 engineered SpCas9 PAM variants support an ‘induced fit’
S1109 K1107
+5 +4 +3 +2 +1 –1 –2 –3
–3* –2* mechanism, whereby recognition of non-canonical PAM
Target –1* K1107 sequences requires subtle DNA backbone distortion of the
strand 3′ P P P P P P P P 5′
N PAM C C R1333 PAM duplex without changing the conformation of
Non- N G G
the PAM-interacting β-hairpin85,109. In particular, mutation
target 5′ P P P P P P P P 3′
strand +5* +4* +3* +2* +1* –1* –2* –3* R1335 of a threonine to an arginine residue (Thr1337Arg) ena-
Target bled recognition of a guanine base in the fourth position
strand Non-target
R1333 R1335 strand of the varied PAM (5ʹ‑NGNG‑3ʹ)109.
For SpCas9, both the PAM duplex-bound structure
b AsCas12a and the double-stranded DNA (dsDNA)-bound struc-
ture suggest that PAM–Cas9 interactions trigger local
Guide RNA
structural changes that destabilize the adjacent DNA
K548 5′ P P P P P
3′
M604 –2*
–1*
duplex base pairing and facilitate hybridization of the
–4 –3 –2 –1 +1 +2 +3 +4
T539
guide RNA to the target DNA85,86. In the PAM duplex-
3′ P P P P P P P P 5′ K607
–3* bound SpCas9 structure, a sharp kink is observed in the
A A A N
T T T N target DNA strand immediately upstream of the PAM,
T167
5′ P P P P P P P P 3′ with the connecting phosphodiester group (referred to
–4* –3* –2* –1* +1* +2* +3* +4* –4* as +1 phosphate) stabilized by a phosphate lock-loop
Target (facilitated by Lys1107 and Ser1109) located in the
T539 T167 K607 Non-target
strand
strand PAM-interacting domain85 (FIG. 3a). Superimposition
of the guide RNA-bound SpCas9 structure with the
c AaCas12b
PAM duplex-bound SpCas9 structure shows that this
Guide RNA phosphate lock-loop moves inwards towards the central
N400 5′ P P P P P 3′ –1* nucleic acid recognition channel on binding to target
–3 –2 –1 +1 +2 +3 +4 +5 –2* DNA83,85. Interactions between the phosphate lock-loop
3′ P P P P P P P P 5′
A A N
N144 R122 of SpCas9 and the +1 phosphate on PAM recognition
T T N –3* have been proposed to aid DNA melting and stabilization
5′ P P P P P P P P 3′ G143 N400 of the guide RNA–target DNA hybrid85,86. However,
–3* –2* –1* +1* +2* +3* +4* +5*
mutation of the phosphate lock-loop residues in SpCas9
G143 N144
R122 Target Non-target did not reduce endonucleolytic cleavage activities on a
strand strand complementary target DNA85. Although the phosphate
Figure 3 | PAM recognition and sequence-specific target identification. lock-loop is probably not involved in DNA duplex dest-
a | Streptococcus pyogenes Cas9 (SpCas9; Protein Data Bank identifier: 4UN3) | Chemistry
Nature Reviews abilization, a recent study from 2017 suggested that these
recognizes a 5ʹ‑NGG‑3ʹ protospacer adjacent motif (PAM) sequence, where N signifies residues instead promote initial pairing of the guide
any nucleotide, located downstream of the DNA target sequence. Hydrogen bonding RNA with the target strand111. Furthermore, the lack of
interactions between the crucial arginine (Arg1333 and Arg1335) residues and the sequence similarity of the phosphate lock-loop among
guanine dinucleotide (GG) on the non-target strand occur through the major groove of
Cas9 orthologues suggests that initial DNA unwinding
the PAM duplex, which induces local structural changes that enable DNA unwinding.
Stabilization of the upstream (+1) phosphate on the target strand by the Ser1109 and might require additional elements beyond the phosphate
Lys1107 residues on the PAM-interacting domain of SpCas9 make up a phosphate lock-loop residues85,90,91,93,111.
lock-loop, which has been suggested to facilitate guide RNA–target DNA hybridization. The dsDNA-bound SpCas9 structure also suggests
b | Acidaminococcus spp. Cas12a (AsCas12a; PDB ID: 5B43) recognizes a 5ʹ‑TTTN‑3ʹ PAM how PAM recognition triggers local DNA unwinding. As
located upstream of the DNA target sequence. The crucial Lys607 residue of AsCas12a observed in the PAM duplex-bound SpCas9 structure,
forms hydrogen bonds with the thymine (–2*) and adenine (–3) on the minor groove of the target DNA strand kinks at the +1 phosphodiester
the PAM duplex, whereas the conserved Lys548, Met604, Thr167 and Thr539 residues linkage and then pairs with the guide RNA segment to
form hydrogen bonds or van der Waals interactions with nucleotides on both strands of form a pseudo-A form RNA–DNA hybrid85,86. By con-
the PAM duplex. c | Alicyclobacillus acidoterrestris Cas12b (AaCas12b; PDB ID: 5U30) trast, the non-target DNA strand threads into a tight
recognizes a 5ʹ‑TTN‑3ʹ PAM located upstream of the DNA target sequence. The key
tunnel located within the nuclease lobe towards the
arginine (Arg122) and asparagine (Asn144 and Asn400) residues also reach into the minor
groove of the PAM to form hydrogen bonds with A–T base pairs within the PAM duplex. RuvC active site of SpCas9. The PAM-proximal non-
The target strand and the non-target strand (*) of the DNA duplex are labelled, target DNA strand is stabilized by extensive hydropho-
phosphate groups within the target sequence are labelled with (+) and those outside the bic and van der Waals interactions, in which the first
target sequence are labelled with (–). Note that the polarity of the target sequence for nucleotide upstream of the PAM (referred to as the +1*
Cas12a in part b and Cas12b in part c is opposite to that of Cas9. position) stacks onto the PAM duplex 86. This intra-strand

6 | ARTICLE NUMBER 0078 | VOLUME 1 www.nature.com/natrevchem


©
2
0
1
7
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
,
p
a
r
t
o
f
S
p
r
i
n
g
e
r
N
a
t
u
r
e
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

RNA strand invasion base stacking might help to stabilize the PAM duplex with the PAM duplex markedly reduced or abolished
A process in which the guide and thereby to facilitate PAM recognition by SpCas9 endonucleolytic cleavage activity, consistent with the
RNA segment interrogates a through base-specific interactions with the GG dinu- requirement for base specificity in PAM recognition95,98.
double-stranded DNA (dsDNA) cleotide. The sharp kinks and flipped bases include However, a structure-guided mutagenesis screen suc-
target to initiate DNA
unwinding and to form an
nucleotides at positions +1 on the target strand and cessfully identified amino acid substitutions in the
RNA–DNA heteroduplex. +2* and +3* on the non-target strand, which are mainly PAM-interacting domain of LbCas12a, which conferred
exposed to bulk solvent 86 (FIG. 3a). The orientation of cleavage activities at non-canonical PAM sequences, thus
R‑loop these nucleotides, which function as the nucleation site showing that Cas12 proteins can be engineered to alter
An RNA–DNA heteroduplex
to initiate target binding, might explain how Cas9 sam- PAM specificity 112–114.
and a displaced DNA strand,
which is the end result of RNA
ples the DNA adjacent to the PAM sequence for guide
strand invasion by Cas RNA complementarity and RNA strand invasion following R‑loop formation and DNA unwinding
ribonucleoprotein (RNP) PAM recognition. Following PAM verification, the RNP surveillance
complexes. complex interrogates a complementary dsDNA target
PAM recognition in type V systems. In contrast to by RNA strand invasion. Base pairing of the crRNA
SpCas9, the PAM for Cas12 proteins contains a thymine spacer sequence with the DNA target strand displaces
(T)-rich sequence and is located at the 5ʹ end of the target the non-target strand and forms an R‑loop structure,
sequence36,58 (FIG. 3b,c). The crRNA–target–AsCas12a the stability of which is crucial for permitting target
crystal structure revealed a distorted PAM duplex with cleavage86,115–117 (FIG. 4). The observation that mismatches
a narrow minor groove, which functions as the location between the guide RNA and the target DNA directly
for base-specific contacts with amino acid side chains upstream of the PAM diminished Cas9 binding and
via van der Waals and hydrogen bond interactions97,98. In cleavage of the target DNA provided initial evidence for
particular, the critical Lys607 residue of AsCas12a forms a ‘seed region’ that functioned as the nucleation site for
a hydrogen bond with thymine at the −2* and adenine R‑loop formation21,22,103,105. Substrate competition experi-
at the −3 position of the 5ʹ‑TTTN‑3ʹ PAM, whereas the ments with SpCas9 also showed that extending the guide
neighbouring Met604 and Pro599 residues interact with RNA–target DNA sequence complementarity beyond
the −2 adenine base pair 97,98 (FIG. 3b). Hydrogen bonding the first 10–12 bp of the target site promoted stepwise
between amino acid side chains and the PAM duplex RNA–DNA heteroduplex formation and disfavoured
also occurs through the minor groove in the crRNA– re‑annealing of the dsDNA duplex 103. Additional studies
target–AaCas12b crystal structure, which recognizes the have since confirmed the importance of the seed region,
5ʹ‑TTN‑3ʹ PAM sequence upstream of the complemen- thus supporting a model for unidirectional R‑loop
tary DNA target 95 (FIG. 3c). For both Cas12a and Cas12b, propagation towards the PAM-distal end of the target
mutation of residues involved in base-specific contacts sequence103,118–120 (FIG. 4).

a Cas9
Target scanning PAM recognition Local DNA unwinding R-loop formation

Effector
protein

5′ 3′
sgRNA

Cas12 Cas9
PAM PAM
dsDNA
target

b Cas12

5′ 3′

Figure 4 | Model for unidirectional DNA unwinding and R‑loop formation in Cas9 and Cas12 interference
Nature proteins.
Reviews | Chemistry
Protospacer adjacent motif (PAM) sequence recognition induces guide RNA–double-stranded (dsDNA) target
hybridization and unidirectional DNA unwinding from the 3ʹ−5ʹ direction in Cas9 proteins (part a) or from the 5ʹ−3ʹ
direction in Cas12 (part b) to form the R‑loop structure. In the presence of mismatches in the seed region (the region
adjacent to the PAM), RNA strand invasion is non-productive and the ribonucleoprotein (RNP) surveillance complex rapidly
dissociates from the dsDNA substrate. PAM-distal mutations still enable rapid dsDNA–RNP complex association rates and
productive RNA strand invasion, but the lack of dsDNA–guide RNA complementarity at the PAM-distal end alters R‑loop
stability, which favours re‑annealing of the DNA strands and thus decreases the cleavage rate. Target recognition and
cleavage therefore depends on stable R‑loop formation, which probably involves key protein interactions with the entire
length of the RNA–DNA heteroduplex to activate Cas9 or Cas12 nuclease activity. sgRNA, single-guide RNA.

NATURE REVIEWS | CHEMISTRY VOLUME 1 | ARTICLE NUMBER 0078 | 7


©
2
0
1
7
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
,
p
a
r
t
o
f
S
p
r
i
n
g
e
r
N
a
t
u
r
e
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

Kinetic inhibition As dsDNA unwinding by Cas9 is ATP independent21, only affect R‑loop stability 116. Consistent with the tar-
A model in the context of RNA RNA strand invasion also depends on duplex distortion get search model that involves many rounds of R‑loop
strand invasion that explains on PAM binding to induce local DNA melting 86,111,121, formation and dissociation before interrogation of a
how impeding the rate of which is reminiscent of the DNA bending process complementary target, R‑loops were not observed with
R‑loop formation reduces Cas9
cleavage activity.
observed in type I-E and I-C Cascade complexes106,122–124. targets containing <11 bp of PAM-proximal comple-
A cryo-electron microscopy (cryo‑EM)-based re­consti- mentarity, which suggests that Cas9 is capable of sensing
Kinetic Monte Carlo tution of SpCas9 containing a complete R‑loop showed a along the entire RNA–DNA heteroduplex 116. A separate
analyses ~30° bend angle between the protruding dsDNA ends86, single-molecule FRET (smFRET) study measuring
Reveal the time evolution of a
which is thought to stabilize the R‑loop structure by R‑loop conformations with a complementary DNA tar-
given process (that is, stability
of the R‑loop over time with
enabling strand separation and reducing torsional get observed two distinct structural states at the PAM-
mismatched substrates or stress121,125,126. Although this static model provided a distal end, which shows more heterogeneity than was
single-guide RNA (sgRNA) snapshot of the full R‑loop before target cleavage, ques- previously expected81. These studies have contributed
variants). tions regarding R‑loop initiation could be more directly greater complexity to the simple model of unidirectional
HNH nuclease domain
addressed by capturing the PAM-bound (pre-strand R‑loop formation.
Contains a histidine– invasion) Cas9–sgRNA–dsDNA complex, the structure The observation that PAM-distal mutations have a
asparagine–histidine (HNH) of which has yet to be elucidated. Notably, the structurally marked effect on R‑loop stability and observed cleav-
motif and is the nuclease within compact Cas9 orthologues in type  II-C systems age rates without altering the rate of R‑loop formation
Cas9 that hydrolyses the target
have been shown to lack robust DNA-unwinding emphasizes the crucial role of target recognition by the
DNA strand through a one
metal ion mechanism.
capabilities, which suggests that kinetic inhibition of effector protein116. Although Cas9 is more tolerant of
the initial RNA strand invasion step occurs in these PAM-distal mismatches than seed region mutations129,
smaller variants82. several lines of evidence support a model for DNA–
In silico analyses based on thermodynamic param- RNA heteroduplex sensing at the PAM-distal end.
eters for dsDNA duplex and DNA–RNA heteroduplex Cas9 does not use an R‑loop ‘locking’ mechanism,
stabilities have offered insight into the propensity for during which protein domains reorganize to stabilize
Cas9 R‑loop formation. On the basis of kinetic Monte the R‑loop structure116; this would require additional
Carlo analyses (KMC analyses) using nearest neighbour proofreading steps in addition to simple base pairing
free-energy parameters, simulations showed that the to ensure fidelity of target recognition and cleavage.
guide RNA completely and stably invades a comple- Small fluctuations in R‑loop stability at the PAM-distal
mentary 20‑nucleotide DNA target, independently of end induced by target mismatches have a considerable
Cas9 (REF. 127). By comparing KMC simulations with effect on cleavage activity 129, and increasing the number
mismatches between the guide RNA and target DNA to of PAM-distal mismatches impedes conformational
reported SpCas9 activity at off-target sites, KMC analysis activation of the HNH nuclease domain (histidine–arginine–
showed a correlation between guide RNA stability histidine nuclease domain)89. Furthermore, comparison
around the 14–17 nucleotide position and SpCas9 off- of the pre-target-bound and ssDNA-bound RNP complex
target cleavage, which emphasizes that the R‑loop is capa- structures showed a dramatic conformational change
ble of traversing mismatches given transient stability at when SpCas9–sgRNA binds to a target DNA strand, even
this crucial position71,127. As R‑loop stability was a strong without the PAM-containing non-target strand83,84. On
predictor of cleavage rate compared with DNA–RNA the basis of these crystal structures, R‑loop sensing prob-
binding energies or the position of mismatches alone, ably involves key residues in the α-helical lobe of SpCas9
understanding the kinetics of RNA strand invasion that interact with the DNA–RNA hetero­duplex 84–86,111
proved valuable for anticipating off-target cleavage127. It to enable further Cas9 conformational rearrangements
is worth noting that all KMC experiments were carried to trigger target cleavage.
out with an initial R‑loop length of 10 bp, which assumed
strand invasion beyond the seed region, and were there- R-loop formation in type V systems. Although the deter-
fore unable to test de novo R‑loop formation127. A sepa- minants of R‑loop formation and target recognition for
rate biophysical model predicted that SpCas9 was much Cas12 effector proteins are not as well understood as for
more likely to spontaneously dissociate than to form a SpCas9, existing experimental and structural evidence
stable R‑loop when mutations in the target sequence suggests that Cas12a and Cas12b also initiate RNA strand
were present, which suggests that target search involves invasion at a 5ʹ PAM-proximal seed region (FIG. 4b).
many rounds of PAM recognition, local unwinding and Single mismatches between the FnCas12a crRNA and
initial strand displacement and dissociation before suffi- target DNA within positions +1 to +5 reduced in vitro
cient target DNA–guide RNA complementarity is found, target cleavage activity, whereas single mismatches
which drives R‑loop formation108,128 (FIG. 4a). between the AaCas12b sgRNA and target DNA within
To experimentally understand the energetics of DNA positions +1 to +18 abrogated cleavage58,94. Cell-based
unwinding and the effects of mismatches between the activity assays also showed that AsCas12a and LbCas12a
guide RNA and target DNA on RNA strand invasion, were highly intolerant of single mismatches between the
a few single-molecule studies have also probed the crRNA and target DNA within nucleotide positions +2
real-time kinetics of SpCas9 R‑loop formation. Single- to +6 and +11 to +18 (REF. 130). Interestingly, PAM-distal
molecule DNA supercoiling experiments have shown mismatches at the 3ʹ end beyond position +20 had no
that PAM mutations regulate R‑loop formation through effect on cleavage for AsCas12a and LbCas12a130, which
kinetic inhibition, whereas PAM-distal mutations was consistent with target-bound crystal structures in

8 | ARTICLE NUMBER 0078 | VOLUME 1 www.nature.com/natrevchem


©
2
0
1
7
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
,
p
a
r
t
o
f
S
p
r
i
n
g
e
r
N
a
t
u
r
e
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

Apoprotein which the last four PAM-distal nucleotides on the target the Mn2+ ions into the RuvC active site predicted a
The inactive, unbound state of strand were not duplexed with the crRNA97,98. However, metal coordination distance of ~5.5 Å between the
the protein. single mismatches at positions before +20 markedly Mn2+ ion and the non-bridging oxygen atom on the
reduced cleavage activity, which suggests that Cas12a −3 phosphate, which was slightly longer than the 2.1 Å
Molecular dynamics
simulation
target recognition might also require PAM-distal sens- Mg–O coordination distance typically observed for
(MD simulation). Computer ing through α-helical rearrangements to achieve high catalysis86,137. However, the putative cleavage mecha-
simulations that capture the fidelity cleavage58,130,131. nism involves two metal ion coordination by Asp10,
time evolution of atomic and/or Glu762 and Asp986 and the oxygen atoms of the scis-
molecular systems by
Nucleic acid target cleavage sile phosphate, which makes the phosphate susceptible
numerically solving Newton’s
equations of motion.
Cleavage of the nucleic acid target marks the final step in to nucleophilic attack by the His983‑activated water
the CRISPR–Cas interference pathway. Precise cutting molecule21,84,142–144 (FIG. 5c). To model how metal ions bridge
Cas9‑digested requires tight coupling between the processes of target the gap between the RuvC active site and the non-target
whole-genome sequencing recognition and catalytic activation, which is typically strand scissile phosphate, a molecular dynamics simulation
An in vitro method for
detecting Cas9 cleavage sites
achieved through conformational rearrangements in (MD simulation) suggested that introducing two Mg 2+
within genomic DNA using the effector proteins in order to properly orient one or ions at the −4 phosphate led to the largest decrease in
whole-genome sequencing. more active sites with the desired substrate132. distance between the scissile phosphate and the Asp10
and His983 active site residues145. The MD simulations
Cas9 nuclease domains and activity. For Cas9, gener- also showed that Mg 2+ binding was more stable at the −4
ating a blunt dsDNA break requires coordination of the phosphate, providing in silico evidence that Cas9 might
two catalytic nuclease domains, the HNH domain and instead generate a 1 bp 5ʹ staggered end145. Although Cas9
the RuvC domain, which cleave the target strand and the has been biochemically shown to cleave the non-target
non-target strand, respectively 21 (FIG. 5a,d). Notably, Cas9 strand at the −3 phosphate18,21, in vitro Cas9‑digested whole-
is considered a single-turnover enzyme; after the DNA genome sequencing (Digenome-seq) has shown cutting at
substrate has been cleaved, the target strand remains both the −3 and the −4 phosphates146. Notably, previous
base paired to the guide RNA and the protein is unable to biochemical studies have also showed that the non-target
bind additional targets for subsequent reactions103. Cas9 strand is trimmed by 3ʹ−5ʹ exonuclease activity of the Cas9
hydrolyses the scissile phosphates of the target strand RuvC nuclease following the initial cleavage event 21,86,
and the non-target strand via a one metal ion mecha- which might contribute to the observed staggered cut.
nism in the HNH nuclease domain and a two metal ion Although the MD simulations provide a putative mech-
mechanism in the RuvC nuclease domain133–136. anism for RuvC-dependent trimming of the non-target
The HNH domain is characterized by the conserved strand, the elucidation of the preferred −4 phosphate
ββα-fold and its catalytic pocket consists of three active cleavage site requires further experimental evidence.
site residues (Asp839, His840 and Asn863 in SpCas9)83–
85,137
. As the HNH domain structure has yet to be eluci- Allosteric recognition and targeting specificity. Crystal
dated in its active conformation, details of the cleavage structures of SpCas9 in various substrate-bound
process remain elusive and are inferred based on homol- states have also shown a large, rigid-body translation
ogy to other well-studied systems, such as the Holliday and rotation by the HNH domain towards the target
junction resolvase phage T4 endonuclease VII and the strand83–86,137, the direct interaction of which with the
nonspecific periplasmic Vibrio vulnificus nuclease84,138,139. scissile phosphate has yet to be captured structurally
By homology modelling, the active site residues Asp839 (FIG.  2a) . Biochemical and bulk FRET experiments
and Asn863, together with the oxygen atoms of the target first showed that the conformational state of the
strand scissile phosphate, coordinate a magnesium ion HNH domain controls DNA cleavage activity 89, and
(Mg 2+) within the catalytic centre of the HNH domain. a smFRET study later showed that the HNH domain
The His840 residue functions as a general base to activate occupies a stably ‘docked’ conformation when bound
a water molecule, which acts as a nucleophile to attack to a complementary DNA substrate140. However, when
the electrophilic +3 phosphate on the target strand and PAM-distal mismatches are present, smFRET exper-
generates the 5ʹ phosphate and 3ʹ hydroxyl (3ʹ‑OH) prod- iments revealed that the HNH domain transitions
ucts21,84,134 (FIG. 5b). Although the HNH active site requires towards a catalytically inactive ‘conformational check-
Mg 2+ for catalysis21,133–135, divalent cations (including point’ (REF.  140), thus explaining how Cas9 cleaves
Mg 2+, Ca2+ and Co2+) have also been shown to stabilize only a subset of sequences to which it binds104,119,147.
the HNH domain in its active conformation140,141. Although the HNH domain is sensitive to mismatches
The RuvC nuclease domain, which shares an RNase along the RNA–DNA heteroduplex 89,140, the domain
H‑like fold with other retroviral integrases142–144, consists itself does not directly interact with PAM-distal nucleic
of four active site residues (Asp10, Glu762, His983 and acids on the RNA–DNA heteroduplex 83–86, alluding to a
Asp986 in SpCas9)83–85. A crystal structure containing mechanism of allosteric recognition that controls HNH
the non-target strand revealed the presence of the −3 conformational activation148.
phosphate near the RuvC catalytic centre86. Although The role of allosteric effects on Cas9 conformational
no metals were present in this structure, the non-target activation has been shown in several steps of target
strand-containing RuvC domain aligned almost per- recognition and cleavage86,89,145,148–150. MD simulations
fectly with a separate manganese(ii) ion (Mn2+)-bound showed that PAM binding induces an ‘open to closed’
RuvC apoprotein structure86,137. Superimposition of conformational transition that reorients the domains

NATURE REVIEWS | CHEMISTRY VOLUME 1 | ARTICLE NUMBER 0078 | 9


©
2
0
1
7
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
,
p
a
r
t
o
f
S
p
r
i
n
g
e
r
N
a
t
u
r
e
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

REC3
(sensor)
REC2
Recognition (regulator)
lobe

HNH (inactive HNH (active


conformation) conformation)
REC1

sgRNA
RuvC

Nuclease
lobe
PAM-interacting

Target strand
Non-target strand

b HNH active site c RuvC active site

N854

H840 D839 E762


N863
D10
D986 H983

H840
O Base H983
O O Base
H H N O
O H O H H
NH2 O H O NH O N
E762 H
N863 O– O
– H O H NH
O P O
O Mn2+ H
Mg2+ O Base O– P O
O O O Base
O– O Mn2+ O
D839 O NH2
O O H D10
O O– O H O H
O– P O
H
N854 O O– P O
D986 O

d RNA
Cas9 Cas12 Cas13
trans
dsDNA dsDNA
PAM PAM cis
RNA PFS

Nature Reviews | Chemistry

10 | ARTICLE NUMBER 0078 | VOLUME 1 www.nature.com/natrevchem


©
2
0
1
7
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
,
p
a
r
t
o
f
S
p
r
i
n
g
e
r
N
a
t
u
r
e
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

◂ Figure 5 | Nuclease domain structure and metal-dependent nucleic acid target cleavage. a | Crystal structure of
Streptococcus pyogenes Cas9 (SpCas9; Protein Data Bank identifier: 5F9R) bound to its single-guide RNA (sgRNA) and
target double-stranded (dsDNA), depicting the major domains of the recognition (REC) and nuclease lobes. SpCas9
undergoes a large conformational change on dsDNA target binding in order to position the histidine–asparagine–
histidine (HNH) nuclease domain in the cleavage-competent, active conformation and simultaneously to trigger RuvC
(RNase H‑like fold) nuclease domain activity. The HNH and the RuvC domains of the nuclease lobe cleave the target strand
and the non-target strand of the dsDNA target, respectively. Although existing crystal structures have failed to capture
the active site residues that are engaged with the DNA substrate, details of the Cas9 cleavage mechanism have been
inferred on the basis of homology to known nucleases that contain the conserved HNH or RuvC domains. b | A structural
alignment of the SpCas9 HNH domain (PDB ID: 5F9R) with the phage T4 endonuclease VII HNH domain (PDB ID: 2QNC)
bound to its target strand scissile phosphate supports a one metal ion mechanism involving magnesium ion (Mg2+)
coordination at the active site (top panel). In the chemical depiction, the polar residues (Asn863, Asp839 and Asn854) help
to coordinate the Mg2+ ion in the catalytic site. The His840 residue functions as a general base to activate a water molecule
for nucleophilic attack on the Mg2+-coordinated scissile phosphate, generating the 5ʹ phosphate and 3ʹ hydroxyl (3ʹ-OH)
ends on the cleaved DNA backbone (bottom panel). c | A structural alignment of the non-target strand-bound SpCas9
RuvC domain (PDB ID: 5F9R) with the manganese(ii) ion (Mn2+)-bound apoprotein (apo) SpCas9 RuvC domain (PDB ID:
4CMQ) shows a putative two metal ion cleavage mechanism (top panel). In the chemical depiction, the acidic residues
(Glu762, Asp10 and Asp986) help to coordinate the two Mn2+ ions in the active site. The His963 residue activates a water
molecule to attack the electrophilic scissile phosphate, generating the 5ʹ phosphate and 3ʹ-OH ends on the cleaved DNA
products (bottom panel). d | Although the general mechanism of phosphodiester hydrolysis is consistent across class 2
interference proteins, cleavage patterns are distinct among Cas9 (blunt end), Cas12 (5ʹ staggered overhang) and Cas13
(promiscuous cis- and trans-cleavage) systems. For Cas9, dsDNA cleavage is accomplished by the concerted activation of
the HNH and RuvC nucleases, which respectively cut the target strand and the non-target strand upstream of the
protospacer adjacent motif (PAM) sequence. For Cas12, the RuvC nuclease is independently responsible for cleaving both
the target strand and the non-target strand of the dsDNA target downstream of the PAM, but the precise cleavage
mechanism remains unknown. For Cas13, binding to a complementary single-stranded RNA (ssRNA) sequence adjacent
to a protospacer flanking sequence (PFS; analogous to PAM) activates the higher eukaryotes and prokaryotes
nucleotide-binding RNase (HEPN) domain nuclease to indiscriminately cleave the bound RNA target (cis-cleavage) and
other ssRNAs in solution (trans-cleavage).

of SpCas9 for target recognition150. On formation of genome and the epigenetic context in mammalian cells71,
the R‑loop structure, recognition of the RNA–DNA 107,154,155
. Avoiding off-target cleavage is an ongoing chal-
heteroduplex by the REC3 domain (within the recog- lenge for using Cas9 in target-specific applications and
nition lobe) was shown to allosterically regulate global existing strategies for reducing off-target effects include
Cas9 conformational changes by communicating with direct RNP delivery (by titrating the Cas9:sgRNA con-
the REC2 domain, which sterically restricts or permits centration)156,157, sgRNA modifications (by lengthening
HNH domain access to the scissile phosphate148. The and/or truncating the 5ʹ end)74,158 or protein engineering
displacement of the non-target strand was also crucial (by introducing paired nickases, dimerization or point
for activating and stabilizing HNH docking in the mutations)148,159–164, some of which have been extensively
active conformation140,145. In addition, conformational reviewed165–167. With respect to truncating the sgRNA
activation of the HNH domain exerts allosteric control and introducing alanine substitutions at the protein–
over the RuvC nuclease via two α-helical linkers (L1 nucleic acid interface, the underlying rationale for
and L2) that connect the nuclease domains; these were improving specificity focused on weakening Cas9–target
proposed to function as allosteric transducers to medi- interactions 74,163,164. Although these manipulations
ate concerted DNA cleavage of the target strand and reduced the number of detectable off-target events
non-target strand89. A dsDNA-bound crystal structure compared with the wild-type SpCas9–sgRNA complex
of SpCas9 also showed major structural rearrangements in cultured cells74,163,164, it was biochemically shown that
of L1 and L2 to facilitate HNH conformational activa- the cleavage specificity of high-fidelity Cas9 variants
tion in the pre-cleavage state86. Furthermore, MD sim- does not depend on target binding affinity 148. Rather,
ulations indicated how the structural plasticity of Cas9 smFRET experiments have shown that these Cas9 var-
mediates allosteric crosstalk between the REC lobe, iants have an altered threshold for HNH conformational
HNH and RuvC domains to achieve the catalytically activation when bound to DNA substrates and that muta-
competent structural conformation145,149,150. However, tion of residues within REC3 that are involved in target
the precise mode of communication from REC3 to the recognition can alter the specificity of target cleavage148.
HNH nuclease domain remains unknown and warrants A kinetic model was also proposed to maximize RNP
further investigation. target discrimination by simultaneously increasing the
Despite conformational regulation, SpCas9 is still RNP dissociation rate and decreasing the cleavage rate
capable of cleaving sequences that resemble the comple- of off-target dsDNA sequences168; however, the missing
mentary target, as detected by several unbiased genome- emphasis on maintaining and improving on‑target effi-
wide off-target methods104,147,151–153. The mechanism of ciency remains an important caveat of these predictions.
off-target cleavage correlates with transient R‑loop sta- Understanding how these high-fidelity strategies affect
bility and HNH docking 89,127, but off-target interrogation R‑loop stability also remains an important unanswered
also depends on the number of PAM sequences within the question that could improve targeting specificity.

NATURE REVIEWS | CHEMISTRY VOLUME 1 | ARTICLE NUMBER 0078 | 11


©
2
0
1
7
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
,
p
a
r
t
o
f
S
p
r
i
n
g
e
r
N
a
t
u
r
e
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

Target cleavage in type V and type VI systems. Crystal catalytic HEPN domains that promiscuously cleave the
structures of Cas12a and Cas12b bound to both the crRNA target at a region beyond the crRNA–ssRNA duplex,
and dsDNA target have offered greater insight into the tar- known as cis-cleavage63,64. Binding of Cas13a to the
get cleavage mechanism of Cas12 effector proteins59,94–98. cognate ssRNA also triggers indiscriminate cleavage
These RNA-guided, dsDNA-targeting proteins contain of ssRNAs in solution, known as trans-cleavage63,64. This
the conserved RuvC nuclease domain36, but several finding has uncovered a mechanism for nonspecific
features indicate a DNA-targeting mechanism that is RNase activity by RNA target recognition, which has
distinct from that of Cas9. The cleavage reaction for since been used for RNA detection applications64,169. It
both Cas12a and Cas12b generates a 2–8 nucleotide was later shown that the Cas13a family is functionally
5ʹ overhang staggered cut 58,59,94,95 (FIG. 5d). A crRNA– divided into two groups on the basis of crRNA and
target-bound structure of FnCas12a showed that ssRNA substrate preferences 170. In addition to tar-
the target strand scissile phosphate is located in a DNA geted and collateral RNA cleavage, the related Cas13b
duplex downstream of the 3ʹ end of the crRNA, which protein is further regulated by the accessory proteins
was speculated to undergo further unwinding through Csx27 and Csx28, which are separately encoded in the
an unknown mechanism to facilitate target strand CRISPR–Cas locus and repress and enhance RNase
cleavage59. Notably, two genome-wide studies of Cas12a activity, respectively 62. A recent crystal structure of the
specificity reported substantially fewer off-target cleavage crRNA-bound LshCas13a showed an external HEPN
events and slightly lower on‑target efficiency compared catalytic pocket comprising two active site residues from
with SpCas9 (REFS 130,146), which suggests different each HEPN domain65, consistent with the mechanism of
mechanisms for target recognition or conformational cis- and trans-cleavage by a single catalytic site (FIG. 5d).
activation. Furthermore, Cas12a and Cas12b lack an A subsequent target-bound Cas13a structure has also
HNH nuclease domain or additional protein domain shown that HEPN dimerization is a prerequisite for
with detectable structural or sequence homology to complex activation63–66. Finally, it has been speculated,
known nucleases63,94–98. Nonetheless, a putative nuclease but remains unknown, whether type VI CRISPR–Cas
(Nuc) domain that is responsible for cleaving the target systems have an ‘off switch’ against collateral RNA damage
strand has been assigned to both Cas12a and Cas12b by Cas13 effectors to prevent off-target activation
on the basis of its proximity to the target strand cleav- of RNase activity.
age site; this remains controversial owing to a paucity of
evidence for a bona fide active site and a lack of struc- Conclusions
tural similarity with other nucleases (and even between The unique features of RNA-guided adaptive immu-
Cas12 effector proteins)94–98. Only a single point mutation nity in CRISPR–Cas systems have motivated the next
(Arg1226Ala) selectively inhibited target strand cleavage generation of genome-editing technologies, with Cas9
by AsCas12a98; however, this residue is located within a paving the way for sequence-specific DNA-targeting
linker between the RuvC and Nuc domain and there- applications. Although numerous biochemical and
fore does not conclusively represent a catalytic residue structural studies on Cas9 have provided key insights
within the Nuc domain. No catalytic residue specific for into each step of the DNA interference pathway, interest
target strand cleavage has yet been identified for Cas12b. in enhancing targeting specificity with the widely used
However, for both Cas12a and Cas12b, a single RuvC SpCas9 still requires a comprehensive understanding of
point mutation was shown to abrogate cleavage of both the thermodynamic and kinetic parameters that control
dsDNA strands58,59,94,95,98. target recognition and complex activation. Furthermore,
Following RuvC mutational analysis and the obser- there is growing momentum in discovering and adopt-
vation that an extended single-stranded segment of the ing smaller Cas9 orthologues such as SaCas9 and CjCas9
cis-cleavage target strand folds back to the RuvC active site in target- for genome editing; these are compact molecules that
Occurs when the Cas13– bound AaCas12b structures95, the RuvC catalytic pocket could aid in vivo delivery and have distinct specificity
CRISPR RNA (crRNA) complex
was suggested to be responsible for cleaving both the profiles147,171,172.
binds a complementary
single-stranded RNA (ssRNA) target strand and the non-target strand via separate Finally, the discovery of type V and type VI effector
target, which activates the cleavage events 59,95. A subsequent FnCas12a study proteins is a reminder that the Cas9 family represents
external higher eukaryotes and was unable to identify any mutations within the Nuc only a fraction of the diversity in class 2 CRISPR–Cas
prokaryotes nucleotide-binding domain that selectively inhibited target strand cleavage, systems. Emerging mechanistic studies on these distinct
RNase (HEPN) domain
catalytic pocket to cleave the
thereby eliminating the nuclease domain as a bona fide CRISPR–Cas systems have shown both common themes
bound ssRNA target. nuclease59. However, the model of target strand and and functional differences, from RNP complex forma-
non-target strand cleavage by a single active site in Cas12 tion to substrate preference and nucleic acid degradation.
trans-cleavage effectors still warrants further investigation, as questions However, Cas12a is the only interference protein in
Occurs when the Cas13–
regarding the kinetics of strand scission and structures the type V and type VI systems that has been success-
CRISPR RNA (crRNA) complex
binds a complementary of the cleavage-competent conformational state remain fully repurposed for genome editing in mammalian
single-stranded RNA (ssRNA) unanswered. cells63,130,146,173–175. As more systems become optimized for
target, which activates the The functionally validated Cas13a and Cas13b pro- in vivo applications, our understanding of DNA inter-
external higher eukaryotes and teins recognize and cleave RNA by a mechanism that is ference at a molecular level becomes increasingly more
prokaryotes nucleotide-binding
RNase (HEPN) domain
distinct from the previously discussed class 2 effectors. important in order to maximize the potential of diverse
catalytic pocket to cleave Initial studies with Cas13a showed that crRNA-guided programmable endonucleases for novel research tools
nonspecific ssRNAs in solution. recognition of a complementary ssRNA activates two and innovative therapies.

12 | ARTICLE NUMBER 0078 | VOLUME 1 www.nature.com/natrevchem


©
2
0
1
7
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
,
p
a
r
t
o
f
S
p
r
i
n
g
e
r
N
a
t
u
r
e
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

1. Sorek, R., Kunin, V. & Hugenholtz, P. CRISPR — a This study characterizes the determinants of DNA 48. Tamulaitis, G., Venclovas, C. & Siksnys, V. Type III
widespread system that provides acquired resistance cleavage for SpCas9 and establishes programmable CRISPR−Cas immunity: major differences brushed
against phages in bacteria and archaea. Nat. Rev. targeting using the crRNA–tracrRNA molecule or a aside. Trends Microbiol. 25, 49–61 (2017).
Microbiol. 6, 181–186 (2008). chimeric sgRNA. 49. Samai, P. et al. Co‑transcriptional DNA and RNA
2. van der Oost, J., Westra, E. R., Jackson, R. N. & 22. Cong, L. et al. Multiplex genome engineering using cleavage during type III CRISPR−Cas immunity. Cell
Wiedenheft, B. Unravelling the structural and CRISPR−Cas systems. Science 339, 819–823 (2013). 161, 1164–1174 (2015).
mechanistic basis of CRISPR-Cas systems. Nat. Rev. 23. Jinek, M. et al. RNA-programmed genome editing in 50. Liu, T. Y., Iavarone, A. T. & Doudna, J. A. RNA and
Microbiol. 12, 479–492 (2014). human cells. eLife 2, e00471 (2013). DNA targeting by a reconstituted thermus
3. Makarova, K. S. et al. An updated evolutionary 24. Mali, P. et al. RNA-guided human genome engineering thermophilus type III‑A CRISPR−Cas system. PLoS
classification of CRISPR−Cas systems. Nat. Rev. via Cas9. Science 339, 823–826 (2013). ONE 12, e0170552 (2017).
Microbiol. 13, 722–736 (2015). 25. Rath, D., Amlinger, L., Hoekzema, M., Devulapally, P. R. 51. Elmore, J. R. et al. Bipartite recognition of target
4. Mojica, F. J., Diez-Villasenor, C., Garcia-Martinez, J. & & Lundgren, M. Efficient programmable gene silencing RNAs activates DNA cleavage by the Type IIIB
Soria, E. Intervening sequences of regularly spaced by Cascade. Nucleic Acids Res. 43, 237–246 (2015). CRISPR−Cas system. Genes Dev. 30, 447–459
prokaryotic repeats derive from foreign genetic 26. O’Connell, M. R. et al. Programmable RNA recognition (2016).
elements. J. Mol. Evol. 60, 174–182 (2005). and cleavage by CRISPR−Cas9. Nature 516, 52. Estrella, M. A., Kuo, F. T. & Bailey, S. RNA-activated
This study uncovers the origin of spacers on the 263–266 (2014). DNA cleavage by the Type IIIB CRISPR−Cas effector
basis of homology to foreign DNA sequences and 27. Gilbert, L. A. et al. Genome-scale CRISPR-mediated complex. Genes Dev. 30, 460–470 (2016).
proposes that CRISPR is involved in specific control of gene repression and activation. Cell 159, 53. Makarova, K. S., Wolf, Y. I. & Koonin, E. V.
immunity against MGEs. 647–661 (2014). Comparative genomics of defense systems in archaea
5. Bolotin, A., Quinquis, B., Sorokin, A. & Ehrlich, S. D. 28. Larson, M. H. et al. CRISPR interference (CRISPRi) for and bacteria. Nucleic Acids Res. 41, 4360–4377
Clustered regularly interspaced short palindrome sequence-specific control of gene expression. Nat. (2013).
repeats (CRISPRs) have spacers of extrachromosomal Protoc. 8, 2180–2196 (2013). 54. Makarova, K. S. et al. Evolution and classification of
origin. Microbiology 151, 2551–2561 (2005). 29. Doudna, J. A. & Charpentier, E. Genome editing. The the CRISPR−Cas systems. Nat. Rev. Microbiol. 9,
6. Pourcel, C., Salvignol, G. & Vergnaud, G. CRISPR new frontier of genome engineering with 467–477 (2011).
elements in Yersinia pestis acquire new repeats by CRISPR‑Cas9. Science 346, 1258096 (2014). 55. Chylinski, K., Makarova, K. S., Charpentier, E. &
preferential uptake of bacteriophage DNA, and 30. Hsu, P. D., Lander, E. S. & Zhang, F. Development and Koonin, E. V. Classification and evolution of type II
provide additional tools for evolutionary studies. applications of CRISPR‑Cas9 for genome engineering. CRISPRCas systems. Nucleic Acids Res. 42,
Microbiology 151, 653–663 (2005). Cell 157, 1262–1278 (2014). 6091–6105 (2014).
7. Barrangou, R. et al. CRISPR provides acquired 31. Mali, P., Esvelt, K. M. & Church, G. M. Cas9 as a 56. Heler, R. et al. Cas9 specifies functional viral targets
resistance against viruses in prokaryotes. Science versatile tool for engineering biology. Nat. Methods during CRISPR−Cas adaptation. Nature 519,
315, 1709–1712 (2007). 10, 957–963 (2013). 199–202 (2015).
This is the first study to experimentally show the 32. Sander, J. D. & Joung, J. K. CRISPR−Cas systems for 57. Aravind, L., Makarova, K. S. & Koonin, E. V. Survey
role of CRISPR systems in bacterial adaptive editing, regulating and targeting genomes. Nat. and summary: Holliday junction resolvases and related
immunity. Biotechnol. 32, 347–355 (2014). nucleases: identification of new families, phyletic
8. Arslan, Z., Hermanns, V., Wurm, R., Wagner, R. & 33. Fellmann, C., Gowen, B. G., Lin, P. C., Doudna, J. A. & distribution and evolutionary trajectories. Nucleic
Pul, U. Detection and characterization of spacer Corn, J. E. Cornerstones of CRISPR−Cas in drug Acids Res. 28, 3417–3432 (2000).
integration intermediates in type IE CRISPR−Cas discovery and therapy. Nat. Rev. Drug Discov. 16, 58. Zetsche, B. et al. Cpf1 is a single RNA-guided
system. Nucleic Acids Res. 42, 7884–7893 (2014). 89–100 (2017). endonuclease of a class 2 CRISPR−Cas system. Cell
9. Nunez, J. K., Lee, A. S., Engelman, A. & Doudna, J. A. 34. Barrangou, R. & Horvath, P. A decade of discovery: 163, 759–771 (2015).
Integrase-mediated spacer acquisition during CRISPR− CRISPR functions and applications. Nat. Microbiol. 2, 59. Swarts, D. C., van der Oost, J. & Jinek, M. Structural
Cas adaptive immunity. Nature 519, 193–198 17092 (2017). basis for guide RNA processing and seed-dependent
(2015). 35. Burstein, D. et al. New CRISPRCas systems from DNA targeting by CRISPR−Cas12a. Mol. Cell 66,
10. Jansen, R., Embden, J. D., Gaastra, W. & uncultivated microbes. Nature 542, 237–241 (2017). 221–233.e4 (2017).
Schouls, L. M. Identification of genes that are 36. Shmakov, S. et al. Discovery and functional 60. Fonfara, I., Richter, H., Bratovic, M., Le Rhun, A. &
associated with DNA repeats in prokaryotes. Mol. characterization of diverse class 2 CRISPR−Cas Charpentier, E. The CRISPR-associated DNA-cleaving
Microbiol. 43, 1565–1575 (2002). systems. Mol. Cell 60, 385–397 (2015). enzyme Cpf1 also processes precursor CRISPR RNA.
11. Kunin, V., Sorek, R. & Hugenholtz, P. Evolutionary This study reports the discovery of three distinct Nature 532, 517–521 (2016).
conservation of sequence and secondary structures in class 2 CRISPR–Cas systems (C2c1, C2c2 and This study shows the most streamlined CRISPR–
CRISPR repeats. Genome Biol. 8, R61 (2007). C2c3) and shows that these CRISPR loci contain Cas system by demonstrating that Cpf1 is
12. Brouns, S. J. et al. Small CRISPR RNAs guide antiviral functional interference proteins. independently capable of both crRNA processing
defense in prokaryotes. Science 321, 960–964 37. Shmakov, S. et al. Diversity and evolution of class 2 and DNA target interference.
(2008). CRISPR−Cas systems. Nat. Rev. Microbiol. 15, 61. Anantharaman, V., Makarova, K. S., Burroughs, A. M.,
This study shows that the formation of mature 169–182 (2017). Koonin, E. V. & Aravind, L. Comprehensive analysis of
crRNAs containing the spacer and repeat This review provides a comprehensive overview of the HEPN superfamily: identification of novel roles in
sequences is essential for mediating an antiviral all class 2 CRISPR–Cas systems discovered to date intra-genomic conflicts, defense, pathogenesis and
response. and discusses their evolutionary origins and RNA processing. Biol. Direct 8, 15 (2013).
13. Deltcheva, E. et al. CRISPR RNA maturation by trans- relationships. 62. Smargon, A. A. et al. Cas13b is a type VIB CRISPR-
encoded small RNA and host factor RNase III. Nature 38. Stern, A. & Sorek, R. The phage-host arms race: associated RNA-guided RNase differentially regulated
471, 602–607 (2011). shaping the evolution of microbes. Bioessays 33, by accessory proteins Csx27 and Csx28. Mol. Cell 65,
This study identifies the tracrRNA molecule and 43–51 (2011). 618–630.e7 (2017).
shows its role in directing crRNA maturation and 39. Koonin, E. V. & Makarova, K. S. CRISPR−Cas: 63. Abudayyeh, O. O. et al. C2c2 is a single-component
antiviral immunity. evolution of an RNA-based adaptive immunity system programmable RNA-guided RNA-targeting CRISPR
14. Carte, J., Wang, R., Li, H., Terns, R. M. & Terns, M. P. in prokaryotes. RNA Biol. 10, 679–686 (2013). effector. Science 353, aaf5573 (2016).
Cas6 is an endoribonuclease that generates guide 40. Zhang, J., Kasciukovic, T. & White, M. F. The CRISPR This study validates a novel class of CRISPR
RNAs for invader defense in prokaryotes. Genes Dev. associated protein Cas4 is a 5ʹ to 3’ DNA exonuclease interference proteins capable of programmable,
22, 3489–3496 (2008). with an iron-sulfur cluster. PLoS ONE 7, e47232 RNA-guided RNA cleavage and uncovers its
15. Gesner, E. M., Schellenberg, M. J., Garside, E. L., (2012). nonspecific RNase activity.
George, M. M. & Macmillan, A. M. Recognition and 41. Nam, K. H., Kurinov, I. & Ke, A. Crystal structure of 64. East-Seletsky, A. et al. Two distinct RNase activities of
maturation of effector RNAs in a CRISPR interference clustered regularly interspaced short palindromic CRISPR−C2c2 enable guide-RNA processing and RNA
pathway. Nat. Struct. Mol. Biol. 18, 688–692 (2011). repeats (CRISPR)-associated Csn2 protein revealed detection. Nature 538, 270–273 (2016).
16. Hochstrasser, M. L. & Doudna, J. A. Cutting it close: Ca2+-dependent double-stranded DNA binding 65. Liu, L. et al. Two distant catalytic sites are responsible
CRISPR-associated endoribonuclease structure and activity. J. Biol. Chem. 286, 30759–30768 (2011). for C2c2 RNase activities. Cell 168, 121–134.e12
function. Trends Biochem. Sci. 40, 58–66 (2015). 42. Arslan, Z. et al. Double-strand DNA end-binding and (2017).
17. Garneau, J. E. et al. The CRISPRCas bacterial immune sliding of the toroidal CRISPR-associated protein 66. Liu, L. et al. The molecular architecture for RNA-guided
system cleaves bacteriophage and plasmid DNA. Csn2. Nucleic Acids Res. 41, 6347–6359 (2013). RNA cleavage by Cas13a. Cell 170, 714–726.e10
Nature 468, 67–71 (2010). 43. Kazlauskiene, M., Kostiuk, G., Venclovas, C., (2017).
18. Gasiunas, G., Barrangou, R., Horvath, P. & Siksnys, V. Tamulaitis, G. & Siksnys, V. A cyclic oligonucleotide 67. Karvelis, T. et al. crRNA and tracrRNA guide
Cas9−crRNA ribonucleoprotein complex mediates signaling pathway in type III CRISPR−Cas systems. Cas9‑mediated DNA interference in Streptococcus
specific DNA cleavage for adaptive immunity in Science 357, 605–609 (2017). thermophilus. RNA Biol. 10, 841–851 (2013).
bacteria. Proc. Natl Acad. Sci. USA 109, 44. Niewoehner, O. et al. Type III CRISPR−Cas systems 68. Chylinski, K., Le Rhun, A. & Charpentier, E. The
E2579–E2586 (2012). produce cyclic oligoadenylate second messengers. tracrRNA and Cas9 families of type II CRISPR−Cas
19. Wiedenheft, B., Sternberg, S. H. & Doudna, J. A. RNA- Nature 548, 543–548 (2017). immunity systems. RNA Biol. 10, 726–737 (2013).
guided genetic silencing systems in bacteria and 45. Mohanraju, P. et al. Diverse evolutionary roots and 69. Yang, H., Gao, P., Rajashankar, K. R. & Patel, D. J.
archaea. Nature 482, 331–338 (2012). mechanistic variations of the CRISPR−Cas systems. PAM-dependent target DNA recognition and cleavage
20. Marraffini, L. A. & Sontheimer, E. J. CRISPR Science 353, aad5147 (2016). by C2c1 CRISPR–Cas endonuclease. Cell 167,
interference: RNA-directed adaptive immunity in 46. Koonin, E. V., Makarova, K. S. & Zhang, F. Diversity, 1814–1828.e12 (2016).
bacteria and archaea. Nat. Rev. Genet. 11, 181–190 classification and evolution of CRISPR−Cas systems. 70. Burstein, D. et al. New CRISPR–Cas systems from
(2010). Curr. Opin. Microbiol. 37, 67–78 (2017). uncultivated microbes. Nature 542, 237–241 (2017).
21. Jinek, M. et al. A programmable dual-RNA-guided 47. Sinkunas, T. et al. Cas3 is a single-stranded DNA 71. Hsu, P. D. et al. DNA targeting specificity of RNA-
DNA endonuclease in adaptive bacterial immunity. nuclease and ATP-dependent helicase in the CRISPR− guided Cas9 nucleases. Nat. Biotechnol. 31,
Science 337, 816–821 (2012). Cas immune system. EMBO J. 30, 1335–1342 (2011). 827–832 (2013).

NATURE REVIEWS | CHEMISTRY VOLUME 1 | ARTICLE NUMBER 0078 | 13


©
2
0
1
7
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
,
p
a
r
t
o
f
S
p
r
i
n
g
e
r
N
a
t
u
r
e
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

72. Dang, Y. et al. Optimizing sgRNA structure to improve target DNA recognition. Cell Res. 26, 901–913 124. Xiao, Y. et al. Structure basis for directional R‑loop
CRISPR−Cas9 knockout efficiency. Genome Biol. 16, (2016). formation and substrate handover mechanisms in
280 (2015). 98. Yamano, T. et al. Crystal structure of Cpf1 in complex type I CRISPR−Cas system. Cell 170, 48–60.e11
73. Hendel, A. et al. Chemically modified guide RNAs with guide RNA and target DNA. Cell 165, 949–962 (2017).
enhance CRISPR-Cas genome editing in human (2016). 125. Nelson, P. Transport of torsional stress in DNA.
primary cells. Nat. Biotechnol. 33, 985–989 (2015). 99. Stella, S., Alcon, P. & Montoya, G. Structure of the Proc. Natl Acad. Sci. USA 96, 14342–14347
74. Fu, Y., Sander, J. D., Reyon, D., Cascio, V. M. & Cpf1 endonuclease R‑loop complex after target DNA (1999).
Joung, J. K. Improving CRISPR−Cas nuclease cleavage. Nature 546, 559–563 (2017). 126. Vassylyev, D. G. et al. Crystal structure of a bacterial
specificity using truncated guide RNAs. Nat. 100. Mojica, F. J., Diez-Villasenor, C., Garcia-Martinez, J. & RNA polymerase holoenzyme at 2.6 Å resolution.
Biotechnol. 32, 279–284 (2014). Almendros, C. Short motif sequences determine the Nature 417, 712–719 (2002).
75. Nissim, L., Perli, S. D., Fridkin, A., Perez-Pinera, P. & targets of the prokaryotic CRISPR defence system. 127. Josephs, E. A. et al. Structure and specificity of the
Lu, T. K. Multiplexed and programmable regulation of Microbiology 155, 733–740 (2009). RNA-guided endonuclease Cas9 during DNA
gene networks with an integrated RNA and CRISPR− 101. Shah, S. A., Erdmann, S., Mojica, F. J. & Garrett, R. A. interrogation, target binding and cleavage. Nucleic
Cas toolkit in human cells. Mol. Cell 54, 698–710 Protospacer recognition motifs: mixed identities and Acids Res. 43, 8924–8941 (2015).
(2014). functional diversity. RNA Biol. 10, 891–899 (2013). 128. Farasat, I. & Salis, H. M. A biophysical model of
76. Doench, J. G. et al. Rational design of highly active 102. Deveau, H. et al. Phage response to CRISPR-encoded CRISPR−Cas9 activity for rational design of genome
sgRNAs for CRISPR−Cas9‑mediated gene inactivation. resistance in Streptococcus thermophilus. J. Bacteriol. editing and gene regulation. PLoS Comput. Biol. 12,
Nat. Biotechnol. 32, 1262–1267 (2014). 190, 1390–1400 (2008). e1004724 (2016).
77. Konermann, S. et al. Genome-scale transcriptional 103. Sternberg, S. H., Redding, S., Jinek, M., Greene, E. C. & 129. Cencic, R. et al. Protospacer adjacent motif (PAM)-
activation by an engineered CRISPR−Cas9 complex. Doudna, J. A. DNA interrogation by the CRISPR RNA- distal sequences engage CRISPR−Cas9 DNA target
Nature 517, 583–588 (2015). guided endonuclease Cas9. Nature 507, 62–67 (2014). cleavage. PLoS ONE 9, e109213 (2014).
78. Nowak, C. M., Lawson, S., Zerez, M. & Bleris, L. Guide 104. Tsai, S. Q. et al. GUIDE-seq enables genome-wide 130. Kleinstiver, B. P. et al. Genome-wide specificities of
RNA engineering for versatile Cas9 functionality. profiling of off-target cleavage by CRISPR−Cas CRISPR−Cas Cpf1 nucleases in human cells. Nat.
Nucleic Acids Res. 44, 9555–9564 (2016). nucleases. Nat. Biotechnol. 33, 187–197 (2015). Biotechnol. 34, 869–874 (2016).
79. Chen, B. et al. Dynamic imaging of genomic loci in 105. Jiang, W., Bikard, D., Cox, D., Zhang, F. & 131. Kim, H. K. et al. In vivo high-throughput profiling of
living human cells by an optimized CRISPR−Cas Marraffini, L. A. RNA-guided editing of bacterial CRISPR−Cpf1 activity. Nat. Methods 14, 153–159
system. Cell 155, 1479–1491 (2013). genomes using CRISPR−Cas systems. Nat. Biotechnol. (2017).
80. Thyme, S. B., Akhmetova, L., Montague, T. G., Valen, E. 31, 233–239 (2013). 132. Jiang, F. & Doudna, J. A. CRISPR−Cas9 structures and
& Schier, A. F. Internal guide RNA interactions 106. Westra, E. R. et al. CRISPR immunity relies on the mechanisms. Annu. Rev. Biophys. 46, 505–529
interfere with Cas9‑mediated cleavage. Nat. Commun. consecutive binding and degradation of negatively (2017).
7, 11750 (2016). supercoiled invader DNA by Cascade and Cas3. Mol. 133. Yang, W., Lee, J. Y. & Nowotny, M. Making and
81. Lim, Y. et al. Structural roles of guide RNAs in the Cell 46, 595–605 (2012). breaking nucleic acids: two‑Mg2+-ion catalysis and
nuclease activity of Cas9 endonuclease. Nat. Commun. 107. Knight, S. C. et al. Dynamics of CRISPR−Cas9 genome substrate specificity. Mol. Cell 22, 5–13 (2006).
7, 13350 (2016). interrogation in living cells. Science 350, 823–826 134. Yang, W. An equivalent metal ion in one- and two-
82. Ma, E., Harrington, L. B., O’Connell, M. R., Zhou, K. & (2015). metal-ion catalysis. Nat. Struct. Mol. Biol. 15,
Doudna, J. A. Single-stranded DNA cleavage by 108. Singh, D., Sternberg, S. H., Fei, J., Doudna, J. A. & 1228–1231 (2008).
divergent CRISPR−Cas9 enzymes. Mol. Cell 60, Ha, T. Real-time observation of DNA recognition and 135. Yang, W. Nucleases: diversity of structure, function
398–407 (2015). rejection by the RNA-guided endonuclease Cas9. Nat. and mechanism. Q. Rev. Biophys. 44, 1–93
83. Jiang, F., Zhou, K., Ma, L., Gressel, S. & Doudna, J. A. Commun. 7, 12778 (2016). (2011).
Structural Biology. A Cas9‑guide RNA complex 109. Hirano, S., Nishimasu, H., Ishitani, R. & Nureki, O. 136. Steitz, T. A. & Steitz, J. A. A general two-metal-ion
preorganized for target DNA recognition. Science Structural basis for the altered PAM specificities of mechanism for catalytic RNA. Proc. Natl Acad. Sci.
348, 1477–1481 (2015). engineered CRISPR−Cas9. Mol. Cell 61, 886–894 USA 90, 6498–6502 (1993).
84. Nishimasu, H. et al. Crystal structure of Cas9 in (2016). 137. Jinek, M. et al. Structures of Cas9 endonucleases
complex with guide RNA and target DNA. Cell 156, 110. Kleinstiver, B. P. et al. Engineered CRISPR−Cas9 reveal RNA-mediated conformational activation.
935–949 (2014). nucleases with altered PAM specificities. Nature 523, Science 343, 1247997 (2014).
85. Anders, C., Niewoehner, O., Duerst, A. & Jinek, M. 481–485 (2015). 138. Biertumpfel, C., Yang, W. & Suck, D. Crystal structure
Structural basis of PAM-dependent target DNA 111. Mekler, V., Minakhin, L. & Severinov, K. Mechanism of of T4 endonuclease VII resolving a Holliday junction.
recognition by the Cas9 endonuclease. Nature 513, duplex DNA destabilization by RNA-guided Cas9 Nature 449, 616–620 (2007).
569–573 (2014). nuclease during target interrogation. Proc. Natl Acad. 139. Li, C. L. et al. DNA binding and cleavage by the
86. Jiang, F. et al. Structures of a CRISPR−Cas9 R‑loop Sci. USA 114, 5443–5448 (2017). periplasmic nuclease Vvn: a novel structure with a
complex primed for DNA cleavage. Science 351, 112. Gao, L. et al. Engineered Cpf1 variants with altered known active site. EMBO J. 22, 4014–4025
867–871 (2016). PAM specificities. Nat. Biotechnol. 35, 789–792 (2003).
87. Briner, A. E. et al. Guide RNA functional modules (2017). 140. Dagdas, Y. S., Chen, J. S., Sternberg, S. H.,
direct Cas9 activity and orthogonality. Mol. Cell 56, 113. Nishimasu, H. et al. Structural basis for the altered Doudna, J. A. & Yildiz, A. A conformational checkpoint
333–339 (2014). PAM recognition by engineered CRISPR−Cpf1. Mol. between DNA binding and cleavage by CRISPR‑Cas9.
88. Wright, A. V. et al. Rational design of a split‑Cas9 Cell 67, 139–147.e2 (2017). Sci. Adv. 3, eaao0027 (2017).
enzyme complex. Proc. Natl Acad. Sci. USA 112, 114. Yamano, T. et al. Structural basis for the canonical and 141. Osuka, S. et al. Real-time observation of flexible
2984–2989 (2015). non-canonical PAM recognition by CRISPR−Cpf1. Mol. domain movements in Cas9. Preprint at http://www.
89. Sternberg, S. H., LaFrance, B., Kaplan, M. & Cell. 67, 633–645.e3 (2017). biorxiv.org/content/early/2017/03/29/122069
Doudna, J. A. Conformational control of DNA target 115. Jore, M. M. et al. Structural basis for CRISPR RNA- (2017).
cleavage by CRISPR−Cas9. Nature 527, 110–113 guided DNA recognition by Cascade. Nat. Struct. Mol. 142. Ariyoshi, M. et al. Atomic structure of the RuvC
(2015). Biol. 18, 529–536 (2011). resolvase: a Holliday junction-specific endonuclease
This study shows that the conformation of the HNH 116. Szczelkun, M. D. et al. Direct observation of R‑loop from E. coli. Cell 78, 1063–1072 (1994).
nuclease domain within SpCas9 regulates RuvC formation by single RNA-guided Cas9 and Cascade 143. Chen, L., Shi, K., Yin, Z. & Aihara, H. Structural
nuclease activity to ensure accurate and concerted effector complexes. Proc. Natl Acad. Sci. USA 111, asymmetry in the Thermus thermophilus RuvC dimer
cleavage of both DNA strands. 9798–9803 (2014). suggests a basis for sequential strand cleavages
90. Nishimasu, H. et al. Crystal structure of 117. Rutkauskas, M. et al. Directional R‑loop formation by during Holliday junction resolution. Nucleic Acids Res.
Staphylococcus aureus Cas9. Cell 162, 1113–1126 the CRISPR−Cas surveillance complex cascade 41, 648–656 (2013).
(2015). provides efficient off-target site rejection. Cell Rep. 10, 144. Gorecka, K. M., Komorowska, W. & Nowotny, M.
91. Hirano, H. et al. Structure and engineering of 1534–1543 (2015). Crystal structure of RuvC resolvase in complex with
Francisella novicida Cas9. Cell 164, 950–961 (2016). 118. Pattanayak, V. et al. High-throughput profiling of off- Holliday junction substrate. Nucleic Acids Res. 41,
92. Esvelt, K. M. et al. Orthogonal Cas9 proteins for RNA- target DNA cleavage reveals RNA-programmed Cas9 9945–9955 (2013).
guided gene regulation and editing. Nat. Methods 10, nuclease specificity. Nat. Biotechnol. 31, 839–843 145. Palermo, G., Miao, Y., Walker, R. C., Jinek, M. &
1116–1121 (2013). (2013). McCammon, J. A. Striking plasticity of CRISPR−Cas9
93. Yamada, M. et al. Crystal structure of the minimal 119. Wu, X. et al. Genome-wide binding of the CRISPR and key role of non-target DNA, as revealed by
Cas9 from Campylobacter jejuni reveals the molecular endonuclease Cas9 in mammalian cells. Nat. molecular simulations. ACS Cent. Sci. 2, 756–763
diversity in the CRISPR‑Cas9 systems. Mol. Cell 65, Biotechnol. 32, 670–676 (2014). (2016).
1109–1121.e3 (2017). 120. Zheng, T. et al. Profiling single-guide RNA specificity 146. Kim, D. et al. Genome-wide analysis reveals
94. Liu, L. et al. C2c1−sgRNA complex structure reveals reveals a mismatch sensitive core sequence. Sci. Rep. specificities of Cpf1 endonucleases in human cells.
RNA-guided DNA cleavage mechanism. Mol. Cell 65, 7, 40638 (2017). Nat. Biotechnol. 34, 863–868 (2016).
310–322 (2017). 121. Coulombe, B. & Burton, Z. F. DNA bending and 147. Ran, F. A. et al. In vivo genome editing using
95. Yang, H., Gao, P., Rajashankar, K. R. & Patel, D. J. wrapping around RNA polymerase: a “revolutionary” Staphylococcus aureus Cas9. Nature 520, 186–191
PAM-dependent target DNA recognition and cleavage model describing transcriptional mechanisms. (2015).
by C2c1 CRISPR−Cas endonuclease. Cell 167, Microbiol. Mol. Biol. Rev. 63, 457–478 (1999). 148. Chen, J. S. et al. Enhanced proofreading governs
1814–1828.e12 (2016). 122. Westra, E. R. et al. Cascade-mediated binding and CRISPR‑Cas9 targeting accuracy. Nature http://dx.doi.
96. Dong, D. et al. The crystal structure of Cpf1 in bending of negatively supercoiled DNA. RNA Biol. 9, org/10.1038/nature24268 (2017).
complex with CRISPR RNA. Nature 532, 522–526 1134–1138 (2012). 149. Palermo, G., Miao, Y., Walker, R. C., Jinek, M. &
(2016). 123. Hochstrasser, M. L., Taylor, D. W., Kornfeld, J. E., McCammon, J. A. CRISPR‑Cas9 conformational
97. Gao, P., Yang, H., Rajashankar, K. R., Huang, Z. & Nogales, E. & Doudna, J. A. DNA targeting by a activation as elucidated from enhanced molecular
Patel, D. J. Type V CRISPR−Cas Cpf1 endonuclease minimal CRISPR RNA-guided cascade. Mol. Cell 63, simulations. Proc. Natl Acad. Sci. USA 114,
employs a unique mechanism for crRNA-mediated 840–851 (2016). 7260–7265 (2017).

14 | ARTICLE NUMBER 0078 | VOLUME 1 www.nature.com/natrevchem


©
2
0
1
7
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
,
p
a
r
t
o
f
S
p
r
i
n
g
e
r
N
a
t
u
r
e
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

150. Palermo, G. et al. Protospacer adjacent motif-induced 162. Tsai, S. Q. et al. Dimeric CRISPR RNA-guided FokI 175. Tu, M. et al. A ‘new lease of life’: FnCpf1 possesses
allostery activates CRISPR‑Cas9. J. Am. Chem. Soc. nucleases for highly specific genome editing. Nat. DNA cleavage activity for genome editing in human
http://dx.doi.org/10.1021/jacs.7b05313 (2017). Biotechnol. 32, 569–576 (2014). cells. Nucleic Acids Res. https://doi.org/10.1093/nar/
151. Frock, R. L. et al. Genome-wide detection of DNA 163. Slaymaker, I. M. et al. Rationally engineered Cas9 gkx783 (2017).
double-stranded breaks induced by engineered nucleases with improved specificity. Science 351, 176. Knott, G. J. et al. Guide-bound structures of an RNA-
nucleases. Nat. Biotechnol. 33, 179–186 (2015). 84–88 (2016). targeting A-cleaving CRISPR-Cas13a enzyme. Nat.
152. Wang, X. et al. Unbiased detection of off-target 164. Kleinstiver, B. P. et al. High-fidelity CRISPR−Cas9 Struct. Mol. Biol. https://doi.org/10.1038/nsmb.3466
cleavage by CRISPR−Cas9 and TALENs using integrase- nucleases with no detectable genome-wide off-target (2017).
defective lentiviral vectors. Nat. Biotechnol. 33, effects. Nature 529, 490–495 (2016).
175–178 (2015). 165. Tsai, S. Q. & Joung, J. K. Defining and improving the Acknowledgements
153. Kim, D. et al. Digenome-seq: genome-wide profiling of genome-wide specificities of CRISPR−Cas9 nucleases. The authors thank M. L. Hochstrasser, L. B. Harrington and
CRISPR−Cas9 off-target effects in human cells. Nat. Nat. Rev. Genet. 17, 300–312 (2016). A. V. Wright for critical reading and valuable input on the
Methods 12, 237–243 (2015). 166. Tycko, J., Myer, V. E. & Hsu, P. D. Methods for manuscript. J.S.C. is a National Science Foundation
154. O’Geen, H., Henry, I. M., Bhakta, M. S., Meckler, J. F. & optimizing CRISPR−Cas9 genome editing specificity. Graduate Research Fellow and J.A.D. is a Howard Hughes
Segal, D. J. A genome-wide analysis of Cas9 binding Mol. Cell 63, 355–370 (2016). Medical Institute Investigator.
specificity using ChIP-seq and targeted sequence 167. Yee, J. K. Off-target effects of engineered nucleases.
capture. Nucleic Acids Res. 43, FEBS J. 283, 3239–3248 (2016). Author contributions
3389–3404 (2015). 168. Bisaria, N., Jarmoskaite, I. & Herschlag, D. Lessons The authors contributed equally to this manuscript.
155. Horlbeck, M. A. et al. Nucleosomes impede Cas9 access from enzyme kinetics reveal specificity principles for
to DNA in vivo and in vitro. eLife 5, e12677 (2016). RNA-guided nucleases in RNA interference and Competing interests statement
156. Kim, S., Kim, D., Cho, S. W., Kim, J. & Kim, J. S. CRISPR-based genome editing. Cell Syst. 4, 21–29 J.A.D. is a co‑founder of Caribou Biosciences, Editas Medicine
Highly efficient RNA-guided genome editing in human (2017). and Intellia Therapeutics; a scientific adviser to Caribou
cells via delivery of purified Cas9 ribonucleoproteins. 169. Gootenberg, J. S. et al. Nucleic acid detection with Biosciences, Intellia Therapeutics, eFFECTOR Therapeutics
Genome Res. 24, 1012–1019 (2014). CRISPR−Cas13a/C2c2. Science 356, 438–442 (2017). and Driver; and executive director of the Innovative Genomics
157. Lin, S., Staahl, B. T., Alla, R. K. & Doudna, J. A. 170. East-Seletsky, A., O’Connell, M. R., Burstein, D., Institute at the University of California Berkeley (UC Berkeley)
Enhanced homology-directed human genome Knott, G. J. & Doudna, J. A. RNA targeting by and the University of California San Francisco (UCSF). J.S.C.
engineering by controlled timing of CRISPR−Cas9 functionally orthogonal type VIA CRISPR−Cas enzymes. and J.A.D. are inventors on UC Berkeley and Howard Hughes
delivery. eLife 3, e04766 (2014). Mol. Cell 66, 373–383.e3 (2017). Medical Institute patents for clustered regularly interspaced
158. Cho, S. W. et al. Analysis of off-target effects of CRISPR− 171. Friedland, A. E. et al. Characterization of short palindromic repeats (CRISPR) technologies.
Cas-derived RNA-guided endonucleases and nickases. Staphylococcus aureus Cas9: a smaller Cas9 for
Genome Res. 24, 132–141 (2014). all‑in‑one adeno-associated virus delivery and Publisher’s note
159. Guilinger, J. P., Thompson, D. B. & Liu, D. R. Fusion of paired nickase applications. Genome Biol. 16, 257 Springer Nature remains neutral with regard to jurisdictional
catalytically inactive Cas9 to FokI nuclease improves the (2015). claims in published maps and institutional affiliations.
specificity of genome modification. Nat. Biotechnol. 32, 172. Kim, E. et al. In vivo genome editing with a small Cas9
577–582 (2014). orthologue derived from Campylobacter jejuni. Nat. How to cite this article
160. Ran, F. A. et al. Double nicking by RNA-guided CRISPR− Commun. 8, 14500 (2017). Chen, J. S. & Doudna, J. A. The chemistry of Cas9 and its
Cas9 for enhanced genome editing specificity. Cell 154, 173. Hur, J. K. et al. Targeted mutagenesis in mice by CRISPR colleagues. Nat. Rev. Chem. 1, 0078 (2017).
1380–1389 (2013). electroporation of Cpf1 ribonucleoproteins. Nat.
161. Mali, P. et al. CAS9 transcriptional activators for target Biotechnol. 34, 807–808 (2016).
specificity screening and paired nickases for cooperative 174. Kim, Y. et al. Generation of knockout mice by DATABASES
genome engineering. Nat. Biotechnol. 31, 833–838 Cpf1‑mediated gene targeting. Nat. Biotechnol. 34, Protein Data Bank: http://www.rcsb.org/pdb/home/home.do
(2013). 808–810 (2016).

NATURE REVIEWS | CHEMISTRY VOLUME 1 | ARTICLE NUMBER 0078 | 15


©
2
0
1
7
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
,
p
a
r
t
o
f
S
p
r
i
n
g
e
r
N
a
t
u
r
e
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.

You might also like