You are on page 1of 19

RNA-directed gene editing specifically eradicates

latent and prevents new HIV-1 infection


Wenhui Hua,1,2, Rafal Kaminskia,1, Fan Yanga, Yonggang Zhanga, Laura Cosentinoa, Fang Lia, Biao Luob,
David Alvarez-Carbonellc, Yoelvis Garcia-Mesac, Jonathan Karnc, Xianming Mod, and Kamel Khalilia,2
a
Department of Neuroscience, Center for Neurovirology and The Comprehensive NeuroAIDS Center, Temple University School of Medicine, Philadelphia,
PA 19140; bCancer Genome Institute, Fox Chase Cancer Center, Temple University School of Medicine, Philadelphia, PA 19111; cDepartment of Molecular
Biology and Microbiology, Case Western Reserve University, Cleveland, OH 44106; and dLaboratory of Stem Cell Biology, State Key Laboratory of Biotherapy,
West China Hospital, West China Medical School, Sichuan University, Chengdu 610041, China

Edited by Anthony S. Fauci, National Institute of Allergy and Infectious Diseases, Bethesda, MD, and approved June 19, 2014 (received for review
March 19, 2014)

AIDS remains incurable due to the permanent integration of HIV-1 receptor 4 (CXCR4) and proviral DNA-encoding viral pro-
into the host genome, imparting risk of viral reactivation even after teins (8, 9). CCR5 gene-targeting ZFNs are in phase II clinical
antiretroviral therapy. New strategies are needed to ablate the trials for HIV-1/AIDS treatment (11). Also, various gene editing
viral genome from latently infected cells, because current methods technologies have recently been shown to remove the proviral
are too inefficient and prone to adverse off-target effects. To HIV-1 DNA from the host cell genome by targeting its highly
eliminate the integrated HIV-1 genome, we used the Cas9/guide conserved 5′ and 3′ long terminal repeats (LTRs) (12, 13). How-
RNA (gRNA) system, in single and multiplex configurations. We ever, introduction of nucleases into cells via these nuclease-based
identified highly specific targets within the HIV-1 LTR U3 region genomic editing approaches remains inefficient and partially se-

MEDICAL SCIENCES
that were efficiently edited by Cas9/gRNA, inactivating viral gene lective to remove the entire HIV-1 genome. Thus, the key barrier
expression and replication in latently infected microglial, promon- to their clinical translation is insufficient gene specificity to prevent
ocytic, and T cells. Cas9/gRNAs caused neither genotoxicity nor off- potential off-target effects (toxicities). To achieve highly specific
target editing to the host cells, and completely excised a 9,709-bp HIV-1 genome editing, we combined approaches to identify HIV-1
fragment of integrated proviral DNA that spanned from its 5′ to 3′ targets while circumventing host off-target effects. The resulting
LTRs. Furthermore, the presence of multiplex gRNAs within Cas9- highly specific Cas9-based method proved capable of eradicating
expressing cells prevented HIV-1 infection. Our results suggest that integrated HIV-1 DNA with high efficiency from latently infected
Cas9/gRNA can be engineered to provide a specific, efficacious pro-
human “reservoir” cell types, and prevented their infection
phylactic and therapeutic approach against AIDS.
by HIV-1.
CRISPR/Cas9 | genome editing | latency | retrovirus | reservoir Results
We assessed the ability of HIV-1–directed guide RNAs (gRNAs)

I nfection with HIV-1 is a major public health problem affecting


more than 35 million people worldwide (1). Current therapy
for controlling HIV-1 infection and impeding AIDS development
to abrogate LTR transcriptional activity and eradicate proviral
DNA from the genomes of latently infected myeloid cells that
serve as HIV-1 reservoirs in the brain, a particularly intractable
(highly active antiretroviral therapy; HAART) includes a mixture target population. Our strategy was focused on targeting the
of compounds that suppress various steps of the viral life cycle
(2). HAART profoundly reduces viral replication in cells that Significance
support HIV-1 infection and reduces plasma viremia to a minimal
level but neither suppresses low-level viral genome expression
and replication in tissues nor targets the latently infected cells For more than three decades since the discovery of HIV-1, AIDS
that serve as a reservoir for HIV-1, including brain macrophages, remains a major public health problem affecting greater than
microglia, and astrocytes, gut-associated lymphoid cells, and others 35.3 million people worldwide. Current antiretroviral therapy
(3, 4). HIV-1 persists in ∼106 cells per patient during HAART, has failed to eradicate HIV-1, partly due to the persistence of
and is linked to comorbidities including heart and renal diseases, viral reservoirs. RNA-guided HIV-1 genome cleavage by the
osteopenia, and neurological disorders (5). Because current thera- Cas9 technology has shown promising efficacy in disrupting
pies are unable to suppress viral gene transcription from integrated the HIV-1 genome in latently infected cells, suppressing viral
proviral DNA or eliminate the transcriptionally silent proviral gene expression and replication, and immunizing uninfected
genomes, low-level viral protein production by latently infected cells against HIV-1 infection. These properties may provide
cells may contribute to multiple illnesses in the aging HIV-1– a viable path toward a permanent cure for AIDS, and provide
infected patient population. Supporting this notion, pathogenic a means to vaccinate against other pathogenic viruses. Given
viral proteins including transactivator of transcription (Tat) are the ease and rapidity of Cas9/guide RNA development, per-
present in the cerebrospinal fluid of HIV-1–positive patients sonalized therapies for individual patients with HIV-1 variants
receiving HAART (6). To prevent viral protein expression and can be developed instantly.
viral reactivation in latently infected host cells, new strategies are
Author contributions: W.H., R.K., and K.K. designed research; W.H., R.K., F.Y., Y.Z., L.C.,
thus needed to permanently disable the HIV-1 genome by eradi- F.L., and B.L. performed research; D.A.-C., Y.G.-M., J.K., and X.M. contributed new re-
cating large segments of integrated proviral DNA. agents/analytic tools; W.H., B.L., and K.K. analyzed data; and W.H. and K.K. wrote
Advances in the engineered nucleases including zinc finger the paper.
nuclease (ZFN), transcription activator-like effector nuclease Conflict of interest statement: A patent application has been filed relating to this work.
(TALEN), and clustered regularly interspaced short palindromic This article is a PNAS Direct Submission.
repeats (CRISPR) associated 9 (Cas9) that can disrupt target 1
W.H. and R.K. contributed equally to this work.
genes have raised prospects of selectively deleting HIV-1 pro- 2
To whom correspondence may be addressed. Email: kamel.khalili@temple.edu or wenhui.hu@
viral DNA integrated into the host genome (7–10). These ap- temple.edu.
proaches have been used to disrupt HIV-1 entry coreceptors This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.
C-C chemokine receptor 5 (CCR5) or C-C-C chemokine 1073/pnas.1405186111/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1405186111 PNAS Early Edition | 1 of 6


HIV-1 LTR promoter U3 region. By bioinformatic screening SURVEYOR assay. Similarly, expressing gRNAs targeting LTRs
and efficiency/off-target prediction (14, 15), we identified four C and D in HeLa-derived TZM-bI cells, which contain stably
gRNA targets (protospacers; LTRs A−D) that avoid conserved incorporated HIV-1 LTR copies driving a firefly luciferase re-
transcription factor binding sites, minimizing the likelihood of porter gene (18), suppressed viral promoter activity (Fig. S3A),
altering host gene expression (Table S1 and Fig. S1). We inserted and elicited indels within the LTR U3 region (Fig. S3 B−D)
DNA oligonucleotides (Table S2) complementary to gRNAs A−D demonstrated by SURVEYOR and Sanger sequencing. More-
into a humanized Cas9 expression vector (A/B in pX260; C/D in over, the combined expression of LTR-C/D-targeting gRNAs
pX330) (16) and tested their individual and combined abilities in these cells caused excision of the predicted 302-bp viral
to alter the integrated HIV-1 genome activity. We first used DNA sequence, and emergence of the residual 194-bp fragment
the microglial cell line CHME5, which harbors integrated (Fig. S3 E and F).
copies of a single round HIV-1 vector that includes the 5′ and 3′ Multiplex expression of LTR-A/B gRNAs in mixed clonal
LTRs, and a gene encoding an enhanced green fluorescent pro- CHME5 cells caused deletion of a 190-bp fragment between A
tein (EGFP) reporter replacing Gag (pNL4-3-ΔGag-d2EGFP) and B target sites and led to indels to various extents (Fig. 1 C
(17). Treating CHME5 cells with trichostatin A (TSA), a histone and D). Among >20 puromycin-selected stable subclones, we
deacetylase inhibitor, reactivates transcription from the majority found cell populations with complete blockade of TSA-induced
of the integrated proviruses and leads to expression of EGFP and HIV-1 proviral reactivation determined by flow cytometry for
the remaining HIV-1 proteome (17). Expressing of gRNAs plus EGFP (Fig. 1E). PCR-based analysis for EGFP and HIV-1 Rev
Cas9 markedly decreased the fraction of TSA-induced EGFP- response element (RRE) in the proviral genome validated the
positive CHME5 cells (Fig. 1A and Fig. S2). We detected insertion/ eradication of HIV-1 genome (Fig. 1 F and G). Furthermore,
deletion gene mutations (indels) for LTRs A−D (Fig. 1B and sequencing of the PCR products revealed that the entire 5′−3′
Fig. S2B) using a Cel I nuclease-based heteroduplex-specific LTR-spanning viral genome was deleted, yielding a 351-bp

Fig. 1. Cas9/LTR-gRNA suppresses HIV-1 reporter virus production in CHME5 microglial cells latently infected with HIV-1. (A) Representative gating diagram
of EGFP flow cytometry shows a dramatic reduction in TSA-induced reactivation of latent pNL4-3-ΔGag-d2EGFP reporter virus by stably expressed Cas9 plus
LTR-A or -B, vs. empty U6-driven gRNA expression vector (U6-CAG). (B) SURVEYOR Cel-I nuclease assay of PCR product (−453 to +43 within LTR) from selected
LTR-A- or -B-expressing stable clones shows dramatic indel mutation patterns (arrows). (C and D) PCR fragment analysis shows a precise deletion of 190-bp
region between LTR-A and -B cutting sites (red arrowhead and arrow), leaving 306-bp fragment (black arrow) validated by TA-cloning and sequencing results.
(E−G) Subcloning of LTR-A/B stable clones reveals complete loss of reporter reactivation determined by EGFP flow cytometry (E) and elimination of pNL4-3-
ΔGag-d2EGFP proviral genome detected by standard (F) and real-time (G) PCR amplification of genomic DNA for EGFP and HIV-1 Rev response element (RRE);
β-actin is a DNA purification and loading control. (H) PCR genotyping of LTR-A/B subclones (#8, #13) using primers to amplify DNA fragment covering HIV-1
LTR U3/R/U5 regions (−411 to +129) shows indels (a, deletion; c, insertion) and “intact” or combined LTR (b).

2 of 6 | www.pnas.org/cgi/doi/10.1073/pnas.1405186111 Hu et al.
fragment via a 190-bp excision between cleavage sites A and B from cell population heterogeneity and/or incomplete genome
(Fig. 1H and Fig. S4), and a 682-bp fragment with a 175-bp in- editing. We also validated the ablation of HIV-1 genome by Cas9/
sertion and a 27-bp deletion at the LTR-A and -B sites, respectively LTR-A/B gRNAs in latently infected J-Lat T cells harboring
(Fig. S4C). The residual HIV-1 genome (Fig. 1 F−H) may reflect integrated HIV-R7/E-/EGFP (21) using flow cytometry analysis,
the presence of trace Cas9/gRNA-negative cells. These results SURVEYOR assay, and PCR genotyping (Fig. S6), supporting
indicate that LTR-targeting Cas9/gRNAs A/B eradicates the the results of previous reports on HIV-1 proviral deletion in
HIV-1 genome and blocks its reactivation in latently infected Jurkat T cells by Cas9/gRNA (12) and ZFN (13). Taken together,
microglial cells. our results suggest that the multiplex LTR-gRNAs/Cas9 system
The promonocytic U-937 cell subclone U1, an HIV-1 latency efficiently suppresses HIV-1 replication and reactivation in la-
model for infected perivascular macrophages and monocytes, is tently HIV-1–infected “reservoir” (microglial, monocytic, and T)
chronically HIV-1 infected and exhibits low-level constitutive cells typical of human latent HIV-1 infection, and in TZM-bI
viral gene expression and replication (19). GenomeWalker map- cells highly sensitive for detecting HIV-1 transcription and reac-
ping detected two integrated proviral DNA copies at chromo- tivation. Single or multiplex gRNAs targeting 5′ and 3′ LTRs ef-
somes Xp11-4 (Fig. 2A) and 2p21 (Fig. S5A) in U1 cells. A 9,935- fectively eradicated the entire HIV-1 genome.
bp DNA fragment representing the entire 9,709-bp proviral HIV-1 We next tested whether combined Cas9/LTR gRNAs can im-
DNA plus a flanking 226-bp X-chromosome-derived sequence munize cells against HIV-1 infection using stable Cas9/gRNAs-A
(Fig. 2A), and a 10,176-bp fragment containing 9,709-bp HIV-1 and -B-expressing TZM-bI-based clones (Fig. 3A). Two of seven
genome plus its flanking 2-chromosome-derived 467 bp (Fig. S5 A puromycin-selected subclones exhibited efficient excision of the
and B) were identified by the long-range PCR analysis of the 190-bp LTR-A/B site-spanning DNA fragment (Fig. 3B). How-
parental control or empty-vector (U6-CAG) U1 cells. The 226-bp ever, the remaining five subclones exhibited no excision (Fig. 3B)
and 467-bp fragments represent the predicted segment from the and no indel mutations as verified by Sanger sequencing. PCR
other copy of chromosome X and 2, respectively, which lacked the genotyping using primers targeting Cas9 and U6-LTR showed that
integrated proviral DNA. In U1 cells expressing LTR-A/B gRNAs none of these ineffective subclones retained the integrated copies

MEDICAL SCIENCES
and Cas9, we found two additional DNA fragments of 833 and 670 of Cas9/LTR-A/B gRNA expression cassettes. (Fig. S7 A and B).
bp in chromosome X and one additional 1,102-bp fragment in As a result, no expression of full-length Cas9 was detected (Fig. S7
chromosome 2. Thus, gRNAs A/B enabled Cas9 to excise the C and D). The long-term expression of Cas9/LTR-A/B gRNAs did
HIV-1 5′−3′ LTR-spanning viral genome segment in both chro- not adversely affect cell growth or viability, suggesting a low
mosomes. The 833-bp fragment includes the expected 226-bp occurrence of off-target interference with the host genome or
from the host genome and a 607-bp viral LTR sequence with a 27- Cas9-induced toxicity in this model. We assessed de novo HIV-1
bp deletion around the LTR-A site (Fig. 2 A and B). The 670-bp replication by infecting cells with the VSV-G-pseudotyped pNL4-3-
fragment encompassed a 226-bp host sequence and residual 444- ΔE-EGFP reporter virus (22), with EGFP positivity by flow
bp viral LTR sequence after 190-bp fragment excision (Fig. 1D), cytometry indicating HIV-1 replication. Unlike the control U6-
caused by gRNAs-A/B-guided cleavage at both LTRs (Fig. 2A). CAG cells, the cells stably expressing Cas9/LTR-A/B gRNAs failed
The additional fragments did not emerge via circular LTR in- to support HIV-1 replication at 2 d postinfection, indicating that
tegration, because it was absent in the parental U1 cells, and such they were immunized effectively against new HIV-1 infection (Fig. 3
circular LTR viral genome configuration occurs immediately after C and D). A similar immunity against HIV-1 was observed in Cas/
HIV-1 infection but is short lived and intolerant to repeated LTR-A/B gRNA expressing cells infected with native T-tropic
passaging (20). These cells exhibited substantially decreased HIV- X4 strain pNL4-3-ΔE-EGFP reporter virus (Fig. S8A) or native
1 viral load, shown by the functional p24 ELISA replication assay M-tropic R5 strains such as SF162 and JRFL (Fig. S8 B−D).
(Fig. 2C) and real-time PCR analysis (Fig. S5 C and D). The The appeal of Cas9/gRNA as an interventional approach rests
detectable but low residual viral load and reactivation may result on its highly specific on-target indel-producing cleavage (15, 16),

Fig. 2. Cas9/LTR-gRNA efficiently eradicates latent


HIV-1 virus from U1 monocytic cells. (A) (Right) Diagram
showing excision of HIV-1 entire genome in chromo-
some Xp11.4. HIV-1 integration sites were identified
using a Genome-Walker link PCR kit. (Left) Analysis of
PCR amplicon lengths using a primer pair (P1/P2) tar-
geting chromosome X integration site-flanking se-
quence reveals elimination of the entire HIV-1 genome
(9,709 bp), leaving two fragments (833 and 670 bp). (B)
(Upper) TA cloning and sequencing of the LTR fragment
(833 bp) showing the host genomic sequence (small
letters, 226 bp) and the partial sequences (634 − 27 = 607
bp) of 5′ LTR (green) and 3′ LTR (red) with a 27-bp de-
letion around the LTR A targeting site (underlined).
(Lower) Two indel alleles identified from 15 sequenced
clonal amplicons. The 670-bp fragment consists of a host
sequence (226 bp) and the remaining LTR sequence
(634-190 = 444 bp) after 190-bp excision by simultaneous
cutting at LTR-A and -B target sites. The underlined and
green-highlighted sequences indicate the gRNA LTR-A
target site and PAM. (C) Functional analysis of LTR-A/B-
induced eradication of HIV-1 genome, showing sub-
stantial blockade of p24 virion release induced by TSA/
phorbol myristate acetate (PMA) treatment. U1 cells
were transfected with pX260-LTRs A, B, or A/B. After 2-
wk puromycin selection, cells were treated with TSA (250
nM)/PMA for 2 d before p24 Gag ELISA was performed.

Hu et al. PNAS Early Edition | 3 of 6


but multiplex gRNAs could potentially cause host genome muta- one containing the seed (12 bp) plus NRG, we identified only
genesis and chromosomal disorders, cytotoxicity, genotoxicity, or 8 overlapped regions of 92 potential off-target sites against
oncogenesis. Fairly low viral-human genome homology reduces 676,105 indels: 6 indels occurring in both samples, and 2 only in
this risk, but the human genome contains numerous endogenous the U6-CAG control (Fig. 4 C and D). We also identified two
retroviral genomes that are potentially susceptible to HIV-1– indels on HIV-1 LTR that occurred only in the LTR-A/B sub-
directed gRNAs. Therefore, we assessed off-target effects of se- clone but, as expected, not in the U6-CAG control (Fig. 4C). The
lected HIV-1 LTR gRNAs on the human genome. Because the results suggest that LTR-A/B gRNAs induce the indicated on-
12- to 14-bp seed sequence nearest the protospacer-adjacent motif target indels but no off-target indels, consistent with prior find-
(PAM) region (NGG) is critical for cleavage specificity (14, 23), we ings using deep sequencing of PCR products covering predicted/
searched >14-bp seed+NGG, and found no off-target candidate potential off-target sites (14, 24–27).
sites by LTR gRNAs A−D (Table S1). It is not surprising that
progressively shorter gRNA segments yielded increasing off-target Discussion
cleavage sites 100% matched to corresponding on-target sequences The Cas9/gRNA technology platform is facile, versatile and
(i.e., NGG+13 bp yielded 6, 0, 2, and 9 off-target sites, respectively, improving rapidly (23), and clinical application is anticipated,
whereas NGG+12 bp yielded 16, 5, 16, and 29) (Table S1). From particularly in the fields of virus infection, genetic diseases, and
human genomic DNA, we obtained a 500- to 800-bp sequence cancer (9, 28, 29). Here, we found that LTR-directed gRNA/
covering one of the predicted off-target sites using high-fidelity Cas9 eradicates the HIV-1 genome and effectively immunizes
PCR, and analyzed the potential mutations by SURVEYOR and target cells against HIV-1 reactivation and infection with high
Sanger sequencing. We found no mutations (see representative off- specificity and efficiency. These properties may provide a viable
target sites #1, 5, and 6 in TZM-bI and U1 cells; Fig. 4A). path toward a permanent or “sterile” HIV-1 cure, and perhaps
To assess risk of off-target effects comprehensively, we per- provide a means to eradicate and vaccinate against other path-
formed whole-genome sequencing (WGS) using the stable Cas9/ ogenic viruses. In the current study, we have mainly focused our
gRNA A/B-expressing and control U6-CAG TZM-bI cells (Fig. efforts on myeloid lineage cells (microglia/macrophage), which
4 B−D). We identified 676,105 indels, using a genome analysis are the primary cell types that harbor HIV-1 in the brain.
toolkit (GATK, v.2.8.1) with human (hg19) and HIV-1 genomes However, this proof of concept is certainly applicable to any
as reference sequences. Among the indels, 24% occurred in the other cell type, including T-lymphoid cells (Fig. S6) (12, 13),
U6-CAG control, 26% in LTR-A/B subclone, and 50% in both astrocytes, and neural stem cells.
(Fig. 4B). Such substantial intersample indel-calling discrepancy Our combined approaches minimized off-target effects while
suggests the probable off-target effects but most likely results achieving high efficiency and complete ablation of the genomi-
from its limited confidence, limited WGS coverage (15−30×), cally integrated HIV-1 provirus. In addition to an extremely low
and cellular heterogeneity. GATK reported only confidently homology between the foreign viral genome and host cellular
identified indels: some found in the U6-CAG control but not in genome including endogenous retroviral DNA, the key design
the LTR-A/B subclone, and others in the LTR-A/B but not in attributes in our study included: bioinformatic screening using
the U6-CAG. We expected abundant missing indel calls for both the strictest 12-bp+NGG target selection criteria to exclude off-
samples due to the limited WGS coverage. Such limited indel- target human transcriptome or (even rarely) untranslated ge-
calling confidence also implies the possibility of false negatives: nomic sites; avoiding transcription factor binding sites within the
missed indels occurring in LTR-A/B but not U6-CAG controls. HIV-1 LTR promoter (potentially conserved in the host ge-
Cellular heterogeneity may reflect variability of Cas9/gRNA nome); selection of LTR-A- and -B-directed, 30-bp protospacer
editing efficiency and effects of passaging. Therefore, we tested and also precrRNA system reflecting the original bacterial im-
whether each indel was LTR-A/B gRNA-induced, by analyzing mune mechanism to enhance specificity/efficiency vs. 20-bp
±300 bp flanking each indel against LTRs-A/-B-targeted sites of protospacer-based, chimeric crRNA-tracRNA system (16, 30);
the HIV-1 genome and predicted/potential gRNA off-target sites and WGS, Sanger sequencing, and SURVEYOR assay, to
of the host genome (Table S3). For sequences 100% matched to identify and exclude potential off-target effects. Indeed, the

Fig. 3. Stable expression of Cas9 plus LTR-A/B vac-


cinates TZM-bI cells against new HIV-1 virus infec-
tion. (A) Immunocytochemistry (ICC) and Western
blot (WB) analyses with anti-Flag antibody confirm
the expression of Flag-Cas9 in TZM-bI stable clones
puromycin (1 μg/mL) selected for 2 wk. (B) PCR
genotyping of Cas9/LTR-A/B stable clones (c1−c7)
reveals a close correlation of LTR excision with re-
pression of LTR luciferase reporter activation. Fold
changes represent TSA/PMA-induced levels over
corresponding noninduction levels. (C) Stable Cas9/
LTR-A/B-expressing cells (c4) were infected with
pseudotyped-pNL4-3-Nef-EGFP lentivirus at indi-
cated multiplicity of infection (MOI), and infection
efficiency was measured by EGFP flow cytometry,
2 d postinfection. (D) Representative phase-contrast/
fluorescence micrographs show that LTR-A/B stable
but not control (U6-CAG) cells are resistant to new
infection by pNL4-3-ΔE-EGFP HIV-1 reporter virus
(green).

4 of 6 | www.pnas.org/cgi/doi/10.1073/pnas.1405186111 Hu et al.
Fig. 4. Off-target effects of Cas9/LTR-A/B on hu-
man genome. (A) SURVEYOR assay shows no indel
mutations in predicted/potential off-target regions
in human TZM-bI and U1 cells. LTR-A on-target re-
gion (A) was used as a positive control and empty
U6-CAG vector (U6) as a negative control. (B−D)
Whole-genome sequencing of LTR-A/B stable TZM-
bI subclone showing the numbers of called indels in
the U6-CAG control and LTR-A/B samples (B), de-
tailed information on 10 called indels near gRNA

MEDICAL SCIENCES
target sites in both samples (C), and distribution of
off-target called indels (D).

use of newly developed Cas9 double-nicking (23) and RNA- presents an important next step to assess the ability of Cas9 to
guided FokI nuclease (31, 32) may further assist identification of eradicate viral reservoirs in vivo. Moreover, in light of recent
new targets within the various conserved regions of HIV-1 with data illustrating efficient in vitro genome editing using a mixture
reduced off-target effects. of Cas9/gRNA and DNA (39–42), one may explore various sys-
More recently, a clinical trial using the ZFN gene editing tems for delivery of Cas9/LTR-gRNA via various routes for
strategy was launched to disrupt the gene encoding the HIV-1 immunizing high-risk subjects. Once advanced, one may use gene
coreceptor, CCR5 (8, 9, 11). Functional knockout of CCR5 in therapies (viral vector and nanoparticle) and transplantation of
autologous CD4 T cells of a small cohort of patients revealed autologous Cas9/gRNA-modified bone marrow stem/progenitor
that in one out of four enrolled subjects, the viral load remained cells (43, 44) or inducible pluripotent stem cells for eradicating
undetectable at the time of treatment (33). Similarly, TALEN HIV-1 infection.
and Cas9 have been tested experimentally for efficient disruption Here, we demonstrated the high specificity of Cas9/gRNAs
of CCR5 and CXCR4 (9, 28, 34–37); therefore, taking them into in editing HIV-1 target genome. Results from subclone data
consideration for clinical trials is anticipated. Whether or not the revealed the strict dependence of genome editing on the pres-
strategies targeting HIV-1 entry can reach the “sterile” cure of
ence of both Cas9 and gRNA. Moreover, only one nucleotide
AIDS remains to be seen. Our results show that the HIV-1 Cas9/
mismatch in the designed gRNA target will disable the editing
gRNA system has the ability to target more than one copy of the
potency. In addition, all four of our designed LTR gRNAs
LTR, which are positioned on different chromosomes, suggest-
ing that this genome-editing system can alter the DNA sequence worked well with different cell lines, indicating that the editing is
of HIV-1 in latently infected patient’s cells harboring multiple more efficient in the HIV-1 genome than the host cellular ge-
proviral DNAs. To further ensure high editing efficacy and consis- nome, wherein not all designed gRNAs are functional, which
tency of our technology, one may consider the most stable region of may be due to different epigenetic regulation, variable genome
HIV-1 genome as a target to eradicate HIV-1 in patient samples, accessibility, or other reasons. Given the ease and rapidity of
which may not harbor only one strain of HIV-1. Alternatively, one Cas9/gRNA development, even if HIV-1 mutations confer re-
may develop personalized treatment modalities based on the data sistance to one Cas9/gRNA-based therapy, as described above,
from deep sequencing of the patient-derived viral genome be- HIV-1 variants can be genotyped to enable another personalized
fore engineering therapeutic Cas9/gRNA molecules. therapy for individual patients (10).
Our results also demonstrate, for the first time to our knowl-
edge, that Cas9/gRNA genome editing can be used to immunize Materials and Methods
cells against HIV-1 infection. The preventative vaccination is Plasmid Preparation. Vectors containing human Cas9 and gRNA expression
independent of HIV-1 strain’s diversity because the system tar- cassette, pX260, and pX330 (Addgene) were used to create various constructs,
LTR-A, -B, -C, and -D (for details, see SI Materials and Methods).
gets genomic sequences regardless of how the viruses enter the
infected cells. Interestingly, the preexistence of the Cas9/gRNA
Cell Culture and Stable Cell Lines. TZM-bI reporter and U1 cell lines were
system in cells leads to a rapid elimination of the new HIV-1
obtained from the National Institutes of Health (NIH) AIDS Reagent Program,
before it integrates into the host genome, just like the way by and CHME5 microglial cells were described previously (17). The detailed
which the bacteria defense system evolved to combat phage in- procedure for cell growth or preparation of stable cell lines is described in SI
fection (38). Similarly, a gene-editing-based vaccine strategy Materials and Methods.
may be effective in eradicating postintegrated HIV-1 genome
and newly packaged proviruses in cells. Therefore, investigation Immunocytochemistry and Western Blot. Standard methods for immunocy-
of such HIV-1 vaccination in various latent reservoir cells and tochemical observation of the cells and evaluation of protein expression by
animal models with stable expression of Cas9/LTR-gRNAs Western blot were used as described in detail in SI Materials and Methods.

Hu et al. PNAS Early Edition | 5 of 6


Firefly Luciferase Assay. Cells were lysed 24 h posttreatment using Passive Lysis the integration sites of HIV-1, we used a Lenti-X integration site analysis kit as
Buffer (Promega) and assayed with a Luciferase Reporter Gene Assay kit detailed in SI Materials and Methods.
(Promega) according to the manufacturer’s protocol. Luciferase activity was Some PCR products were used for restriction fragment length poly-
normalized to the number of cells determined by a parallel MTT assay morphism analysis. Equal amounts of the PCR products were digested with
(Vybrant; Invitrogen). BsaJI. Digested DNA was separated on an ethidium bromide-contained
agarose gel [2% (wt/vol)]. For sequencing, PCR products were cloned using
p24 ELISA. After infection or reactivation, the levels of HIV-1 viral load in a TA Cloning Kit Dual Promoter with pCRII vector (Invitrogen). The insert was
supernatant were quantified by p24 Gag ELISA (Advanced BioScience Lab- confirmed by digestion with EcoRI, and positive clones were sent to Genewiz
oratories, Inc.) following the manufacturer’s protocol. To assess cell viability for Sanger sequencing.
upon treatments, MTT assay was performed in parallel according to the
manufacturer’s manual (Vybrant; Invitrogen). SURVEYOR Assay. The presence of mutations in PCR products was examined
using a SURVEYOR Mutation Detection Kit (Transgenomic) according to the
EGFP Flow Cytometry. Cells were trypsinized, washed with PBS, and fixed in 2% protocol from the manufacturer. Briefly, heterogeneous PCR product was
(wt/vol) paraformaldehyde for 10 min at room temperature, then washed twice denatured for 10 min in 95 °C and hybridized by gradual cooling using
with PBS and analyzed using a Guava EasyCyte Mini flow cytometer (Guava a thermocycler. Next, 300 ng of hybridized DNA (9 μL) was subjected to
Technologies). digestion with 0.25 μL of SURVEYOR Nuclease in the presence of 0.25 μL
SURVEYOR Enhancer S and 15 mM MgCl2 for 4 h at 42 °C. Then, Stop So-
HIV-1 Reporter Virus Preparation and Infections. HEK293T cells were trans- lution was added and samples were resolved in 2% (wt/vol) agarose gel
fected using Lipofectamine 2000 reagent (Invitrogen) with pNL4-3-ΔE-EGFP together with equal amounts of undigested PCR product controls.
(NIH AIDS Research and Reference Reagent Program). After 48 h, the su-
pernatant was collected, 0.45-μm filtered and titered in HeLa cells using Selection of LTR Target Sites, WGS, Bioinformatics, and Statistical Analysis. We
EGFP as an infection marker. For viral infection, stable Cas9/gRNA TZM-bI
used Jack Lin’s CRISPR/Cas9 gRNA finder tool for initial identification of
cells were incubated 2 h with diluted viral stock, and then washed twice with
potential target sites within the LTR. Detailed WGS, bioinformatic, and
PBS. At 2 and 4 d postinfection, cells were collected, fixed, and analyzed by
statistical analyses are described in SI Materials and Methods.
flow cytometry for EGFP expression, or genomic DNA purification was per-
formed for PCR and WGS.
ACKNOWLEDGMENTS. We thank Jessica Otte for technical support; Jennifer
Gordon, Shohreh Amini, and Xuebin Qin for helpful comments; and Jeffrey B.
Genomic DNA Amplification, PCR, TA Cloning, Sanger Sequencing, and Tatro and Cynthia Papaleo for editorial assistance. This work was supported
GenomeWalker Link PCR. Standard methods for DNA manipulation for cloning by National Institutes of Health Grants R01MH093271 (to K.K.), R01NS087971
and sequencing were used (see SI Materials and Methods). For identification of (to W.H. and K.K.), and P30MH092177 (to K.K.).

1. UNAIDS (2012) Global Report: UNAIDS Report on the Global AIDS Epidemic 2012 23. Ran FA, et al. (2013) Double nicking by RNA-guided CRISPR Cas9 for enhanced ge-
(Joint United Nations Programme on HIV/AIDS, Geneva). nome editing specificity. Cell 154(6):1380–1389.
2. Taylor BS, Wilkin TJ, Shalev N, & Hammer SM (2013) CROI 2013: Advances in anti- 24. Cho SW, et al. (2014) Analysis of off-target effects of CRISPR/Cas-derived RNA-guided
retroviral therapy. Top Antiviral Med 21(2):75–89. endonucleases and nickases. Genome Res 24(1):132–141.
3. Eisele E, Siliciano RF (2012) Redefining the viral reservoirs that prevent HIV-1 eradi- 25. Fu Y, et al. (2013) High-frequency off-target mutagenesis induced by CRISPR-Cas
cation. Immunity 37(3):377–388. nucleases in human cells. Nat Biotechnol 31(9):822–826.
4. Chun TW, et al. (1997) Quantification of latent tissue reservoirs and total body viral 26. Gabriel R, et al. (2011) An unbiased genome-wide analysis of zinc-finger nuclease
load in HIV-1 infection. Nature 387(6629):183–188. specificity. Nat Biotechnol 29(9):816–823.
5. Chun TW, Fauci AS (2012) HIV reservoirs: Pathogenesis and obstacles to viral eradi- 27. Pattanayak V, et al. (2013) High-throughput profiling of off-target DNA cleavage
cation and cure. AIDS 26(10):1261–1268.
reveals RNA-programmed Cas9 nuclease specificity. Nat Biotechnol 31(9):839–843.
6. Johnson TP, et al. (2013) Induction of IL-17 and nonclassical T-cell activation by HIV-
28. Cho SW, Kim S, Kim JM, Kim JS (2013) Targeted genome engineering in human cells
Tat protein. Proc Natl Acad Sci USA 110(33):13588–13593.
with the Cas9 RNA-guided endonuclease. Nat Biotechnol 31(3):230–232.
7. Zhang J, Crumpacker C (2013) Eradication of HIV and cure of AIDS, now and how?
29. Zhang F, Wen Y, Guo X (2014) CRISPR/Cas9 for genome editing: progress, implications
Front Immunol 4:337.
and challenges. Hum Mol Genet, 10.1093/hmg/ddu125.
8. Stone D, Kiem HP, Jerome KR (2013) Targeted gene disruption to cure HIV. Curr Opin
30. Jinek M, et al. (2012) A programmable dual-RNA-guided DNA endonuclease in
HIV AIDS 8(3):217–223.
9. Manjunath N, Yi G, Dang Y, Shankar P (2013) Newer gene editing technologies to- adaptive bacterial immunity. Science 337(6096):816–821.
ward HIV gene therapy. Viruses 5(11):2748–2766. 31. Tsai SQ, et al. (2014) Dimeric CRISPR RNA-guided FokI nucleases for highly specific
10. Mali P, Esvelt KM, Church GM (2013) Cas9 as a versatile tool for engineering biology. genome editing. Nat Biotechnol 32(6):569–576.
Nat Methods 10(10):957–963. 32. Guilinger JP, Thompson DB, Liu DR (2014) Fusion of catalytically inactive Cas9 to FokI
11. Hofer U, et al. (2013) Pre-clinical modeling of CCR5 knockout in human hematopoietic stem nuclease improves the specificity of genome modification. Nat Biotechnol 32(6):577–582.
cells by zinc finger nucleases using humanized mice. J Infect Dis 208(Suppl 2):S160–S164. 33. Tebas P, et al. (2014) Gene editing of CCR5 in autologous CD4 T cells of persons in-
12. Ebina H, Misawa N, Kanemura Y, Koyanagi Y (2013) Harnessing the CRISPR/Cas9 fected with HIV. N Engl J Med 370(10):901–910.
system to disrupt latent HIV-1 provirus. Sci Rep 3:2510. 34. Cradick TJ, Fine EJ, Antico CJ, Bao G (2013) CRISPR/Cas9 systems targeting β-globin and
13. Qu X, et al. (2013) Zinc-finger-nucleases mediate specific and efficient excision of HIV-1 CCR5 genes have substantial off-target activity. Nucleic Acids Res 41(20):9584–9592.
proviral DNA from infected and latently infected human T cells. Nucleic Acids Res 41(16): 35. Yang L, et al. (2013) Optimization of scarless human stem cell genome editing. Nucleic
7771–7782. Acids Res 41(19):9049–9061.
14. Hsu PD, et al. (2013) DNA targeting specificity of RNA-guided Cas9 nucleases. Nat 36. Mussolino C, et al. (2014) TALENs facilitate targeted genome editing in human cells
Biotechnol 31(9):827–832. with high specificity and low cytotoxicity. Nucleic Acids Res 42(10):6762–6773.
15. Mali P, et al. (2013) RNA-guided human genome engineering via Cas9. Science 37. Liu J, Gaj T, Patterson JT, Sirk SJ, Barbas CF, 3rd (2014) Cell-penetrating peptide-
339(6121):823–826. mediated delivery of TALEN proteins via bioconjugation for genome engineering.
16. Cong L, et al. (2013) Multiplex genome engineering using CRISPR/Cas systems. Science PLoS ONE 9(1):e85755.
339(6121):819–823. 38. Horvath P, Barrangou R (2010) CRISPR/Cas, the immune system of bacteria and ar-
17. Wires ES, et al. (2012) Methamphetamine activates nuclear factor kappa-light-chain-
chaea. Science 327(5962):167–170.
enhancer of activated B cells (NF-κB) and induces human immunodeficiency virus
39. Chen H, Choi J, Bailey S (2014) Cut site selection by the two nuclease domains of the
(HIV) transcription in human microglial cells. J Neurovirol 18(5):400–410.
Cas9 RNA-guided endonuclease. J Biol Chem 289(19):13284–13294.
18. Derdeyn CA, et al. (2000) Sensitivity of human immunodeficiency virus type 1 to the
40. Kim JM, Kim D, Kim S, Kim JS (2014) Genotyping with CRISPR-Cas-derived RNA-guided
fusion inhibitor T-20 is modulated by coreceptor specificity defined by the V3 loop of
endonucleases. Nat Commun 5:3157.
gp120. J Virol 74(18):8358–8367.
41. Jinek M, et al. (2013) RNA-programmed genome editing in human cells. eLife
19. Folks TM, Justement J, Kinter A, Dinarello CA, Fauci AS (1987) Cytokine-induced expres-
sion of HIV-1 in a chronically infected promonocyte cell line. Science 238(4828):800–802. 2:e00471.
20. Pace MJ, Graf EH, O’Doherty U (2013) HIV 2-long terminal repeat circular DNA is 42. Karvelis T, Gasiunas G, Siksnys V (2013) Programmable DNA cleavage in vitro by Cas9.
stable in primary CD4+T Cells. Virology 441(1):18–21. Biochem Soc Trans 41(6):1401–1406.
21. Jordan A, Bisgrove D, Verdin E (2003) HIV reproducibly establishes a latent infection 43. Li L, et al. (2013) Genomic editing of the HIV-1 coreceptor CCR5 in adult hematopoietic
after acute infection of T cells in vitro. EMBO J 22(8):1868–1877. stem and progenitor cells using zinc finger nucleases. Mol Ther 21(6):1259–1269.
22. Zhang H, et al. (2004) Novel single-cell-level phenotypic assay for residual drug sus- 44. Younan P, Kowalski J, Kiem HP (2014) Genetically modified hematopoietic stem cell
ceptibility and reduced replication capacity of drug-resistant human immunodefi- transplantation for HIV-1-infected patients: Can we achieve a cure? Mol Ther 22(2):
ciency virus type 1. J Virol 78(4):1718–1729. 257–264.

6 of 6 | www.pnas.org/cgi/doi/10.1073/pnas.1405186111 Hu et al.
Supporting Information
Hu et al. 10.1073/pnas.1405186111
SI Materials and Methods TZM-bI cells cultured in six-well plate were solubilized in 200 μL
Plasmid Preparation. DNA segment expressing long-term repeats of Triton X-100-based lysis buffer containing 20 mM Tris·HCl
(LTR)-A or LTR-B for precrRNA was cloned into the pX260 (pH 7.4), 1% Triton X-100, 5 mM EDTA, 5 mM DTT, 150 mM
vector that contains the puromycin selection gene (Addgene, NaCl, 1 mM phenylmethylsulfonyl fluoride, 1× nuclear extraction
plasmid #42229). DNA segment expressing LTR-C or LTR-D proteinase inhibitor mixture (Cayman Chemical), 1 mM sodium
for the chimeric crRNA-tracrRNA was cloned into the pX330 orthovanadate, and 30 mM NaF. Cell lysates were rotated at 4 °C
vector (Addgene, plasmid #42230). Both vectors contain a hu- for 30 min. Nuclear and cellular debris was cleared by centrifu-
manized Cas9 coding sequence driven by a CAG promoter and gation at 20,000 × g for 20 min at 4 °C. Equal amounts of lysate
a gRNA expression cassette driven by a human U6 promoter (1). proteins (20 μg) were denatured by boiling for 5 min in sodium
The vectors were digested with BbsI and treated with Antarctic dodecyl sulfate (SDS) sample buffer, fractionated by SDS-poly-
Phosphatase, and the linearized vector was purified with a Quick acrylamide gel electrophoresis in Tris-glycine buffer, and trans-
ferred to nitrocellulose membrane (BioRad). The SeeBlue
nucleotide removal kit (Qiagen). A pair of oligonucleotides for
prestained standards (Invitrogen) were used as a molecular
each targeting site (Table S2, AlphaDNA) was annealed, phos-
weight reference. Blots were blocked in 5% (wt/vol) BSA/Tris-
phorylated, and ligated to the linearized vector. The gRNA ex- buffered saline (pH 7.6) plus 0.1% Tween-20 (TBS-T) for 1 h
pression cassette was sequenced with U6 sequencing primer (Table and then incubated overnight at 4 °C with mouse anti-Flag M2
S2) in GENEWIZ. For pX330 vectors, we designed a pair of monoclonal antibody (1:1,000, Sigma) or mouse anti-GAPDH
universal PCR primers with overhang digestion sites (Table S2) monoclonal antibody (1:3,000, Santa Cruz Biotechnology). After
that can tease out the gRNA expression cassette (U6-gRNA- washing with TBS-T, the blots were incubated with IRDye 680LT-
crRNA-stem-tracrRNA) for direct transfection or subcloning to conjugated antimouse antibody for 1 h at room temperature.
other vectors. Membranes were scanned and analyzed using an Odyssey In-
frared Imaging System (LI-COR Biosciences).
Cell Culture. TZM-bI reporter cell line from John C. Kappes,
Xiaoyun Wu, and Tranzyme Inc., U1/HIV-1 cell line from Firefly Luciferase Assay. Cells were lysed 24 h posttreatment using
Thomas Folks, and J-Lat full length clone from Eric Verdin were Passive Lysis Buffer (Promega) and assayed with a Luciferase
obtained through the National Institute of Health (NIH) AIDS Reporter Gene Assay kit (Promega) according to the protocol of
Reagent Program, Division of AIDS, National Institute of Allergy the manufacturer. Luciferase activity was normalized to the
and Infectious Diseases, NIH. CHME5/HIV fetal microglia cell number of cells determined by parallel MTT assay (Vybrant;
lines were generated as previously described (2). TZM-bI and Invitrogen).
CHME5 cells were cultured in DMEM high glucose supple-
mented with 10% (vol/vol) heat-inactivated FBS (FBS) and 1% p24 ELISA. After infection or reactivation, the HIV-1 viral load
penicillin/streptomycin. U1 and J-Lat cells were cultured in levels in the supernatants were quantified by p24 Gag ELISA
RPMI 1640 containing 2.0 mM L-glutamine, 10% FBS, and 1% (Advanced BioScience Laboratories, Inc.) following the manu-
penicillin/streptomycin. facturer’s protocol. To assess the cell viability upon treatments,
MTT assay was performed in parallel according to the manu-
Stable Cell Lines and Subcloning. TZM-bl or CHME5/HIV cells facturer’s protocol (Vybrant; Invitrogen).
were seeded in six-well plates at 1.5 × 105 cells/well and trans-
fected using Lipofectamine 2000 reagent (Invitrogen) with 1 μg EGFP Flow Cytometry. Cells were trypsinized, washed with PBS,
of pX260 (for LTR-A and -B) or 1 μg/0.1 μg of pX330/pX260 and fixed in 2% paraformaldehyde for 10 min at room temper-
(for LTR-C and -D) plasmids. Next day, cells were transferred ature, then washed twice with PBS and analyzed using a Guava
into 100-mm dishes and incubated with growth medium con- EasyCyte Mini flow cytometer (Guava Technologies).
taining 1 μg/mL of puromycin (Sigma). Two weeks later, surviving
HIV-1 Reporter Virus Preparation and Infections. HEK293T cells
cell colonies were isolated using cloning cylinders (Corning). U1
were transfected using Lipofectamine 2000 reagent (Invitrogen)
and J-Lat cells (1.5 × 105) were electroporated with 1 μg of DNA
with pNL4-3-ΔE-EGFP (NIH AIDS Research and Reference
using 10-μL tip, 3 × 10 ms 1,400-V impulses at The Neon Trans-
Reagent Program). For pseudotyped pNL4-3-ΔE-EGFP, the
fection System (Invitrogen). Cells were selected with 0.5 μg/mL of VSV-G vector was cotransfected. After 48 h, the supernatant
puromycin for two weeks. The stable clones were subcultured using was collected, 0.45-μm filtered, and titered in HeLa cells using
a limited dilution method in 96-well plates, and single cell-derived expressed EGFP as an infection marker. HIV-1 SF162 and JRFL
subclones were maintained for further studies. were obtained through the NIH AIDS Research and Reference
Reagent Program: HIV-1 SF162 from Dr. Jay Levy and HIV-1
Immunocytochemistry and Western Blot. The Cas9/gRNA stable
JRFL from Dr. Irvin Chen. Infectious virus was prepared from
expression TZM-bI cells were cultured in eight-well chamber slides phytohaemagglutinin-stimulated peripheral blood mononuclear
for 2 d and fixed for 10 min in 4% (wt/vol) paraformaldehyde/PBS. cells (5 μg/ml, 48 h) and titrated by 50% Tissue Culture Infective
After three rinses, the cells were treated with 0.5% Triton X-100/ Dose (TCID50) assay in TZM-bl cells. For viral infection, stable
PBS for 20 min and blocked in 10% (vol/vol) donkey serum for Cas9/gRNA TZM-bI cells were incubated 2 h with a diluted
1 h. Cells were incubated overnight at 4 °C with mouse anti-Flag viral stock, and washed twice with PBS. At 2 and 4 d post-
M2 primary antibody (1:500, Sigma). After rinsing three times, cells infection, cells were collected, fixed, and analyzed by flow cy-
were incubated for 1 h with donkey antimouse Alexa-Fluor-594 tometry for EGFP expression, or genomic DNA purification was
secondary antibodies, and incubated with Hoechst 33258 for 5 min. performed for PCR and whole-genome sequencing.
After three rinses with PBS, the cells were coverslipped with anti-
fading aqueous mounting media (Biomeda) and analyzed under Genomic DNA Purification, PCR, TA Cloning, and Sanger Sequencing.
a Leica DMI6000B fluorescence microscope. Genomic DNA was isolated from cells using an ArchivePure DNA

Hu et al. www.pnas.org/cgi/content/short/1405186111 1 of 13
cell/tissue purification kit (5PRIME) according to the protocol Phusion High-Fidelity PCR Kit (New England Biolabs) following
recommended by the manufacturer. One hundred nanograms of the manufacturer’s protocol. The PCR products were visualized
extracted DNA were subjected to PCR using a high-fidelity FailSafe on 1% agarose gel and validated by Sanger sequencing.
PCR kit (Epicentre) using primers listed in Table S2. Three steps
of standard PCR were carried out for 30 cycles with 55 °C annealing SURVEYOR Assay. The presence of mutations in PCR products was
and 72 °C extension. The products were resolved in 2% (wt/vol) tested using a SURVEYOR Mutation Detection Kit (Transgenomic)
agarose gel. The bands of interest were gel-purified and cloned according to the protocol of the manufacturer. Briefly, heteroge-
into pCRII T-A vector (Invitrogen), and the nucleotide sequence neous PCR products were denatured for 10 min in 95 °C and hy-
of individual clones was determined by sequencing at Genewiz bridized by gradual cooling using a thermocycler. Next, 300 ng of
using universal T7 and/or SP6 primers. hybridized DNA (9 μL) was subjected to digestion with 0.25 μL of
SURVEYOR Nuclease in the presence of 0.25 μL SURVEYOR
Conventional and Real-Time Reverse Transcription-PCR. For total Enhancer S and 15 mM MgCl2 for 4 h at 42 °C. Then, Stop
RNA extraction, cells were processed with an RNeasy Mini kit Solution was added and samples were resolved in 2% agarose gel
(Qiagen) as per manufacturer’s instructions. The potentially re- together with equal amounts of undigested PCR products.
sidual genomic DNA was removed through on-column DNase Some PCR products were used for restriction fragment length
digestion with an RNase-Free DNase Set (Qiagen). One micro- polymorphism analysis. Equal amounts of PCR products were
gram of RNA for each sample was reversely transcribed into digested with BsaJI. Digested DNA was separated on an ethi-
cDNAs using random hexanucleotide primers with a High-Capacity dium bromide-contained agarose gel (2%). For sequencing, PCR
cDNA Reverse Transcription Kit (Invitrogen). Conventional PCR products were cloned using a TA Cloning Kit Dual Promoter
was performed using a standard protocol. with pCRII vector (Invitrogen). The insert was confirmed by
Quantitative PCR (qPCR) analyses were carried out in a digestion with EcoRI, and positive clones were sent to Genwiz
LightCycler480 (Roche) using an SYBR Green PCR Master Mix for Sanger sequencing.
Kit (Applied Biosystems) as described previously (3). The RT
reactions were diluted to 5 ng of total RNA per microliter of Selection of LTR Target Sites and Prediction of Potential Off-Target
reactions, and 2 μL was used in a 20-μl PCR. For qPCR analysis Sites. For initial studies, we obtained the LTR promoter sequence
of HIV-1 proviruses, 50 ng of genomic DNA were used. The (-411 to -10) of the integrated lentiviral LTR-luciferase reporter
primers were synthesized in AlphaDNA and shown in Table S2. by TA-cloning sequencing of PCR products from the genome of
The primers for human housekeeping genes GAPDH and RPL13A human TZM-bI cells because of potential mutation of LTR
were obtained from RealTimePrimers. Each sample was tested in during passaging. This promoter sequence has 100% match to the
triplicate. Cycle threshold (Ct) values were obtained graphically 5′-LTR of pHR’-CMV-LacZ lentiviral vector (AF105229). Thus,
for the target genes and housekeeping genes. The difference in sense and antisense sequences of the full-length pHR’ 5′-LTR
Ct values between the housekeeping gene and target gene was (634 bp) were used to search for Cas9/gRNA target sites con-
represented as ΔCt values. The ΔΔCt values were obtained by taining 20 bp gRNA targeting sequence plus the PAM sequence
subtracting the ΔCt values of control samples from those of (NRG) using Jack Lin’s CRISPR/Cas9 gRNA finder tool (http://
experimental samples. Relative fold or percentage change was spot.colorado.edu/∼slin/cas9.html). The number of potential off-
calculated as 2-ΔΔCt. In some cases, absolute quantification was targets with exact match was predicted by blasting each gRNA
performed using the pNL4-3-ΔE-EGFP plasmid spiked in hu- targeting sequence plus NRG (AGG, TGG, GGG and CGG;
man genomic DNA as a standard. The number of HIV-1 viral AAG, TAG, GAG, CAG) against all available human genomic
copies was calculated based on standard curve after normaliza- and transcript sequences using the NCBI/blastn suite with E-value
tion with housekeeping gene. cutoff 1,000 and word size 7. After pressing Control + F, copy/paste
the target sequence (1-23 through 9–23 nucleotides) and find the
GenomeWalker Link PCR and Long-Range PCR. The integration sites number of genomic targets with 100% match to the target se-
of HIV-1 in host cells were identified using a Lenti-X Integration quence. The number of off-targets for each search was divided
Site Analysis Kit (Clontech) following the manufacturer’s in- by 3 because of repeated genome library.
struction. Briefly, high-quality genomic DNAs were extracted
from U1 cells using a NucleoSpin Tissue kit (Clontech). To con- Whole-Genome Sequencing and Bioinformatics Analysis. The control
struct the viral integration libraries, each genomic DNA sample was subclone C1 and experimental subclone AB7 of TZM-bI cells
digested with blunt-end-generating digestion enzymes DraI, SspI, or were validated for target cut efficiency and functional suppression
Hpal separately overnight at 37 °C. The digestion efficiency was of the LTR-luciferase reporter. The genomic DNA was isolated
verified by electrophoresis on 0.6% agarose. The digested DNA was with NucleoSpin Tissue kit (Clontech). The DNA samples were
purified using a NucleoSpin Gel and PCR Clean-Up kit fol- submitted to the NextGen sequencing facility at Temple Uni-
lowed by ligation of the digested genomic DNA fragments to versity Fox Chase Cancer Center. Duplicated genomic DNA li-
GenomeWalker Adaptor at 16 °C overnight. The ligation reaction braries were prepared from each subclone using a NEBNext Ultra
was stopped by incubation at 70 °C for 5 min and diluted 5 times DNA Library Prep Kit for Illumina (New England Biolab) fol-
with TE buffer. The primary PCR was performed on the DNA lowing the manufacturer’s instruction. All libraries were sequenced
segments with adaptor primer 1 (AP1) and LTR-specific primer 1 with paired-end 141-bp reads in two Illumina Rapid Run flowcells
(LSP1) using Advantage 2 Polymerase Mix followed by a second- on HiSeq 2500 instrument (Illumina). Demultiplexed read data
ary (nested) PCR using AP2 and LSP2 primers (Table S2). The from the sequenced libraries were sent to AccuraScience, LLC
secondary PCR products were separated on 1.5% ethidium bro- (http://www.accurascience.com) for professional bioinformatics
mide-containing agarose gel. The major bands were gel-purified analysis. Briefly, the raw reads were mapped against human ge-
and cloned into pCRII T-A vector (Invitrogen), and the nucleo- nome (hg19) and HIV-1 genome by using Bowtie2 (4). A genomic
tide sequence of individual clones was determined by sequencing analysis toolkit (GATK, version 2.8.1) was used for the duplicated
at Genewiz using universal T7 and SP6 primers. read removal, local alignment, base quality recalibration, and
The sequence reads were analyzed by NCBI BLAST searching. indel calling. The confidence scores 10 and 30 were the thresholds
Two integration sites of HIV-1 in U1 cells were identified in for low quality (LowQual) and high confidence calling (PASS).
chromosomes X and 2. A pair of primers covering each in- The potential off-target sites of LTR-A and LTR-B with various
tegration site (Table S2) was synthesized in AlphaDNA. Long- mismatches were predicted by NCBI/blastn suite as described
range PCR using the U1 genomic DNA was performed with a above and by a CRISPR Design Tool (http://crispr.mit.edu/).

Hu et al. www.pnas.org/cgi/content/short/1405186111 2 of 13
All of the potential gRNA target sites (Table S3) were used to Statistical Analysis. The quantitative data represented mean ± SD
map the ±300-bp regions around each indel identified by GATK. from three to five independent experiments, and were evaluated
The locations of the overlapped regions in the human genome by Student t test or ANOVA and Newman-Keuls multiple
and HIV-1 genome were compared between the control C1 and comparison test. A P value < 0.05 or 0.01 was considered as a
experimental AB7. statistically significant difference.

1. Cong L, et al. (2013) Multiplex genome engineering using CRISPR/Cas systems. Science 3. Hu W, Mahavadi S, Li F, Murthy KS (2007) Upregulation of RGS4 and downregulation
339(6121):819–823. of CPI-17 mediate inhibition of colonic muscle contraction by interleukin-1beta. Am J
2. Wires ES, et al. (2012) Methamphetamine activates nuclear factor kappa-light-chain- Physiol Cell Physiol 293(6):C1991–C2000.
enhancer of activated B cells (NF-κB) and induces human immunodeficiency virus (HIV) 4. Langmead B, Trapnell C, Pop M, Salzberg SL (2009) Ultrafast and memory-efficient
transcription in human microglial cells. J Neurovirol 18(5):400–410. alignment of short DNA sequences to the human genome. Genome Biol 10(3):R25.

Fig. S1. LTR U3 sequence of the integrated lentiviral LTR-firefly luciferase reporter identified by TA cloning and sequencing of PCR product (−411 to −10) from
the genomic DNA of human TZM-bI cells. The protospacer and PAM (NGG) sequences of four gRNAs (LTRs A−D) and the predicted binding sites of indicated
transcription factors are highlighted. The precise cleavage sites are marked with scissors; +1 indicates the transcriptional start site.

Hu et al. www.pnas.org/cgi/content/short/1405186111 3 of 13
Fig. S2. LTR-C and LTR-D remarkably suppress TSA-induced reactivation of latent pNL4-3-ΔGag-d2EGFP virus in CHME5 microglia cells. (A) Diagram sche-
matically showing pNL4-3-ΔGag-d2EGFP vector containing transactivator of transcription (Tat), Rev, Env, Vpu, and Nef with the reporter gene d2EGFP. (B) The
SURVEYOR assay showing indel mutations in the on-target LTR genome of Cas9/LTR-D but not Cas9/LTR-C transfected cells. (C) Representative gating diagram
of EGFP flow cytometry showing a dramatic reduction in TSA-induced reactivation of latent pNL4-3-ΔGag-d2EGFP reporter viruses by stable expression of Cas9/
LTR-C or LTR-D compared with empty U6-driven gRNA expression vector (U6-CAG).

Hu et al. www.pnas.org/cgi/content/short/1405186111 4 of 13
Fig. S3. Both LTR-C and LTR-D induced indel mutations and significantly decreased constitutive and TSA/PMA-induced luciferase activity in TZM-bI cells stably
incorporated with HIV-1 LTR-firefly luciferase reporter gene. (A) Functional luciferase reporter assay revealing a significant reduction of LTR reactivation by
LTR-C, LTR-D, or both. (B) SURVEYOR assay showing indel mutation in LTR DNA (−453 to +43) induced by LTR-C and LTR-D (red arrow). A combination of LTR-C
and LTR-D generates a 194-bp fragment (black arrow) resulting from the deletion of 302-bp region between LTR-C and LTR-D. (C and D) Sanger sequencing of
30 clones validating the indel efficiency at 23% for LTR-C and 13% for LTR-D and example chromatograms showing inertion/deletion. (E) PCR-restriction
fragment length polymorphism analysis using BsaJ I to cut five sites (96, 102, 372, 386, 482) of the PCR product covering −453 to +43 of LTR showing two major
bands (96 bp and 270 bp) in the U6-CAG control sample, but an additional 372-bp band (red arrow) after LTR-C-induced indel mutation at the 96/102 sites,
a 290-bp band (red arrow) after LTR-D-induced mutations at the 372 site, and a 180-bp fragment (black arrow) after LTR-C/D-induced excision. (F) Example
chromatograms showing the deletion of a 302-bp fragment between LTR-C and LTR-D (Upper) and an additional 17-bp deletion (Lower). Red arrows indicate
the junction sites. *P < 0.05 indicates a significant decrease in LTR-C or LTR-D-mediated luciferase activation compared with U6-CAG control.

Hu et al. www.pnas.org/cgi/content/short/1405186111 5 of 13
Fig. S4. TA cloning and Sanger sequencing of PCR products from CHME5 subclones of LTR-A/B and empty U6-CAG control using primers covering HIV-1 LTR
U3/R/U5 regions (−411 to + 129). (A) Possible combination of LTR-A and LTR-B cuts on both 5′ and 3′ LTRs generating potential fragments a−c as indicated. (B)
Blasting of fragment a (351 bp) showing 190-bp deletion between LTR-A and LTR-B cut sites. (C) Blast of fragment c (682 bp) showing a 175-bp insertion at the
LTR-A cleavage site and a 27-bp deletion at the LTR-B cleavage site.

Fig. S5. Cas9/LTR-gRNA efficiently eradicates latent HIV-1 virus from U1 monocytic cells. (A) Sanger sequencing of a 1.1-kb fragment from long-range PCR
using a primer pair (T492/T493) targeting a chromosome 2 integration site-flanking sequence (small letters, 467 bp) reveals elimination of the entire HIV-1
genome (9709 bp), leaving combined 5′ LTR (green) and 3′ LTR with a 6-bp insertion (red) precisely at the third nucleotide from PAM (highlighted green on red
text) LTR-A targeting site (underlined) and a 4-bp deletion (highlighted pink). (B) The representative DNA gel picture shows specific eradication of the HIV-1
genome. NS, nonspecific band. (C and D) Quantitative PCR analysis using the primer pair targeting the Gag gene (T457/T458) shows 85% efficiency of entire
HIV-1 genome eradication in Cas9/LTR-A/B-expressing U1 cells. U1 cells were transfected with pX260 empty vector (U6-CAG) or LTRs-A/B-encoding vectors. After
2-wk puromycin selection, the cellular genomic DNAs were used for absolute quantitative qPCR analysis using spiked pNL4-3-ΔE-EGFP human genomic DNA as
a standard. **P < 0.01 indicates a significant decrease compared with the U6-CAG control.

Hu et al. www.pnas.org/cgi/content/short/1405186111 6 of 13
Fig. S6. Cas9/LTR gRNAs effectively eradicates HIV-1 provirus in J-Lat latently infected T cells. (A) Functional analysis by EGFP flow cytometry reveals ∼50%
reduction of PMA and TNFα-induced reactivation of EGFP reporter viruses. (B) The SURVEYOR assay shows indel mutations (red arrow) in the on-target LTR
genome of Cas9/LTR-A/B transfected cells. J-Lat cells were transfected with pX260 empty vector or LTR-A and -B. After 2-wk puromycin selection, cells were
treated with PMA or TNFα for 24 h. The genomic DNAs were subject to PCR using primers covering HIV-1 LTR U3/R/U5 regions (−411 to +129), and the
SURVEYOR assay was performed. **P < 0.01 indicates a significant decrease compared with the U6-CAG control. (C) PCR fragment analysis using primers
covering HIV-1 LTR (−374 to +43) shows a precise deletion of 190-bp region between LTR-A and -B cutting sites, leaving 227-bp fragment (red arrow).
Housekeeping gene β-actin serves as a DNA purification and loading control.

Fig. S7. Genome-editing efficiency depends upon the presence of Cas9 and gRNAs. (A and B) PCR genotyping reveals the absence of a U6-driven LTR-A or LTR-B
expression cassette (A) and absence/reduction of CMV-driven Cas9 DNA (B) in puromycin-selected TZM-bI subclones without any indication of genomic editing.
Genomic DNAs from indicated subclones were subject to conventional (A) or real-time (B) PCR analyses using a primer pair covering U6 promoter (T351) and LTR-A
(T354) or LTR-B (T356), and targeting Cas9 (T477/T491). (C and D) Cas9 protein expression is absent in ineffective TZM-bI subclones. The Flag-tagged Cas9 fusion
protein was detected by Western blot (WB) and immunocytochemistry (ICC) with anti-Flag monoclonal antibody. HEK293T cell line stably expressing Flag-Cas9 was
used as a positive control for WB (C). GAPDH serves as a protein loading control. Clone c6 contains Cas9 DNA but no Cas9 protein expression, suggesting a potential
mechanism of epigenetic repression after puromycin selection. Clones c5 and c3 may represent a truncated Flag-Cas9 (tCas9). Nucleus was stained with Hoechst
33258 (D).

Hu et al. www.pnas.org/cgi/content/short/1405186111 7 of 13
Fig. S8. Stable expression of Cas9/LTR-A/B gRNAs in TZM-bI cells vaccinates against pseudotyped or native HIV-1 viruses. (A) Flow cytometry shows a significant
reduction of native pNL4-3-ΔE-EGFP reporter virus infection efficiency in Cas9/LTR-A/B expressing TZM-bI subclones. (B and C) Real-time PCR analysis reveals
suppression or elimination of viral RNA (B) and DNA (C) by Cas9/LTR-A/B gRNAs. (D) The firefly luciferase luminescent assay demonstrates dramatic inhibition of
virus infection-stimulated LTR promoter activity by Cas9/LTR-A/B gRNAs. The stable Cas9/LTR-A/B gRNA-expressing TZM-bI cells were infected for 2 h with
indicated native HIV-1 viruses, and washed twice with PBS. At 2 d postinfection, cells were collected, fixed, and analyzed by flow cytometry for EGFP expression
(A), or lysed for total RNA extraction and RT-qPCR (B), genomic DNA purification for qPCR (C), and luminescence measurement (D). *P < 0.05 and **P < 0.01
indicate significant decreases compared with the U6-CAG control.

Hu et al. www.pnas.org/cgi/content/short/1405186111 8 of 13
Table S1. Predicted LTR gRNAs and their off-target numbers (100% match)
Name No. gRNA sequence 20 19 18 17 16 15 14 13 12

Sense
1 TCAGACCCTTTTAGTCAGTGTGG 0 0 0 0 0 0 0 2 25
2 TTGCTTGTACTGGGTCTCTCTGG 0 0 0 0 0 0 0 3 10
3 CAGCTGCTTTTTGCTTGTACTGG 0 0 0 0 0 0 2 4 11
4 CTGACATCGAGCTTGCTACAAGG 0 0 0 0 0 0 0 0 17
5 CCGCCTAGCATTTCATCACATGG 0 0 0 0 0 1 2 9 55
6 CGGAGAGAGAAGTATTAGAGTGG 0 0 0 0 0 0 0 3 18
7 AGTACCAGTTGAGCAAGAGAAGG 0 0 0 0 0 0 0 1 18
8 GATATCCACTGACCTTTGGATGG 0 0 0 0 0 1 3 22 211
LTR-C 9 GATTGGCAGAACTACACACCAGG 0 0 0 0 0 0 0 2 12
10 CACAAGGCTACTTCCCTGATTGG 0 0 0 0 0 0 0 2 11
11 CTGTGGATCTACCACACACAAGG 0 0 0 0 0 2 5 12 37
12 TGGGAGCTCTCTGGCTAACTAGG 0 0 0 0 0 0 0 4 14
13 GGTTAGACCAGATCTGAGCCTGG 0 0 0 0 0 0 0 9 33
14 TGCTACAAGGGACTTTCCGCTGG 0 0 0 0 0 0 0 2 5
15 AGAGAGAAGTATTAGAGTGGAGG 0 0 0 0 0 0 5 7 16
16 TTACACCCTGTGAGCCTGCATGG 0 0 0 0 0 0 3 14 35
17 AAGGTAGAAGAAGCCAATGAAGG 0 0 0 0 0 0 4 36 75
LTR-A 18 ATCAGATATCCACTGACCTTTGG 0 0 0 0 0 0 1 6 16
19 GACAAGATATCCTTGATCTGTGG 0 0 0 0 0 0 0 0 10
20 GCCCGTCTGTTGTGTGACTCTGG 0 0 0 0 0 0 3 7 35
21 ATCTGAGCCTGGGAGCTCTCTGG 0 0 0 0 1 2 9 32 78
22 CTTTCCGCTGGGGACTTTCCAGG 0 0 0 0 0 0 0 0 4
23 CAGAACTACACACCAGGGCCAGG 0 0 0 0 0 0 0 3 20
24 CCTGCATGGGATGGATGACCCGG 0 0 0 0 2 2 2 5 21
25 CCCTGTGAGCCTGCATGGGATGG 0 0 0 0 1 1 2 9 30
26 CTTTCCAGGGAGGCGTGGCCTGG 0 0 0 0 0 8 15 32 75
27 GGGGACTTTCCAGGGAGGCGTGG 0 0 0 0 0 0 0 5 24
28 CCGCTGGGGACTTTCCAGGGAGG 0 0 0 0 0 1 3 9 25
29 CATGGCCCGAGAGCTGCATCCGG 0 0 0 0 0 0 0 0 16
30 GCCTGGGCGGGACTGGGGAGTGG 0 0 0 0 1 2 7 18 250
31 AGGCGTGGCCTGGGCGGGACTGG 0 0 0 0 0 0 0 2 6
LTR-D 32 GCGTGGCCTGGGCGGGACTGGGG 0 0 0 0 0 0 6 9 29
33 CCAGGGAGGCGTGGCCTGGGCGG 0 0 0 0 2 2 2 11 22
Antisense
1 TGTGGTAGATCCACAGATCAAGG 0 0 0 0 0 0 0 3 13
2 GGTGTGTAGTTCTGCCAATCAGG 0 0 0 0 0 0 0 8 11
3 GTCAGTGGATATCTGATCCCTGG 0 0 0 0 0 0 0 2 10
4 TAGCACCATCCAAAGGTCAGTGG 0 0 0 0 0 0 2 3 10
5 TAGCTTGTAGCACCATCCAAAGG 0 0 0 0 0 1 1 6 12
6 TCTACCTTCTCTTGCTCAACTGG 0 0 0 0 0 0 1 1 6
7 CACTCTAATACTTCTCTCTCCGG 0 0 0 0 0 0 0 1 26
8 CCATGTGATGAAATGCTAGGCGG 0 0 0 0 0 1 1 5 17
9 GGGCCATGTGATGAAATGCTAGG 0 0 0 0 0 0 0 3 48
LTR-B 10 CAGCAGTTCTTGAAGTACTCCGG 0 0 0 0 0 0 0 0 5
11 CTGCTTATATGCAGCATCTGAGG 0 0 0 0 0 0 0 4 39
12 CACACTACTTGAAGCACTCAAGG 0 0 0 0 0 0 0 3 19
13 TACCAGAGTCACACAACAGACGG 0 0 0 0 0 0 0 2 12
14 ACACTGACTAAAAGGGTCTGAGG 0 0 0 0 1 2 3 4 15
15 CAAGGATATCTTGTCTTCGTTGG 0 0 0 0 0 0 0 1 5
16 CAGGGAAGTAGCCTTGTGTGTGG 0 0 0 0 0 1 2 4 17
17 GCGGGTGTTCTCTCCTTCATTGG 0 0 0 0 0 2 2 15 49
18 TAGTTAGCCAGAGAGCTCCCAGG 0 0 0 0 0 2 4 24 93
19 CTTTATTGAGGCTTAAGCAGTGG 0 0 0 0 0 0 0 4 25
20 ACTCAAGGCAAGCTTTATTGAGG 0 0 0 0 0 0 0 1 28
21 GGATATCTGATCCCTGGCCCTGG 0 0 0 0 0 1 2 8 43
22 GGCTCACAGGGTGTAACAAGCGG 0 0 0 0 0 0 0 2 5
23 TCCATCCCATGCAGGCTCACAGG 0 0 0 0 0 3 3 8 20

Hu et al. www.pnas.org/cgi/content/short/1405186111 9 of 13
Table S1. Cont.
Name No. gRNA sequence 20 19 18 17 16 15 14 13 12

24 AGTACTCCGGATGCAGCTCTCGG 0 0 0 0 0 0 1 5 48
25 AGAGCTCCCAGGCTCAGATCTGG 0 0 0 0 0 0 4 15 38
26 GATTTTCCACACTGACTAAAAGG 0 0 0 0 0 0 0 8 21
27 CCGGGTCATCCATCCCATGCAGG 0 0 0 0 0 0 0 4 36
28 CCTCCCTGGAAAGTCCCCAGCGG 0 0 0 0 0 0 3 14 37
29 GCCACTCCCCAGTCCCGCCCAGG 0 0 0 0 0 0 1 4 14
30 CCGCCCAGGCCACGCCTCCCTGG 0 0 0 0 0 1 1 6 19
The 5′ LTR sense and antisense sequences (634 bp) of pHR’-CMV-LacZ lentiviral vector (AF105229) were
used to search for Cas9/gRNA target sites containing a 20-bp guide sequence (protospacer) plus the proto-
spacer adjacent motif sequence (NGG) using Jack Lin’s CRISPR/Cas9 gRNA finder tool (http://spot.colorado.
edu/~slin/cas9.html). Each gRNA plus NGG (AGG, TGG, GGG, CGG) was blasted against available human
genomic and transcript sequences with 1,000 aligned sequences being displayed. After pressing Control + F,
copy/paste the target sequence (1-23 through 9-23 nucleotides) and find the number of genomic targets
with 100% match. The number of off-targets for each searching was divided by 3 because of repeated
genome library. The number shown indicates the sum of four searches (NGG). The red number indicates the
gRNA target sequences farthest from NGG. The sequence and off-target numbers for the selected LTR-A/B
and LTR-C/D are highlighted red and green, respectively.

Hu et al. www.pnas.org/cgi/content/short/1405186111 10 of 13
Table S2. Oligonucleotides for gRNA targeting sites and primers used for PCR and sequencing
Target name Direction Sequences (5′ to 3′)

LTR-A T353: Sense aaacAGGGCCAGGGATCAGATATCCACTGACCTTgt


T354: Antisense taaacAAGGTCAGTGGATATCTGATCCCTGGCCCT
LTR-B T355: Antisense aaacAGCTCGATGTCAGCAGTTCTTGAAGTACTCgt
T356: Sense taaacGAGTACTTCAAGAACTGCTGACATCGAGCT
LTR-C T357: Sense caccGATTGGCAGAACTACACACC
T358: Antisense aaacGGTGTGTAGTTCTGCCAATC
LTR-D T359: Sense caccGCGTGGCCTGGGCGGGACTG
T360: Antisense aaacCAGTCCCGCCCAGGCCACGC
PCR primer
LTR -453/F Sense TGGAAGGGCTAATTCACTCCCAAC
LTR +43/R Antisense CCGAGAGCTCCCAGGCTCAGATCT
LTR -411/F T361: Sense caccGATCTGTGGATCTACCACACACA
LTR +129/R T363: Antisense aaacGAGTCACACAACAGACGGGC
Cas-hU6/5′/XhoBm T351: Sense cgcctcgaggatccGAGGGCCTATTTCCCATGATTCC
Cas-CAG/3′/EcoR T352: Antisense tgtgaattcAGGCGGGCCATTTACCGTAAGTTATG
U1-Chromosome X T485: Sense ACGACTATCTTATCAATCCTTCCTG
T486: Antisense CTAGGTGATTAGGATATTCTACAATC
U1-Chromosome 2 T492: Sense GCTATTGTATCTGATCACAAGCTG
T493: Antisense TTGATTGTGTGTCCAGGTCCTAGG
d2EGFP T494: Sense GCAAGGGCGAGGAGCTGTTCACC
T495: Antisense TTGTAGTTGCCGTCGTCCTTGAAG
Gag T457: Sense AATGGTACATCAGGCCATATCAC
T458: Antisense CCCACTGTGTTTAGCATGGTATT
Cas9 T477: Sense CACAGCATCAAGAAGAACCTGAT
T491: Antisense TCTTCCGTCTGGTGTATCTTCTTC
RRE Sense CGCCAAGCTTGAATAGGAGCTTTGTTCC
Antisense CTAGGATCCAGGAGCTGTTGATCCTTTAGG
Off-Target (OT)
LTR-A-OT-1 T465: Sense GTGGACTTTGGATGGTGAGATAG
T466: Antisense GCCTGGCAAGAGTGAACTGAGTC
LTR-A-OT-2 T467: Sense AAGATAATGAGTTGTGGCAGAGC
T468: Antisense TCTACCTGGTAATCCAGCATCTGG
LTR-A-OT-3 T469: Sense ATAGGAGGAAGGCACCAAGAGGG
T470: Antisense AATGATGCTTTGGTCCTACTCCT
LTR-A-OT-4 T471: Sense TGCTCTTGCTACTCTGGCATGTAC
T472: Antisense AATCTACCTCTGAGAGCTGCAGG
LTR-A-OT-5 T473: Sense TCAGACACAGCTGAAGCAGAGGC
T474: Antisense ATGCCAGTGTCAGTAGATGTCAG
LTR-A-OT-6 T475: Sense TCAAGATCAGCCAGAGTGCACATG
T476: Antisense TGCTCTTCCGAGCCTCTCTGGAG
Others
hU6-sequence T428: Sense ATGGACTATCATATGCTTACCG
LSP1 Sense GCTTCAGCAAGCCGAGTCCTGCGTCGAG
LSP2 Antisense GCTCCTCTGGTTTCCCTTTCGCTTTCAA
AP1 Sense GTAATACGACTCACTATAGGGC
AP2 Antisense ACTATAGGGCACGCGTGGT

Hu et al. www.pnas.org/cgi/content/short/1405186111 11 of 13
Table S3. Locations of predicted gRNA targeting sites of LTR-A and LTR-B
Mismatch
Subject Identity, (12-bp
Name Query seq Id % E-value Start End Strand Ref. seq seed)

LTR-A ATCAGATATCCACTGACCTTTGG HIV 100 7.00E–04 162 184 + ATCAGATATCCACTGACCTTTGG 0


LTR-A TCAGATATCCACTGACCTTTGG HIV 100 0.003 163 184 + TCAGATATCCACTGACCTTTGG 0
LTR-A TCAGATATCCACTGACCTTTGG HIV 100 0.003 9091 9112 + TCAGATATCCACTGACCTTTGG 0
LTR-A CAGATATCCACTGACCTTTGG HIV 100 0.009 164 184 + CAGATATCCACTGACCTTTGG 0
LTR-A CAGATATCCACTGACCTTTGG HIV 100 0.009 9092 9112 + CAGATATCCACTGACCTTTGG 0
LTR-A AGATATCCACTGACCTTTGG HIV 100 0.033 165 184 + AGATATCCACTGACCTTTGG 0
LTR-A AGATATCCACTGACCTTTGG HIV 100 0.033 9093 9112 + AGATATCCACTGACCTTTGG 0
LTR-A GATATCCACTGACCTTTGG HIV 100 0.12 166 184 + GATATCCACTGACCTTTGG 0
LTR-A GATATCCACTGACCTTTGG HIV 100 0.12 9094 9112 + GATATCCACTGACCTTTGG 0
LTR-A ATATCCACTGACCTTTGG HIV 100 0.42 167 184 + ATATCCACTGACCTTTGG 0
LTR-A ATATCCACTGACCTTTGG HIV 100 0.42 9095 9112 + ATATCCACTGACCTTTGG 0
LTR-A TATCCACTGACCTTGGG chr5 100 1.5 21926317 21926333 + TATCCACTGACCTTGGG 0
LTR-A TATCCACTGACCTTTGG HIV 100 1.5 168 184 + TATCCACTGACCTTTGG 0
LTR-A TATCCACTGACCTTTGG HIV 100 1.5 9096 9112 + TATCCACTGACCTTTGG 0
LTR-A TATCCACTGACCTTAAG chr3 100 1.5 116712577 116712593 + TATCCACTGACCTTAAG 0
LTR-A TATCCACTGACCTTGAG chr6 100 1.5 32460607 32460623 + TATCCACTGACCTTGAG 0
LTR-A ATCCACTGACCTTAGG chr3 100 5.4 2669092 2669107 + ATCCACTGACCTTAGG 0
LTR-A ATCCACTGACCTTAGG chr3 100 5.4 158293369 158293384 + ATCCACTGACCTTAGG 0
LTR-A ATCCACTGACCTTGGG chr20 100 5.4 46918344 46918359 + ATCCACTGACCTTGGG 0
LTR-A ATCCACTGACCTTGGG chr14 100 5.4 86310067 86310082 − ATCCACTGACCTTGGG 0
LTR-A ATCCACTGACCTTGGG chr5 100 5.4 21926318 21926333 + ATCCACTGACCTTGGG 0
LTR-A ATCCACTGACCTTGGG chr4 100 5.4 95491921 95491936 − ATCCACTGACCTTGGG 0
LTR-A ATCCACTGACCTTTGG HIV 100 5.4 169 184 + ATCCACTGACCTTTGG 0
LTR-A ATCCACTGACCTTTGG HIV 100 5.4 9097 9112 + ATCCACTGACCTTTGG 0
LTR-A ATCCACTGACCTTTGG chr6 100 5.4 98901053 98901068 + ATCCACTGACCTTTGG 0
LTR-A ATCCACTGACCTTAAG chr7 100 5.4 155511293 155511308 − ATCCACTGACCTTAAG 0
LTR-A ATCCACTGACCTTAAG chr3 100 5.4 116712578 116712593 + ATCCACTGACCTTAAG 0
LTR-A ATCCACTGACCTTCAG chr5 100 5.4 152371289 152371304 + ATCCACTGACCTTCAG 0
LTR-A ATCCACTGACCTTCAG chr4 100 5.4 110823169 110823184 − ATCCACTGACCTTCAG 0
LTR-A ATCCACTGACCTTGAG chrX 100 5.4 74260260 74260275 + ATCCACTGACCTTGAG 0
LTR-A ATCCACTGACCTTGAG chr6 100 5.4 32460608 32460623 + ATCCACTGACCTTGAG 0
LTR-A TCCACTGACCTTAGG chr12 100 20 14485012 14485026 − TCCACTGACCTTAGG 0
LTR-A TCCACTGACCTTAGG chr7 100 20 72210628 72210642 − TCCACTGACCTTAGG 0
LTR-A TCCACTGACCTTAGG chr6 100 20 160845640 160845654 + TCCACTGACCTTAGG 0
LTR-A TCCACTGACCTTAGG chr3 100 20 2669093 2669107 + TCCACTGACCTTAGG 0
LTR-A TCCACTGACCTTAGG chr3 100 20 158293370 158293384 + TCCACTGACCTTAGG 0
LTR-A TCCACTGACCTTAGG chr2 100 20 237551230 237551244 − TCCACTGACCTTAGG 0
LTR-A TCCACTGACCTTGGG chr20 100 20 46918345 46918359 + TCCACTGACCTTGGG 0
LTR-A TCCACTGACCTTGGG chr14 100 20 86310067 86310081 − TCCACTGACCTTGGG 0
LTR-A TCCACTGACCTTGGG chr12 100 20 116054688 116054702 + TCCACTGACCTTGGG 0
LTR-A TCCACTGACCTTGGG chr11 100 20 103532094 103532108 + TCCACTGACCTTGGG 0
LTR-A TCCACTGACCTTGGG chr10 100 20 132186431 132186445 − TCCACTGACCTTGGG 0
LTR-A TCCACTGACCTTGGG chr8 100 20 144600475 144600489 − TCCACTGACCTTGGG 0
LTR-A TCCACTGACCTTGGG chr5 100 20 21926319 21926333 + TCCACTGACCTTGGG 0
LTR-A TCCACTGACCTTGGG chr4 100 20 95491921 95491935 − TCCACTGACCTTGGG 0
LTR-A TCCACTGACCTTTGG HIV 100 20 170 184 + TCCACTGACCTTTGG 0
LTR-A TCCACTGACCTTTGG HIV 100 20 9098 9112 + TCCACTGACCTTTGG 0
LTR-A TCCACTGACCTTTGG chr16 100 20 86962569 86962583 + TCCACTGACCTTTGG 0
LTR-A TCCACTGACCTTTGG chr11 100 20 68156214 68156228 + TCCACTGACCTTTGG 0
LTR-A TCCACTGACCTTTGG chr6 100 20 98901054 98901068 + TCCACTGACCTTTGG 0
LTR-A TCCACTGACCTTTGG chr5 100 20 72600080 72600094 + TCCACTGACCTTTGG 0
LTR-A TCCACTGACCTTTGG chr5 100 20 136458169 136458183 + TCCACTGACCTTTGG 0
LTR-A TCCACTGACCTTTGG chr4 100 20 25353030 25353044 + TCCACTGACCTTTGG 0
LTR-A TCCACTGACCTTTGG chr2 100 20 207833373 207833387 + TCCACTGACCTTTGG 0
LTR-A TCCACTGACCTTAAG chr15 100 20 67850506 67850520 − TCCACTGACCTTAAG 0
LTR-A TCCACTGACCTTAAG chr7 100 20 155511293 155511307 − TCCACTGACCTTAAG 0
LTR-A TCCACTGACCTTAAG chr5 100 20 25142541 25142555 − TCCACTGACCTTAAG 0
LTR-A TCCACTGACCTTAAG chr3 100 20 116712579 116712593 + TCCACTGACCTTAAG 0
LTR-A TCCACTGACCTTAAG chr1 100 20 163298514 163298528 + TCCACTGACCTTAAG 0
LTR-A TCCACTGACCTTCAG chr20 100 20 22136764 22136778 − TCCACTGACCTTCAG 0
LTR-A TCCACTGACCTTCAG chr19 100 20 50519462 50519476 − TCCACTGACCTTCAG 0

Hu et al. www.pnas.org/cgi/content/short/1405186111 12 of 13
Table S3. Cont.
Mismatch
Subject Identity, (12-bp
Name Query seq Id % E-value Start End Strand Ref. seq seed)

LTR-A TCCACTGACCTTCAG chr18 100 20 74623621 74623635 − TCCACTGACCTTCAG 0


LTR-A TCCACTGACCTTCAG chr16 100 20 71402733 71402747 + TCCACTGACCTTCAG 0
LTR-A TCCACTGACCTTCAG chr14 100 20 24193180 24193194 − TCCACTGACCTTCAG 0
LTR-A TCCACTGACCTTCAG chr11 100 20 133664063 133664077 + TCCACTGACCTTCAG 0
LTR-A TCCACTGACCTTCAG chr9 100 20 140394271 140394285 + TCCACTGACCTTCAG 0
LTR-A TCCACTGACCTTCAG chr6 100 20 47685256 47685270 − TCCACTGACCTTCAG 0
LTR-A TCCACTGACCTTCAG chr5 100 20 152371290 152371304 + TCCACTGACCTTCAG 0
LTR-A TCCACTGACCTTCAG chr4 100 20 110823169 110823183 − TCCACTGACCTTCAG 0
LTR-A TCCACTGACCTTCAG chr3 100 20 46255327 46255341 + TCCACTGACCTTCAG 0
LTR-A TCCACTGACCTTCAG chr3 100 20 186757301 186757315 + TCCACTGACCTTCAG 0
LTR-A TCCACTGACCTTGAG chrX 100 20 74260261 74260275 + TCCACTGACCTTGAG 0
LTR-A TCCACTGACCTTGAG chr11 100 20 76052171 76052185 − TCCACTGACCTTGAG 0
LTR-A TCCACTGACCTTGAG chr9 100 20 33927660 33927674 − TCCACTGACCTTGAG 0
LTR-A TCCACTGACCTTGAG chr9 100 20 71035331 71035345 + TCCACTGACCTTGAG 0
LTR-A TCCACTGACCTTGAG chr9 100 20 95871690 95871704 + TCCACTGACCTTGAG 0
LTR-A TCCACTGACCTTGAG chr7 100 20 137681847 137681861 + TCCACTGACCTTGAG 0
LTR-A TCCACTGACCTTGAG chr6 100 20 32460609 32460623 + TCCACTGACCTTGAG 0
LTR-A TCCACTGACCTTGAG chr3 100 20 42344237 42344251 + TCCACTGACCTTGAG 0
LTR-A TCCACTGACCTTGAG chr2 100 20 64643586 64643600 + TCCACTGACCTTGAG 0
LTR-A TCCACTGACCTTTAG chr16 100 20 55133552 55133566 − TCCACTGACCTTTAG 0
LTR-A TCCACTGACCTTTAG chr15 100 20 90072212 90072226 − TCCACTGACCTTTAG 0
LTR-A TCCACTGACCTTTAG chr12 100 20 69006300 69006314 + TCCACTGACCTTTAG 0
LTR-A TCCACTGACCTTTAG chr3 100 20 170680338 170680352 − TCCACTGACCTTTAG 0
LTR-A TCCACTGACCTTTAG chr2 100 20 215414950 215414964 − TCCACTGACCTTTAG 0
LTR-B CAGCAGTTCTTGAAGTACTCCGG HIV 100 7.00E–04 9291 9313 − CAGCAGTTCTTGAAGTACTCCGG 0
LTR-B AGCAGTTCTTGAAGTACTCCGG HIV 100 0.003 9291 9312 − AGCAGTTCTTGAAGTACTCCGG 0
LTR-B GCAGTTCTTGAAGTACTCCGG HIV 100 0.009 9291 9311 − GCAGTTCTTGAAGTACTCCGG 0
LTR-B CAGTTCTTGAAGTACTCCGG HIV 100 0.033 9291 9310 − CAGTTCTTGAAGTACTCCGG 0
LTR-B AGTTCTTGAAGTACTCCGG HIV 100 0.12 9291 9309 − AGTTCTTGAAGTACTCCGG 0
LTR-B GTTCTTGAAGTACTCCGG HIV 100 0.42 9291 9308 − GTTCTTGAAGTACTCCGG 0
LTR-B TTCTTGAAGTACTCCGG HIV 100 1.5 9291 9307 − TTCTTGAAGTACTCCGG 0
LTR-B TCTTGAAGTACTCCGG HIV 100 5.4 9291 9306 − TCTTGAAGTACTCCGG 0
LTR-B TCTTGAAGTACTCTAG chr11 100 5.4 91845834 91845849 − TCTTGAAGTACTCTAG 0
LTR-B CTTGAAGTACTCAGG chr19 100 20 45672789 45672803 − CTTGAAGTACTCAGG 0
LTR-B CTTGAAGTACTCAGG chr15 100 20 82132445 82132459 + CTTGAAGTACTCAGG 0
LTR-B CTTGAAGTACTCAGG chr11 100 20 94282411 94282425 + CTTGAAGTACTCAGG 0
LTR-B CTTGAAGTACTCAGG chr2 100 20 193312744 193312758 − CTTGAAGTACTCAGG 0
LTR-B CTTGAAGTACTCCGG HIV 100 20 9291 9305 − CTTGAAGTACTCCGG 0
LTR-B CTTGAAGTACTCTGG chr15 100 20 61274973 61274987 − CTTGAAGTACTCTGG 0
LTR-B CTTGAAGTACTCAAG chrX 100 20 36051764 36051778 − CTTGAAGTACTCAAG 0
LTR-B CTTGAAGTACTCAAG chr15 100 20 31316465 31316479 − CTTGAAGTACTCAAG 0
LTR-B CTTGAAGTACTCAAG chr13 100 20 23054474 23054488 − CTTGAAGTACTCAAG 0
LTR-B CTTGAAGTACTCAAG chr9 100 20 83208046 83208060 + CTTGAAGTACTCAAG 0
LTR-B CTTGAAGTACTCAAG chr8 100 20 13956368 13956382 + CTTGAAGTACTCAAG 0
LTR-B CTTGAAGTACTCCAG chr16 100 20 57449025 57449039 − CTTGAAGTACTCCAG 0
LTR-B CTTGAAGTACTCCAG chr15 100 20 41397831 41397845 − CTTGAAGTACTCCAG 0
LTR-B CTTGAAGTACTCCAG chr11 100 20 70255488 70255502 − CTTGAAGTACTCCAG 0
LTR-B CTTGAAGTACTCCAG chr3 100 20 134149643 134149657 + CTTGAAGTACTCCAG 0
LTR-B CTTGAAGTACTCTAG chr11 100 20 91845834 91845848 − CTTGAAGTACTCTAG 0
LTR-B CTTGAAGTACTCTAG chr1 100 20 224520600 224520614 + CTTGAAGTACTCTAG 0

Hu et al. www.pnas.org/cgi/content/short/1405186111 13 of 13

You might also like