Published Express 3 January 2013; corrected for print 15 February 2013

www.sciencemag.org/cgi/content/full/science.1232033/DC1

Supplementary Materials for
RNA-Guided Human Genome Engineering via Cas9

Prashant Mali, Luhan Yang, Kevin M. Esvelt, John Aach, Marc Guell, James E. DiCarlo, Julie E. Norville, George M. Church*

*To whom correspondence should be addressed. E-mail: gchurch@genetics.med.harvard.edu

Published 3 January 2013 on Science Express DOI: 10.1126/science.1232033 This PDF file includes Materials and Methods Supplementary Text Figs. S1 to S11 Full References Other Supplementary Material for this manuscript includes the following: (available at http://arep.med.harvard.edu/human_crispr) Table S1. Bioinformatically computed genome-wide resource of candidate unique gRNA targets in human exons. Table S2. Incorporation of gRNA targets in Table 1 into a 200-bp format suitable for multiplex DNA array based synthesis. Table S3. 12k gRNA targets from Table 1 in a 200-bp format synthesized by CustomArray Inc. Corrections: The authors fixed some minor typographical errors, made bold the DNA sequences in Figs. S1 and S10 to improve readability, and included the link to their Addgene plasmid deposit. The findings have not changed.

  Section   The  Type  II  CRISPR-­‐Cas  System   Material  and  Methods   Supplementary  Fig.  S1.  The  engineered  type  II  CRISPR  system  for  human  cells.   Supplementary  Fig.  S2.  RNA-­‐guided  genome  editing  requires  both  Cas9  and  guide  RNA   for  successful  targeting.   Supplementary  Fig.  S3.  Analysis  of  gRNA  and  Cas9  mediated  genome  editing.     Supplementary  Fig.  S4.  RNA-­‐guided  genome  editing  is  target  sequence  specific.   Supplementary  Fig.  S5.  Guide  RNAs  targeted  to  the  GFP  sequence  enable  robust  genome   editing.   Supplementary  Fig.  S6.  RNA-­‐guided  genome  editing  is  target  sequence  specific,  and   demonstrates  similar  targeting  efficiencies  as  ZFNs  or  TALENs.   Supplementary  Fig.  S7.  RNA-­‐guided  NHEJ  in  human  iPS  cells.   Supplementary  Fig.  S8.  RNA-­‐guided  NHEJ  in  human  K562  cells.   Supplementary  Fig.  S9.  RNA-­‐guided  NHEJ  in  human  293T  cells.   Supplementary  Fig.  S10.  HR  at  the  endogenous  AAVS1  locus  using  either  a  dsDNA  donor   or  a  short  oligonucleotide  donor.   Supplementary  Fig.  S11.  Methodology  for  multiplex  synthesis,  retrieval  and  U6   expression  vector  cloning  of  guide  RNAs  targeting  genes  in  the  human  genome.   Supplementary  Table  S1.  Bioinformatically  computed  genome-­‐wide  resource  of   candidate  unique  gRNA  targets  in  human  exons.   Supplementary  Table  S2.  Incorporation  of  gRNA  targets  in  Table  1  into  a  200bp  format   suitable  for  multiplex  DNA  array  based  synthesis.   Supplementary  Table  S3.  12k  gRNA  targets  from  Table  1  in  a  200bp  format  synthesized   by  CustomArray  Inc.   *Due  to  size  constraints,  the  Supplementary  Tables  S1,  S2  and  S3  are  available   on:    http://arep.med.harvard.edu/human_crispr.   Page   3   7   14   16   17   19   21   22   24   26   28   30   32   *   *   *  

 

2  

The  Type  II  CRISPR-­‐Cas  System   Bacteria   and   archaea   have   evolved   adaptive   immune   defenses   termed   clustered   regularly  interspaced  short  palindromic  repeats  (CRISPR)/CRISPR-­‐associated  (Cas)  systems  that   use   short   RNA   to   direct   degradation   of   foreign   nucleic   acids   (DNA/RNA).   CRISPR   defense   involves  acquisition  and  integration  of  new  targeting  “spacers”  from  invading  virus  or  plasmid   DNA   into   the   CRISPR   locus,   expression   and   processing   of   short   guiding   CRISPR   RNAs   (crRNAs)   consisting   of   spacer-­‐repeat   units,   and   cleavage   of   nucleic   acids   (most   commonly   DNA)   complementary  to  the  spacer.     Three  classes  of  CRISPR  systems  have  been  described  thus  far  (Type  I,  II  and  III).  Here  we   focus   on   the   Type   II   CRISPR   system,   which   utilizes   a   single   effector   enzyme,   Cas9,   to   cleave   dsDNA,   whereas   Type   I   and   Type   III   systems   require   multiple   distinct   effectors   acting   as   a   complex  (for  a  detailed  review  of  CRISPR  classification,  see  reference  (29)).  As  a  consequence,   Type  II  systems  are  more  likely  to  function  in  alternative  contexts  such  as  eukaryotic  cells.  The   Type   II   effector   system   consists   of   a   long   pre-­‐crRNA   transcribed   from   the   spacer-­‐containing   CRISPR  locus,  the  multifunctional  Cas9  protein,  and  a  tracrRNA  important  for  gRNA  processing.     The   tracrRNAs   hybridize   to   the   repeat   regions   separating   the   spacers   of   the   pre-­‐crRNA,   initiating   dsRNA   cleavage   by   endogenous   RNase   III,   which   is   followed   by   a   second   cleavage   event   within   each   spacer   by   Cas9,   producing   mature   crRNAs   that   remain   associated   with   the   tracrRNA   and   Cas9.     Jinek   et   al.   demonstrated   that   a   tracrRNA-­‐crRNA   fusion,   termed   a   guide   RNA  (gRNA)  in  this  work  (Fig.  1),  is  functional   in  vitro,  obviating  the  need  for  RNase  III  and  the   crRNA  processing  in  general  (4).  

 

3  

  a   process   mediated   by   two   catalytic   domains   in   the   Cas9   protein:   an   HNH   domain  that  cleaves  the  complementary  strand  of  the  DNA  and  a  RuvC-­‐like  domain  that  cleaves   the   non-­‐complementary   strand   (refer   Fig.  respectively.Type  II  CRISPR  interference  is  a  result  of  Cas9  unwinding  the  DNA  duplex  and  searching   for   sequences   matching   the   crRNA   to   cleave.   fig.    If  one  of  the  two  nuclease  domains  is  inactivated.   S.   34).  the  specificity  of  gRNA-­‐directed  Cas9  cleavage  will  be  of   the   utmost   importance.   Cas9   cuts   the   DNA   only   if   a   correct   protospacer-­‐ adjacent  motif  (PAM)  is  also  present  at  the  3’  end.   resulting   in   toxicity   and   possibly   oncogenesis   in   gene   therapy  applications.  Cas9  will  function  as  a   nickase  in  vitro  (4)  and  in  human  cells  (fig.   thermophilus.  pyogenes  system  tolerates  mismatches  in  the  first  6  bases  out  of   the   20bp   mature   spacer   sequence   in   vitro.  pyogenes  system  utilized  in  this  work  requires  an  NGG  sequence.  Different  Type  II  systems  have  differing  PAM   requirements.   it   is   entirely   possible   that   greater   4     .     Target   recognition   occurs   upon   detection   of   complementarity   between   a   “protospacer”   sequence   in   the   target   DNA   and   the   remaining   spacer   sequence   in   the   crRNA.  Bioinformatic   analyses   have   generated   extensive   databases   of   CRISPR   loci   in   a   variety   of   bacteria   that   may   serve  to  identify  new  PAMs  and  expand  the  set  of  CRISPR-­‐targetable  sequences  (33.     As  a  genome  engineering  tool.  while  different   S.   1A.  where  N   can   be   any   nucleotide.  The   S.   Cas9   generates   a   blunt-­‐ended   double-­‐stranded   break   3bp   upstream   of   the   protospacer   (5).   However.  In   S.   thermophilus   Type   II   systems   require   NGGNG   (30)   and   NNAGAAW   (31).   pyogenes   system   has   not   been  characterized  to  the  same  level  of  precision.  DSB  formation  also  occurs  towards  the  3’  end   of  the  protospacer.   While   the   S.   Significant   off-­‐target   activity   could   cause   unwanted   double-­‐strand   breaks   at   other   regions   of   the   genome.   S1).    The  S.   Importantly.  S3).  mutans  systems  tolerate  NGG  or  NAAR  (32).

stringency  is  required  in  vivo  given  the  low  toxicity  we  observed  in  human  cell  lines.  and  -­‐7  through  -­‐8   abolished   interference.    Second.  the  use  of  a  Cas9  variant  requiring  a  longer  PAM  sequence  will  reduce  the  set  of   potential   targetable   sequences.   the   S.   thermophilus.  known  as  the  “seed  sequence”  (6).   Garneau   et   al.   it   is   possible   that   interference  is  sensitive  to  the  melting  temperature  of  the  gRNA-­‐DNA  hybrid.  in  which  case  AT-­‐ rich  target  sequences  are  likely  to  have  fewer  off-­‐target  sites.    Jinek  et  al.  similar  degeneracy  was  not  sufficient  to  block  phage  infection  (31).   As   a   caveat.    Such  a  project  is  likely  to  require  extensive  modifications  to  the  Cas9  protein.  as   5     .   -­‐6.    Third.     There   exist   at   least   four   possible   ways   to   improve   specificity.   thermophilus   (35).  emphasizing  the   importance  of  the  assay  utilized.   coli.   thermophilus   system   did   not   tolerate   single   mutations   in   the   PAM   or   in   positions   -­‐3.     Finally.   First.  Taken  together.  these  results  point  towards  the  urgent  need   to  assay  specificity  in  the  context  of  interest.   single  mutations  in  the  PAM  or  at  positions  -­‐1.  Mismatches  towards  the  3’  end  of  the  spacer.   ideally   requiring   a   perfect   20bp   gRNA   match   with   a   minimal  PAM.   When   transplanted   into   E.  directed  evolution  might  be  utilized  to  improve  Cas9  specificity  to  a  level  sufficient  to   completely   preclude   off-­‐target   activity.   however.   found  that  spacers  acquired  from  plasmid  DNA  tolerated  greater  degeneracy  in  both  the  PAM   and   seed   sequence   while   sufficing   to   block   plasmid   acquisition   in   S.  found  that  single  mismatches  in  the  PAM  at  positions  -­‐3  through   -­‐7   abolished   interference   in   vitro.  -­‐3  through  -­‐5.   or   -­‐8.  are   less  well  tolerated.  carefully  choosing  target   sites  to  avoid  pseudo-­‐sites  with  at  least  14bp  matching  sequences  elsewhere  in  the  genome  of   interest.  as  potential   off-­‐target   sites   matching   (last   14   bp)   NGG   exist   within   the   human   reference   genome   for   our   gRNAs.   but   should   similarly   reduce   the   frequency   of   off-­‐target   sites.   though   a   mismatch   at   position   -­‐10   did   not   (4).     In   S.

 see  references  (1.       6   .   novel   methods   permitting   many   rounds   of   evolution   in   a   short   timeframe   (36)   may   be   warranted.  For  more  detailed  reviews  of  CRISPR  systems.the  small  genomes  of  bacteria  are  unlikely  to  select  for  high  specificity  in  natural  variants.    As   such.  37).

  These   lentivectors   were   then   used   to   establish   the   GFP   reporter   stable  lines.   The   dsDNA   donor   for   HR   at   the   native   AAVS1   locus   is   described   in   (13).   The   target   gRNA   expression   constructs   were   directly   ordered   as   individual  455bp  gBlocks  from  IDT  (sequence  in  fig.   Invitrogen)   high   glucose   supplemented   with   10%   fetal   bovine   serum   (FBS.   penicillin/streptomycin     7   .3-­‐ TOPO   vector   (Invitrogen).  TALENs  used  in  this  study  were  constructed  using  the  protocols  described  in  (11).addgene.   S6)   assembled   into   the   EGIP   lentivector   from   Addgene   (plasmid   #26777).   refer   fig.   All   DNA   reagents   developed  in  this  study  are  available  at  Addgene  (http://www.   S1A).   The   vectors   for   the   HR   reporter   assay   involving   a   broken   GFP   were   constructed   by   fusion   PCR   assembly   of   the   GFP   sequence   bearing   the   stop   codon   and   68bp   AAVS1   fragment   (or   mutants   thereof.   Cas9_D10A   was   similarly   constructed.Material  and  Methods   Plasmid  construction   The   Cas9   gene   sequence   was   human   codon   optimized   and   assembled   by   hierarchical   fusion   PCR   assembly   of   9   500bp   gBlocks   ordered   from   IDT   (sequence   in   fig.  Cultures  were  passaged  every  5–7  d  with  TrypLE  Express  (Invitrogen).  HEK  293T  cells   were   cultured   in   Dulbecco’s   modified   Eagle’s   medium   (DMEM.   S4).   Cell  culture   PGP1   iPS   cells   were   maintained   on   Matrigel   (BD   Biosciences)-­‐coated   plates   in   mTeSR1   (Stemcell  Technologies).   Invitrogen).   The   resulting   full-­‐length   products   were   cloned   into   the   pcDNA3.   K562  cells  were  grown  and  maintained  in  RPMI  (Invitrogen)  containing  15%  FBS.org/crispr/church/).  S1B)  and  either  cloned  into  the  pCR-­‐BluntII-­‐ TOPO   vector   (Invitrogen)   or   pcr   amplified.   or   58bp   fragments   from   the   DNMT3a   and   DNMT3b   genomic   loci   (refer   fig.

    Gene  targeting  of  PGP1  iPS.   The   former   has   flanking   short   homology   arms  and  a  SA-­‐2A-­‐puromycin-­‐CaGGS-­‐eGFP  cassette  to  enrich  for  successfully  targeted  cells.000   reads.   Invitrogen).   Invitrogen).   and   nucleofected   according   to   manufacturer’s   instruction   (Lonza).  The  DNA  donors  used  for  endogenous  AAVS1  targeting  were  either  a   dsDNA   donor   (Fig.  The  reference  AAVS1  sequence  analyzed  is:     8   .   Cells   were   subsequently  plated  on  an  mTeSR1-­‐coated  plate  in  mTeSR1  medium  supplemented  with  ROCK   inhibitor  for  the  first  24h.   For   293Ts.   and   nucleofected   according   to   manufacturer’s   instruction   (Lonza).     Cells   were   harvest   using   TrypLE   Express   (Invitrogen)   and   2×106   cells   were   resuspended   in   P3   reagent   (Lonza)   with   1μg   Cas9   plasmid.(pen/strep.     Assess  the  targeting  efficiency     Cells  were  harvested  3  days  after  nucleofection  and  the  genomic  DNA  of  ~1  X  106  cells   was  extracted  using  prepGEM  (ZyGEM).  2×106  cells  were  resuspended  in  SF  reagent  (Lonza)  with   1μg   Cas9   plasmid.   All   cells   were   maintained  at  37°C  and  5%  CO2  in  a  humidified  incubator.  For  K562s.  K562  and  293Ts   PGP1   iPS   cells   were   cultured   in   Rho   kinase   (ROCK)   inhibitor   (Calbiochem)   2h   before   nucleofection.   0.   1μg   gRNA   and/or   1μg   DNA   donor   plasmid.  PCR  was  conducted  to  amplify  the  targeting  region  with   genomic   DNA   derived   from   the   cells   and   amplicons   were   deep   sequenced   by   MiSeq   Personal   Sequencer   (Illumina)   with   coverage   >200.   1μg   gRNA   and/or   1μg   DNA   donor   plasmid   using   Lipofectamine   2000   as   per   the   manufacturer’s  protocols.   2C)   or   a   90mer   oligonucleotide.   1μg   gRNA   and/or   1μg   DNA   donor   plasmid.   and   non-­‐essential   amino   acids   (NEAA.   The   sequencing   data   was   analyzed   to   estimate  NHEJ  efficiencies.1×106   cells   were   transfected   with   1μg   Cas9   plasmid.

7   AAVS1-­‐F.2   AAVS1-­‐F.  2C  the  primers  used  were:   HR_AAVS1-­‐F   CTGCCGTCTCTCTCCTGAGT   HR_Puro-­‐R   GTGGGCTTGTACTCGGTCAT     9   .5   AAVS1-­‐F.CACTTCAGGACAGCATGTTTGCTGCCTCCAGGGATCCTGTGTCCCCGAGCTGGGACCACCTTATATTCCC AGGGCCGGTTAATGTGGCTCTGGTTCTGGGTACTTTTATCTGTCCCCTCCACCCCACAGTGGGGCCACTA GGGACAGGATTGGTGACAGAAAAGCCCCATCCTTAGGCCTCCTCCTTCCTAGTCTCCTGATATTGGGTCT AACCCCCACCTCCTGTTAGGCAGATTCCTTATCTGGTGACACACCCCCATTTCCTGGA   The  PCR  primers  for  amplifying  the  targeting  regions  in  the  human  genome  are:   AAVS1-­‐R   AAVS1-­‐F.12   CTCGGCATTCCTGCTGAACCGCTCTTCCGATCTacaggaggtgggggttagac   ACACTCTTTCCCTACACGACGCTCTTCCGATCTCGTGATtatattcccagggccggtta   ACACTCTTTCCCTACACGACGCTCTTCCGATCTACATCGtatattcccagggccggtta   ACACTCTTTCCCTACACGACGCTCTTCCGATCTGCCTAAtatattcccagggccggtta   ACACTCTTTCCCTACACGACGCTCTTCCGATCTTGGTCAtatattcccagggccggtta   ACACTCTTTCCCTACACGACGCTCTTCCGATCTCACTGTtatattcccagggccggtta   ACACTCTTTCCCTACACGACGCTCTTCCGATCTATTGGCtatattcccagggccggtta   ACACTCTTTCCCTACACGACGCTCTTCCGATCTGATCTGtatattcccagggccggtta   ACACTCTTTCCCTACACGACGCTCTTCCGATCTTCAAGTtatattcccagggccggtta   ACACTCTTTCCCTACACGACGCTCTTCCGATCTCTGATCtatattcccagggccggtta   ACACTCTTTCCCTACACGACGCTCTTCCGATCTAAGCTAtatattcccagggccggtta   ACACTCTTTCCCTACACGACGCTCTTCCGATCTGTAGCCtatattcccagggccggtta   ACACTCTTTCCCTACACGACGCTCTTCCGATCTTACAAGtatattcccagggccggtta   To  analyze  the  HR  events  using  the  DNA  donor  in  Fig.3   AAVS1-­‐F.11   AAVS1-­‐F.10   AAVS1-­‐F.4   AAVS1-­‐F.1   AAVS1-­‐F.9   AAVS1-­‐F.6   AAVS1-­‐F.8   AAVS1-­‐F.

   Specifically.  the  5’-­‐most  20nt  of  which  exactly   complement   a   desired   location.   where  the  B’s  represent  the  bases  at  the  exon  location.    Coding  exon   locations  in  this  BED  file  comprised  a  set  of  346089  mappings  of  RefSeq  mRNA  accessions  to  the   hg19  genome.  and  many  accessions  mapped  to  subsets  of  the  same  set  of  exon   10     .  we  (i)  downloaded  a  BED  file  of  locations  of  coding  regions  of  all  RefSeq   genes  the  GRCh37/hg19  human  genome  from  the  UCSC  Genome  Browser  (38-­‐40).     Additionally.  we  therefore   examined   all   23bp   sequences   of   the   form   5’-­‐GBBBB   BBBBB   BBBBB   BBBBB   NGG-­‐3’   (form   1).  for  which  no  sequence  of  the  form  5’-­‐ NNNNN   NNBBB   BBBBB   BBBBB   NGG-­‐3’   (form   2)   existed   at   any   other   location   in   the   human   genome.   Maximally   efficient  targeting  by  a  gRNA  is  achieved  by  23nt  sequences.  mispairing  of  the  six  5’-­‐most  nt  of  a  20bp  gRNA  against  its  genomic  target  does   not  abrogate  Cas9-­‐mediated  cleavage  so  long  as  the  last  14nt  pairs  properly.Bioinformatics   approach   for   computing   human   exon   CRISPR   targets   and   methodology   for   their  multiplexed  synthesis   We   sought   to   generate   a   set   of   gRNA   gene   sequences   that   maximally   target   specific   locations   in   human   exons   but   minimally   target   other   locations   in   the   genome.  like  the  case  of  six.  permissive   of   cleavage.    To  be  conservative  regarding  off-­‐target  effects.   so   that   pairing   of   the   3’-­‐most   13nt   is   sufficient   for   cleavage.  However.    However.  while  the  case  of  the  seven  5-­‐most   nt  mispairs  and  13  3’  pairs  was  not  tested.  we   therefore  assumed  that  the  case  of  the  seven  5’-­‐most  mispairs  is.   while   the   three   3’-­‐most   bases   must   be   of   the   form   NGG.  some  RefSeq  mRNA  accessions  mapped  to  multiple  genomic  locations   (probable  gene  duplications).   To   identify   CRISPR   target  sites  within  human  exons  that  should  be  cleavable  without  off-­‐target  cuts.  the  5’-­‐most  nt  must  be  a  G  to  establish  a  pol-­‐III  transcription  start  site.  but  mispairing  of   the  eight  5’-­‐most  nt  along  with  pairing  of  the  last  12  nt  does.   according  to  (4).

2-­‐zip-­‐87e3926)   to   consolidate   overlapping   exon   locations   into   merged   exon  regions.   These   comprise   our   set   of   CRISPR-­‐targetable   exonic   locations   in   the   human   genome.   Note   that   because   any   specific   13bp   core   sequence   followed   by   the   sequence   NGG   confers   only   15bp   of   specificity.   adding   20bp   of   padding   on   each   end.   we   therefore     (ii)   added   unique   numerical   suffixes   to   705   RefSeq   accession  numbers  that  had  multiple  genomic  locations.    These  steps  reduced  the  initial  set  of  346089  RefSeq  exon  locations  to  192783   distinct   genomic   regions.  and  (iii)  used  the  mergeBed  function  of   BEDTools   (41)   (v2.  and  T).   we   clustered   RefSeq   mRNA   11     .     To   assess   targeting   at   a   gene   level.   We   then   downloaded   the   hg19   sequence   for   all   merged   exon   regions   using   the   UCSC   Table   Browser.   however   189864   sequences   passed   this   filter.   The   189864   sequences   target   locations   in   78028   merged   exonic   regions   (~40.8   (42)   using   the   parameters   -­‐l   16   -­‐v   0   -­‐k   2.   (v)   We   then   filtered  these  sequences  for  the  existence  of  off-­‐target  occurrences  of  form  2:  For  each  merged   exon   form  1  target.  and   searched   the   entire   hg19   genome   for   exact   matches   to   these   6631172   sequences   using   Bowtie   version   0.  for  each   core  generated  the  four  16bp  sequences  5’-­‐BBB  BBBBB  BBBBB  NGG-­‐3’    (N  =  A.  C.5%  of  the  total  of  192783  merged  human  exon  regions)  at  a  multiplicity  of  ~2.     To   distinguish   apparently   duplicated   gene   instances  and  consolidate  multiple  references  to  the  same  genomic  exon  instance  by  multiple   RefSeq   isoform   accessions.locations   (multiple   isoforms   of   the   same   genes).   most   of   the   1657793   initially   identified   targets   were   rejected.   We   rejected   any   exon   target   site   for   which   there   was   more   than   a   single   match.6   matches   to   an   extended   core   sequence   in   a   random   ~3Gb   sequence   (both   strands).   we   identified   1657793   instances   of   form   1   within   this   exonic   sequence.  G.  we  extracted  the  3’-­‐most  13bp  specific  (B)  “core”  sequences  and.12.   Therefore.   (iv)   Using   custom   perl   code.4  sites  per   targeted   exonic   region.16.   there   should   be   on   average   ~5.

  Specifically   we   tested   this   approach   by   synthesizing   a   12k   oligonucleotide   pool   from   CustomArray  Inc.1   per   targeted   gene   cluster.  (Supplementary  Table  3).  such  as  base  composition  and  secondary  structure  of   both  gRNAs  and  genomic  targets  (43.  and  the  epigenetic  state  of  these  targets  in  human   cell  lines  for  which  this  information  is  available  (45).  they  will  also  collapse  overlapping  distinct  genes  as  well  as  genes  with  antisense   transcripts.  Furthermore.   the   189864   exonic   specific  CRISPR  sites  target  17104  out  of  18872  gene  clusters  (~90.   we   plan   to   refine   our   database   by   correlating   performance   with  factors  we  expect  to  be  important.   Our   design   allows   for   targeted   retrieval   of   a   specific   or   pools   of   gRNA   sequences   from   the   DNA   array   based   oligonucleotide   pool  and  its  ready  cloning  into  a  common  expression  vector  (fig.9%)   mapped   RefSeq   accessions   (including   our   distinguished   gene   duplicates)   at   a   multiplicity   of   ~6.  as  per  our  approach  we  were  able  to     12   .   46).  Supplementary  Table  2).)    At  the  level  of  original  RefSeq  accessions.   44).     (Note   that   while   these   gene   clusters   collapse   RefSeq   mRNA   accessions   that   represent   multiple   isoforms   of   a   single   transcribed   gene   into   a   single  entity.2   sites   per   targeted   mapped   RefSeq   accession.  the  189864  sequences  targeted  exonic   regions   in   30563   out   of   a   total   of   43726   (~69.mappings   so   that   any   two   RefSeq   accessions   (including   the   gene   duplicates   we   distinguished   in   (ii))   that   overlap   a   merged   exon   region   are   counted   as   a   single   gene   cluster.   As   we   gather   information   on   CRISPR   performance   at   our   computationally   predicted   human   exon   CRISPR   target   sites.   we   also   incorporated   these   target   sequences   into   a   200bp   format   that   is   compatible   for   multiplex   synthesis   on   DNA   arrays   (14.  S11A.               Finally.6%  of  all  gene  clusters)  at  a   multiplicity   of   ~11.

  S11B).   We   observed   an   error   rate   of   ~4  mutations  per  1000bp  of  array  synthesized  DNA.successfully   retrieve   gRNAs   of   choice   from   this   library   (fig.     13   .

  and   the   C-­‐   14   .   S1.   The   RuvC-­‐like   and   HNH   motifs   (4).A                             gccaccATGGACAAGAAGTACTCCATTGGGCTCGATATCGGCACAAACAGCGTCGGCTGGGCCGTCATTACGGACGAGTACAAGGTGCCGAGCAAAAAATTCAAAGTTCTGGGCAATACCGA TCGCCACAGCATAAAGAAGAACCTCATTGGCGCCCTCCTGTTCGACTCCGGGGAGACGGCCGAAGCCACGCGGCTCAAAAGAACAGCACGGCGCAGATATACCCGCAGAAAGAATCGGATCT GCTACCTGCAGGAGATCTTTAGTAATGAGATGGCTAAGGTGGATGACTCTTTCTTCCATAGGCTGGAGGAGTCCTTTTTGGTGGAGGAGGATAAAAAGCACGAGCGCCACCCAATCTTTGGC AATATCGTGGACGAGGTGGCGTACCATGAAAAGTACCCAACCATATATCATCTGAGGAAGAAGCTTGTAGACAGTACTGATAAGGCTGACTTGCGGTTGATCTATCTCGCGCTGGCGCATAT GATCAAATTTCGGGGACACTTCCTCATCGAGGGGGACCTGAACCCAGACAACAGCGATGTCGACAAACTCTTTATCCAACTGGTTCAGACTTACAATCAGCTTTTCGAAGAGAACCCGATCA ACGCATCCGGAGTTGACGCCAAAGCAATCCTGAGCGCTAGGCTGTCCAAATCCCGGCGGCTCGAAAACCTCATCGCACAGCTCCCTGGGGAGAAGAAGAACGGCCTGTTTGGTAATCTTATC GCCCTGTCACTCGGGCTGACCCCCAACTTTAAATCTAACTTCGACCTGGCCGAAGATGCCAAGCTTCAACTGAGCAAAGACACCTACGATGATGATCTCGACAATCTGCTGGCCCAGATCGG CGACCAGTACGCAGACCTTTTTTTGGCGGCAAAGAACCTGTCAGACGCCATTCTGCTGAGTGATATTCTGCGAGTGAACACGGAGATCACCAAAGCTCCGCTGAGCGCTAGTATGATCAAGC GCTATGATGAGCACCACCAAGACTTGACTTTGCTGAAGGCCCTTGTCAGACAGCAACTGCCTGAGAAGTACAAGGAAATTTTCTTCGATCAGTCTAAAAATGGCTACGCCGGATACATTGAC GGCGGAGCAAGCCAGGAGGAATTTTACAAATTTATTAAGCCCATCTTGGAAAAAATGGACGGCACCGAGGAGCTGCTGGTAAAGCTTAACAGAGAAGATCTGTTGCGCAAACAGCGCACTTT CGACAATGGAAGCATCCCCCACCAGATTCACCTGGGCGAACTGCACGCTATCCTCAGGCGGCAAGAGGATTTCTACCCCTTTTTGAAAGATAACAGGGAAAAGATTGAGAAAATCCTCACAT TTCGGATACCCTACTATGTAGGCCCCCTCGCCCGGGGAAATTCCAGATTCGCGTGGATGACTCGCAAATCAGAAGAGACCATCACTCCCTGGAACTTCGAGGAAGTCGTGGATAAGGGGGCC TCTGCCCAGTCCTTCATCGAAAGGATGACTAACTTTGATAAAAATCTGCCTAACGAAAAGGTGCTTCCTAAACACTCTCTGCTGTACGAGTACTTCACAGTTTATAACGAGCTCACCAAGGT CAAATACGTCACAGAAGGGATGAGAAAGCCAGCATTCCTGTCTGGAGAGCAGAAGAAAGCTATCGTGGACCTCCTCTTCAAGACGAACCGGAAAGTTACCGTGAAACAGCTCAAAGAAGACT ATTTCAAAAAGATTGAATGTTTCGACTCTGTTGAAATCAGCGGAGTGGAGGATCGCTTCAACGCATCCCTGGGAACGTATCACGATCTCCTGAAAATCATTAAAGACAAGGACTTCCTGGAC AATGAGGAGAACGAGGACATTCTTGAGGACATTGTCCTCACCCTTACGTTGTTTGAAGATAGGGAGATGATTGAAGAACGCTTGAAAACTTACGCTCATCTCTTCGACGACAAAGTCATGAA ACAGCTCAAGAGGCGCCGATATACAGGATGGGGGCGGCTGTCAAGAAAACTGATCAATGGGATCCGAGACAAGCAGAGTGGAAAGACAATCCTGGATTTTCTTAAGTCCGATGGATTTGCCA ACCGGAACTTCATGCAGTTGATCCATGATGACTCTCTCACCTTTAAGGAGGACATCCAGAAAGCACAAGTTTCTGGCCAGGGGGACAGTCTTCACGAGCACATCGCTAATCTTGCAGGTAGC CCAGCTATCAAAAAGGGAATACTGCAGACCGTTAAGGTCGTGGATGAACTCGTCAAAGTAATGGGAAGGCATAAGCCCGAGAATATCGTTATCGAGATGGCCCGAGAGAACCAAACTACCCA GAAGGGACAGAAGAACAGTAGGGAAAGGATGAAGAGGATTGAAGAGGGTATAAAAGAACTGGGGTCCCAAATCCTTAAGGAACACCCAGTTGAAAACACCCAGCTTCAGAATGAGAAGCTCT ACCTGTACTACCTGCAGAACGGCAGGGACATGTACGTGGATCAGGAACTGGACATCAATCGGCTCTCCGACTACGACGTGGATCATATCGTGCCCCAGTCTTTTCTCAAAGATGATTCTATT GATAATAAAGTGTTGACAAGATCCGATAAAAATAGAGGGAAGAGTGATAACGTCCCCTCAGAAGAAGTTGTCAAGAAAATGAAAAATTATTGGCGGCAGCTGCTGAACGCCAAACTGATCAC ACAACGGAAGTTCGATAATCTGACTAAGGCTGAACGAGGTGGCCTGTCTGAGTTGGATAAAGCCGGCTTCATCAAAAGGCAGCTTGTTGAGACACGCCAGATCACCAAGCACGTGGCCCAAA TTCTCGATTCACGCATGAACACCAAGTACGATGAAAATGACAAACTGATTCGAGAGGTGAAAGTTATTACTCTGAAGTCTAAGCTGGTCTCAGATTTCAGAAAGGACTTTCAGTTTTATAAG GTGAGAGAGATCAACAATTACCACCATGCGCATGATGCCTACCTGAATGCAGTGGTAGGCACTGCACTTATCAAAAAATATCCCAAGCTTGAATCTGAATTTGTTTACGGAGACTATAAAGT GTACGATGTTAGGAAAATGATCGCAAAGTCTGAGCAGGAAATAGGCAAGGCCACCGCTAAGTACTTCTTTTACAGCAATATTATGAATTTTTTCAAGACCGAGATTACACTGGCCAATGGAG AGATTCGGAAGCGACCACTTATCGAAACAAACGGAGAAACAGGAGAAATCGTGTGGGACAAGGGTAGGGATTTCGCGACAGTCCGGAAGGTCCTGTCCATGCCGCAGGTGAACATCGTTAAA AAGACCGAAGTACAGACCGGAGGCTTCTCCAAGGAAAGTATCCTCCCGAAAAGGAACAGCGACAAGCTGATCGCACGCAAAAAAGATTGGGACCCCAAGAAATACGGCGGATTCGATTCTCC TACAGTCGCTTACAGTGTACTGGTTGTGGCCAAAGTGGAGAAAGGGAAGTCTAAAAAACTCAAAAGCGTCAAGGAACTGCTGGGCATCACAATCATGGAGCGATCAAGCTTCGAAAAAAACC CCATCGACTTTCTCGAGGCGAAAGGATATAAAGAGGTCAAAAAAGACCTCATCATTAAGCTTCCCAAGTACTCTCTCTTTGAGCTTGAAAACGGCCGGAAACGAATGCTCGCTAGTGCGGGC GAGCTGCAGAAAGGTAACGAGCTGGCACTGCCCTCTAAATACGTTAATTTCTTGTATCTGGCCAGCCACTATGAAAAGCTCAAAGGGTCTCCCGAAGATAATGAGCAGAAGCAGCTGTTCGT GGAACAACACAAACACTACCTTGATGAGATCATCGAGCAAATAAGCGAATTCTCCAAAAGAGTGATCCTCGCCGACGCTAACCTCGATAAGGTGCTTTCTGCTTACAATAAGCACAGGGATA AGCCCATCAGGGAGCAGGCAGAAAACATTATCCACTTGTTTACTCTGACCAACTTGGGCGCGCCTGCAGCCTTCAAGTACTTCGACACCACCATAGACAGAAAGCGGTACACCTCTACAAAG GAGGTCCTGGACGCCACACTGATTCATCAGTCAATTACGGGGCTCTATGAAACAAGAATCGACCTCTCTCAGCTCGGTGGAGACAGCAGGGCTGACCCCAAGAAGAAGAGGAAGGTGTGA B     U6  promoter  +  target  RNA  +  guide  RNA  scaffold: TGTACAAAAAAGCAGGCTTTAAAGGAACCAATTCAGTCGACTGGATCCGGTACCAAGGTCGGGCAGGAAGAGGGCCTATTTCCCATGATTCCTTCATATTTG CATATACGATACAAGGCTGTTAGAGAGATAATTAGAATTAATTTGACTGTAAACACAAAGATATTAGTACAAAATACGTGACGTAGAAAGTAATAATTTCTT GGGTAGTTTGCAGTTTTAAAATTATGTTTTAAAATGGACTATCATATGCTTACCGTAACTTGAAAGTATTTCGATTTCTTGGCTTTATATATCTTGTGGAAA GGACGAAACACCGNNNNNNNNNNNNNNNNNNNGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT CGGTGCTTTTTTTCTAGACCCAGCTTTCTTGTACAAAGTTGGCATTA       C   Name   GFP  gRNA  Target  1   GFP  gRNA  Target  2   AAVS1  gRNA  Target  1   AAVS1  gRNA  Target  2   DNMT3a  gRNA  Target  1   DNMT3a  gRNA  Target  2   DNMT3b  gRNA  Target   Target  Sequence   GTGAACCGCATCGAGCTGAAGGG GGAGCGCACCATCTTCTTCAAGG GTCCCCTCCACCCCACAGTGGGG GGGGCCACTAGGGACAGGATTGG GCATGATGCGCGGCCCAAGGAGG GAGATGATCGCCCCTTCTTCTGG GAATTACTCACGCCCCAAGGAGG     Supplementary   Fig.  The  engineered  type  II  CRISPR  system  for  human  cells.   (A)  Expression   format   and   full   sequence   of   the   cas9   gene   insert.

  (B)   U6   promoter  based  expression  scheme  for  the  guide  RNAs  and  predicted  RNA  transcript  secondary   structure.terminus   SV40   NLS   are   respectively   highlighted   by   blue.  (C)  The  7   gRNAs  used  in  this  study  are  listed.   brown   and   orange   colors.  The  use  of  the  U6  promoter  constrains  the  1st  position  in  the  RNA  transcript  to  be  a   ‘G’  and  thus  all  genomic  sites  of  the  form  GN20GG  can  be  targeted  using  this  approach.     15   .

  Data  are  shown  as  mean  ±  SEM  (N=3).   all   possible   combinations  of  the  repair  DNA  donor.   Using   the   GFP   reporter   assay   described   in   Fig.  Cas9  protein.  GFP+  cells  were  observed  only  when  all  the  3  components  were   present.     16   .   RNA-­‐guided   genome   editing   requires   both   Cas9   and   guide   RNA   for   successful   targeting.  validating  that  these  CRISPR  components  are  essential  for  RNA-­‐guided  genome  editing.  and  gRNA  were  tested  for  their  ability  to   effect  successful  HR  (in  293Ts).   S2.   Supplementary   Fig.   1B.

  S3.   As   comparison   we     17   .   Supplementary   Fig.   We   closely   examined  the  CRISPR  mediated  genome  editing  process  using  either  (A)  a  GFP  reporter  assay  as   described   earlier.   and   (B)   deep   sequencing   of   the   targeted   loci   (in   293Ts).     Analysis   of   gRNA   and   Cas9   mediated   genome   editing.

    18   .   the   D10A   mutant   has   significantly   diminished   NHEJ   rates   (as   would   be   expected   from   its   putative   ability   to   only   nick   DNA).   with   a   median   deletion   frequency   of   ~9-­‐10bp.  Deep  sequencing  however  confirms  that  while  Cas9  shows  robust  NHEJ   at   the   targeted   loci.   Data   are   shown   as   mean   ±   SEM   (N=3).also   tested   a   D10A   mutant   for   Cas9   that   has   been   shown   in   earlier   reports   to   function   as   a   nickase  in  in  vitro  assays.  Our  data  shows  that  both  Cas9  and  Cas9D10A  can  effect  successful  HR   at  nearly  similar  rates.   Also.   our   NHEJ   data   confirms   that   most   base-­‐pair   deletions   or   insertions   occurred   near   the   3’   end   of   the   target   sequence:   the   peak   is   ~3-­‐4   bases   upstream   of   the   PAM   site.   consistent   with   the   known   biochemistry   of   the   Cas9   protein.

  These   are   distinguished   by   the   sequence   of   the   AAVS1     19   .  RNA-­‐guided  genome  editing  is  target  sequence  specific.   S4.  Similar  to  the   GFP   reporter   assay   described   in   Fig.   we   developed   3   293T   stable   lines   each   bearing   a   distinct   GFP   reporter   construct.   Supplementary   Fig.   1B.

 the  AAVS1  TALENs  and  the  T1  gRNA  only  targeted   the   wt-­‐AAVS1   cell   type.fragment  insert  (as  indicated  in  the  figure).   and   the   T2   gRNA   successfully   targets   all   3   cell   types.   a   AAVS1   TALEN   that   could   potentially   target   only   the   wt-­‐AAVS1   fragment   since   the   mutations   in   the   other   two   lines   should   render   the   left   TALEN   unable   to   bind   their   sites.  since  its  target  site  is  also  disrupted  in  the   two  mutant  lines.   These   results   together   confirm   that   the   guide   RNA   mediated   editing   is   target   sequence   specific.   the   T1   gRNA   which   can   also  potentially  target  only  the  wt-­‐AAVS1  fragment.  Each  of  the  lines  was  then  targeted  by   one  of  the  following  4  reagents:  a  GFP-­‐ZFN  pair  that  can  target  all  cell  types  since  its  targeted   sequence   was   in   the   flanking   GFP   fragments   and   hence   present   in   along   cell   lines.  the  ZFN  modified  all  3  cell  types.     20   .   Data   are   shown  as  mean  ±  SEM  (N=3).  and  finally  the  T2  gRNA  which  should  be  able  to  target  all  the  3  cell  lines  since   unlike   the   T1   gRNA   its   target   site   is   unaltered   among   the   3   lines.   Consistent   with   these   predictions.  One  line  harbored  the  wild-­‐type  fragment  while  the   two  other  lines  were  mutated  at  6bp  (highlighted  in  red).

  S5.   These   gRNAs   were   also   able   to   effect   robust   HR   at   this   engineered   locus.   In   addition   to   the   2   gRNAs   targeting   the   AAVS1   insert.   Data   are   shown   as   mean  ±  SEM  (N=3).     21   .   Guide   RNAs   targeted   to   the   GFP   sequence   enable   robust   genome   editing.   Supplementary   Fig.   we   also   tested   two   additional   gRNAs   targeting   the   flanking   GFP   sequences   of   the   reporter   described   in   Fig.   1B   (in   293Ts).

  Supplementary   Fig.   1B.   and   demonstrates  similar  targeting  efficiencies  as  ZFNs  or  TALENs.   we   developed   2   293T   stable   lines   each   bearing   a   distinct   GFP   reporter   construct.  Similar  to  the  GFP  reporter  assay   described   in   Fig.   S6.  These  are  distinguished  by  the  sequence  of  the  fragment  insert  (as  indicated  in  the     22   .   RNA-­‐guided   genome   editing   is   target   sequence   specific.

 Consistent  with  these   predictions.  a  pair  of  gRNAs  that  can  potentially  target  only  the  DNMT3a  fragment.   a   pair   of   TALENs   that   potentially   target   either   DNMT3a   or   DNMT3b  fragments.   One   line   harbored   a   58bp   fragment   from   the   DNMT3a   gene   while   the   other   line   bore   a   homologous  58bp  fragment  from  the  DNMT3b  gene.  Data  are  shown  as  mean  ±   SEM  (N=3).  the  ZFN  modified  all  3  cell  types.  Each  of  the  lines  was  then  targeted  by  one  of  the  following  6  reagents:  a  GFP-­‐ZFN  pair   that  can  target  all  cell  types  since  its  targeted  sequence  was  in  the  flanking  GFP  fragments  and   hence   present   in   along   cell   lines.   These   results   together   confirm   that   RNA-­‐guided   editing   is   target   sequence   specific   and  demonstrates  similar  targeting  efficiencies  as  ZFNs  or  TALENs.     23   .  and   finally  a  gRNA  that  should  potentially  only  target  the  DNMT3b  fragment.  and  the  TALENs  and  gRNAs  only  their  respective   targets.  The  sequence  differences  are  highlighted   in  red.figure).   Furthermore   the   efficiencies   of   targeting   were   comparable   across   the   6   targeting   reagents.

      24   .

  we   measured   NHEJ  rate  by  assessing  genomic  deletion  and  insertion  rate  at  double-­‐strand  breaks  (DSBs)  by   deep  sequencing.   green   dash   lines:   boundary   of   T2   RNA   targeting   site.     25     .  We  nucleofected  human  iPS  cells   (PGP1)   with   constructs   indicated   in   the   left   panel.  RNA-­‐guided  NHEJ  in  human  iPS  cells.  Red  dash  lines:  boundary  of  T1  RNA  targeting  site.   Panel   3:   Deletion   size   distribution.  iPS  targeting  by  both  gRNAs  is  efficient  (2-­‐4%).   Panel   2:   Insertion   rate   detected   at   targeting   region.   Supplementary  Fig.  green  dash  lines:  boundary  of  T2  RNA   targeting   site.   4   days   after   nucleofection.  Panel  1:  Deletion  rate  detected  at  targeting  region.   S2.    Red  dash  lines:  boundary   of   T1   RNA   targeting   site.   We   plot   the   incidence   of   insertion   at   the   genomic   location   where   the   first   insertion   junction   was   detected   in   black   lines   and   we   calculated   the   insertion   rate   as   the   percentage   of   reads   carrying   insertions.   We   plot   the   deletion  incidence  at  each  nucleotide  position  in  black  lines  and  we  calculated  the  deletion  rate   as   the   percentage   of   reads   carrying   deletions.   We   plot   the   frequencies  of  different  size  deletions  among  the  whole  NHEJ  population.   the   NGS-­‐based  analysis  also  shows  that  both  the  Cas9  protein  and  the  gRNA  are  essential  for  NHEJ   events  at  the  target  locus.   We   plot   the   frequencies   of   different   sizes   insertions   among   the   whole   NHEJ   population.   and   reaffirming   the   results   of   fig.  Panel  4:  insertion  size   distribution.  S7.  sequence  specific  (as  shown  by  the   shift   in   position   of   the   NHEJ   deletion   distributions).

  Red   dash   lines:     26   .   Panel   1:   Deletion   rate   detected   at   targeting   region.   We   plot   the   deletion   incidence   at   each   nucleotide   position   in   black   lines   and   calculated   the   deletion   rate   as   the   percentage   of   reads   carrying   deletions.   Panel   2:   Insertion   rate   detected   at   targeting   region.   we   measured   NHEJ   rate   by   assessing   genomic   deletion   and   insertion   rate   at   DSBs   by   deep   sequencing.   RNA-­‐guided   NHEJ   in   K562   cells.   4   days   after   nucleofection.   S8.   Supplementary   Fig.     Red   dash   lines:   boundary   of   T1   RNA   targeting   site.   green   dash   lines:   boundary   of   T2   RNA   targeting   site.   We   nucleofected   K562   cells   with   constructs   indicated   in   the   left   panel.

 We  plot   the   incidence   of   insertion   at   the   genomic   location   where   the   first   insertion   junction   was   detected   in   black   lines   and   we   calculated   the   insertion   rate   as   the   percentage   of   reads   carrying   insertions.  We  plot  the  frequencies  of  different  size  deletions   among   the   whole   NHEJ   population.   simultaneous   introduction   of   both   T1   and   T2   guide   RNAs   resulted   in   high   efficiency  deletion  of  the  intervening  19bp  fragment.  demonstrating  that  multiplexed  editing  of   genomic  loci  is  also  feasible  using  this  approach.  Importantly.     27   .   We   plot   the   frequencies   of  different  sizes  insertions  among  the  whole  NHEJ  population.boundary  of  T1  RNA  targeting  site.  green  dash  lines:  boundary  of  T2  RNA  targeting  site.   Panel   4:   insertion   size   distribution.  as  evidenced  by  the  peaks  in  the  histogram  of  observed  frequencies   of   deletion   sizes.  Panel  3:  Deletion  size  distribution.  K562  targeting  by  both  gRNAs  is   efficient   (13-­‐38%)   and   sequence   specific   (as   shown   by   the   shift   in   position   of   the   NHEJ   deletion   distributions).

  We   plot   the   deletion   incidence   at   each   nucleotide   position   in   black   lines   and   calculated   the   deletion   rate   as   the   percentage   of   reads     28   .   Supplementary   Fig.   RNA-­‐guided   NHEJ   in   293T   cells.   we   measured   NHEJ   rate   by   assessing   genomic   deletion   and   insertion   rate   at   DSBs   by   deep   sequencing.   S9.   We   transfected   293T   cells   with   constructs   indicated   in   the   left   panel.     Red   dash   lines:   boundary   of   T1   RNA   targeting   site.   4   days   after   nucleofection.   Panel   1:   Deletion   rate   detected   at   targeting   region.   green   dash   lines:   boundary   of   T2   RNA   targeting   site.

carrying   deletions.   Red   dash   lines:   boundary  of  T1  RNA  targeting  site.     29   .  293T  targeting  by  both  gRNAs  is   efficient   (10-­‐24%)   and   sequence   specific   (as   shown   by   the   shift   in   position   of   the   NHEJ   deletion   distributions).   Panel   2:   Insertion   rate   detected   at   targeting   region.   Panel   4:   insertion   size   distribution.  green  dash  lines:  boundary  of  T2  RNA  targeting  site.  We  plot  the  frequencies  of  different  size  deletions   among   the   whole   NHEJ   population.  We  plot   the   incidence   of   insertion   at   the   genomic   location   where   the   first   insertion   junction   was   detected   in   black   lines   and   we   calculated   the   insertion   rate   as   the   percentage   of   reads   carrying   insertions.  Panel  3:  Deletion  size  distribution.   We   plot   the   frequencies   of  different  sizes  insertions  among  the  whole  NHEJ  population.

 HR  at  the  endogenous  AAVS1  locus  using  either  a  dsDNA  donor  or  a   short   oligonucleotide   donor.   Supplementary   Fig.   S10.   (A)   PCR   screen   (refer   Fig.   2C)   confirmed   that   21/24   randomly     30   .

picked  293T  clones  were  successfully  targeted.   (B)  Similar  PCR  screen  confirmed  3/7  randomly   picked  PGP1-­‐iPS  clones  were  also  successfully  targeted.  (C)  Finally  short  90mer  oligos  could  also   effect  robust  targeting  at  the  endogenous  AAVS1  locus.     31   .  The  pink  bar  in  the  histogram  highlights   the   frequency   of   events   where   an   ‘AA’   base   modification   by   oligonucleotide   mediated   homology  directed  repair  (HDR)  was  successfully  effected  (shown  here  for  K562  cells).

  retrieval   and   U6   expression   vector   cloning   of   guide   RNAs   targeting   genes   in   the   human   genome.   S11.   Methodology   for   multiplex   synthesis.   We   established   a   resource   of  ~190k  bioinformatically  computed  unique  gRNA  sites  targeting  ~40.   Supplementary   Fig.     32   .5%  of  all  exons  of  genes   in  the  human  genome  (list  in  Supplementary  Table  1).   (A)  We  incorporated  these  into  a  200bp   format  (list  in  Supplementary  Table  2)  that  is  compatible  for  multiplex  synthesis  on  DNA  arrays.

        33   .   our   design   allows   for   (i)   targeted   retrieval   of   a   specific   or   pools   of   gRNA   targets   from   the   DNA   array   oligonucleotide   pool   (through   3   sequential   rounds   of   nested   PCR   as   indicated   in   the   figure   schematic).Specifically.   (refer  Methods.   and   (ii)   its   rapid   cloning   into   a   common   expression   vector   which  upon  linearization  using  an  AflII  site  serves  as  a  recipient  for  Gibson  assembly  mediated   incorporation   of   the   gRNA   insert   fragment.  list  in  Supplementary  Table  3).   (B)   We   confirmed   this   methodology   by   targeted   retrieval  of  10  unique  gRNAs  from  a  12k  oligonucleotide  pool  synthesized  by  CustomArray  Inc.

Cheng. D. D. Acad. Wiedenheft.1562 Medline 14. Zou. M.2011.1038/nprot. 171 (2012). Nat. 331 (2012).1000718 Medline 13.1038/nature09886 Medline 10. Siksnys. R.1016/j.1038/nbt. Curr.1670 Medline   34  . Biotechnol. 109. Doudna. Scalable gene synthesis by selective amplification of DNA pools from highfidelity microchips. J. H.1093/nar/gkr606 Medline 7. Nat. Terns. CRISPR-based adaptive immune systems.1716 Medline 15. Horvath. J. S. Taira. 7.1068999 Medline 8. P. 321 (2011). X. R.1073/pnas. V. Methods 8. R. E. Cas9-crRNA ribonucleoprotein complex mediates specific DNA cleavage for adaptive immunity in bacteria.. Revealing off-target cleavage specificities of zinc-finger nucleases by in vitro selection. Science 337. Nat. Nature 471. Davison.1126/science. J. 27. doi:10. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. doi:10. 816 (2012). Agami. Rev.1038/nmeth. Dowey. Proc. Miyagishi. P. Pattanayak. 9275 (2011). R. Lee et al.1146/annurev-genet-110410-132430 Medline 3. 550 (2002). e1000718 (2009). 851 (2009). U. Nat. Nature 482. C. doi:10. N. Sternberg. B. 5. Sanjana et al. Science 296. G. Nucleic Acids Res. A system for stable expression of short interfering RNAs in mammalian cells. S. P.. M.1038/nature10886 Medline 2. Efficient targeting of expressed and silent genes in human ESCs and iPSCs using zinc-finger nucleases. 602 (2011). doi:10. Hockemeyer et al.1371/journal. doi:10. M. R. Sci. M.S. Ramirez. Gasiunas.431 Medline 12. doi:10. Natl. S.1126/science.mib. Bernards. Protoc. E. Mali. U6 promoter-driven siRNAs with four uridine 3′ overhangs efficiently suppress targeted gene expression in mammalian cells. K. K. Blood 118.1038/nbt0502-497 Medline 9. Biotechnol.03. PLoS Genet. R. 14. doi:10.. RNA-guided genetic silencing systems in bacteria and archaea. 273 (2011). 497 (2002). doi:10. Bhaya. Genet. Opin. J. 28. 4599 (2011). doi:10. Huang.1182/blood-2011-02-335554 Medline 11.pgen. doi:10. Brummelkamp. A.1225829 Medline 5. N. doi:10. M. Nat. The Streptococcus thermophilus CRISPR/Cas system provides immunity in Escherichia coli. D.2011. Kosuri et al. V. 1295 (2010)..References 1. Sapranauskas et al.1038/nbt. A robust approach to identifying tissue-specific gene expression regulatory variants using personalized human induced pluripotent stem cells. 765 (2011). 20. doi:10. Microbiol. CRISPR RNA maturation by trans-encoded small RNA and host factor RNase III. Jinek et al. Site-specific gene correction of a point mutation in human iPS cells derived from an adult patient with sickle cell disease. A transcription activator-like effector toolbox for genome engineering. Annu.005 Medline 4. R. doi:10. Liu. Terns..A. 45. CRISPR-Cas systems in bacteria and archaea: Versatile small RNAs for adaptive defense and regulation. R. 39. Barrangou.1208507109 Medline 6. Deltcheva et al. Joung. L. doi:10.. Barrangou. H. doi:10.. Biotechnol. E2579 (2012). T. L.

doi:10. Rebar. Cohen-Haguenauer.93. 16. doi:10. doi:10. 1501 (2009). En route to ethical recommendations for gene transfer clinical trials. Mol. the immune system of bacteria and archaea. 11. 1298 (2007).16. F.1038/nature10177 Medline 29.. doi:10. doi:10. Nat. Makarova et al. Deveau et al.1126/science. J. R. A.1178817 Medline 21. M. Nat.S.. E. A.. Nat.1038/nbt1353 Medline 28. 1390 (2008). doi:10. Rev. A simple cipher governs DNA recognition by TAL effectors. Breaking the code of DNA binding specificity of TAL-type III effectors. Purnick. 167 (2010).1179555 Medline 31.1099/mic. Biotechnol.1663 Medline 26. Sci.2008.1073/pnas. 410 (2009). 839 (2010). J. Chandrasegaran. D. Evolution and classification of the CRISPR-Cas systems.stem.. O. 10. Rev.023 Medline 25. 25. Microbiol. 646 (2005). Collins.. Ther.1038/mt. van der Ploeg. O.13 Medline 17. doi:10. Multiplex genome engineering using CRISPR/Cas systems. P.1126/science. Nature 475. Lombardo et al.. Human hematopoietic stem/progenitor cells modified by zinc-finger nucleases targeted to CCR5 control HIV-1 in vivo. Cell Stem Cell 5.1126/science. 1156 (1996).1038/nrm2698 Medline 24. J. C. Science 326. Cha. doi:10. S.. 1509 (2009).1038/nrg2775 Medline 23. R. Zou et al. The second wave of synthetic biology: From modules to systems.1126/science.1016/j. S. doi:10. 217 (2011). Analysis of CRISPR in Streptococcus mutans suggests frequent occurrence of acquired immunity against infection by M102-like bacteriophages. Science 327. doi:10. Cell Biol. 190. Khalil. J. Barrangou.01412-07 Medline 32.1038/nrmicro2577 Medline 30. G. Synthetic biology: Applications come of age. 97 (2009). Holt et al.1156 Medline 18. L. doi:10. Zinc finger phage: Affinity selection of fingers with new DNAbinding specificities. J.05. N.. Biotechnol. P. R. Genet. King. J.A.. Microbiology 155.1128/JB. Proc. In vivo genome editing restores haemostasis in a mouse model of haemophilia. H. M.2009. Boch et al. doi:10. Gene editing in human stem cells using zinc finger nucleases and integrase-defective lentiviral vector delivery. 432 (2008). A. Science 10. Bogdanove. Mol. Bacteriol. Weiss. E. doi:10. Nat. Science 263. CRISPR/Cas. doi:10.1178811 Medline 20. Phage response to CRISPR-encoded resistance in Streptococcus thermophilus. 22. H. 671 (1994).8303274 Medline 19.0. 28. 1966 (2009). 367 (2010). J. Pabo. 93. doi:10. Rev. Moscou.1126/science. N. J. Science 326. Horvath.3. Y. J. Kim. doi:10. Hybrid restriction enzymes: Zinc finger fusions to Fok I cleavage domain.1038/nbt. 467 (2011). K. U.1038/nature03556 Medline 27. Highly efficient endogenous human gene correction using designed zincfinger nucleases. Urnov et al.027508-0 Medline 35    .1231143 (2013). Nature 435. 9. Nat. J. Cong et al. Acad. Li et al. S. Natl. Gene targeting of a disease-related gene in human induced pluripotent stem and embryonic stem cells.

H.1186/gb-2009-10-3-r25 Medline 43. 10. B. U. C. A system for the continuous directed evolution of biomolecules. The UCSC Genome Browser database: Extensions and updates 2011. J. M. Nucleic Acids Res.1093/nar/gkh103 Medline 41. H. Doak. doi:10. Mol. Design of 240.1093/nar/gkr1055 Medline 40. Proc. R. P. Xu. S. Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. 40. J. doi:10. S. W. J. 126 (2011). Kent et al. R. doi:10. doi:10. CRISPR: New horizons in phage resistance and strain identification. ViennaRNA Package 2. E.110 Medline 35. Langmead.. 288. (Database issue). Nature 489. 6. Medline 39.S. 499 (2011). 996 (2002). Schlabach. Wu. Sabina. Tang. Annu. Biol. PLoS Genet. 106. 75 (2012). D493 (2004).. J. Horvath. Nucleic Acids Res. Bioinformatics 26. doi:10. Zuker. Analysis of streptococcal CRISPRs from human saliva reveals substantial sequence diversity within and between subjects over time.1038/nature11232 Medline 46. (Database issue). Carlson. Salzberg. H. Pop. R. Diverse CRISPRs evolving in human microbiomes. Turner. Acad. Q. R. J. Y. Quinlan. R. The accessible chromatin landscape of the human genome. D. Garneau et al. 12. Genome Res. D.. doi:10. E. doi:10.0812506106 Medline     36  . Liu. D.2700 Medline 45.. 32. The CRISPR/Cas bacterial immune system cleaves bacteriophage and plasmid DNA.1006/jmbi. G.1038/nature09929 Medline 37. T. The UCSC Table Browser data retrieval tool. doi:10. Genome Res. Natl. D. Dreszer et al. Technol.. Thurman et al. M.A. doi:10.1093/bioinformatics/btq033 Medline 42. R. e1002441 (2012). doi:10. Barrangou. Y.1371/journal. D.pgen. 841 (2010). Genome Biol. BEDTools: A flexible suite of utilities for comparing genomic features. 21. J.111732. Pride et al. Sci. doi:10. G. The human genome browser at UCSC.1186/1748-7188-6-26 Medline 44.000 orthogonal 25mer DNA barcode probes. 26 (2011). L. I. 143 (2012). 3. Nature 472. doi:10. W.1146/annurev-food022811-101134 Medline 38. M. doi:10. T. Esvelt. Hall. J. Rev.1073/pnas. M. Nature 468.1999. K. Lorenz et al. Algorithms Mol. M. Hannon. Karolchik et al.1101/gr.. Rho. R25 (2009). M. T. Elledge. 67 (2010). Trapnell. 911 (1999).1002441 Medline 34. R. Biol. 2289 (2009)..33.1038/nature09523 Medline 36. C. Mathews. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. A.0. 8. Food Sci. D918 (2012). Ye.

Sign up to vote on this title
UsefulNot useful