You are on page 1of 14

protocol

Transcriptome-wide mapping of
N6-methyladenosine by m6A-seq based on
immunocapturing and massively parallel
sequencing
Dan Dominissini1–3, Sharon Moshitch-Moshkovitz1,3, Mali Salmon-Divon1, Ninette Amariglio1 &
Gideon Rechavi1,2
1Cancer Research Center, Chaim Sheba Medical Center, Tel Hashomer, Israel. 2Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel. 3These authors contributed
equally to this work. Correspondence should be addressed to G.R. (gidi.rechavi@sheba.health.gov.il).

Published online 3 January 2013; doi:10.1038/nprot.2012.148

N6-methyladenosine–sequencing (m6A-seq) is an immunocapturing approach for the unbiased transcriptome-wide localization


of m6A in high resolution. To our knowledge, this is the first protocol to allow a global view of this ubiquitous RNA modification,
and it is based on antibody-mediated enrichment of methylated RNA fragments followed by massively parallel sequencing. Building
© 2013 Nature America, Inc. All rights reserved.

on principles of chromatin immunoprecipitation–sequencing (ChIP-seq) and methylated DNA immunoprecipitation (MeDIP),


read densities of immunoprecipitated RNA relative to untreated input control are used to identify methylated sites. A consensus
motif is deduced, and its distance to the point of maximal enrichment is assessed; these measures further corroborate the success
of the protocol. Identified locations are intersected in turn with gene architecture to draw conclusions regarding the distribution
of m6A between and within gene transcripts. When applied to human and mouse transcriptomes, m6A-seq generated comprehensive
methylation profiles revealing, for the first time, tenets governing the nonrandom distribution of m6A. The protocol can be
completed within ~9 d for four different sample pairs (each consists of an immunoprecipitation and corresponding input).

Introduction
Over 100 modifications are known to decorate all four canoni- laborious, of low throughput and require several iterations in
cal nucleotides of RNA to create a complexity suiting the versatile order to pinpoint a single site, but most importantly they are
nature of this molecule, now known to exceed its classic role as a hypothesis driven, narrowing the search to a specific transcript
carrier of genetic information1,2. Some of these modifications are or nucleotide.
of regulatory importance, similar to dynamically regulated DNA In the field of genetics, the localization-function relationship
and protein modifications3,4. has an equivalent importance to the more general structure-
Methylation of the N6 position of adenosine (m6A) is a wide- function relationship in fueling of discovery. The major break-
spread and enigmatic post-transcriptional RNA modification5; the throughs, such as those accomplished for 5-methylcytosine and
devastating phenotypic consequences of its obliteration have been recently for 5-hydroxymethylcytosine, are attributed to the ability
documented in a growing number of organisms6. Especially illu- to map the global landscapes of these modifications and then to
minating in that they provide a physiological context are its proven superimpose it on top of other regulatory layers.
role in gametogenesis of Saccharomyces cerevisiae7, as well as the By harnessing the advantages of two established and power-
recent discovery that the fat mass and obesity-associated (FTO) ful technologies—immunocapturing and massively parallel
gene, a central regulator of metabolism and an obesity risk gene, sequencing—we were able to develop a new, relatively simple
is an m6A-demethylase8. method for the transcriptome-wide localization of m6A in high
Although being the most prevalent internal modification resolution13. In summary, we used a highly m6A-specific antibody
in mRNA of eukaryotes5, until recently its study still lagged far to immunoprecipitate methylated RNA fragments out of a ran-
behind that of other common RNA and DNA modifications such as domly fragmented transcriptome. We then subjected these frag-
adenosine-to-inosine RNA editing9 and 5-methylcytosine10. ments to massively parallel sequencing and identified positions of
As m6A has no effect on Watson-Crick base pairing or known signal enrichment relative to input control.
chemical derivatization reactions, it could not be identified We used m6A-seq to profile the human and mouse transcrip-
by reverse transcription–based methods 11. These limitations tomes13. This study revealed thousands of methylated sites char-
prevented the development of robust and efficient procedures acterized by a typical consensus in over 7,000 gene transcripts.
for its global mapping. Existing methods for localizing m 6A Strikingly, m6A sites tend to cluster around stop codons and within
within a sequence context require that first a specific transcript long internal exons. This nonrandom distribution is highly con-
be isolated (by either pull-down or nuclease protection, using served between humans and mice, suggesting a fundamental role
a complementary probe), and then subsequently fragmented for this modification. The global overview offered by m6A-seq
to its constituent nucleotides and analyzed by any number of helped reveal these unifying principles and will hopefully
physicochemical techniques (thin-layer chromatography (TLC), provide the framework for detailed functional analyses to come.
high-performance liquid chromatography (HPLC), mass spec- Our results have since been confirmed by another group14, which
trometry, scintillation and so on)12. These methods are obviously used an almost identical approach. Their experimental protocol

176 | VOL.8 NO.1 | 2013 | nature protocols


protocol
used proteinase K to release antibody-bound RNA fragments from Therefore, make sure to include DNase treatment.
beads, rather than competition with free m6A as in our protocol To the best of our knowledge, m6A has not been
(see Experimental design). detected in mammalian DNA.
The approach presented here could well serve the unbiased,
transcriptome-wide localization of other RNA modifications, The starting amounts and the nature of the RNA sample affect
provided that they or their chemical derivatives could be enriched the informative depth of the library and consequently the number
by antibody or any other means (metal ion, purified binding protein of identified peaks. When profiling total RNA, remember that a
and so on). Notably, given the dynamic nature of m6A, profiling the considerable fraction of the sequenced tags will originate from
methylome of different systems and under varying physiological methylated rRNA (18S) (ref. 15). Thus, enough starting material
states may be a powerful approach to shed light on the functions should be used in this case: in our hands, a minimum of 300 µg of
of this modification. total RNA was required for profiling. Alternatively, rRNA burden
can be minimized by using either Ribominus (Life Technologies) or
Limitations and advantages of the method mRNA enrichment. The use of as little as 5 µg of one-round polyA-
The major limitations of m6A-seq arise from its reliance on RNA selected RNA still allowed the detection of thousands of m6A peaks
fragmentation. This principle, while enabling one to narrow down that together recapitulated the global features of the methylome as
the location of methylated sites, inherently entails a loss of infor- determined from larger libraries, namely the consensus motif and
mation regarding isoform identity and the presence of multiple the characteristic nonrandom distribution.
methylation sites along the same parent transcript (combinatorics).
A tradeoff exists between fragment size and the ability to accurately RNA fragmentation. Chemical fragmentation (metal-ion induced)
© 2013 Nature America, Inc. All rights reserved.

align it back to the genome, in effect limiting the resolution of is an integral part of the protocol. We typically strive for a size
the method. Whereas resolution is admittedly not at the single- distribution centered on ~100 nt. In our experience, even small
nucleotide level, when combined with consensus information a changes in incubation time and temperature, or the presence
relatively high resolution is achieved. As the m6A-seq approach of residual EDTA and/or salts, will affect fragmentation effi-
also relies on enrichment of methylated RNA fragments and their ciency. Importantly, RNA concentration is a major determinant
physical separation from the rest of the fragments, stoichiometry of efficiency. We thus strongly recommend calibrating this step
information is largely lost, making it insensitive to the proportion before fragmenting your entire sample (Fig. 3). Small volumes
of methylated transcripts or sites. (20 µl), batches of no more than five tubes, thin-walled tubes
m6A-seq allows a transcriptome-wide, hypothesis-independent and a thermocycler (for accurate temperature setting) will help
identification of thousands of methylation sites, representing its ensure reproducible results. Any change in fragment size or
greatest advantage over previous methods. The latter involved
laborious biochemical procedures applied to individual puri- m

fied transcripts, and therefore did not allow a global overview of m

this widespread modification. Our approach overcomes most of m


m
Purified
these limitations in a relatively short execution time while using RNA m
m

Steps 1–5
sample
accessible materials, equipment and software packages. m
m
m
m

Experimental design m

The m6A-seq protocol outlined here consists of two parts: Chemical


Steps 6–12
immunocapturing of m6A-modified RNA fragments and their fragmentation

massively parallel sequencing (Fig. 1), and bioinformatic analysis m m


m
m
of the generated tags (Fig. 2). Fragmented
RNA m

(~100 nt) m
m

RNA isolation. Several points should be considered when isolating


RNA for m6A-seq. Immunoprecipitation with
Steps 13–18
anti-m6A antibodies

(i) A large number of reagents and protocols are available for


Input
m

RNA extraction. We typically use column-based purifica- Step 13


control Elution Steps 19–25
with free m6A
tion kits, mainly for their high yield, good quality and ease m
m
m

of use. However, most RNA purification methods will be m


m m
m

suitable, as long as purified RNA does not contain EDTA Random primed cDNA
IP library generation,
or salts that interfere with downstream fragmentation. adaptor ligation and
Steps 26–28
(ii) RNA integrity is influential. As RNA is chemically Illumina sequencing

fragmented later in the protocol, degraded RNA will be Sequencing primer


further reduced in size, increasing the proportion of frag-
ments that are lost during processing. Figure 1 | Schematic diagram of the m6A-seq protocol (Steps 1–28). Purified
RNA is chemically fragmented into ~100-nt-long oligonucleotides and
(iii) DNA contamination should be avoided. Its presence can
subjected to immunoprecipitation using an m6A-specific antibody. Eluted
interfere with downstream analysis. This is all the more m6A-containing fragments (IP) and untreated input control fragments are
important in organisms known to contain m6A in their converted to cDNA using random hexamer primers followed by adapter
DNA, as the anti-m6A antibody will recognize it. ligation and Illumina sequencing. Circled ‘m’, N6-methyladenosine.

nature protocols | VOL.8 NO.1 | 2013 | 177


protocol
Figure 2 | Bioinformatic pipeline. A flowchart summarizing the basic
Input bioinformatic analysis of reads obtained by m6A-seq (IP  +  input control),
Raw data including the software components used (Steps 29–49). Reads first undergo
(.fastq)
Step 29 a set of QC checks with FastQC and are then mapped to the reference genome
IP with Bowtie. Mapped reads are provided as input for MACS, which identifies
m6A peaks that can be adapted for visualization on the UCSC genome browser.
MEME is used for de novo motif finding followed by localization of the motif
Pre-processing: QC checks
(FastQc)
with respect to peak summit by CentriMo. Called peaks are annotated by
Step 30 intersection with gene architecture using PeakAnalyzer. GO, gene ontology.

antibody-bound RNA. Whereas solvent extraction of IP and bead-


Read mapping only samples yields comparable RNA amounts, elution retrieves
(Bowtie)
Steps 31 and 32 detectable amounts solely from the immunoprecipitated sample.
Therefore, it is important to elute bound RNA by competition with
free m6A rather than by extraction of the entire bead-antibody
Peak calling and visualization complex. Although eluted RNA levels are small, it is important to
(MACS)
Steps 33–38 not include glycogen when ethanol-precipitating the eluates, as this
interferes with measuring RNA levels and may also affect library
preparation because of co-precipitation of free m6A. Jaffrey and
Motif search colleagues’14 use of proteinase K in this step seems to serve the same
© 2013 Nature America, Inc. All rights reserved.

(MEME)
Peak annotation
Steps 39–42
purpose. However, currently there is not enough data to assess the
(PeakAnalyzer)
Steps 48 and 49 relative efficiencies of the two approaches.

Other Quality control. You may want to validate the success of the protocol
downstream Central enrichment
analysis up to this point before proceeding to preparation and sequencing of
analyses (GO,
comparison to (Centrimo) the libraries. Because of the extremely high success rates of the pro-
gene Steps 43–47
expression and so on)
tocol, we do not routinely implement this step; therefore, procedural
details for quality control (QC) are not included in the protocol.
A straightforward way would be to use reverse transcription–
quantitative PCR (RT-qPCR) to assess depletion of methylated
fragmentation method should be compatible with ensuing steps, transcript fragments from the supernatant that remains after
particularly library preparation. IP (see PROCEDURE Step 18) relative to input control. An
unmethylated transcript, for which no or only minimal depletion
Immunoprecipitation. Before proceeding to IP, save a few micro- is observed, is an adequate negative control. The works by us13 and
grams of fragmented RNA to serve as input control in RNA-seq. by the Jaffrey laboratory14 provide a host of human and mouse
The input library is required for determining signal enrichment in methylated transcripts to choose from. Be sure to choose a couple
the immunoprecipitated sample. of transcripts, as a specific one may not be methylated in your tissue
IP of m6A-containing RNA fragments is an important step that may or under the conditions of your experiment.
be susceptible to RNA degradation owing to possible RNase contami-
nation carried over with the antibody or protein A beads. It is extremely
Fragmented RNA
important to add RNase inhibitors: we typically use both RNasin Plus
A
N
lR
to ct
ta
ta

(Promega), which is capable of inhibiting eukaryotic RNase A and 3 4 5 6 7 (min)


In

RNase B, and ribonucleoside vanadyl complexes (RVC, Sigma-Aldrich)


that inhibit most nucleases. Nonspecific binding of RNA to protein A
beads contributes to background noise. We therefore advise preblock-
ing the beads with BSA and including a bead-only control.
The anti-m6A antibody introduces no sequence bias, and its
high specificity and IP compatibility are well established8,16–22.
Nevertheless, several measures were taken to ensure the validity and
stringency of our experimental approach: (i) antibody-bound RNA Small RNAs
5S RNA (120 nt)
fragments were eluted only by competition with free m6A nucleo­ tRNA (70–110 nt)
tides (see Elution section below). (ii) m6A-seq of m6A-deficient
mRNA (obtained from an ime4∆ yeast mutant) yielded undetect-
able RNA amounts and libraries, in contrast to mRNA obtained
from sporulating wild-type yeast. (iii) Identification of a known Figure 3 | Calibration of RNA fragmentation and validation of size
distribution. Different RNA samples were chemically fragmented (according
methy­lation site within 18S rRNA was confirmed, serving as a posi-
to the recipe in Step 6) for the specified time points, ethanol-precipitated
tive internal control. (iv) High signal reproducibility was demon- and separated on a 1.5% (wt/vol) agarose gel. Average fragment size and
strated across many independent biological replicates13. intensity decreased with time (as very small fragments escaped the gel).
After 5 min (indicated in red and by a white arrow), fragments centered on
Elution. The high signal-to-noise ratio obtained with m6A-seq ~100 nt. Intact total RNA, including the small RNA fraction (black arrow),
is attributed to a large extent to highly specific retrieval of is shown on the left for comparison.

178 | VOL.8 NO.1 | 2013 | nature protocols


protocol
Library preparation and massively parallel sequencing. With proper parameter setting, computational tools available for
We successfully used mRNA-seq or TruSeq sample preparation kits calling peaks in ChIP-seq data, such as MACS23,24, can be adapted
(Illumina) for library preparation. However, other library genera- for the analysis of m6A-seq. MACS shifts all reads by half the frag-
tion kits and reagents can be used, as long as the resulting library is ment length (modeled by MACS or defined by the user) toward the
compatible with sequencing platform requirements. As RNA puri- 3′ ends in order to better locate the precise binding site (or methy­
fication and fragmentation are already performed before IP, these lation region in our case). Next, to capture the influence of local
steps are skipped during library preparation. Our library prepa- biases, it models the read distribution along the genome by Poisson
ration starts from first-strand cDNA synthesis. Libraries are pre- distribution using the dynamic parameter ‘lambda_local’ (instead
pared from input and IP samples, as well as from bead-only control. of uniform ‘lambda_background’). Lambda_local is estimated from
We recommend size-selecting the library by gel excision (adjusting windows centered on the peak location in the input (control) sam-
the expected size range of the desired band according to the average ple, and is applied to calculate the P value for each enriched region
size of fragmented RNA and considering the added length of adapter in the IP. To correct for multiple testing, MACS swaps IP and input
or sequencing primers). It is important to validate the successful samples, and empirically calculates the false discovery rate (FDR)
preparation of each library before proceeding to massively paral- based on the number of peaks from input over IP sample that are
lel sequencing using an Agilent 2100 Bioanalyzer (or equivalent). called at the same P value cutoff. Background calculation relies
The bead-only control sample should not produce a library (indi- on the effective genome size (‘gsize’ parameter). In our case, reads
cating low background levels). Successful libraries of sample pairs are derived from the much smaller transcriptome. Therefore, it is
(input and IP) are subjected to sequencing using standard 36-nt important to set this parameter to the estimated transcriptome size;
read size with the Illumina sequencing kit for Illumina GAIIx. this means that more tags should be found in a fixed region for it to
© 2013 Nature America, Inc. All rights reserved.

be considered a peak. Once peaks are identified, their authenticity


Bioinformatic analysis. The streamlined analysis described from is tested by an unbiased motif search using MEME25. Detection of
PROCEDURE, Step 29 onward uses free, easily accessed, software a dominant consensus in which adenosine solely occupies a non-
packages. The m6A-seq approach is analogous to ChIP-seq—both degenerate position strongly supports the validity of the results.
are based on global identification of regions of signal enrichment. Additional confirmation is gained by assessing the distance of the
Indeed, the distribution of sequence tags in m6A-seq data is quite identified motif from peak summits, compared with negative con-
similar to that of ChIP-seq: in both cases, enriched regions are trol peaks: a significant clustering of the identified motif around
discrete and form sharp peaks along the genome or transcriptome. the peak summit is to be expected.

MATERIALS
REAGENTS • β-Mercaptoethanol (β-ME; Sigma-Aldrich, cat. no. M7522) ! CAUTION β-ME
• Cultured cells or tissues (any cell line or tissue can be used as a source for is highly toxic. Wear protective clothing, including gloves, eye and face
RNA) ! CAUTION Adhere to all relevant institutional ethics guidelines. masks when handling it.
• TBE buffer (10×; BioLab, cat. no. 201423) • RNaseKiller solution (5 PRIME, cat. no. 2900630)
• PerfectPure RNA cultured cell kit (5 PRIME, cat. no. 2302340) • mRNA-seq sample preparation kit (Illumina, cat. no. 1004814)
• ZnCl2 (Sigma-Aldrich, cat. no. 96468) • TruSeq RNA sample preparation kit (Illumina, cat. no. 15013136)
• Ultrapure water (Biological Industries, cat. no. 01-866-1B) • TruSeq SBS kit v5-GA, 36-cycles (Illumina, cat. no. 15013676)
• Tris-HCl (pH 7.0, 1 M; Sigma-Aldrich, cat. no. T2413) • TruSeq SR cluster kit v2-Bot-GA (Illumina, cat. no. 15019749)
• Tris-HCl (pH 7.4, 1 M; Sigma-Aldrich, cat. no. T2663) • GeneRuler low-range DNA ladder (Fermentas, cat. no. SM1193)
• GenElute mRNA miniprep kit (Sigma-Aldrich, cat. no. MRN70) • Agilent DNA 100 kit (Agilent, cat. no. 5067-1504)
• Sodium acetate (pH 5.2, 3 M; Sigma-Aldrich, cat. no. S7899) • QIAquick gel extraction kit (Qiagen, cat. no. 28704)
• EDTA (pH 8.0, 0.5 M; Sigma-Aldrich, cat. no. 03690) • QIAquick PCR purification kit (Qiagen, cat. no. 28104)
• Glycogen (5 mg ml − 1; Life Technologies, cat. no. AM9510) • MinElute PCR purification kit (Qiagen, cat. no. 28004)
• Ribonucleoside vanadyl complexes (RVC; 200 mM; Sigma-Aldrich, • Quant-iT RNA assay kit (100 assays; Life Technologies, cat. no. Q32852)
cat. no. R3380) • Agilent RNA 6000 Pico kit (Agilent, cat. no. 5067-1513)
• Agarose (Sigma-Aldrich, cat. no. A9539) • Test data set: m6A-seq of human hepatocarcinoma cell line (HepG2)
• RNasin Plus RNase inhibitor (Promega, cat. no. N2611) can be obtained from Gene Expression Omnibus (GEO) at accession
• NaCl (5 M; Sigma-Aldrich, cat. no. S6546) code GSE37003
• Igepal CA-630 (Sigma-Aldrich, cat. no. I8896) • Gene annotation from Ensembl in GTF format (ftp://ftp.ensembl.org/pub/
• Affinity purified anti-m6A rabbit polyclonal antibody (Synaptic Systems, release-67/gtf/homo_sapiens/Homo_sapiens.GRCh37.67.gtf.gz)
cat. no. 202 003) • Human reference genome sequence, build 37 (hg19), downloaded from the
• N6-Methyladenosine, 5′-monophosphate sodium salt (Sigma-Aldrich, University of California at Santa Cruz (UCSC) (http://hgdownload.cse.ucsc.
cat. no. M2780) edu/goldenPath/hg19/chromosomes/) or from Ensembl (ftp://ftp.ensembl.
• Ethanol (Sigma-Aldrich, cat. no. E7023) ! CAUTION Ethanol is highly org/pub/release-67/fasta/homo_sapiens/dna/). Concatenate the chromo-
flammable; keep flammable liquids away from all sources of ignition. somal fasta files into a single multifasta file called ‘hg19.fa’
• Ethidium bromide (Sigma-Aldrich, cat. no. E1510) ! CAUTION Ethidium • SRA toolkit (version 2.1.10) for SRA to .fastq conversion (http://www.ncbi.
bromide is a highly toxic carcinogen; wear gloves and lab coats when nlm.nih.gov/Traces/sra/sra.cgi?cmd=show&f=software&m=software&s=so
handling it. ftware)
• Loading dye (Fermentas, cat. no. R0631) • FastQC tool (version 0.10.1) for quality checks of sequenced reads
• Immobilized recombinant protein A (Repligen, cat. no. IPA300) (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/)
• BSA (20 mg ml − 1; Sigma-Aldrich, cat. no. B8667) • Bowtie26 (version 0.12.7) for read mapping (http://bowtie-bio.sourceforge.
• SuperScript II reverse transcriptase (Life Technologies, cat. no. 18064-014) net/index.shtml)

nature protocols | VOL.8 NO.1 | 2013 | 179


protocol
• BEDTools27 (version 2.15.0) for file manipulation (http://code.google. REAGENT SETUP
com/p/bedtools/) ZnCl2, 1 M  Dissolve 1.363 g of ZnCl2 in 10 ml of molecular biology–grade,
• MACS23 (version 1.4.1) for peak calling (http://liulab.dfci.harvard.edu/ RNase-free water. Store the solution at room temperature (22 °C) and use it
MACS/) within 18 months.
• MEME25 (version 4.6.1) for de novo motif search (http://meme.nbcr.net/ m6A, 20 mM  Dissolve 10 mg of m6A in 1.3 ml of molecular biology–grade,
meme/cgi-bin/meme.cgi) RNase-free water. Store aliquots of 150 µl at  − 20 °C and use them within
• CentriMo28 (version 4.8.1) for central motif enrichment analysis (http:// 12 months.
meme.nbcr.net/meme/cgi-bin/centrimo.cgi) Cell lysis buffer with b-ME  Add 10 µl (stock) of β-ME per 1 ml of 5 PRIME
• PeakAnalyzer29 (version 1.4) for peak annotation (http://www.ebi.ac.uk/ lysis solution (supplied in the PerfectPure RNA cultured cell kit). Freshly
bertone/software.html) prepare the buffer. ! CAUTION β-ME is toxic; dispense it in a fume cupboard
• fetchChromSizes (UCSC) for retrieval of genome size (http://hgdownload. and wear protective clothing.
cse.ucsc.edu/admin/exe/linux.x86_64/) Fragmentation buffer, 10×  Mix 800 µl of molecular biology–grade,
• wigToBigWig (UCSC) for file format conversion (http://hgdownload.cse. RNase-free water with 100 µl (1 M stock) of Tris-HCl (pH 7.0) and
ucsc.edu/admin/exe/linux.x86_64/) 100 µl (1 M stock) of ZnCl2. Freshly prepare the buffer. The final
EQUIPMENT concentrations in the buffer concentrate are 100 mM Tris-HCl and
• Microcentrifuge tubes (1.75 ml; Axygen, cat. no. MCT-175-C)
100 mM ZnCl2.
• PCR tubes with flat cap (0.2 ml; Axygen cat. no. PCR-02-A)
IP buffer, 5×  Mix 0.5 ml (1 M stock) of Tris-HCl (pH 7.4), 1.5 ml (5 M stock)
• PCR tubes (8 strip; Axygen, cat. no. PCR-0208-C)
of NaCl and 0.5 ml (10% vol/vol stock) of Igepal CA-630 in a total volume
• PCR tube caps (8 strip; Axygen, cat. no. PCR-02-FCP-C)
of 10 ml (use molecular biology–grade, RNase-free water). Freshly prepare
• Microspin minicentrifuge/vortex (BioSan, model. no. FV-2400 or equivalent)
the buffer. The final concentrations in the buffer concentrate are 50 mM
• Pipettes
Tris-HCl, 750 mM NaCl and 0.5% (vol/vol) Igepal CA-630.
• Pipette filter tips
• Transilluminator Elution buffer, 1×  Mix 90 µl (5× stock) of IP buffer, 150 µl (20 mM stock)
© 2013 Nature America, Inc. All rights reserved.

• Gel imager of m6A, 7 µl of RNasin Plus and 203 µl of water (use molecular biology–
• Magnetic separation rack grade, RNase-free water). Freshly prepare the buffer. Final concentrations are
• Thermocycler machine (96-well plate; Applied Biosystems or equivalent) 1× IP buffer and 6.7 mM m6A.
• Gel electrophoresis system m6A-specific antibody stock solution, 0.5 mg ml − 1  Reconstitute 50 µg of
• Weighing scale lyophilized affinity-purified m6A-specific antibody in 100 µl of molecular
• Weighing boats biology–grade, RNase-free water. Store the stock solution in aliquots of 25 µl
• NanoDrop spectrophotometer (NanoDrop Technologies, ND-1000, or at  − 20 °C and use them within 12 months.
equivalent) EQUIPMENT SETUP
• Agilent 2100 Bioanalyzer or equivalent Bioinformatics  Most of the commands given in the protocol can be run on
• Head-over-tail rotator the UNIX shell prompt, and are meant to be run from the example working
• 64-bit computer running Linux or Mac OS X; 4GB of RAM directory. We assume that the location of all software tools is defined in your
• Cell scraper PATH. Commands meant to be executed from the UNIX shell are prefixed
• Homogenizer with a ‘$’ character.

PROCEDURE
RNA isolation ● TIMING 3 h
1| Lyse the cells by adding PerfectPure RNA cultured cell kit lysis solution supplemented with 143 mM β-ME directly onto
the cells. Collect the lysed cells with a cell scraper and vortex until the sample is homogenous. Tissues should be thoroughly
homogenized in lysis solution with the aid of a homogenizer. Next, pass the lysate through an 18–21-gauge syringe needle
several times.
 CRITICAL STEP Ensure that you have enough starting material, depending on your choice of RNA type for analysis
(see Experimental design).
 CRITICAL STEP Passing the lysate through a needle will improve RNA yields and ensure shearing of genomic DNA.
Note that the antibody will also recognize m6A in the context of DNA if it is present in your organism.
 PAUSE POINT Lysates can be stored at  −80 °C until further processing for up to 6 months.

2| Isolate total RNA according to the manufacturer’s instructions, including on-column DNase treatment.
 CRITICAL STEP Elute RNA in molecular biology–grade water, as elution buffers may interfere with subsequent RNA
fragmentation. Elution volume should be as low as possible. Do not omit DNase treatment, as contaminating DNA will
interfere with downstream analysis. DNase treatment is all the more essential in organisms with m6A in their DNA.
 CRITICAL STEP Note that this RNA isolation method is not suitable for small RNAs (<150 nt), as they are lost during
the process.
 PAUSE POINT Isolated RNA can be stored at  −80 °C for up to 1 year until further use.

3| Measure RNA concentration with a NanoDrop spectrophotometer.


? TROUBLESHOOTING

180 | VOL.8 NO.1 | 2013 | nature protocols


protocol

4| Validate RNA integrity by agarose gel electrophoresis or analysis on an Agilent 2100 Bioanalyzer.
 CRITICAL STEP Metal-ion induced fragmentation later in the protocol can cause RNA degradation products, which are
smaller in size to begin with, to escape processing and analysis.
 CRITICAL STEP We advise using an RNA-dedicated electrophoresis apparatus.
? TROUBLESHOOTING

5| If desired, enrich for polyadenylated RNA by at least one round of oligo-dT selection using the GenElute mRNA
miniprep kit. Depletion of rRNA using the RiboMinus transcriptome isolation kit is an alternative. Note that this step
is optional; m6A-seq works well when performed on total RNA (see Experimental design).
 CRITICAL STEP Elution volumes should be large here (polyadenylated RNA from 500 µg of total RNA is eluted in 100 µl) in
order to ensure maximum yields, and they will typically result in low RNA concentration. Either concentrate RNA by ethanol
precipitation or recalibrate fragmentation (Step 6).

Fragmentation ● TIMING 3 h
6| Adjust the RNA concentration to ~1 µg µl − 1 with RNase-free water. Set up the following fragmentation reaction in a
thin-walled 200-µl PCR tube. Vortex and spin down the tube.

Component Volume (ml) Final


© 2013 Nature America, Inc. All rights reserved.

RNA (from Step 3 or Step 5) 18 18 µg

Fragmentation buffer, 10× 2 1×

Total volume 20

 CRITICAL STEP Adherence to the specified amounts and volumes is highly recommended, as scaling may affect
fragmentation efficiency and the resulting size distribution. Work quickly at this stage and immediately proceed to Step 7.
We advise working in batches of five tubes (300 µg of RNA require ~17 tubes). Substituting metal ion–induced fragmentation
with physical fragmentation by sonication is not advised, as it yields fragments  >200 nt and might not be entirely random.

7| Incubate the tubes at 94 °C for 5 min in a preheated thermal cycler block with the heated lid closed. Remove the tubes
from the block and immediately add 2 µl of 0.5 M EDTA. Vortex and spin down the tubes and place them on ice.
 CRITICAL STEP Time and temperature settings should be closely followed. Work quickly at this stage.

8| Repeat Steps 6 and 7 for each batch of five tubes until all of the RNA is fragmented.

9| Collect contents of all tubes, add one-tenth volumes of 3 M sodium acetate (pH 5.2), glycogen (100 µg ml−1 final) and
2.5 volumes of 100% ethanol. Mix the contents and incubate at  −80 °C overnight.
 CRITICAL STEP Do not use nucleic acids as carriers for precipitation, as they will interfere with downstream IP and sequencing.
 PAUSE POINT RNA is stable in the precipitation mixture when stored at  −80 °C for a long time period.

10| Centrifuge the tubes at 15,000g for 25 min at 4 °C. Discard the supernatant, taking care not to disrupt the pellet, which
is easily visible because of the presence of glycogen. Wash the pellet with 1 ml of 75% (vol/vol) ethanol and centrifuge
again at 15,000g for 15 min at 4 °C.

Validation of postfragmentation size distribution ● TIMING 2 h


11| Carefully aspirate the supernatant and let the pellet air-dry. Resuspend the pellet in 300 µl of RNase-free water.
 PAUSE POINT RNA can be stored at  −80 °C at this stage until further use for up to 1 year.

12| Validate RNA postfragmentation size distribution by measuring RNA concentration with a NanoDrop spectrophotometer
and running 0.5 µg of RNA on 1.5% (wt/vol) agarose gel for ~30 min. The outlined fragmentation procedure should produce
a distribution of RNA fragment sizes centered on ~100 nt (Fig. 3). Alternatively, fragmented RNA can be run according to
the manufacturer’s instructions on an Agilent 2100 Bioanalyzer with an Agilent RNA 6000 Pico kit.
 CRITICAL STEP Validate RNA size distribution only after it has been ethanol-precipitated, as the presence of salts may
affect gel migration of the fragments. We advise using an RNA-dedicated electrophoresis apparatus.
? TROUBLESHOOTING
 PAUSE POINT RNA can be stored at  −80 °C at this stage until further use for up to 1 year.

nature protocols | VOL.8 NO.1 | 2013 | 181


protocol

Immunoprecipitation ● TIMING 5 h
13| Save a portion of untreated fragmented RNA to serve as input control in RNA-seq. To be on the safe side, we recommend
saving a few micrograms, although much less will also suffice.
 PAUSE POINT RNA can be stored at  −80 °C for up to 1 year before use in Step 26.

14| Adjust the volume of the remaining RNA to 755 µl with RNase-free water. Prepare the reaction mixture tabulated below
in a 1.7-ml low-binding microcentrifuge tube. Vortex and spin down the tube. We recommend setting up a parallel reaction
that includes the same amount of fragmented RNA but without the antibody. It will be treated in the same manner as the
IP sample throughout Steps 15–27 and will serve as a bead-only control to assess background levels and efficiency of RNA
elution at Steps 25 and 27.

Component Volume (ml) Final

Fragmented RNA (from Step 12) 755 Varies ( >5 µg of mRNA,  >300 µg of total RNA)

RNasin (40 U µl − 1) 10 200–400 U

RVC (200 mM) 10 2 mM


© 2013 Nature America, Inc. All rights reserved.

IP buffer, 5× 200 1×

m6A-specific antibody (0.5 mg ml − 1) 25 12.5 µg

Total volume (µl) 1,000

 CRITICAL STEP We recommend using low-binding RNase/DNase-free microcentrifuge tubes from this step onward.

15| Incubate with head-over-tail rotation for 2 h at 4 °C.

16| While the samples are incubating, wash 200 µl of recombinant protein A bead slurry twice in 1 ml of 1× IP buffer.
Resuspend the beads in 1 ml of 1× IP buffer supplemented with BSA (0.5 mg ml − 1) and incubate on a rotating wheel
for 2 h. Spin down, remove and discard the supernatant and wash twice in 1 ml of 1× IP buffer. Equally divide the beads
between two 1.7-ml microcentrifuge tubes (one for the IP sample and one for the bead-only control).
 CRITICAL STEP Remember to supplement the IP buffer with RNasin and RVC. Do not exceed the specified quantity of
beads (already in excess), as it can influence background levels.

17| Transfer the reactions from Step 15 into the bead-containing tubes prepared in Step 16. Incubate the reaction mixtures
for 2 h on a rotating wheel at 4 °C.

18| Spin down the beads and carefully remove and retain the supernatant. Wash the beads with 1 ml of 1× IP buffer
three times.
 CRITICAL STEP Remember to supplement the IP buffer with RNasin and RVC. Try to minimize bead loss, as the amount of
precipitated RNA is scarce.
 CRITICAL STEP Bear in mind that the desired population of m6A-enriched RNA fragments is not in the supernatant, but
rather is still on the beads. The supernatant is saved for the sake of IP QC (see Experimental design and Step 25 below).

Elution ● TIMING 2.5 h


19| Add 100 µl of elution buffer to the sedimented beads. Incubate the mixture for 1 h with continuous shaking at 4 °C.
Remember that Steps 19–25 should be carried out on the beads used to capture the IP reaction, as well as on the
bead-only sample.
 CRITICAL STEP In our experience, elution by competition, rather than by solvent extraction of the entire bead-antibody
complex, is imperative in order to minimize background levels due to nonspecific binding.
 CRITICAL STEP Remember to supplement elution buffer with RNasin and RVC.

20| Spin down the beads and carefully remove and retain the supernatant (now containing eluted RNA fragments).
 CRITICAL STEP Take special care not to aspirate the beads, as it will increase background noise.

182 | VOL.8 NO.1 | 2013 | nature protocols


protocol

21| Add 100 µl of 1× IP buffer to the sedimented beads and gently tap the tube to mix. Spin down the beads and carefully
remove and retain the supernatant.
 CRITICAL STEP Remember to supplement IP buffer with RNasin and RVC.

22| Repeat Steps 19–21 once more. Discard the beads.

23| Combine all eluates from the same sample (IP or bead-only control) and add one-tenth volumes of 3 M sodium acetate
(pH 5.2), and 2.5 volumes of 100% ethanol. Mix and incubate the sample at  − 80 °C overnight.
 CRITICAL STEP Do not add glycogen (or other carrier) to the precipitation mixture at this stage, as it can precipitate
free m6A, possibly interfering with downstream reactions and measurements.
 PAUSE POINT RNA is stable in the precipitation mixture when stored at  − 80 °C for a long period of time.

Recovery of precipitated RNA ● TIMING 1 h


24| Centrifuge the tube at 15,000g for 25 min at 4 °C. Discard the supernatant, taking care not to disrupt the pellet, which is
not visible at the bottom of the tube. Wash the pellet with 1 ml of 75% (vol/vol) ethanol and centrifuge it again at 15,000g
for 15 min at 4 °C. Aspirate the supernatant and let the pellet air-dry. Resuspend the pellet in 15 µl of RNase-free water.

25| Measure RNA concentration (using 1 µl from Step 24) with the Quant-iT RNA assay kit.
© 2013 Nature America, Inc. All rights reserved.

 CRITICAL STEP Assuming the minimal starting RNA amounts specified in the Experimental design section, expect yields
on the order of tens of nanograms, depending on the cell line or tissue of origin. The product of the bead-only control
reaction should be below the detection threshold of the Quant-iT RNA assay kit. Absorbance-based measurements of RNA
concentration are not sensitive enough for this application and should not be used.
 CRITICAL STEP Optionally, at this point you might choose to QC your IP before proceeding to library generation and
sequencing (see Experimental design).
? TROUBLESHOOTING
 PAUSE POINT RNA can be stored at  −80 °C at this stage until further use for up to 1 year.

Library preparation and massively parallel sequencing ● TIMING ~6 d


26| Subject comparable amounts of input control (from Step 13) and eluate generated in Step 24 from the IP sample—as
well as the eluate of the bead-only control sample—to first-strand cDNA synthesis using random primers and SuperScript II
reverse transcriptase according to instructions included in the mRNA-seq sample preparation kit.
 CRITICAL STEP Pay attention to the fact that RNA purification and fragmentation are performed before IP in the m6A-seq
protocol, and therefore ought to be skipped when using the mRNA-seq sample preparation kit.
 PAUSE POINT Reaction products can be stored at  − 20 °C for up to 6 months.

27| Proceed to second-strand cDNA synthesis, DNA end repair, adapter ligation and PCR amplification using mRNA-seq or
TruSeq sample preparation kits according to the manufacturer’s instructions.
 CRITICAL STEP Note that library validation by Agilent Technologies 2100 Bioanalyzer should demonstrate that the
bead-only control sample did not produce a library.
 CRITICAL STEP Size-selecting the library by gel excision is advised. The expected size range of the desired band depends
on fragmentation output (Step 12), as well as on adapter and sequencing primer lengths.
? TROUBLESHOOTING
 PAUSE POINT Libraries can be stored at  − 20 °C. According to the manufacturer’s instructions of the TruSeq sample
preparation kit, it is not recommended to store libraries for more than a week.

28| Subject libraries to cluster generation and next-generation sequencing on the Illumina GAIIx platform (or similar NGS
machine) via the 36-cycle sequencing module, according to the manufacturer ’s instructions. Longer reads can also be used.
 CRITICAL STEP Assuming that m6A-seq is applied to total RNA and also that ~30 million reads per lane are obtained,
we recommend allocating separate lanes to IP and input samples. Multiplex sequencing by the use of indexed adapters can
be used as long as enough reads are obtained by the sequencing platform in use.

QC of raw sequence data ● TIMING ~30 min


29| Convert SRA files to .fastq format using the SRA Toolkit.
$ fastq-dump "SRR456555.sra"
$ fastq-dump "SRR456556.sra"
$ fastq-dump "SRR456557.sra"

nature protocols | VOL.8 NO.1 | 2013 | 183


protocol

$ fastq-dump "SRR456551.sra"
$ fastq-dump "SRR456552.sra"
$ fastq-dump "SRR456553.sra"
$ fastq-dump "SRR456554.sra"
 CRITICAL STEP The outlined analysis scheme that follows is applicable to any data set generated by m6A-seq. Here,
a stepwise analysis is demonstrated on a test data set that can be obtained from GEO accession no. GSE37003.

30| Perform simple QC checks to ensure that the raw data look good, with no biases that could affect results. Use the FastQC
tool, which provides summary graphs and tables.
$ fastqc *.fastq

Read mapping ● TIMING 4.5 h


31| Concatenate all .fastq files to single input and IP files.
$ cat SRR456555.fastq SRR456556.fastq SRR456557.fastq  > 

Input.fastq
$ cat SRR456551.fastq SRR456552.fastq SRR456553.fastq
SRR456554.fastq  >  IP.fastq
© 2013 Nature America, Inc. All rights reserved.

32| Map reads to a reference genome, using any short-read aligner such as BWA30 or Bowtie26. Use Bowtie aligner allowing
up to five multi-hits for each read, meaning that all reads matching to more than five places are excluded.
$ bowtie -m 5 -a --sam --best --strata bowtie_index/hg19 IP.fastq
> IP.sam
 $ bowtie -m 5 -a --sam --best --strata bowtie_index/hg19
Input.fastq > Input.sam
If you wish to focus on methylation sites overlapping with splice junctions, use a splicing-aware aligner such as TopHat31.
From our experience, a very small fraction of reads (<1%) map to exon-exon junctions, and hence we ignore these in down-
stream analysis. For reads longer than 50 nt, consider using Bowtie 2 (ref. 32), which will search for multiple alignments and
report the best one.

Identification of m6A sites (‘peak calling’) ● TIMING 1.5 h


33| Set the effective genome size (gsize) in MACS on estimation of transcriptome size according to the UCSC table browser.
Use the ‘block total’ value, which gives the total nucleotide count for exons (with overlaps). Go to UCSC table browser
at http://genome.ucsc.edu/cgi-bin/hgTables?command=start. Under ‘genome’ choose ‘Human’, under ‘assembly’ choose
‘Feb.2009 (GRCh37/hg19)’, under ‘group’ choose ‘Genes and Gene Prediction Tracks’ and under ‘track’ choose ‘Ensembl Genes’.
Click on the ‘summary/statistics’ link in order to get transcriptome statistics (based on Ensembl genes in our case).

34| Run MACS.


$ macs14 -t IP.sam -c Input.sam --name=m6A --format="SAM" --

gsize=282000000 --tsize=36 --nomodel --shiftsize=50 --to-small -
w -S 2>macs.out &
 CRITICAL STEP ‘--tsize=36’ is set by the length of sequenced reads (Step 28).
 CRITICAL STEP ‘--shiftsize=50’ is set by the size of input RNA fragments (100 nt, Step 12).
 CRITICAL STEP ‘2 > macs.out’ causes the standard error output stream (‘stderr‘) of MACS to be written to a file named
‘macs.out’.
 CRITICAL STEP ‘-w -S’ parameters are used for storing the fragment pileup in wiggle format, which can be downloaded
into a genome browser for visualization.

35| Use the output file ‘m6A_peaks.xls’ generated by MACS, containing information about called peaks, to sort out only
those peaks having an FDR  < 5%, and save them in a separate file.
$ awk '{if($9 <  = 5) print }' m6A_peaks.xls  >  m6A_sig_peaks.xls

Peak visualization ● TIMING 30 min


36| Normalize the coverage data in the wiggle files for the read density to be comparable between different samples
(IP and input in our case). The normalization is done such that we get the pileup per ‘fixed’ million reads in each sample
file. For example, if the data have 15 million reads, and you want to get the pileup per 10 million, you have to divide each
value in the (unzipped) wiggle file by 1.5.

184 | VOL.8 NO.1 | 2013 | nature protocols


protocol

$ awk '{if($1~/^[0-9]+$/) {print $1 "\t" $2/2.67} else print}'


m6A_MACS_wiggle/treat/m6A_treat_afterfiting_all.wig > IP_norm.wig
$ awk '{if($1~/^[0-9]+$/) {print $1 "\t" $2/3.76} else print}'
m6A_MACS_wiggle/control/m6A_control_afterfiting_all.wig > Input_norm.wig
 CRITICAL STEP Division is by 2.67 (top command) and 3.76 (bottom command), as in our test data set only 26.7 million
and 37.6 million alignments are left after filtering redundant tags in the IP and input control samples, respectively. This
information can be found in the ‘macs.out’ file that contains processing information output by MACS.

37| Convert the normalized WIG files to BigWig format.


 CRITICAL STEP A file containing hg19 chromosome sizes (‘hg19.chrom.sizes’) can be fetched from the UCSC database
using the UCSC fetchChromSizes script.
$ grep -v ^track IP_norm.wig | wigToBigWig -clip stdin
hg19.chrom.sizes IP_norm.bw
$ grep -v ^track Input_norm.wig | wigToBigWig -clip stdin
hg19.chrom.sizes Input_norm.bw

38| Place the BigWig file in an http, https or ftp location, and then upload it as a custom track to the UCSC genome browser
(see http://genome.ucsc.edu/goldenPath/help/bigWig.html for detailed instructions).
© 2013 Nature America, Inc. All rights reserved.

Motif search ● TIMING 25 min


39| Sort m6A peaks with FDR  <5% by fold change and retrieve the coordinates of peak-summit regions (50 nt flanking the
summit). For example, for the best 1,000 m6A peak-summit regions
$ sort -k8,8 -n -r m6A_sig_peaks.xls |head -1000 |awk
'{summit=$2-1+$5; print $1 "\t" summit-51 "\t" summit+50 }' >
bestPeaks.location

40| Map the peak-summit regions to annotated genes in order to fetch sequences from the sense strand.
$ awk '{print "chr" $0}'
 Homo_sapiens.GRCh37.67.gtf >
genes.gtf
 $ intersectBed -wo -a bestPeaks.location -b genes.gtf | awk -v
OFS="\t" '{print $1,$2,$3,"*","*",$10}'|uniq > bestPeaks.bed
 CRITICAL STEP If your peak file contains the string ‘chr’ before chromosome numbers, skip the first command.

41| Use the fastaFromBed utility (BedTools) to fetch sequences taking the strand information into consideration.
$ fastaFromBed -s -fi hg19.fa -bed bestPeaks.bed -fo bestPeaks.fa
42| Run MEME for de novo motif finding. The command below retrieves the top three motifs.
$ meme bestPeaks.fa -dna -nmotifs 3 -maxsize 1000000 -o
bestPeaks_meme
Determining peak summit-to-motif distance ● TIMING 10 min
43| Generate a location file containing sequences flanking the peak summits (±150 nt).
$ awk '{if($1~/[^#]/) {summit=$2-1+$5; print $1 "\t" summit-151

"\t" summit+150} }' m6A_sig_peaks.xls >
m6A_sig_peaks_summit.location
44| Intersect with annotated genes in order to fetch the sequences from the sense strand.
$ intersectBed -wo -a m6A_sig_peaks_summit.location -b genes.gtf

| awk -v OFS="\t" '{print $1,$2,$3,"*","*",$10}'|uniq >
m6A_sig_peaks_summit.bed
45| Fetch sequences from the .fasta file containing the sequence of the human genome.
$ fastaFromBed -s -fi hg19.fa -bed m6A_sig_peaks_summit.bed -

fo m6A_sig_peaks_summit.fa
46| Run CentriMo.
$ centrimo --motif 1 --o peaks_motif_centrimo --norc
m6A_sig_peaks_summit.fa bestPeaks_meme/meme.txt

nature protocols | VOL.8 NO.1 | 2013 | 185


protocol
Figure 4 | Deduction of methylation consensus motifs and evaluation
of their location relative to summits of m6A peaks. (a) Sequence logo
a 2

representing the top MEME-deduced consensus motif for the 1,000 best-
scoring m6A peaks (Steps 39–42). The height of a nucleotide at each
position reflects its frequency. (b) CentriMo-generated (Steps 43–47) density
curves of the motif in a at positions flanking the peak summit (red) relative

Bits
1
to negative control peaks (blue). P values are indicated for each curve.

47| Repeat Steps 43–46 for negative peaks (called by swapping


the IP and Input control samples) generated by MACS 0
(after removing the header line from the ‘m6A_negative_peaks. –3 –2 –1 m6A 1 2 3
xls’ file).
b 7
m6A peaks
Annotation of methylated regions ● TIMING 10 min 6
Negative peaks
48| Generate an input file for PeakAnnotator containing infor- 5

Probability (× 10–3)
mation on chromosome start and end coordinates
4 P = 8.9 × 10–1
(without any header lines).

$ awk '{if($1~/[^#]/) print $1 "\t" $2 "\t" $3}' 3

m6A_sig_peaks.xls > m6A_sig_peaks_PAinput.txt


© 2013 Nature America, Inc. All rights reserved.

2
P = 3.3 × 10–704

1
49| Run the command line utility ‘PeakAnnotator’ of
PeakAnalyzer29. 0

$ java -jar PeakAnnotator.jar -u ndg -p –120 –90 –60 –30 m6A 30 60 90 120
Peak-summit
m6A_sig_peaks_PAinput.txt -a genes.gtf -g
Position of best site in sequence (nt)
all -o ./
 CRITICAL STEP There are two points worth taking into consideration when performing annotation analysis: first, multiple
transcripts overlapping a given location are all reported. Hence, if you are interested in generating statistics regarding peak
locations within genes, this analysis can be performed on ‘canonical transcripts’ instead of on all isoforms in order to avoid
bias toward multi-isoform genes. Second, PeakAnnotator reports the overlap of the first, central and last nucleotide of a peak
inside a gene. As peak locations reported by MACS are not centered on the summit, you can use the summit location file
output by MACS (‘m6A_summits.bed’) as the input for PeakAnnotator in order to retrieve overlaps with summit regions
(assumed to be in close proximity to the actual methylated nucleotide, Fig. 4). Note that whereas the input for PeakAnalyzer
is 1-based the summit locations reported by MACS are 0-based.

? TROUBLESHOOTING
Troubleshooting advice can be found in Table 1.

Table 1 | Troubleshooting table.

Step Problem Possible reason Solution

3 Low RNA yield RNA was not fully eluted from the Re-elute RNA from column by adding RNase-free
column water preheated to 70 °C
Too little starting material Repeat RNA isolation with larger amount of starting
material
RNA is degraded Proceed to verify RNA integrity in Step 4
None of the above Repeat RNA purification. Make sure the kit has not
expired. Closely follow the protocol, especially after
DNase treatment
4 RNA appears to be degraded on RNA is contaminated with RNase Repeat RNA purification. Before starting, clean your
the gel work area, pipettes and gloves with RNaseKiller
(5 PRIME) or a similar product
Genomic DNA is apparent in the Unsuccessful DNase treatment Repeat DNase treatment
agarose gel wells
(continued)

186 | VOL.8 NO.1 | 2013 | nature protocols


protocol
Table 1 | Troubleshooting table (continued).

Step Problem Possible reason Solution

12 RNA fragments appear to be Stop solution was not added to Repeat fragmentation. Make sure you quickly add the
shorter than expected the sample immediately after stop solution, mix and place the sample on ice
fragmentation
RNA fragments appear to be Incorrect concentration of Prepare a fresh fragmentation buffer
shorter/longer than expected fragmentation buffer
RNA concentration is too low/high Repeat fragmentation with concentrated RNA or
recalibrate
Fragmentation time and Repeat fragmentation under the specified conditions.
temperature are different than Check the block temperature, before repeating
those stated in the protocol fragmentation

RNA fragments seem to be longer RNA sample was eluted from the Ethanol precipitate your sample to remove any EDTA
than expected column with an elution buffer and repeat RNA fragmentation
containing EDTA
© 2013 Nature America, Inc. All rights reserved.

25 RNA is detected in bead-only Beads were not washed properly Repeat immunoprecipitation and elution
control (in comparable amounts
to IP)

RNA was solvent extracted and not Repeat immunoprecipitation and elution
eluted

Inaccurate measurement Repeat measurement

RNA is not detected in the IP RNA starting amounts were too Repeat with larger starting amounts
sample small

Cells or tissue of origin have very Repeat with larger starting amounts for validation
low levels of methylation

RNA was degraded during elution Repeat and be sure to supplement IP and elution
buffers with RNase inhibitors

Insufficient m6A in elution buffer Repeat with new elution buffer

Inaccurate measurement Repeat measurement

Inefficient elution Repeat with increased elution volume and more


vigorous shaking

27 No library or inefficient library RNA amounts in IP were too small Repeat IP


was generated (per Bioanalyzer
analysis)

High levels of adapter-adapter RNA amounts in IP were too small Repeat IP


sequences

● TIMING
Day 1
Steps 1–5, RNA isolation: 3 h
Steps 6–10, fragmentation: 3 h
Day 2
Steps 11 and 12, validation of postfragmentation size distribution: 2 h
Steps 13–18, immunoprecipitation: 5 h
Steps 19–23, elution: 2.5 h
Days 3–7
Steps 24–25, recovery of precipitated RNA: 1 h

nature protocols | VOL.8 NO.1 | 2013 | 187


protocol

Step 26–28, library preparation and next-generation sequencing: ~6 d


Days 8 and 9
Steps 29–49, bioinformatic analysis: 1–2 d

ANTICIPATED RESULTS
Pre-processing QC
The majority of FastQC metrics (Step 30) are expected to be comparable between IP and input control samples. Notably,
however, ‘Per sequence GC content’ is consistently higher in IP than in input control; this is apparently a property of
methylated regions. Additional small deviations can be attributed to the presence of adapter sequences, PCR duplicates,
repetitive sequences and rRNA. FastQC reports for two samples, SRR456557.fastq (input) and SRR456551.fastq (IP),
are available at http://sheba-cancer.org.il/Nat_protocols/Input_fastqc/fastqc_report.html and at http://sheba-cancer.org.
il/Nat_protocols/IP_fastqc/fastqc_report.html, respectively.

Read mapping
It is noteworthy that PCR duplicates can give rise to artifactual enrichments, and are therefore best eliminated. Sample-to-
sample variation in the extent of redundancy is influenced to a large degree by starting RNA amounts: redundancy increases
as starting amounts decrease.
© 2013 Nature America, Inc. All rights reserved.

Peak calling
In our test data set, the specified configuration of MACS (Steps 33–35) identified 35,098 peaks, of which 34,352 had an FDR
of ≤5%. Further filtering by fold changes of ≥4 results in 32,105 peaks. For each peak, MACS reports the chromosome name,
start and end positions, length of the peak region, summit location relative to the peak start position, number of reads in
the peak region, P value for the peak region, fold enrichment for this region (compared with the expectation from Poisson
distribution with local lambda) and an FDR.

Peak visualization
The UCSC genome browser session of the test data set (Steps 36–38) is available at http://genome.ucsc.edu/cgi-bin/
hgTracks?hgS_doOtherUser=submit&hgS_otherUserName=mali&hgS_otherUserSessionName=m6A_protocol. Representative
plots are given in Figure 5.
Analysis of the test data set shows that m6A-enriched regions are generally discrete, and that they typically form sharp
peaks along the transcriptome. There are ~2.1 peaks per gene. Of the genes that harbor more than one peak, some contain
two or more contiguous peaks, suggesting that peak clustering is a feature of the methylome. Acknowledging this feature,
you may choose to further increase resolution by using the PeakSplitter utility of PeakAnalyzer to subdivide peak regions
containing more than one site of signal enrichment (not described in this protocol).

Motif search and summit-to-motif distance


We suggest that the follow-up analyses should involve consensus motif finding and peak summit-to-motif distance assess-
ment. Recapitulation of the previously established methylation consensus and its vicinity to m6A peak summits are taken as
further corroborations to the success of the protocol. This assumption holds true for the human and mouse transcriptomes.
However, as the anti-m6A antibody is independent of sequence context, enrichment of motifless regions is possible and was
indeed shown for a subset of methylated sites in our data set. The first motif that MEME detects (out of three motifs defined
in the search, Steps 39–42) is the typical methylation consensus (Fig. 4a). Figure 4b shows the density of the best strong
site for this motif at each position within m6A peak regions
(300 nt) relative to negative peak regions (Steps 43–47).
A very strong central enrichment is evident with a P value of 300 ADAR (chr1:152,821,158-152,847,306) IP
No. of reads

Input
3.3 × 10 .–704

0
5′ 3′
Annotation
300 UBE2Q1 (chr1:152,787,675-152,797,744)
No. of reads

The final part of the proposed analysis deals with peak dis-
tribution among different genomic features. PeakAnnotator
generates several files: ‘m6A_sig_peaks_PAinput.summary. 0
5′ 3′
txt’ reports the overlapping and the nearest downstream
Figure 5 | Representative human gene plots harboring m6A peaks.
gene for each peak. For peaks that localize within genes,
Normalized coverage of IP and input control is indicated in red and blue,
the position of the peak relative to gene features (exons, respectively, above gene architecture in a UCSC format. Thick black boxes
introns, 3′ and 5′ untranslated regions) is reported in the represent exons; thin black boxes represent 5′ and 3′ untranslated regions
‘m6A_sig_peaks_PAinput.overlap.txt’ file. (UTRs); thin lines represent introns.

188 | VOL.8 NO.1 | 2013 | nature protocols


protocol
Acknowledgments We thank the Kahn Family Foundation for their support. 12. Kellner, S., Burhenne, J. & Helm, M. Detection of RNA modifications.
This work was supported in part by grants from the Flight Attendant Medical RNA biology 7, 237–247 (2010).
Research Institute (FAMRI), Bio-Med Morasha Israel Science Foundation (ISF) 13. Dominissini, D. et al. Topology of the human and mouse m6A RNA
(grant no. 1942/08), ISF (grant no. 1667/12), the molecular basis of human methylomes revealed by m6A-seq. Nature 485, 201–206 (2012).
disease I-CORE (Israeli Centers of Research Excellence) and the Israel Ministry 14. Meyer, K.D. et al. Comprehensive analysis of mRNA methylation reveals
of Science and Technology (Scientific Infrastructure Program). G.R. holds the enrichment in 3′ UTRs and near stop codons. Cell 149, 1635–1646
Djerassi Chair in Oncology at the Sackler Faculty of Medicine, Tel Aviv University. (2012).
This work was performed in partial fulfillment of the requirements for a PhD 15. Czerwoniec, A. et al. MODOMICS: a database of RNA modification
degree to D.D., Sackler Faculty of Medicine, Tel Aviv University. pathways. 2008 update. Nucleic Acids Res. 37, D118–121 (2009).
16. Horowitz, S., Horowitz, A., Nilsen, T.W., Munns, T.W. & Rottman, F.M.
AUTHOR CONTRIBUTIONS D.D. and S.M.-M. conceived the approach, developed Mapping of N6-methyladenosine residues in bovine prolactin mRNA.
the protocol and performed the experiments. M.S.-D. designed the bioinformatic Proc. Natl. Acad. Sci. USA 81, 5667–5671 (1984).
pipeline and analyzed the data. N.A. and G.R. supervised the project. D.D., 17. Bringmann, P. & Luhrmann, R. Antibodies specific for N6-methyladenosine
S.M.-M., M.S.-D., N.A. and G.R. wrote the manuscript. react with intact snRNPs U2 and U4/U6. FEBS Lett. 213, 309–315 (1987).
18. Dante, R. & Niveleau, A. Inhibition of in vitro translation by antibodies
directed against N6-methyladenosine. FEBS Lett. 130, 153–157 (1981).
COMPETING FINANCIAL INTERESTS The authors declare no competing financial
19. Munns, T.W., Liszewski, M.K., Oberst, R.J. & Sims, H.F. Antibody nucleic
interests.
acid complexes. Immunospecific retention of N6-methyladenosine-
containing transfer ribonucleic acid. Biochemistry 17, 2573–2578 (1978).
Published online at http://www.nature.com/doifinder/10.1038/nprot.2012.148.
20. Munns, T.W., Liszewski, M.K. & Sims, H.F. Characterization of antibodies
Reprints and permissions information is available online at http://www.nature.
specific for N6-methyladenosine and for 7-methylguanosine. Biochemistry 16,
com/reprints/index.html.
2163–2168 (1977).
21. Munns, T.W., Oberst, R.J., Sims, H.F. & Liszewski, M.K. Antibody-nucleic
1. Cantara, W.A. et al. The RNA Modification Database, RNAMDB: 2011 acid complexes. Immunospecific recognition of 7-methylguanine- and
© 2013 Nature America, Inc. All rights reserved.

update. Nucleic Acids Res. 39, D195–D201 (2011). N6-methyladenine-containing 5′-terminal oligonucleotides of mRNA.
2. He, C. Grand challenge commentary: RNA epigenetics? Nat. Chem. Biol. 6, J. Biol. Chem. 254, 4327–4330 (1979).
863–865 (2010). 22. Munns, T.W., Sims, H.F. & Liszewski, M.K. Immunospecific retention of
3. Chan, C.T. et al. A quantitative systems approach reveals dynamic control oligonucleotides possessing N6-methyladenosine and 7-methylguanosine.
of tRNA modifications during cellular stress. PLoS Genet. 6, e1001247 J. Biol. Chem. 252, 3102–3104 (1977).
(2010). 23. Zhang, Y. et al. Model-based analysis of ChIP-seq (MACS). Genome Biol. 9,
4. Schaefer, M. et al. RNA methylation by Dnmt2 protects transfer RNAs R137 (2008).
against stress-induced cleavage. Genes Dev. 24, 1590–1595 (2010). 24. Feng, J., Liu, T. & Zhang, Y. Using MACS to identify peaks from ChIP-seq
5. Bokar, J. Fine-tuning of RNA functions by modification and editing. data. Curr. Protoc. Bioinformatics 34, 2.14.1–2.14.14 (2011).
in Topics in Current Genetics 12 (ed. Grosjean, H.) 141–177 (Springer, 25. Machanick, P. & Bailey, T.L. MEME-ChIP: motif analysis of large DNA
2005). datasets. Bioinformatics 27, 1696–1697 (2011).
6. Zhong, S. et al. MTA is an Arabidopsis messenger RNA adenosine 26. Langmead, B., Trapnell, C., Pop, M. & Salzberg, S.L. Ultrafast and memory-
methylase and interacts with a homolog of a sex-specific splicing factor. efficient alignment of short DNA sequences to the human genome. Genome
Plant Cell 20, 1278–1288 (2008). Biol. 10, R25 (2009).
7. Clancy, M.J., Shambaugh, M.E., Timpte, C.S. & Bokar, J.A. Induction of 27. Quinlan, A.R. & Hall, I.M. BEDTools: a flexible suite of utilities for
sporulation in Saccharomyces cerevisiae leads to the formation of comparing genomic features. Bioinformatics 26, 841–842 (2010).
N6-methyladenosine in mRNA: a potential mechanism for the activity 28. Bailey, T.L. & Machanick, P. Inferring direct DNA binding from ChIP-seq.
of the IME4 gene. Nucleic Acids Res. 30, 4509–4518 (2002). Nucleic Acids Res. 18, 18 (2012).
8. Jia, G. et al. N6-methyladenosine in nuclear RNA is a major substrate 29. Salmon-Divon, M., Dvinge, H., Tammoja, K. & Bertone, P. PeakAnalyzer:
of the obesity-associated FTO. Nat. Chem. Biol. 7, 885–887 (2011). genome-wide annotation of chromatin binding and modification loci.
9. Levanon, E.Y. et al. Systematic identification of abundant A-to-I editing BMC Bioinformatics 11, 415 (2010).
sites in the human transcriptome. Nat. Biotechnol. 22, 1001–1005 (2004). 30. Li, H. & Durbin, R. Fast and accurate long-read alignment with
10. Klose, R.J. & Bird, A.P. Genomic DNA methylation: the mark and its Burrows-Wheeler transform. Bioinformatics 26, 589–595 (2010).
mediators. Trends Biochem. Sci. 31, 89–97 (2006). 31. Trapnell, C., Pachter, L. & Salzberg, S.L. TopHat: discovering splice
11. Dai, Q. et al. Identification of recognition residues for ligation-based junctions with RNA-seq. Bioinformatics 25, 1105–1111 (2009).
detection and quantitation of pseudouridine and N6-methyladenosine. 32. Langmead, B. & Salzberg, S.L. Fast gapped-read alignment with Bowtie 2.
Nucleic Acids Res. 35, 6322–6329 (2007). Nat. Methods 9, 357–359 (2012).

nature protocols | VOL.8 NO.1 | 2013 | 189

You might also like