You are on page 1of 11

Review

Emerging Roles of Disordered


Sequences in RNA-Binding
Proteins
Sara Calabretta1 and Stéphane Richard1,*
RNA-binding proteins (RBPs) maintain RNA metabolism homeostasis in the cell Trends
by regulating temporal, spatial, and functional dynamics of RNAs. RBPs achieve Intrinsically disordered regions (IDRs)
RNA binding not only through classical structured RNA-binding domains but are protein sequences that lack a
defined and ordered 3D structure
also with sequences that are intrinsically disordered and often of low amino acid and play key roles in RNA-binding pro-
complexity. RBP–RNA interactions form ribonucleoprotein (RNP) complexes teins (RBPs).
and emerging evidence indicates that RNPs form higher structures or lattices,
The physiological role of IDRs in RBPs
promoting territories of phase transitions. Herein, we discuss the role of disor- is likely related to the formation of hubs
dered sequences in RBPs, their function in RNPs and protein networks, as well of ribonucleoproteins (RNPs), required
as their regulation by post-translational modifications and how RBP deregula- for proper RNA metabolism. IDRs
undergo disordered-to-ordered transi-
tion leads to disease. tion after binding with interactors.

RNA-Binding Proteins Mutations of IDRs contained in RBPs


Many cellular RNAs exist in association with RNA-binding proteins (RBPs) (see Glossary) to have been linked to the onset of dis-
eases. Mutations likely affect the flex-
form ribonucleoprotein (RNP) complexes; defects in the formation or the composition of ibility of intrinsically disordered proteins
RNPs lead to diseases [1–3]. Interactions between RBPs and RNA are crucial for maintaining (IDPs), thereby disrupting the correct
RNA metabolism homeostasis at all stages from biogenesis to degradation. Therefore, RBPs balance between disordered and
ordered structures.
are key post-transcriptional gene regulators. It is not surprising that RBPs fulfill versatile roles in
the regulation of basic cellular processes [4], such as the regulation of pre-messenger RNA
(pre-mRNA) splicing [5], polyadenylation [6], export to the cytoplasm, and translation into
protein (Figure 1). Many other roles have been attributed to RBPs including the processing of
noncoding RNA, such as microRNAs (miRNAs), circular RNAs (circRNAs), and long noncoding
RNAs (lncRNAs), and will not be extensively covered here as these were reviewed recently
elsewhere [7–10].

RBPs were historically identified by their ability to associate with RNA using biochemical
methods [11–14]. RNA-binding domains were then defined by structure–function studies
and structure determination [15–18]. Sequencing and computational analysis of DNA sequen-
ces across species led to the identification of >500 human RBPs, each containing at least one
RNA-binding domain [19]. Classically, RBPs were categorized based on their RNA-binding
domains including the RNA recognition motif (RRM), the K homology (KH) domain, the DEAD 1
Terry Fox Molecular Oncology Group
motif, the double-stranded RNA-binding motif (DSRM), and the zinc-finger domain [19]. How- and Segal Cancer Center, Bloomfield
ever, new RBP–RNA complexes have emerged through the use of genome-wide RNA target Center for Research on Aging, Lady
identification via crosslinking and immunoprecipitation (CLIP) technology combined with high- Davis Institute for Medical Research
and Departments of Oncology and
throughput sequencing [20–23]. Bioinformatics focusing on the specific RNA sequences bound Medicine, McGill University, Montréal,
by RBPs have revealed many new RBP–RNA interactions [24]. In addition, interactome Québec, H3T 1E2, Canada
capture coupled with mass spectrometry identified >1300 experimentally confirmed human
RBPs [25–28]. A compilation of these datasets has led to the experimentally validated census of
*Correspondence:
1500 RBPs [29,30]. Many of these newly identified RBPs do not harbor a canonical RNA- stephane.richard@mcgill.ca
binding domain, but rather contain disordered and low amino acid complexity sequences, such (S. Richard).

662 Trends in Biochemical Sciences, November 2015, Vol. 40, No. 11 http://dx.doi.org/10.1016/j.tibs.2015.08.012
© 2015 Elsevier Ltd. All rights reserved.
Glossary
Hub: top 20% of the interacting
RNA transcripon proteins of an interactome.

Nuclear processing
RBP RBP Interactome: a network of
l II Exon RBP
Po Intron Exon interacting proteins.
Intron Exon
Intrinsically disordered proteins
(IDPs): proteins that contain one or
more IDRs.
Intrinsically disordered regions
Noncoding RNAs Splicing and polyadenylaon microRNA processing (IDRs): a sequence in a protein that
is not structured by itself, but
RBP
requires a ligand or binding partner to
CBC
assume a secondary structure.
RBP
SF PABP
PABPN1 Low complexity (LC) sequences: a
m7G
m7G P
CAP Exon Exon Exon AAAAAA RSBF protein sequence containing limited
EJ diversity in amino acid composition
and is devoid of hydrophobic
residues.
NP
C Ribonucleoprotein (RNP)
Exp5
complexes: macromolecules
containing proteins and RNAs.
NPC Classic examples include the
spliceosome, ribosome, and exon
junction complex.
RNA-binding proteins (RBPs):
proteins that directly interact with

Cytoplasmic processing
RNA. RBPs either interact with RNA
mRNA translaon Inhibion of mRNA translaon Gene silencing
microRNA with a structured domain and/or with
an IDR. The RBP–RNA interaction
can occur in a sequence- and/or
structure-specific manner; however,
eIF4F
m7G
RibosomeRibosome
Ribosome PABP
RBP AAAAAA
CBC
m7G
Ribosome
X
RBP PABP
AAAAAA
CBC RISC some RBPs bind RNA in a
5′UTR 3′UTR
m7G
Ribosome
XPABPs
AAAAAA sequence- and structure-independent
manner.
Aggregaon in stress granules Aggregaon in P-bodies RNA metabolism: the process that
implicates RNAs from their synthesis
to their degradation. RNA metabolism
eIF4F RBP PABPs defines the processes including pre-
AAAAAA eIF4F
m7G
eIF4F RBP PABPs
AAAAAA m7G
eIF4F
X PABPs
AAAAAA
mRNA splicing, noncoding RNA
regulation, nonsense-mediated
m7G eIF4F
m7G
RBP PABPs
AAAAAA m7G
eIF4F X PABPs
AAAAAA
decay, as well as RNA export,
eIF4F
m7G
RBP PABPs
AAAAAA m7G XPABPs
AAAAAA
localization, stability, packaging in
RNPs, and mRNA translation.

Figure 1. RNA-Binding Proteins (RBPs) in RNA metabolism. Nuclear processing: RNA polymerase II (Pol II)
generates RNAs such as the pre-messenger RNA (pre-mRNA) shown. Noncoding RNAs are also transcribed and RBPs are
emerging as key players in noncoding RNA metabolism and function. pre-mRNAs are stabilized by the Cap-binding
complex (CBC), which binds the 50 cap [50 -50 triphosphate-linked guanine modified with a 50 7-methyl group (m7G)], and by
the addition of a poly(A) tail by the poly(A)-polymerase. The poly(A) tail is then recognized by the polyA-binding protein
(PABP). Splicing by the spliceosome and RBP splicing factors leads to mature mRNAs. The exon junction (EJ) complex
binds the splice junctions and remains associated with the mRNAs until translation or degradation occurs. Additional RBPs
bind the mRNA, forming messenger ribonucleoproteins (mRNPs) and contribute to mRNA export into the cytoplasm via the
nuclear core complex (NPC). Primary microRNA (miRNA) transcripts are processed to form precursor miRNAs that are then
exported to the cytoplasm through Exportin 5 (Exp5) and further processed to obtain the mature miRNAs before its loading
into the RNA-induced silencing complex (RISC). The modulation of miRNA processing is also modulated by specific RBPs.
Cytoplasmic processing: the nonfunctional mRNPs are those that contain a premature termination codon (PTC) or those
stored in response to stress. Translation elongation will not proceed and the mRNAs will be either degraded by
endonucleases or stored in stress granules. The functional mRNPs recruit productive components of the translation
machinery such as the eukaryotic translation initiation factor complex (eIF4F) and ribosomes that initiate protein synthesis.
The mRNAs targeted by miRNA are not translated and will be degraded or stored in processing bodies (P-bodies). RBPs are
involved in the modulation of all the aforementioned processes.

Trends in Biochemical Sciences, November 2015, Vol. 40, No. 11 663


as the RGG/RG motif [3,29,31]. This raises the question as to how these disordered sequences
function in RBPs and RNA metabolism.

In this review, we discuss the emerging functions of disordered sequences in RBPs, their
regulation by post-translational modifications, their roles in forming ultrastructures and their
contribution to disease. In particular, we describe the biochemical and cellular properties of
disordered sequences and how they give unique properties and versatility to RBPs. We also
discuss RBP-regulated processes that disordered sequences affect under physiological and
pathological conditions.

Disordered Sequences in RBPs


RBPs have major roles in RNA metabolism through their participation in RNP complexes that
represent the functional units of the RNA processing machineries such as the spliceosome, the
exon junction (EJ) complex, and the ribosome. The spliceosome is a large macromolecular
complex of RNPs that catalyzes the removal of introns or alternative exons from pre-mRNA to
generate mature transcripts [32]. The EJ complex is similarly a large heterogeneous complex
that binds to neighboring exon–exon junctions after RNA splicing, providing nuclear history
[33,34]. The ribosome is composed of rRNA molecules associating with >50 proteins to form an
asymmetric complex necessary for decoding successive codons on mRNAs to generate
polypeptide chains [35].

The activity of RNP complexes can be modulated at several steps; for instance, the specific
composition of the RBPs within RNP complexes can influence the fate of bound RNAs, since
certain RBPs can cooperate or antagonize the overall function of RNPs [36]. An additional level of
regulation is represented by the modulation of the RNA-binding activity of RBPs by post-
translational modifications, such as protein methylation and phosphorylation [37–40].

It has long been recognized that protein sequences termed ‘domains’ adopt secondary
structure with /-helices and b-sheets. Protein domains infer functionality to proteins; therefore,
proteins have been classically grouped into families that share similar functional domains, such
as in the Pfam database. In the past decade or so, it has begun to be appreciated that certain
proteins or portions of their sequence are simply disordered in nature and it is their lack of
folding that is required for their function [41–43]. Disordered protein sequences are termed
intrinsically disordered regions (IDRs) and proteins that harbor these regions are com-
monly named intrinsically disordered proteins (IDPs, Box 1). IDPs typically share some
structural characteristics, such as a low overall hydrophobicity, a large net charge, and low
ordered secondary structure resulting in flexibility, which may become rigid in the presence of a
ligand [44,45]. Subclasses of IDRs include the short linear motifs (SLiMs), 1–10 amino acid
disordered motifs, which are present in structured proteins and act as ligands or partners for
protein domains or serve as consensus sites for enzymes, such as kinases; molecular
recognition features (MoRFs), which are 10–70 amino acid motifs that undergo a disorder-
to-structured change after protein binding; low complexity (LC) sequences composed by
up to hundreds of repetitions of one or several amino acids [46,47]. LC sequences are often
found in disordered states, but they can also assume a structured conformation (Box 1).
Recently, IDPs were classified according to function, sequence, protein interactions, and
biophysical properties [48]. Moreover, the phosphorylation of certain disordered proteins such
as 4E-BP2 induces folding [49].

It is now known that some disordered proteins may adopt a defined structure in the presence
of a ‘ligand’, an interaction partner or a certain cellular stress, or they may simply function
while disordered [41–43]. An example of ligand-mediated induced folding is the DNA-binding
domain of the yeast protein GCN4. This IDR is natively unstructured, but after binding with

664 Trends in Biochemical Sciences, November 2015, Vol. 40, No. 11


Box 1. Defining Disorder in Proteins
Intrinsically Disordered Regions and Proteins: IDRs and IDPs
IDRs are defined as protein sequences that lack a defined 3D structure under naive and physiological conditions [48]. The
inability of IDRs to form defined structures is due to the lack of core of hydrophobic amino acids that promotes folding.
IDRs are classified according to the different types of interaction that they can promote, although overlap exists between
categories. The proteins containing IDRs are defined as IDPs. IDRs are fully functional in IDPs, even if they are not folded
or folded only partially, and they often assume a fold with a binding partner or ligand. Disordered regions represent sites
for enzymes that deposit post-translational modifications. The post-translational modifications of these disordered
regions also allow recruitment of binding proteins that modulate the activity of the IDPs or their complexes.

Molecular Recognition Features: MoRFs


MoRFs are 10–70 amino acid protein segments that transition from disordered to structured after protein–protein
interaction [100]. The newly formed structure is classified as /-MoRFs, b-MoRFs, and i-MoRFs (/-helices, b-strands,
and irregular structure). MoRFs composed of mixed structures are termed complex MoRFs and are a subset of IDRs.

Short Linear Motifs: SLiMs


SLiMs have a maximum of 10 amino acids, are located outside protein domains, and are disordered [101]. Classical
motifs are kinase phosphorylation sites, RGG/RG motifs, and so forth. SLiMs are involved in protein–protein interactions
and regulation of protein localization. They are often sites of post-translational modifications and recruitment of binding
factors to form protein complexes. SLiMs are a subset of IDRs, and are classified into two subgroups: linear motifs that
function as ligands for structured protein domains and motifs that function as sites for post-translational modifications.

Low Complexity Sequences: LC Sequences


LC sequences are defined as amino acid sequences >100 residues [83] that are composed of repeats of one to a few
residues. LC sequences are thus a subset of IDRs.

DNA through its leucine zipper motif, a transition to an ordered /-helix conformation is
observed [44,45].

It is well established that certain disordered sequences have intrinsic RNA-binding activity, such
as the RGG/RG motif, a SLiM. RGG/RG and the related RGG/YGG motifs are highly abundant in
RBPs and represent two of the most frequent RNA-binding sequences [25]. The RGG/RG motifs
of nucleolin and fragile X mental retardation protein (FMRP), for example, have been shown to
associate with G-rich RNA sequences [50,51]. Given that FMRP associates with the G4-
quadruplex (G4) secondary structure, this suggests that its RGG/RG motifs might specifically
bind RNA [31]. Indeed, structural analysis of the RGG/RG box in FMRP revealed that this
sequence is essential for base recognition and binding to the G4 RNA structure [52]. RNA
molecules induce a disordered-to-ordered transition in the RGG/RG motif, allowing strong
interactions between the arginine and the G4 sequences [52]. In addition, recently, the RGG/RG
motif of Aven has been shown to bind the G4 sequences of the mRNAs encoding the mixed
lineage leukemia 1 and 4 (MLL1 and MLL4) to regulate their translation [53]. These findings
suggest that the RGG/RG disorder may be essential for the RNA recognition of complex
structures such as G4 sequences, because the flexibility allows the shaping of the binding site
to properly fit with RNA.

Another function of disordered sequences in RBPs is to regulate interaction with the carboxyl
terminal domain (CTD) of RNA polymerase II (RNA Pol II). It has been shown that the LC
sequences of the FET (protein products generated by the fusion of FUS/EWS/TAF15 due to
pathological translocation) RBPs contribute to regulation of RNA transcription. In particular,
FET family members also bind DNA, and then through their LC sequences start to aggregate
into polymers locally, resulting in protein polymers that are able to then recruit RNA Pol II via the
CTD [54]. The RNA Pol II CTD domain is a disordered region per se, containing several repeats
of the MoRF region, YSPTSPS, that allows binding with multiple interactors such as the
cleavage and polyadenylation factor 11 (PCF11) [55] and the mRNA capping enzyme
CGT1 [56]. Interestingly, interaction of the RNA Pol II with both PCF11 and CGT1 requires
the phosphorylation of the CTD. This suggests that an additional level of regulation of RBP

Trends in Biochemical Sciences, November 2015, Vol. 40, No. 11 665


function through IDRs may be achieved through their post-translational modification (Table 1).
Repeats of serine, threonine, and tyrosine found in FUS, for instance, represent sites for
phosphorylation by kinases, whereas RGG/RG motifs correspond to sites for methylation. Both
events were shown to modulate RNA-binding activities. Some examples are the post-transla-
tional regulation of SR proteins by phosphorylation, which modulates alternative splicing [57],
and the arginine methylation of the Sm proteins, which promotes their interaction with the Tudor
domain of SMN [58].

RNA is known to be one of the three macromolecules, along with DNA and proteins, essential for
life and it provides a genetic copy of DNA that is used to make protein. Many noncoding RNAs
have been identified such as tRNA, rRNA, ribozymes, miRNAs, and so forth. Therefore, it is not
surprising that RNA is actively transported into intracellular organelles and granules to fulfill its
functions. This process is extremely dynamic and utilizes protein–protein networks organized
into functional nodes or hubs [59]. Analysis of these hubs reveals that 30% are RBPs
containing disordered motifs [30]. The formation of hubs strongly correlates with the presence
of IDRs in proteins [60], consistent with sequence disorder being a key feature for flexible
regulation of the networking activity of RBPs. One example is the contribution of IDPs to the
activity of the mRNA decapping complex. Both DCP1 and DCP2 hub proteins possess
predicted disordered regions that, through conformational changes after binding, are predicted
to be required for the interaction with the core components of the decapping complex and for
the recruitment of partners [61]. It has also been observed that IDRs are extremely widespread in
ribosomal proteins. A bioinformatics study indeed demonstrated that several ribosomal proteins
exist in a disordered conformation and undergo a disordered-to-structured transition after
binding, allowing the correct assembly of the ribosomal structure through protein–protein
interactions and protein–rRNA interactions [62]. An additional hub of RBPs enriched in IDRs
is the spliceosome. A computational study showed that approximately half of the proteins
abundant in the spliceosome are predicted to possess disordered regions, including the small
nuclear RNPs (snRNPs) [63]. Taken together, the role of IDRs in RNA recognition and in the
modulation of hub formation defines a key role played by disordered regions in the regulation of
RNA metabolism.

Roles of IDRs
Emerging evidence in structural biology reveals that the lack of canonical folding represents a
feature that contributes to protein function. IDRs have been linked to several properties, such as
protein–protein, protein–DNA, and protein–RNA interactions [48,64]. The absence of structure
allows a major degree of flexibility, which allows IDRs to interact with multiple proteins and form

Table 1. Post-Translational Regulation of Disordered Sequences


SLiMs or LC Post-Translational Potential Impacts RBPs
Sequences Modifications

RGG/RG Arginine methylation or Modulation of RNA-binding activity FUS, FMRP, EWS, TAF15
citrullination

YGG Tyrosine phosphorylation Modulation of RNA-binding activity FUS, RBM3


Glycine myristoylation Regulation of subcellular localization

RS Serine phosphorylation Modulation of RNA-binding activity SRSF1, SRSF2, SRSF3

PPP Proline hydroxylation Changes in protein folding PSF


cis-trans isomerization

GY/GSYGS/ Tyrosine phosphorylation Modulation of RNA-binding activity FUS, TAF15


GYS/SYG/SYS Serine phosphorylation Regulation of subcellular localization
Glycine myristoylation

QQQ Pyroglutamate formation Aggregate formation FUS, EWS, TAF15

666 Trends in Biochemical Sciences, November 2015, Vol. 40, No. 11


networks. IDRs have been shown to be scaffolds for proteins that mediate ubiquitination [47].
Moreover, IDRs have been identified as sites of proteasomal degradation of IDPs and thus
disordered sequences tend to correlate with a short protein half-life [65–67]. However, this is not
always the case, as IDPs also have a high intrinsic propensity to form transient interactions with
multiple partners, forming networks that protect them against degradation [47,68]. This is
observed with the p53 tumor suppressor, which possess IDRs at both its N and C termini.
p53 is protected from degradation through an interaction between its IDRs and binding partners
such as NQO1, Hdmx, or Mdm2 [69]. The correlation between IDRs and proteasomal degra-
dation is finely tuned both physiologically and pathologically. Indeed, it has been shown that
when mutations occur in disordered regions, a common event in several diseases, the presence
of IDRs does not lead to increased protein degradation. Furthermore, growing evidence
demonstrates that the expression of IDPs needs to be tightly regulated to avoid the accumula-
tion of potentially dangerous disordered proteins [70,71].

Impairment of proteasomal activity has been shown for the mutant protein huntingtin [72]. The
formation of intracellular aggregates due to pathological expansion of the LC sequences in
neurodegenerative disorders prevents the activity of the proteasome, leading to an accumulation
of ubiquitin-modified conjugates, which arrests the cell cycle by the sequestration of protea-
somal components [73]. These observations were observed for LC sequences rich in glutamine
[74]. RBPs are known, in general, to be stable proteins, therefore, IDRs are not thought to play a
major role in their protein stability [4]. Thus, one of the key functions of IDRs in RBPs may be the
formation of functional hubs required for their roles in RNA metabolism.

RBPs, IDRs, and ultrastructures


The presence of IDRs in RBPs has been linked to the formation of ultrastructures or granules,
also termed ‘assemblages’ [75]. RNP granules are dynamic structures involved in RNA metab-
olism through regulating RNA processing, bioavailability, degradation, and transport. Stress
granules represent a well-known example of a structure formed by RNPs (Box 2) [76]. The
formation of RNA granules relies mainly on the presence of RNA molecules; for instance,
untranslated mRNAs promote the assembly of stress granules [77]. The presence of untrans-
lated mRNAs also triggers the formation of the processing bodies (P-bodies) [78,79], which
represents another classical example of highly dynamic RNP ultrastructure. An additional class
of RNP granules are transport granules, which are composed of large structures of RNAs bound
to RBPs involved in the control of RNA transport and translation. The function of transport
granules relies on the storage and maintenance of RNA molecules to orchestrate the promotion
of local protein synthesis [80]. The assembly of intracellular structures has been linked with a
dynamic phase transition into liquid droplets [81].

Substantial evidence has demonstrated that the presence of disordered sequences pro-
motes the formation of RNPs with IDRs acting as assembly domains. In vitro precipitation of
RBPs in RNP granules showed that the protein sequences required for RNP granule
cohesion were the LC sequences present in RBPs [82]. Mass spectrometry analysis of
RNP granules revealed an enrichment of RBPs involved in key steps of RNA metabolism,
including alternative splicing and mRNA translation [83]. LC sequences have been shown to
promote molecular aggregation, resulting in the formation of intracellular ultrastructures
termed hydrogels (Box 2). One example is represented by the LC sequences G/S and
YG/S contained in FUS, which are required for hydrogel formation and for FUS localization to
stress granules. FUS also contains other types of disordered sequences, including RGG/RG
motifs, which may contribute to ultrastructure formation. RNA binding is required to trigger
the formation of amyloid-like fibers to recruit the CTD of RNA Pol II [84]. FUS-composed
hydrogels appear to be a dynamic structure. The dynamic phase transition is an essential
feature for the formation of the membrane-less cell RNP structures [85,86], and IDRs within

Trends in Biochemical Sciences, November 2015, Vol. 40, No. 11 667


Box 2. RBP Ultrastructures
RNA Granules
Cytoplasmic structures that are dynamic, transient, and contain defined RNPs. For example, RNA granules are
transported along the length of neuronal axons or oligodendrocyte processes.

Stress Granules
RNA granules containing extra proteins and RNAs that are added in conditions of stress. Overexpression of these stress-
dependent proteins often results in stress granule formation. The mRNAs in these structures are bound to a subset of
48S preinitiation factors and specific RBPs.

Processing Bodies: P-bodies


P-bodies are cytoplasmic granules composed of messenger RNPs (mRNPs) destined to be temporarily translationally
repressed or degraded. P-bodies are composed by untranslated mRNAs and multiple proteins involved in storage or
degradation of RNAs.

Transport Granules
Large mRNP granules actively transported from the nucleus to the cytoplasm. Transport granules are abundant in axons
and dendrites to ensure correct local protein synthesis of specific mRNAs.

Hydrogels
Hydrophilic gels are a web of polymeric chains that form colloidal structures. Hydrogels are produced by aggregation of
one or more monomers that trigger the formation of a crosslinked peptide network that retains a portion of the water from
the ultrastructure [102]. According to the proprieties of the polymers, hydrogels exist in different densities and can retain
differential amounts of water.

Amyloid-like Fibers
Amyloid fibers are elongated intracellular structures formed by abnormal aggregation of soluble proteins. These fibers are
insoluble and resistant to intracellular degradation [103]. The formation of amyloid fibrils has been associated with several
diseases including Alzheimer's disease. Electron microscopy and X-ray diffraction analyses revealed that these fibers are
essentially composed of repeats of continuous b-sheets. In particular, X-ray analysis indicates that amyloid fibrils possess
a structural core formed by several b-sheets connected by hydrogen bonding, thereby forming planes perpendicular to
the length of the fiber. The intracellular ultrastructures that possess similar proprieties are defined as amyloid-like fibers.

Prion-like Structures
Prions are proteins that can assume alternative structures and are thereby capable of self-replication by promoting the
structural conversion of other molecules of the same protein [104]. This cascade effect prevents the protein from
assuming its normal structure and function. If these self-assembling proteins can be transmitted between different
individuals, they are called prions; if not, they are termed prion-like. Importantly, in silico analysis revealed that hundreds of
human proteins possess a prion-like domain (PrLD), including RBPs such as hnRNPA1/A2, FUS, and TDP-43. Prion-like
structures are often linked with amyloid-like fibers, since the expansion of incorrectly folded self-assembling proteins
eventually generates aggregates that will form fibers.

the components of these granules mediate this structural flexibility and contribute to
membrane curvature [87].

In Caenorhabditis elegans, it has been shown that the RNA helicase LAF-1, a component of
P-granules, forms droplets in vitro together with RNA molecules. Importantly, the N-terminal
RGG-disordered motif of LAF-1 is required for both droplet localization and RNA binding [88].
Similar observations were made for P-granule proteins MEG-1 and MEG-3, which are predicted
to be highly disordered. Despite MEG proteins lacking any distinguishable RNA-binding
domains, they are required for proper assembly of P-granules in the C. elegans embryo, thereby
affecting RNA storage and metabolism [89]. These observations highlight the fundamental
contribution of IDRs in RNP ultrastructure assembly by ensuring plasticity of these granules
required for proper cellular RNA metabolism.

RBPs, Protein Disorder, and Disease


The aberrant expression and/or altered function of RBPs in disease is now well established [1–3].
The highly disordered state of the RBPs associated with disease raises the question of whether
IDRs contribute to the pathological behavior of RBPs (Figure 2). It is likely that the aberrant

668 Trends in Biochemical Sciences, November 2015, Vol. 40, No. 11


Tra2β Key:
Schizophrenia YB1 WT1 = IDRs

QKI U2AF1

Myelin
disorders eIF4E Cancer SRSF1-6

ATXN1
PTBP1
TIA-1
Ataxias Sam68

hnRNPK
Sarcomas
ATXN2
Autoimmunity SmD3
NOVA-1,2 HuR
SMA SmD1
Elav/HuB,C,D Translocaon hnRNPA1
hnRNPA2
SMN
eIF4G
POMA
PD TDP43
Neurological EWS
Muscular OPMD
disorders atrophies
ALS TLS/FUS PABPN1

C9orf72 TAF15
Staufen1
FXS FXTAS
DM CNBP
MBNL
FMRP1 MBNL1

CUGBP1

Figure 2. A Network of RNA-Binding Proteins (RBPs) and Intrinsically Disordered Regions (IDRs) in Human
Diseases. The aberrant expression or functions of RBPs (green) have been identified in major human diseases (cancer,
neurological disorders, muscular atrophies; orange) or specific disease types (blue). Unbroken lines represent RBPs with a
known role in the disease and the broken lines represent indirect or predicted roles. The star denotes RBPs with IDRs as
predicted using the D2P2 prediction tool [105]. Abbreviations: ALS, amyotrophic lateral sclerosis; DM, dystrophia
myotonica (myotonic dystrophy); FXS, fragile X syndrome; FXTAS, fragile X-associated tremor/ataxia syndrome; OPMD,
oculopharyngeal muscular dystrophy; PD, Parkinson's disease; POMA, paraneoplastic opsoclonus myoclonus ataxia;
SMA, spinal muscular atrophy.

aggregation and the consequent loss- or gain-of-function effects of IDRs may be a common
feature in multiple pathogenic conditions, underlying the importance of having a certain amount
of disorder and its regulation from disorder-to-order transition.

Indeed, disease-linked mutations in proteins frequently occur in IDRs [90,91]. Arginine repre-
sents one of the most frequently mutated amino acids in disordered regions and, importantly, it
appears to be one of the key drivers of disorder-to-order transitions [90]. One example is
represented by the R244C mutation in the RGG/RG motif of FUS, which is linked with
amyotrophic lateral sclerosis (ALS) [92]. R244C mutation affects FUS intracellular localization,
promoting translocation in the cytoplasm, where FUS aggregates are formed and where
modification by arginine methylation likely contributes to the pathology [93]. Additional arginine
mutations occurring within the RGG/RG motif of FUS are linked with ALS disease [3], even
though the effects of these mutations are still unknown. Another consequence of mutations in
IDRs could be the loss of the flexibility required for the dynamic phase transition and thereby the
accumulation of stable, ordered cellular ultrastructures leading to disease [94,95].

Several examples link mutation in RBPs with the formation of pathogenic intracellular aggre-
gates. TDP-43 mutations have been observed [96] that result in its cellular aggregation and
misfolding [97]. Characterizations of ALS-linked mutations of hnRNPA1 and hnRNPA2/B1
proteins have also emerged. Prion-like domains of hnRNPs promote their assembly into ultra-
structures, a feature that is aggravated after mutations, leading to a higher presence in stress

Trends in Biochemical Sciences, November 2015, Vol. 40, No. 11 669


granules [98]. Mutation in the IDRs of PABPN1 causes nuclear accumulation in fibrils that Outstanding Questions
increases cell death in oculopharyngeal muscular dystrophy [99]. Additional mutations identified Why is some amino acid disorder nec-
in the disordered regions of RBPs include SMN1 and Wilms’ tumor 1 (WT1), linked to spinal essary for RBPs? Is it so that RBPs can
associate more widely with proteins
muscular atrophy and renal cancer, respectively [3] (Figure 2). Thus, a common theme is the and RNAs?
intracellular accumulation of RBPs containing IDRs in several diseases.
Is the disorder essential for the forma-
tion of RBP networks?
Concluding Remarks
RBPs have received a wealth of attention in the past few years in the context of their interactions What is the role of the SLiMs, especially
with miRNAs, siRNA, lncRNAs, circRNA, and now guide RNAs for use with CRISPR (clustered RGG/RG motifs, in amyloid-like fiber
regularly interspaced short palindromic repeats)/Cas9 technology. It is not surprising that they assembly of RBPs? Are these SLiMs
regulated by post-translational
are linked extensively with diseases (Figure 2) [1–3,23]. The interesting observation that RBPs
modification?
are among the most enriched for IDRs [30] suggests that IDRs bestow new properties to RBPs
other than recognizing RNA (see Outstanding Questions). RBPs likely require IDRs to form Why are disordered sequences predis-
RNPs, producing major macromolecules in the cell such as the spliceosome and ribosome. posed to form ultrastructures? What is
the function of these ultrastructures?
RBPs have a propensity to favor network interactions to engage the formation of ‘higher’
ultrastructures. The function of RNA-induced ultrastructures remains undefined, but they likely Is there a way to impede the patho-
store RNA components awaiting stress or growth signals. An additional role of RNA-induced genic ultrastructure composed of
ultrastructures may be to shut down or pause certain processes such as protein synthesis by RBPs and their interactors? How can
we restore the flexibility of RNA dis-
‘trapping’ the macromolecules, thus reducing their mobility through inclusion in these ultra- ease-related aggregates?
structures. Future research requires the identification of their components and better imaging to
define the boundary of phase transition of these ultrastructures.

References
1. Lukong, K.E. et al. (2008) RNA-binding proteins in human genetic 17. Musco, G. et al. (1996) Three-dimensional structure and stability
disease. Trends Genet. 24, 416–425 of the KH domain: molecular insights into the fragile X syndrome.
2. Cooper, T.A. et al. (2009) RNA and disease. Cell 136, 777–793 Cell 85, 237–245

3. Castello, A. et al. (2013) RNA-binding proteins in Mendelian 18. Shamoo, Y. et al. (1997) Crystal structure of the two RNA binding
disease. Trends Genet. 29, 318–327 domains of human hnRNP A1 at 1.75 A resolution. Nat. Struct.
Biol. 4, 215–222
4. Mittal, N. et al. (2009) Dissecting the expression dynamics of
RNA-binding proteins in posttranscriptional regulatory networks. 19. Anantharaman, V. et al. (2002) Comparative genomics and evo-
Proc. Natl. Acad. Sci. U.S.A. 106, 20300–20305 lution of proteins involved in RNA metabolism. Nucleic Acids Res.
30, 1427–1464
5. Braunschweig, U. et al. (2013) Dynamic integration of splicing
within gene regulatory pathways. Cell 152, 1252–1269 20. Hafner, M. et al. (2010) PAR-CliP – a method to identify tran-
scriptome-wide the binding sites of RNA binding proteins. J. Vis.
6. Shi, Y. and Manley, J.L. (2015) The end of the message: multiple
Exp. 41, 2034
protein–RNA interactions define the mRNA polyadenylation site.
Genes Dev. 29, 889–897 21. Huppertz, I. et al. (2014) iCLIP: protein–RNA interactions at
nucleotide resolution. Methods 65, 274–287
7. Ha, M. and Kim, V.N. (2014) Regulation of microRNA biogenesis.
Nat. Rev. Mol. Cell Biol. 15, 509–524 22. Moore, M.J. et al. (2014) Mapping Argonaute and conventional
RNA-binding protein interactions with RNA at single-nucleotide
8. Rinn, J.L. (2014) lncRNAs: linking RNA to chromatin. Cold Spring
resolution using HITS-CLIP and CIMS analysis. Nat. Protoc. 9,
Harb. Perspect. Biol. 6, a018614
263–293
9. Lasda, E. and Parker, R. (2014) Circular RNAs: diversity of form
23. Nussbacher, J.K. et al. (2015) RNA-binding proteins in neuro-
and function. RNA 20, 1829–1842
degeneration: Seq and you shall receive. Trends Neurosci. 38,
10. St Laurent, G. et al. (2015) The landscape of long noncoding RNA 226–236
classification. Trends Genet. 31, 239–251
24. Ray, D. et al. (2013) A compendium of RNA-binding motifs for
11. Pelletier, J. and Sonenberg, N. (1985) Photochemical cross- decoding gene regulation. Nature 499, 172–177
linking of cap binding proteins to eucaryotic mRNAs: effect of
25. Castello, A. et al. (2012) Insights into RNA biology from an atlas of
mRNA 50 secondary structure. Mol. Cell. Biol. 5, 3222–3230
mammalian mRNA-binding proteins. Cell 149, 1393–1406
12. Krainer, A.R. and Maniatis, T. (1985) Multiple factors including the
26. Baltz, A.G. et al. (2012) The mRNA-bound proteome and its
small nuclear ribonucleoproteins U1 and U2 are necessary for
global occupancy profile on protein-coding transcripts. Mol. Cell
pre-mRNA splicing in vitro. Cell 42, 725–736
46, 674–690
13. Grabowski, P.J. and Sharp, P.A. (1986) Affinity chromatography
27. Kwon, S.C. et al. (2013) The RNA-binding protein repertoire of
of splicing complexes: U2, U5, and U4 + U6 small nuclear
embryonic stem cells. Nat. Struct. Mol. Biol. 20, 1122–1130
ribonucleoprotein particles in the spliceosome. Science 233,
1294–1299 28. Mitchell, S.F. et al. (2013) Global analysis of yeast mRNPs. Nat.
Struct. Mol. Biol. 20, 127–133
14. Choi, Y.D. et al. (1986) Heterogeneous nuclear ribonucleopro-
teins: role in RNA splicing. Science 231, 1534–1539 29. Gerstberger, S. et al. (2014) A census of human RNA-binding
proteins. Nat. Rev. Genet. 15, 829–845
15. Query, C.C. et al. (1989) A common RNA recognition motif
identified within a defined U1 RNA binding domain of the 70K 30. Neelamraju, Y. et al. (2015) The human RBPome: from genes and
U1 snRNP protein. Cell 57, 89–101 proteins to human disease. J. Proteomics Published online May
14, 2015. http://dx.doi.org/10.1016/j.jprot.2015.04.031
16. Ramakrishnan, V. and White, S.W. (1992) The structure of ribo-
somal protein S5 reveals sites of interaction with 16S rRNA. 31. Thandapani, P. et al. (2013) Defining the RGG/RG motif. Mol. Cell
Nature 358, 768–771 50, 613–623

670 Trends in Biochemical Sciences, November 2015, Vol. 40, No. 11


32. Wahl, M.C. and Lührmann, R. (2015) SnapShot: spliceosome 59. Petrey, D. and Honig, B. (2014) Structural bioinformatics of the
dynamics I. Cell 161, 1474 interactome. Annu. Rev. Biophys. 43, 193–210
33. Matsumoto, K. et al. (1998) Nuclear history of a pre-mRNA 60. Haynes, C. et al. (2006) Intrinsic disorder is a common feature of
determines the translational activity of cytoplasmic mRNA. hub proteins from four eukaryotic interactomes. PLoS Comput.
EMBO J. 17, 2107–2121 Biol. 2, e100
34. Bono, F. and Gehring, N.H. (2011) Assembly, disassembly and 61. Jonas, S. and Izaurralde, E. (2013) The role of disordered protein
recycling: the dynamics of exon junction complexes. RNA Biol. 8, regions in the assembly of decapping complexes and RNP
24–29 granules. Genes Dev. 27, 2628–2641
35. Rodnina, M.V. and Wintermeyer, W. (2010) The ribosome goes 62. Peng, Z. et al. (2014) A creature with a hundred waggly tails:
Nobel. Trends Biochem. Sci. 35, 1–5 intrinsically disordered proteins in the ribosome. Cell. Mol. Life
36. Lunde, B.M. et al. (2007) RNA-binding proteins: modular design Sci. 71, 1477–1504
for efficient function. Nat. Rev. Mol. Cell Biol. 8, 479–490 63. Korneta, I. and Bujnicki, J.M. (2012) Intrinsic disorder in the
37. Cáceres, J.F. et al. (1998) A specific subset of SR proteins human spliceosomal proteome. PLoS Comput. Biol. 8,
shuttles continuously between the nucleus and the cytoplasm. e1002641
Genes Dev. 12, 55–66 64. Wright, P.E. and Dyson, H.J. (2015) Intrinsically disordered pro-
38. Xiao, S.H. and Manley, J.L. (1997) Phosphorylation of the ASF/ teins in cellular signalling and regulation. Nat. Rev. Mol. Cell Biol.
SF2 RS domain affects both protein–protein and protein–RNA 16, 18–29
interactions and is necessary for splicing. Genes Dev. 11, 65. Ngoc, L.V. et al. (2014) Rapid proteasomal degradation of post-
334–344 transcriptional regulators of the TIS11/tristetraprolin family is
39. Bedford, M.T. et al. (2000) Arginine methylation inhibits the induced by an intrinsically unstructured region independently
binding of proline-rich ligands to Src homology 3, but not of ubiquitination. Mol. Cell. Biol. 34, 4315–4328
WW, domains. J. Biol. Chem. 275, 16030–16036 66. van der Lee, R. et al. (2014) Intrinsically disordered segments
40. Cote, J. and Richard, S. (2005) Tudor domains bind symmetrical affect protein half-life in the cell and during evolution. Cell Rep. 8,
dimethylated arginines. J. Biol. Chem. 280, 28476–28483 1832–1844

41. Tompa, P. (2012) Intrinsically disordered proteins: a 10-year 67. Fishbain, S. et al. (2015) Sequence composition of disordered
recap. Trends Biochem. Sci. 37, 509–516 regions fine-tunes protein half-life. Nat. Struct. Mol. Biol. 22,
214–221
42. Berlow, R.B. et al. (2015) Functional advantages of dynamic
protein disorder. FEBS Lett. Published online June 11, 2015. 68. Suskiewicz, M.J. et al. (2011) Context-dependent resistance to
http://dx.doi.org/10.1016/j.febslet.2015.06.003 proteolysis of intrinsically disordered proteins. Protein Sci. 20,
1285–1297
43. Uversky, V.N. (2015) The multifaceted roles of intrinsic disorder in
protein complexes. FEBS Lett. Published online June 11, 2015. 69. Tsvetkov, P. et al. (2009) The nanny model for IDPs. Nat. Chem.
http://dx.doi.org/10.1016/j.febslet.2015.06.004 Biol. 5, 778–781

44. Wright, P.E. and Dyson, H.J. (1999) Intrinsically unstructured 70. Gsponer, J. et al. (2008) Tight regulation of unstructured pro-
proteins: re-assessing the protein structure–function paradigm. teins: from transcript synthesis to protein degradation. Science
J. Mol. Biol. 293, 321–331 322, 1365–1368

45. Uversky, V.N. (2002) What does it mean to be natively unfolded? 71. Gsponer, J. and Babu, M.M. (2012) Cellular strategies for regu-
Eur. J. Biochem. 269, 2–12 lating functional and nonfunctional protein aggregation. Cell Rep.
2, 1425–1437
46. Romero, P. et al. (2001) Sequence complexity of disordered
protein. Proteins 42, 38–48 72. Hipp, M.S. et al. (2012) Indirect inhibition of 26S proteasome
activity in a cellular model of Huntington's disease. J. Cell Biol.
47. Cumberworth, A. et al. (2013) Promiscuity as a functional trait:
196, 573–587
intrinsically disordered regions as central players of interactomes.
Biochem. J. 454, 361–369 73. Bence, N.F. et al. (2001) Impairment of the ubiquitin–proteasome
system by protein aggregation. Science 292, 1552–1555
48. van der Lee, R. et al. (2014) Classification of intrinsically disor-
dered regions and proteins. Chem. Rev. 114, 6589–6631 74. Holmberg, C.I. et al. (2004) Inefficient degradation of truncated
polyglutamine proteins by the proteasome. EMBO J. 23, 4307–
49. Bah, A. et al. (2015) Folding of an intrinsically disordered protein
4318
by phosphorylation as a regulatory switch. Nature 519, 106–109
75. Toretsky, J.A. and Wright, P.E. (2014) Assemblages: functional
50. Darnell, J.C. et al. (2001) Fragile X mental retardation protein
units formed by cellular phase separation. J. Cell Biol. 206,
targets G quartet mRNAs important for neuronal function. Cell
579–588
107, 489–499
76. Kedersha, N. et al. (2013) Stress granules and cell signaling:
51. Hanakahi, L.A. et al. (1999) High affinity interactions of nucleolin
more than just a passing phase? Trends Biochem. Sci. 38,
with G-G-paired rDNA. J. Biol. Chem. 274, 15908–15912
494–506
52. Phan, A.T. et al. (2011) Structure–function studies of FMRP RGG
77. Mazroui, R. et al. (2006) Inhibition of ribosome recruitment indu-
peptide recognition of an RNA duplex-quadruplex junction. Nat.
ces stress granule formation independently of eukaryotic initiation
Struct. Mol. Biol. 18, 796–804
factor 2/ phosphorylation. Mol. Biol. Cell 17, 4212–4219
53. Thandapani, P. et al. (2015) Aven recognition of RNA G-quad-
78. Eulalio, A. et al. (2007) P-body formation is a consequence, not
ruplexes regulates translation of the mixed lineage leukemia
the cause, of RNA-mediated gene silencing. Mol. Cell. Biol. 27,
protooncogenes. Elife 4, 06234
3970–3981
54. Kwon, I. et al. (2013) Phosphorylation-regulated binding of RNA
79. Jain, S. and Parker, R. (2013) The discovery and analysis of P
polymerase II to fibrous polymers of low-complexity domains.
bodies. Adv. Exp. Med. Biol. 768, 23–43
Cell 155, 1049–1060
80. Jung, H. et al. (2012) Axonal mRNA localization and local protein
55. Meinhart, A. and Cramer, P. (2004) Recognition of RNA poly-
synthesis in nervous system assembly, maintenance and repair.
merase II carboxy-terminal domain by 30 -RNA-processing fac-
Nat. Rev. Neurosci. 13, 308–324
tors. Nature 430, 223–226
81. Li, P. et al. (2012) Phase transitions in the assembly of multivalent
56. Fabrega, C. et al. (2003) Structure of an mRNA capping enzyme
signalling proteins. Nature 483, 336–340
bound to the phosphorylated carboxy-terminal domain of RNA
polymerase II. Mol. Cell 11, 1549–1561 82. Han, T.W. et al. (2012) Cell-free formation of RNA granules:
bound RNAs identify features and components of cellular assem-
57. Zhou, Z. and Fu, X.D. (2013) Regulation of splicing by SR proteins
blies. Cell 149, 768–779
and SR protein-specific kinases. Chromosoma 122, 191–207
83. Kato, M. et al. (2012) Cell-free formation of RNA granules: low
58. Friesen, W.J. et al. (2001) SMN, the product of the spinal mus-
complexity sequence domains form dynamic fibers within hydro-
cular atrophy gene, binds preferentially to dimethylarginine-con-
gels. Cell 149, 753–767
taining protein targets. Mol. Cell 7, 1111–1117

Trends in Biochemical Sciences, November 2015, Vol. 40, No. 11 671


84. Schwartz, J.C. et al. (2013) RNA seeds higher-order assembly of 95. Droppelmann, C.A. et al. (2014) RNA metabolism in ALS: when
FUS protein. Cell Rep. 5, 918–925 normal processes become pathological. Amyotroph. Lateral
85. Brangwynne, C.P. et al. (2009) Germline P granules are liquid Scler. Frontotemporal Degener. 15, 321–336
droplets that localize by controlled dissolution/condensation. 96. Sreedharan, J. et al. (2008) TDP-43 mutations in familial
Science 324, 1729–1732 and sporadic amyotrophic lateral sclerosis. Science 319,
86. Weber, S.C. and Brangwynne, C.P. (2012) Getting RNA and 1668–1672
protein in phase. Cell 149, 1188–1191 97. Neumann, M. et al. (2006) Ubiquitinated TDP-43 in frontotem-
87. Busch, D.J. et al. (2015) Intrinsically disordered proteins drive poral lobar degeneration and amyotrophic lateral sclerosis. Sci-
membrane curvature. Nat. Commun. 24, 7875 ence 314, 130–133
88. Elbaum-Garfinkle, S. et al. (2015) The disordered P granule protein 98. Kim, H.J. et al. (2013) Mutations in prion-like domains in
LAF-1 drives phase separation into droplets with tunable viscosity hnRNPA2B1 and hnRNPA1 cause multisystem proteinopathy
and dynamics. Proc. Natl. Acad. Sci. U.S.A. 112, 7189–7194 and ALS. Nature 495, 467–473

89. Wang, J.T. et al. (2014) Regulation of RNA granule dynamics by 99. Fan, X. et al. (2001) Oligomerization of polyalanine expanded
phosphorylation of serine-rich, intrinsically disordered proteins in PABPN1 facilitates nuclear protein aggregation that is associated
C. elegans. Elife 3, e04591 with cell death. Hum. Mol. Genet. 10, 2341–2351

90. Vacic, V. et al. (2012) Disease-associated mutations disrupt 100. Vacic, V. et al. (2007) Characterization of molecular recognition
functionally important regions of intrinsic protein disorder. PLoS features, MoRFs, and their binding partners. J. Proteome Res. 6,
Comput. Biol. 8, e1002709 2351–2366

91. Uversky, V.N. et al. (2008) Intrinsically disordered proteins in 101. Weatheritt, R.J. et al. (2012) The identification of short linear
human diseases: introducing the D2 concept. Annu. Rev. Bio- motif-mediated interfaces within the human interactome. Bioin-
phys. 37, 215–246 formatics 28, 976–982

92. Vance, C. et al. (2009) Mutations in FUS, an RNA processing 102. Ahmed, E.M. (2015) Hydrogel: preparation, characterization, and
protein, cause familial amyotrophic lateral sclerosis type 6. Sci- applications: a review. J. Adv. Res. 6, 105–121
ence 323, 1208–1211 103. Sipe, J.D. et al. (2010) Amyloid fibril protein nomenclature: 2010
93. Tradewell, M.L. et al. (2012) Arginine methylation by PRMT1 recommendations from the nomenclature committee of the Inter-
regulates nuclear-cytoplasmic localization and toxicity of FUS/ national Society of Amyloidosis. Amyloid 17, 101–104
TLS harbouring ALS-linked mutations. Hum. Mol. Genet. 21, 104. King, O.D. et al. (2012) The tip of the iceberg: RNA-binding
136–149 proteins with prion-like domains in neurodegenerative disease.
94. Knowles, T.P. et al. (2014) The amyloid state and its association Brain Res. 1462, 61–80
with protein misfolding diseases. Nat. Rev. Mol. Cell Biol. 15, 105. Oates, M.E. et al. (2013) D2P2: database of disordered protein
384–396 predictions. Nucleic Acids Res. 41, D508–D516

672 Trends in Biochemical Sciences, November 2015, Vol. 40, No. 11

You might also like