You are on page 1of 7

Eur. J. Biochem.

241, 425-431 (1996)


0 FEBS 1996

Characterisation of the nucleic-acid-binding activity of KH domains


Different properties of different domains
Kurt DEJGAARD' and Henrik LEFFERS'
' Institute of Medical Biochemistry and Danish Centre for Human Genome Research, Aarhus University, Denmark
Department of Growth and Reproduction, Rigshospitalet, Copenhagen, Denmark

(Received 3 June 1996) - EJB 96 0800/2

The KH module is a sequence motif recently identified in a number of diversified RNA-binding


proteins and suggested to be the functional element responsible for RNA binding. So far, however, this
hypothesis has not received direct experimental support. We have expressed the three KH-domains from
heterogeneous nuclear ribonucleoprotein K (hnRNP-K), the poly(C)-binding proteins PCBP-1 and PCBP-
2, the first three to four domains from the high-density binding protein HBP, the one and a half domain
from the archaeon Halobacterium halobium ORF139 and one and a half domain of the fragile-X protein
FMRl in Escherichia coli and analysed their nucleic-acid-binding properties in vitro. The results showed
that the in vitro poly(rC)-binding activity of hnRNP-K can be assigned to KH-domain 3, whereas both
domains 1 and 3 in the PCBPs bind poly(rC). In addition, all these domains exhibit binding activity
towards other nucleic acids, albeit at a significantly lower level. The first KH domain from the FMRl
protein binds poly(rG) and single-stranded and double-stranded DNA. The N-terminal three or four do-
mains from HBP bind poly(rG) and, at a much lower level, single-stranded and double-stranded DNA.
Thus, single KH domains are discrete and independent nucleic-acid-binding units. Moreover, different
KH domains bind different nucleic acids, suggesting that KH domains are composed of a conserved,
weakly nucleic-acid-binding, structure that is fine tuned, by sequence variation, resulting in sequence-
specific nucleic-acid-binding entities.
Keywords: KH domain ; nucleic-acid-binding protein.

A variety of RNA binding proteins has been identified and fuss et al., 1993) and this motif was also found in RRM-motif-
many of these are unique, containing no recognisable sequence containing proteins, including nucleolin (Srivastava et al., 1989).
motifs that are shared with other RNA binding proteins. This This motif consists of one or more RGG boxes, that are repeti-
group includes many ribosomal proteins and tRNA synthetases. tions of an Arg-Gly-Gly sequence interspersed with other, often
Others, however, belong to protein families and this led to the aromatic, amino acids and it has been shown to interact directly
identification of specific amino acid sequence motifs that medi- with RNA (Kiledijan and Dreyfuss, 1992; Ghisolfi et al., 1992).
ate protein-RNA interactions (reviewed by Mattaj, 1993; A few other RNA-binding motifs have been identified; these
Lamm and Lamond, 1993; Dreyfuss et al., 1993). The first con- include zinc fingers and related motifs (Theunissen et al., 1992)
served RNA binding sequence identified was a 93-amino-acid and Arg-rich sequences that form short basic a-helices as found
internal repeated motif in a proteolytic fragment of hnRNP A l , in retroviral RNA binding proteins such as Rev, Tat and Tar (Tan
called UP1 (Merrill et al., 1986, 1988). Further studies (Dreyfuss et al., 1993, and references therein).
et al., 1988; Query et al., 1989; Scherly et al., 1989) led to the Recently, a novel motif was detected in heterogenous nuclear
identification of similar domains in other RNA-binding proteins ribonucleoprotein K (hnRNP-K) (Matunis et al., 1992; Dejgaard
and the motif was given the name consensus-sequence RNA- et al., 1994) and one or more were soon discovered in a number
binding domain (CS-RBD) (Dreyfuss et al., 1988, 1993) or RNA of other putative nucleic acid binding proteins from both eukary-
recognition motif (RRM) (Query et al., 1989). Three-dimen- otes, eubacteria and archaea (Gibson et al., 1993; Siomi et al.,
sional structures of members of this family are now available 1993a; Dejgaard et al., 1994). These include the fragile-X pro-
(Nagai et al., 1990; Hoffman et al., 1991; Gorlach et al., 1992) tein (FMR1) (Verkerk et al., 1991; Siomi et al., 1993b), the
and they provide some suggestions as to how RRM motifs rec- Ri auto antigen (nova) (Buckanovich et al., 1993), a putative
ognise RNA (Kenan et al., 1991; Gorlach et al., 1992). Another transcription activator FBP (Duncan et al., 1994), the poly(rC)-
RNA-binding motif, called an RGG box, was identified in binding proteins (PCBP) 1 and 2 (Leffers et al., 1995) and HBP
several hnRNP proteins (Kiledijan and Dreyfuss, 1992; Drey- (McKnight et al., 1992) from human, the yeast merl (Enge-
brecht and Roeder, 1990) and HX proteins (Delahodde et al.,
Correspondence to H. Leffers, Department of Growth and Reproduc- 1986), ribosomal proteins from different species (Kao et al.,
tion, Section GR-5064, Rigshospitalet, Blegdamsvej 9, DK-2100 Copen-
hagen, Denmark
1990; Klenk et al., 1991) and the bacterial nusA (see Musco et
Fax: +45 3545 6054. al., 1996, for latest review). A 45-55-amino-acid motif was
Abbreviations. hnRNP, heterogenous nuclear ribonucleoprotein ; KH named KH for K-homologous, with reference to the initial iden-
domain, a peptide motif identified originally in hnRNP-K; PCBP, tification of the motif in hnRNP-K (Siomi et al., 1993a; Gibson
poly(C)-binding protein; CS-RBD, consensus-sequence RNA-binding et al., 1993). Based on our observations of hnRNP-K and the
domain; RRM, RNA recognition motif; 2D, two-dimensional. repeated sequences found in HBP (McKnight et al., 1992), we
426 Dejgaard and Leffers (Eur J. Biochem. 241)

later extended this to cover 65-70 amino acids (Dejgaard et al.,


hnRNP-K D GMTCTCGAGACTGAACAGCCAGAAG 231
1994). NMR studies of KH-domain 5 from HBP recently re- CTCAGATCTGATGCTGTGGAATGC 605
vealed that this extension is required for structural stability of TCTAGATCTCCCATCAAAGGACGTG 873
the domain (Castiglione-Morelli et al., 1995). TTGGMTTCAACCATAATCATAGGTTTC 903R
MTGMTTCTTAACCAAGATCACCATATG 134%
The organisation of the KH domains varies. The three do- TGGAGATCTGGTGGACCTATTATTAC 1375
mains in the known nucleic-acid-binding proteins hnRNP-K and ATTAGATCTCTACACAAGTAAC 1387
PCBP-1 and -2 are arranged with two repeats closely spaced in CTTGMTTCTCACTGCAGCAAATAC 1553R
CTTGMTTCTTCACTAGTCTTAG 1658R
the N-terminal and one in the C-terminal region. Repeats 1 and
2 are separated from repeat 3 by a linking segment that, in PCBP-1 & 2 GCAGMTTCACCTCAGGGTGACCGG 319R1311R
AGCAGATCTCCGGTCACCCTGAGG 333/325
hnRNP-K, contains several RGG boxes (Siomi et al., 1993a; TTTAGATCTGGTTTGGATGCATCT 8611874
Matunis et al., 1992; Dejgaard et al., 1994). The same arrange- GCAWTTCTAGCTGCTCCCCATGC 1086W1104R
ment was found in a mouse nucleic-acid-binding protein termed CMGMTTCCCTCAGTTTCCATGGT 1086R11104R
hnRNP-X (Hahm et al., 1993) and in nova (Buckanovich et al., PCBP-1 GCCAGATCTGCCGGTGTGACTGAAAG 53
1993). Other proteins harbouring KH domains exhibit a diversi- TGCGGATCCATGCTGGAGACGCTCTC 542
TGGGMTTCACGGAATGGTCATGACT 560R
fied architecture, e.g. in the known nucleic-acid-binding protein GGTGMTTCTTATGCATCCAAACTTGCCC 842R
FMRl (Verkerk et al., 1991 ; Siomi et al., 1993b) there are two PCBP-2 CTCGGATCCGACACCGGTGTGATTG 41
closely spaced domains where the second has a 45-amino-acid CCTGMTTCAQGACTGGGAGAGAGT 527R
insert in the middle of the domain (Musco et al., 1996). In the HBP GTGGATCCACCATGAGTTCCGTTGC 168
c-myc promoter binding protein FBP, four domains have been GGAGMTTCTTAGCTATTGGCCTTGGCA 1228R
identified (Duncan et al., 1994) whereas the yeast merl protein FMR1 GMGGATCCAGTAAGCAGCTGGAGAG 827
only contains a single KH domain (Engebrecht and Roeder, ACAGMTTCACCATACCCCTCTGGAC 1306R
1990). The protein with the highest number of repeats is the H . h a l . 139 GTCGGATCCATGACCATCCGGCTCTC 9167
HBP protein (McKnight et al., 1992), also known as Vigilin AGCGAATTCGACTACGTCAGCTGGAT 9556R
(Schmidt et al., 1992), which consists of 14 or 15 KH domains Fig. 1. Oligonucleotides used for preparing the constructs. Positions
with little or no space in between. Among the prokaryotic pro- correspond to the positions in the cDNAs (hnRNP-K D, Dejgaard et al.,
teins there is one domain in the ribosomal S3 proteins (Zurawski 1994; PCBP-1 and 2, Leffers et al., 1995; HBP, McKnight et al., 1992;
and Zurawski, 1985; Kao et al., 1990) and one and a half in FMRl, Verkerk et al., 1991) or gene (H. halobium ORF 139, Leffers et
the hypothetical archae 130-139 proteins (Puehler et al., 1989; al., 1989), for primers that matched both PCBP-1 and, the positions are
Leffers et al., 1989). shown as: PCBP-1PCBP-2. An R denotes that the oligonucleotide is
Indications that KH domains are nucleic acid binding came reversed as compared to the sense direction. Nucleotides shown in bold
from analysis of mouse hnRNP-X (Hahm et al., 1993) and the were added to facilitate the cloning into pGEX expression vectors.
human poly(rC) binding PCBP-1 and 2 (Leffers et al., 1995) that
bound nucleic acids but contained no other known RNA-binding
motifs; also from a study of the rat hnRNP-K homolog where a
C-terminal fragment that only included the last KH domain was FMRl cDNA was cloned from an amnion cDNA library. The
shown to bind poly(rC) (It0 et al., 1994). Independent of this, identity and the sequence of each construct were verified by
mutagenesis studies of the KH domains of hnRNP-K and FMRl sequencing.
revealed that the highly conserved residues in the early part of All the DNA fragments were cloned into expressions vectors
the domains are crucial for the maintenance of the wild-type pGEX-2T, -2TK or 3 x 1 (Pharmacia) and expressed and puri-
RNA binding profile of the proteins (Siomi et al., 1994). fied as described by Pharmacia. The amount of intact protein in
Here we present a systematic analysis of the nucleic acid each preparation was estimated from Coomassie-blue-stained
binding activity of the each KH domain from human hnRNP-K, SDS gels (Laemmli, 1970) before approximately similar
PCBP-1 and 2, one and a half domain from FMRl, the first amounts of the GST-fusion proteins were dotted onto nitro-cellu-
three to four domains from HBP and the one and a half domain lose filters that had been wetted in binding buffer (50 mM NaCI;
from the archaeon Halobacterium halobium ORF139 protein. 20 mM Tris/HC1 pH 7.5, 1 mM EDTA).
Binding assays were performed on filters with native pro-
teins and those denatured in situ by SDS or guanidine hydroclo-
MATERIALS AND METHODS ride. For SDS denaturation, the nitrocellulose filters were incu-
bated in water containing 0.1 % SDS at room temperature; the
Preparation of recombinant proteins. The complete SDS was removed by washing the filters for five 5-min periods
hnRNP-K protein was prepared using oligonucleotides, whereas in binding buffer. For guanidine hydrocloride denaturation, the
all other hnRNP-K PCR constructs that included KH-domain 1 nitrocellulose filters were incubated in renaturation buffer
were digested with BglII (position 312; Dejgaard et al., 1994) (25 mM Hepes pH 7.0, 3 mM MgC12,40 mM KCI, 1 mM dithio-
to remove the acidic N-terminal 35 amino acids. Most other con- threitol) supplemented with 6 M guanidine hydrocloride for
structs were prepared by PCR using oligonucleotides that were 30 min at 4"C, after which the buffer was diluted with an equal
specific for the different domains (Fig. 1) and the pfu DNA poly- volume of renaturation buffer and incubated for 5 min; this was
merase (Stratagene). hnRNP-K, PCBP-1 and PCBP-2 constructs repeated four times at 4"C, followed by two washes in binding
were made by PCR from plasmids containing the cloned buffer at room temperature.
cDNAs, except for KH-domain 1 from hnRNP-K that was pre- Nucleic acid binding. The blots were probed with RNA ho-
pared using restriction enzymes BglII (position 312) and Sty1 mopolymers [poly(rA), poly(rG), poly(rC) and poly(rU) ; Phar-
(position 529). FMRl and HBP constructs were made by PCR macia] and with single-stranded (ss) or double-stranded (ds)
from total cDNA, prepared as described by Gubler (1988). The M13V8 DNA, which corresponds to M13mp18 containing an
cDNA was prepared from either primary term trophoblasts or insert consisting of the 800-bp G+C-rich V8 domain from hu-
transformed human amnion cells. H. halobium ORF139 was man 28s ribosomal RNA (Gonzalez et al., 1985; Leffers and
made by PCR from a I clone containing the genomic RNA poly- Andersen, 1993). RNA homopolymers were 9-end-labelled with
merase subunit operon (Leffers et al., 1989). The full-length [y-22P]ATP(ICN) using T4 polynucleotide kinase (Amersham)
Dejgaard and Leffers (Eur: J. Biochem. 241) 427
(Sambrook et al., 1988). M13V8 ssDNA was sonicated for 1, 3 Prollne Poly(rC)
or 5 min, the three simples were mixed and aliquots were 5’-
end-labelled as for RNA homopolymers. dsDNA was digested
hnRNP-K .
Acidic

1
KHI KH2 RGG richRGG KH3

464 ****
3 6 464
with Suu3AI and Hinfl and labelled with [a-32P]dATP(Amers- 36 108
:+++

ham) by filling in the 5’-overhangs with the Klenow enzyme 36 222 (+)
36 237
+
(Amersham).

’+
36 384
Nucleic acid binding assays were performed essentially as 127
127
237
384
described for northwestern blots by Matunis et al. (1992) and

-
127 464
216 384
Dejgaard and Celis (1994). In brief, dot-blots with approxi- 216 464 **
mately 1 pg of each fusion protein were incubated in binding 382 464 ++++

buffer (10 mM TrisMC1 pH 7.5, 1 mM EDTA, 50 mM NaC1) KH 1 KH 2 KH3


PCBP-I
supplemented with 10XDenhart’s for I h at room temperature. II
356
++++
The blots were then incubated in binding buffer with Denhart’s 2 101
++*
2 182
+++
supplemented with 32P-labelled nucleic acids with or without 2 276
+++
competitors (10 pg/ml of E. coli tRNA and 10 pglml of hemng 97 182
166 276
sperm DNA) and incubated for 1 h at room temperature. Excess 274 356
++++

32P-labelled nucleic acids were removed by washing the blots


Fig. 2. Localisation of the poly(rC)-bindingactivity in hnRNP-K and
for three 15-min periods in binding buffer. PCBP-1. KH domains are shown with dark boxes The positions of RGG
boxes and the proline-rich segment in hnRNP-K are indicated. Thin lines
denotes proteins or protein fragments that were expressed in E. coli using
RESULTS pGEX vectors (Pharmacia) and (#) indicates proteins that were expressed
both in E. coli and in human amnion cells using the vaccinia virus ex-
Localisation of the poly(rC) binding domain(s) of hnRNP-K pression system. Numbers refer to the positions in the amino acid se-
and PCBPs. Initially, we analysed constructs containing dif- quences (hnRNP-K D, Dejgaard et al., 1994; PCBP-1, Leffers et al.,
ferent regions of hnRNP-K in order to localise the part(s) of the 1995). The nucleic acid binding activity of each construct is indicated
molecule that are responsible for the observed in vitro poly(rC) by: -, no binding; (f), very weak; +, weak; ++, moderate; f + f ,
binding (results not shown; see Fig. 2 for schematic overview). strong; ++++, very strong; nd, not determined. The results are sum-
marked from at least three independent experiments.
Constructs containing only KH-domain 1 (hnRNP-K DI) andlor
KH-domain 2 (hnRNP-K D2) exhibit very low binding activity
towards poly(rC) (see below), whereas constructs including the specificity. This was in contrast to D1 from hnRNP-K that only
C-terminal KH domain (hnRNP-K D3) showed poly(rC) binding binds weakly to poly(rG). The lowest binding activities were
activities approaching that observed for the full-length protein. observed for the D2 domains from all three proteins, where only
The highest binding activity was detected for a polypeptide that D2 from PCBP-2 shows a weak binding to poly(rG).
only included the C-terminal domain. The central segment of In an attempt to determine the minimal sequence length
hnRNP-K containing the RGG boxes did not bind any of the necessary for RNA binding, we prepared different constructs of
nucleic acids under any of the in vitro conditions that were tested KH-domain 3 from hnRNP-K (Fig. 3). Fragments that started at
(results not shown). This indicates that D3 alone is responsible position 390 and included the rest of the molecule (both splice
for the in vitro poly(rC) binding and suggests that single KH variants ; Dejgaard et al., 1994) exhibit binding activities similar
domains are capable of nucleic acid binding. to larger constructs, whereas removal of the C-terminal 11 or 12
To elaborate on these results, we prepared similar constructs amino acids significantly reduces the binding (Fig. 4A, construct
from another poly(rC)-binding protein PCBP-1 (Leffers et al., F). Combining these results with the results presented by Ito et
1995) (Fig. 2). As for hnRNP-K, the strongest poly(rC)-binding al. (1994) and comparisons with the closely spaced domains in
activity was found for constructs that included the C-terminal HBP (McKnight et al., 1992) and FMRl allows a reasonably
KH domain (PCBP-1 D3) but, in addition, we found that KH- precise definition of the nucleic acid binding domain. Thus, a
domain 1 (PCBP-1 DI) also interacts strongly with poly(rC) KH domain includes about 70 amino acids, which is consistent
(Fig. 4). No poly(rC)-binding activity was found in KH-domain with what was found from structure determination of HBP
2 (PCBP-1 D2) or in the central part of the molecule (results KH-domain 5 (Castiglione-Morelli et al., 1995; Musco et al.,
not shown). This shows that single isolated KH domains are 1996) but considerably larger than previous estimates (Siomi et
capable of nucleic acid binding and that KH domains most likely al., 1993a, 1994, 1995; Dreyfuss et al., 1993; Gibson et al.,
are responsible for the observed in vitro poly(rC)-binding prop- 1993; Buckanovich et al., 1993; Duncan et al., 1994) (see be-
erties of both hnRNP-K and PCBP-1. low).

Nucleic acid binding activity of the hnRNP-K and PCBP KH Isolation and analysis of additional KH domains. Having
domains. To analyse the nucleic acid binding properties of shown that the closely related KH domains from hnRNP-K and
single KH domains, we prepared dot-blots with aliquots of each the PCBPs were capable of nucleic acid binding, we were inter-
KH domain from hnRNP-K, PCBP-1 and PCBP-2 (Fig. 4). The ested in determining whether more distantly related KH domains
dot-blots were incubated with 3ZP-labelledpoly(rC), poly(rU), also bound nucleic acids. To do this, we isolated and expressed
poly(rG), poly(rA), ssDNA (M13V8) and dsDNA (M13V8 RF). the first three or four of the 14 or 15 KH domains from HBP
The results showed that D3 from all three proteins exhibits both (McKnight et al., 1992), the H. hulobium ORF 139 protein
the highest affinity for nucleic acids and the broadest substrate (Leffers et al., 1989) which contains one and a half KH domain
specificity of all the KH domains (Figs 3 and 4A). The three D3 (Fig. 3) and a construct from FMRl (Verkerk et al., 1991 ;Siomi
domains bind RNA homopolymers, ssDNA and dsDNA, with et al., 1993b) that included at least the first and part of the
the exception of D3 from PCBP-2 that binds ssDNA and dsDNA second KH domain. Although we originally intended to cover
and poly(rU) at a reduced level (Figs 3 and 4). Domain 1 from both the KH domains of FMRl, it is very likely, considering
the two PCBPs exhibits an affinity for poly(rC) that is almost a more recent sequence alignment based on the structure of a
as high as the D3 domains and they also show a broad substrate representitative member of the family (Musco et al., 1996), that
428 Dejgaard and Leffers ( E m J. Biochern. 241)

Proline Poly(rA) Poly(rG) Poly(rC) PolyWJ) ssDNA dsDNA


Acidic KHl KH2 RGG rich RGG KH3
hnRNP-K
t"
c
Y 3 6
464
464
nd
f+
+++
+++
++
++++
++++
-
nd
+
nd
++
nd
++
108
D " 127 237 - (+)
+++
(+)
++++ ++ ++ ++
E
F
382
2-3
464 +

I - ++
G 390 464
+ +++ + + +

PCBP- 1
356
+ +++ ++++ + ++ ++
+ ++ +++ ++ ++ ++
182
++ -
274 356
+ ++ ++++ + + +

PCBP-2
2 365
++ ++++ (+) + +
M 2 101
+ ++ + + +
N 97 182 - (+) -
0 280 365 (+I (+) ++++ (+> (+) -
FNR I
.1
592 nd nd - nd + nd
Q 206 376 (+) ++ - (+) (+)

HBP
(+) ++ - (+) (+)

--
R 2 363

HhORF139
s 1 139 - (+) -

Fig.3. Origin and nucleic acid binding activity of the expressed KH domains. The domain structure of the different proteins is shown schemati-
cally as described in Fig. 2. The putative KH domain in the N-terminus of HBP is so divergent (Fig. 6) that it is questionable if it is a KH domain
and is indicated with an open box. The complete FMRl protein was only expressed in the vaccinia virus system. Numbers refer to the positions of
the first and last amino acid in the expressed peptides as compared to the complete proteins (hnRNP-K D, Dejgaard et al., 1994; PCBP-1 and 2,
Leffers et al., 1995; HBP, McKnight, et al., 1992; FMR1, Verkerk et al., 1991; H. halobiurn (Hh) ORFl39, Leffers et al., 1989). The numbering
for FMRl Starts from the first Met residue (Met66 in Verkerk et al., 1991). The nucleic acid binding activities of the different constructs are
indicated by: -, no binding; (+), very weak; +, weak; ++,
moderate; +++, strong; ++++, very strong; nd, not determined. It should be
stressed that resent results (Musco et al., 1996) suggest that the second KH domain of FMRl is considerably larger than previous estimates and
thus that the nucleic acid binding activity reported here most likely only includes the contribution from KH-domain 1 .

the FMRl construct includes only the first KH domain whereas from hnRNP-K (Figs 3 and 4B, construct F) where washing at
the C-terminal half of the second is not present (see also com- 1 M NaCl reduced the binding significantly and essentially all
ments to Fig. 3). the bound poly(rC) was lost in the 5 M wash, suggesting that its
The constructs containing one and a half KH domains from reduced poly(rC) binding may be caused by a decrease in the
FMRl or the first three or four KH domains from HBP bind stability of the truncated domain rather than to an inability to
poly(rG) and, in addition, they bind weakly to M13V8 ssDNA recognise poly(rC), consistent with observations by others
and dsDNA (Figs 3 and 4). The H. halobium ORF139 protein (Castiglione-Morelli et al., 1995). Surprisingly, the isolated
exhibits a very low affinity for poly(rG) and none to any of the domains showed a higher salt resistance than did the complete
other nucleic acids that were tested (Fig. 4A). Thus, KH proteins (Fig. 4B).
domains whose sequences are only distantly related to the KH The original analysis of the poly(rC) binding activity of
domains from hnRNP-K and the PCBPs are also capable of hnRNP-K and the PCBPs were made on northwestern blots of
binding nucleic acids, although they exhibited significantly 2D gels (Matunis et al., 1992; Dejgaard et al., 1994; Leffers et
lower affinities towards the nucleic acids that were used in this al., 1995) and, thus, on proteins that had been denatured by SDS
study. gel electrophoresis. To test if the KH domains retained their
nucleic acid binding activity after denaturation, we incubated
Stability of the protein-nucleic-acid complexes in high salt
dot-blots in either 0.1 % SDS or 6 M guanidine hydrocloride
buffer and binding activity of denaturearenatured KH do-
to denature the proteins. The guanidine-hydrocloride-denatured
mains. It had previously been shown that the nucleic acid bind-
blots were renatured essentially as described by Sambrook et al.
ing of hnRNP-K is highly salt resistant (Matunis et al., 1992;
Siomi et al., 1994). To determine if this is shared by single KH (1989) (see Materials and Methods), whereas the SDS-denatured
domains, we incubated dot-blots with labelled nucleic acids as blots were transferred directly to binding buffer as for ID and
described above, but increased the NaCl concentration in the 2D western blots of whole cell lysates (Matunis et al., 1992;
washing buffer first from 50 mM to 1 M and subsequently to Dejgaard et al., 1994; Leffers et al., 1995). The results for SDS-
5 M (Fig. 4B). No major difference in signal intensity was ob- denatured polypeptides were, for most of the domains, essen-
served after washing at 1 M NaCI, whereas a significant tially as before denaturation, whereas guanidine hydrocloride
decrease in the amount of bound nucleic acids was observed denaturation reduced or abolished the binding activity of several
after incubation in 5 M NaC1, especially for poly(rG), poly(rU), KH domains (Fig. 5). The results suggest that SDS probably
poly(rA), ssDNA and dsDNA [exemplified for poly(rG) in does not denature the peptides completely, with the result that
Fig. 4B]. However, we were not able to remove all the bound most or all the constructs retain some of their structure which
nucleic acids and for the poly(rC) binding domains we only later facilitates the refolding. Guanidine hydrocloride is a
observed a 50% decrease in signal intensity as determined by stronger denaturant and will probably completely denature the
scintilation counting (Fig. 4 B ; results not shown). The most pro- proteins, resulting in less efficient refolding. It should be noted
nounced difference was observed for the truncated domain 3 that both SDS and guanidine hydrocloride treatment of filters
Dejgaard and Leffers (Euc J. Biochem. 241) 429

Fig. 5. Nucleic-acid-bindingactivity after SDS and guanidine hydro-


cloride denaturation. Blots were prepared as described in Fig. 4, but
were denatured in either 0.1 % SDS or 6 M guanidine hydrocloride be-
fore incubation with [32P]poly(rG).The SDS-treated filter was transfer-
red directly to binding buffer whereas the guanidine-hydrocloride-treated
filter was renatured as described in Materials and Methods. The organi-
sation of the blots is as described for Fig. 4,except that constructs F and
G are not present on the filters.

acid binding. Our results (Dejgaard et al., 1994; Leffers et al.,


1995; this paper) and those of others (It0 et al., 1994; Castigli-
one-Morelli et al., 1995; Musco et al., 1996), suggest that KH
domains are composed of two parts, an N-terminal part that is
reasonably well conserved and a somewhat more divergent C-
terminal and that both parts are required for nucleic acid bind-
ing. The conserved part could constitute the backbone of a
weakly nucleic-acid-binding structure and the variable part
could then confer the additional stability and the sequence speci-
ficity in the nucleic acid binding.

Reduced sequence specificity and high salt resistance of


single domains. The relatively broad sequence specificity
Fig. 4. Nucleic acid binding activity of immobilised KH-domain pep- exhibited by KH domains when analysed as E. coli-expressed
tides. Approximately 1 pg of each GST-fusion protein was dotted onto fusion proteins on dot blots, as compared to previous observa-
nitrocellulose membranes and the membranes were incubated with 32P- tions on 1D and 2D northwestern blots of cellular proteins
labelled nucleic acids as described in Materials and Methods. (A) Dot-
blots incubated with different 32P-labellednucleic acids ; the blots were (Dejgaard et al., 1994; Leffers et al., 1995; unpublished observa-
washed in binding buffer (10mM Tns/HCl pH 7.5, 1 mM EDTA, tions) remains a puzzle. We speculated that the low affinity of
50 mM NaCI). (B) Dot-blots incubated with 32P-labelledpoly(rG) and the cellular proteins towards some homopolymer nucleic acids
po1yfrC) as in (A) but washed in binding buffer supplemented with 5 M could be a consequence of imperfect refolding after SDS electro-
NaCI. (C) Schematic representation of the dot-blots, showing the posi- phoresis ; however, our results show that SDS denaturation and
tion of the different constructs; letters refer to the proteins and protein subsequent renaturation of the dot-blots do not significantly alter
fragments that are shown in Fig. 3, GST represents =4 pg GST protein, the binding preferences of most KH domains. Thus, the results
expressed from the pGEX 2TK vector without insert. suggest that single KH domains may possess a more relaxed
sequence specificity than that of the entire protein from which
they originate (Fig. 4A). Surprisingly, several single KH do-
mains also showed a higher salt resistance in their binding to
with bound nucleic acids completely removes all the nucleic certain nucleic acids than did their respective full-length pro-
acids, as judged by autoradiography. teins. Combined with the observed lack of sequence specificity,
this suggests that, although single KH domains are fully capable
of binding nucleic acids as isolated entities, they may collaborate
DISCUSSION
within the protein and thereby modulate the overall affinity and
Definition of a KH domain as a nucleic-acid-bindingentity. sequence specificity of the individual domains. This suggestion
Our results, combined with the results from Ito et al. (1994) and offers an explanation for the previous findings of Siomi et al.
comparisons between the many identified KH-domain se- (1994) who showed that mutations in either of the 3 KH do-
quences, suggests that the minimum size of a KH domain is mains of hnRNP-K changes the affinity for poly(rC) of the
65-70 amino acids (Fig. 6). If it is truncated by five amino whole protein. This set of data is otherwise in contrast to our
acids at either end, the binding is reduced (It0 et al., 1994; this observations and those of Ito et al. (1994) where KH-domain 3
paper). Most previous descriptions of KH domains (Dreyfuss et from hnRNP-K is the only domain that shows any significant
al., 1993; Siomi et al., 1993a; Gibson et al., 1993; Buckanovich affinity for poly(rC). A hypothesis of KH-domain collaboration
et al., 1993; Siomi et al., 1994, 1995; Duncan et al., 1994) have is further supported by studies of Duncan et al. (1994), showing
only included the N-terminal part, ending shortly after the mid- that KH domains 3 and 4 are both necessary and sufficient for
dle variable region, and this is clearly insufficient for nucleic the binding of FBP to the promoter region upstream of c-myc;
430 Dejgaard and Leffers ( E m J. Biochem. 241)

hnRNP-K D1 44 ELRILLQSKNAGAVIGKGGKNIKALRTD~ASVSVPDS .....SGPERILSISADIETIGEILKKIIPTLEEG 111


hnRNP-K D2 146 ELRLLIHQSLAGGIIGVKGAIKELRENTQTTIKLFQE..CCPHSTDRWI,IGGKPDRWECIKIILDLISES 216
hnRNP-K D3 389 TTQVTIPKDLAGSIIGKGGQRIKQIRHESGASIKIDEP ...LEGSEDRIITITGTQDQIQNAQYLLQNSVKQY 458
PCBP-1 D1 15 TIRLLMHGKEVGSIIGKKGESVKRIREESGARINISEG ..... NCPERIITLTGPTNAIFKAFAMIIDKLEED 82
PCBP-1 D2 99 TLRLWPATQCGSLIGKGGCKIKEIRESTGAQVQVAGD..MLPNSTERAITIAGVPQSVECVKQICLVMLET 169
PCBP-1 D3 281 THELTIPNNLIGCIIGRQG~INEIRQMSGAQIKIANP...VEGSSGRQVTITGSAASISLAQYLINARLSSE 350
PCBP-2 D1 15 TIRLLMHGKNGSIIGKKGESVKKMREESGARINISEG. . . . .NCPERIITLAGPTNAIFKAFAMIIDKLEED 82
PCBP-2 D2 99 TLRLWPASQCGSLIGKGGCKIKEIRESTGAQVQLAGD..MLPNSTERAITIAGIPQSIIECVKQIC~LET 169
PCBP-2 D3 289 SHELTIPNDLIGCIIGRQGAKINEIRQMSGAQIKIANP ...VEGSTDRQVTITGSAASISLAQYLINVRLSSE 358
hsfmrl D1 220 H E Q F I V R E D L M G L A I G T H G A N I Q Q ~ V P G ~ A I .D.L. . . . . . . .DEDTCTFHIYGEDQDAVKKARSFLEFA
282
hsfmrl D2 283 EDVIQVPRNLVGKVIGKNGKLIQEIVDKSGVVRVRIEAENEKNVPQEEEIMPPNSLPSNNSRVGPNAPEEKKH 355
hshbp DO 47 LPEKAACLESAQEPAGAWGNKIRPIKASVITQVFH V . . . . . . . PLEERKYKDMNQFGEGEQAICLEIMQRTG 112
hshbp D1 152 SATVAIPKEHHRFVIGKNGEKLQDLELKTATKIQIPRP . . . . . DDPSNQIKITGTKEGIEKARE~LISAEQ 219
hshbp D2 224 VERLEVEKAFHPFIAGPYNRLVGEIMQETGTRINIP . . . . .PPSVNRTEIVFTGEKEQLAQAVARIKKIYEEK 291
hshbp D3 297 TIAVEVKKSQHKYVIGPKGNSLQEILERTGVSVEIPP.....SDSISETVILRGEPEKLGQALTEWAKANSF 363
H. hal. 139 33 RVVYWTAGEMGAAIGDGGSRVDALEATLGRSWLVEDAPTAEGFVANALSPAAVYNVTVSENDTTVAYAEVA 105
H. hal. 139 106 . . . . . . .REDKGVAIGADGTNIETAKELAARHFDIDDIQLT 139

Fig.6. Alignment of the sequences of the expressed KH domains. Numbers refer to their positions in the different proteins. The KH
domains are numbered from the start of the proteins. HBP DO corresponds to a putative KH domain in the N-terminal of the HBP protein, preceding
D1.

it is also supported by mutagenesis studies on FMRl (Siomi et Duncan, R., Bazar, L., Michelotti, G., Tomonaga, T., Krutzsch, H., Avi-
al., 1993b, 1994, and references in the latter) showing that a gan, M. & Levens, D. (1994) A sequence-specific, single-strand
single amino acid change in the second KH domain abolishes binding protein activates the far upstream element of c-myc and de-
fines a new DNA-binding motif, Genes Dev. 8, 465-480.
RNA binding with a dramatic effect on the function of the pro-
Engebrecht, J. & Roeder, G. S. (1990) Merl, a yeast gene required for
tein as suggested by the strong mental retardation observed for chromosome pairing and genetic recombination, is induced in meio-
a patient with this mutation in the FMRl gene (De Boulle et al., sis, Mol. Cell. Biol. 10, 2379-2389.
1993). Ghisolfi, L., Joseph, G., Amalric, F. & Erard, M. (1992) The glycine-
In conclusion, our results suggest that most, possibly all, KH rich domain of nucleolin has an unusual supersecondary structure
domains have the ability to bind nucleic acids and that some responcible for its RNA-helix destabilizing properties, J. Biol. Chem.
may do so as single isolated entities, whereas others may work 267, 2955-2959.
cooperatively with other domains of the KH family. Those that, Gibson, T. J., Thompson, J. D. & Heringa, J. (1993) The KH domain
in our study, do not bind any of the probes may have more occurs in a diverse set of RNA-binding proteins that include the
antiterminator NusA and is probably involved in binding to nucleic
stringent sequence specificities and we assume that several of
acids, FEBS Lett. 324, 361 -366.
those that do bind may have even greater affinity for more spe- Gonzalez, I. L., Gorski, J. L., Campen, T. J., Domey, D. J., Erickson, J.
cific sequences. M., Sylvester, J. E. & Schmickel, R. D. (1985) Variation among
human 28s ribosomal RNA genes, Proc. Nut1 Acad. Sci. USA 82,
K. Dejgaard and H. Leffers were supported by fellowships from the 7666 -7670.
Danish Cancer Society. The work was supported by grants from the Gorlach, M., Wittekind, M., Beckman, R. A., Mueller, L. & Dreyfuss,
Danish Cancer Society and the Danish Biotechnology Programme. G. (1992) Interaction of the RNA binding domain of the hnRNP C
proteins with RNA, EMBO J. 11, 3289-3295.
Gubler, U. (1988) One tube reaction for the synthesis of blunt-ended
double-stranded cDNA, Nucleic Acids Res. 16, 2726.
REFERENCES Hahm, K. B., Kim, G., Turch, C. & Smale, S. T. (1993) Isolation of a
Buckanovich, R. J., Posner, J. B. & Darnell, R. B. (1993) Nova, the murine gene encoding a nucleic acid-binding protein with homology
paraneoplastic Ri antigen, is homologous to an RNA-binding protein to hnRNP K, Nucleic Acids Res. 21, 3894.
and is specifically expressed in the developing motor system, Neuron Hoffman, D. W., Query, C. C., Golden, B. L., White, S. W. & Keene, J.
11, 651-612. D. (1991) RNA binding domain of the A protein component of the
Castiglione-Morelli, M. A,, Stier, G., Gibson, T., Joseph, C., Musco, G., U1 small nuclear ribonucleoprotein analysed by NMR spectroscopy
Pastore, A. & Trave, G. (1995) The KH module has an ab fold, is similar to ribosomal proteins, Proc. Natl Acad. Sci. USA 88,
FEBS Lett. 358, 193-198. 2495 - 2499.
De Boulle, K., Verkerk, A. J., Reyniers, E., Vits, L., Hendrickx, J., Van Ito, K., Sato, K. & Endo, H. (1994) Cloning and characterization of a
Roy, B., Van den Bos, F., de Graaff, E., Oostra, B. A. & Willems, P. single-stranded binding protein that specifically recognises deoxy-
J. (1993) A point mutation in the FMR-1 gene associated with fragile cytidine stretch, Nucleic Acids Rex 22, 53-58.
X mental retardation, Nat. Genet. 3, 31-35. Kao, J., Wu, M. & Chiang, Y. M. (1990) Cloning and characterization
Dejgaard, K., Leffers, H., Rasmussen, H. H., Madsen, P., Kruse, T. A,, of chloroplast ribosomal protein-encoding genes, rp116 and rps3 of
Gesser, B., Nielsen, H. & Celis, J. E. (1994) Identification, molecular the marine macro-algae Gracilaria tenuistipitata, Gene 90, 221 -
cloning, expression and chromosome mapping of a family of trans- 226.
formation upregulated hnRNP-K proteins derived by alternative Kenan, D. J., Query, C. C. & Keene, J. D. (1991) RNA recognition:
splicing, J. Mo/. Biol. 235, 33-48. Towards identifying determinants of specificity, Trends Biochem. Sci.
Dejgaard, K. & Celis, J. E. (1994) Two-dimensional northwestern blot- 16, 214-220.
ting, in Cell biology: a luboratory handbook, vol. 3 (Celis, J. E., Kiledijan, M. & Dreyfuss, G. (1992) Primary structure and binding ac-
ed.) pp. 339-344, Academic Press, San Diego CA. tivity of the hnRNP U protein: binding RNA through RGG box,
Delahodde, A., Becam, A. M., Perea, J. & Jacq, C. (1986) A yeast pro- EMBO J. 11, 2655-2664.
tein HX has homologies with the histone H2AF expressed in chicken Klenk, H. P., Schwas, V. & Zillig, W. (1991) Nucleotide sequence of
embryo, Nucleic Acids Res. 14, 9213-9214. the genes encoding L30, S12 and S7 equivalent ribosomal proteins
Dreyfuss, G., Swanson, M. S. & Piiiol-Roma, S. (1988) Heterogeneous from the archeaum Thermococcus celer, Nucleic Acids Res. 19,
nuclear ribonucleoprotein particles and the pathway of mRNA for- 6047.
mation, Trends Biochem. Sci. 13, 86-91. Laemmli, U. K. (1970) Cleavage of structural proteins during the assem-
Dreyfuss, G., Matunis, M. J., Piiiol-Roma, S. & Burd, C. G. (1993) bly of the head of bacteriophage T4, Nature 227, 680-685.
hnRNP proteins and the biogenesis of mRNA, Annu. Rev. Biochem. Lamm, G. M. & Lamond, A. (1993) Non-snRNP protein splicing factors,
62, 289-321. Biochem. Biophys. Acta 1173, 247-265.
Dejgaard and Leffers (Eur: J. Biochem. 241) 43 1

Leffers, H., Gropp, F., Lottspeich, F., Zillig, W. & Garrett, R. A. (1989) Query, C. C., Bentley, R. C. & Keene, J. D. (1989) A common RNA
Sequence, organization, transcription and evolution of RNA poly- recognition motif identified within a defined U1 RNA binding do-
merase subunit genes from the archaebacterial extreme halophiles main of the 70 K U1 snRNP protein, Cell 57, 89-101.
Halobacterium halobium and Halococcus morrhuae, J. Mol. Biol. Sambrook, J., Fritsch, E. F. & Maniatis, T. (1989) Molecular cloning, a
2 0 6 , l - 17. laboratory manual, 2nd edn, Cold Spring Harbour Laboratory, Cold
Leffers, H. & Andersen, A. H. (1993) The sequence of 28s ribosomal Spring Harbour NY.
RNA varies within and between human cell lines, Nucleic Acids Res. Scherly, D., Boelens, W., van Venrooij, W. J., Dathan, N. A,, Hamm,
21, 1449-1455. J. & Mattaj, I. W. (1989) Identification of the RNA binding segment
Leffers, H., Dejgaard, K. & Celis, J. E. (1995) Characterisation of two of human U1 A protein and definition of its binding site on U1
major cellular poly(rC)-binding human proteins, each containing snRNA, EMBO J. 8,4163-4170.
three K-homologous (KH) domains, Eur: J. Biochem. 230, 447 - Schmidt, C., Henkel, B., Paeschl, E., Zorbas, H., Purschke, W. G., Gloe,
T. R. & Miiller, P. K. (1992) Complete cDNA sequence of chicken
453. vigilin, a novel protein with amplified and evolutionary concerned
Mattaj, I. W. (1993) RNA recognition: a family matter? Cell 73, 837- domains, Eur: J. Biochem. 206, 625-634.
840. Siomi, H., Matunis, M. J., Michael, W. M. & Dreyfuss, G. (1993a) The
Matunis, M. J., Michael, W. M. & Dreyfuss, G. (1992) Characterization pre-mRNA binding K protein contains a novel evolutionary con-
and primary structure of the poly(C)-binding heterogeneous nuclear served motif, Nucleic Acids Res. 21, 1193-1198.
ribonucleoprotein complex K protein, Mol. Cell. Biol. 12, 164- Siomi, H., Siomi, M. C., Nussbaum, R. L. & Dreyfuss, G. (1993b) The
171. protein product of the fragile X gene, FMR1, has characteristics of
McKnight, G. L.. Reasoner, J., Gilbert, T., Sundquist, K. O., Hokland, an RNA-binding protein, Cell 74, 291-298.
B., McKernan, P. A,, Champagne, J., Johnson, C. J., Bailey, M. C., Siomi, H., Choi, M., Siomi, M. C., Nussbaum, R. L. & Dreyfuss, G.
Holly, R., O’Hara, P. J. & Oram, J. F. (1992) Cloning and expression (1994) Essential role for KH domains in RNA binding: impaired
of a cellular high density lipoprotein-binding protein that is up-regu- RNA binding by a mutation in the KH domain of FMRI that causes
lated by cholesterol loading of cells, J. Biol. Chem. 267, 12131 - fragile X syndrome, Cell 77, 33-39.
12141. Siomi, M. C., Siomi, H., Sauer, W. H., Srinivasan, S., Nussbaum, R.
Merrill, B. M., LoPresti, M. B., Stone, K. L. & Williams, K. R. (1986) L. & Dreyfuss, G. (1995) FXRl, an autosomal homolog of the fragile
High pressure liquid chromatography purification of UP1 and UP2, X mental retardation gene, EMBO J. 14, 2401 -2408.
two related single stranded nucleic acid-binding proteins from calf Srivastava, M., Fleming, P. J., Pollard, H. B. & Lee Bums, A. (1989)
thymus, J. Biol. Chem. 261, 878-883. Cloning and sequencing of the human nucleolin cDNA, FEBS Lett.
Merrill, B. M., Stone, K. L., Cobianchi, F., Wilson, S. H. & Williams, 250, 99-105.
K. R. (1988) Phenylalanines that are conserved among several RNA- Tan, R., Chen, L., Buettner, J. A., Hudson, D. & Frankel, A. D. (1993)
binding proteins form part of a nucleic acid-binding pocket in the A1 RNA recognition by an isolated a helix, Cell 73, 1031-1040.
heterogeneous nuclear ribonucleoprotein, J. B i d . Chem. 263,3307 - Theunissen, O., Rudt, F., Guddat, U., Mentzel, H. & Pieler, T. (1992)
RNA and DNA binding zinc fingers in Xenopus TFIIIA, Cell 71,
3313.
679-690.
Musco, G., Stier, G., Joseph, C., Castiglione-Morelli, M. A., Nilges, M., Verkerk, A. J., Pieretti, M., Sutcliffe, J. S., Fu, Y. H., Kuhl, D. P., Pizzuti,
Gibson, T. J. & Pastore, A. (1996) Three dimensional structure and A,, Reiner, O., Richards, S., Victoria, M. F., Zhang, F., Eussen, B.
stability of the KH domain: Molecular insight into the fragile X E., van Ommen, G. J. B., Blonden, L. A., Riggins, G. J., Chastain,
syndrome, Cell 85, 237-245. J. L., Kunst, C. B., Galjaard, H., Caskey, C . T., Nelson, D. L., Oostra,
Nagai, K., Oubridge, C., Jessen, T. H., Li, J. & Evans, P. R. (1990) B. A. & Warren, S. T. (1991) Identification of a gene (FMR-1)
Crystal structure of the RNA-binding domain of the U1 small nuclear containing a CGG repeat coincident with a breakpoint cluster region
ribonucleoprotein A, Nature 346, 515 -520. exhibiting length variation in fragile X syndrome, Cell 66, 905-
Puehler, G., Lottspeich, F. & Zillig, W. (1989) Organization and nucleo- 914.
tide sequence of the genes encoding the large subunits A, B and Zurawski, G. & Zurawski, S. M. (1985) Structure of the Escherichia
C of the DNA-dependent RNA polymerase of the archaebacterium coli S10 ribosomal protein operon, Nucleic Acids Res. 13, 4521-
Sulfolobus acidocaldarius, Nucleic Acids Res. 17, 451 7 -4534. 4526.

You might also like