Professional Documents
Culture Documents
Transcription factors lie at the center of gene regulation, The traditional route to the identification of cognate trans-
and their identification is crucial to the understanding of acting factors, the biochemical isolation and identification of
transcription and gene expression. Traditionally, the iso- DNA-binding proteins, is usually a long and labor-intensive
lation and identification of transcription factors has been process. Purification of transcription factors often involves
a long and laborious task. We present here a novel four or five different chromatographic steps, including ion
method for the identification of DNA-binding proteins
exchange, gel filtration, and nonspecific and sequence-spe-
seen in electrophoretic mobility shift assay (EMSA) using
cific DNA affinity columns (2). The major impediment to the
the power of two-dimensional electrophoresis coupled
with mass spectrometry. By coupling SDS-PAGE and iso- rapid identification of a transcription factor of interest is the
electric focusing to EMSA, the molecular mass and pI of a fact that they are generally present in low concentrations,
protein complex seen in EMSA were estimated. Candidate usually less than 0.1% of the total nuclear protein. Additionally
proteins were then identified on a two-dimensional array they often bind with moderate affinity (3). Recent advents in
at the predetermined pI and molecular mass coordinates proteomics and mass spectrometry have created unprece-
and identified by mass spectrometry. We show here the dented power in protein identification. For example, proteins
successful isolation of a functionally relevant transcrip- have recently been analyzed directly by matrix-assisted laser
tion factor and validate the identity through EMSA super-
desorption ionization time-of-flight mass spectrometry utiliz-
shift analysis. Molecular & Cellular Proteomics 1:
472– 478, 2002. ing DNA probes harboring specific sequence motifs (4).
In this paper, we have developed a powerful method for the
identification of DNA-binding proteins seen in EMSA. Utilizing
The isolation and identification of sequence-specific DNA- the power of two-dimensional electrophoresis (2DE) and
binding proteins is not trivial. The first step involves the map- mass spectrometry (MS), we have established a novel tech-
ping of promoter regions that interact with potential transcrip- nique to isolate transcription factors. More importantly, our
tion factors. DNase I footprinting and electrophoretic mobility method obviates the need for laborious and extensive purifi-
shift assays (EMSA)1 are generally employed for the rapid char- cation of the protein of interest. In this paper, the methodol-
acterization of such regions including the definition of cis-acting ogy required and the successful isolation of a functionally
elements (1). If the defined DNA elements conform to consen- relevant transcription factor have been described using our
sus transcription factor binding sites, and if specific antibodies novel proteomics approach.
are available, then EMSA supershifts can be employed to pos- We were interested in the identity of an EMSA complex that
sibly identify the interacting proteins. However, if appropriate bound to a CCAT repeat sequence. This repeat forms part of
supershift antibodies are not available, then EMSA is of limited a functionally important microsatellite repressor sequence
value in the identification of novel DNA-binding proteins. within the CD30 promoter (5). Traditional methods such as
sequence-specific DNA affinity chromatography, coupled
From Biochemistry and Molecular Biology, School of Biomedical with chromatographic purification of nuclear proteins, proved
and Chemical Sciences and Western Australian Institute for Medical unsuccessful because of the high abundance and affinity of
Research, The University of Western Australia, Crawley 6009, nonspecific nuclear proteins. Instead, by estimating the pI and
Australia
molecular mass (MM) of the protein by coupling SDS-PAGE or
Received, April 17, 2002, and in revised form, May 29, 2002
Published, MCP Papers in Press, June 20, 2002, DOI 10.1074/ isoelectric focusing (IEF) with EMSA, it was possible to iden-
mcp.T200003-MCP200 tify candidate protein spots on a two-dimensional array of
1
The abbreviations used are: EMSA, electrophoretic mobility shift nuclear proteins. These candidates were characterized further
assay; 2DE, two-dimensional electrophoresis; IEF, isoelectric focus- by excision from a two-dimensional gel at the predetermined
ing; IPG, immobilized pH gradient; MALDI-TOF, matrix-assisted laser
desorption ionization time-of-flight; MS, mass spectrometry; MM,
pI and MM. Proteins were then eluted, renatured, and tested
molecular mass; CHAPS, 3-[(3-cholamidopropyl)dimethylammonio]- for original activity in EMSA, and candidate spots were sub-
1-propanesulfonic acid. sequently analyzed by mass spectrometry, and their identity
472 Molecular & Cellular Proteomics 1.6 © 2002 by The American Society for Biochemistry and Molecular Biology, Inc.
This paper is available on line at http://www.mcponline.org
Proteomic Identification of DNA Binding Activities
(v/v), 2% SDS (w/v), 2 mM tributyl phosphine) for 30 min. Strips were Matrix-assisted Laser Desorption Ionization Time-of-Flight (MALDI-
rinsed briefly in SDS running buffer (25 mM Tris, 192 mM glycine, 0.1% TOF) MS—Spots of interest were excised from 2DE gels, placed in
SDS (w/v)) prior to being sealed into the top of a 10% SDS-polyacryl- microtiter plates and subjected to MALDI-TOF MS (Australian Pro-
amide gel with 0.5% agarose in SDS running buffer. Electrophoresis teome Analysis Facility). Samples were subjected to a 16-h tryptic
was performed at 20 mA until the dye front reached the anodic end of digest at 37 °C. Peptides were extracted from the gel using a 50%
the gels. Gels were subsequently stained using the silver staining kit (v/v) acetonitrile, 1% (v/v) trifluoroacetic acid solution. A 1-l aliquot
(Amersham Biosciences) as per the manufacturer’s instructions. Al- was spotted onto a sample plate with 1 l of matrix (␣-cyano-4-
ternatively, colloidal Coomassie Blue G250 was used if MS was being hydroxycinnamic acid, 8 mg/ml in 40% acetonitrile (v/v), 1% trifluoro-
performed. acetic acid), and MALDI-TOF analysis was performed on a Micromass
Tofspec time-of-flight mass spectrometer.
Protein Identification—The peptide masses obtained from MALDI-
TOF spectra were analyzed using NCBI databases and utilizing the
MS-FIT database tool located at the ProteinProspector website (10).
Monoisotopic peaks were searched against human proteins, 1–100
kDa, with a maximum of one missed cleavage, unmodified cysteines,
and with a mass tolerance of 100 ppm.
RESULTS
FIG. 4. Complex E has a denatured pI of ⬃5.8. Fraction 36 was analyzed by IEF, and pI fractions were isolated in gel pieces. Proteins were
eluted, renatured, and tested for complex E activity in EMSA. Peak complex E activity was detected in pI fractions 5.6 –5.8 and 5.8 – 6.15. Blank
denotes radiolabeled probe alone, NE shows binding of oligonucleotide to 2 g of crude nuclear extract, and F36 denotes 1 g of S300 fraction
36. Protein-DNA complexes are marked with arrows, and pI fractions are as illustrated.
TABLE I
Protein spot E4 matches the transcriptional repressor YY1
Protein spot E4 were analyzed by MALDI-TOF MS analysis. Monoisotopic peaks were searched against the NCBI protein database for
human proteins 1–100 kDa in size with a maximum of one missed cleavage, unmodified cysteines, and with a mass tolerance of 100 ppm. The
database was searched using the MS-FIT database tool located at the ProteinProspector website (10). The top five matches for protein spot
E4 are shown. The top three matches identify the transcription factor YY1, as well as alternatively named species of YY1 (NF-E1).
MOWSE % Peptides NCBI accession
Identified protein pI/MM
score matched number
Da
YY1 transcription factor 3040 23 5.8/44713 4507955
E3⬘ enhancer-binding protein NF-E1 3349 23 5.8/44732 1082557
Transcription repressor protein YY1 1477 21 5.8/44812 88893
KIAA1356 protein 692 13 6.2/58984 7243093
Keratin 9, cytoskeletal 533 18 5.2/62130 1082558
Complex E Is a Monomeric Protein of 55– 66 kDa—Concen- to an SDS-polyacrylamide gel. The region of interest was
trated crude nuclear extract was analyzed by SDS-PAGE, and excised (MM 52– 65, pI 5.5– 6.0) and was dissected into 20
fractions of discrete molecular mass intervals were excised quadrants each corresponding to discrete MM and pI inter-
and eluted from the gel. Proteins of each fraction were rena- vals (Fig. 5A). The proteins from each gel slice were eluted
tured and tested for complex E activity in EMSA. Complex E with re-naturation buffer and assayed for complex E activity
activity was detected in the 55– 66-kDa fraction (Fig. 3, lane using EMSA. Peak complex E activity was detected in gel
55– 66). This lane also contains two other complexes that are slices 10 and 11 corresponding to pI intervals of 5.65–5.75
not seen in the crude nuclear extract binding profile. These and 5.75–5.85, respectively, and a MM of 57–59 kDa. Smaller
complexes may represent multimers of complex E constitu- complex E activity was detected in neighboring fractions 6, 7,
ents and suggested that complex E is most likely a monomer 14, and 15, which represent the same pI intervals, but neigh-
or homomeric protein complex although it was possible that it boring MM intervals 54.5–57 kDa and 59 – 62 kDa (Fig. 5B).
consisted of heteromeric subunits that were within the same These results confirmed our earlier estimates of MM (Fig. 3)
molecular mass range. and pI (Fig. 4) of complex E. Examination of the excised region
Denatured Complex E Has a pI of ⬃5.8 —Sephacryl 300 on a silver-stained gel indicated four candidate protein spots
fractionation demonstrated that fraction 36 (Fig. 1, lane F36) that were located within the quadrants that contained com-
contained significant activity of complexes D and E, and to plex E activity. Candidates were denoted E1, E2, E3, and E4
preserve the complex E peak fractions this fraction was used (Fig. 5A), and the identity of these proteins spots was eluci-
to determine the pI of complex E. Peak complex E activity was dated using MALDI-TOF MS.
detected in two intervals, pI 5.6 –5.8 and 5.8 – 6.15 with lower Mass Spectrometry Identified YY1 as a Candidate for Com-
activity detected in neighboring intervals (Fig. 4). This result plex E—Two-dimensional electrophoresis of fraction 39 was
suggested that the pI of complex E is ⬃5.8. Complex D activity repeated, and the resulting 2DE gel was stained with colloidal
was also reconstituted, and peak activity was seen in the pI Coomassie Blue because of its compatibility with mass spec-
5.1–5.35 interval with lower activity in neighboring intervals. Also trometry. Two candidate protein spots were excised (Fig. 5A,
another complex, which was not seen in the crude nuclear E2 and E4), one from each quadrant, digested with trypsin
extract binding profile, was seen with peak activity in the pI and analyzed by MALDI-TOF MS, and monoisotopic peaks
4.8 –5.1 interval. This complex may represent a nonspecific were searched against the NCBI database.
DNA-binding protein that is normally out competed in the S300 Protein spot E4 matched the transcriptional repressor pro-
fractions but is able to bind in the IEF-purified fraction. tein YY1 (Table I). YY1 is a zinc finger transcription factor with
Two-dimensional Analysis of Complex E—The results indi- a pI of 5.8 and an apparent molecular mass of 60 – 68 kDa (11,
cated that complex E is a 60-kDa protein with a denatured pI 12), properties similar to our estimated MM and pI of the
of ⬃5.8. To confirm these characteristics, nuclear proteins of protein within complex E. Protein spot E2 also matched some
fraction 39, the peak E fraction, were analyzed by two-dimen- YY1 peptides, although because of keratin contamination it
sional electrophoresis over an IPG of pH 4 –7 and transferred had a significantly lower MOWSE score (data not shown).
FIG. 5. Two-dimensional analysis of complex E. Fraction 39 was analyzed by two-dimensional electrophoresis and resolved on an
SDS-polyacrylamide gel and silver-stained. A, the resulting two-dimensional gel had the region of interest excised between pI 5.5– 6.0 and MM
52– 65 kDa, which was dissected into 20 quadrants as shown. B, EMSA of eluted proteins. Quadrants had proteins eluted, renatured, and
tested for complex E binding activity in EMSA. Peak complex E activity was detected in quadrants 10 and 11, corresponding to four candidate
protein spots labeled E1, E2, E3, and E4 in A. NE shows oligonucleotide binding to 2 g of crude nuclear extract, and F39 designates 1 g
of S300 fraction 39. Protein-DNA complexes are indicated with arrows. Blank, unbound oligonucleotide.
mildly soluble proteins suffer from poor resolution or are lost 2. Gadgil, H., Jurando, L. A., and Jarrett, H. W (2001) DNA affinity chroma-
tography of transcription factors. Anal. Biochem. 290, 147–178
during IEF (14). As such, the DNA-binding protein of interest
3. Ren, L., Chen, C., and Sternberg, A. S. (1994) Tethered bandshift assay and
should be readily soluble to avoid complications with 2DE and affinity purification of a new DNA-binding protein. Biotechniques 16,
contain few post-translational modifications to aid mass 852– 855
spectrometric identification. 4. Nordhoff, E., Krogsdam, A.-M., Jorgensen, H. F., Kallipolitis, B. H., Clark,
B. F. C., Roepstorff, P., and Kristiansen, K. (1999) Rapid identification of
This technique has proven of general utility in identifying DNA-binding proteins by mass spectrometry. Nat. Biotechnol. 17,
DNA-binding proteins seen in EMSA without the need for 884 – 888
extensive purification of the protein of interest. We have used 5. Croager, E., Gout, A. M., and Abraham, L. J. (2000) Involvement of Sp1 and
microsatellite repressor sequences in the transcriptional control of the
this technique to identify other EMSA binding activities such
human CD30 gene. Am. J. Pathol. 156, 1723–1731
as Sp2 (data not shown). DNA-binding proteins, especially 6. Li, Y., Ross, J., Scheppler J. A., and Franza, B. R. (1991) An in vitro
transcription factors, lie at the center of gene regulation, and transcriptional analysis of early responses of the human immunodefi-
thus the identification of unknown factors is crucial to the ciency virus type I long terminal repeat to different transcriptional acti-
vators. Mol. Cell. Biol. 11, 1883–1893
understanding of transcriptional regulation. Recent advances 7. Marshak, D. R., Kadonaga, J. T., Burgess, R. R., Knuth, M. W., Brennan,
in mass spectrometry and proteomics have provided rapid W. A., Jr., and Lin, S. (1996) Strategies for Protein Purification and
and accurate techniques for protein identification and will Characterization: A Laboratory Course Manual, Cold Spring Harbor Lab-
oratory Press, Cold Spring Harbor, NY
allow the identification of many transcription factors without
8. Ossipow, V., Laemmli, U., and Schibler, U. (1993) A simple method to
the need for tedious purification techniques. renature DNA-binding proteins separated by SDS-polyacrylamide elec-
trophoresis. Nucleic Acids Res. 21, 6040 – 6041
* This work was supported by the Australian National Health and 9. Laemmli, U. K. (1970) Cleavage of structural proteins during the assembly
Medical Research Council and the Cancer Foundation of Western of the head of bacteriophage T4. Nature 227, 680 – 685
Australia. The costs of publication of this article were defrayed in part 10. Clauser K. R., Baker P. R., and Burlingame, A. L. (1999) Role of accurate
by the payment of page charges. This article must therefore be hereby mass measurement (⫹/⫺ 10 ppm) in protein identification strategies
marked “advertisement” in accordance with 18 U.S.C. Section 1734 employing MS or MS/MS and database searching. Anal. Chem. 71,
solely to indicate this fact. 2871–2882
‡ Contributed equally to this work. 11. Austen, M., Lüscher, B., and Lüscher-Firzlaff, J. M. (1997) Characterization
of the transcriptional repressor YY1. J. Biol. Chem. 272, 1709 –1717
§ To whom correspondence should be addressed: Biochemistry
12. Harihan, N., Kelley, D. E., and Perry, R. P. (1991) (␦, a transcription factor
and Molecular Biology, School of Biomedical and Chemical Sciences,
that binds to downstream elements in several polymerase II promoters,
The University of Western Australia, 35 Stirling Hwy., 6009 Crawley,
is a functionally versatile zinc finger protein. Proc. Natl. Acad. Sci.
Western Australia. Tel.: 61-8-9380-3041; Fax: 61-8-9380-1148; U. S. A. 88, 9799 –9803
E-mail: labraham@cyllene.uwa.edu.au. 13. Becker, K. G., Jedlicka, P., Templeton, N. S., Liotta, L., and Ozato, K.
(1994). Characterization of hUCRBP (YY1, NF-E1, delta): a transcription
REFERENCES
factor that binds the regulatory regions of many viral and cellular genes.
1. Dent, C. L., and Latchman, D. S. (1993) in Transcription Factors: A Practical Gene 150, 259 –266
Approach (Latchman, D. S., ed) pp. 1–26, Oxford University Press, 14. Molloy, M. P. (2000) Two-dimensional electrophoresis of membrane pro-
Oxford teins using immobilized pH gradients. Anal. Biochem. 280, 1–10