Professional Documents
Culture Documents
DNA family shuffling is a powerful method for enzyme engi- logous families rather than random mutagenesis of a single
neering, which utilizes recombination of naturally occurring gene (6). This approach takes advantage of the fact that most
functional diversity to accelerate laboratory-directed evolu- of the deleterious mutations have long ago been removed by
tion. However, the use of this technique has been hindered by natural selection, while diverse function has been created by
the scarcity of family genes with the required level of sequence the positive mutation interchange. The reassortment of
identity in the genome database. We describe here a strategy proven mutations yields a higher frequency of functional
for collecting metagenomic homologous genes for DNA shuf- progeny sequences, and because multi-gene shuffling starts
fling from environmental samples by truncated metagenomic with more than one parental sequence, it accesses a broader
DECEMBER 31, 2010 • VOLUME 285 • NUMBER 53 JOURNAL OF BIOLOGICAL CHEMISTRY 41509
TMGS-PCR-based DNA Family Shuffling
analysis of environmental DNA (or so-called metagenomes)
have enabled us to mine microbial diversity, allowing us to
access their genomes (17–20). Two strategies are generally
used to screen and identify novel genes from metagenomic
libraries: function-based and sequence-based screening. In
function-based screening, clones expressing desired traits are
selected from libraries, and aspects of the molecular biology
and biochemistry of the active clones are analyzed. Con-
versely, sequence-based screening is not dependent on the
expression of cloned genes in heterologous hosts. Generally, it
is based on the conserved DNA sequences of target genes.
Hybridizations or PCRs are performed based on the deduced
DNA consensus (21).
We report here the development of a homologous gene
collection method termed “truncated metagenomic gene-
specific PCR” (TMGS-PCR).2 This approach led to success in
tapping into homologous sequences of enzymes from envi-
ronmental DNA with PCR with a designated metagenomic
starting gene and experimentally validated truncated gene-
EXPERIMENTAL PROCEDURES
Bacterial Strains, Plasmids, and Growth Conditions—Esche-
richia coli DH5␣ and pBluescript SK⫹ (Stratagene, La Jolla,
CA) were used as host and vector, respectively, for cloning
of the lipase. E. coli strain BL21-CodonPlus (DE3)-RIPL
(Stratagene) and pET-24a(⫹) (Novagen, Inc., Madison, WI)
were used as host and vector, respectively, for heterologous
expression of the lipase. Plasmid pMD18-T (TaKaRa Ltd., FIGURE 1. Schematic overview of homologous genes collected by trun-
Otsu, Japan) was used for gene cloning and sequencing. E. coli cated metagenomic gene-specific amplification. Initially, the meta-
cells were routinely grown in Luria-Bertani (LB) medium at genomic starting gene was isolated from environmental samples by func-
tional screening. Based on the sequence of this identified gene, truncated
37 °C, supplemented with 100 g/ml ampicillin, 50 g/ml gene-specific primers were used to amplify homologous genes from differ-
streptomycin sulfate, 10 g/ml tetracycline HCl, and 12.5 ent environmental samples. The retrieved homologous family genes were
g/ml chloramphenicol, if needed. subjected to shuffling to generate the chimeric libraries for the isolated
clones with desired properties.
Library Construction—A total of 60 environmental samples
were collected from terrestrial microenvironments in China
and were named as S01– 60. DNA extraction from samples (22). Sau3AI partially digested fosmid DNA from the positive
was carried out using the PowerSoil威 DNA Isolation kit clone was ligated to BamHI-linearized and dephosphorylated
(MOBIO, Carlsbad, CA) according to the manufacturer’s in- pBluescript SK⫹ vector, and the ligation mixture was used to
structions. Electrophoresis for DNA from environmental transform E. coli DH5␣. The new clones were examined on
samples was performed with a Bio-Rad CHEF DR II system the same type of indicator plates for lipase activity. A tribu-
(Bio-Rad). A metagenomic library of the S05 sample was con- tyrin-positive clone was selected and sequenced. The gene
structed using the ⬃40-kb end-repaired DNA fragments and encoding LipS05 was amplified by PCR using Lip24aF and
CopyControlTM Fosmid Library Production Kit (Epicenter Lip24aR primers, 5⬘-CGCCATATGATGGAATTTATTC-
Technologies, Madison, WI) using protocols provided by the CCCCTG-3⬘ and 5⬘-TAGCTCGAGTTTGGCACGCAGCG-
manufacturer. 3⬘, which resulted in NdeI and XhoI sites at the 5⬘- and 3⬘-
Screening and Subcloning—For detection of lipolytic activ- ends, respectively. The digested PCR fragment was inserted
ity, transformed cells were spread on LB agar containing 1% between the NdeI and XhoI sites of pET-24a(⫹), creating
(w/v) tributyrin (Merck, Darmstadt, Germany) and incubated pET-24a-LipS05. The recombined plasmid was transformed
overnight at 37 °C. After 3–5 days at room temperature or into competent cells of E. coli strain BL21-CodonPlus (DE3)-
4 °C, colonies were monitored for the presence of clear halos RIPL using the CaCl2 method following standard protocols.
Homologous Gene Amplification—Sets of primers, which
2 were assigned designations comprising the first three letters
The abbreviations used are: TMGS-PCR, truncated metagenomic gene-
specific polymerase chain reaction; pNPC4, p-nitrophenyl butyrate; pNP8, Lip referring to lipase and the sequence length (bp) of the
p-nitrophenyl caprylate; pNPC12, p-nitrophenyl laurate. product, were used to amplify related lipase fragments from
41510 JOURNAL OF BIOLOGICAL CHEMISTRY VOLUME 285 • NUMBER 53 • DECEMBER 31, 2010
TMGS-PCR-based DNA Family Shuffling
enviromental DNA samples. The product was amplified using -D-thiogalactopyranoside (IPTG), and the incubation tem-
a Bio-Rad DNA Engine Peltier Thermal Cycler. The condi- perature was lowered to 20 °C for 20 h.
tions were as follows. Thirty cycles of reactions were per- Inclusion bodies were solubilized using the protein refold-
formed under the conditions of denaturation at 94 °C for 1 ing kit (Novagen), according to the manufacturer’s protocol,
min, annealing at 58 °C for 1 min, and an extension step at with slight modifications. E. coli BL21 cells expressing lipases
72 °C for 1 min. This sequence was repeated 35 times fol- were harvested, resuspended in 0.1 culture volume of wash
lowed by a 10-min final extension step at 72 °C. All PCR prod- buffer (20 mM Tris-HCl, pH 7.5/10 mM EDTA/1% Triton
ucts were cloned into the TA vector and confirmed by DNA X-100), and broken by sonication. After centrifugation, the
sequencing. pellet was washed twice with 0.1 culture volumes of wash
Sequence Analysis—The Lipase Engineering Database was buffer and then solubilized at room temperature in solubiliza-
searched using the BLASTP program (23). Multiple sequence tion buffer (0.1 M glycine, pH 11/0.3% N-lauroylsarcosine) at a
alignments were constructed using the CLUSTAL-X program concentration of 5–10 mg/ml wet weight of the pelleted cell
(24). To determine the evolutionary relationship of enviro- debris, including the inclusion bodies. For refolding, the dena-
mental lipases with other lipases, phylogenetic analysis was tured protein was dialyzed twice (for 3 and 12 h) against 50
performed with the amino acid sequences of environmental sample volumes of dialysis buffer (20 mM Tris-HCl, pH 8.5)
lipases and 50 lipases obtained from Lipase Engineering Data- supplemented with 0.1 mM DTT, twice again (3 h each)
base using neighbor-joining methods with 1000 bootstrap against dialysis buffer, and then once again (12 h) against 25
replications performed by MEGA version 4.0 (25). sample volumes of dialysis buffer supplemented with 1.0 mM
DNA Shuffling—The chimeric library was constructed us- reduced glutathione/0.2 mM oxidized glutathione. All dialysis
DECEMBER 31, 2010 • VOLUME 285 • NUMBER 53 JOURNAL OF BIOLOGICAL CHEMISTRY 41511
TMGS-PCR-based DNA Family Shuffling
DNA library was constructed using a CopyControl fosmid with that of E. coli harboring pET-24a without the subcloned
library production kit (Epicenter) according to the manufac- lipase sequence. Therefore, high-throughput functional
turer’s instructions. This process was followed by the screen- screening can be applied to detect the phenotypic variability
ing of the lipolytic activity on a tributyrin agar plate. A posi- of mutants of LipS05 using this expression system (29).
tive clone showing the lipolytic activity among ⬃30,000 Recovery of Homologous Sequences—Because there are no
colonies was selected (Fig. 2A). Sequence analysis of the short conserved domains in the LipS05 sequence, only sequences
insert DNA obtained by subsequent subcloning experiments that correspond to tens of amino acids that allow alignment, it
revealed the presence of one open reading frame consisting of is not possible to design degenerate primers to isolate related
1,311 nucleotides, encoding a protein (LipS05) with a molecu- family genes from other environmental samples as in tradi-
lar mass of 48 kDa. A BLAST search in the Lipase Engineering tional PCR-based metagenomic screening. Sets of primers
Database revealed that the LipS05 can be aligned to several specific to arbitrary locations of the LipS05 sequence were
lipases around the catalytic site (E value ⬍0.01). Those se- designed (Table 1). Initially, overlapping primers targeted ter-
quences showing the highest homology were from Rhodococ- minal to the LipS05 sequence were designed to amplify full-
cus sp. (gi 111021394 ; identity, 25/76 32%; similarity, 39/76 length homologous genes from environmental DNA samples.
51%), Coprinopsis cinerea (gi 169844727 ; identity, 27/82 32%; For full-length Lip1311 primer sets, no more than 3 specific
similarity, 45/89 50%), Streptomyces griseus (gi 182439565 ; amplification products of the correct size (1311 bp) were gen-
identity, 19/58, 32%; similarity, 41/82 50%) and Cryptococcus erated from reactions for 60 templates. For the other two
neoformans (gi 134109957 ; identity, 25/89 28%; similarity, overlapping terminal primers, Lip1275 and Lip1247, none of
48/89 53%). LipS05 and the homologous uncharacterized the products were observed by gel analysis. Despite repeated
41512 JOURNAL OF BIOLOGICAL CHEMISTRY VOLUME 285 • NUMBER 53 • DECEMBER 31, 2010
TMGS-PCR-based DNA Family Shuffling
multiple alignments (Fig. 4). Therefore, a combination of 141
variable sites with a minimum of two amino acids change in
one site would comprise at least 2141, or 2.8 ⫻ 1042, different
sequences. Although the vast majority of the sequence space
that can be explored will be contingent on the recombining
capacity of DNA family shuffling, it can be assumed that the
DNA family shuffling of those homologous genes will result in
extensive genetic diversity.
The phylogenetic tree constructed by the NJ method for
the lipase family proteins is shown in Fig. 5. All of the lipases
from environmental samples cluster closely together on the
left, while all of the known lipases cluster loosely together on
the right. The lipases map together on a branch of the tree
separate from those known lipases and form a clearly distinct
group. The phylogenetic tree clearly indicates that novel
lipases form a well-supported clade (NJ bootstrap, 100%) that
is loosely related to the other known lipases (bootstrap,
⬍50%).
Analysis of Shuffled Clones—To explore whether those ho-
DECEMBER 31, 2010 • VOLUME 285 • NUMBER 53 JOURNAL OF BIOLOGICAL CHEMISTRY 41513
TMGS-PCR-based DNA Family Shuffling
DISCUSSION
In the present study, a novel metagenomic gene-specific
strategy was chosen to retrieve related genes, in contrast to
traditional PCR-based metagenomic screening, which uses
known gene information to design degenerate primers corre-
sponding to conserved regions of an enzyme family (21). Total
FIGURE 5. Phylogenetic analysis of lipase sequences (partial sequence,
307 amino acids). The data are illustrated as a neighbor-joining tree based
DNA extracted directly from environmental samples does not
on 24 new lipases and 50 previously known sequences from the Lipase En- typically contain an even representation of the population
gineering Database. The name at the nodes is assigned according to those genomes within the sample (30). Therefore, the targeted ho-
in the Lipase Engineering Database and is shown for the major clades only.
Each geographic origin is represented by a gray dot. mologous genes of known sequences might be overshadowed
by sequences from unknown dominant microbial populations
strate pNPC12, the mutant C3 showed the highest specific in environmental DNA, which would severely decrease the
activity with 17.4 units mg⫺1, which was more than 2-fold efficiency and specificity of gene amplification and affect the
more than that of the other two mutants. To investigate the product yield (18). In fact, LeCleir et al. (31) failed to retrieve
correlation between structure and observed activity, these any chitinase sequences from environmental samples using a
three clones were sequenced. The sequence of C3 is a chimera known sequence-based PCR approach, despite evidence sug-
of LipS33, LipS11, LipS30, LipS11, LipS01-2, LipS08-2, and gesting high levels of chitinolytic activity in the sample. A pre-
LipS11. The E5 gene encodes a chimeric enzyme with se- vious study retrieved only one lipase sequence from the envi-
quence elements originating from LipS01-2, LipS08-2, LipS30, ronmental DNA of olive oil percolation using a known
LipS33, LipS11, LipS30, LipS08-2, LipS33, LipS13, LipS12, sequence-based PCR approach, even with extensive analysis
LipS13, LipS30, and LipS16. D5 contained sequence elements of conserved regions and careful primer design (32). In con-
41514 JOURNAL OF BIOLOGICAL CHEMISTRY VOLUME 285 • NUMBER 53 • DECEMBER 31, 2010
TMGS-PCR-based DNA Family Shuffling
FIGURE 6. Characterization of chimerical clones. A, lipase activity screens. The chimeric lipase library was plated to rich medium containing 1% tributyrin,
grown 1 day, and incubated for 5–7 additional days at 4 °C. Transformants exhibiting diverse activity produce clear variably-sized halos. B, specific activity
trast, metagenomic gene-specific PCR uses a gene identified strates with various chain lengths provides definite evidence
by functional screening of a soil DNA library, instead of a for the functional diversity of chimeric clones. Such a prop-
known gene in the database as the starting point for retrieving erty of diversity is expected and is indeed a hallmark of suc-
family genes, which has an inherent property of biasing for cessful libraries created by DNA family shuffling (6, 7). The
genomes of dominant microorganisms in soil DNA. This dis- crossover frequency of chimeric clones in this study is higher
tinguishing feature makes the metagenomic gene a good start- than that reported for normal enzyme shuffling and lower
ing point for retrieving metagenomic homologous family that that of a shuffled library created with methods for gener-
genes from soil DNA samples by PCR. It is worth noting that ating highly recombined genes, such as RICHT (38). Given
20 bands of correct size and without any nonspecific bands that we used the same shuffling procedure as in normal en-
was observed in the gel analysis of the products amplified zyme shuffling reports to create our chimeric library, the high
from DNA of 60 soil samples with a metagenomic gene-spe- crossover frequency can be attributed to the diversity of the
cific Lip923 primer set (Fig. 3). That was achieved by only one genetic material we used. In fact, there are usually no more
experiment without any attempts at optimizing reaction con- than 4 homologous genes of enzymes from microorganisms
ditions. Nevertheless, neither our method nor traditional used to shuffle in reports using DNA family shuffling (38).
metagenomic sequence-based PCR guarantees acquisition of Additionally, in the 13,310 bases sequenced, 3 spontaneous
full-length family genes because identical sequences are con- mutations were found. This rate of 1 mutation per 4,436 bases
served in the interior regions of subfamily genes rather than is well within the number of spontaneous mutations expected
at terminal regions (6). from the rounds of PCR used to generate the fragment and
Phylogenetic analysis indicated that the metagenomic gene- template DNA and to amplify the chimeric products. Further,
specific amplification strategy offers a chance to access a no unexpected insertions, deletions, or rearrangements were
novel subfamily that was previously unknown and completely detected by DNA sequencing. These data suggest that the
uncharacterized. Traditional degenerate primer-based PCR DNA shuffling procedure using genetic material collected by
for metagenomic screening is based on the conserved DNA TMGS-PCR can create a fair qualified chimerical library.
sequences of known family genes. Generally, the amino acid
identities of newly discovered sequences from a metagenome CONCLUSION
to those in the protein databases have been above 50% and In conclusion, we have presented a novel TMGS-PCR ap-
usually around 80 –98%, which is rarely possible to shed light proach with unique features that distinguish it from the tradi-
on a novel gene family (33–37). tional PCR-based metagenomic screening method. Functional
The functional diversity of chimeric phenotypes generated analysis of the shuffled library, coupled with the sequence
by DNA family-shuffling was evident in experiments assessing information from the selected chimeric progeny, indicated
both activity on indicative plates and substrate specificity of that this method can provide high-quality genetic materials
purified mutants. The varied size of the clear halos around the for DNA family shuffling. We expect that TMGS-PCR could
mutant clones indicated an apparently varied lipolytic activity. provide a new alternative to collecting metagenomic homolo-
In addition, characterization of the specific activity on sub- gous genes of microbial enzymes for DNA family shuffling,
DECEMBER 31, 2010 • VOLUME 285 • NUMBER 53 JOURNAL OF BIOLOGICAL CHEMISTRY 41515
TMGS-PCR-based DNA Family Shuffling
which is an excellent tool for the irrational design of biocata- R. M. (1998) Chem. Biol. 5, R245–249
lysts with tailored properties. Meanwhile, one can also use 17. Ferrer, M., Martínez-Abarca, F., and Golyshin, P. N. (2005) Curr. Opin.
Biotechnol. 16, 588 –593
TMSG-PCR to generate breeding stock and then use “syn-
18. Cowan, D., Meyer, Q., Stafford, W., Muyanga, S., Cameron, R., and Wit-
thetic shuffling” to breed divergent sequences for efficient twer, P. (2005) Trends Biotechnol. 23, 321–329
directed enzyme evolution. 19. Lorenz, P., and Eck, J. (2005) Nat. Rev. Microbiol. 3, 510 –516
20. Blow, N. (2008) Nature 453, 687– 690
REFERENCES
21. Yun, J., and Ryu, S. (2005) Microbial Cell Factories 4, 8
1. Arnold, F. H., and Volkov, A. A. (1999) Curr. Opin. Chem. Biol. 3, 22. Schmidt, M., and Bornscheuer, U. T. (2005) Biomol. Eng. 22, 51–56
54 –59 23. Fischer, M., and Pleiss, J. (2003) Nucleic Acids Res. 31, 319 –321
2. Tao, H., and Cornish, V. W. (2002) Curr. Opin. Chem. Biol. 6, 858 – 864 24. Thompson, J. D., Higgins, D. G., and Gibson, T. J. (1994) Nucleic Acids
3. Cherry, J. R., and Fidantsef, A. L. (2003) Curr. Opin. Chem. Biol. 14, Res. 22, 4673– 4680
438 – 443
25. Tamura, K., Dudley, J., Nei, M., and Kumar, S. (2007) Mol. Biol. Evolu-
4. Turner, N. J. (2009) Nat. Chem. Biol. 5, 567–573
tion 24, 1596 –1599
5. Cohen, N., Abramov, S., Dror, Y., and Freeman, A. (2001) Trends Bio-
26. Joern, J. M., Meinhold, P., and Arnold, F. H. (2002) J. Mol. Biol. 316,
technol. 19, 507–510
643– 656
6. Ness, J. E., Welch, M., Giver, L., Bueno, M., Cherry, J. R., Borchert, T. V.,
27. Ferrer, M., Plou, F., Nuero, O., Reyes, F., and Ballesteros, A. (2000)
Stemmer, W. P., and Minshull, J. (1999) Nat. Biotechnol. 17, 893– 896
J. Chem. Technol. Biotechnol. 75, 569 –576
7. Crameri, A., Raillard, S. A., Bermudez, E., and Stemmer, W. P. (1998)
28. Wong, H., and Schotz, M. (2002) J. Lipid Res. 43, 993
Nature 391, 288 –291
29. Moore, J. C., and Arnold, F. H. (1996) Nat. Biotechnol. 14, 458 – 467
8. Chang, C. C., Chen, T. T., Cox, B. W., Dawes, G. N., Stemmer, W. P.,
Punnonen, J., and Patten, P. A. (1999) Nat. Biotechnol. 17, 793–797 30. Tringe, S. G., and Rubin, E. M. (2005) Nat. Rev. Genet. 6, 805– 814
9. Christians, F. C., Scapozza, L., Crameri, A., Folkers, G., and Stemmer, 31. LeCleir, G. R., Buchan, A., Maurer, J., Moran, M. A., and Hollibaugh, J.
41516 JOURNAL OF BIOLOGICAL CHEMISTRY VOLUME 285 • NUMBER 53 • DECEMBER 31, 2010
Prospecting Metagenomic Enzyme Subfamily Genes for DNA Family Shuffling by
a Novel PCR-based Approach
Qiuyan Wang, Huili Wu, Anming Wang, Pengfei Du, Xiaolin Pei, Haifeng Li, Xiaopu
Yin, Lifeng Huang and Xiaolong Xiong
J. Biol. Chem. 2010, 285:41509-41516.
doi: 10.1074/jbc.M110.139659 originally published online October 20, 2010
Alerts:
• When this article is cited
• When a correction for this article is posted