This action might not be possible to undo. Are you sure you want to continue?
1128/MMBR.68.3.518–537.2004 Copyright © 2004, American Society for Microbiology. All Rights Reserved.
Vol. 68, No. 3
Determination of the Core of a Minimal Bacterial Gene Set†
Rosario Gil,1,2* Francisco J. Silva,1,2 Juli Pereto ´,1,3 and Andre ´s Moya1,2
Institut Cavanilles de Biodiversitat i Biologia Evolutiva, Universitat de Vale `ncia, Valencia,1 and 2 Departament de Gene `tica and Departament de Bioquı ´mica i Biologia Molecular,3 Universitat de Vale `ncia, Burjassot (Vale `ncia), Spain INTRODUCTION .......................................................................................................................................................518 MAIN FEATURES OF THE MINIMAL SET ........................................................................................................520 Information Storage and Processing....................................................................................................................520 DNA metabolism. (i) Basic replication machinery ........................................................................................520 (ii) DNA repair, restriction, and modiﬁcation ...........................................................................................524 RNA metabolism .................................................................................................................................................524 (i) Basic transcription machinery ................................................................................................................524 (ii) Translation................................................................................................................................................525 (iii) RNA degradation ....................................................................................................................................526 Protein Processing, Folding, and Secretion ........................................................................................................527 Cell Structure and Cellular Processes.................................................................................................................527 Cell wall................................................................................................................................................................527 Cell shape and division......................................................................................................................................527 Substrate transport ............................................................................................................................................528 Energetic and Intermediary Metabolism.............................................................................................................528 Glycolysis, gluconeogenesis, pyruvate metabolism, and the TCA cycle ......................................................529 Electron transport chain and proton motive force generation.....................................................................530 Pentose phosphate pathway...............................................................................................................................530 Biosynthesis of amino acids ..............................................................................................................................531 Biosynthesis of lipids..........................................................................................................................................531 Biosynthesis of nucleotides................................................................................................................................531 Biosynthesis of cofactors....................................................................................................................................533 Poorly Characterized Genes ..................................................................................................................................534 CONCLUSIONS .........................................................................................................................................................534 ACKNOWLEDGMENTS ...........................................................................................................................................535 REFERENCES ............................................................................................................................................................535 INTRODUCTION Complete genome sequences are becoming available for a large number of diverse bacterial species. Comparative genomics shows that most bacterial proteins are highly conserved in evolution, allowing predictions to be made about the functions of most products of uncharacterized genomes based on model organisms, such as Escherichia coli and Bacillus subtilis (grampositive and gram-negative bacteria, respectively), for which abundant high-quality genetic and biochemical information has been obtained in the past. One important question raised by the availability of complete genomic sequences is how many genes are essential for cellular life. Although bacterial genomes differ vastly in their sizes and gene repertoires, no matter how small, they must contain all the information to allow the cell to perform many essential (housekeeping) functions that give the cell the ability to maintain metabolic homeostasis, reproduce, and evolve, the three main properties of living cells (53). Cells usually can
* Corresponding author. Mailing address: Institut Cavanilles de Biodiversitat i Biologia Evolutiva, Universitat de Vale `ncia, Apartat Oﬁcial 2085, 46071 Vale `ncia, Spain. Phone: 34 96 354 36 29. Fax: 34 96 354 36 70. E-mail: firstname.lastname@example.org. † Supplemental material for this article may be found at http://mmbr .asm.org. 518
import metabolites but not functional proteins; therefore, they have to rely on their own gene products to perform such essential functions. The determination of the minimal set of protein-coding genes necessary to maintain a living cell is becoming an increasingly appealing issue, considering that such minimal gene set should include “the smallest possible group of genes that would be sufﬁcient to sustain a functioning cellular life form under the most favorable conditions imaginable, that is, in the presence of a full complement of essential nutrients and in the absence of environmental stress” (48). Reconstruction of the minimal gene set can take advantage of the increasing knowledge of completely characterized genomes. In recent years, several research groups have tried to deﬁne the essential set of survival protein-encoding genes in bacteria by different experimental and computational methods (reviewed in references 22, 24, 38, and 79). Three different experimental approaches have been used to identify genes that are essential under particular growth conditions: massive transposon mutagenesis strategies (the most widely used approach), the use of antisense RNA to inhibit gene expression (22, 37), and the systematic inactivation of each individual gene present in a genome (31, 45, 64; http://www.genome.wisc.edu/functional /tnmutagenesis.htm). However, all these approaches have limitations. Transposon mutagenesis might overestimate the set
VOL. 68, 2004
MINIMAL BACTERIAL GENE SET
by misclassiﬁcation of nonessential genes that slow growth without arresting it but can also miss essential genes that tolerate transposon insertions. The use of antisense RNA is limited to the genes for which an adequate expression of the inhibitory RNA can be obtained in the organism under study. Finally, inactivation of single genes does not detect essential functions encoded by redundant genes, and the essential gene set is not the same as the minimal genome, since it is clear that genes that are individually dispensable may not be simultaneously dispensable. Computational analysis has also been extensively used to try to get closer to the minimal gene set (28, 48, 65, 81). For this purpose, the smallest bacterial genomes fully sequenced, from bacterial parasites or endosymbionts, have been very useful, since they must retain all genes involved in housekeeping functions and a minimum amount of metabolic transactions for cellular survival and replication in their given niche. The smallest complete bacterial genome reported thus far corresponds to the human pathogen Mycoplasma genitalium, with only 480 protein-coding genes in a 540-kb genome (23), which can be considered an upper-limit estimate for the minimal bacterial gene set. However, smaller genomes have been described for other bacteria, such as some strains of Buchnera aphidicola, the endosymbiotic bacteria of aphids, whose genomes have an estimated size of around 450 kb, which can contain about 400 protein-coding genes (27). The two ﬁrst bacterial genomes completely sequenced were those from the parasitic bacteria Haemophilus inﬂuenzae (21) and M. genitalium (23), gram-negative and gram-positive bacteria, respectively, with reduced genomes (1.83 Mb and 0.58 Mb respectively). Soon after that, Mushegian and Koonin performed a computational comparison between them, assuming that genes conserved across the large phylogenetic distances that exist between these two species are likely to be essential (65). The authors suggested a list of 256 conserved genes as a close estimate of the minimal gene set. However, some of the genes considered essential by this approach can be disrupted by transposon insertion (34), and it is reasonable to consider that enlargening the number of genomes used in the comparison will signiﬁcatively reduce the number of genes considered to be essential. In the last few years, complete genomes of ﬁve bacterial endosymbionts of insects have been described (2, 28, 77, 80, 83). These bacteria have a permanent host-associated lifestyle, and therefore they have relaxed selection on the maintenance of genes that are not required for survival in their protected environments. Based on the above-mentioned assumption that the genes shared by multiple genomes are likely to be essential and to be good candidates for inclusion in the minimal gene set, we performed a comparative analysis of all ﬁve endosymbiont genomes, which revealed that they share only 277 orthologous protein-coding genes (28). The obtained minimal endosymbiont gene set was then compared with the genome of M. genitalium; this resulted in the identiﬁcation of 180 housekeeping genes that were shared by all six genomes. The number of shared genes among endosymbiotic and parasitic bacteria was further reduced to 156 when the parasites Rickettsia prowazekii and Chlamydia trachomatis were added to the comparison (44). However, computational analysis has also limitations, since it is likely to underestimate the minimal set
because it takes into account only the genes that have remained similar enough during the course of evolution to be recognized as true orthologues. Therefore, it will not include genes with a high rate of evolution, which may not show their relationship in comparisons of distant taxons. In the abovementioned study, the displacement of unrelated but functionally analogous protein-coding genes was not considered, thus presenting only a core of the possible minimum gene set necessary to sustain life. In an attempt to get closer to the minimal bacterial genome, in this paper we present an enhanced review of all previously used strategies for addressing this issue. Taking as a start our previous computational comparison of gene orthologues for the ﬁve completely sequenced endosymbionts, we added to the comparison functional equivalent genes that do not present sequence similarity. The obtained gene set was compared with the B. subtilis essential genes determined by systematic inactivation (45) and with the proposed essential genes for E. coli based on the genome-scale genetic footprinting performed by Gerdes et al. (25), the information found at the PEC (for “Proﬁling E. coli Chromosome”) website (http://www.shigen .nig.ac.jp/ecoli/pec), and the results obtained by F.R. Blattner and coworkers (http://www.genome.wisc.edu/functional /tnmutagenesis.htm). It should be noticed that the PEC and Blattner’s E. coli sequencing-project databases are incomplete, and in some cases there are discrepancies about the essentiality of E. coli genes depending on the source. Some differences might be due to the different growth conditions used in the experiments, while others might be reﬂecting the inability of a mutant with a severe decrease in ﬁtness to maintain an enough vigorous growth to be isolated in the footprinting experiments (25). We also took into account the computationally derived minimal gene set proposed for M. genitalium (65) and the results of the global transposon mutagenesis for mycoplasmas (34) in order to detect genes that appear to be dispensable. Genes that are present in all ﬁve endosymbionts, and for which no transposon insertion could be detected in mycoplasmas, were considered essential in reduced genomes, even if they appear to be nonessential in bacteria with larger genomes. We have also included in our minimal gene set some of the genes that were disrupted by transposon insertion in mycoplasmas but were proven to be essential in other bacteria and are present in all ﬁve endosymbionts under study. We also compared our set with the list of essential genes identiﬁed in Staphylococcus aureus by antisense RNA experiments (22, 37) and the recently sequenced genome of Phytoplasma asteris (70). Finally, the resulting gene list was reanalyzed to detect gaps in metabolic pathways that could be considered essential to maintain a reasonable metabolic homeostasis for any living cell. The following gene and protein servers and databases were used to identify the corresponding orthologous genes and protein functions, as well as to reconstruct the metabolic pathways: BRENDA (for “Braunschweig Enzyme Database”) (http://www .brenda.uni-koeln.de/index.php4), COG (for “Clusters of Orthologous Groups of Proteins”) (http://www.ncbi.nlm.nih.gov /COG/), ExPASy (for “Expert Protein Analysis System”) molecular biology server (http://us.expasy.org), KEGG (for “Kyoto Encyclopedia of Genes and Genomes”) (http://www.genome .ad.jp/kegg2.html), MBGD (for “Microbial Genome Database for Comparative Analysis”) (http://mbgd.genome.ad.jp), and
BSg and BBp. Pfam (for “Protein Families Database of Alignments and HMMs”) (http://www. Several topoisomerases that are present in M.ac. coli to allow independent segregation of the two daughter chromosomes (EC 5.7). ␦’. Another recruitment protein (DnaC in E. performs all the topoisomerase functions required for DNA replication and segregation. since different essential functions can be deﬁned depending on the environmental conditions. which is present in all these bacteria. MOL. in Table S2 presented in the supplemental material. the general replication reaction duplicates both strands of DNA. However.7. The number of these genes. encoded by lig). as yet unidentiﬁed proteins that perform such functions in these reduced genomes. In this work. At least one histone-like protein is also encoded by all ﬁve analyzed endosymbiont genomes. The notion of a minimal cell cannot be sharply deﬁned.fr /SubtiList/) (50) and can also be found in the KEGG database. while topA is only present in two strains of B.pasteur.99.6. the general DNA replication begins. coli. replication terminates and the two daughter chromosomes are separated. are also conserved. Second. Several mechanisms have been described in bacteria (46).-) to the replication fork.1.1.7. or that there are other. and PriA).99. B. which might be the case for small genomes. encoded by topA. Although lig was previously annotated as a pseudogene in B. Nevertheless. genitalium. a multistage reaction with four basic steps.2. The gene name column generally indicates the most common synonymous for the corresponding gene in gram-negative bacteria (except when no gene with such function has been identiﬁed in E. Candidatus Blochmannia ﬂoridanus. It is possible that the above-mentioned GyrA/B. revealed that it does not encode any of the proposed initiation proteins (DnaA. MAIN FEATURES OF THE MINIMAL SET Information Storage and Processing DNA metabolism.sanger. The speciﬁc mechanism for replication initiation depends both on the structure of the replication origin and on the nature of the initiation protein (47). (SSB). REV.ad. glossinidia (endosymbiont of tsetse ﬂies). we decided to include this histone-like protein-coding gene in our minimal set.1. The recruitment and loading of helicase at the replication origin generally requires the presence of a histonelike protein for the destabilization of the duplex DNA near the origin of replication. The number of genes included in each category is indicated parenthetically in the tables.-) attracts the primase DnaG (EC 2.5. gyrase (EC 5. In the cases of nonorthologous or nonhomologous genes encoding proteins with the same function. (i) Basic replication machinery. it is possible to try to deﬁne which functions should be performed in any living cell and to list the genes that would be necessary to maintain such functions.99. To begin DNA replication. subtilis and E. MICROBIOL.1. The B. we have included in Table S2 (in the supplemental material) the gene that was present in more analyzed organisms. Hence. ␦. One of the most important housekeeping functions in all living cells is DNA replication. we present the hypothetical core of one of the possible minimal protein-coding gene sets able to sustain a functional bacterial cell under ideal conditions. Genes encoding the ␣. On the other hand. ε. Third.1. both of them essential in B. Two more proteins required at this stage. ﬂoridanus. although in some cases it does not correspond with the given name for the gram-positive bacteria orthologous gene. and ␤ subunits of DNA polymerase III. transferring it only in the correct DNA location which is not coated by Single-Stranded DNA Binding protein. it can be assumed that DNA replication can take place without the need for these initiation and recruiting proteins under some conditions. ﬂoridanus and W. BIOL. MG353 appears to encode a nucleoid DNA binding protein similar to hupA. with the action of DNA polymerase III (EC 2. aphidicola.7. It should be noticed that in gram-positive bacteria there are two genes encoding the ␣ subunits of this holoenzyme. RecA. Once the DNA is melted and primed. subtilis. coli). could show some minor variation associated with putative gene fusions. genitalium. coli and B. encoded by gyrA and gyrB) and ligase (EC 6. subtilis) cannot be found in the reduced genomes under study. aphidicola BSg due to the absence of the C-terminal protein end (80). Fourth.uk/Software/Pfam). The results of this analysis are summarized in Table 1 and. aphidicola strains BAp.jp/ and http://genolist. DnaI in B. thus allowing the completion of the third stage of DNA replication. and numerous versions of minimal gene sets can be conceived to fulﬁll such functions even for the same set of conditions (49).-. a replication origin sequence within the genome is recognized by protein components.1.520 GIL ET AL. ␥ and . the gene encoding the protein necessary to stop replication by inhibiting helicase translocation (tus in E.-) to prevent the indiscriminate binding of the helicase to single-stranded DNA. the endosymbiotic bacteria of carpenter ants. subtilis). DnaC is absent in B. The functional classiﬁcation of the genes is based on the categories used in the sequencing work on Aquifex aeolicus (19). since parC and parE have been lost in all endosymbionts analyzed. but one of them (dnaE) has been proven to be dispensable in M. genitalium and essential in B. and EC 5. coli.7. since it has been suggested that type IA (TopA) and type II (GyrA/B and ParD/E) topoisomerases may use comparable mechanisms to . are also conserved. Table S2 in the supplemental material also includes the homologous genes present in the six completely sequenced reduced genomes (M.2. rtp in B. subtilis) is also required to form a complex with the helicase DnaB (EC 3. Since this gene is also present in all endosymbionts analyzed except B. and we examine the main features of the proposed minimal gene set. although none of these proteins is conserved in all of them. but the analysis of the genome of B.2. the helicase DnaB (EC 3. both of them present in all bacteria with reduced genomes analyzed in this study. while polC encodes both the ␣ and ε subunits of DNA polymerase III. genitalium. all subunits that are present both in gram-negative and gram-positive bacteria. and Wigglesworthia glossinidia) and the two model bacteria used in this analysis (E. with more detail. subtilis gene numbers are those given by the Japanese and European Consortium involved in its sequencing (http://bacillus. initiation of replication occurs through the recruitment of replisomal proteins at the origin. indicating that it is also dispensable in these bacteria. but not the number of functions. it has been reannotated as a functional gene encoding the catalytic domain of the protein but with the absence of the BRCT domain (28).6. encoded by parC and parE) must also be considered dispensable. ﬂoridanus. The rationale followed to decide which genes should be included in it is described in detail. In M. First.genome.
11 6.1.12 6.10 6.1. Protein-coding genes in the hypothetical minimal cell 521 Category Subcategory Gene E.184.108.40.206 6.6 2.1.17 220.127.116.11.18.104.22.168.22.214.171.124 6.2.1.VOL.1.14 6.1.3 5.1.6 6. b subunit Prolyl-tRNA synthase Seryl-tRNA synthase Threonyl-tRNA synthase Tryptophanyl-tRNA synthase Tyrosyl-tRNA synthase Valyl-tRNA synthase Cysteine desulfurase-NifS homolog tRNA (5-methylaminomethyl-2-thiouridylate) methyltransferase GTP binding protein involved in biosynthesis of 5-methilaminomethyl-2-thiouridine Glucose-inhibited division protein A.7.2. restriction.9 4. ␣ subunit DNA primase DNA polymerase III.1 6.7 2.1. A subunit DNA gyrase.126.96.36.199.6 2. involved in biosynthesis of 5-methylaminomethyl-2-thiouridine Peptidyl-tRNA hydrolase Protein component of RNa P 50S 50S 50S 50S 50S 50S 50S 50S 50S 50S 50S 50S 50S 50S ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal protein protein protein protein protein protein protein protein protein protein protein protein protein protein L1 L2 L3 L4 L5 L6 L9 L10 L11 L12 L13 L14 L15 L16 DNA repair.7.7 188.8.131.52.1.1.20 184.108.40.206.6 6.2.1. a subunit Phenylalanyl-tRNA synthase.7 220.127.116.11. ␤ subunit DNA polymerase III.7.2.2 6. B subunit DNA polymerase III. no.1.7. ␤Ј subunit RNA polymerase major factor Alanyl-tRNA synthase Arginyl-tRNA synthase Asparaginyl-tRNA synthase Aspartyl-tRNA synthase Cysteinyl-tRNA synthase Glutaminyl-tRNA synthase Glutamyl-tRNA synthase Glycyl-tRNA synthase.3 6.- Replicative DNA helicase DNA polymerase III. ε subunit DNA polymerase III. Protein function DNA metabolism (16 genes) Basic replication machinery (13 genes) dnaB dnaE dnaG dnaN dnaQ dnaX gyrA gyrB holA holB hupA lig ssb nth polA ung deaD greA nusA nusG rpoA rpoB rpoC rpoD alaS argS asnS aspS cysS glnS gltX glyS hisS ileS leuS lysS metS pheS pheT proS serS thrS trpS tyrS valS iscS mnmAa mnmEb mnmGc pth rnpA 18.104.22.168.1.1.61 Translation: aminoacyl-tRNA synthesis (21 genes) Translation: tRNA maturation and modiﬁcation (6 genes) 3.1. ␤ subunit RNA polymerase.22.214.171.124 126.96.36.199. 68.99.29 188.8.131.52 184.108.40.206.7. b subunit Histidyl-tRNA synthase Isoleucyl-tRNA synthase Leucyl-tRNA synthase Lysyl-tRNA synthase Methionyl-tRNA synthase Phenylalanyl-tRNA synthase.19 6.1.4 220.127.116.11 2. ␦ subunit DNA polymerase III.99. ␦Ј subunit DNA binding protein DNA ligase (NAD dependent) SSB Endonuclease III 5Ј-3Ј exonuclease domain of DNA polymerase I Uracil-DNA glycosylase ATP-dependent RNA helicase Transcription elongation factor Transcription-translation coupling Transcription antitermination protein RNA polymerase.C.7 6.7.2 4. 2004 MINIMAL BACTERIAL GENE SET TABLE 18.104.22.168 6.1.5 6.3 2.18 6.5 Translation: ribosomal proteins (50 genes) rplA rplB rplC rplD rplE rplF rplI rplJ rplK rplL rplM rplN rplO rplP Continued on following page . ␣ subunit RNA polymerase.26.1.7. and modiﬁcation (3 genes) RNA metabolism (106 genes) Basic transcription machinery (8 genes) 2. ␥ and subunits DNA gyrase.22.214.171.124.1.7 126.96.36.199.1.15 6.
MICROBIOL.- 50S 50S 50S 50S 50S 50S 50S 50S 50S 50S 50S 50S 50S 50S 50S 50S 50S 30S 30S 30S 30S 30S 30S 30S 30S 30S 30S 30S 30S 30S 30S 30S 30S 30S 30S 30S ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal ribosomal protein protein protein protein protein protein protein protein protein protein protein protein protein protein protein protein protein protein protein protein protein protein protein protein protein protein protein protein protein protein protein protein protein protein protein protein L17 L18 L19 L20 L21 L22 L23 L24 L27 L28 L29 L31 L32 L33 L34 L35 L36 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11 S12 S13 S14 S15 S16 S17 S18 S19 S20 Ribosomal methytransferase GTP binding protein GTP binding protein Dimethyladenosine transferase GTP binding protein Ribosome binding factor A GTP binding protein Elongation factor P Elongation factor G Ribosome-recycling factor N5-glutamine methyltransferase. and secretion (15 genes) Protein posttranslational modiﬁcation (2 genes) Protein folding (5 genes) Protein translocation and secretion (5 genes) Continued on following page .C. maturation and modiﬁcation (7 genes) cspR engA era ksgA obg rbfA ychF efp fusA frr hemK infA infB infC lepA prfA smpB tsf tufA RNA degradation (2 genes) pnp rnc map pepA dnaJ dnaK groEL groES grpE ffh ftsY 188.8.131.52.1.3 2.1 Protein processing.1.1.7. no.1.6. BIOL.11.48 2.- 3. Protein function rplQ rplR rplS rplT rplU rplV rplW rplX rpmA rpmB rpmC rpmE rpmF rpmG rpmH rpmI rpmJ rpsB rpsC rpsD rpsE rpsF rpsG rpsH rpsI rpsJ rpsK rpsL rpsM rpsN rpsO rpsP rpsQ rpsR rpsS rpsT Translation: ribosome function.1. REV. MOL.26. TABLE 1—Continued Category Subcategory Gene E. modulation of release factor activity Initiation factor IF-1 Initiation factor IF-2 Initiation factor IF-3 GTP binding elongation factor Peptide chain release factor 1 (RF1) tmRNA binding protein Elongation factor Ts Elongation factor Tu Polyribonucleotide nucleotidyltransferase Ribonuclease III Methionine aminopeptidase Aminopeptidase A/I Hsp70 cochaperone Chaperone Hsp70 Class I heat shock protein Class 1 heat shock protein Hsp70 cochaperone Protein component of signal recognition particle Signal recognition particle receptor Translation factors (12 genes) 3.3 184.108.40.206. folding.8 220.127.116.11.4.522 GIL ET AL.18 3.
11 18.104.22.168 4.1.8 2.3 Lipid metabolism (7 genes) Biosynthesis of nucleotides (15 genes) Continued on following page .22.214.171.124.12 5.1.1 2.15 2.65 2.40 5.3.8 2.7.16 126.96.36.199.1 6.6 1.41 6.53 Preprotein translocase subunit (ATPase) Membrane-embedded preprotein translocase subunit Membrane-embedded preprotein translocase subunit Probable O-sialoglycoprotein endopeptidase ATP-dependent protease ATP-dependent protease La Cytoskeletal cell division protein Low-afﬁnity inorganic phosphate transporter PTS glucose-speciﬁc enzyme II Histidine-containing phosphocarrier protein of PTS PTS enzyme I Enolase Fructose-188.8.131.52 184.108.40.206.2.17.9 2.4.51 220.127.116.11.3 18.104.22.168.17 22.214.171.124.126.96.36.199.7.1 1.9 1.1 188.8.131.52.8.9 2.8.19 184.108.40.206.4. MINIMAL BACTERIAL GENE SET 523 Protein function secA secE secY Protein turnover (3 genes) gcp hﬂB Ion ftsZ pitA ptsG ptsH ptsI Energetic and intermediary metabolism (56 genes) Glycolysis (10 genes) eno fbaA gapA gpmA ldh pfkA pgi pgk pykA tpiA atpA atpB atpC atpD atpE atpF atpG atpH yidC rpe rpiA tkt cdsA fadD gpsA plsB plsC psd pssA adk dcd gmk hpt ndk nrdE nrdF ppa prsA pyrG thyA tmk trxA trxB upp Biosynthesis of cofactors (12 genes) coaA coaD 4.33 220.127.116.11 3.45 2.9 2.1.69 3.3. no.1.2.18 18.104.22.168.22.214.171.124 2.1.1 126.96.36.199.94 2.3 3.1 5.3.1 2. 68.6-bisphosphate aldolase Glyceraldehyde-3-phosphate dehydrogenase Phosphoglycerate mutase L-Lactate dehydrogenase 6-Phosphofructokinase Glucose-6-phosphate isomerase Phosphoglycerate kinase Pyruvate kinase Triose-phosphate isomerase ATP synthase ␣ chain ATP synthase A chain ATP synthase ε chain ATP synthase ␤ chain ATP synthase C chain ATP synthase B chain ATP synthase ␥ chain ATP synthase ␦ chain Essential for proper integration of ATPase into the membrane Ribulose-phosphate 3-epimerase Ribose 5-phosphate isomerase Transketolase Phosphatidate cytidylyltransferase Acyl-CoA synthase sn-Glycerol-3-phosphate dehydrogenase sn-Glycerol-3-phosphate acyltransferase 1-Acyl-sn-glycerol-3-phosphate acyltransferase Phosphatidylserine decarboxylase Phosphatidylserine synthase Adenylate kinase dCTP deaminase Guanylate kinase Hypoxanthine phosphoribosyltransferase Nucleoside diphosphate kinase Ribonucleoside diphosphate reductase (major subunit) Ribonucleoside diphosphate reductase (minor subunit) Inorganic pyrophosphatase Phosphoribosylpyrophosphate synthase CTP synthase Thymidylate synthase Thymidylate kinase Thioredoxin Thioredoxin reductase Uracil phosphoribosyltransferase Pantothenate kinase 4Ј-Phosphopantetheine adenylyltransferase Cellular processes (5 genes) Cell division (1 gene) Transport (4 genes) Proton motive force generation (9 genes) Penthose phosphate pathway (3 genes) 188.8.131.52.1.3. 2004 TABLE 1—Continued Category Subcategory Gene E.4.VOL.57 184.108.40.206.13 220.127.116.11 18.104.22.168 2.3 22.214.171.124.126.96.36.199.C.14 188.8.131.52.6.1.2 2.15 184.108.40.206.220.127.116.11.3.4.20 3.
(ii) DNA repair. surprisingly. coli.1. rpoC. which has been extensively studied in endosymbiotic bacteria. but its gram-positive counterpart. subtilis (SMC. which encodes an endonuclease IV (EC 3. We have also included in the minimal gene set four more genes involved in transcription which are present in all ﬁve endosymbionts and M. processing.5. is present in all endosymbionts. The presence of these genes would allow the cell to maintain the rate of mutation at tolerable levels. tatD family Conserved hypothetical protein Conserved hypothetical protein Conserved hypothetical protein 2.7. which promotes the transcription of a wide variety of genes) can be disrupted in M. and modiﬁcation. rpoD (encoding the major factor of RNA polymerase. M. a rudimentary system of DNA repair is still maintained in every endosymbiont genome and in M.2 2. A common feature of reduced genomes is the loss of most genes involved in DNA recombination and repair.36 1. subtilis and are conserved in all genomes analyzed in this study. TABLE 1—Continued Category Subcategory Gene E.2. Instead. MICROBIOL. and modiﬁcation of transcripts.-). encoding two transcription factors. However. since none of them have been dis- . and RNA degradation and its regulation. restriction.7. the authors recognize that there might exist some nonidentiﬁed problems with the method of mutagenesis used. The exonuclease V (EC 3. genitalium has lost the DNA polymerase domain of polA (which encodes DNA polymerase I).1. while M. coli either.1 2. Furthermore. The ung gene. genitalium.7.6 2. encoded by the addAB system.1.26 2. The genes involved in these pathways represent more than 50% of the total number of genes included in our proposed minimal set (107 of 206 genes).21. bind and cleave DNA. REV.1. Old gene name: thdF. encoded by the recBCD system in gram-negative bacteria. M. coli. aphidicola.1. None of them seems to be essential for B. is not present in M. no.7.- a b Old gene name: trmU.21. (i) Basic transcription machinery.7.-) has been conserved. genitalium. subtilis and some experiments indicate that they might not be essential in E. MOL.3.1.2. RNA metabolism refers to all processes that involve RNA.C. Taking everything into account. indicating that the truncated protein must be involved only in DNA repair mechanisms.2. SMC proteins are conserved in most (but not all) prokaryotes. It is the central and most evolutionarily conserved part of cell physiology.1.5. RNA metabolism. pneumoniae.11. and ScpB) are not conserved either. and ung.24 6. although uvrA has proven to be dispensable (34). BIOL. genitalium.2) with similar activity. which result in leakage for some of the mutants (34). although.99.2. based on similarities in several structural elements (6). rpoB.7.7. Protein function coaE dfp folA glyA metK nadR nadV pdxY ribF yloS Poorly characterized (8 genes) mesJ mraW ybeY ycfF ycfH yoaE yqgF yraL 2.1.3. including transcription.1. subtilis and E. translation. Other proteins involved in chromosome condensation that are essential in B. c Old gene name: gidA.5). subtilis and E. As in B. we propose that a minimal gene set should include at least the genes that encode one endonuclease (nth or nfo). related to RNA polymerase function [EC 2. involved in the coupling between translation and termination of RNA synthesis) are essential in B. endonuclease III (EC 4.7. is present in all endosymbionts under study.524 GIL ET AL. and only the portion with the 5Ј-3Ј exonuclease activity of the protein (EC 3.1.3 2. although it is not considered essential in B.7. genitalium: deaD (ATP-dependent RNA helicase) and greA and nusG. and E.11. 80).1 2. is also present in all reduced genomes that have been analyzed. which repairs DNA at apurinic or apyrimidinic sites. It has been proposed that the loss of DNA repair mechanisms at the beginning of the symbiotic relationship started a process of continuous degeneration of these genomes (62) while the loss of genes involved in homologous recombination has probably contributed to the chromosome stability of reduced genomes (78.1.2 Dephospho-CoA kinase Phosphopantothenate cysteine ligase 4Ј-Phospho-pantothenyl-L-cysteine decarboxylase Dihydrofolate reductase Glycine hydroxymethyltransferase Methionine adenosyltransferase Adenylyltransferase Nicotinamide phosphoribosyltransferase Pyridoxal kinase Riboﬂavin kinase Flavin mononucleotide adenylyltransferase Thiamine pyrophosphokinase Conserved hypothetical protein Methyltrasferase Conserved hypothetical protein HIT family Putative deoxyribonuclease. encoding the DNA repair enzyme uracil-DNA glycosylase (EC 3. and they have not been included in the minimal set.1.6].7.5 4.6. and rpoD.2. coli mutants lacking the encoding gene are viable (85).1. and nusA. one exonuclease (encoded by a truncated version of polA).18. probably due to the existence of several overlapping DNA repair mechanisms in large genomes or because they are dispensable in the absence of selective pressure. genitalium retains the nfo gene. encoded by nth). ScpA. Five genes that encode components of the basic transcription machinery (rpoA.1. However.35 2. all of them seem to be essential in M. They have all been included in the minimal gene set. Although none of them are essential in B. genitalium retains the UvrABC system.
but trmD appears as a pseudogene in W. but it is also essential in S. and DNA partitioning and segregation.18). coli. Therefore. MnmE. 40). but published reports disagree about whether pth and trmU (renamed mnmA ) are essential in E. genitalium: gidA (glucose-inhibited division protein A). EC 2. are nonessential in E.1. Although the function of most of the universally conserved bacterial GTPases is poorly understood (with the exception of the factors necessary for protein synthesis and secretion. Since these four genes have been preserved in reduced genomes and since the 2-thiouridine modiﬁcation of tRNAs stabilizes anticodon structure.3. genitalium. glossinidia and is absent in M. Furthermore. coli (39). The loss of trmD (encoding a tRNA methylase. subtilis and in M. and E. renamed mnmE ). coli as a sulfur transferase (40). and Obg. EC 3. We have not included in the minimal gene set any transcription regulator. recent studies support the idea that GTPases of the Obg and Era groups regulate and coordinate ribosome function.1.61) seem to be essential in small genomes. Therefore. and S. cell cycle activity. but it appears to be also involved in the hypermodiﬁcation of tRNAs and has been renamed mnmG (7). Two Obg-like GTPases. genitalium (34). and formylation has proven not to be essential for all bacteria. required for the formylation of methionyl tRNA (EC 2. is also present in all analyzed reduced genomes. pth (which encodes a peptidyl-tRNA hydrolase. L30. However. 2004 MINIMAL BACTERIAL GENE SET 525 rupted in the massive transposon knockout experiment performed by Hutchison et al. we assume that the minimal number of ribosomal proteins required for proper functioning of the ribosome corresponds to the gene set present in M. members of a two-component system involved in regulation of RNA synthesis (yycF and yycG. Era.1.7. genitalium (ﬁve different mutants were found in the global transposon mutagenesis experiment ) and therefore has not been included in our minimal set.1. and gatC. 25.31) reduces the growth rate in E. Among the genes involved in tRNA modiﬁcation that have been described as essential in B. (ii) Translation.-). (34).2. although the authors consider that this ﬁnding does not prove that L28 is not essential (34). genitalium. obg appears to . Thus. and a small number of GTPases are universally conserved over the entire range of bacterial species. The fmt gene.9). EC 2. coli. since it is dispensable in Pseudomonas aeruginosa (66). gatA. since it does not seem to be an essential function in bacteria with reduced genomes. except in M. genitalium. confers ribosome binding ability to tRNA.-). EC 2.1. Three more genes in this category that are not considered essential in B. which proves that it can be nonessential depending on features of other elements involved in translation (9). the gene that encodes a cysteine desulfurase (EC 4. and MnmG are involved in the ﬁrst steps of the biosynthesis of the hypermodiﬁed nucleoside 5-methylaminomethyl-2-thiouridine. 29. while several E. Among the genes involved in ribosome maturation and modiﬁcation. coli strains lacking this gene have severely impaired growth rates (30). represented mainly by aminoacyl-tRNA synthases and ribosomal proteins. In all genomes analyzed in this work. Mutations in mnmE result in excessive frameshifting during protein synthesis.1. although the last of these can be disrupted in M. The largest number of conserved genes encode ribosomal proteins. and S21) are not encoded by the genome of M. EC 2.2. and organelles analyzed (18). it was not identiﬁed in the recently sequenced genome of Phytoplasma asteris (70). subtilis and E. 56. EC 4.7. is also preserved in all the genomes under study. thdF (GTPase protein involved in hypermodiﬁcation of tRNA. genitalium. mnmA can be disrupted in M. where it is composed only of two ␤ subunits. 68.26.7. S1. which is absent in B. coli and are absent in the reduced genomes under study. genitalium. only rnpA (encoding the protein component of RNase P.25). in E.-). The gidA gene was previously described as a gene involved in cell division.4.1. genitalium.3.5). subtilis and E. It has been proposed that only the A and B subunits of the heterotrimeric protein are necessary for its activity. essential for tRNA processing. which is found in the wobble position of some tRNAs. gram-positive bacteria. The GTPase superfamily of cellular regulators is well represented in bacteria. coli (16.VOL. subtilis. suggesting that they play important roles in bacterial cellular systems (11). which are described in the corresponding section). coli some of the mutations can be tolerated in some genetic backgrounds. and improves open reading frame maintenance. Two more genes that are essential in B. we decided to include them in the minimal set.-) will contain the genes that are present in all ﬁve endosymbionts. subtilis essential genes. is absent in M. it can be considered that the smallest set of genes necessary to encode all 20 aminoacyl-tRNA synthases (EC 6. is essential in B.1. MnmA. the glycyl-tRNA synthase is an heterotetramer composed of two ␣ and two ␤ subunits. 55. EC 6. are present in all analyzed bacteria with reduced genomes. have been described as being essential in B. However.1.1. These two grampositive bacteria contain instead the genes that encode the different subunits of the glutamyl-tRNA amidotransferase (EC 6. Obg and YchF. rnpA also appears to be essential in E.70). pneumoniae mutants simultaneously lacking fmt and def (encoding a peptide deformylase) have been obtained (13. and trmU (encoding a tRNA methyltransferase that participates in the hypermodiﬁcation of tRNA. and disruptions in the gene that encodes L28 are viable in this microorganism. EC 3. aureus (22). S. genitalium. Four of the ribosomal proteins present in all ﬁve endosymbionts (L25.1. which includes 31 proteins for the large ribosomal subunit and 19 proteins for the small one.5. whose function is not yet well understood. aureus. 59). genitalium. With respect to the other B. gatB. and truA (site-speciﬁc pseudouridine synthase. Aminoacyl-tRNA synthases for all amino acid present in proteins are represented in all genomes with the exception of only glutaminyl-tRNA synthase (glnS. subtilis. cca (which encodes a tRNA nucleotidyl transferase involved in tRNA repair. coli (72). deaD is the only gene encoding an RNA helicase that has been preserved in the reduced genomes analyzed.1. it has not been included in the minimal gene set. the last of which is missing in M. The largest category of preserved genes corresponds to those involved in protein synthesis (78 genes). subtilis.1. Furthermore. as in M. three genes encoding the GTP-binding proteins EngA.1. iscS. since only homologues of the gatA and gatB genes can be found in all the archaea. coli. coli have been conserved in all ﬁve analyzed endosymbionts and M.29). which is also essential in E. which has also been involved is in vitro 2-thiouridine biosynthesis in E. but considering that a functional glycyl-tRNA synthase can be encoded only by the glyS gene.
M. since it recognizes the UGA codon that. only ThdF (already described in this section) and EngA are present in all reduced genomes analyzed. 37). EngA is a unique GTPase that contains two GTPbinding domains arranged in tandem and that has been involved in ribosome assembly or stability. in this microorganism. and therefore only one prf gene has been included in the minimal set. homologous to vacB).2. The era gene is essential in E. M. tufA has been proven to be dispensable in E.3). efp and hemK are essential in E. since no direct M. an essential gene in B.-) and oligoribonuclease (EC 3. On the other hand.1.4). coli. Four more genes in this category that are nonessential in B. aureus (22. has no orthologous gene identiﬁable in the B. The RNR family is composed of the RNase II (EC 3. essential for the activity of tmRNA in releasing stalled ribosomes from damaged mRNAs and targeting incompletely synthesized protein fragments for degradation. genitalium and E. The smpB gene is also present in all resident genomes analyzed. a highly conserved protein that has been found in every sequenced bacterial genome except those of mycoplasmas. subtilis and E. The rRNA methyltransferase CspR (EC 2. we have included smpB in the minimal gene set. we have included both era and engA in our minimal gene set.-) is also essential in B. prfA and prfB encode the two codon-speciﬁc bacterial release factors RF1 and RF2. or alter translation rates. It has been proposed that methylation could modulate rRNA maturation.70]). ﬂoridanus and W. and therefore we decided to include cspR in our minimal set.5. coli and B. but some studies indicate that it is essential in E.-) types. appears to be dispensable in M. RNase E. subtilis. and a modulator of the release factors activity. The catabolism of RNA molecules is known to encompass a wide variety of reactions and to require a large number of distinct RNases. encoded by the pnp gene). be essential in E. ftsJ (renamed rrmJ ).7. rbfA. as in other enteric bacteria. The orthologous gene can be disrupted in M. is present in all analyzed reduced genomes. initially annotated as a pseudogene due to the absence of a translation start codon (80). All the essential translation factors in B. coli (15). However.1.8. tsf. All of them have orthologous genes in B. coli (69). is required for ribosome recycling.-]. smpB encodes the small protein B. genitalium. encoding a 23S rRNA methyltransferase. are also present in all analyzed reduced genomes and are essential in S. we decide to include this gene in our minimal set. Although both types can normally be found in ␥-proteobacteria.1. and lepA is essential in S. which are nonessential in B. (iii) RNA degradation. genitalium mutant was obtained in the transposon insertion experiment (34). only rnc. since this gene can be found in both gram-positive and gram-negative bacteria. prfB has been lost in M. because these microorganisms. Since both the prfA and prfB genes are paralogous. where another conserved GTPase might perform its function.1.48). At least one 23S rRNA methylase has been identiﬁed in all reduced genomes analyzed. This protein is also present in M. RNase T (EC 3. it possesses an unusual AUU initiator codon.26.6. subtilis and was included in the minimal gene set proposed by Mushegian and Koonin (65). like many other gram-negative bacteria. whose product is involved in 23S rRNA maturation. However. It has been found that its coding gene is cotranscribed with pth in E. subtilis are conserved in all ﬁve endosymbionts. affect rRNA stability. and fusA (EC 3.1.1. coli and S. The status of the infC gene in B. MOL. aureus (22).26. Instead. The PDX family includes polynucleotide phosphorylase (EC 2. Therefore. Since tmRNA has also been identiﬁed in all prokaryotic genomes sequenced (41). glossinidia. subtilis are present in all ﬁve endosymbionts analyzed. is present in all ﬁve endosymbionts but not in M. rluD. 37).3) is a multifunctional endoribonuclease involved in the processing of rRNA precursors and some mRNAs and in the maturation and decay of RNAs. although there is no single conserved exoribonuclease-coding gene. MICROBIOL. subtilis and E. aureus (22.1. The genes tufA (EC 3.-). the major endoribonuclease participating in mRNA turnover. members of the DEDD family. coli (17. subtilis and M. has been reassigned to the tryptophan codon during evolution. which encode elongation factors.1. pneumoniae. genitalium (34) and has not been included in the minimal gene set. genitalium also contains RNase HII (EC 3. The orthologous gene can be found in all three analyzed strains of B. aureus (22). The last conserved B. are exoribonucleases that are found in only some bacteria (including ␥-proteobacteria) and are represented in all ﬁve endosymbionts analyzed. genitalium genomes. was recently changed (28) because. KsgA and RbfA are required for efﬁcient processing of the 16S rRNA. which probably indicates that another GTPase can perform its function in this microorganism. hemK. subtilis essential gene.-. are present in all reduced genomes under study. Very little is known about YchF.1. RNA degradation must be one of the essential functions in any living cell. subtilis. pneumoniae and is nonessential in B. and since it has been proven that a single mutation in RF2 can allow the new protein encoded to recognize all three stop codons (36). coli but not in the ﬁve analyzed endosymbionts. frr. although many of them appear to overlap functionally. although it is also widely distributed.18.104.22.168. Some endoribonucleases are also present in small genomes. aphidicola BSg. only the endonuclease encoded by rnc . which probably reveals that they are essential in bacteria with small genomes. It is essential in E. coli (88) and Salmonella enterica serovar Typhimurium (33). and infC are required for translation initiation and are also essential in E. efp and lepA.526 GIL ET AL.6. and rluD [EC 4. 25) and S. BIOL. have a duplication (tufA and tufB).7. Furthermore. Two more elongation factors. glossinidia lack both of them. coli. infA. Exoribonucleases belonging to three different superfamilies (87) have been identiﬁed in the bacteria analyzed in this study. coli and S. and it is present in all ﬁve endosymbionts analyzed. subtilis that encodes RNase III. genitalium contains only one RNase R (encoded by MG104. it has been proven to be essential in S. and ychF is also essential in E. it can be assumed that a single ancestral RF existed for the direct reading of the three stop codons. and it also cleaves double-stranded RNA. although the real function is poorly understood. 37). aureus (22. aphidicola but not in B. subtilis. Among the GTPases of the Era group. Nevertheless. aureus (22) and H. genitalium. B. B. which acts on RNA-DNA hybrids but is absent in most endosymbionts analyzed. Although it can be disrupted in M. coli. aureus (22).1. ﬂoridanus and W. which links this bacterial GTPase to regulation of protein synthesis and ribosome function. infB. coli. since there is only one release factor in eukaryotes. REV. and three of these genes also have an orthologue in M. However. and S.1) and RNase R (EC 3. inﬂuenzae (1).-. genitalium (ksgA [EC 2. RNase III (EC 3.
and it is present in all analyzed resident genomes. it has not been included either. activating the translocase machinery. therefore. we are taking into account grampositive and gram-negative bacteria simultaneously. and many cocci (54). both of which are preserved in all analyzed reduced genomes: map. subtilis. although the survival conditions of mutants lacking this protein might be diminished (48). Several genes encoding molecular chaperones are also present in all genomes analyzed. Furthermore. Surprisingly. Preproteins depend on chaperones to maintain their translocation-competent state. only one gene in- volved in protein turnover. at least in the chlamidiae (8).21. The presence of a well-structured cell wall is probably not necessary for microorganisms living inside other cells. genitalium genomes because it was considered parasite speciﬁc. Furthermore. Protein Processing. we have also included in our minimal set the widely conserved exoribonuclease encoded by pnp. In addition. while MinE negatively regulates the action of MinCD.88). genitalium. coli and B.5.4. Nevertheless. GroEL and GroES. As mentioned above. The essential subunits of the translocase are the dissociable peripheral ATPase SecA and the integral membrane proteins SecY and SecE. none of the genes that are essential in B. aureus (22). it has been proven that mutant E. The DnaK-DnaJ-GrpE chaperone system. Five genes in this category that are nonessential in B. Cell Structure and Cellular Processes Cell wall. 59). FtsZ. In fact.23. molecular chaperones are quite abundant in reduced genomes. the cell wall might not be necessary for cellular structure. Therefore. In our comparison. coli. and thus we have not included any gene of this category in our proposed minimal set. in a protected environment. MinC and MinD act together to regulate the assembly of the FtsZ ring. it was possible to disrupt groEL in M.4. which encodes a polypeptide deformylase (EC 3. and therefore none of these genes have been included in our minimal set. cells simultaneously lacking def and fmt are still viable (13. Thus. 71) but is conserved in the ﬁve endosymbionts and M. MinE is present in only a few characterized species. H. It has been proposed that FTsZ itself can provide the constrictive force necessary to split the cells. suggesting that FtsZ is sufﬁcient for cell division and that the other proteins serve to coordinate cell wall and septum synthesis. subtilis.53). asteris and can be disrupted in M. inﬂuenzae. subtilis have been conserved in all ﬁve endosymbionts.11. Two signal peptidase are present in all ﬁve endosymbionts and essential in E. gcp (EC 3. genitalium due to its lack of cell wall. and it has been included in the minimal gene set. 2004 MINIMAL BACTERIAL GENE SET 527 is conserved in all bacteria with reduced genomes analyzed in this study. LspA encodes a lipoprotein signal peptidase. Nevertheless. The DivIVA protein acts as the MinE equivalent in B. we have chosen to include these genes in the minimal set.24. Since no lipoproteins are encoded in our proposed minimal set. was found to be essential in B.57). genitalium: hﬂB and lon. Furthermore. In fact.1). Therefore. genitalium. is by far the most highly conserved known cell division protein (54). coli strains lacking the Min system are still viable (86). This gene was eliminated from the minimal genome computationally derived from a comparison of the H. which encode two ATP-dependent proteases. and def. translocation of proteins through biological membranes. oligomeric assembly of proteins. The other genes involved in septum formation are absent in M. HﬂB (EC 3. subtilis and are present in all analyzed bacteria with reduced genomes. The gene coding for the aminopeptidase PepA (EC 3. is also present in M. 68. probably being essential in a genomic environment that lacks most repair mechanisms. However. genitalium. SRP (for “bacterial Signal Recognition Particle”) is a ribosome-bound chaperone that binds to emerging nascent preproteins and interacts with its receptor FtsY. including M. at least one exoribonuclease must be also included in the minimal gene set. genitalium.1. which form the preprotein-conducting channel (60). 37). The preprotein translocase system in bacteria is a large and complex system. and although it is not present in the obligate chlamidia Ureaplasma urealyticum or in P. and so it is not essential for this microorganism.36).4. subtilis. one of the main determinants of division site placement. subtilis.4. However. All three genes are present in the reduced genomes analyzed and are essential in E. pneumoniae. Cell shape and division. coli (4) and S. can be disrupted . where it is necessary for normal cell division and for the maintenance of normal septation. Among them. asteris (70). Nevertheless. have also been described as essential in E. Eleven essential genes involved in cell shape and division have been identiﬁed in B. the endosymbionts have obvious defects in the peptidoglycan structure. coli and B.4.11. some species seem to need only one exoribonuclease (87). coli.18). and MinC and MinD are absent in some species. subtilis. subtilis. coli. subtilis. essential in B. and their degradation. involved in many processes such as protein folding. genitalium has no cell wall.VOL. it also appears to be essential in E. 56. while the protease La (EC 3. Folding. The best-conserved chaperones. six of them are also essential in E. M. to avoid the effect of slightly deleterious mutations (20). coli and B. only ftsZ is present in all resident genomes analyzed. Ffh (the protein component of the signal recognition particle) and FtsY are essential in E. since many of the genes involved in its synthesis are absent or have become pseudogenes in some of them. it appears that another protein might be performing the same function. inﬂuenzae and M. degrades short-lived regulatory and abnormal proteins. The ﬁve endosymbionts also contain the minCDE system. although it is nonessential in E. 75) and B. which is present in all resident genomes analyzed. LspA (EC 3. Therefore. coli (25. Both of them have also been included in our minimal set. it can be assumed that.-) has been involved in degradation of some integral membrane proteins.24. encoded by lon. and it is essential in S. we have included it in the minimal gene set.4. subtilis and E. but only two have orthologues in M. subtilis for the synthesis of teichoic acids has an orthologue in the ﬁve gram-negative endosymbionts. and Secretion Two genes are essential for posttranslational modiﬁcation. def is not present in P. 55. one of these. Protein turnover is another essential function that needs to be considered in a minimal cell. Two alternative gene products can carry out deformylation in B. has been included it in the minimal genome. which encodes a methionine aminopeptidase (EC 3. The GTP binding protein EngB. is nonessential in the free-living bacteria used in this comparison (3. aureus (22. a cytoskeletal protein.
and an ATP binding protein. or cofactors. asteris (70). Some metabolic pathways (such as the biosynthesis of purine nucleotides or redox coenzymes) have been pre- . The B. it appears that at least one or several ABC transporters. B. are present in all bacteria with reduced genomes analyzed except W. Several porins and low-afﬁnity active transporters. fatty acids. Furthermore. and the PTS delivers phosphorylated glucose to the cell interior. It appears that another yet unidentiﬁed phosphate transport system exists in mycoplasmas and in B. This fact might be related to the presence of a less highly structured cell envelope. i. Bacteria with small genomes are unable to synthesize many of their essential metabolites. but none of them is shared in all cases. and different alternative transporters can be included in a minimal cell depending on the nutrients available in the environment or the cell membrane characteristics. and they probably compensate for the reduced transporter spectrum by encoding transporters with broadened speciﬁcity. with the available data. and we have included in our proposed minimal set only a PTS for glucose (since glucose can enter the cell by using different PTS in all analyzed bacteria with reduced genomes. REV. and such molecules must be synthesized by the cell from intermediary metabolites provided by the environment and its own metabolic machinery.528 GIL ET AL. making glucose available as a metabolic substrate) and pitA (to provide the phosphate that is necessary in metabolic reactions). the analyzed bacteria with small genomes differ in the transport systems that have been preserved from the reductive process suffered by their genomes. In fact. Therefore. genitalium and is absent in B.and divalent cations have also been identiﬁed in every microorganism analyzed in this study. It is not known if the presence of different permeases is due to different metabolic requirements and/or environmen- tal conditions or if such permeases are less speciﬁc in these microorganisms and can thus be used to transport other sugars.and divalent cation transporters must be part of a minimal cell genome. Without detailed information about such possible alternatives. Several more or less speciﬁc transporters for mono. Nevertheless. However. can be provided by the environment and are able to permeate the cell membrane. (53). a cell in which the low-molecularmass compounds. Based on the present analysis. and some more or less speciﬁc mono. a major carbohydrate active-transport system that catalyzes sugar phosphorylation and transport across the cytoplasmic membrane. in M. as well as a few ATP binding cassette (ABC) transporters have been identiﬁed in the ﬁve endosymbionts under study. B. a PTS for glucose transport. which can transport a wide variety of sugar monomers. genitalium. All the components of a phosphoenolpyruvate-dependent sugar phosphotransferase system (PTS). it could be expected that many transport systems must be present in order to obtain the necessary molecules. genitalium. a number that is slightly increased in the other endosymbiotic bacteria analyzed and in M. In that case. however. we ﬁrst considered those that have been preserved in the different bacteria with reduced genomes analyzed. aphidicola BCce. although two of its three subunits (MG410 and MG411) can be disrupted. which instead contains an ABC phosphate transporter. We are assuming by deﬁnition that a minimal cell lives in a nutrient-rich medium and that therefore the major metabolites (amino acids. However. However. which are apparently overrepresented compared to the other two subunits in all genomes sequenced thus far (32).or dinucleotides). BIOL. we propose that our hypothetical minimal cell must be able to internalize small molecules that are not highly charged. aphidicola appears to be close to the minimal “free diffusing cell” described by Luisi et al. M. and the only metabolic role for acetyl-CoA would be the synthesis of fatty acids. However. although the speciﬁc permease differs among them. as it has been proven for P. while M. and they must rely on the environment to obtain them. Thus. even the synthesis of a central metabolite such as acetyl coenzyme A (acetyl-CoA) turns out to be dispensable. glossinidia. a permease. subtilis genes necessary for determination of cell shape are not present in all endosymbionts either. As discussed below.e. ABC transporters are heterotrimeric transport systems made up of a speciﬁcity (ligand binding) subunit. while B. including the reduced genomes under study. aphidicola genome. since it is highly dependent on the repertoire of metabolites present in the environment. ﬂoridanus genome encodes the mannose permease. it is not possible to deﬁne which ones should be such transporters. our proposed minimal cell does not contain speciﬁc transporters for charged molecules (such as mono. MICROBIOL. aphidicola encodes the genes for intake of mannitol and glucose and M. nitrogenous bases. some of the metabolic pathways that we include as part of our hypothetical minimal cell might not be fully present in bacteria with small genomes that contain the corresponding speciﬁc transporter or a cell membrane that allows the internalization of such biomolecules. with a smaller genome. as discussed in the previous section. including nucleotides and amino acids. genitalium contains the corresponding glucose and fructose permeases. Energetic and Intermediary Metabolism The deﬁnition of the essential metabolic functions that must be present in a minimal cell is a difﬁcult part of the analysis of a minimal genome. which is being sequenced in our laboratory (unpublished results). only the low-afﬁnity inorganic phosphate transporter encoded by pitA is present in all endosymbionts under study. which might be more permeable to small metabolites. To deﬁne which metabolic pathways should be considered essential in a minimal cell. fatty acids. This situation makes unnecessary the function of entire anabolic and assimilatory pathways of the main biomolecules. and glucose) are available without limitation.. Many ATP binding subunits appear to be “orphan” proteins with unknown partners. although it is not present in the smallest described B. precursors of coenzymes. it is clear that no single transport system is shared by all analyzed bacteria. genitalium has a larger number of ABC transporters. the bioenergetics of our minimal cell is basically fermentative. the presence of many speciﬁc transporters in this bacterium can help to explain why many metabolic pathways for the synthesis of small molecules are missing in this microorganism but have been preserved in its close relative. none of them has been preserved in all analyzed reduced genomes simultaneously. MOL. No orthologous gene has been found in M. de novo biosyntheses of amino acids. aphidicola has a reduced number of genes devoted to such function. Furthermore. Phosphate transport is thought to be an essential function. we postulate that these essential biomolecules are present and available in the medium. The other essential B. Substrate transport. genitalium. ﬂoridanus.
Metabolic precursors of external origin are in gray boxes. but they use different alternative routes. EC 4. A general outline of the metabolic abilities of the proposed minimal cell is presented in Fig. the loss of some genes conditioned the essentiality of other genes that became necessary to conserve a speciﬁc metabolic function. The minimal cell can obtain its more basic components from the environment: glucose. genitalium can be achieved if we consider the action of the PTS transporters.1. pyruvate metabolism. Each box includes the metabolic transformations classiﬁed in major groups of pathways: glycolysis. genitalium. . In this case. nonoxidative pentose-phosphate pathway. phospholipid biosynthesis. S-adenosylmethionine.11) and phosphoenolpyruvate carboxykinase (pck. The phosphorylation of glucose in the other endosymbionts and M. Single continuous arrows represent single enzymatic steps.1. aminoacyl-tRNAs (aa-tRNA). we choose to include in our minimal set the costless pathway that will allow the cell to preserve such metabolic function. CDP-diacylglycerol. tetrahydrofolate. glossinidia. DHAP. Metabolic intermediates and ﬁnal pathway products are in green boxes.1. dihydroxyacetonephosphate. phosphoenolpyruvate. are present in all endosymbionts and in M. glossinidia.3. uracil.e. and synthesis of protein precursors. amino acids. guanine. we have included all genes involved in this pathway in the minimal set. pantothenate. Metabolites acting as a source of chemical energy are in red boxes. The exception is pfkA (EC 2.1. All genes involved in glycolysis. This is a strong indication that W. sn-glycerol-3-phosphate. Although some of the genes involved in these pathways may still be present in all genomes under study. To approach the minimal genome. folate. Abbreviations (besides the accepted symbols and those deﬁned in the text): PEP. fatty acids. Other metabolic pathways are incomplete. synthesis of enzymatic cofactors. gluconeogenesis. EC 3. Gd3P. Since no other pathway for the glucose degradation has been identiﬁed in the small genome organisms under study and since glycolysis also provides precursor metabolites for some anabolic pathways. and the TCA cycle. CDP-DAG. In those cases. served in all analyzed microorganisms. G3P. it is remarkable that the absence of phosphofructokinase is accompanied by the presence of two idiosyncratic activities of the gluconeogenic pathway: fructosebisphosphatase (fbp. which has not been identiﬁed in W. nucleotide biosynthesis. except the one responsible for the phosphorylation of glucose.7. Glycolysis. 1. whereas wide arrows represent several enzymatic steps (the number within the arrow indicates the number of steps). SAM. i. A minimal metabolism.VOL. Arrows with discontinuous lines represent incorporation from the environment.49). adenine.11). 2004 MINIMAL BACTERIAL GENE SET 529 FIG. THF. riboﬂavin. glucose-6-phosphate. synthesizes hexoses from C3 compounds derived mainly from amino acids. they have not been included in the minimal set.. glyceraldehyde-3-phosphate. Lines with a ﬁnal black point indicate the necessity of metabolites for some of the transformations inside the corresponding box. G6P. 1. 68. revealing degenerative processes that are randomly affecting different genes in each genome. Reducing-power cofactors are in light blue boxes. unlike the other described endosymbionts. and coenzyme precursors (nicotinamide. and pyridoxal).
the ﬁnal product of glycolysis. The ydiC gene.1. both NADH:quinone oxidoreductase and cytochrome o:quinol oxidoreductase conserve the redox energy as a proton motive force.5. MOL. we decided to include them in the proposed minimal gene set.1.10. the major route of ATP synthesis might be through substrate-level phosphorylation. The gene encoding this enzyme (ldh) is present in M. genitalium. encoded by rpe (EC 5. E3 subunit) in gram-negative bacteria. succinate:menaquinone oxidoreductase. subtilis are required for menaquinone biosynthesis. respectively.12. aureus (37). glossinidia.1. The third activity in the standard pathway. asteris (70). ribose 5-phosphate isomerase A).6.31). NADH:quinone oxidoreductase (EC 1. or cytochrome oxidase are present.1.4.2..2) and E2 (EC 2. glossinidia). dihydrolipoamide dehydrogenase. led to the conclusion that these two enzymes and the branched-chain 2-oxoacid dehydrogenases (EC 1. and lpdA (EC 1. the ﬁve endosymbionts analyzed present all the enzymes involved in the nonoxidative branch of the pentose phosphate pathway. ﬂoridanus).99. These observations. The pathway allows the synthesis of carbohydrates with different number of carbons. subtilis and S. encoded by zwf (glucose-6-phosphate dehydrogenase. B. Therefore. while W. ﬂoridanus and W.3) in B. aphidicola).3) is present in the three genomes of B. Only the last gene is missing in M.2.8. transaldolase A).4.e. aphidicola and B. genitalium and whose E1 (EC 1. since no NADH dehydrogenase. structurally related to the aa3-type family of cytochrome c oxidases (12). 6-phosphogluconolactonase (EC 3. It should be noticed that 2-oxoglutarate dehydrogenase is the only enzyme of the TCA cycle that is present in all analyzed endosymbionts. aphidicola and in B.3.3.4. genitalium. The pathway is completely missing in P. subtilis. all the components of the ATP synthase have also been preserved in M.1. in M. thus regenerating NADϩ. Acetyl-CoA. and talA (EC 2. E2 subunit).6.4. The NADH dehydrogenase function is performed by a complex of 13 proteins (EC 1. is present in all reduced genomes analyzed and has also been included in the minimal set. whereas the succinate:quinone oxidoreductase complex (EC 1. respectively.1.1. This absence in itself should not represent an interruption of the metabolic ﬂux. but most of them have been lost in the ﬁve endosymbionts analyzed or appear as pseudogenes in B. The cytochrome o:quinol oxidoreductase complex (EC 1. In all the cases. the product of this enzyme.61) subunits are encoded by sucA and sucB. On the other hand. which encodes an NADH dehydrogenase independent of ATP (EC 1. in all three instances via the reduced quinone pool. EC 1. Both the ␣ and ␤ subunits are present in M.6:5.27). Pentose phosphate pathway. Electron transport chain and proton motive force generation. and the complete set of genes for an F0F1-type ATP synthase (EC 3. in our hypothetical minimal cell. acetyl-CoA is not necessary. which indicates that all the endosymbionts are dependent on the host quinones. Pyruvate. aceF (EC 2. since the action of all the above enzymes except this one can also allow the interconversion of sugars of different lengths. although adding their respective metabolites to the culture medium can restore the growth of the mutant strains (45). Thus. Moreover.3). Instead. All three genes are present in the ﬁve endosymbionts analyzed. aphidicola BBp. genitalium. since a proton motive force could be necessary to generate and maintain a negative transmembrane potential. since these endosymbionts maintain different segments of the electron transport chain as a means of oxidizing NADH (as discussed in the next section).1. which indicates that this plant pathogen is using the host pentoses. is missing in the abovementioned endosymbionts.1. and no genes involved in oxidative phosphorylation have been included in the minimal set. Since our proposed minimal cell can obtain these biomolecules from the environment. ﬂoridanus. Nevertheless.1.3. asteris (70). EC 1. from NADH and succinate to molecular oxygen (B.1. REV.1 transketolase). tkt is also essential in B.3. ﬂoridanus. since it is well known that 6-phosphogluconolactone undergoes a rapid spontaneous hydrolysis (61). which is necessary for physiologically normal function of the cell membrane. where it most probably functions as a proton pump consuming ATP. genitalium. E1 subunit).1. which seems to be essential for proper integration of the ATP synthase into the membrane (82).1. NAD[P]H). we can consider that all ﬁve endosymbionts can perform electron transport to energize the membrane from NADH to molecular oxygen (B.2. is also required by 2-oxoglutarate dehydrogenase. is converted into acetyl-CoA by the action of pyruvate dehydrogenase.1) is present in both B.14) are present in all endosymbionts. Although these genes are absent in the recently sequenced genome of P.4) form a family of paralogous genes.2. Therefore.-).2. The third activity.1. aphidicola and B.6. ribulose-phosphate 3-epimerase). we propose that. an enzyme involved in the tricarboxylic acid (TCA) cycle that is maintained in all endosymbionts but not in M. i.44). the genes that encode the pyruvate dehydrogenase have not been included in the minimal set.1. genitalium.530 GIL ET AL. The absence of transaldolase in M. subtilis. rpiA (EC 5. the minimal cell we propose should be able to obtain its energy through substratelevel phosphorylation. genitalium is not a problem. BIOL.1. genitalium but not in the ﬁve endosymbionts analyzed.3. the pyruvate generated by the glycolysis would be reduced to lactate by the L-lactate dehydrogenase (EC 1. is an essential substrate for the synthesis of amino acids and fatty acids. as is performed by the nonreductive phase of the Calvin cycle (67). Thus. but only the ␣ subunit appears to be essential in B. together with the phylogenetic analysis of the bacterial genes encoding the E1 and E2 subunits. glossinidia has only the ndh gene. MICROBIOL. Electron transport chains and oxidative phosphorylation also appear to be essential processes as a source of energy in the ﬁve analyzed endosymbionts. Gram-positive pyruvate dehydrogenase subunit E1 is a heterodimer of ␣ and ␤ subunits encoded by pdhA and pdhB.1.2. Many of the genes in this category that are essential in B. but not in M. tktA or tktB (EC 2.3. as suggested by Mushegian and Koonin (65). We propose that a minimal cell can perform the synthesis of pentoses from trioses or hexoses only with the activities of the .49) and gnd (6-phosphogluconate dehydrogenase. only the components of the pyruvate dehydrogenase appear to be good candidates for inclusion in the minimal set. Only the orthologous genes of sucB (odhB) and aceE (pdhA) are essential in B. and pentoses from hexoses and/or trioses).6.3. a multienzymatic complex that is composed of three subunits encoded by aceE (EC 1. particularly hexoses (or trioses) and pentoses. or from succinate to molecular oxygen (W. ﬂoridanus contain two of three enzymes of the oxidative branch of the pentose phosphate cycle (a pathway that produces reducing power.99. On the other hand.
genitalium genome may explain why nrdE seems to be nonessential in mycoplasmas (34). that is present in all the reduced genomes analyzed. it seems reasonable to assume that B. Purine monophosphate nucleotides can then be synthesized by using two alternative pathways.6. EC 2. The observation that E. encoding a putative ribonucleoside diphosphate reductase that is essential in B. a set of 10 genes coding for a complete biosynthetic pathway of the major bacterial phospholipids (i.4.e. ﬂoridanus provide their host insect with essential amino acids.9). and tkt. subtilis. aphidicola is fragile. ﬂoridanus. The presence of this gene in the small genomes analyzed can be explained because all such genomes contain genes that encode at least one lipoprotein. Among the genes related to nucleic acids metabolism. nonorthologous substitutions.8. However. coli).1). This activity is encoded by the prsA gene. One of the most surprising ﬁndings when the ﬁrst B. Although many of the genes involved in this pathway are essential in E.. and prs (phosphoribosylpyrophosphate synthase. coli. glossinidia. and cardiolipin) from triosephosphate and acyl-CoA is present in W.99. and no genes of this category need to be included in the minimal set. subtilis and E.1). The gene product drives anabolic ﬂuxes by pyrophosphate hydrolysis in various biochemical reactions. EC 1. The nonessentiality of some metabolic steps or the absence of some genes can be explained by metabolic redundancy. tmk (thymidylate kinase.4. The nrdI gene. 68. aphidicola genomes corroborated this ﬁnding.. we propose that the activation should take place through a single step catalyzed by acyl-CoA synthase (EC 6. and some of them are essential in B. is catalyzed in most bacteria by the type II fatty acid synthase system. glossinidia) need a more structured and ﬂexible surface to protect themselves from the host cell. encoding the two subunits of the ribonucleoside diphosphate reductase. glossinidia has lost most of these genes. the fatty acid biosynthesis pathway is incomplete in most reduced genomes analyzed.-) involved in the ﬁrst step of the synthesis of lipoproteins. aphidicola and M. since the only metabolic destination of the fatty acids would be the biosynthesis of phospholipids.17. since none of them has been included in the minimal set. subtilis.4. Although it was annotated as a pseudogene in B. genitalium is glyA. all except one (plsB. In all ﬁve endosymbionts. Fatty acid biosynthesis. it can be assumed that amino acids can be provided by the environment in the small genome microorganisms analyzed. together with trxA and trxB (thioredoxin and thioredoxin reductase. Biosynthesis of amino acids. 1-(5Ј-phosphoribosyl)-5amino-4-imidazolecarboxamide (AICAR) can be synthesized .VOL. and the only gene in this category found in M. these include nrdE and nrdF (or the homologous genes nrdA and nrdB.6. aphidicola and B. We assume for our minimal cell a theoretical lipid bilayer made only of phosphatidylethanolamine.9). and endosymbionts that live free in the cytosol of the host cell (which is the case for B. subtilis.e. rendering a protein of 246 amino acids that has lost only the ﬁrst transmembrane domain. we consider phospholipid biosynthesis one of the essential functions to be performed by a minimal cell.7. lgt.4. coli and B. including all reactions that involve PRPP. composed of a series of small.1). Since phospholipids are an indispensable component of the formation of the membrane lipid bilayer. rpiA.3) encoded by fadD (76).2. the main phospholipid present in bacterial membranes (it represents 70 to 80% of all zwitterionic phospholipids in E. orthologous to the essential prs gene in B. Biosynthesis of nucleotides. However. EC 2. ﬂoridanus and W. phosphatidylglycerol. soluble proteins that are each encoded by a discrete gene (74). genitalium. the ﬁrst stage in membrane lipid biogenesis. 2004 MINIMAL BACTERIAL GENE SET 531 nonoxidative branch of the pentose phosphate pathway encoded by rpe. its presence in the M. Biosynthesis of lipids. all the genes involved in the synthesis of malonyl-CoA have been lost in B. A hypothetical phospholipid biosynthetic pathway from glycerol and exogenous fatty acids has been proposed by Mushegian and Koonin (65) for M. several genes involved in the synthesis of purines and pyrimidines are essential in B. encoded by acpP) is involved in fatty acids uptake and activation (i.7. EC 1.7. subtilis and are present in all analyzed genomes. EC 2. encoding an inorganic pyrophosphatase (EC 3.7. This gene is also involved in the biosynthesis of folate derivatives and has been included in the section an biosynthesis of cofactors (below). has no orthologue in the ﬁve endosymbionts analyzed. probably reﬂecting the lack of such amino acids in the culture media or the absence of the proper transporters to import them into the cell.7) are present in all of them. which encodes a prolipoprotein diacylglycerol transferase (EC 2. In contrast.7.8. albeit with an unreasonably small number of steps.3). EC 2. subtilis. subtilis. The sequencing of two more B.-).1. and that ACP (the acyl carrier protein. However.8) synthesizes 5-phosphoribosyl-alpha-1-diphosphate (PRPP) from ribose 5-phosphate and ATP. Thus. W. maybe due to its prolonged intracellular life inside vacuole-like host-derived organelles. aphidicola genome was sequenced was that this bacterium lacks all genes responsible for phospholipid biosynthesis (except cls. and therefore we have included in our minimal gene set all the genes necessary for the biosynthesis of phosphatidylethanolamine from dihydroxyacetone phosphate and activated fatty acids. the corresponding open reading frame contains a frameshift 29 nucleotides after the start codon of the orthologous gene.1. the starting point for nucleotide biosynthesis. or even the presence of other kinds of phospholipids acting as functional surrogates (14). is conserved in all analyzed reduced genomes but is absent in P.4. B. aphidicola imports them from the host cell.7. All bacteria with small genomes analyzed in this study can synthesize purine nucleotides. genitalium. These ﬁndings probably imply that fatty acids can be provided by the environment in bacteria with small genomes.1. genitalium.8). synthesis of acyl-CoA). These facts probably indicate that the cell surface in B. Several genes necessary for the synthesis of amino acids of the aspartate family are included among the essential genes of B.15) are present in B. However. gmk (guanylate kinase.3. aphidicola BSg (80). phosphatidylethanolamine. since only acpP and acpS (EC 2. lgt has not been included either. pathogenic bacteria. EC 2. and thus the genes involved in the biosynthetic pathways for those amino acids are present in their genomes.8. coli cells lacking the quantitatively major acidic phospholipids (phosphatidylglycerol and cardiolipin) are viable (42) also supports this choice. but there is an alternative start a few nucleotides downstream. but with different metabolic solutions. However. such as M. adk (adenylate kinase.1.6. The ppa gene.7. the gene that encodes cardiolipin synthase. EC 2. The enzyme ribose-phosphate diphosphokinase (EC 2. However. In fact. is also present in all analyzed genomes and is essential in B. asteris (70).
1.1.3. PRPP.3. MOL.1.17). genitalium. purC (EC 6. using the set of activities encoded by hisG (EC 2.2). The kinases encoded by adk and gmk are conserved in all reduced genomes analyzed. glossinidia) genes. and purB (EC 4.2.3). and NADH-dependent reduction.3.1.3. as may also be the case for B. On the other hand. Only a few genes involved in the synthesis of nucleoside monophosphates have been maintained in all of them simultaneously: tmk.22.214.171.124. purE and purK (EC 4. purM (EC 6. aphidicola BAp and BSg.2. REV. to obtain the corresponding deoxyribonucleoside diphosphate (Fig. Other colors are used as in Fig. BIOL. 2. ﬂoridanus and W. since it can use the enzyme hypoxanthine phosphoribosyltransferase (EC 2.16). allowing the synthesis of nucleoside diphosphates from the nucleoside monophosphates.4). Pyrimidine metabolism is not so highly conserved in the analyzed bacteria with small genomes. encoded by pykA. A minimal nucleotide metabolism based on salvage pathways. White boxes indicate the individual enzymatic activity (EC number and coding gene).4. glossinidia. which lack ndk.126.96.36.199. a homologue of purN (58).4.3.-) (68) encoded by purT.31 and 3. AICAR biosynthesis is performed by a standard pathway starting with PRPP and glutamine.1. FIG.7.6). genitalium has a different and costless metabolic solution.7.2.19).21). hisI (EC 3.40). Purine nucleotide diphosphate can be phosphorylated by the kinase encoded by ndk or by the glycolytic enzyme pyruvate kinase (EC 2.4. purD (EC 6. aphidicola and B. aphidicola and M.1).1. pyrG (CTP .2.532 GIL ET AL. The ribonucleoside diphosphate reductase function can be performed by the products of the nrdA and nrdB genes or the nrdE and nrdF genes (depending on the analyzed organism). and U).205) and guaB (EC 6. AMP and GMP are then synthesized from AICAR through a pathway involving the products of the purA (EC 6. 1.5. G. purB (EC 4. MICROBIOL.10) genes and the guaC (EC 1. with the help of the products of trxA and trxB.4.14).2) has been replaced by the enzyme 5Ј-phosphoribosylglycinamide transformylase 2 (EC 2. encoded by the gene hpt) for the synthesis of GMP and AMP from PRPP and guanine or adenine.2. and purH (EC 3.7) (in B. 2).1.2.3. de novo and used for the synthesis of the purine nucleotides. ATP-dependent phosphorylating reactions.3. aphidicola) or guaA (EC 188.8.131.52) (in B. purL (EC 6. ﬂoridanus synthesize AICAR through the four ﬁrst steps of the histidine biosynthetic pathway starting with PRPP and ATP. M.8. In W.4. both B. involving the products of the purF (EC 2. although the step usually catalyzed by the protein encoded by purN (EC 2.6.2) genes.13). Activated ribonucleotides and deoxyribonucleotides are obtained from free bases (A. and his A (EC 5.5. in B.3.1. respectively.
A metabolism of essential cofactors used by the minimal cell starting with precursors (i. glossinidia can obtain the pyrimidine nucleotides from the cytoplasm of the host cell. free uracil can be provided by the environment and used to synthesize all the necessary pyrimidine nucleotides.e.and ε-proteobacteria have only tmk. PP. and some ␦.e. coli are present in all of them. Pyrimidine nucleoside diphosphates cannot be phosphorylated by the kinase encoded by pykA. Biosynthesis of cofactors. is present only in B. 2004 MINIMAL BACTERIAL GENE SET 533 FIG.6.7. EC 6.3.e. . methionine. it can use short and simple salvage pathways rather than long and complicated de novo biosyntheses. ﬂoridanus and M. synthase.14) or tmk (EC 2.4. If a microorganism has free availability of coenzyme precursors (i. vitamins): nicotinamide. probably as nucleoside monophosphates. genitalium. riboﬂavin. Other colors and symbols are as in Fig. pyridoxal. However. aphidicola and W. 1 and 2. vitamins). However. such as vitamins. Yellow boxes indicate the enzymes that need each cofactor for their correct function. i. aphidicola. Biosynthesis of cofactors. Only the cofactors that are necessary for the correct catalytic function of the enzymes present in our proposed minimal cell have been included.2). and folic acid.. 3). At least one of these genes appears in all genomes sequenced to date. This broad speciﬁcity has been demonstrated in several experimental systems including prokaryotic cells (see the primary literature in the BRENDA database).2.4. since we have not included speciﬁc transporters for such molecules in our minimal cell. necessary for the incorporation of free uracil for the synthesis of activated ribonucleotides.4. and it contains only the minimum number of steps to incorporate them as enzymatic cofactors (Fig. using the costless pathway as described in Fig.23) (or its corresponding suggested homologue in M. It appears that B. chlamydiae and mycobacteria contain only cmk. while archaea..VOL. and therefore in our metabolic map the activity encoded by ndk plays an important role in catalyzing the transfer of the gamma-phosphate from the nucleoside triphosphate to a variety of nucleoside diphosphates. encoded by cmk (EC 2. As a general criterion. Most of the genes involved in the biosynthesis of cofactors have been lost in the reduced genomes analyzed.4. B.. subtilis or E. MG268) (65). thiamine. the presence of biosynthetic pathways in bacteria with reduced genomes or their essentiality in laboratorycultured bacteria reﬂects the chemical complexity of the environment.9). patothenic acid. pyridoxal-phosphate. genitalium. we propose that our minimal metabolic chart starts from elaborate molecules.7. 2. and only a few genes included in this category that are essential in B. we consider that in a hypothetical minimal cell. The conversion of pyrimidine monophosphate to pyrimidine diphosphate needs the action of one kinase. and dut (EC 3. EC 2.9). only tmk has been included in the minimal gene set.1. 68. Therefore. The upp gene (uracil phosphoribosyl transferase. 3.
In our proposed minimal metabolism. the synthesis of NADPϩ from NADϩ can be catalyzed by kinases with a wide bacterial distribution. might be performing this function in this microorganism. aphidicola (BAp and BSg strains). and they are present in all analyzed reduced genomes.10methylenetetrahydrofolate with serine as the monocarbon unit donor). We have decided not to include them in a minimal gene set.6). MICROBIOL.2.7. inﬂuenzae may be candidates to perform such function (65). W. First. and a uracyl-DNA glycosylase. and the DNA gyrase is the only topoisomerase included. (ii) A very rudimentary system for DNA repair. such as the ppnK product. subtilis. salvage pathway II. Salvage pathway III. requires extracellular enzymes and membrane transporters to recycle exogenous nicotinamide nucleotides.7. glossinidia.1). The biosynthesis of the essential redox coenzymes NAD(P)ϩ can occur by both de novo and salvage pathways (5).1..1. Both activities are encoded by the gene ribF. aphidicola (BAp and BSg) can perform salvage pathway I (synthesis of nicotinate mononucleotide [NMN] from nicotinate and PRPP).e. i. CONCLUSIONS Based on the conjoint analysis of several computational and experimental strategies designed to deﬁne the minimal set of protein-coding genes that are necessary to maintain a functional bacterial cell.3. glossinidia.2). DNA helicase. Pasteurellaceae. we propose a minimal gene set composed of 206 genes.5.1) genes encode enzymes involved in the monocarbon folate pool interconversion. since we assume that all the reductive steps in the minimal metabolism can use NADH. but considering that this coenzyme precursor can also be provided by the host cell.26) and ﬂavin mononucleotide adenylyltransferase (EC 2. Among them. involved in the phosphorylation of NADϩ to produce NADPϩ (EC 2.17 and 6. Another gene of this category that has been conserved is ppnK.e. polymerase III. The synthesis of S-adenosylmethionine. primase.7. However. Further studies need to be done in order to characterize their gene products. This is the case. For our minimal gene set.6. In B. including the three subunits of the RNA polymerase. which encodes the enzyme responsible for the transformation of tetrahydrofolate to 10-formyltetrahydrofolate. (34) may be explained by the presence of an alternative pathway through the use of ThyA (EC. EC 2. subtilis gene yloS.3). Then adenylyl transferase (EC 2. indicating either that this organism synthesizes or recycles NADϩ by an unidentiﬁed mechanism or that all the steps except one have suffered nonorthologous substitutions. genitalium. whose function has not been characterized.3.7. which is present in B. Such a gene set will be able to sustain the main vital functions of a hypothetical simplest bacterial cell with the following features. subtilis and E. No initiation and recruiting proteins seem to be essential. Musheguian and Koonin suggested that several proteins with unknown function that have an ortholog in H. We suppose that our minimal cell cannot recycle the S-adenosyl-L-homocysteine produced by methyltransferases. the reaction catalyzed by GlyA is essential for charging tetrahydrofolate with monocarbon units (i.3.12) encoded by nadV (57). REV. genitalium could be performing the same function. encoded by nadR (51).1. genitalium by Hutchison et al. and B. which should perform both replication and chromosome segregation functions. folC is also present in all ﬁve endosymbionts. the synthesis of 5. we have not included it in our minimal set. Although both genes are absent in U. (i) A virtually complete DNA replication machinery. dfp. The detection of one glyA mutant in M. B. from methionine is catalyzed by methionine adenosyltransferase (metK. although another gene of unknown function that appears to be essential in B.7. urealyticum (26). Finally. a factor.. The folA (EC 1. SSB. since they can be disrupted (34). an RNA helicase. coli. the gene is not essential in B. present in Enterobacteriaceae. except for ycfF and ycfH. Finally. Three different salvage pathways that are operative in different bacterial groups have been described so for (51). Although it is not present in M. catalyzes the synthesis of NADϩ from NMN.7. ﬂoridanus. the methyl donor in methyltransferases. subtilis.7. 2.23).1. and four transcriptional factors (with elonga- .1.2). However. the pathway may be closed in this species by the presence of the gene fthS (EC.1. subtilis and conserved in M. probably indicating that they perform an essential function in small genomes.35). a gene that encodes a hypothetical conserved protein of unknown function. BIOL. genitalium. at least in in vitro assays. glossinidia. Poorly Characterized Genes There are many poorly characterized genes that were identiﬁed as essential in B.45). and other gram-negative bacteria.2.1. Two strains of B. A few more genes that encode proteins of unknown function but are not essential in B.534 GIL ET AL. composed of one nucleoid DNA binding protein. and W. we propose the shortest pathway to synthesize NADϩ from PRPP and nicotinamide. subtilis are also present in all analyzed reduced genomes. for the reductases involved in the synthesis of ribonucleotides and the monocarbon metabolism (see the primary references in the BRENDA database). gyrase. The de novo biosynthesis from aspartate is present in E.12) is necessary only for the last step in the de novo synthesis of folate. and ligase.4.2. we have decided to include them in our minimal set. ﬂoridanus. NMN is synthesized from free nicotinamide and PRPP by the recently described nicotinamide phosphoribosyltransferase (EC 2. 6.7. we have not included this last gene in our ﬁnal set. which appear to be nonessential in M. (iii) A virtually complete transcriptional machinery. Therefore. MOL. Some other genes in this category that are present in all ﬁve endosymbionts analyzed do not have an orthologue in gram-positive bacteria.4. one exonuclease. only the above-mentioned ppnK gene can be detected. is also present in all analyzed genomes. The synthesis of pyridoxal-5-phosphate (cofactor of glycine hydroxymethyltransferase) from pyridoxal (vitamin B6) is catalyzed by pyridoxal kinase (EC 2.3) and glyA (EC 2.5. Thiamine diphosphate (cofactor of transketolase) can be synthesized from thiamine (vitamin B1) by the action of thiamine pyrophosphokinase (EC 2. coaD.2. coli (43). The product of the B. all of which are present in W. and coaE). including only one endonuclease.1. The synthesis of ﬂavin adenine dinucleotide (cofactor of thiredoxin reductase) from riboﬂavin (vitamin B2) requires the action of riboﬂavin kinase (EC 2. only mesJ. The dihydrofolate synthase (EC 6. the synthesis of CoA from pantothenic acid (vitamin B5) requires the action of ﬁve activities encoded by four genes (coaA.
. since it is impossible to identify any of the abovementioned diverse solutions with the one adopted by the more primitive cells (63). archaea. A genome-scale analysis for identiﬁcation of genes re- . more speciﬁcally the smallest ones (84). we have included in the minimal set only a PTS for glucose transport and a phosphate transporter. antitermination. although some of these biosynthetic pathways are not present in some bacteria with reduced genomes. (xiii) Most cofactor precursors (i. N. The lack of such pathways reﬂects that these bacteria are very dependent on their hosts due to an irreversible degenerative process of their metabolic abilities. Regulation of transcription does not appear to be essential in bacteria with reduced genomes. since we suppose that they can be provided by the environment. Spain (project grupos 03/204). based on our current knowledge. a reduced repair machinery. This work was supported by Ministerio de Ciencia y Tecnolog´ ıa. is a recipient of a contract in the Ramon y Cajal Program from the Ministerio de Ciencia y Tecnolog´ ıa. (iv) A nearly complete translational system. our conclusions must be regarded as provisional. This is especially true in the cases where a bacterium-centered approach is followed. ﬂavin aderine dinucleotide. REFERENCES 1. ﬁve enzymes involved in tRNA maturation and modiﬁcation. Another computational approach obtained by comparative analysis of 21 complete genomes of bacteria. thiamine diphosphate. (viii) The energetic metabolism is based on ATP synthesis by glycolytic substrate-level phosphorylation. (vii) A basic substrate transport machinery cannot be clearly deﬁned. Mekalanos. J. Novick. (xii) Nucleotide biosynthesis proceeds through the salvage pathways. should converge in several solutions (35. Amaya.e. guanine. The proposed virtual self-surviving cell (SSC) contained only 105 protein-coding genes that allowed the cell to maintain protein and membrane structure. Different approaches. J. L. one endopeptidase. and uracil. ribosephosphate isomerase. J.VOL. one must keep in mind that this kind of research has little relevance for the study of the origin of life. 49). Our proposed minimal cell has a similar number of genes devoted to such functions. allowing the synthesis of pentoses (PRPP) from trioses or hexoses. 68. (xi) Lipid biosynthesis is reduced to the biosynthesis of phosphatidylethanolamine from the glycolytic intermediate dihydroxyacetone phosphate and activated fatty acids provided by the environment. and two proteases. this virtual minimal cell is able to maintain metabolic homeostasis but not to reproduce and evolve. and nucleotide and coenzyme syntheses. B. and eukaryotes (48) suggested that a bare-bones set of about 150 genes would be sufﬁcient to maintain basal systems for transcription. (vi) Cell division can be driven by FtsZ only. 12 translation factors. We are sure that future studies will highlight a diversity of minimal ecologically dependent metabolic charts supporting a universal genetic machinery. its receptor. Finally. (v) Protein-processing. Rubin. from PRPP and the free bases adenine. the maintenance of the membrane proton motive force. (ix) The nonoxidative branch of the pentose pathway contains three enzymes (ribulose-phosphate epimerase. E. R. 2002. a small set of molecular chaperones. However. could help to achieve the exciting goal of experimentally constructing a minimal living cell (73). Among the 105 genes proposed for SSC. 97 are included in our proposed minimal set. Several attempts have been made to try to deﬁne the characteristics of a minimal cell. Akerley.. Judson. -folding. although it remains questionable whether such a minimal cell could survive under any realistic conditions. The only functions considered in this virtual cell were glycolysis (using exogenous glucose to obtain ATP). ours among others. Any attempt to universalize the conclusions would necessarily include the comparison with archaeal genomes. and 2 RNases involved in RNA degradation. and replication. ACKNOWLEDGMENTS We thank Antonio Lazcano and Amparo Latorre for their suggestions and critical reading of the manuscript. vitamins) are provided by the environment. and CoA. in a protected environment. at least from a metabolic point of view. and no cell wall. together with the continued efforts in deﬁning the minimal gene set. In this sense. and translation. and transketolase). as described in this paper. and J. and degradation functions are performed by at least three proteins for posttranslational modiﬁcation. A computer model of a such minimal cell was performed a few years ago and was called the E-CELL Project (81). pyridoxal phosphate. phospholipid biosynthesis (from exogenous fatty acids and glycerol). six proteins necessary for ribosome function and maturation (four of which are GTP binding proteins whose speciﬁc function is not well known). In a more technological vein. K. Therefore.G. we also included in our minimal set the genes involved in a nonoxidative pentose phosphate pathway. since we have included a different pathway for this purpose. two molecular chaperone systems (GroEL/S and DnaK/DnaJ/GrpE). we think they should be present in a hypothetical minimal cell able to perform all the necessary reactions to maintain a minimal and coherent metabolic functionality. and transcription-translation coupling functions). Spain (project BFM2003-00305). the development of more sophisticated techniques for genomic engineering. and therefore the minimal gene set does not contain any transcriptional regulators. and Generalitat Valenciana. secretion. 2004 MINIMAL BACTERIAL GENE SET 535 tion. 50 ribosomal proteins (31 proteins for the large ribosomal subunit and 19 proteins for the small one). an intermediate metabolism reduced to glycolysis. The only differences involve the fmt gene (methionyl-tRNA formyltransferase) and the genes proposed in SSC for phospholipid biosynthesis. considering that. Although it appears that several cation and ABC transporters are always present in all analyzed bacteria. six components of the translocase machinery (including the signal recognition particle. Our proposed minimal cell performs only the steps for the syntheses of the strictly necessary coenzymes tetrahydrofolate. the three essential components of the translocase channel. and a signal peptidase). Spain. (x) No biosynthetic pathways for amino acids. Further analysis should be performed to deﬁne a more complete set of transporters. translation. Therefore. transcription. a primitive transport system. a methionyl-tRNA formyltransferase. the cell wall might not be necessary for cellular structure. At any rate. It contains the 20 aminoacyl-tRNA synthases. NADϩ. V. we should accept that there is no conceptual or experimental support for the existence of one particular form of minimal cell. which are obtained from the environment.
USA 99:966–971. C. 18. J. C. J. Martins. Akman. S. A. Zamudio. Fonstein.. L. and W. The Escherichia coli trmE (mnmE) gene. and C. E. 15. Proc. A. 57:203–224. and E. quired for growth or survival of Haemophilus inﬂuenzae. Foulkes. C. F. L. L. Palan. F. Hunter. Zhu. Fuhrmann. M. A genome-based approach for the identiﬁcation of essential bacterial genes. Guarneros. 7. G. F. Hernandez-Sanchez. Knecht. M. Binet. R. V.. W. T. J. J. O. Bacteriol. Gerdes. M. 2001. 1998.. Edgerton. W. C. Tan. Trends Microbiol. Le Coq. and J. Robbins. Kobayashi. Smith. D. Wall. C.. O. Lee. 38. Brown. Dervyn. M. C. F. 26. Dreesen. N. and H. Overbeek. 6:135–139. 1989. 22. van Ham. J. 1998. Bacteriol. J. M. Nat. Sandusky. Gerdes. Small. Mseeh. D. Fayat. R. P. Single amino acid substitution in prokaryote polypeptide release factor 2 permits it to terminate translation at all three stop codons. C. Sutton. S. Green. Bre ´geon. Deuerling. Utterback. M. Merrick. M. S. J. 2002. 2000. 184:4555–4572. F. Biol. Antibiotic activity and characterization of BB-3497. C. G. Immun. I. M. O. White. O’Reilly. and J. D. J. A. Moir. K. Z. P. Baker.. Experimental determination and system level analysis of essential genes in Escherichia coli MG1655. Y. 2002. A. Sci. 31. Andersson. 176:198–205. J. Trawick. Science 286:2165–2169. Bott. 6. A. Meldrum. R. D’Souza. Acad. Brown. and A. Fares. Opin. Matsumoto. F. 4. Nanamiya. Kikuchi. F. Smith. K. Y. Huy. Balazsi. Natl. J. Pragai. Lemieux. MICROBIOL. F. Venter. 2002. N. S. 182:1523–1528. J. H. W. Both genes for EF-Tu in Salmonella typhimurium are individually dispensable for growth. Mseeh. Fraser. S. J. Desgres. O. S. Agents Chemother. D. C. Y. Y. G. H.-Y. Kelley. K. Baev. Gaasterland. M. Y. S. F. and A. L. D. Translation initiation factor IF1 is essential for cell viability in Escherichia coli. 2.. Farrell. Hutchison III. 41. W. Natl. H. Ohlsen. 1998. M. Gadau. Errington. Watanabe.536 GIL ET AL. 2004. Amati.. Weidman. R. A. R. 2002. Gocayne. Harwood. Devine. C. V. 17... Bognar. Keshavjee. B. 28. G. F. 16:851–856. R. K. and A. F. 45:563–570. K. The genome sequence of Blochmannia ﬂoridanus: comparative analysis of reduced genomes. Karamata. MnmA and IscS are required for in vitro 2-thiouridine biosynthesis in Escherichia coli. B. A. C. Rodgers. M. 171:2748–2755. Nature 407:757–762. M. 32:402–407. Transcriptional analysis of the gene encoding peptidyl-tRNA hydrolase in E. S. Cronan. Ho ¨lldobler. Bult. Adams. R. Lauhon. R. R. A. 1998. T. Utterback. Woodnutt. Saudek. Anderson. Fritchman. S. Spriggs. 1999. K. 42. G. M. M. Judson. Albertini. K. I. Cotton. Gardan. D. T. J. Catlin. Cabedo. Georgopoulos. L. J. J. Swanson. 32. Overbeek. Sutton. Baev. Bacterial membrane lipids: where do we stand? Annu. White. Escudero. Feldman. Cruz-Vera. R. R. Kapatral. G. Fraser. Beebakhee. Z. C. Lucier. Fujita. Structural similarities between topoisomerases that cleave one or both strands. Sci. Huynen. C. and G. J. J. J. Fukushima. J. P. Silva. and J. Henkin. A. Fine. and J.. C. D. Shibuya. Moya. M. 2003. Gene 331:153–163. Devine. H. 8. EMBO J. Arch. Mekalanos. Galindo. A genome-wide strategy for the identiﬁcation of essential genes in Staphylococcus aureus.. M. E. 33. R. 43:1387–1400. S. Mechulam. 2003. A. Oshima. Geoghagen. L. J. J. J. Trends Genet.. T. G. 15:2295–2306. Pooley. Dougherty. B. Talabot. 2000. Shapiro. Chepuri. K. Daugherty. Gil. Scanlan. Gross. Bhattacharya.. Barabasi. USA 100: 9388–9393. H. J. 2000. D. G. Hanna. J. M. Begley. R. Gennis. Keiler.. Hattori. BIOL. Microbiol. Gonza ´lez-Candelas. 25. Salama. M. A. Acad. Kamerbeek. Wang. M. M. Disruption of the gene for Met-tRNA(fMet) formyltransferase severely impairs growth of Escherichia coli. T. 36.. Glass. Sci. Warren. Warren. Peitsch. A. 284:9–16. Herring. 68:708–715. Allet. Proc. C. Curr. J. Stragier. 27. Delmotte. P. E. S. Nygaard. Viability of an Escherichia coli pgsA null mutant lacking detectable phosphatidylglycerol and cardiolipin. D.. M. L. R. Gill. 2003.. D. A. Acad. M.. Jamotte. Chem. Fritchman. Zyskind. H. and P. Scholle. 24. M. Rev. P. Acad. K. 16. J. Asai. J. The purB gene of Escherichia coli K-12 is located in an operon. 2000. Sarvas. Microbiology 142:3219–3230. M. E. J. Clements. E. Kirkness.. Clayton. R. D. 21. Genome sequence of the endocellular obligate symbiont of tsetse ﬂies. M. From genetic footprinting to antimicrobial drug targets: examples in cofactor biosynthetic pathways. Glu-tRNAGln amidotransferase: a novel heterotrimeric enzyme required for correct decoding of glutamine codons during translation. F. Shatalin. and Y. Hutchison. Phillips. Cassell. Nakamura. A.. Rockey. M.. O’Rourke. D. A. S. Au. L. G. Miniser. Rice. 3. R. E. Moya. C. A. Zientz. A. R. Weidman. S. J. and F. T. S. D. Imamura. Hedblom. A. 18:7063–7076. M. MOL. P. J. The minimal gene complement of Mycoplasma genitalium. and J. M. D. L. and K. 45. Rosemberg. R. 2002. 20. S. J. Glass. King. E. Tomb. I. E. 265:11185–11192. Y. J. Caldas. D. R. Ogura. McCarthy. N. J. O. Phillips. Sekowska. Mol. P. Lefkowitz. T. A. M. Haga. D. Guarneros. G. Rausell. and R. L. H. and G.. Galizzi. V. 152:205–210. E. Froelich. Kambampati. A. and M. McKenney. H. M. Y. 2001. N. A. J.. R. Bacteriol. D. D. V. J. D. J. Cline. P. Sci. M. C. J. M. Y. 2000. Mol. K. 2002. Global transposon mutagenesis and a minimal mycoplasma genome. Winkler. S. Biochemistry 42:1109– 1117. Mehl. Gnehm. M. Mellado. C.. Thomas. R. Fields. P. and A. A. Simon. Hosoya. D. Nakai. Glasner. Ohanan. USA 95:7876–7881. Bessieres. L. 43. codes for an evolutionarily conserved GTPase with unusual biochemical properties. 8:521–526. D. S. P. J. Molecular basis for the temperature sensitivity of Escherichia coli pth(Ts). Kawamura. a novel peptide deformylase inhibitor. Brignell. T. A. 13. Barrio. M. J. S. R. De ´barbouille ´. 1995. F. M. 2001. Function of the universally conserved bacterial GTPases. S. Hecker. V. Grechkin. J. Radman. and G. S. Kedar. R. 23. Park. J. F. D. J. F. Deckert. S. Gocayne. V. Sekiguchi. . A. M. M. Y. G. Acad. Aujay.. Wang. Chen.. Alteration of RNA I metabolism in a temperature-sensitive Escherichia coli rnpA mutant strain. Arnaud. Acad. Extreme genome reduction in Buchnera spp. L. Huber. Gene replacement without selection: regulated suppression of amber mutations in Escherichia coli. Beckett. Polanuyer. The complete sequence of the mucosal pathogen Ureaplasma urealyticum. 1995. Bunai. Mutagenesis of the folC gene encoding folylpolyglutamate synthetase-dihydrofolate synthetase in Escherichia coli. M. C. T. C. Barynin. Heiner. Biochem. M. C. F. H. J. L. L.. J. Evolution of minimal-gene-sets in host-dependent bacteria. Glodek. O. M. D. T. B. S. Burnham. Y. K. C. Small. L. Uno. M. L. Harrison. N. Schmitter. Aksoy. J. R. Chowdhury. M. Natl. Wigglesworthia glossinidia. Haselbeck. X. S. Daugherty. A. J. Macia ´n. M. Arigoni. V. Bacteriol. L. Kakeshita. Aymerich. Sci. K. Eschevins. W. D. E. structure and mechanism—an overview. R. A. Studer. and G. Fleischmann. S. D. Chapuis. Sadaie. Fillinger. Forsyth. 12. Fass. Infect. R. I. M. 2000. L. Schole. Saudek. Guillon.. C. K. A. Kasahara. Kinsland. F. J. 12:37–43. M. Higgins. Taddei. Genes Dev. C. T. Y.. V. Olsen. Identiﬁcation of critical staphylococcal genes using conditional phenotypes generated by antisense RNA. 1990. S. L. 30. F. Ravasz. K. Trends Microbiol. Natl. Lobell. Constructing a minimal genome. Biophys. Zhang. Chem. Sci. Gelfand. Microbiol. The biosynthesis of nicotinamide adenine dinucleotides in bacteria. Mart´ ınez-Vicente. S. L. T. Osterman. M. 2000. C. M. and Y. Proc. G. Snead. Se ´ror. F. C. Y. Bouloc. M. P. W. Nagakawa. Hu. and S. 1990. G. USA 94:11819–11826. Identiﬁcation of an antigen localized to an apparent septum within dividing chlamidiae. and F. V. 1999. H. Cruz-Vera. R. T. Science 293:2266–2269. FitzHugh. Malik. M. Latorre. S. and G. Clayton. F. Venter. J. Biophys. Res. Carr. E. Kuwana. P. 2001. 2001. Shiba. White. Giles.. Science 269:496– 512. Sabater-Mun ˜ oz. Proc. van Horn. C. M. Transposon-based approaches to identify essential bacterial genes. and S. M. Osterman. Smith. 1992. T. A. K. 40. C. Anantha. D. Fraser. Yuan. Peterson. Hullo. 34. F. V. Hong. The heat-shock-regulated grpE gene of Escherichia coli is required for bacterial growth at all temperatures but is dispensable in certain mutant backgrounds. Cummings. Lenox. I. The sequence of the cyo operon indicates substantial structural similarities between cytochrome o ubiquinol oxidase of Escherichia coli and the aa3-type family of cytochrome c oxidases. Rawlins. 2003. Blattner. Curchod. 14. M. Seegers. 16:116. Kelley. Campbell. K. C. Graham. 1996. Andersen. E. USA 97:7778–7783. 186:1463–1470. Nature 417: 398. Sci. and K. Proc. The FtsJ/RrmJ heat shock protein of Escherichia coli is a 23 S ribosomal RNA methyltransferase. D’Souza. 2000. Silva. E. C. Dougherty. Scott. J. Brandon. Liu. Danchin. R. C. P.: towards the minimal genome needed for symbiotic life. M. J. A. Rapoport. D. Boland. K. Jung. Y. Latorre. E. 1997. 174:4294–4301. and D. J. ABC transporters: physiology. R. W. D. H. Nat. Whittaker. V. Meima. A. So ¨ll. R. 215:41–51. Ruiz-Gonza ´lez. Elena. Whole-genome random sequencing and assembly of Haemophilus inﬂuenzae Rd. Proc. Blanquet. M. H. L. M. K. B. T. Kerlavage. 1992. I. R. P. The complete genome of the hyperthermophilic bacterium Aquifex aeolicus. Bernal. Hershey. R. Kimlova. J. Commun. Villarroya. Kerlavage. A. G. Ishio.. J. Biol. C. Osterman. E. O. Ishikawa. Hughes. Venter. Xu. Kim. Kyrpides. Colot. C. L. Yamashita. 5. 37. B. Koga. Nguyen. Nguyen. and M. W.. USA 95:8165–8169. D. R. Natl. D. 9. J. Berger. and D. Short.. 19. T. E. Saxild. C. Tomb. T. M. III. K. Bult. R. M. M. M. USA 99:4454–4458. Curnow. Rivolta. Translational misreading: a tRNA modiﬁcation counteracts a ϩ2 ribosomal frameshift. Biol. Masson. N. Bacteriol. O.. Ito. D. 44. J. R. 11. 61:103–109. Foster. M. W. 182:371–376. A.-I. Z. Sadaie. T. M. Peterson. Ashikaga. W. S. Nature 392: 353–358. C. K. M. 185:5673–5684. Gil. 1994. G. J. Yamamoto. L. J. D. 1991. S. Shirley. Mosca. coli. Biotechnol. Campbell. and R. Armengod. F. Fish. Res. and A. G. J. Science 270:397–403. M. Natl. Microbiol. C. Koski. Merrick. R. and S. March. A. involved in tRNA modiﬁcation. I. J. H. R. I. Pyne. Bacteriol. H. Biochem. Z. Proc. Drabble. 35. K.. Antimicrob. Kurnasov. and C. Ehrlich. B. tmRNAs that encode proteolysis-inducing tags are found in all known bacterial genomes: a two-piece tmRNA functions in Caulobacter. Bacteriol. Bron. Keller. A. T. J. Loferer. Ishimaru. J. 29. H.. M. C. H. J. W. L. Acad. Costa. O. Ji. E. REV. D. 2003. Christiansen.. Moya. Genet. 10. M. H. Fuhrmann. 22:16414–16419. A. A. Klasson. Horm. Ang. Maue ¨l. Fleischmann. W. D. L. C. Rivas. S. W. Richarme. S. Natl. C. E. J. Oltvai. R. Dorrestein. and M. Williams. D. Somera. J. Microbiology 148:3457–3466. G. M. K. R. E. F. J. Sato. S. Schumann. McDonald. A. M. B. Microbiol. Brown-Driver. Malone. A. Caldon. Adams. C. J. Vitam. A. 2003. 39. and P. GroEL as an enabler for bacterial endosymbiotic lifestyle. D. P. and J. Wood.. Toledo. Fonstein. Moriya. Young. V. L. A. S.
EMBO J. Themes and variations in prokaryotic cell divisio ´n. Wang. Koonin. Rev. J. V. 561–567. 82. P. homologous recombination. D. Zuber. B. P. M. Nat. Miller. In search of the minimal Escherichia coli genome. and M. 2002. H.. Takahashi. N. Wambutt. Stetter. Kunst.VOL. France. 276:34840–34846. Shea. P. S. Rev. Ghim. Sci. The notion of a DNA minimal cell: a general discourse and some guidelines for an experimental approach. Nygaard. Oberholzer. J. S. A. Oudega. R. 142:279–282. Ostedrman. Zumstein. Vandenbol. Bacteriol. L. D. Soldo. Yoshikawa. D. M. J.. U. B. C. H. G. H. 325:103–111. Structure and function of the ftsH gene in Escherichia coli. Horiuchi. Commun. Rocha. Noback. D. S. 1:127–136.. Weitzenegger. A. Miclet. F. F.. Richardson. Trigatti. Nakai. J. J. J. Deamer. V. 15:321–326. Short. Harwood. A. 2000. Yoshikawa. T. and J. C. K.. Sutton. J. Emmerson. K. K. The bacterial SecY/E translocation complex forms channellike structures similar to those of the eukaryotic Sec61p complex. R. Oshima. 1991. Liu. 2004. and J. Viari. 86. M. L. Hackbarth. Polanuyer. Genet. K. Koetter. Tanida. D. M. Namba. Z. Bertero. K. Jones. Gerdes. X. 78. Bieker. M. Konieczny. Graham. B. Goffeau. Bylund.. Moya.-S. M. Smith. Foulger. E. Guiseppi. 58.. Ogura.-J. R. Yamamoto. Reductive genome evolution in Buchnera aphidicola. Kogoma. B. Fujita. S.). E. C. Terpstra. Borriss. van der Laan. M. Kasahara. 85. Noordewier. C. M. L. 67. H. 2003. 1999. Grandi. 50 million years of genomic stasis in endosymbiotic bacteria. M. and T. Kumano. P. and S. 1994. S. Mol. Presecan. Sadaie. Smalley. E. Lin. van Dijl.. A.. Mol.. Deutscher. M. 1996. M. M. E.. Margolin. 1999. Sci. Acad. Harms. A. J. J. R. Luirink. Mora ´n. 65. Wei. H. Koningstein. Nature 390:249–256. Rapoport. and A. Bibbs. 1999. 2001. Mangroo. K. B. and B. G. A.. 2000. Resistance of Streptococcus pneumoniae to deformylase inhibitors is due to mutations in defB. J. Stable DNA replication: interplay between DNA replication. Microbiol. R. C. I. Yu. M. Canba ¨ck. 29:1017–1026. Odile Jacob. S. T. 56. Guy. 63. W. C. 36:27–29. K. A conserved function of YidC in the biogenesis of respiratory chain complexes. Biochemistry 33:2531–2537.-H. and A. Persson. Krogh. T. E-CELL: software environment for whole-cell simulation. Jung. Vagner. Seror. 87. Young. 177:5554–5560. Natl. Wedler. 68. 1995. J. Kurita. and S. S. and M. K. K. Caldwell. and S. NMR spectroscopic analysis of the ﬁrst two steps of the pentose-phosphate pathway elucidates the role of 6-phosphogluconolactonase.. A. E. and J. S. Yata. H. M. 55. Grosjean and R. W. C. Park. H. Wolska. a membrane component of the E. M. J. M. Rock. Rausell. T. 1998. M.. Rey. and K. V. V. 71. Sa ´nchez-Valdenebro. M. B. Denizot. J. Chem. D. Schroeter. Devine. Chen. Hiraga. Mutual suppression of mukB and seqA phenotypes might arise from their opposing inﬂuences on the Escherichia coli nucleoid structure. C. and H. Beeson. K.. Yamane. Moya. R. D. Hullo. F. Microbiol. A. 2004.. 45:2432–2435. M. S. Acad. Trends Microbiol. K. 46:7–17. Chem. P. Natl. Scoffone. 1991. E. Smith. 2003.-C. 2001. 2003. Suzuki. Vannier. Lardinois. E. Takemaru. Mele ´ndez-Hevia. A. S. D. 2000. APS. J. C. V. 1994. J. Andersson. J. S. Tacconi. Margolin. Newton. Rose. K. Abascal. 76. 1999. In H. Y. 68. 2001. 1:99–116. R. Rev. Agents Chemother. Kakizawa. A. Zuurmond. F. 72.-F. J. Rivolta. Escherichia coli defects caused by null mutations in dnaK and dnaJ genes. 49. Uchiyama. E. Wedler. Itaya. USA 93:10268–10273. H. P. E. S. 61. Miyoshi. Lazarevic. Yata. H. Kamerbeek. V. J. L.. C. Parro. Holsappel. W. Why are the genomes of endosymbiotic bacteria so stable? Trends Genet. Waters. E. 80. Yuki. Adams. O. A. Acad. M. Braun. Genome sequence of the endocellular bacterial symbiont of aphids Buchnera sp. Galizzi. Jime ´nez. Dasgupta. La vie explique ´e? 50 ans apre `s la double helix. Ogasawara. Biol. A. 66. W. 20:123–128. A. Bacteriol.-M. Daniel. 11:6–8. Me ´ne ´tret. 1997. V. and D. Paciorek. 75. Fujita. Acad. Chem. EMBO Rep. Joris. 52. Sato. Simon. A. P. M. B. Bolanos. B. J. J. U. Biochemical demonstration of the involvement of fatty acyl-CoA synthetase in fatty acid translocation across the plasma membrane. H. 53. A. H.. T. Reynolds. Gerber. 151:121–128. and J. T. Acta 85:1759–1777.. O. D. White. C. Borchert. B. The genome of Nanoarchaeum equitans: insights into early archaeal evolution and derived parasitism. Klasson. K. Genet. 184:6906–6917. How many genes can make a cell: the minimal-gene-set concept. I. coli K-12 can be deleted without loss of cell viability. Mulks. 74. Lallemand. A. Mol. Antimicrob. 1999. P. K. 24:531–548. A. M. 57. Mushegian. M. H. Kobayashi. Essential Bacillus subtilis genes. Galleron. Biol. Glaser. Dufﬁeux. Lauber. van Ham. H. E. B. F. Latorre. K. Schatz. 2003. Driessen. Watabe. J. Microbiol. F. J. D. W. Danchin.. Comparative genomics. T. Sekiguchi. T. Keller. Oudega. X.. M. 2001. J. 2000. I.. M. ten Hagen-Jongman. and F. D. 83. minimal gene-sets and the last universal common ancestor. Conway. Bolotin. A. Mathur. A. 1997. G. Latorre. Cummings. 1997. T. E. Pochet. 274:22143–22146. K. Yoshida. Haga. J. Lapidus. D. Artiﬁcial cells: prospects for biotechnology. Ogiwara. B. B. 50. Ahel. Moestl. White. M. Res. Trias. Y. Network organization of cell metabolism: monosaccharide interconversion. Nature 407:81–86. He ´naut. M. G. A. 2003. T. 1997. Nun ˜ o. 1993. S. Tamames. Lobacz. Viguera. Portetelle. J. 285:1789–1800. and M. S. M. M. Brouillet. S. V.. S. J. J. M. Montero. K. 2000. C. R. Margolis. G. Porwollik. Bjo ¨rk. Bioinformatics 15:72–84. M. B. Du ¨sterho ¨ft. Microbiol. Y. Na ¨slund. Connerton. T. Alloni. Miki. M. S. Albertini. Ogawa. Schleich. and M. I. F.. 19:176–180. A. D. 81.C. Sci. Maue ¨l. Benkovic. Bacteriol. Benne (ed. R. Golightly. J. Acta Microbiol. FEMS Microbiol. S. Tam. Klaerr-Blanchard. N. T. Cloning and characterization of a new purine biosynthetic enzyme: a non-folate glycinamide ribonucleotide transformylase from Escherichia coli. Acad. E. Noone. Schmelter.-Y. D. R. and A. Silhavy. Moran. G. F.-K. A. T. Genomics Hum. Kraal. C. R. Choi. N. a distinctive enzyme of eubacterial translation. Saito. T. Mol. A. Science 296:2376–2379. K. 2004 Takamatsu. O’reilly. V. Proc. Hashimoto. Purnelle. G. Hilbert. C. H. C. Michels. Brans. and D. 2002. J. Y. Kardys. Res. Rundlof. Ferna ´ndez. and D. R. E. Forty years of bacterial fatty acid synthesis. A. 2002. Matsuzaki. T. Yamane. 175:3591–3597. 34:157–168. M. S. EMBO J. Tanaka. Yuan. T. Agents Chemother. 51. M. p. O. Hutchison III. Takagi. R. USA 100:5801–5806. Leung. Yoshida. Trends Biotechnol. Nat. S. S. M. USA 100:12984–12988. Nishigawa. Ananta. and J. . C. M. Lee. Azevedo. A. Yuan. G. Jackowski. Yamamoto. Nakata. Biol.. E. Mol. F. Niki. and A. F. Microbiol. L. R. D. A. J. Sandstro ¨m. Mori. 84. Rieger. K. Karamata. D. M. R. Proc. K. Bacteriol. Wernegreen. K. Koonin. C. 2002 Ribosylnicotinamide kinase domain of NadR protein: identiﬁcation and implications in NAD biosynthesis. Sloutsky. Entian. V. Tosato. Wipat. Errington. P. Genetic locations and database accession numbers of RNA-modifying and -editing enzymes. Morimura. M. H.. T. Shimizu.. T. D. Venter. M. Silva. Bron. V. S. Biochem. Natl. Nouwen. Yasumoto. Lopez. C. Tanaka. T. C. Opperdoes. Ottemann. Peptide deformylase in Staphylococcus aureus: resistance to inhibition is mediated by mutations in the formyltransferase gene. J. The complete genome sequence of the Gram-positive bacterium Bacillus subtilis. M. J. Y. Moszer. and T. Strategies for helicase recruitment and loading in bacteria. 64. Thomaides. Yamamoto. Zuo. W. Ogasawara. Luisi. Vassarotti. Lazcano. Whiteley. E. Pohl. Shin. Akey. O. Shigenobu. P.-F. Takeuchi. H. S. Fabret. V. N. Biol. Proc. Winters. L.. A. F. Natl. Morange. Masuda. A. Tognoni. S. L. Me ´digue. Exoribonuclease superfamilies: structural analysis and phylogenetic distribution. Donachie. MINIMAL BACTERIAL GENE SET 537 46. Capuano. Sakaki. J. Mazel. Bessie `res. 88. M. Serror. P. Margolis. D. Modiﬁcation and editing of RNA. and E. G. K. B. 2003. N. Valencia. Postigo. Functional analysis of the ffh-trmD region of the Escherichia coli chromosome by using reverse genetics. Bastolla. USA 100:4678–4683. Trias. Podar. 2003. K. Codani. Antimicrob. M. G. I.. M. K. Takeuchi. Helv. J. C. J. Wikstrom. C. Urbanus. Hagervall. Y. Either of the chromosomal tuf genes of E. A. Pujic. M. E. Mangroo. Ishikawa. K. Hackbarth. Breitling.. Winkler. S. Hosono. Klein. J. 44:1825–1831. R. M. A minimal gene set for cellular life derived by comparison of complete bacterial genomes. Barnstead. 62. J. and J. Mizuno. 73. Moran. Brignell. P. 77. G. Silva. Carter. Y. Yoshikawa. Yugi. Martin. Rapoport. S. H. Lifestyle evolution in symbiotic bacteria: insights from genomics. and F. P. K. 69. R. Pohorille. Arashida. Kretz. 292:1155–1166. Medina. Y. 59. I. 47. L. 79. 48. Sci. K. Formylation is not essential for initiation of protein synthesis in all eubacteria. Sorokin.. S. T. Palacios. and S. 183:1168–1174. Prescott. Masslewski. Wang. J. 2003. Proc. S. Microbiol. I. Nordstro ¨m. Boursier. Biochem. E. Gen. Tomita. M. 13:914–923. 61:212–238. Reductive evolution suggested from the complete genome sequence of a plant-pathogenic phytoplasma. Pe ´rez-Iratxeta. Mellado. L. Isono. T. J. and W. Koonin. K. Z. 1999. 279:24163–24170. Yamamoto. H. E.. P. Tomoyasu. 2002. S. Nucleic Acids Res. A. and transcription. and S. 260:603–607. Tamas. J. N. Sci. Genetic characterization of polypeptide deformylase. F. and A. Y. Scanlan.. Proc. Beckwith. and S. Pol. FtsZ ring clusters in min and partition mutants: role of both the Min system and the nucleoid in regulating FtsZ ring localization. Rev. Ugaki. A. P. Begg. Stoven. One of three transmembrane stretches is sufﬁcient for the functioning of the SecE protein. A. Biophys. Evol. K. D.. Y. Wernegreen. Meyer. and N. 2003. Ferrari. Y. S. 70. Bruschi. American Society for Microbiology. and P. 54. P. Mori. Weitao. M. Functional genomics of Escherichia coli in Japan. J. J. and T. A. E. Miyata. J. Tamakoshi. Natl. USA 100:581–586. J. A. C.. Eriksson. coli secretion machinery. S. A. H. 32:315–326. Berg. Biol. Takahashi. M. Levine. R. E.. S. N. S. A. Identiﬁcation of a plasmidencoded gene from Haemophilus ducreyi which confers NAD independence. K. G. H. Evidence for a novel glycinamide ribonucleotide transformylase in Escherichia coli. P. Res. Marliere. Trends Ecol. Creuzenet. Maniar. Kursanov. Ehrlich. Ni. Soll. Chim. A. J. Paris. Wipat. P. Hohn.-Y. 60. S. and C. Haiech. Roche. Annu. N. Genet. and P. R. L. B. Washington. Watanabe. Sekowska. K. Fritz. 4:37–41. Hattori. C.. C. 2000. G. M. Fuma. J. N. 10:1749–1757.
This action might not be possible to undo. Are you sure you want to continue?