You are on page 1of 6

Available online at www.sciencedirect.

com

ScienceDirect

Advances in gap-filling genome-scale metabolic models


and model-driven experiments lead to novel metabolic
discoveries
Shu Pan1,2 and Jennifer L Reed1,2

With rapid improvements in next-generation sequencing metabolic model. Gap-filling analyses can lead to discov-
technologies, our knowledge about metabolism of many eries of missing reactions, unknown pathways, unanno-
organisms is rapidly increasing. However, gaps in metabolic tated and misannotated genes, as well as promiscuous
networks exist due to incomplete knowledge (e.g., missing enzymes and underground metabolic pathways. Classic
reactions, unknown pathways, unannotated and misannotated gap-filling algorithms have been reviewed previously by
genes, promiscuous enzymes, and underground metabolic Orth and Palsson [7]. These algorithms generally include
pathways). In this review, we discuss recent advances in gap- three steps: detecting gaps, suggesting model content
filling algorithms based on genome-scale metabolic models changes (i.e., add/remove reactions, change biomass
and the importance of both high-throughput experiments and compositions, or change reaction reversibility), and iden-
detailed biochemical characterization, which work in concert tifying genes responsible for the gap-filled reactions
with in silico methods, to allow a more accurate and (Figure 1). In the first step, gap-filling algorithms identify
comprehensive understanding of metabolism. dead-end metabolites (metabolites which cannot be con-
sumed or produced in the network), and/or inconsisten-
cies between model predictions and experimental data (e.
Addresses
1 g., growth phenotypes). They then solve for a set of
Department of Chemical and Biological Engineering, University of
Wisconsin-Madison, 1415 Engineering Dr, Madison, WI 53706, United reactions from metabolic databases of potential reactions
States that if added to the metabolic model ‘activate’ dead-end
2
Great Lakes Bioenergy Research Center, Madison, WI 53706, United metabolites or resolve the inconsistencies. In the third
States step, some gap-filling algorithms discover genes that
Corresponding author: Reed, Jennifer L (reed@engr.wisc.edu)
could be responsible for these reactions, which can be
further tested biochemically or genetically. A simple gap-
filling example is illustrated in Figure 2. Here, we first
Current Opinion in Biotechnology 2018, 51:103–108 review recent gap-filling methods, which are more effi-
This review comes from a themed issue on Systems biology cient or operate under different assumptions. Then we
Edited by Nathan Price and Eran Segal discuss how advances in experimental techniques have
significantly advanced gap-filling methods by identifying
For a complete overview see the Issue and the Editorial
model-data inconsistencies. Finally, we describe recent
Available online 23rd December 2017
studies that have used gap-filling analyses to discover the
https://doi.org/10.1016/j.copbio.2017.12.012 promiscuous functions of enzymes.
0958-1669/ã 2017 Elsevier Ltd. All rights reserved.

Advances in gap-detection and reaction-


addition algorithms
Some recent algorithms aim to detect and fill gaps more
efficiently than earlier gap-filling algorithms [7]. For exam-
Introduction ple, FASTGAPFILL [8] is a scalable algorithm that com-
A genome-scale metabolic model is a mathematical repre- putes a near minimal set of added reactions for a compart-
sentation of the metabolic capabilities of an organism, mentalized model. Another method, GLOBALFIT [9],
which is inferred primarily from genome annotations. reformulates the mixed integer linear programming prob-
Such models have shown great utility in predicting bio- lem of gap-filling into a simpler bi-level linear optimization
logical capabilities, metabolic engineering, and systems problem. It efficiently identifies the minimal set of net-
medicine [1–4]. A draft model is often generated auto- work changes needed to correct multiple in silico predic-
matically by software platforms, which use genome anno- tions that are inconsistent with in vivo observations simul-
tations of a specific organism and connect genes to meta- taneously. Meneco [10] and a hybrid metabolic network
bolic reactions using reference databases [5,6]. A draft completion algorithm [11] reformulate the reaction-addi-
model has to be further refined and evaluated in multiple tion problem using answer set programming, a declarative
steps to ensure its quality [5,6]. This refinement and programming paradigm intended to solve difficult combi-
evaluation process includes gap-filling, which improves natorial search problems. Their usage of answer set pro-
the network connectivity by modifying content of the gramming allows for stoichiometry constraints to be

www.sciencedirect.com Current Opinion in Biotechnology 2018, 51:103–108


104 Systems biology

Figure 1

Gap-Filling & Promiscuous Enzyme Discovery Input Data &


Computational Algorithms

detect dead-end metabolites


gaps in vivo / in silico inconsistencies
metabolomics data
knockout phenotypes (e.g. Tn-seq)

databases of potential reactions


suggest
model reaction
network topology
13
content x reaction z C fluxomics data
reaction
changes
y2
metabolomics data

gap-filled reactions promiscuous enzymes alternative pathways


gene x gene y gene z
sequence similarity
genomic-context data
find genomic functional selections
genes & protein X protein Y protein Z knockout phenotypes (e.g. Tn-seq)
change GPR
biochemical characterization

reaction x reaction y1 reaction y2 reaction z

open gaps
closed gaps using gap-filled reactions
closed gaps using promiscuous enzymes
closed gaps using alternative pathways
Current Opinion in Biotechnology

Steps, input data, and computational algorithms of gap-filling. First, dead-end metabolites, in silico and in vivo inconsistencies, metabolomics
data, and knockout phenotypes allow detection of gaps in metabolic models. Then, the model content (i.e., reactions and biomass compositions)
is changed to resolve these inconsistencies. In this step, missing reactions can be added from databases, and network topology analysis can rank
these potential reactions. Metabolomics and 13C fluxomics data could also suggest reactions that should be included in the model. Finally, the
genes responsible for the filled gaps are identified using sequence similarity, genomic-context data, genomic functional selections, or knockout
phenotypes and are verified by biochemical characterization. Similarly, promiscuous enzymes and underground metabolic pathways can also be
identified when analyzing the gaps in the models.

violated, potentially resulting in solutions which are less upon filling reactions to ‘activate’ dead-end metabolites
biased by the inaccurate stoichiometry of a model. The in a manner similar to eukaryotes engulfing mitochondria
hybrid metabolic network completion approach combines to find the most efficient pathways for consuming oxygen.
answer set programming with linear stoichiometry con- Following this quasi-endosymbiosis theory, DEF aims to
straints, and offers a better solution for restoring highly add reactions that maximize production/consumption of
degraded models [11]. dead-end metabolites in the original model. A DEF
solution could contain more reactions that are biologically
Algorithms, such as GAUGE [12], have been developed reasonable compared to a parsimonious gap-filling solu-
to exploit alternative mechanisms for gap-identification. tion, which often is a minimal set of reactions.
GAUGE exploits flux coupling analysis (FCA) [13] that
detects how two reactions depend on each other. Using Inherently different from all algorithms mentioned
FCA, GAUGE finds gaps involving genes that are associ- above, pattern-based gap-filling algorithms do not contain
ated with fully dependent reactions but show uncorre- an explicit gap-identification or reaction-addition step. In
lated expression patterns. However, GAUGE can only a metabolite pattern and probabilistic method [15], fea-
analyze a subset of a model where gene-protein-reaction ture propagation Markov models (HMMs) are used to
(GPR) associations are defined and isozymes or multi- rank potential gap-filled reactions by how closely they are
functional genes do not create possibilities for uncorre- related to the network. In MATBoost and BoostGAP-
lated gene expression patterns. FILL [16,17], a training incidence matrix, S, with artifi-
cial gaps is created by deleting reactions randomly from a
Some algorithms exploit alternative mechanisms for add- network. Then a machine learning technique, matrix
ing reactions. The novel algorithm DEF [14] is based factorization, completes the missing entries creating

Current Opinion in Biotechnology 2018, 51:103–108 www.sciencedirect.com


Advances in gap-filling genome-scale metabolic models Pan and Reed 105

Figure 2

growth phenotyping draft model with missing reactions


and inaccurate gene-protein-reaction associations

metabolite reaction a
A
metabolite reaction b
A 1 0 0 gene a gene c gene d
Ca
reaction c B 0 0 0
metabolite C 0 1 0
C D 0 0 1 protein A protein C protein D
metabolite reaction d
optical density

D
reaction a reaction c reaction d
growth a b c d

time

improved model with gap-filled reactions biochemical characterization reaction addition from databases
and promiscuous enzyme added
atgCCTATCG
gene a gene b gene c gene d gene b ...ACGCCAG database
A 1 0 0 0 ...TGCCAAtta
B 0 1 0 0
C 0 0 1 0 database
D 0 0 0 1 atgTTCATTG
protein A protein B protein C protein D
gene d ...TTGTTAG
...CGCCATtta

reaction a reaction b reaction c reaction d


a b c d

Current Opinion in Biotechnology

An example of the iterative process of laboratory and computational experimentation in gap-filling. As revealed by high-throughput growth
phenotyping experiments, the wild type can grow on metabolites A, B, or D as a sole carbon source. Additionally, a knockout mutant of gene c
that is known to catalyze reaction c can grow on metabolite C. Contrarily, the model incorrectly predicts that the wild type cannot grow on B and
the gene c mutant cannot grow on C, indicating that reaction b is missing from the model and there exists another enzyme that can catalyze
reaction c. To fix these inconsistencies, gap-filling adds reaction b from databases and finds that gene d has sequence similarity to gene
c. Further biochemical characterization confirms that protein B is responsible for reaction b and protein D has weak activities towards reaction c.
The model is then updated and compared to additional experimental datasets to allow iterative metabolic discoveries and to improve prediction
accuracy.

matrix A. Finally, an integer least square optimization explored step which should be supported with additional
selects reactions from a database that best match A. These experiments.
steps, which are based on the network pattern alone,
cannot fill in all the gaps in the network. However, they Advances in gene-assignment algorithms and
can rank and set weights of reactions in databases and approaches
work in conjunction with other gap-filling algorithms such Knowing the gene sequence(s) responsible for a reaction
as FASTGAPFILL [8] to improve their gap-filling is extremely useful in applications using genetic manip-
accuracy. ulation or drug discoveries. Several algorithms that have
been previously reviewed [7] provide bioinformatics solu-
Unlike the above algorithms that add reactions to an tions to match gene sequences to gap-filled reactions.
incomplete metabolic model of any organism, SONEC These methods could incorporate data such as sequence
[18] works for an organism in a metagenomics sample. It similarity, co-expression, chromosomal proximity, and
is initialized with a bin of contigs for each organism in the phylogenetic profiles. Recently, GLOBUS [19] has com-
sample and a bin for unassigned sequence fragments. bined these data with a global probabilistic approach to
Then it computes metabolite connectivity scores provide probable annotations including possible alternate
between the unassigned fragments and each organism functions. Another novel algorithm GO-MEP [20] iden-
to determine which organism these fragments belong to. tifies biologically relevant groups of proteins (e.g., pro-
teins in the same pathways, protein complexes, proteins
The abovementioned gap-detection and reaction-addi- with the same localization, and physically interacting
tion algorithms are used in the first two steps of gap-filling protein pairs) and has showed improved accuracy of
(Figure 1). They can hypothesize new network compo- identifying missing genes.
nents including gap-filled reactions. After the two steps,
genes are assigned to gap-filled reactions (Figure 1). This An alternative approach for gene assignment is to directly
last step of gap-filling is a very important yet under- incorporate gene assignment in the gap-filling procedure.

www.sciencedirect.com Current Opinion in Biotechnology 2018, 51:103–108


106 Systems biology

In MIRAGE [21], phylogenetic profiles and gene expres- example, gap-filling algorithms could improve the quality
sion are first used to estimate the likelihood that a specific of a metabolic model by resolving incorrect growth phe-
reaction in the database can be added. In the next step, notype predictions for knockout mutants under different
starting from a model with all reactions in the database medium conditions. Currently, phenotypic microarrays
added, a new model is created to determine the set of gap- and robotic instruments [29] can test growth of cells in
filled reactions by removing low likelihood reactions fifty 96-well plates for hundreds of different media con-
while keeping the desired model properties. Another ditions. RB-TnSeq has also allowed rapid quantification
likelihood-based gap-filling algorithm [22] and its web of genome-wide mutant fitness in 387 successful assays
service (ProbAnnoWeb) and standalone python package [30]. With improved genome editing tools such as MAGE
(ProbAnnoPy) [23], have been developed to directly [31,32], pORTMAGE [33], and CRMAGE [34], more
incorporate gene assignment into gap-filling. In ProbAn- knockout libraries will be added to the current collections
noWeb and ProbAnnoPy, an organism-specific likelihood [35–41] in the near future. Such libraries can be rapidly
score is assigned to each reaction in a database of potential tested under a variety of conditions to identify incon-
reactions. These scores are calculated based on the sistencies between model predictions and experimental
BLASTp results of a gene in the organism genome observations and to identify metabolic gaps.
against high confidence functional annotations. Reactions
with higher scores are weighted heavier in the objective Limitations in current gap-filling algorithms
function. Therefore, a ProbAnno solution might contain Despite the increasing power of newer gap-filling meth-
more reactions than a parsimonious gap-filling solution. ods to complete our knowledge in metabolic networks,
Similar to ProbAnno, there exists another BLAST- major challenges still exist. First, a prevalent problem
weighted gap-filling algorithm [24] in which reactions among these algorithms is that most cannot resolve false-
are added from weighted biochemistry databases. positive predictions (negative growth in vivo and positive
growth in silico). Simply removing reactions or limiting
Ultimately, all gap-filled gene-reaction associations reaction directionality to resolve false-positive predic-
should be evaluated in subsequent laboratory experi- tions may not be successful, since unknown regulatory
ments. For example, a change in gene expression might rules or essential biomass components could instead be
be measured by quantitative PCR (qPCR) or growth the source of the issue. Gap-filling algorithms might also
phenotypes of knockout mutants under different media add reactions without genomic evidence to models and
conditions. Or the product of an enzyme’s proposed risk overfitting when trying to activate dead-end metab-
reaction could be measured by analytical chemistry meth- olites or resolve false-negative model predictions (posi-
ods. Recently an integrated computational and experi- tive growth in vivo and negative growth in silico). Third,
mental approach, MEGS [25], has been developed that solutions that reconcile experiment and model inconsis-
allows the direct and rapid identification of genes encod- tencies might be non-unique, with different solutions
ing for gap-filled reactions. In MEGS, gap-filled reactions being hard to rank or prioritize. These solutions could
in a model are identified with a classic gap-filling proce- also change depending on which experiment and model
dure by resolving model and experiment inconsistencies. inconsistency is solved first. When one inconsistency is
A genomic functional selection then can be designed fixed, another new inconsistency might be created. Unin-
using a forced-coupling algorithm [26] such that a recipi- tuitively, a global gap-filling procedure, where multiple
ent strain from a well-characterized organism can only inconsistencies are solved simultaneously, may result in a
grow in a selective medium if it acquires the gene(s) larger set of gap-filled reactions and may take significantly
encoding for the gap-filled reaction. Finally, a genomic longer to run as the number of inconsistencies increases
library from a donor strain catalyzing the proposed gap- [42].
filled reaction is transformed into the recipient strain and
a genomic functional selection is performed to identify Discoveries of promiscuous enzymes and
the gene(s) of interest. Using MEGS, five genes were underground metabolic pathways via gap-
successfully identified for the gap-filled reactions in the filling analyses
draft model of the marine bacterium, Vibrio fischeri. In addition to discoveries of unannotated or misannotated
genes, gap-filling procedures can identify promiscuous
Identifying inconsistencies between model activities of pre-existing enzymes that are recruited by
predictions and high-throughput phenotyping ‘underground’ metabolic pathways [43] to bypass the
experiments more commonly used pathways (Figures 1 and 2). Tradi-
Laboratory experiments are not only important to validate tionally, enzyme promiscuity is evaluated using multi-
gap-filling solutions, but also are important for identifying copy suppression experiments where overexpression of a
the metabolic gaps themselves. Cheaper and more avail- single gene from a plasmid library rescues a conditionally
able high-throughput experimental datasets, which can lethal knockout [44,45]. However, these experiments
be compared to in silico predictions, have also made gap- have limited utility because it can be tricky to obtain
filling analyses increasingly powerful [27,28]. For optimal expression levels and only a fraction of genes are

Current Opinion in Biotechnology 2018, 51:103–108 www.sciencedirect.com


Advances in gap-filling genome-scale metabolic models Pan and Reed 107

conditionally lethal in common growth media [46]. Acknowledgements


Recently, gap-filling analyses have been combined with The authors wish to thank Matthew R. Long, Paul A. Adamczyk, and
experimental characterization for discoveries of promis- Sanjan Gupta for their comments on the manuscript. This work was
supported by the National Science Foundation (CBET-1053712); and the
cuous enzymes. Predicted gaps in central metabolic path- U.S. Department of Energy Great Lakes Bioenergy Research Center (DOE
ways and cofactor biosynthesis have led to refined anno- BER Office of Science DE-FC02-07ER64494).
tations for Dehalobacter restrictus and discovery of a D.
restrictus promiscuous serine hydroxymethyltransferase
[47]. Experiments where knockout and synthetic lethal References and recommended reading
Papers of particular interest, published within the period of review,
mutants grow under different media conditions are useful have been highlighted as:
for discovering underground reactions that do not take  of special interest
place in wild-type strains under standard media condi-  of outstanding interest
tions (e.g., rich media). An extensive study of Escherichia
coli multiple-knockout mutants on 13 carbon sources 1. Zhang C, Hua Q: Applications of genome-scale metabolic
models in biotechnology and systems medicine. Front Physiol
confirmed underground reactions using metabolomics 2016:6.
analysis and in vitro enzymatic assays [48]. A novel work- 2. King ZA, Lloyd CJ, Feist AM, Palsson BO: Next-generation
flow has been developed to resolve incorrect model pre- genome-scale models for metabolic engineering. Curr Opin
dictions of gene essentiality using sequence similarity Biotechnol 2015, 35:23-29.
analysis, knockout strain analysis, and adaptive evolution. 3. O’Brien EJ, Monk JM, Palsson BO: Using genome-scale models
Such a workflow has revealed promiscuous functions of to predict biological capabilities. Cell 2015, 161:971-987.

several E. coli enzymes [49]. However, it cannot reveal 4. McCloskey D, Palsson BO, Feist AM: Basic and applied uses of
genome-scale metabolic network reconstructions of
long underground metabolic pathways containing multi- Escherichia coli. Mol Syst Biol 2014, 9 661-661.
ple promiscuous enzymes and novel intermediate metab-
5. Thiele I, Palsson BØ: A protocol for generating a high-quality
olites. A purely computational approach, PROPER [46], genome-scale metabolic reconstruction. Nat Protoc 2010,
has been developed to predict promiscuous enzymes by 5:93-121.
searching distant gene similarities and by analyzing phy- 6. Hamilton JJ, Reed JL: Software platforms to facilitate
logenetic trees to assign secondary functions to genes. A reconstructing genome-scale metabolic networks. Environ
Microbiol 2014, 16:49-59.
related model-based approach (GEM-PROPER) [46]
7. Orth JD, Palsson B: Systematizing the generation of missing
has been designed to identify underground metabolic metabolic knowledge. Biotechnol Bioeng 2010, 107:403-412.
pathways. Unlike the analyses that have combined
8. Thiele I, Vlassis N, Fleming RMT: FASTGAPFILL: efficient gap
experiments and gap-filling, the accuracies of PROPER filling in metabolic networks. Bioinformatics 2014, 30:2529-
and GEM-PROPER predictions still need to be evalu- 2531.
ated. So far 10 out of the 63 PROPER predicted E. coli 9. Hartleb D, Jarre F, Lercher MJ: Improved metabolic models for
promiscuous enzymes and 1 out of the 98 GEM- E. coli and Mycoplasma genitalium from GlobalFit, an
algorithm that simultaneously matches growth and non-
PROPER predicted pathways have been validated. growth data sets. PLOS Comput Biol 2016, 12:e1005036.
10. Prigent S, Frioux C, Dittami SM, Thiele S, Larhlimi A, Collet G,
Conclusions Gutknecht F, Got J, Eveillard D, Bourdon J et al.: Meneco, a
Recent developments in gap-filling algorithms for topology-based gap-filling tool applicable to degraded
genome-wide metabolic networks. PLoS Comput Biol 2017, 13:
genome-scale metabolic models have showed great e1005276.
potential for revealing missing knowledge of metabolic 11. Frioux C, Schaub T, Schellhorn S, Siegel A, Wanko P: Hybrid
reactions, gene-protein-reaction associations, and promis- metabolic network completion. LPNMR 2017: Logic
cuous enzymes. These algorithms are diverse in the terms Programming and Nonmonotonic Reasoning. 2017:308-321.
of experimental data input, formulation strategies, and 12. Hosseini Z, Marashi S-A: Discovering missing reactions of
 metabolic networks by using gene co-expression data. Sci
computational methods. Novel techniques from machine Rep 2017, 7:41774.
learning, network topology analysis, likelihood modeling, This algorithm, GAUGE, is the first algorithm that incorporates flux
and bioinformatics research have been incorporated into coupling analysis for gap-identification. It allows the incorporation of
gene expression datasets.
these methods. In the future, gap-filling algorithms could
13. Burgard AP, Nikolaev EV, Schilling CH, Maranas CD: Flux
incorporate kinetic, thermodynamics and regulatory con- coupling analysis of genome-scale metabolic network
straints for more accurate predictions. With the advances reconstructions. Genome Res 2004, 14:301-312.
in high-throughput phenotyping technologies and omics 14. Liu L, Zhang Z, Sheng T, Chen M: DEF: an automated dead-end
measurement, gap-filling algorithms will generate more filling approach based on quasi-endosymbiosis. Bioinformatics
2016, 33:btw604.
and better hypotheses. Meanwhile, biochemical charac-
terization that reveals the details and caveats of novel 15. Ganter M, Kaltenbach H-M, Stelling J: Predicting network
functions with nested patterns. Nat Commun 2014, 5:3006.
metabolic discoveries should be performed to inform and
validate in silico predictions. 16. M. Zhang, Z. Cui, T. Oyetunde, Y. Tang and Y. Chen, Recovering
metabolic networks using a novel hyperlink prediction
method. 2016. arXiv:1610.06941.
Conflict of interest 17. Oyetunde T, Zhang M, Chen Y, Tang Y, Lo C: BoostGAPFILL:
None.  improving the fidelity of metabolic network reconstructions

www.sciencedirect.com Current Opinion in Biotechnology 2018, 51:103–108


108 Systems biology

through integrated constraint and pattern-based methods. mutational effects across bacterial species. Proc Natl Acad Sci
Bioinformatics 2016, 33:btw684. USA 2016, 113:2502-2507.
BoostGAPFILL is a novel pattern-based gap-filling method that uses
matrix factorization, a machine learning technique. It can be used as a tool 34. Ronda C, Pedersen LE, Sommer MOA, Nielsen AT: CRMAGE:
for ranking reactions in databases. CRISPR optimized MAGE recombineering. Sci Rep 2016,
6:19452.
18. Biggs MB, Papin JA: Metabolic network-guided binning of
 metagenomic sequence fragments. Bioinformatics 2016, 35. Côté J, French S, Gehrke SS, MacNair CR, Mangat CS, Bharat A,
32:867-874. Brown ED: The genome-wide interaction network of nutrient
This work introduces a unique algorithm, SONEC, which is specially stress genes in Escherichia coli. mBio 2016, 7:e01714-e1716.
designed for completing networks of organisms in metagenomics
samples. 36. Giaever G, Nislow C: The yeast deletion collection: a decade of
functional genomics. Genetics 2014, 197:451-465.
19. Plata G, Fuhrer T, Hsiao T-L, Sauer U, Vitkup D: Global
probabilistic annotation of metabolic networks enables 37. Yamamoto N, Nakahigashi K, Nakamichi T, Yoshino M, Takai Y,
enzyme discovery. Nat Chem Biol 2012, 8:848-854. Touda Y, Furubayashi A, Kinjyo S, Dose H, Hasegawa M et al.:
Update on the Keio collection of Escherichia coli single-gene
20. Chitale M, Khan IK, Kihara D: Missing gene identification using deletion mutants. Mol Syst Biol 2009, 5:335.
functional coherence scores. Sci Rep 2016, 6:31725.
38. Baba T, Ara T, Hasegawa M, Takai Y, Okumura Y, Baba M,
21. Vitkin E, Shlomi T: MIRAGE: a functional genomics-based Datsenko KA, Tomita M, Wanner BL, Mori H: Construction of
approach for metabolic network model reconstruction and its Escherichia coli K-12 in-frame, single-gene knockout
application to Cyanobacteria networks. Genome Biol 2012, 13: mutants: the Keio collection. Mol Syst Biol 2006, 2 2006.0008.
R111.
39. Kobayashi K, Ehrlich SD, Albertini A, Amati G, Andersen KK,
22. Benedict MN, Mundy MB, Henry CS, Chia N, Price ND: Arnaud M, Asai K, Ashikaga S, Aymerich S, Bessieres P et al.:
Likelihood-based gene annotations for gap filling and quality Essential Bacillus subtilis genes. Proc Natl Acad Sci USA 2003,
assessment in genome-scale metabolic models. PLoS Comput 100:4678-4683.
Biol 2014, 10:e1003882.
40. Deutschbauer A, Price MN, Wetmore KM, Shao W, Baumohl JK,
23. King B, Farrah T, Richards M, Mundy M, Simeonidis E, Price ND: Xu Z, Nguyen M, Tamse R, Davis RW, Arkin AP: Evidence-based
ProbAnnoWeb and ProbAnnoPy: probabilistic annotation and annotation of gene function in Shewanella oneidensis MR-1
gap-filling of metabolic reconstructions. bioRxiv, 2017. 151258. using genome-wide fitness profiling across 121 conditions.
PLoS Genet 2011, 7:e1002385.
24. Krumholz EW, Libourel IGL: Sequence-based network
completion reveals the integrality of missing reactions in 41. Skerker JM, Leon D, Price MN, Mar JS, Tarjan DR, Wetmore KM,
metabolic networks. J Biol Chem 2015, 290:19197-19207. Deutschbauer AM, Baumohl JK, Bauer S, Ibanez AB et al.:
Dissecting a complex chemical stress: chemogenomic
25. Pan S, Nikolakakis K, Adamczyk PA, Pan M, Ruby EG, Reed JL:
profiling of plant hydrolysates. Mol Syst Biol 2014, 9 674-674.
 Model-enabled gene search (MEGS) allows fast and direct
discovery of enzymatic and transport gene functions in the 42. Biggs MB, Papin JA: Managing uncertainty in metabolic
marine bacterium Vibrio fischeri. J Biol Chem 2017, 292:10250- network structure and improving predictions using
10261. EnsembleFBA. PLoS Comput Biol 2017, 13:1-25.
MEGS is an integrated computational and experimental approach that
allows the identification of gap-filled reactions and the gene-protein- 43. Notebaart RA, Kintses B, Feist AM, Papp B: Underground
reaction associations with a high level of confidence. metabolism: network-level perspective and biotechnological
potential. Curr Opin Biotechnol 2018, 49:108-114.
26. Tervo CJ, Reed JL: FOCAL: an experimental design tool for
systematizing metabolic discoveries and model development. 44. Patra B, Kon Y, Yadav G, Sevold AW, Frumkin JP,
Genome Biol 2012, 13:R116. Vallabhajosyula RR, Hintze A, Stman B, Schossau J, Bhan A et al.:
A genome wide dosage suppressor network reveals genomic
27. Joyce AR, Reed JL, White A, Edwards R, Osterman A, Baba T, robustness. Nucleic Acids Res 2017, 45:255-270.
Mori H, Lesely Sa, Palsson BØ, Agarwalla S: Experimental and
computational assessment of conditionally essential genes in 45. Patrick WM, Quandt EM, Swartzlander DB, Matsumura I:
Escherichia coli. J Bacteriol 2006, 188:8259-8271. Multicopy suppression underpins metabolic evolvability. Mol
Biol Evol 2007, 24:2716-2722.
28. Szappanos B, Kovács K, Szamecz B, Honti F, Costanzo M,
Baryshnikova A, Gelius-Dietrich G, Lercher MJ, Jelasity M, 46. Oberhardt MA, Zarecki R, Reshef L, Xia F, Duran-Frigola M,
Myers CL et al.: An integrated approach to characterize genetic  Schreiber R, Henry CS, Ben-Tal N, Dwyer DJ, Gophna U et al.:
interaction networks in yeast metabolism. Nat Genet 2011, Systems-wide prediction of enzyme promiscuity reveals a new
43:656-662. underground alternative route for pyridoxal 50 -phosphate
29. Shea A, Wolcott M, Daefler S, Rozak DA: Biolog phenotype production in E. coli. PLoS Comput Biol 2016, 12:1-19.
microarrays. In Microbial Systems Biology: Methods and This work introduces in silico comparative genomics tools to identify both
Protocols. Edited by Navid A. Humana Press; 2012:331-373. promiscuous enzymes and underground metabolic pathways. It gener-
ates unintuitive hypotheses that could not be obtained using traditional
30. Wetmore KM, Price MN, Waters RJ, Lamson JS, He J, Hoover CA, multicopy suppression experiments.
Blow MJ, Bristow J, Butland G, Arkin AP et al.: Rapid
quantification of mutant fitness in diverse bacteria by 47. Wang P-H, Tang S, Nemr K, Flick R, Yan J, Mahadevan R, F
sequencing randomly bar-coded transposons. mBio 2015, 6: Yakunin A, Löffler FE, Edwards EA: Refined experimental
e00306-e315. annotation reveals conserved corrinoid autotrophy in
chloroform-respiring Dehalobacter isolates. ISME J 2017,
31. Gallagher RR, Li Z, Lewis AO, Isaacs FJ: Rapid editing and 11:626-640.
evolution of bacterial genomes using libraries of synthetic
DNA. Nat Protoc 2014, 9:2301-2316. 48. Nakahigashi K, Toya Y, Ishii N, Soga T, Hasegawa M, Watanabe H,
Takai Y, Honma M, Mori H, Tomita M: Systematic phenome
32. Wang HH, Isaacs FJ, Carr PA, Sun ZZ, Xu G, Forest CR, analysis of Escherichia coli multiple-knockout mutants
Church GM: Programming cells by multiplex genome reveals hidden reactions in central carbon metabolism. Mol
engineering and accelerated evolution. Nature 2009, 460:894- Syst Biol 2009, 5:306.
898.
49. Guzmán GI, Utrilla J, Nurk S, Brunk E, Monk JM, Ebrahim A,
33. Nyerges Á, CsörgÅ B, Nagy I, Bálint B, Bihari P, Lázár V, Apjok G, Palsson BO, Feist AM: Model-driven discovery of underground
Umenhoffer K, Bogos B, Pósfai G et al.: A highly precise and metabolic functions in Escherichia coli. Proc Natl Acad Sci USA
portable genome engineering method allows comparison of 2015, 112:929-934.

Current Opinion in Biotechnology 2018, 51:103–108 www.sciencedirect.com

You might also like