Professional Documents
Culture Documents
BioSystems
journal homepage: www.elsevier.com/locate/biosystems
a r t i c l e i n f o a b s t r a c t
Article history: Background: Reconstruction of genome-scale metabolic networks has resulted in models capable of repro-
Received 16 September 2011 ducing experimentally observed biomass yield/growth rates and predicting the effect of alterations in
Received in revised form 23 March 2012 metabolism for biotechnological applications. The existing studies rely on modifying the metabolic net-
Accepted 23 April 2012
work of an investigated organism by removing or inserting reactions taken either from evolutionary
similar organisms or from databases of biochemical reactions (e.g., KEGG). A potential disadvantage
Keywords:
of these knowledge-driven approaches is that the result is biased towards known reactions, as such
Metabolic networks
approaches do not account for the possibility of including novel enzymes, together with the reactions
Optimization
Mass-balanced reactions
they catalyze.
Synthetic biology Results: Here, we explore the alternative of increasing biomass yield in three model organisms, namely
Bacillus subtilis, Escherichia coli, and Hordeum vulgare, by applying small, chemically feasible network mod-
ifications. We use the predicted and experimentally confirmed growth rates of the wild-type networks
as reference values and determine the effect of inserting mass-balanced, thermodynamically feasible
reactions on predictions of growth rate by using flux balance analysis.
Conclusions: While many replacements of existing reactions naturally lead to a decrease or complete
loss of biomass production ability, in all three investigated organisms we find feasible modifications
which facilitate a significant increase in this biological function. We focus on modifications with feasible
chemical properties and a significant increase in biomass yield. The results demonstrate that small mod-
ifications are sufficient to substantially alter biomass yield in the three organisms. The method can be
used to predict the effect of targeted modifications on the yield of any set of metabolites (e.g., ethanol),
thus providing a computational framework for synthetic metabolic engineering.
© 2012 Elsevier Ireland Ltd. Open access under CC BY-NC-ND license.
0303-2647 © 2012 Elsevier Ireland Ltd. Open access under CC BY-NC-ND license.
http://dx.doi.org/10.1016/j.biosystems.2012.04.007
G. Basler et al. / BioSystems 109 (2012) 186–191 187
Yim et al., 2011); H. vulgare has been genetically engineered for Flux balance analysis (FBA) (Varma and Palsson, 1994) provides a widely-used
enhanced breeding properties, protein synthesis, food and cellulose approach to predict the effect of pathway modifications. Due to its linear program-
production (Horvath et al., 2000, 2001; von Wettstein et al., 2000; ming formulation, FBA can be used to calculate a steady-state flux distribution which
Patel et al., 2000). We point out that our approach is not restricted to optimizes a given objective function. In order to apply FBA, one needs to specify the
stoichiometric matrix, S, containing the stoichiometric coefficients si,j of compound
optimizing biomass yield, and thus allows for the detection of reac-
i in reaction j as well as the stoichiometry of import and export reactions. Under the
tions which, when introduced into the respective network, improve assumption that the network operates at steady state, one can then calculate the
any metabolic objective of interest. optimal flux distribution by solving the linear program:
max vb
2. Methods
s.t. S · v = 0
2.1. Generating Chemically Feasible Reactions
where v is the vector of reaction fluxes, and vb the entry in v corresponding to the
metabolic objective function. In order to predict the growth rate of an organism, vb
We apply minimal modifications and combinations thereof to the wild-type net-
is chosen such that the b-th column in S describes the consumption of biomass pre-
works using two parallel strategies: (1) inserting one new reaction at a time into the
cursors. The problem is further constrained by specifying upper and lower bounds
network, termed minimal insertion, and (2) replacing one reaction at a time by a new
for the flux values vi of all reactions:
reaction, referred to as minimal replacement. The first case mimics an evolutionary
event, such as the duplication of an enzyme-coding gene, followed by its neofunc- vmin ≤ vi ≤ vmax .
tionalization. The second case models the replacement of an enzyme-coding gene
either through gene duplication, followed by a loss-of-function of the original gene, By choosing the bounds vmin and vmax of import reactions according to experimen-
or through transformation of a gene using biotechnological methods. To modify the tally observed uptake rates of a cell, it is possible to simulate a specific environmental
networks by using realistic chemical reactions, we first extend the recent method condition. Here, in order to predict the maximum possible growth rate of an analyzed
for mass-balanced randomization of metabolic networks (Basler et al., 2011). In its organism, we simulate environmental conditions which result in the largest exper-
original formulation, the method generates a new reaction by replacing the sub- imentally observed growth rates: for B. subtilis, we limit the import of glucose to
strates and products of a present reaction by compounds included in the network, 6.2 mmol g−1 cell h−1 and restrict the flux through the caa3 oxidase reaction to zero,
while preserving the mass-balance equation. The latter ensures that the number of reflecting experimental observations (Oh et al., 2007). For E. coli we set the glucose
substrate atoms equals the number of product atoms: uptake rate to 8, the O2 uptake rate to 18.5, and limit the uptake of all other nutri-
ents to 20 mmol g−1 dry weight h−1 (Feist et al., 2007). For H. vulgare seeds, we limit
the oxygen uptake to 8.9769 and the sucrose uptake to 8 mol g−1 dry weight h−1 ,
se,r · me = sp,r · mp , (1)
facilitating maximal biomass production (Grafahrend-Belau et al., 2009). By apply-
e∈Er p∈Pr ing FBA to the metabolic networks with experimentally validated nutrient uptake
and growth rates, we can now calculate the growth rates of the wild-type networks
where Er is the set of substrates and Pr the set of products of r, me , mp are the and their modified variants.
vectors of sum formulas of e and p, respectively, and se,r , sp,r their stoichiometric
coefficients. As an example for replacing an individual substrate, consider the aldose 2.3. Screening for Feasible Synthetic Reactions
1-epimerase reaction in E. coli: -d-galactose → d-galactose, with m-d-galactose =
md-galactose = C6 H12 O6 . Then, given that glyceraldehyde with mglyceraldehyde = C3 H6 O3 First, we calculate the optimal biomass yield for the metabolic network of each
participates in the network, the method can generate the mass-balanced reaction 2 analyzed organism to reproduce the experimental results and obtain a reference
glyceraldehyde → d-galactose, which satisfies Eq. (1) since 2 C3 H6 O3 = C6 H12 O6 . In value of the wild-type networks. For each wild-type network, we then generate
addition to substituting individual substrates or products, the method also allows all possible networks obtained by applying minimal perturbations. To this end, all
more complex substitutions involving pairs of substrates or products, yielding a substitutions of individual substrates or products and pairs thereof for each reaction
large number of possible substitutions. are considered (see Basler et al., 2011). Perturbations leading to thermodynamically
The motivation for generating new chemical reactions from those already infeasible reactions are neglected, as described in Section 2.1, amounting to 57,
included in metabolic models is twofold: first, the modification is minimal in the 1469, and 0 such perturbations in B. subtilis, E. coli, and H. vulgare, respectively.
sense that only the substrates or products of one reaction in the network are mod- The total number of remaining perturbations, and thus the number of minimally
ified. Thus, the generated reactions are similar to an existing one. Compared to modified networks is 20,768 for B. subtilis, 23,898 for E. coli, and 722 for the smaller
generating entirely new reactions from scratch, the results of this method are more network of H. vulgare seeds. The relatively small number of perturbations involving
likely to be obtained by an evolutionary event, such as functional divergence of an thermodynamically infeasible reactions is a direct consequence of the ability of the
enzyme, or to be synthesizable by chemical modifications. Second, it is computation- extended randomization method to preserve a similar distribution of Gibbs free
ally inefficient to generate all possible mass-balanced reactions, so that one would energy changes of reactions (Basler et al., 2012).
have to restrict the approach by additional constraints or sampling. The presented approach can easily be extended by combining two or more min-
While the presence of the substituted compounds in the network and the imal substitutions. This could, in principle, allow for an even higher increase in
requirement of mass-balance ensure that the reaction can in principal take place, it biomass production than a single minimal substitution alone. However, depending
may still be thermodynamically infeasible. Unfortunately, the current genome-scale on the network size, evaluation of all possible combinations of two or more minimal
metabolic networks do not contain information about the physiological conditions substitutions becomes computationally demanding. For instance, we obtain more
under which individual reactions may occur. However, under the assumption of than 108 modifications involving two substitutions for the genome-scale networks
standard conditions (pH = 7, T = 298.15 K), the thermodynamic feasibility of a reac- of B. subtilis and E. coli. As a result, we resort to random sampling of two, three,
tion can still be estimated from the chemical structure of the involved molecules or four minimal substitutions, which are inserted into the original metabolic net-
using the group contribution method (Mavrovouniotis, 1991; Tanaka et al., 2003; works. These modifications are referred to as two-, three-, and four-fold insertion,
Henry et al., 2006). The estimated standard Gibbs free energy change of a reaction, respectively.
188 G. Basler et al. / BioSystems 109 (2012) 186–191
Table 1
Feasible replacements of reactions leading to the highest increase in biomass yield. (a) Malic enzyme reaction in B. subtilis (EC 1.1.1.40), (b) fumarate hydratase reaction in
E. coli (EC 4.2.1.2) and (c) glutamate dehydrogenase reaction in H. vulgare (EC 1.4.1.2). In B. subtilis, 89 other modifications give the same biomass yield, and in H. vulgare one
more. All minimal replacements are found in Supplementary File 1.
For each modified network, we calculate the biomass yield by using identical flux Table 2
bounds and biomass reaction of the corresponding original network, and compare Two feasible synthetic reactions, generated from the (a) aspartate transaminase (EC
the obtained values to the original biomass yield. 2.6.1.1) and (b) carbonate dehydratase (EC 4.2.1.1) reactions in E. coli, each resulting
in a 225.4% increase in biomass yield when inserted into the network. All minimal
insertions are found in Supplementary File 2.
Fig. 1. Distribution of relative changes in biomass yield. Distributions of the relative change in biomass yield after modifying the wild-type networks of (a) B. subtilis, (b) E. coli,
and (c) H. vulgare seeds. The relative change is shown as percentage of the biomass yield compared to the original network. Values are binned in 1% intervals. In each organism,
most of the modifications do not affect biomass yield (100%), several lead to a complete loss (0%), and some to a significant decrease or increase. The main panels are zoomed
on the y-axis for clarity; the inlays show the full frequency ranges.
190 G. Basler et al. / BioSystems 109 (2012) 186–191
Original flux
Perturbed flux
Oxaloacetate Citrate
Acetoacetate
Succinate Succinyl-CoA
Fig. 2. Modification of the TCA cycle. Illustration of a replacement in the TCA cycle leading to a 7.2% increase in biomass yield in B. subtilis. The original reaction, alanine
transaminase (EC 2.6.1.1) with equation pyruvate + glutamate ↔ 2-oxoglutarate + l-alanine (dashed blue), is replaced by the synthetic reaction acetoacetate + aspartate ↔
2-oxoglutarate + l-alanine (dashed red), which is mass-balanced and thermodynamically feasible with r Gest 0
= −2.46. Calculated fluxes in the original network are shown
as blue arrows, fluxes in the modified network as red arrows. Black arrows indicate a flux in both networks.
Table 3
Highest biomass yields of all minimal replacements, minimal insertions, and two-, three-, and four-fold insertions. In H. vulgare, all 257,288 pairwise combinations of minimal
insertions were considered, while the other combinations were sampled randomly due to computational complexity. The number of perturbed networks is denoted by no.,
while max. yield gives the maximal value of biomass production for these networks compared to the original networks. Only in H. vulgare the twofold insertions further
increase the highest value for biomass yield over the minimal perturbations.
substitutions. In the case of B. subtilis, 31% of all reactions allow particularly the more specialized network of H. vulgare seeds. Nev-
for an increase in biomass. This proportion increases to 71.5% if the ertheless, we identified several feasible modifications to each of the
reactions with variability of vmax − vmin are excluded. Similar find- networks which are predicted to further increase biomass yield.
ings are obtained for H. vulgare, where the values range between Using ethanol production as a further objective function, we illus-
19% and 51%, and E. coli, from 28% to 70%. This suggests that FVA trate that the same approach can be directly applied to generate
may be used to identify reactions whose substitutions are good reactions facilitating the improved production of valuable com-
candidates for increased biomass production. However, the mutual pounds. By using more complex objective functions the approach
information between the variability and the increase of a reaction may also be applied to detect chemical reactions which, in addition
is very low, with values of 0.17, 0.13, and 0.002 for B. subtilis, H. vul- to improving the production of desired compounds, suppress the
gare, and E. coli, respectively, which clearly indicates that predictive production of toxic compounds.
power of FVA is limited. As all considered synthetic reactions satisfy basic mass conser-
vation and thermodynamic constraints, they may be potentially
4. Conclusions catalyzed by suitable enzymes. Strategies for their design and syn-
thesis may be developed in three ways: (1) targeted search for a
We have presented a new approach for systematic detection of gene encoding the enzyme in a less characterized organism; (2)
novel feasible reactions which alter a metabolic objective of inter- discovery of the enzyme in the environment; (3) targeted synthe-
est, as exemplified by biomass and ethanol production. We found sis of the enzyme using methods of chemical engineering (Qi et al.,
that the three analyzed metabolic networks are strongly optimized, 2001).
G. Basler et al. / BioSystems 109 (2012) 186–191 191
We further point out that, in order to obtain the predicted effect Grafahrend-Belau, E., Schreiber, F., Koschützki, D., Junker, B.H., 2009. Flux balance
of introducing a reaction in the network, one does not necessarily analysis of barley seeds: a computational approach to study systemic properties
of central metabolism. Plant Physiol. 149 (January (1)), 585–598.
need to specifically catalyze the predicted reaction. In fact, concate- Hanson, A.D., Pribat, A., Waller, J.C., de Crécy-Lagard, V., 2010. ‘Unknown’
nated reactions having the same net consumption and production proteins and ‘orphan’ enzymes: the missing half of the engineering
as the identified reaction will have the same effect, thus broadening parts list—and how to find it. Biochem. J. 425 (January (1)), 1–11,
http://dx.doi.org/10.1042/BJ20091328.
the possibilities for applying the method. Henry, C.S., Jankowski, M.D., Broadbelt, L.J., Hatzimanikatis, V., 2006. Genome-scale
Another possible application is the automated curation of thermodynamic analysis of Escherichia coli metabolism. Biophys. J. 90 (February
metabolic networks. If an analyzed network is not capable of repro- (4)), 1453–1461, http://dx.doi.org/10.1529/biophysj.105.071720.
Horvath, H., Huang, J., Wong, O., Kohl, E., Okita, T., Kannangara, C.G., von
ducing the experimental observations, it is likely to be incomplete.
Wettstein, D., 2000. The production of recombinant proteins in transgenic
The generation of novel reactions which allow a model to better barley grains. Proc. Natl. Acad. Sci. U.S.A. 97 (February (4)), 1914–1919,
fit experimental data may then give hints to which important reac- http://dx.doi.org/10.1073/pnas.030527497.
Horvath, H., Jensen, L.G., Wong, O.T., Kohl, E., Ullrich, S.E., Cochran, J., Kannangara,
tions are missing. Therefore, we believe that the approach provides
C.G., von Wettstein, D., 2001. Stability of transgene expression, field performance
a valuable tool for reconstruction of metabolic networks and their and recombination breeding of transformed barley lines. Theor. Appl. Genet. 102,
use in metabolic engineering or drug development. 1–11, http://dx.doi.org/10.1007/s001220051612.
Lee, K.H., Park, J.H., Kim, T.Y., Kim, H.U., Lee, S.Y., 2007. Systems metabolic engi-
neering of Escherichia coli for l-threonine production. Mol. Syst. Biol. 3, 149,
Acknowledgements http://dx.doi.org/10.1038/msb4100196.
Lee, S.J., Lee, D.-Y., Kim, T.Y., Kim, B.H., Lee, J., Lee, S.Y., 2005. Metabolic
engineering of Escherichia coli for enhanced production of succinic
This work was supported by the GoFORSYS project (GB and ZN) acid, based on genome comparison and in silico gene knockout sim-
and the ColoNET project (SG), funded by the German Federal Min- ulation. Appl. Environ. Microbiol. 71 (December (12)), 7880–7887,
istry of Education and Research. http://dx.doi.org/10.1128/AEM.71.12.7880-7887.2005.
Mahadevan, R., Schilling, C.H., 2003. The effects of alternate optimal solutions in
Authors’ contributions: GB developed the method, SG performed constraint-based genome-scale metabolic models. Metab. Eng. 5 (October (4)),
the analyses, ZN designed and coordinated the study. All authors 264–276.
read and approved the final manuscript. Mavrovouniotis, M.L., 1991. Estimation of standard gibbs energy changes of bio-
transformations. J. Biol. Chem. 266 (August (22)), 14440–14445.
Oh, Y.-K., Palsson, B.Ø., Park, S.M., Schilling, C.H., Mahadevan, R., 2007.
Appendix A. Supplementary Data Genome-scale reconstruction of metabolic network in Bacillus subtilis based
on high-throughput phenotyping and gene essentiality data. J. Biol. Chem. 282
(September (39)), 28791–28799, http://dx.doi.org/10.1074/jbc.M703759200.
Supplementary data associated with this article can be found, Oliveira, A.P., Nielsen, J., Förster, J., 2005. Modeling lactococcus lac-
in the online version, at http://dx.doi.org/10.1016/j.biosystems. tis using a genome-scale flux model. BMC Microbiol. 5, 39,
http://dx.doi.org/10.1186/1471-2180-5-39.
2012.04.007 Patel, M., Johnson, J.S., Brettell, R.I., Jacobsen, J., Xue, G.-P., 2000. Transgenic barley
expressing a fungal xylanase gene in the endosperm of the developing grains.
Mol. Breed. 6, 113–124, http://dx.doi.org/10.1023/A:1009640427515.
References Perkins, J.B., Sloma, A., Hermann, T., Theriault, K., Zachgo, E., Erdenberger, T., Hannett,
N., Chatterjee, N.P., Williams II, V., Rufo Jr., G., Hatch, R., Pero, J., 1999. Genetic
Alper, H., Miyaoku, K., Stephanopoulos, G., 2005. Construction of engineering of Bacillus subtilis for the commercial production of riboflavin. J. Ind.
lycopene-overproducing E. coli strains by combining systematic and com- Microbiol. Biotechnol. 22, 8–18, http://dx.doi.org/10.1038/sj.jim.2900587.
binatorial gene knockout targets. Nat. Biotechnol. 23 (May (5)), 612–616, Qi, D., Tann, C.M., Haring, D., Distefano, M.D., 2001. Generation of new enzymes
http://dx.doi.org/10.1038/nbt1083. via covalent modification of existing proteins. Chem. Rev. 101 (October (10)),
Atsumi, S., Hanai, T., Liao, J.C., 2008. Non-fermentative pathways for synthesis of 3081–3111.
branched-chain higher alcohols as biofuels. Nature 451 (January (7174)), 86–89, Quek, L.-E., Nielsen, L.K., 2008. On the reconstruction of the Mus musculus
http://dx.doi.org/10.1038/nature06450. genome-scale metabolic network model. Genome Inform. 21, 89–100.
Bar-Even, A., Noor, E., Lewis, N.E., Milo, R., 2010. Design and analysis of synthetic Schallmey, M., Singh, A., Ward, O.P., 2004. Developments in the use of bacillus
carbon fixation pathways. Proc. Natl. Acad. Sci. U.S.A. 107 (May (19)), 8889–8894, species for industrial production. Can. J. Microbiol. 50 (January (1)), 1–17,
http://dx.doi.org/10.1073/pnas.0907176107. http://dx.doi.org/10.1139/w03-076.
Basler, G., Ebenhöh, O., Selbig, J., Nikoloski, Z., 2011. Mass-balanced random- Sohn, S.B., Kim, T.Y., Park, J.M., Lee, S.Y., 2010. In silico genome-scale metabolic
ization of metabolic networks. Bioinformatics 27 (May (10)), 1397–1403, analysis of Pseudomonas putida KT2440 for polyhydroxyalkanoate synthesis,
http://dx.doi.org/10.1093/bioinformatics/btr145. degradation of aromatics and anaerobic survival. Biotechnol. J. 5 (July (7)),
Basler, G., Grimbs, S., Ebenhöh, O., Selbig, J., Nikoloski, Z., 2012. Evolutionary sig- 739–750, http://dx.doi.org/10.1002/biot.201000124.
nificance of metabolic network properties. J. R. Soc. Interface 9 (June (71)), Tanaka, M., Okuno, Y., Yamada, T., Goto, S., Uemura, S., Kanehisa, M., 2003. Extrac-
1168–1176, http://dx.doi.org/10.1098/rsif.2011.0652. tion of a thermodynamic property for biochemical reactions in the metabolic
Bond-Watts, B.B., Bellerose, R.J., Chang, M.C.Y., 2011. Enzyme mechanism as a kinetic pathway. Genome Inform. 14, 370–371.
control element for designing synthetic biofuel pathways. Nat. Chem. Biol. 7 Varma, A., Palsson, B.Ø., 1994. Metabolic flux balancing: basic concepts,
(April (4)), 222–227, http://dx.doi.org/10.1038/nchembio.537. scientific and practical use. Biotechnology 12 (October (10)), 994–998,
Breitling, R., Vitkup, D., Barrett, M.P., 2008. New surveyor tools for charting http://dx.doi.org/10.1038/nbt1094-994.
microbial metabolic maps. Nat. Rev. Microbiol. 6 (February (2)), 156–161, von Wettstein, D., Mikhaylenko, G., Froseth, J.A., Kannangara, C.G., 2000.
http://dx.doi.org/10.1038/nrmicro1797. Improved barley broiler feed with transgenic malt containing heat-stable
Burgard, A.P., Pharkya, P., Maranas, C.D., 2003. Optknock: a bilevel programming (1,3-1,4)-beta-glucanase. Proc. Natl. Acad. Sci. U.S.A. 97 (December (25)),
framework for identifying gene knockout strategies for microbial strain opti- 13512–13517, http://dx.doi.org/10.1073/pnas.97.25.13512.
mization. Biotechnol. Bioeng. 84 (December (6)), 647–657. Wang, Y., Ruan, L., Lo, W.-H., Chua, H., Yu, H.-F., 2006. Construction of recombi-
Dobson, C.M., 2004. Chemical space and biology. Nature 432 (December (7019)), nant Bacillus subtilis for production of polyhydroxyalkanoates. Appl. Biochem.
824–828, http://dx.doi.org/10.1038/nature03192. Biotechnol. 129–132, 1015–1022.
Edwards, J.S., Ibarra, R.U., Palsson, B.Ø., 2001. In silico predictions of Escherichia coli Wang, Z., Chen, T., Ma, X., Shen, Z., Zhao, X., 2011. Enhancement of riboflavin produc-
metabolic capabilities are consistent with experimental data. Nat. Biotechnol. tion with Bacillus subtilis by expression and site-directed mutagenesis of zwf and
19 (February (2)), 125–130, http://dx.doi.org/10.1038/84379. gnd gene from Corynebacterium glutamicum. Bioresour. Technol. 102 (February
Feist, A.M., Henry, C.S., Reed, J.L., Krummenacker, M., Joyce, A.R., Karp, P.D., (4)), 3934–3940, http://dx.doi.org/10.1016/j.biortech.2010.11.120.
Broadbelt, L.J., Hatzimanikatis, V., Palsson, B.Ø., 2007. A genome-scale Yamada, T., Bork, P., 2009. Evolution of biomolecular networks: lessons from
metabolic reconstruction for Escherichia coli k-12 mg1655 that accounts metabolic and protein interactions. Nat. Rev. Mol. Cell Biol. 10 (November (11)),
for 1260 ORFs and thermodynamic information. Mol. Syst. Biol. 3 (121), 791–803.
http://dx.doi.org/10.1038/msb4100155. Yim, H., Haselbeck, R., Niu, W., Pujol-Baxley, C., Burgard, A., Boldt, J., Khandurina, J.,
Goeddel, D.V., Kleid, D.G., Bolivar, F., Heyneker, H.L., Yansura, D.G., Crea, R., Hirose, Trawick, J.D., Osterhout, R.E., Stephen, R., Estadilla, J., Teisan, S., Schreyer, H.B.,
T., Kraszewski, A., Itakura, K., Riggs, A.D., 1979. Expression in Escherichia coli of Andrae, S., Yang, T.H., Lee, S.Y., Burk, M.J., Dien, S.V., 2011. Metabolic engineering
chemically synthesized genes for human insulin. Proc. Natl. Acad. Sci. U.S.A. 76 of Escherichia coli for direct production of 1,4-butanediol. Nat. Chem. Biol. (May),
(January (1)), 106–110. http://dx.doi.org/10.1038/nchembio.580.