You are on page 1of 14

pubs.acs.

org/synthbio Research Article

MAPPS: A Web-Based Tool for Metabolic Pathway Prediction and


Network Analysis in the Postgenomic Era
Muhammad Rizwan Riaz, Gail M. Preston, and Aziz Mithani*
Cite This: https://dx.doi.org/10.1021/acssynbio.9b00397 Read Online

ACCESS Metrics & More Article Recommendations *


sı Supporting Information
See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.

ABSTRACT: Comparative and evolutionary analyses of metabolic net-


works have a wide range of applications, ranging from research into
metabolic evolution through to practical applications in drug development,
synthetic biology, and biodegradation. We present MAPPS: Metabolic
network Analysis and Pathway Prediction Server (https://mapps.lums.edu.
Downloaded via BIU SANTE on April 29, 2020 at 19:43:11 (UTC).

pk), a web-based tool to study functions and evolution of metabolic


networks using traditional and ‘omics data sets. MAPPS provides diverse
functionalities including an interactive interface, graphical visualization of
results, pathway prediction and network comparison, identification of potential drug targets, in silico metabolic engineering, host−
microbe interactions, and ancestral network building. Importantly, MAPPS also allows users to upload custom data, thus enabling
metabolic analyses on draft and custom genomes, and has an ‘omics pipeline to filter pathway results, making it relevant in today’s
postgenomic era.
KEYWORDS: metabolic network, pathway prediction, network comparison, metabolic evolution, in silico metabolic engineering,
host−microbe interaction, ‘omics pipeline

M etabolic networks correspond to one of the most


intricate processes inside a cell and consist of
biochemical reactions, catalyzed by enzymes, connecting one
organisms,15 such as the Kyoto Encyclopedia of Genes and
Genomes (KEGG),16 Reactome,17 and BioCyc.18 These
databases have been complemented by the development of
or more metabolites called substrates, which combine to give tools for predicting and analyzing metabolic networks,
one or more metabolites called products. A sequence of these comparing metabolic networks, performing in silico metabolic
enzyme-catalyzed reactions transforming a source metabolite engineering, and designing novel pathways for biosynthesis and
into the target metabolite is called a metabolic pathway.1 Over biodegradation to understand the adaptation and functional
the years, many metabolic pathways have been deciphered specialization in different species.19 Some of the commonly
using experimental protocols to explore the metabolic used tools for metabolic network analysis are listed in Table 1.
capabilities of different organisms.2−6 Studies have shown Although the available tools serve as useful resources for
that most organisms have a core set of enzymes that are metabolic pathway prediction, their usability is limited by the
involved in energy metabolism and catalyze essential processes fact that most of them are designed to allow users to examine
such as protein synthesis and DNA replication, but a significant the metabolism of a single organism or reference metabolic
proportion of the enzymes present in different organisms are network rather than analyzing data for multiple organisms,
specific to the needs of individual organisms or tissues.7−9 which is essential to carry out metabolic comparisons. In
Moreover, it has been reported that despite the presence of addition, many tools including Pathway Hunter Tool (PHT),20
many possible routes from one metabolite to another, different From Metabolite to Metabolite (FMM),21 Metabolic Route
organisms have evolved to favor distinct pathways to produce Search and Design (MRSD),22 PathComp,23 PathPred,24 and
or consume a metabolite.10 Availability of genomes of many Metabolic Route Explorer (MRE)25 do not allow users to
species offers the possibility of in-depth analysis of metabolic simultaneously predict pathways between multiple pairs of
pathways within an organism as well as between different source and target metabolites making metabolic pathway
organisms to comprehend the processes and attributes that analysis and comparison even within a single organism a
affect the evolution of metabolic networks.11
Comparative and evolutionary analyses of metabolic net-
works have a broad range of applications, ranging from Received: September 30, 2019
research into metabolic evolution through to practical
applications in drug development, synthetic biology, and
biodegradation.12−14 Researchers have developed comprehen-
sive databases that catalog experimentally discovered genes,
enzymes, reactions, and pathways of genome-sequenced

© XXXX American Chemical Society https://dx.doi.org/10.1021/acssynbio.9b00397


A ACS Synth. Biol. XXXX, XXX, XXX−XXX
ACS Synthetic Biology pubs.acs.org/synthbio Research Article

Table 1. Tools Available for Metabolic Network Analysis


supported analyses
web-
based database last pathway comparative support for custom evolutionary drug target host−microbe
tool name tool active updated prediction analysis networks analysis identification interaction
ATLAS26 + + 2015a +
BPAT-S/BPAT- + + 20/06/2011 +
M27
FMM21 + + 01/10/2008 + +
FogLight28 + Unknownb +
Metabolic N/A +
Tinker29
MetaPath + + 13/2/2007 + +
Online30
MetaRoute31 + + 21/10/2007 + + +
MetQuest32 + Unknownb + +
MRE25 + + 01/10/2015 +
MRSD22 + N/A +
NetCooperate33 + Unknownb +
PathComp23 + + Real-time +
PathPred24 + + Real-time +
PHT20 + + 05/04/2011 + + +
Rahnuma34 + N/A + + +
a
ATLAS database has been developed using KEGG 2015 reactions. bLast updated date not available.

tedious task for the end user. Another limitation of the the understanding of the system as a whole, the challenge of
currently available tools is the lack of options to filter predicted predicting and analyzing the properties of metabolic networks
pathways and to refine pathway searches. Although some tools based on transcriptomic, proteomic and metabolomic data has
allow users to define a limited number of constraints during gained much attention in recent years.35 Most of the currently
pathway prediction, their usability is limited by the flexibility available pathway prediction tools do not allow the user to map
provided by these tools. For example, PHT allows a provision ‘omics data on the metabolic network and compare metabolic
of requiring a metabolite to be present in the predicted networks to see the effects of ‘omics data on the pathways
pathways but does not allow metabolites to be avoided during between specific metabolites. Finally, currently available tools
pathway prediction. Similarly, MRSD only provides a provision do not provide an interactive interface for visualizing pathway
of intermediary metabolites to be required during pathway prediction results or exporting the results into standard
prediction. MRE, on the other hand, allows exclusion of Systems Biology Markup Language (SBML) for further
multiple metabolites during pathway prediction but does not analyses. A tool is, therefore, required that addresses these
provide an option to require one or more metabolites. Only limitations and allows users to analyze a wide range of
MetaRoute allows the provision of both required and excluded metabolic analyses, integrate ‘omic data sets in order to predict
metabolites/reactions but it is provided as a pathway filtering and compare metabolic networks, visualize pathway predic-
option once the pathways have been predicted between source tions in a biologically meaningful way, and obtain insight into
and target metabolites and not at the time of pathway the functioning and evolution of metabolic networks.
prediction itself. Besides this, there is a limited support for We present a tool, called MAPPS: Metabolic network
specialized analyses such as effects of enzyme/reaction Analysis and Pathway Prediction Server, that addresses the
insertion and knockout (MetaPath Online and PHT) on limitations outlined above and provides an interactive platform
metabolic pathways, support for custom networks (PHT, to analyze metabolic networks using traditional as well as
NetCooperate, MetaRoute, and MetQuest), and host− ‘omics data. MAPPS is a web-based tool available at https://
pathogen interactions (NetCooperate) in currently available mapps.lums.edu.pk and builds upon the existing architecture of
tools. Rahnuma,34 a tool that we previously developed for metabolic
Metabolic networks, like all other biological networks, are pathway prediction, and comparative and evolutionary analysis
under a process of continuous evolution. However, the of metabolic networks. Although Rahnuma had a variety of
evolutionary mechanisms of these networks are not well distinctive features that are not available in many of the
understood. It is unclear how these networks evolve and if available tools, including multinetwork pathway prediction,
there is a correlation between the evolution of metabolic and network comparison between two or more organisms or at
capabilities and various factors such as the network structure different levels of a phylogeny, its utility as a tool for studying
and/or the environment in which these organisms thrive. The network evolution, predicting metabolic capabilities and
availability of genomes for many closely related species offers analyzing ‘omic data was limited by its basic user interface,
the possibility of tracing metabolic evolution on a phylogeny text-based output, and reliance on KEGG-derived annotations
relating the genomes to understand the evolutionary processes of completely sequenced genomes. MAPPS has been designed
and constraints that affect the evolution of metabolic networks. to overcome these limitations by providing an interactive user
To the best of our knowledge, none of the currently available interface for job submission and graphical result visualization,
tools allows phylogeny-based analyses focusing on evolution of and allowing users to upload custom data to enable analyses on
metabolic networks. Furthermore, with an increased focus on draft and custom genomes in addition to providing many novel
B https://dx.doi.org/10.1021/acssynbio.9b00397
ACS Synth. Biol. XXXX, XXX, XXX−XXX
ACS Synthetic Biology pubs.acs.org/synthbio Research Article

Figure 1. Overview of MAPPS. MAPPS uses data from the KEGG database in addition to allowing users to upload custom metabolic networks
containing KEGG or non-KEGG identifiers. KEGG-based networks can be refined based on ‘omics (transcriptomics, proteomics, and
metabolomics) data sets. MAPPS provides several analyses including pathway prediction and comparison, ancestral network building/comparison,
host−microbe interaction, identifying potential drug targets, metabolite reachability, network enumeration and comparison, metabolic similarity
analysis, metabolite-specific pathways, estimation of evolution parameters, and interactive network viewer. Users can also modify networks using in
silico enzyme/reaction insertion or knockout. MAPPS output can be generated in multiple output formats including Hyper Text Markup Language
(HTML) file containing hyperlinks to KEGG for compound and reaction entities, Systems Biology Markup Language (SBML) file, tab-delimited
text file, and interactive graphical format.

functionalities. Like its predecessor, MAPPS represents network comparisons at organism as well as phylogenetic
metabolic networks as hypergraphs, rather than the commonly levels.
used graph representation. A hypergraph is a generalization of
an ordinary graph where an edge, called a hyperedge, can
connect more than two vertices.36 Since a reaction is treated as
■ RESULTS AND DISCUSSION
Overview of MAPPS. MAPPS: Metabolic network
a single entity in a hypergraph, it can be used to capture Analysis and Pathway Prediction Server is a web-based tool
relationships between multiple metabolites involved in a available at https://mapps.lums.edu.pk. The web service
reaction, unlike ordinary graphs where each edge is provides a guest login as well as an option to create a user
independent. Hypergraphs have been used to represent profile, which is free. Registration allows users to save their
metabolic networks in different studies.37,38 Overall, MAPPS unsubmitted jobs, keep track of their pending jobs, and revisit
aims to provide a single powerful resource for the analysis and the results of their completed jobs.
comparison of metabolic networks and for the study of Figure 1 provides an overview of the tool. MAPPS uses the
metabolic evolution by allowing pathway based metabolic data from Kyoto Encyclopedia of Genes and Genomes
C https://dx.doi.org/10.1021/acssynbio.9b00397
ACS Synth. Biol. XXXX, XXX, XXX−XXX
ACS Synthetic Biology pubs.acs.org/synthbio Research Article

Figure 2. MAPPS graphical user interface for job submission and interactive results visualizer. (A) The screenshot shows the first step of job
submission listing the different analyses available in MAPPS. Job submission in MAPPS consists of four steps: (1) define job, (2) build network, (3)
enter job parameters (if applicable), and (4) review and submit job. Users can provide a meaningful job name and also select the desired output
format. (B) An example of the interactive graphical output for pathway prediction. MAPPS allows users to filter predicted pathways by length,
delete one or more metabolites and/or reactions to analyze the effects of reported pathways, visualize metabolite and reaction details from KEGG
via its public API, and provides single-click access to related information in relevant databases (see text for details). The graphical result can also be
saved in PNG and PDF formats.

(KEGG), which is updated fortnightly, in addition to allowing HyperText Markup Language (HTML) and Systems Biology
users to upload custom data to build metabolic networks. Markup Language (SBML).
Support for custom data removes the reliance of the tool on MAPPS provides an interactive user-friendly interface for job
KEGG allowing pathway prediction and other analyses on draft submission and result visualization, and incorporates features
genomes for which metabolic data is not available in KEGG. such as click and drag, right-click pop-up menus and context-
Users can additionally upload transcriptomic, proteomic, or dependent help (Figure 2). It also provides single-click access
metabolomic data to customize the networks and filter the to relevant information in publicly available databases such as
results. The tool provides a wide range of functionalities UniProt,39 MetaCyc,18 BRENDA,40 and NCBI41 for enzymes
including pathway prediction and network comparison, and CheBI,42 ChEMBL,43 and PubChem44 for metabolites
identification of potential drug targets, in silico metabolic (Figure 2B). In addition, the interactive interface for result
engineering by adding/removing metabolic reactions or visualization allows the user to further explore the output by
enzymes, metabolic reachability analysis, host−pathogen manipulating the parameters, for example, by varying pathway
interactions, and ancestral network building (Figure 2A). length or removing reactions or enzymes from the underlying
The output can be generated in a number of different formats network (Figure 2B). Job submission in MAPPS consists of
including interactive graphical format, tab-delimited text, four main steps (Figure 2A, Supplementary Figure S1).
D https://dx.doi.org/10.1021/acssynbio.9b00397
ACS Synth. Biol. XXXX, XXX, XXX−XXX
ACS Synthetic Biology pubs.acs.org/synthbio Research Article

(1) Define Job: User provides a meaningful job name, and one source/target metabolite(s) to predict metabolic pathways
selects the analysis to be performed and output format in a single job (Supplementary Figure S2, Supplementary
for the results. Figure S3). To the best of our knowledge, none of the
(2) Build Network: User selects the KEGG pathways or currently available tools allow user flexibility at this level with
uploads custom data to build metabolic network(s). MetQuest being the only tool which provides an option to
Networks can be built at an organism level by specifying select multiple source/target metabolites. In addition, MAPPS
one or more organism sets or over a phylogeny (see provides an option to perform pathway-based comparison
Methods). In addition, reactions/enzymes can be added between two or more organism sets to identify pathways or
or removed from these networks using in silico metabolic reactions involved in the predicted pathways that are present in
engineering and/or filtered using ‘omics data sets (see one organism set but absent in other organism sets and vice
below). versa (Supplementary Figure S3). Finally, MAPPS also allows
metabolic pathways to be predicted on a user-defined
(3) Enter Job Parameters: User selects one or more source/ phylogeny. In this mode, users can perform pathway prediction
target metabolites, pathway length, and other parameters on leaf nodes as well as ancestral nodes predicted using one of
for pathway prediction. This step is available for many available modes (see the subsection on Ancestral
pathway-based analyses only. Network Building/Comparison). Pathway prediction on a
(4) Review and Submit Job: User reviews the job parameters phylogeny can help in identifying the functional differences at
and submits the job. various levels of the phylogeny and provide clues about the
After a user submits a job, it is added in a queue and the user metabolic evolution of various species.12
is redirected to the results page displaying the list of previously Metabolite Reachability. The reachability of a metabolite is
completed jobs for registered users and the unique URL for defined as a set of metabolites reachable from the start
results for guests. Once the job is completed, user can metabolite in given number of steps and is very useful in
download the results file or visualize the results in MAPPS in identifying the essential metabolites being produced from
the case of graphical output format. Registered users are also precursor metabolites in a given metabolic network.46 This is
notified by an email once their job has finished running. particularly useful in studies focusing on nutrient assimilation
Analyses Provided by MAPPS. MAPPS offers a diverse pathways.12,47,48 Traditionally, 13C tracer experiments are used
range of analyses for metabolic pathway prediction and to identify intermediary metabolites produced from a given
comparison, and network analysis. These are described in the start metabolite with metabolomic data being used in the
subsequent sections with some of the important features recent years.49−51 However, this has primarily been done in
illustrated using case studies later. prokaryotes or for subsets of metabolic networks due to high
Pathway Prediction and Comparison. MAPPS computes experimental cost for large networks.52 MAPPS provides a
metabolic pathways between source and target metabolites comprehensive functionality for users to identify metabolites
using depth-first traversal of hypergraphs taking into account reachable from one or more start metabolites in a given range
the constraints specified by the user during job submission (see of steps taking into account all the constraints available for
Methods). A pathway between two metabolites is defined as a pathway prediction (Supplementary Figure S4). Metabolic
connected sequence of reactions such that the product of one reachability can be ascertained on one or more networks built
reaction acts as a substrate in the next reaction (Figure 2B). using KEGG or custom data, and over organism set(s) or
While predicting metabolic pathways, MAPPS computes direct KEGG reference network allowing users to analyze the
routes between metabolites and does not allow reactions or connectedness and scope of the metabolites in the metabolic
metabolites to be repeated in a pathway thus avoiding cycles. network(s) under study.
All possible pathways between source and target metabolites Metabolite-Specific Reactions. Metabolite-specific reac-
are reported allowing users to detect previously unreported tions are defined as reactions which are involved in pathways
pathways. Pathways can, however, be constrained by requiring from only one of the given start metabolites.34 Exploring a
or avoiding one or more metabolites and/or reactions/ metabolic network to identify reactions/enzymes that are
enzymes thereby enabling users to focus or avoid certain exclusive to a particular metabolic route has many applications
metabolic routes during pathway prediction (Supplementary ranging from potential therapeutic targets to metabolic
Figure S2). By default, ubiquitous metabolites such as ATP, engineering.13,19 MAPPS provides an option to identify
AMP, O2, H2O and CO2 (Supplementary Table S1) are metabolite-specific reactions, which is built upon the pathway
ignored during pathway prediction. In addition, intermediate prediction module. However, unlike standard pathway
metabolites can also be restricted by presence or absence of prediction which predicts metabolic pathways from one or
constituent elements including nitrogen, oxygen, phosphorus, more source metabolite(s) to target metabolite(s), this analysis
sulfur, bromine, manganese and zinc to enable element tracing requires at least two start metabolites and uses comparative
during pathway prediction (Supplementary Figure S2). pathway prediction module to identify reactions which are
MAPPS also allows pathways to be ranked by pathway length, involved in pathways from only one of the given start
number of reversible reactions, connectivity of participating metabolites.
metabolites and weighted scoring scheme introduced by Ancestral Network Building/Comparison. A distinguishing
Huang et al.45 that combines reaction thermodynamics feature of MAPPS which differentiates it from currently
information and structural similarity between participating available tools is the provision of analyzing metabolic networks
compounds (see Methods). on a phylogeny. This is extremely important for studying
MAPPS provides an interactive interface to build a network metabolic evolution as studying networks on a phylogeny may
over one or more organisms (called an “Organism Set”; see provide clue about the gain or loss of functionalities at various
Methods), select KEGG reference network or upload custom levels.12,34 MAPPS provides an option to build and compare
network, specify multiple organism sets, and select more than metabolic networks at the internal nodes of a phylogeny using
E https://dx.doi.org/10.1021/acssynbio.9b00397
ACS Synth. Biol. XXXX, XXX, XXX−XXX
ACS Synthetic Biology pubs.acs.org/synthbio Research Article

various methods (Supplementary Figure S5). This includes the effects of selecting different methods on the resulting
maximum parsimony and its variants (Sankoff, Dollo, and dendrogram.56 Single linkage selects the clusters containing
Polymorphism parsimony), algebraic methods (union, inter- closest pair of elements for joining together whereas maximum
section and reaction neighborhood),34,53 distance-based linkage selects the clusters with farthest pair of elements.
methods (UPGMA and neighbor-joining),54 and stochastic Average linkage, on the other hand, combines the clusters with
models of metabolic evolution.11 For parsimony and its minimum average distance between all pairs of elements
variants, algebraic methods and stochastic models, MAPPS present in the clusters.
takes a user-defined phylogeny as input and builds metabolic Host−Microbe Interaction. To allow a systematic approach
networks at the internal nodes of the phylogeny whereas for to study interactions between different organisms, MAPPS
the distance-based methods, MAPPS takes three or more provides an in silico platform to study the potential metabolic
Organism Sets as input instead of a phylogeny and constructs a pathways resulting from interactions between host and
phylogeny first before building metabolic networks at the microbe. The interaction between host and microbe leads to
internal nodes of the phylogeny. As before, metabolic networks sharing of metabolites and can result in emergent pathways, for
at the leaf nodes can be built using KEGG data using all or a example as shown by a detailed analysis of cometabolism
subset of KEGG pathways or using the custom data provided among host and its commensal microbe in a recent study.57
by the user. It is important to note that comparing ancestral Taking host and microbe organism sets as input, metabolic
networks is different from pathway-based comparison on pathways are predicted between source and target metabolites
ancestral networks (see above) as the latter is pathway specific in the host, microbe, and combined metabolic networks taking
and is only able to identify pathway-specific differences into account all the constraints available for pathway prediction
between given source/target metabolites on the given (see above). Metabolic pathways emerging as a result of
phylogeny whereas the former provides an overall comparison interactions are reported separately, thus enabling identifica-
of the metabolic networks in terms of reactions present or tion of novel metabolic pathways that are not present in the
absent at various levels of the phylogeny under different host or microbe networks but which could be formed due to
phylogenetic modes. their interaction (Supplementary Figure S8).
Network Enumeration/Comparison. The network enumer- Potential Drug Targets. Another type of analysis which
ation and comparison options in MAPPS allow users to might be very useful from the drug discovery point of view is
enumerate and/or compare one or more metabolic networks. the identification of potential drug targets. A number of studies
Network enumeration reports the reactions present in one or have reported key metabolic enzymes as potential drug targets
more organism set(s) along with the metabolites involved in since they uniquely produce or consume a metabolite and their
these reactions in one of the allowed output formats (see
disruption leads to all related pathways being rendered as
above). This option provides users an opportunity to exploit
dysfunctional.13,58,59 MAPPS provides an option to search for
functionalities offered by MAPPS for generating input data for
potential drug targets by identifying reactions acting as bridges
other tools, for example by performing in silico metabolic
in a metabolic network (see the case study on drug target
engineering on a metabolic network created using KEGG data
identification below). Bridge reactions are the reactions which
(see above) and exporting the resulting network in SBML
format for visualization and topological analysis in Cyto- if removed from the metabolic network will result in all
scape.55 Network comparison, on the other hand, compares pathways being eliminated between the specified source and
reactions present between metabolic networks built over two target metabolites.34 These reactions can be used as potential
or more organism set(s) irrespective of their involvement in a drug targets to disrupt desired metabolic capabilities of an
particular metabolic pathway. If two organism sets are organism. Similar to many analyses described above, potential
provided then a standard comparison is performed, which drug targets can be simultaneously identified in one or more
identifies reactions present or absent in their respective organism sets using KEGG or custom data.
metabolic networks, whereas for more than two organism Estimate Evolution Parameters. Evolution of metabolic
sets an all but one comparison is performed, which identifies networks is characterized by loss and gain of reactions (or
reactions present (or absent) in only one organism set but enzymes) connecting two or more metabolites.11,60 Using
absent (or present) in all the others. simple (independent loss/gain of reactions) or complex
Metabolic Similarity Analysis. Metabolic network compar- (incorporating dependencies among reactions) stochastic
ison is a powerful method in comparative genomics providing models of metabolic evolution, it is possible to study how
insights into the characteristic metabolic features of organisms metabolic networks evolve over time. We have incorporated
under study.12 By grouping the organisms based on their stochastic models of metabolic network evolution which
metabolic capabilities, specific hypotheses relating to special- describe metabolic evolution as a continuous time Markov
ization of metabolic networks can be generated which can then chain11,60 into MAPPS, which allows users to estimate
be experimentally tested in the lab. To this end, MAPPS allows evolution parameters (insertion rate, deletion rate, reaction
users to perform agglomerative hierarchical clustering of dependencies) between two organism set(s) as well as over a
metabolic networks of three or more organism sets phylogeny providing better insights into the evolution
(Supplementary Figure S6). Hierarchical clustering is an mechanisms of metabolic networks (Supplementary Figure
unsupervised machine learning technique which iteratively S9). Using statistical models of network evolution to analyze
groups the data into clusters based on a similarity measure.56 metabolic networks will also enable users to test various
In MAPPS, metabolic networks can be clustered based on their biological hypotheses such as specialization of genomes and
similarity between reactions, enzymes, or metabolic pathways identification of regions of metabolic networks that are under
between given source and target metabolites (Supplementary high selection and to investigate how the evolution of
Figure S7). In addition, networks can be clustered using single, metabolic networks relates to the evolution of underlying
average, or maximum linkage method enabling users to explore genomes and the environment.
F https://dx.doi.org/10.1021/acssynbio.9b00397
ACS Synth. Biol. XXXX, XXX, XXX−XXX
ACS Synthetic Biology pubs.acs.org/synthbio Research Article

Visualize Metabolic Networks/Pathways. Finally, MAPPS taken up by roots and reduced to sulfide, then incorporated
provides an option for interactive visualization of metabolic into activated O-acetyl-L-serine to form cysteine62 (Figure 3A).
networks and pathways. This option is not limited to the
visualization of MAPPS output and can also be used to
visualize user-defined metabolic networks or pathways. It takes
as input an SBML file containing KEGG or custom data and
allows user to visualize the resulting network/pathways in an
interactive environment with an option to download the result
in PNG and PDF formats.
In Silico Metabolic Engineering. With the recent
advances in high-throughput sequencing, metabolic engineer-
ing has opened new opportunities to design and analyze
heterologous biosynthetic systems.19,61 To facilitate the
process of metabolic engineering, MAPPS provides a provision
to perform in silico metabolic engineering by adding and/or
removing reactions or enzymes while building metabolic
networks (Supplementary Figure S10). This option is available
for all organism-based analyses described above involving
KEGG data and can be used to study the effects of in silico
knockout and knock-in metabolic mutations on organisms’
metabolic capabilities while performing different analyses. For
example, addition of enzymes/reactions in a metabolic network
enables users to determine the feasibility of engineering novel
metabolic pathways in the network while removal of existing
enzymes/reactions provides insight into the robustness of a
metabolic network by detecting alternative routes between
metabolites, and prediction of the potential effect of the
knockout on resulting metabolic pathways. To the best of our
knowledge no other tool provides the flexibility of simulta-
neously studying the effects of in silico knockout and knock-in
metabolic mutations (see the case study on in silico metabolic
engineering below). In addition, users can also compare the
results of metabolic engineering by running the analyses
simultaneously on the original and modified networks, a Figure 3. Sulfate assimilation pathway in Arabidopsis thaliana. (A)
Schematic diagram of the sulfate assimilation pathway (adapted from
feature not available in other tools. Kopriva et al.47). (B) Pathway predicted by MAPPS from sulfate
Network Filtering Using ‘Omics Data. Another (KEGG ID: C00087) to L -cysteine (KEGG ID: C00097)
distinguishing feature that sets MAPPS apart from the corresponding to the experimentally validated pathway shown in
currently available tools is the support for ‘omics data. (A). Enzymes and their corresponding EC numbers are shown in
MAPPS allows users to refine metabolic networks using blue, whereas KEGG reaction ids are shown in green. Sulfur tracing is
‘omics data at the network building step (see Methods). To shown by yellow circle.
this end, users can provide a list of genes/proteins/metabolites
along with their expression/concentration values and a cutoff In KEGG, this corresponds to a four-step pathway from sulfate
threshold, which is used to filter the metabolic networks (KEGG ID: C00059) to L-cysteine (KEGG ID: C00097) in
(Supplementary Figure S11). To provide flexibility to users, the sulfur metabolism map (KEGG map: 00920) (Supple-
MAPPS supports a number of public databases for ‘omics mentary Figure S12). We predicted the metabolic pathways
filtering. These include KEGG and NCBI identifiers of genes from sulfate to L-cysteine using MAPPS, restricting the search
for transcriptomics data, UniProt39 identifiers of enzymatic to the sulfur metabolism map. MAPPS successfully reported
proteins for proteomics data, and KEGG, ChEBI,42 and the above-mentioned pathway (Figure 3B). As noted above,
PubChem44 identifiers for metabolites. Allowing users to MAPPS allows users to define one or more elements as
compare metabolic annotations with other ‘omic data sets can required or to be excluded in the pathway search
provide greater insight into the metabolic capabilities of (Supplementary Figure S2), which helps in tracing or avoiding
organisms. For example, incorporation of expression data into a specific element in the reported metabolic pathways. To
pathway prediction is helpful in identifying enzymes that are demonstrate the usability of tracing constituent elements, we
coexpressed to give a functionally viable pathway, and in first computed the pathways from sulfate to L-cysteine across
mapping functionally related genes to gene clusters. Similarly, all KEGG pathway maps using metabolite pairing, which
mapping metabolomic data during pathway prediction can considers all possible pairs between source and target
help in identifying pathways that are active under different metabolites in a reaction without element tracing. MAPPS
experimental conditions and provide a detailed picture of what reported a total of six pathways from sulfate to L-cysteine
is going on inside a cell at a metabolic level. including the pathway shown in Figure 3B. Out of these, four
Case Studies. We demonstrate the functionalities of pathways, however, did not contain sulfur in one or more
MAPPS by analyzing the data from published studies below. intermediary metabolites. We next designated sulfur as a
Predicting Biologically Meaningful Metabolic Pathways required element and reran the pathway prediction. This time
by Tracing Specific Elements. In Arabidopsis thaliana, sulfur is MAPPS reported only two pathways with all sulfur containing
G https://dx.doi.org/10.1021/acssynbio.9b00397
ACS Synth. Biol. XXXX, XXX, XXX−XXX
ACS Synthetic Biology pubs.acs.org/synthbio Research Article

Figure 4. Examples demonstrating the use of in silico metabolic engineering in MAPPS to study effects of metabolic knockout and knock-in
mutations. (A) Schematic diagram showing Leloir pathway for metabolizing α-D-galactose in humans. (B) Metabolic pathways from α-D-galactose
to UDP-glucose predicted by MAPPS in human metabolic network. In addition to reporting the Leloir pathway shown in (A), MAPPS also reports
an alternate route to UDP-glucose via D-glucose-1P. In silico knockout of 2.7.1.6 and/or 2.7.7.12 from the human metabolic network results in both
pathways between α-D-galactose and UDP-glucose being eliminated whereas removal of 5.1.3.2 results in the Leloir pathway being eliminated (see
Supplementary Figure S15). (C) A heterologous biosynthetic pathway for producing flavonoid precursor naringenin from L-tyrosine in Escherichia
coli.14 MAPPS reports this heterologous pathway in the modified E. coli metabolic network with enzymes tyrosine ammonia lyase (TAL, EC
4.3.1.23), 4-coumarate:CoA ligase (4CL, EC 6.2.1.12), chalcone synthase (CHS, EC 2.3.1.74), and chalcone isomerase (CHI, EC 5.5.1.6) added
through in silico metabolic engineering. Enzymes and their corresponding EC numbers are shown in blue, whereas KEGG reaction ids are shown in
green.

metabolites including the experimentally validated pathway (Supplementary Figure S14) and reran the pathway prediction.
described above. The other pathway differed by only one No metabolic pathways were reported in this case (Supple-
reaction/enzyme from the pathway shown in Figure 3A mentary Figure S15). A similar result was obtained when the
suggesting an alternate route to L-cysteine (Supplementary enzyme galactose-1-P uridylyltransferase was removed and
Figure S13). only one pathway through α-D-glucose-1P was reported when
Studying the Effects of In Silico Metabolic Knockout and UDP-galactose 4′-epimerase was removed from the human
Knock-in Mutation. We demonstrate the use of in silico metabolic network (Supplementary Figure S15).
metabolic engineering option available in MAPPS to study the Another interesting use of metabolic engineering is to design
effects of knockout and knock-in metabolic mutations on heterologous biosynthetic pathways by incorporating foreign
metabolic pathways. For knockout mutations we used a enzymes into a host. To demonstrate this, we used the in silico
published study relating to galactose metabolism. α-D-galactose metabolic engineering option available in MAPPS to reproduce
is metabolized in humans via a sequential pathway known as a heterologous pathway for flavonoid production from L-
the Leloir pathway (Figure 4A) where α-D-galactose is first tyrosine in Escherichia coli14 (Figure 4C). We added four
phosphorylated by the enzyme galactokinase (GALK, EC enzymes, namely, tyrosine ammonia lyase (TAL, EC 4.3.1.23),
2.7.1.6) to produce α-D-galactose-1P, which is then converted 4-coumarate:CoA ligase (4CL, EC 6.2.1.12), chalcone
into UDP-galactose using the enzyme galactose-1-P uridylyl- synthase (CHS, EC 2.3.1.74) and chalcone isomerase (CHI,
transferase (GALT, EC 2.7.7.12), followed by interconversion EC 5.5.1.6), to the E. coli metabolic network and compared
of UDP-galactose and UDP-glucose through UDP-galactose pathways from L-tyrosine (KEGG ID: C00082) to the main
4′-epimerase (GALE, EC 5.1.3.2). Individuals with defects in flavonoid precursor naringenin (KEGG ID: C00509) in the
any one of these enzymes are unable to properly metabolize original and modified E. coli network (Supplementary Figure
milk sugar leading to Galactosemia, which is an inherited S16). While there was no pathway reported in the original
metabolic disorder.63 Pathway prediction between α-D- network between L-tyrosine and naringenin, a pathway utilizing
galactose (KEGG ID: C00984) and UDP-glucose (KEGG the newly added enzymes was reported in the modified E. coli
ID: C00029) using MAPPS on human galactose metabolic network (Figure 4C).
network (KEGG map: 00052) resulted in two pathways being Identification of Potential Drug Targets. MAPPS allows
reported including the Leloir pathway (Figure 4B). The users to identify potential drug targets in a metabolic network
alternate pathway only differs at the last step which uses α-D- by identifying reactions which if removed from the metabolic
glucose-1P to produce UDP-glucose using enzyme UTP- network will result in all pathways being eliminated between
glucose-1-phosphate uridylyltransferase (UGP2, EC 2.7.7.9) the specified source and target metabolites. For example,
instead of using UDP-galactose and forms a part of the UDP- prostaglandin-endoperoxide synthase (EC 1.14.99.1) is a
α-D-glucose biosynthesis I pathway in MetaCyc.18 We next reported target of many anti-inflammatory drugs including
used the in silico metabolic engineering option to remove the aspirin and ibuprofen, and catalyzes the conversion of
enzyme galactokinase from the human metabolic network arachidonic acid (KEGG ID: C00219) to prostaglandin H2
H https://dx.doi.org/10.1021/acssynbio.9b00397
ACS Synth. Biol. XXXX, XXX, XXX−XXX
ACS Synthetic Biology pubs.acs.org/synthbio Research Article

Figure 5. An application of potential drug target identification and host−microbe interaction in MAPPS. (A) The enzyme prostaglandin-
endoperoxide synthase (EC 1.14.99.1) catalyzes two-step conversion of arachidonic acid to prostaglandin H2, the precursor for all prostanoids
including those shown here, and is a reported target of many anti-inflammatory drugs including aspirin and ibuprofen. MAPPS identified both
reactions (R00073 and R01590) catalyzed by this enzyme as potential drug targets for disrupting metabolic pathways from arachidonic acid to
various prostanoids. (B) Rickettsia parkeri lacks upstream enzymes to produce isopentenyl pyrophosphate (IPP) from mevalonate (MEV)
biosynthesis pathway. It uses human MEV pathway to produce Octaprenyl diphosphate (C8−PP) and Undecaprenyl diphosphate (C55−PP) which
are precursors for ubiquinone synthesis and peptidoglycan synthesis, respectively. MAPPS correctly predicted metabolic pathways from
Mevalonate, via IPP produced by human metabolic enzymes, to C8−PP and C55−PP. Enzymes and their corresponding EC numbers are shown in
blue, reaction present in both, host and microbe, is shown in green, while reactions present in only humans are shown in yellow and the microbe-
specific reactions are shown in orange color.

(PGH2, KEGG ID: C00427) via prostaglandin G2 (PGH2, identification module in MAPPS predicting the pathways
KEGG ID: C05956) in two steps.64 PGH2 is the precursor for between D-glucose (KEGG ID: C00031) and pyruvate (KEGG
all prostanoids including prostaglandins, thromboxanes, and ID: C00022) considering the whole carbohydrate metabolism.
prostacyclins and is a key metabolite in arachidonic acid Reaction R02738 which is catalyzed by Enzyme IIGlc (EC
metabolism (Figure 5A). MAPPS reports the two reactions 2.7.1.199) was reported by MAPPS as the potential drug target
(R00073 and R01590) catalyzed by the enzyme prostaglandin- (Supplementary Figure S18). Enzyme IIGlc is a key component
endoperoxide synthase as potential drug targets for disrupting in the PTS system and is involved in the transport of glucose
metabolic pathways from arachidonic acid to various across the membrane as well as its phosphorylation.65 To
prostanoids (Figure 5A, Supplementary Figure S17). confirm the effect of enzyme removal, we compared the
Another example demonstrating the efficacy of MAPPS in metabolic pathways from glucose to pyruvate in the original
identifying potential drug targets relates to the phospho- S. typhimurium metabolic network against the one with
transferase system (PTS) for transporting sugar into bacteria. Enzyme IIGlc removed using in silico metabolic engineering
PTS, which is crucial for bacterial growth, is specific to (see above). While multiple pathways were reported in the
prokaryotes and thus can serve as a potential drug target.65 It unmodified network (Supplementary Figure S18), no pathway
has been reported that replication and survival of Salmonella was reported in the modified network, suggesting that Enzyme
enterica serovar Typhimurium (S. typhimurium), which causes IIGlc can indeed be used as a potential drug target to prevent
gastroenteritis and fatal typhoid, in mice depends on glucose S. typhimurium infection in mammals.
and glycolysis.66 To identify potential enzymes from the PTS Studying Emergent Pathways Resulting from Host−
which can be used as drug targets, we ran the drug target Microbe Interaction. MAPPS allows users to identify novel
I https://dx.doi.org/10.1021/acssynbio.9b00397
ACS Synth. Biol. XXXX, XXX, XXX−XXX
ACS Synthetic Biology pubs.acs.org/synthbio Research Article

pathways emerging due to interaction between host and pathways as the underlying data set require a long time to run
microbe enabling them to study the metabolic basis of host− due to an exponential increase in the search space. This can be
microbe interface. We demonstrate this by exploring a sped up by using parallel programming and hybrid search
hypothesis relating to isoprenoid biosynthesis, discussed in algorithms to optimize pathway searches. Finally, it might be
detail elsewhere,67 that Rickettsia parkeri, a gram-negative useful to incorporate an annotation pipeline using ‘omics data
obligate intracellular parasite that cause typhus and spotted sets in MAPPS that facilitates analysis and provides users with
fever in humans, lacks necessary enzymes to produce the option of using the comparative functional annotation
isopentenyl pyrophosphate (IPP; KEGG ID: C00129), the approach to generate metabolic annotations, particularly in the
central precursor molecule for producing isoprenoids, and its case of draft and incompletely annotated genomes.
isomer dimethylallyl diphosphate (DMAPP; KEGG ID:
C00235). Instead, R. parkeri uses the human mevalonate
(MEV) pathway as the upstream source of IPP for its own
■ METHODS
Data Resource. MAPPS uses KEGG as the primary data
production of bactoprenols, which are essential building blocks source for organisms, compounds, reactions, and metabolic
for peptidoglycan and other cell wall polysaccharides, and pathway maps.16 External mapping for compounds and
ubiquinone, a coenzyme involved in electron transport chain.68 enzymes to other public databases is obtained through the
To investigate the metabolic interaction between R. parkeri KEGG API and reaction directions are extracted by parsing
and human in the isoprenoid biosynthetic pathway, we used KEGG Markup Language (KGML) files of reference reaction
the host−microbe interaction option available in MAPPS maps. Data downloaded from KEGG is stored in the MAPPS
setting human and R. parkeri as host and microbe networks database and is updated fortnightly. MAPPS provides an
respectively, and predicted metabolic pathways from mevalo- option to use reaction energies and metabolic structural
nate (KEGG ID: C00418) to Octaprenyl diphosphate (C8− similarity into the pathway scoring scheme (see below). For
PP; KEGG ID: C04146) and Undecaprenyl diphosphate this, standard Gibbs energies of KEGG reactions for pH values
(C55−PP; KEGG ID: C04574), which are precursors for ranging from 5 to 9 were downloaded from eQuilibrator70 and
ubiquinone and peptidoglycan synthesis respectively in gram- structural similarity scores of metabolites were obtained
negative bacteria.67 While no pathways were reported in the through the REST API of SIMCOMP2.71
host or microbe networks separately, MAPPS predicted MAPPS Architecture. MAPPS is built on .Net framework
multiple emergent pathways between mevalonate and the and is hosted on a Dell PowerEdge R740 Server with two
two precursor metabolites C8−PP and C55−PP in the Intel(R) Xeon Silver 4110 2.1 GHz CPUs with 8 cores each
combined network (Figure 5B), which use the human MEV and 64GB of Memory running Windows Server 2012 attached
pathway for the upstream source of isoprene units for the to Dell Power Vault MD1200 storage box with 10TB of
synthesis of bacterial bactoprenols and ubiquinone. The storage space. MAPPS is designed in a way so as to minimize
reported pathways not only match the route suggested to be the dependency among its components and enhance scalability
taken by R. parkeri for isoprenoid biosynthesis67 but rightly (Supplementary Figure S19). It is divided into the following
identify enzyme isopentenyl diphosphate isomerase (EC parts.
5.3.3.2) catalyzing the reversible conversion of IPP to Listener. The listener program is written in C#, and it
DMAPP69 as the only enzyme to be present in both humans contains core algorithms of MAPPS. It runs as a multithread
and R. parkeri in this pathway thus demonstrating the potential process and performs three primary functions, database
of MAPPS in studying pathway based host−microbe polling, job status monitoring, and job execution. The database
interactions. To the best of our knowledge, no other tool polling thread samples the database at regular intervals to fetch
allows pathway-based analysis of host−microbe interactions. newly submitted jobs. If it finds an unprocessed job, it adds it

■ CONCLUSION
In summary, MAPPS provides a powerful resource for
to the job queue. A separate thread, which actively monitors
the status of queued entries, starts an independent thread to
execute the job and remove it from the queue. A total of 10
metabolic pathway prediction and comparison, specialized jobs can be executed in parallel.
analyses such as drug target identification, in silico metabolic Application Programming Interface (API). The API
engineering by adding/removing metabolic reactions or provides a public interface to communicate with MAPPS
enzymes, detection of metabolite-specific reactions, analyzing database. It takes the job from the Web site, submits to the
the effects of host−microbe interactions, and to study database, and retrieves job parameters and results from the
metabolic evolution using traditional as well as stochastic database and sends it to the Web site.
models. MAPPS also has an ‘omics pipeline to refine the Database. A MySQL database stores data downloaded from
pathway results using transcriptomic, proteomic, or metab- KEGG and other public databases (see above). In addition, it
olomic data to provide a greater insight into the metabolic also stores external links, job parameters, and results. The
capabilities of organisms, making it relevant in today’s results are stored in the database for 3 days for guest users and
postgenomic era. 15 days for registered users.
Currently, MAPPS uses KEGG as its primary data source Web Site. The MAPPS Web site provides an interactive
with an option to upload custom data to build metabolic platform to submit complex queries. It is developed in ASP.net,
networks. Adding other data sources such as Reactome17 and HTML, CSS, and various JavaScript libraries (JQuery, AJAX,
BioCyc18 would enhance the capabilities of MAPPS and make JointJS, Bootstrap, Dagre, and Vectorizer). It communicates
it more useful to the scientific community. Besides this, with the database through API for retrieving data and
MAPPS uses a depth-first search of the metabolic graphs to submitting jobs, and provides quick access to relevant public
compute pathways between metabolites. Consequently, jobs databases.
which employ exhaustive search by using metabolite pairing to Provision for Custom Data. Although MAPPS uses
establishing connection between metabolites or use all KEGG KEGG as the primary data resource, it also allows users to
J https://dx.doi.org/10.1021/acssynbio.9b00397
ACS Synth. Biol. XXXX, XXX, XXX−XXX
ACS Synthetic Biology pubs.acs.org/synthbio Research Article

analyze custom metabolic networks. Custom networks can be catalyzed by these enzymes. In the case of metabolomics data,
provided with or without KEGG identifiers. The user can reactions are filtered based on the presence/absence of
upload custom metabolic networks based on KEGG identifiers metabolites acting as substrates in those reactions.
in one of the three supported formats including Systems Pathway Prediction. MAPPS computes pathways using a
Biology Markup Language (SBML), KEGG Markup Language depth-first search of metabolic networks and reports all
(KGML) and tab-delimited text file containing a list of KEGG possible pathways between the source and target metabolites.
reaction IDs to build metabolic network(s) whereas metabolic A pathway between two metabolites is defined as a connected
networks containing non-KEGG identifiers can be provided in sequence of reactions such that the product of one reaction
SBML format. Support for custom data removes the reliance of acts as a substrate in the next reaction that satisfies user-
the tool on KEGG allowing pathway prediction and other defined constraints. User can select KEGG Reaction Class
analyses on draft genomes for which metabolic data is not (RCLASS),16 which contains information relating to the main
available in KEGG. reactant pairs of KEGG reactions to compute connections
Metabolic Network Building. Metabolic network build- between metabolites thus allowing pathway searches to be
ing is the key step for most of the analyses available in MAPPS. optimized73 or use metabolite pairing where all possible pairs
The network can be built using KEGG or custom data between substrates and products of a given reaction are
depending on user choice. As described above, custom data considered to establish the connection between reactions.
can be provided in SBML format. When using KEGG data, While predicting metabolic pathways, MAPPS takes into
users can either choose to build a network using all pathway account reaction directions and computes all possible pathways
maps available in KEGG or select one or more pathway maps between the source and target metabolites. MAPPS does not
(Supplementary Figure S3). Users can then build a network allow metabolites or reactions to be repeated in a pathway thus
over one or more organisms (called an “Organism Set”), use a avoiding cycles and reports only direct routes between
KEGG reference network that contains all reactions present in metabolites. While submitting a pathway prediction job,
the selected pathways, or upload a custom network based on users must specify minimum and maximum lengths of the
KEGG annotations (see above). MAPPS provides biologically pathways to be reported. All pathways outside of the specified
informative options for selecting organism(s) with a choice range are ignored. Currently, pathways up to 10 reactions in
between using an autocomplete search box or selecting length can be computed by MAPPS.
organisms using an expandable tree view, in which organisms To refine the pathway search, users can define additional
are grouped taxonomically. When using multiple organisms in constraints on metabolites and/or reactions/enzymes during
an organism set, users can choose between one of the following pathway prediction (Supplementary Figure S2). At the
three modes to build the metabolic network. metabolite level, users can specify a list of metabolites to be
(1) All Reactions: Includes all reactions from the selected avoided or required during pathway prediction. Pathways
organisms with duplicates removed. passing through metabolites which are designated to be
ignored are not reported. By default, ubiquitous metabolites
(2) Common Reactions: Includes only those reactions
such as such as ATP, AMP, O2, H2O and CO2 (Supplementary
which are present in all of the selected organisms.
Table S1) are ignored during pathway prediction. If one or
(3) Reaction Neighborhood: Includes only those reactions more metabolites are stipulated to be required then only those
for which the proportion of neighboring reactions pathways that contain these required metabolite(s) are
present in the network formed by combining all reported. In addition, MAPPS also provides an option to filter
reactions is greater than the specified cutoff.12 metabolites based on their connectivity scores (see below)
While building metabolic networks, reactions/enzymes can since it has been previously reported that metabolite filtering
be added or removed from these networks using in silico by assigning weights to metabolites based on their connectivity
metabolic engineering and/or filtered using ‘omics data set in the metabolic network narrows down the search space and
(see below). helps in reporting biologically relevant pathways.74 To this
For phylogeny-based analyses, users can provide a phylogeny end, user can specify a metabolite connectivity cutoff value to
using KEGG organism codes or custom data. The phylogeny filter compounds based on their degree in the underlying
must be specified in a format based on Newick format72 where metabolic network. Besides metabolites, users can also specify
each subtree must have exactly two children. For example, whether one or more reactions/enzymes must be present or
(A,B), ((A,B),C), ((A,B),(C,D)) are considered to be valid are to be avoided during pathway prediction (Supplementary
phylogenies, whereas phylogenies such as (), (A), and (A,B,C) Figure S2). To enhance reporting of biologically meaningful
are regarded as invalid. pathways, MAPPS also provides an option of filtering
Network Filtering Using ‘Omics Data. MAPPS allows intermediate metabolites by presence or absence of constituent
users to provide processed ‘omics data for filtering metabolic elements including nitrogen, oxygen, phosphorus, sulfur,
networks at the network building step (Supplementary Figure bromine, manganese and zinc while computing pathways
S11). The user can upload a tab-delimited file containing a list between source and target metabolites. The user has the
of unique identifiers from one of the supported public flexibility to define one or more chemical elements as required
databases (see above) in the first column and values of (element(s) must be present in all intermediate metabolites of
samples in the second column, and specify a cut off value for the predicted pathway) or ignored (element(s) must not be
filtering the data when submitting the job. The nodes/edges present in the intermediary metabolites). By default, all
that do not meet the required cutoff values are eliminated from elements are regarded as optional (Supplementary Figure
the metabolic network. In the case of transcriptomic data, S2). Finally, users can also choose to incorporate structural
metabolic reactions (edges) are filtered using gene-enzyme- similarity between consecutive metabolites while predicting
reaction mapping of provided genes. For proteomics data, the metabolic pathways. Structural similarity scores are based on
user provides a list of enzymatic proteins to filter reactions SIMCOMP2 calculations71 and a user-specified cutoff is used
K https://dx.doi.org/10.1021/acssynbio.9b00397
ACS Synth. Biol. XXXX, XXX, XXX−XXX
ACS Synthetic Biology pubs.acs.org/synthbio Research Article

to filter out intermediary metabolites during pathway sulfur metabolism; Figure S13: Alternate pathway from
prediction. The pseudocode for pathway prediction is given sulfate to L-cysteine predicted by MAPPS; Figure S14:
as Supplementary Note S1 (Supporting Information). Building human galactose metabolic network with in
Pathway Ranking. MAPPS reports all possible pathways silico knockout; Figure S15: Effect of in silico knockouts
between source and target metabolites that satisfy user on predicted pathways between α-D-galactose and UDP-
constraints enabling discovery of previously unknown path- glucose in human galactose metabolic networks; Figure
ways. To help focus on biologically meaningful pathways, S16: Comparing Escherichia coli metabolic networks
pathways can be ranked based on pathway length, number of with and without in silico knock-in mutations; Figure
reversible reactions, pathway connectivity score, and a pathway S17: MAPPS output showing identification of potential
score based on the scoring scheme introduced by Huang et drug targets; Figure S18: Identification of Enzyme IIGlc
al.45 Pathway connectivity score is calculated as follows. For (EC 2.7.1.199) as the potential drug target in Salmonella
each metabolite present in the underlying network, first all enterica serovar Typhimurium; Figure S19: MAPPS
reactions in which that metabolite is acting as a substrate are architecture; Table S1: List of ubiquitous metabolites
identified. Next, the total number of metabolites acting as ignored by default during pathway prediction in MAPPS
products in these reactions is calculated by adding the number (PDF)
of product metabolites in individual reactions. This number is
then normalized by dividing it with the maximum value for all
the metabolites to get the metabolite connectivity score for
that metabolite. The pathway connectivity score is then
■ AUTHOR INFORMATION
Corresponding Author
calculated as the sum of the log of metabolite connectivity Aziz Mithani − Department of Biology, Syed Babar Ali School of
scores of all intermediary metabolites involved in the pathway. Science and Engineering, Lahore University of Management
The scoring scheme introduced by Huang et al.45 combines Sciences (LUMS), DHA, Lahore 54792, Pakistan;
reaction thermodynamics information and structural similarity orcid.org/0000-0002-2214-3526; Email: aziz.mithani@
to calculate similarity between any two metabolites. MAPPS lums.edu.pk
computes a pathway score using the similarity scores between
consecutive metabolites in a metabolic pathway. In this Authors
scheme, the score W ij between any two consecutive Muhammad Rizwan Riaz − Department of Biology, Syed Babar
metabolites vi and vj in a metabolic pathway is calculated as45 Ali School of Science and Engineering, Lahore University of
Wij = α(1 − sim(vi , vj)) + (1 − α)(3200 + fe(rij))/10000 Management Sciences (LUMS), DHA, Lahore 54792, Pakistan
Gail M. Preston − Department of Plant Sciences, University of
where α is the proportional contribution of compound Oxford, Oxford OX1 3RB, U.K.
similarity and Gibbs free energy in the score (set to 0.5 in Complete contact information is available at:
MAPPS), sim(vi, vj) is the structural similarity between https://pubs.acs.org/10.1021/acssynbio.9b00397
metabolites vi and vj (obtained using SIMCOMP271 in
MAPPS), and fe(rij) is the Gibbs free energy of the reaction Author Contributions
involving vi and vj (downloaded from eQuilibrator70). MAPPS AM and GP conceived MAPPS. MRR developed MAPPS and
then calculates the pathway score by averaging the similarity performed the analyses. MRR and AM wrote the manuscript.
scores across the length n of the pathway as shown below. GP contributed to the review and editing of the manuscript. All
authors approved the final version of the manuscript.
Pathway Score = ∑ Wij/n
Notes

■ ASSOCIATED CONTENT
The authors declare no competing financial interest.
MAPPS is freely accessible at https://mapps.lums.edu.pk.


*
sı Supporting Information
The Supporting Information is available free of charge at ACKNOWLEDGMENTS
https://pubs.acs.org/doi/10.1021/acssynbio.9b00397.
We thank Safee Ullah Chaudhary, Muhammad Tariq, Suleman
Note S1: Pseudocode for pathway prediction in Shahid, Abdul Rehman Basharat, Muhammad Faizyab Ali
MAPPS; Figure S1: Flowchart describing MAPPS Chaudhary for their valuable comments and suggestions. We
workflow; Figure S2: MAPPS interface for specifying would also like to thank Sameed Ali, Muhammad Haseeb,
pathway related parameters; Figure S3: Specifying one Faisal Zulfiqar, and Haseeb Shaukat for their contribution in
or more organism set(s) to build metabolic network(s) improving the graphical output module. This work was
in MAPPS; Figure S4: Graphical output for metabolic supported by a grant (Grant No. 20-2516/NRPU/R&D/
reachability; Figure S5: Ancestral network building and HEC/13) from Higher Education Commission of Pakistan to
comparison in MAPPS; Figure S6: An example AM.


dendrogram resulting from metabolic similarity analysis
between three organisms; Figure S7: MAPPS user REFERENCES
interface for performing metabolic similarity analysis;
(1) Jeong, H., Tombor, B., Albert, R., Oltvai, Z. N., and Barabasi, A.-
Figure S8: Graphical output showing pathways resulting
L. (2000) The Large-Scale Organization of Metabolic Networks.
due to host−microbe interactions; Figure S9: User Nature 407 (6804), 651−654.
interface for estimation of evolution parameters in (2) Feng, X., Zhuang, W. Q., Colletti, P., and Tang, Y. J. (2012)
MAPPS; Figure S10: Performing in silico metabolic Metabolic Pathway Determination and Flux Analysis in Nonmodel
engineering in MAPPS; Figure S11: Network filtering Microorganisms through 13C-Isotope Labeling. Methods Mol. Biol.
using ‘omics data; Figure S12: KEGG pathway map of 881, 309−330.

L https://dx.doi.org/10.1021/acssynbio.9b00397
ACS Synth. Biol. XXXX, XXX, XXX−XXX
ACS Synthetic Biology pubs.acs.org/synthbio Research Article

(3) Schomburg, D. D., and Michal, G. ( 2012) Biochemical (22) Xia, D., Zheng, H., Liu, Z., Li, G., Li, J., Hong, J., and Zhao, K.
Pathways : An Atlas of Biochemistry and Molecular Biology, John (2011) MRSD: A Web Server for Metabolic Route Search and
Wiley & Sons. Design. Bioinformatics 27 (11), 1581−1582.
(4) Fuchs, T. M., Eisenreich, W., Heesemann, J., and Goebel, W. (23) Ogata, H., Goto, S., Fujibuchi, W., and Kanehisa, M. (1998)
(2012) Metabolic Adaptation of Human Pathogenic and Related Computation with the KEGG Pathway Database. BioSystems 47 (1−
Nonpathogenic Bacteria to Extra- and Intracellular Habitats. FEMS 2), 119−128.
Microbiol. Rev. 36 (2), 435−462. (24) Moriya, Y., Shigemizu, D., Hattori, M., Tokimatsu, T., Kotera,
(5) Carbonell, P., Parutto, P., Herisson, J., Pandit, S. B., and Faulon, M., Goto, S., and Kanehisa, M. (2010) PathPred: An Enzyme-
J. L. (2014) XTMS: Pathway Design in an EXTended Metabolic Catalyzed Metabolic Pathway Prediction Server. Nucleic Acids Res. 38
Space. Nucleic Acids Res. 42 (W1), W389. (SUPPL.2), 138−143.
(6) Fang, C., Fernie, A. R., and Luo, J. (2019) Trends Plant Sci. 24, (25) Kuwahara, H., Alazmi, M., Cui, X., and Gao, X. (2016) MRE: A
83. Web Tool to Suggest Foreign Enzymes for the Biosynthesis Pathway
(7) Mentzen, W. I., Peng, J., Ransom, N., Nikolau, B. J., and Wurtele, Design with Competing Endogenous Reactions in Mind. Nucleic Acids
E. (2008) Articulation of Three Core Metabolic Processes in Res. 44, W217.
Arabidopsis: Fatty Acid Biosynthesis, Leucine Catabolism and Starch (26) Hadadi, N., Hafner, J., Shajkofci, A., Zisaki, A., and
Hatzimanikatis, V. (2016) ATLAS of Biochemistry: A Repository of
Metabolism. BMC Plant Biol. 8 (1), 76.
(8) Smith, E., and Morowitz, H. J. (2004) Universality in All Possible Biochemical Reactions for Synthetic Biology and
Metabolic Engineering Studies. ACS Synth. Biol. 5 (10), 1155−1166.
Intermediary Metabolism. Proc. Natl. Acad. Sci. U. S. A. 101 (36),
(27) Heath, A. P., Bennett, G. N., and Kavraki, L. E. (2011)
13168−13173.
Identifying Branched Metabolic Pathways by Merging Linear
(9) Dandekar, T., Schuster, S., Snel, B., Huynen, M., and Bork, P.
Metabolic Pathways. Lect. Notes Comput. Sci. 6577, 70−84.
(1999) Pathway Alignment: Application to the Comparative Analysis (28) Khosraviani, M., Zamani, M. S., and Bidkhori, G. (2016)
of Glycolytic Enzymes. Biochem. J. 343 (1), 115−124. FogLight: An Efficient Matrix-Based Approach to Construct
(10) Planes, F. J., and Beasley, J. E. (2009) Path Finding Approaches Metabolic Pathways by Search Space Reduction. Bioinformatics 32
and Metabolic Pathways. Discret. Appl. Math. 157 (10), 2244−2256. (3), 398−408.
(11) Mithani, A., Preston, G. M., and Hein, J. (2010) A Bayesian (29) McClymont, K., and Soyer, O. S. (2013) Metabolic Tinker: An
Approach to the Evolution of Metabolic Networks on a Phylogeny. Online Tool for Guiding the Design of Synthetic Metabolic Pathways.
PLoS Comput. Biol. 6 (8), e1000868. Nucleic Acids Res. 41 (11), e113.
(12) Mithani, A., Hein, J., and Preston, G. M. (2011) Comparative (30) Handorf, T., and Ebenhöh, O. (2007) MetaPath Online: A
Analysis of Metabolic Networks Provides Insight into the Evolution of Web Server Implementation of the Network Expansion Algorithm.
Plant Pathogenic and Nonpathogenic Lifestyles in Pseudomonas. Mol. Nucleic Acids Res. 35 (SUPPL. 2), 613−618.
Biol. Evol. 28 (1), 483−499. (31) Blum, T., and Kohlbacher, O. (2008) MetaRoute: Fast Search
(13) Taylor, C. M., Wang, Q., Rosa, B. A., Huang, S. C. C., Powell, for Relevant Metabolic Routes for Interactive Network Navigation
K., Schedl, T., Pearce, E. J., Abubucker, S., and Mitreva, M. (2013) and Visualization. Bioinformatics 24 (18), 2108−2109.
Discovery of Anthelmintic Drug Targets and Drugs Using Choke- (32) Ravikrishnan, A., Nasre, M., and Raman, K. (2018)
points in Nematode Metabolic Pathways. PLoS Pathog. 9 (8), Enumerating All Possible Biosynthetic Pathways from Metabolic
e1003505. Networks. Sci. Rep. 8, 9932.
(14) Santos, C. N. S., Koffas, M., and Stephanopoulos, G. (2011) (33) Levy, R., Carr, R., Kreimer, A., Freilich, S., and Borenstein, E.
Optimization of a Heterologous Pathway for the Production of (2015) NetCooperate: A Network-Based Tool for Inferring Host-
Flavonoids from Glucose. Metab. Eng. 13 (4), 392−400. Microbe and Microbe-Microbe Cooperation. BMC Bioinf. 16 (1), 164.
(15) Jing, L. S., Shah, F. F. M., Mohamad, M. S., Hamran, N. L., (34) Mithani, A., Preston, G. M., and Hein, J. (2009) Rahnuma:
Salleh, A. H. M., Deris, S., and Alashwal, H. (2014) Database and Hypergraph-Based Tool for Metabolic Pathway Prediction and
Tools for Metabolic Network Analysis. Biotechnol. Bioprocess Eng. 19 Network Comparison. Bioinformatics 25 (14), 1831−1832.
(4), 568−585. (35) Faust, K., Croes, D., and van Helden, J. (2011) Prediction of
(16) Kanehisa, M., Furumichi, M., Tanabe, M., Sato, Y., and Metabolic Pathways from Genome-Scale Metabolic Networks.
Morishima, K. (2017) KEGG: New Perspectives on Genomes, BioSystems 105 (2), 109−121.
Pathways, Diseases and Drugs. Nucleic Acids Res. 45 (D1), D353− (36) Berge, C., and Minieka, E. (1973) Graphs and Hypergraphs, Vol.
D361. 7, North-Holland Publishing Company, Amsterdam.
(17) Fabregat, A., Sidiropoulos, K., Garapati, P., Gillespie, M., (37) Yeung, M., Thiele, I., and Palsson, B. O. (2007) Estimation of
Hausmann, K., Haw, R., Jassal, B., Jupe, S., Korninger, F., McKay, S., the Number of Extreme Pathways for Metabolic Networks. BMC
Bioinf. 8 (1), 363.
et al. (2016) The Reactome Pathway Knowledgebase. Nucleic Acids
(38) Mithani, A., Preston, G. M., and Hein, J. (2009) A Stochastic
Res. 44 (D1), D481−D487.
Model for the Evolution of Metabolic Networks with Neighbor
(18) Caspi, R., Billington, R., Ferrer, L., Foerster, H., Fulcher, C. A.,
Dependence. Bioinformatics 25 (12), 1528−1535.
Keseler, I. M., Kothari, A., Krummenacker, M., Latendresse, M.,
(39) The UniProt Consortium (2015) UniProt: A Hub for Protein
Mueller, L. A., et al. (2016) The MetaCyc Database of Metabolic Information. Nucleic Acids Res. 43 (D1), D204−D212.
Pathways and Enzymes and the BioCyc Collection of Pathway/ (40) Placzek, S., Schomburg, I., Chang, A., Jeske, L., Ulbrich, M.,
Genome Databases. Nucleic Acids Res. 44 (D1), D471−D480. Tillack, J., and Schomburg, D. (2017) BRENDA in 2017: New
(19) Tomar, N., and De, R. K. (2013) Comparing Methods for Perspectives and New Tools in BRENDA. Nucleic Acids Res. 45 (D1),
Metabolic Network Analysis and an Application to Metabolic D380−D388.
Engineering. Gene 521 (1), 1−14. (41) Agarwala, R., Barrett, T., Beck, J., Benson, D. A., Bollin, C.,
(20) Rahman, S. A., Advani, P., Schunk, R., Schrader, R., and Bolton, E., Bourexis, D., Brister, J. R., Bryant, S. H., Canese, K., et al.
Schomburg, D. (2005) Metabolic Pathway Analysis Web Service (2016) Database Resources of the National Center for Biotechnology
(Pathway Hunter Tool at CUBIC). Bioinformatics 21 (7), 1189− Information. Nucleic Acids Res. 44 (D1), D7−D19.
1193. (42) Hastings, J., De Matos, P., Dekker, A., Ennis, M., Harsha, B.,
(21) Chou, C. H., Chang, W. C., Chiu, C. M., Huang, C. C., and Kale, N., Muthukrishnan, V., Owen, G., Turner, S., Williams, M., et al.
Huang, H. Da (2009) FMM: A Web Server for Metabolic Pathway (2012) The ChEBI Reference Database and Ontology for Biologically
Reconstruction and Comparative Analysis. Nucleic Acids Res. 37 Relevant Chemistry: Enhancements for 2013. Nucleic Acids Res. 41
(SUPPL. 2), 129−134. (D1), D456−D463.

M https://dx.doi.org/10.1021/acssynbio.9b00397
ACS Synth. Biol. XXXX, XXX, XXX−XXX
ACS Synthetic Biology pubs.acs.org/synthbio Research Article

(43) Gaulton, A., Hersey, A., Nowotka, M. L., Bento, A. P., (63) Fridovich-Keil, J. L. (2006) Galactosemia: The Good the Bad
Chambers, J., Mendez, D., Mutowo, P., Atkinson, F., Bellis, L. J., and the Unknown. J. Cell. Physiol. 209 (3), 701−705.
Cibrian-Uhalte, E., et al. (2017) The ChEMBL Database in 2017. (64) Funk, C. D. (2001) Prostaglandins and Leukotrienes: Advances
Nucleic Acids Res. 45 (D1), D945−D954. in Eicosanoid Biology. Science 294 (5548), 1871−1875.
(44) Kim, S., Thiessen, P. A., Bolton, E. E., Chen, J., Fu, G., (65) Ren, Z., Lee, J., Moosa, M. M., Nian, Y., Hu, L., Xu, Z., McCoy,
Gindulyte, A., Han, L., He, J., He, S., Shoemaker, B. A., et al. (2016) J. G., Ferreon, A. C. M., Im, W., and Zhou, M. (2018) Structure of an
PubChem Substance and Compound Databases. Nucleic Acids Res. 44 EIIC Sugar Transporter Trapped in an Inward-Facing Conformation.
(D1), D1202−D1213. Proc. Natl. Acad. Sci. U. S. A. 115, 5962.
(45) Huang, Y., Zhong, C., Lin, H. X., and Wang, J. (2017) A (66) Bowden, S. D., Rowley, G., Hinton, J. C. D., and Thompson, A.
Method for Finding Metabolic Pathways Using Atomic Group (2009) Glucose and Glycolysis Are Required for the Successful
Tracking. PLoS One 12 (1), e0168725. Infection of Macrophages and Mice by Salmonella Enterica Serovar
(46) Karp, P. D., Paley, S. M., Krummenacker, M., Latendresse, M., Typhimurium. Infect. Immun. 77 (7), 3117−3126.
Dale, J. M., Lee, T. J., Kaipa, P., Gilham, F., Spaulding, A., Popescu, L., (67) Ahyong, V., Berdan, C. A., Burke, T. P., Nomura, D. K., and
et al. (2010) Pathway Tools Version 13.0 : Integrated Software for Welch, M. D. (2019) A Metabolic Dependency for Host Isoprenoids
Pathway/Genome Informatics and Systems Biology. Briefings Bioinf. in the Obligate Intracellular Pathogen Rickettsia Parkeri Underlies a
11 (1), 40−79. Sensitivity to the Statin Class of Host-Targeted Therapeutics. mSphere
(47) Koprivova, A., and Kopriva, S. (2014) Molecular Mechanisms 4 (6), No. e00536-19.
of Regulation of Sulfate Assimilation: First Steps on a Long Road. (68) Heuston, S., Begley, M., Gahan, C. G. M., and Hill, C. (2012)
Isoprenoid Biosynthesis in Bacterial Pathogens. Microbiology (London,
Front. Plant Sci. 5, 589.
U. K.) 158, 1389.
(48) Ferguson, B. J., Indrasumunar, A., Hayashi, S., Lin, M. H., Lin,
(69) Berthelot, K., Estevez, Y., Deffieux, A., and Peruch, F. (2012)
Y. H., Reid, D. E., and Gresshoff, P. M. (2010) Molecular Analysis of
Isopentenyl Diphosphate Isomerase: A Checkpoint to Isoprenoid
Legume Nodule Development and Autoregulation. J. Integr. Plant Biol.
Biosynthesis. Biochimie 94, 1621.
52 (1), 61−76. (70) Flamholz, A., Noor, E., Bar-Even, A., and Milo, R. (2012)
(49) Fuhrer, T., Fischer, E., and Sauer, U. (2005) Experimental EQuilibrator - The Biochemical Thermodynamics Calculator. Nucleic
Identification and Quantification of Glucose Metabolism in Seven Acids Res. 40 (D1), 770−775.
Bacterial Species. J. Bacteriol. 187 (5), 1581−1590. (71) Hattori, M., Tanaka, N., Kanehisa, M., and Goto, S. (2010)
(50) Zhang, A., Sun, H., Yan, G., Wang, P., and Wang, X. (2016) SIMCOMP/SUBCOMP: Chemical Structure Search Servers for
Mass Spectrometry-Based Metabolomics: Applications to Biomarker Network Analyses. Nucleic Acids Res. 38, W652−W656.
and Metabolic Pathway Research. Biomed. Chromatogr. 30 (1), 7−12. (72) Felsenstein, J. Inferring Phylogenies. Sunderland, Massachusetts:
(51) Buescher, J. M., Antoniewicz, M. R., Boros, L. G., Burgess, S. Sinauer Associates; 2004.
C., Brunengraber, H., Clish, C. B., DeBerardinis, R. J., Feron, O., (73) Kanehisa, M., Goto, S., Hattori, M., Aoki-Kinoshita, K. F., Itoh,
Frezza, C., Ghesquiere, B., et al. (2015) A Roadmap for Interpreting M., Kawashima, S., Katayama, T., Araki, M., and Hirakawa, M. (2006)
13 C Metabolite Labeling Patterns from Cells. Curr. Opin. Biotechnol. From Genomics to Chemical Genomics: New Developments in
34, 189−201. KEGG. Nucleic Acids Res. 34, D354.
(52) Sauer, U. (2006) Metabolic Networks in Motion: 13C-Based (74) Croes, D., Couche, F., Wodak, S. J., and Van Helden, J. (2005)
Flux Analysis. Mol. Syst. Biol. 2 (1), 62. Metabolic PathFinding: Inferring Relevant Pathways in Biochemical
(53) Forst, C. V., Flamm, C., Hofacker, I. L., and Stadler, P. F. Networks. Nucleic Acids Res. 33, W326.
(2006) Algebraic Comparison of Metabolic Networks, Phylogenetic
Inference, and Metabolic Innovation. BMC Bioinf. 7, 67.
(54) Felsenstein, J. (2004) Inferring Phylogenies, Sinauer Associates,
Sunderland, MA.
(55) Shannon, P., Markiel, A., Ozier, O., Baliga, N. S., Wang, J. T.,
Ramage, D., Amin, N., Schwikowski, B., and Ideker, T. (2003)
Cytoscape: A Software Environment for Integrated Models of
Biomolecular Interaction Networks. Genome Res. 13 (11), 2498−
2504.
(56) Sarle, W. S., Jain, A. K., and Dubes, R. C. (1990) Algorithms for
Clustering Data. Technometrics 32, 227.
(57) Heinken, A., Sahoo, S., Fleming, R. M. T., and Thiele, I. (2013)
Systems-Level Characterization of a Host-Microbe Metabolic
Symbiosis in the Mammalian Gut. Gut Microbes 4 (1), 28−40.
(58) Martz, E. O., Lakes, R. S., and Park, J. B. (1996) Hysteresis
Behaviour and Specific Damping Capacity of Negative Poisson’s Ratio
Foams. Cell. Polym. 15 (5), 349−364.
(59) Yeh, I., Hanekamp, T., Tsoka, S., Karp, P. D., and Altman, R. B.
(2004) Computational Analysis of Plasmodium Falciparum Metab-
olism: Organizing Genomic Information to Facilitate Drug Discovery.
Genome Res. 14 (5), 917−924.
(60) Mithani, A., Preston, G. M., and Hein, J. (2009) A Stochastic
Model for the Evolution of Metabolic Networks with Neighbor
Dependence. Bioinformatics 25 (12), 1528−1535.
(61) Prather, K. L. J., and Martin, C. H. (2008) De Novo
Biosynthetic Pathways: Rational Design of Microbial Chemical
Factories. Curr. Opin. Biotechnol. 19 (5), 468−474.
(62) Nikiforova, V. J., Gakière, B., Kempa, S., Adamik, M.,
Willmitzer, L., Hesse, H., and Hoefgen, R. (2004) Towards Dissecting
Nutrient Metabolism in Plants: A Systems Biology Case Study on
Sulphur Metabolism. J. Exp. Bot. 55 (404), 1861−1870.

N https://dx.doi.org/10.1021/acssynbio.9b00397
ACS Synth. Biol. XXXX, XXX, XXX−XXX

You might also like