Professional Documents
Culture Documents
Introduction
Petroleum hydrocarbons are important energy resources used by industry and in our daily life,
whose production contributes highly to environmental pollution. To control such risk,
bioremediation constitutes an environmentally friendly alternative technology that has been
established and applied. It constitutes the primary mechanism for the elimination of hydrocarbons
from contaminated sites by natural existing populations of microorganisms. Petroleum or crude oil is
a complex mixture of hydrocarbons. Annually, millions of tons of crude petroleum oil enter the
marine environment from either natural or artificial sources. Hydrocarbon-degrading bacteria are
able to assimilate and metabolize hydrocarbons present in petroleum. The effects of environmental
conditions on the microbial degradation of hydrocarbons and the effects of hydrocarbon
contamination on microbial communities are areas of great interest(Rahman et al. 2004).
Bioremediation is a strategy to utilize biological activities to the greatest extent possible for the rapid
elimination of environmental pollutants. Stimulation of the growth of indigenous microorganisms,
biostimulation and inoculation with foreign oil-degrading bacteria are promising means of
accelerating the detoxification of a polluted site with minimal impact on the ecological
systems(Cappello et al. 2006). The growth of microorganisms on hydrocarbons presents particular
problems because hydrocarbons are immiscible in water. Many bacteria are able to emulsify
hydrocarbons in solution by producing surface active agents such as biosurfactants that increase the
adhesion of cells to the substrate. Biosurfactants reduce the surface tension by accumulating at the
interface of immiscible fluids, increasing the surface area of insoluble compounds, which leads to
increased bioavailability and subsequent biodegradation of the hydrocarbons(Batista et al. 2006).
Alkanes are major components of crude oil. Alkane hydroxylase is a key enzyme involved in alkane
degradation. This enzyme, which introduces an oxygen atom derived from molecular oxygen into the
alkane substrate, plays an important role in crude oil bioremediation(Van Beilen et al. 2003). Alkane
hydroxylase genes are classified into three groups based on phylogenetic analysis. The group (I)
alkane hydroxylases, encoded by alk-B genes, catalyze the degradation of short-chain n-alkanes (C6–
C12). The group (II) alkane hydroxylases, encoded by alk-M genes, catalyze the degradation of
medium-chain n-alkanes (C8–C16), and the group (III) alkane hydroxylases, encoded by alk-B genes,
catalyze the degradation of long-chain n-alkanes (>C16)(Kohno et al. 2002). Examples of strains
capable of growing with nalkanes as a sole carbon source include Alcanivorax borkumensis SK2,
which shows growth on C6 to C16 nalkanes(Hara et al. 2004); Rhodocuccus sp. Q15, which grows on
C8 to C32(Whyte et al. 2002); and Acinetobacter sp. M- 1, which can degrade C13 to C44 n-
alkanes(Sakai et al. 1994). One well-studied system for aerobic nalkane degradation is the alk system
of Pseudomonas putida GPo1(Baptist et al. 1963; Eggink et al. 1987b), which is encoded by genes
found on the octane (OCT) plasmid(Chakrabarty et al. 1973; van Beilen et al. 2001). The first step of
n-alkane degradation by this system is catalyzed by AlkB, an integral membrane protein that carries
out a terminal hydroxylation of the n-alkane(Kok et al. 1989). The electrons needed to carry out this
step are delivered to AlkB via a rubredoxin reductase (AlkT) and two rubredoxins (AlkF and AlkG)(van
Beilen et al. 2002b). The resulting alcohol is further converted to a fatty acid via a pathway involving
an alcohol dehydrogenase (AlkJ), an aldehyde dehydrogenase (AlkH), and an acyl CoA synthetase
(AlkK), after which, it enters the β-oxidation pathway(van Beilen et al. 2001). Enzyme systems
homologous to the alk system of P. putida GPo1 have been found in several bacterial species(van
Beilen et al. 2002b), and it has been shown that alkB homologues, generally named alkB or alkM,
sometimes occur as two or more paralogues within the same strain(Whyte et al. 2002; van Beilen et
al. 2004). For example, A. borkumensis carries two alkB homologues, alkB1 and alkB2, which have
been shown to play a role in the degradation of C6 to C12 n-alkanes(Hara et al. 2004). Recently, van
Beilen et al. (2005) showed that the substrate range of AlkB from P. putida GPo1 and AlkB1 from A.
borkumensis AP1 is determined by a specific amino acid in the respective proteins, W55 in the P.
putida AlkB and W58 in the A. borkumensis AlkB1. They showed that if these tryptophanes were
changed to a less bulky residue, e.g., serine or cysteine, the enzymes could catalyze the
hydroxylation of n-alkanes with chain lengths of C14 and C16, whereas the wild-type enzymes could
only degrade nalkanes shorter than C13(van Beilen et al. 2005)
Chemotaxis facilitates the movement of microorganisms toward or away from chemical gradients in
the environment, and this process plays a role in biodegradation by bringing cells into contact with
degradation substrates (Parales and Harwood, 2002; Parales et al., 2008). Alkanes are sources of
carbon and energy for many bacterial species and have been shown to function as chemoattractants
for certain microorganisms. A bacterial Flavimonas oryzihabitans isolate that was obtained from soil
contaminated with gas oil was shown to be chemotactic to gas oil and hexadecane (Lanfranconi et
al., 2003). Similarly, Pseudomonas aeruginosa PAO1 is chemotactic to hexadecane (Smits et al.,
2003). The tlpS gene, which is located downstream of the alkane hydroxylase gene alkB1 in the PAO1
genome, is predicted to encode membranebound methyl-accepting chemotaxis proteins (MCP) that
may play a role in alkane chemotaxis (Smits et al., 2003), although no experimental evidence exists.
Similarly, the gene alkN is predicted to encode an MCP that could be involved in alkane chemotaxis
in P. putida GPo1 (van Beilen et al., 2001). Our recent investigation of the genome sequence of
Alcanivorax dieselolei B-5 (Lai et al., 2012) identified the alkane chemotaxis machinery of
Alcanivorax, which consists of eight cytoplasmic chemotaxis proteins that transmit signals from the
MCP proteins to the flagellar motors (Figure 1). This chemotaxis machinery is similar to that of
Escherichia coli (Parales and Ditty, 2010). However, further investigation is necessary to confirm the
mechanism of alkane chemotaxis in A. dieselolei B-5.
Although the genes and proteins that enable the passage of aromatic hydrocarbons across the
bacterial outer membrane have been identified (van den Berg, 2005; Mooney et al., 2006; Hearn et
al., 2008, 2009), the active transport mechanisms involved in alkane uptake remain unclear. Previous
reviews(Rojo, 2009) discussed the observation that direct uptake of alkane molecules from the
water phase is only possible for low molecular weight alkanes, which are sufficiently soluble to
facilitate efficient transport into cells. For medium- and long-chain n-alkanes, microorganisms may
gain access to these compounds by adhering to hydrocarbon droplets (which is facilitated by the
hydrophobic cell surface) or by surfactant-facilitated access, as reviewed by Rojo (2009). Surfactants
have been reported to increase the uptake and assimilation of alkanes, such as hexadecane, in liquid
culture (Beal and Betts, 2000; Noordman and Janssen, 2002), but their exact role in alkane uptake is
not fully understood. Bacteria that are capable of oil degradation usually produce and secrete
surfactants of diverse chemical nature that allow alkane emulsification (Yakimov et al., 1998; Peng et
al., 2007, 2008; Qiao and Shao, 2010; Shao, 2010). Based on our understanding of biosurfactant
structure and the mechanism of outer membrane transport, we speculate that biosurfactants may
be excluded from entering the cell and remain in the extracellular milieu. In P. putida, alkL in the alk
operon is postulated to play an important role in alkane transport into the cell (van Beilen et al.,
2004; Hearn et al.,2009). Transcriptome analysis of A. borkumensis Sk2 revealed that the alkane-
induced gene blc, encoding the outer membrane lipoprotein Blc, might be involved in alkane uptake
because it contains a so-called lipocalin domain (Sabirova et al., 2011). When this domain contacts
organic solvents, a small hydrophobic pocket forms and catalyzes the transport of small hydrophobic
molecules. More recently, our genome analysis (Lai et al., 2012) and closer examination of A.
dieselolei B-5indicated that three outer membrane proteins that belong to the long-chain fatty acid
transporter protein (FadL) family are involved in alkane transport (unpublished). The FadL homologs
are present in many bacteria that are involved in the biodegradation of xenobiotics (van den
Berg,2005), which are usually hydrophobic and probably enter cells by a mechanism similar to that
employed for long-chain (LC) fatty acids by FadL in E. coli. DEGRADATION PATHWAYS OF n-ALKANES
The initial terminal hydroxylation of n-alkanes can be carried out by enzymes that belong to different
families. Microorganisms degrading short-chain length alkanes (C2–C4, where the subindex indicates
the number of carbon atoms of the alkane molecule) have enzymes related to methane
monooxygenases (van Beilen and Funhoff, 2007). Strains degrading medium-chain length alkanes
(C5–C17) frequently contain soluble cytochrome P450s and integral membrane non-heme iron
monooxygenases, such as AlkB (Rojo, 2009; Austin and Groves, 2011). Interestingly, alkane
hydroxylases of long-chain length (LC-) alkanes (>C18) are unrelated to the above alkane
hydroxylases as characterized recently. One such hydroxylase, AlmA, is an LCalkane monooxygenase
from Acinetobacter. A second hydroxylase is LadA, which is a thermophilic soluble LC-alkane
monooxygenase from Geobacillus (Feng et al., 2007; Throne-Holst et al., 2007; Wentzel et al., 2007).
The almA gene, which encodes a putative monooxygenase belonging to the flavin-binding family,
was identified from Acinetobacter sp. DSM 17874 (Throne-Holst et al., 2007; Wentzel et al., 2007).
This gene encodes the first experimentally confirmed enzyme that is involved in the metabolism of
LC n-alkanes of C32 and longer. We provided the first evidence that the AlmA of the genus
Alcanivorax functions as an LC-alkane hydroxylase, and found that the gene almA in both A.
hongdengensis A-11-3 and A. dieselolei B-5 strains expressed at high levels to facilitate the effi- cient
degradation of LCn-alkanes (Liu et al., 2011;Wang and Shao, 2012a). The almA gene sequences were
present in several bacterial genera capable of LC n-alkane degradation, including Alcanivorax,
Marinobacter, Acinetobacter, and Parvibaculum (Wang and Shao, 2012b). In addition, similar genes
are found in other genera in GenBank, such as Oceanobacter sp. RED65, Ralstonia spp.,
Mycobacterium spp., Photorhabdus sp., Psychrobacter spp., and Nocardia farcinica IFM10152.
However, few of these genes have been functionally characterized. A unique LC-alkane hydroxylase
from the thermophilic bacterium Geobacillus thermodenitrificans NG80-2 has been characterized.
This enzyme is called LadA and oxidizes C15–C36 alkanes, generating the corresponding primary
alcohols (Feng et al., 2007). The LadA crystal structure has been identified, revealing that LadA
belongs to the bacterial luciferase family, which is two-component, flavin-dependent oxygenase (Li
et al., 2008). LadA is believed to oxidize alkanes by a mechanism similar to that of other flavoprotein
monooxygenases, and its ability to recognize and hydroxylate LC-alkanes most likely results from the
way in which it captures the alkane(Li et al., 2008). Therefore, the hydroxylases involved in LC-alkane
degradation appear to have evolved specifically, which is in contrast with other alkane
monooxygenases such as AlkB and P450. Interestingly, branched-chain alkanes are thought to be
more difficult to degrade than linear alkanes (Pirnik et al., 1974). However, Alcanivorax bacteria
efficiently degrade branched alkanes (Hara et al., 2003). In A. borkumensis SK2, isoprenoid
hydrocarbon (phytane) strongly induces P450 (a) and alkB2 (Schneiker et al., 2006). In a previous
report, we found that both pristane and phytane activate the expression of alkB1 and almA in A.
dieselolei B-5 (Liu et al., 2011). In A. hongdengensis A-11-3, we recently found that pristane
selectively activates the expression of alkB1, P450-3 and almA (Wang and Shao, 2012a). However,
the metabolic pathways that mediate this activity are poorly understood, although they may involve
the ω- or β-oxidation of the hydrocarbon molecule (Watkinson and Morgan, 1990). REGULATION OF
ALKANE-DEGRADATION PATHWAYS The expression of the bacterial genes involved in alkane
assimilation is tightly regulated. Alkane-responsive regulators ensure that alkane degradation genes
are induced only in the presence of the appropriate hydrocarbons. Many microorganisms(Rojo,
2009; Austin and Groves, 2011) contain several sets of alkane degradation systems, each one being
active on a particular kind of alkane or being expressed under specific physiological conditions. In
these cases, the regulatory mechanisms should assure an appropriate differential expression of each
set of enzymes. The regulators that have been characterized belong to different families, including
LuxR/MalT, AraC/XylS, and other non-related families (Table 1).
BASIC PROTOCOL MODELING of Cronobacter sp. Strain DJ34 FROM Isolated from Crude Oil-
Containing Sludge from the Duliajan Oil Fields, Assam, India
A computer running RedHat Linux (PC, Opteron or EM64T/Xeon64 systems) or other version of
Linux/Unix (x86/x86_64 Linux), Apple Mac OSX (10.6 or later), or Microsoft Windows (XP or later)
Software:
Files:
Sample files required to complete this protocol can be downloaded from http://salilab.org/
modeller/tutorial/basic-example.tar.gz (Unix/Linux) or http://salilab.org/modeller/tutorial/ basic-
example.zip (Windows)
Background to Cronobacter sp. Strain DJ34 — Very few strains of Cronobacter spp. have been
reported from hydrocarbon or industrial waste-contaminated habitats (3, 4). Recently, a Gram-
negative, facultative anaerobic, hydrocarbon degrading strain, Cronobacter sp. DJ34, from crude oil-
containing sludge in the Duliajan oil fields, Assam, India was isolated. The 16S rRNA gene sequence
of the isolate has been deposited in NCBI GenBank under the accession no. KM054665, which
showed 99% sequence similarity with Cronobacter pulveris strain E444 (accession no. EF059835).
Strain DJ34 showed multiple heavy metal resistances, growth under a wide range of pH,
temperature, and salinity conditions, biosurfactant production, and an ability to utilize various
electron acceptors during anaerobic growth.
Conversion of sequence to PIR file format: It is first necessary to convert the target sequence into a
format that is readable by MODELLER. MODELLER uses the PIR format to read and write sequences
and alignments. The first line of the PIR-formatted sequence consists of >P1; followed by the
identifier of the sequence. the sequence is identified by the code dj34. The second line, consisting of
ten fields separated by colons, usually shows details about the structure. In the case of sequences
with no structural information, only two of these fields are used: the first field should be sequence
(indicating that the file contains a sequence without a known structure) and the second should
contain the model file name (dj34 in this case). The rest of the file contains the sequence of dj34,
with an asterisk (*) marking its end. The standard uppercase single-letter amino acid codes are used
to represent the sequence.
A search for potentially related sequences of known structure can be performed using the
profile.build() command of MODELLER (file build_profile.py). The command uses the local dynamic
programming algorithm to identify related sequences (Smith and Waterman, 1981). In the simplest
case, the command takes as input the target sequence and a database of sequences of known
structure (file pdb_95.pir) and returns a set of statistically significant alignments. The script,
build_profile.py, does the following:
1. Initializes the “environment” for this modeling run by creating a new environ object (called env
here).
2. Creates a new sequence_db object, calling it sdb, which is used to contain large databases of
protein sequences.
3. Reads a file, in text format, containing nonredundant PDB sequences, into the sdb database. The
sequences can be found in the file pdb_95.pir. This file is also in the PIR format. Each sequence in
this file is representative of a group of PDB sequences that share 95% or more sequence identity to
each other and have less than 30 residues or 30% sequence length difference.
4. Writes a binary machine-independent file containing all sequences read in the previous step.
6. Creates a new “alignment” object (aln), reads the target sequence dj34 from the file dj34.ali, and
converts it to a profile object (prf). Profiles contain similar information to alignments, but are more
compact and better for sequence database searching.
7. prf.build() searches the sequence database (sdb) with the target profile (prf). Matches from the
sequence database are added to the profile.
8. prf.write() writes a new profile containing the target sequence and its homologs into the specified
output file. The equivalent information is also written out in standard alignment format.
python build_profile.py > build_profile.log (or, if Python is not installed on your machine, with
mod9.13 build_profile.py). At the end of the execution, a log file is created (build_profile.log).
MODELLER always produces a log file. Errors and warnings in log files can be found by searching for
the _E> and _W> strings, respectively.
Selecting a template—
In the file build_profile.prf The second column reports the code of the PDB sequence that was
aligned to the target sequence. The eleventh column reports the percentage sequence identities
between TvLDH and the PDB sequence normalized by the length of the alignment (indicated in the
tenth column). In general, a sequence identity value above ~25% indicates a potential template,
unless the alignment is too short. Thee PDB sequences show very significant similarities to the query
sequence, with E-values equal to 0. As expected, the hits correspond to alkane monooxygenase
(1bdm:A, 5mdh:A, 1b8p:A, 1civ:A, 7mdh:A, and 1smk:A). To select the appropriate template for the
target sequence, the alignment.compare_structures() command will first be used to assess the
sequence and structure similarity between the three possible templates. In compare.py, the
alignment object aln is created and MODELLER is instructed to read into it the protein sequences
and information about their PDB files. The command malign()calculates their multiple sequence
alignment, which is subsequently used as a starting point for creating a multiple structure alignment
by malign3d(). Based on this structural alignment, the compare_structures() command calculates the
RMS and DRMS deviations between atomic positions and distances, differences between the main-
chain and side-chain dihedral angles, percentage sequence identities, and several other measures.
Finally, the id_table() command writes a file (family.mat) with pairwise sequence distances that can
be used as input to the dendrogram() command (or the clustering programs in the PHYLIP package;
Felsenstein, 1989). dendrogram() calculates a clustering tree from the input matrix of pairwise
distances, which helps visualizing differences among the template candidates.
To align the sequence with the structure of 1bdm:A is to use the align2d() command in MODELLER
(Madhusudhan et al., 2006). Although align2d() is based on a dynamic programming algorithm
(Needleman and Wunsch, 1970), it is different from standard sequence-sequence alignment
methods because it takes into account structural information from the template when constructing
an alignment.
Model building—
Once a target-template alignment is constructed, MODELLER calculates a 3-D model of the target
completely automatically, using its automodel class. The script in Figure 5.6.9 will generate five
different models of TvLDH based on the 1bdm:A template structure and the alignment in file TvLDH-
1bdmA.ali (file modelsingle.py).
Evaluating a model—
several models are calculated for the same target, the best model can be selected by picking the
model with the lowest value of the MODELLER objective function or the DOPE (Shen and Sali, 2006)
or SOAP (Dong et al., 2013) assessment scores, which are reported at the end of the log file.
certain compounds were selected from EPA. Their structures were downloaded from pubchem. Links
are given below .
1.
2.
The sdf structures are then converted to pdb suitable for bioremediation.
Docking protocol :
AutoDock Vina (version 1.1.2) [34] is used in this project to conduct molecular docking.
Target protein structures are converted to the required PDBQT format using MGL Tools
(version 1.5.4) [31]. Open Babel (version 2.3.1) [49] is used to add polar hydrogens and
partial charges to ligand atoms as well as to convert these molecules to the PDBQT format.
The default box size is calculated following the protocol outlined by the authors of Vina [34].
Briefly, an initial docking box is calculated from the coordinates of a bound ligand in the
crystal structure, and the box dimensions in x, y and z are increased by 10 Å. Additionally,
one of the two directions in each dimension is randomly chosen and further increased by 5 Å.
Finally, if the box size in any dimension is smaller than 22.5 Å, it is extended to this value. In
this study, an experimental binding site is defined as the geometric center of a ligand bound
to the target protein, whereas the computationally predicted binding pocket center is obtained
from eFindSite [9]. Docking simulations using predicted pockets start with a random ligand
conformer generated by obconformer from Open Babel [49]; moreover, the ligand is
randomly spun around all axes in order to avoid providing the docking program with any
structural information on the native binding pose. All ligands are also translated so that their
geometric centers overlap with predicted pocket centers.
Virtual screening results are assessed by several commonly used evaluation metrics.
Enrichment factors EF1 % and EF10 % count the fraction of actives in the top 1 and 10 % of the
ranked library, respectively. In order to address the “early recognition problem”, we use the
Boltzmann-Enhanced Discrimination of Receiver Operating Characteristics (BEDROC20)
score that calculates 80 % of the enrichment from the top 8 % of the ranked library [61]. In
addition, we evaluate the area under the enrichment curve (AUC) that determines the
discriminative capability by measuring the distribution of actives over the entire library.
Finally, we calculate ACT-50 %, which corresponds to the top fraction of the ranked library
that contains half of the active compounds.
Docking protocol :
The pdb files of both ligand( dodecane) and enzyme(dj34) were converted to pdbqt using
MGL tools and open babel.pdbqt are the files which stores the atomic coordinates,
partial charges and AutoDock atom types, for both the receptor and the ligand.
Modifying receptor
Modifying ligand
A set of 6 compounds ( Table 1) 3 aliphatic chains and 3 from the EPA’s (U.S. Environmental
Protection Agency) Chemical Releases and Transfers List, were selected available for various
industries