You are on page 1of 14

Author’s Accepted Manuscript

Analysis and comparison of lignin peroxidases


between fungi and bacteria using three different
modes of Chou’s general pseudo amino acid
composition

Mandana Behbahani, Hassan Mohabatkar, Mokhtar


Nosrati
www.elsevier.com/locate/yjtbi

PII: S0022-5193(16)30284-3
DOI: http://dx.doi.org/10.1016/j.jtbi.2016.09.001
Reference: YJTBI8809
To appear in: Journal of Theoretical Biology
Received date: 25 May 2016
Revised date: 27 July 2016
Accepted date: 1 September 2016
Cite this article as: Mandana Behbahani, Hassan Mohabatkar and Mokhtar
Nosrati, Analysis and comparison of lignin peroxidases between fungi and
bacteria using three different modes of Chou’s general pseudo amino acid
c o m p o s i t i o n , Journal of Theoretical Biology,
http://dx.doi.org/10.1016/j.jtbi.2016.09.001
This is a PDF file of an unedited manuscript that has been accepted for
publication. As a service to our customers we are providing this early version of
the manuscript. The manuscript will undergo copyediting, typesetting, and
review of the resulting galley proof before it is published in its final citable form.
Please note that during the production process errors may be discovered which
could affect the content, and all legal disclaimers that apply to the journal pertain.
Analysis and comparison of lignin peroxidases between fungi and bacteria using three
different modes of Chou's general pseudo amino acid composition

Mandana Behbahani1, Hassan Mohabatkar1*, Mokhtar Nosrati1

1
Department of Biotechnology, Faculty of Advanced Sciences and Technologies, University of Isfahan, Isfahan, Iran

*Corresponding author: Email:h.mohabatkar@ast.ui.ac.ir,

Tel: +980313794391

Fax: +98 3137932342

1
Abstract: Lignin peroxidases (LiPs) are important enzymes in the degradation process of lignin
which are presented in different species of fungi and bacteria. In the present study, sequence and
structure-based properties of LPs in fungi and bacteria are compared. These properties include
pseudo amino acid composition (PseAAC), physicochemical properties and the secondary
structure. Autodock 4 has been used for docking between LiPs and lignan. The motifs of LiP
were predicted by MEME tool. Statistical analysis and Multinomial Naïve Bayes (MNB)
algorithm were used for the classification of two LiP protein groups. The results demonstrated
that molecular weight, isoelectric point, aliphatic, extinction coefficient and random coil
percentage of LiPs in fungi and bacteria were significantly different between these two groups.
The classification of these two groups based on the concept of PseAAC showed over 80%
accuracy. The binding free energy between bacterial LiPs and lignan is significantly more than
fungi LiP and ligand. The aliphatic and instability of most important motifs of bacteria and fungi
were significantly different. In conclusion, the results indicated that computational techniques
could provide useful information for comparing fungal and bacterial LiPs. These results can also
explain that there is a relationship between efficacy and physicochemical properties of LiPs.

Keywords: Lignin Peroxidase, Binary classification, Bioremediation.

1.1.Introduction
Lignin is known as one of the most recalcitrant aromatic polymers. It is mostly derived from
wood and abounds on the earth. Lignin degradation is done by both fungi and bacteria. Recently,
some lignin degrading organisms and their enzymes are used for environmental bioremediation
and various industries (Arora and Gill, 2001; Bugg et al, 2011). Lignin peroxidases (LiP) and
versatile peroxidases are demonstrated as important enzymes in the degradation of lignin (Eom
and Kim, 2014). So far, a lot of researches have been conducted on some lignin-degrading fungi
such as white-rot and brown-rot fungi. The use of fungi to degrade lignin is widely studied due to
their environmental importance and potential biotechnological applications (Janusz et al, 2013).
However these fungi have some industrial limitations including particular culture conditions and
substrates (Crawford et al, 1976). There are a number of bacteria such as Streptomycetes that can
break down lignin (Zimmermann, 1990; Ramachandra et al, 1988). The degradation of lignin by
bacteria may be superior to fungi regarding to specificity, thermostability and mediator
dependency. Some studies have been focused on the evolutionary patterns of fungi peroxidase
(Janusz et al, 2013; Johansson and Nyman, 1993). However, the diversity of LiPs in fungi and
bacteria has not been investigated yet. In the present study, the pseudo amino acid composition
(pseAAC), physicochemical properties and secondary structure of LiPs in fungi and bacteria are
compared.

2. Materials and methods


2.1. Data collection
Amino acid sequences of LiPs of fungi and bacteria were fetched from NCBI
(http://www.ncbi.nlm.gov). One thousand six hundred sixty LiP amino acid sequences from

2
bacteria and 122 sequences of fungi were used as our datasets. The data were analyzed by CD-
Hit (Fu et al, 2012) and redundant sequences (more than 95% similarity sequences) were
removed from the datasets. After running CD-HIT, the number of LP sequences in bacteria and
fungi was reduced to 105 and 25 respectively.
2.2. General Pseudo Amino Acid Composition
With the explosive growth of biological sequences in the post-genomic era, one of the most
important but also most difficult problems in computational biology is how to express a
biological sequence with a discrete model or a vector, yet still keep considerable sequence-order
information or key pattern characteristic. This is because all the existing machine-learning
algorithm can only handle vector but not sequence samples, as elucidated in a recent review
[Chou, 2015]. However, a vector defined in a discrete model may completely lose all the
sequence-pattern information. To avoid completely losing the sequence-pattern information for
proteins, the pseudo amino acid composition or PseAAC [Chou et al, 2005] was propose. Ever
since the concept of pseudo amino acid composition or Chou's PseAAC [Du et al, 2012; Cao et
al, 2013; Lin and Lapointe 2013] was proposed in 2001, it has penetrated into many biomedicine
and drug development areas [Zhong and Zhou, 2014; Zhou and Zhong, 2016] and nearly all the
areas of computational proteomics [Kabir and Hayat, 2016; Dehzangi et al, 2015; Kumar et al,
2015; Mondal and Pai, 2014; Wang et al, 2015 ] as well as a long list of references cited in [Du
et al, 2014]. Because it has been widely and increasingly used, recently three powerful open
access soft-wares, called 'PseAAC-Builder' [Du et al, 2012], 'propy' [Cao et al, 2013], and
'PseAAC-General' [Du et al, 2012], were established: the former two are for generating various
modes of Chou's special PseAAC; while the 3rd one for those of Chou's general PseAAC [Chou
et al, 2011, Chou, 2009 ], including not only all the special modes of feature vectors for proteins
but also the higher level feature vectors such as "Functional Domain" mode (see Eqs.9-10 of
[Chou, 2015]), "Gene Ontology" mode (see Eqs.11-12 of [Chou, 2015]), and "Sequential
Evolution" or "PSSM" mode (see Eqs.13-14 of [Chou, 2015]). Encouraged by the successes of
using PseAAC to deal with protein/peptide sequences, three web-servers [Chen et al, 2014; Chen
et al, 2015; Liu et al, 2015; Chen and Lin, 2015] were developed for generating various feature
vectors for DNA/RNA sequences. Particularly, recently a very powerful web-server called Pse-
in-One [Liu et al, 2015] has been established that can be used to generate any desired feature
vectors for protein/peptide and DNA/RNA sequences according to the need of users' studies. In
the current study, we are to use three different modes of the general PseAAC to analyze and
compare the lignin peroxidases between fungi and bacteria.

2.2.1. Physicochemical property analysis


ProtParam is a primary structure prediction tool which computes different physicochemical
properties of a protein. This tool is available at http://web.expasy.org/protparam (Gasteiger, 2005).
In this study, five characteristics (molecular weight, theoretical pI, extinction coefficient,
aliphatic index and grand average of hydropathicity) of LP proteins of fungi and bacteria were
evaluated using ProtParam. The molecular weight, theoretical pI and aliphatic index are most
studied and useful parameters that indicate physicochemical properties of a protein but extinction
coefficient and grand average of hydropathicity are less studied in comparison to above
mentioned parameters especially in comparative studies. The extinction coefficient indicates how
much light a protein absorbs at a certain wavelength. It is useful to have an estimation of this
coefficient for following a protein which a spectrophotometer when purifying it. The GRAVY

3
value for a protein or a peptide is defined by the sum of hydropathy values of all amino acids
divided by the protein length. Increasing positive score indicates a greater hydrophobicity.
2.2.2. Generating Pseudo-amino acid composition (PseAAC)
The collected sequences of LiP proteins of fungi and bacteria were used to compute their
PseAAC values. PseAAC was widely used in many earlier statistical methods for predicting
various attributes of different proteins and peptides. The concept of PseAAC will describe
protein sequences with quantitative representations while taking into account considerable
sequence-order information (Chou, 2001). Therefore, PseAAC can provide a comprehensive
combination with other properties to perform a reliable classification.

2.2.3. Secondary structure analysis


GOR IV is an efficient analysis tool for predicting the secondary structure of a protein from its
amino acid sequence. This tool is available at http://npsa-pbil.ibcp.fr/cgi-
bin/npsa_automat.pl?page=npsa_gor4.html (Kloczkowski et al, 2002). The frequency of
secondary structure of LiPs of fungi and bacteria (alpha helix, extended strand and random coil)
was computed and further analyzed.
2.5. Statistical analysis
The in silico comparison of physicochemical and secondary structure properties of LiPs in fungi
and bacteria was evaluated using Receiver Operator Characteristic (ROC) curve analysis.
Multinomial Naïve Bayes (MNB) classification algorithm was also used for evaluating the
dissimilarity of datasets based on their PseAAC-generated values. ROC curve is a tool for
organizing classifiers and visualizing their performance. The ROC graph analysis is usually used
in machine learning and data mining research (Fawcett, 2006). ROC server can by computing
accuracy (ACC) and Area under curve (AUC). Also ROC server can evaluate the differences
between positive and negative classes of data.
2.6. Classification based on PseAAC
The dissimilarity between bacterial and fungal LiP protein sequences was characterized by the
Multinomial Naive Bayes (MNB) algorithm in Weka Software version 3.7 (Hall et al, 2009). The
performance of a binary classifier can be described by means of different parameters which are
used here to analyze the classification. These are: ACC, precision or positive predictive value
(PPV) and negative predictive value (NPV). PPV and NPV are the proportions of the two
datasets results in binary classification and statistics and they are true positive and true
negative results (Parikh et al, 2008). ACC is the most important parameter in describing a binary
classification. When ACC reach over 0.80 (or 80%), the classification performance is evaluated
to be acceptable.

2.7. Docking
The LiPs in fungi and bacteria also lignan were subjected to molecular docking study as target
and ligand respectively.Lignan is a polyphenolic and a powerful competitive inhibitor of the Lip
so can provide appropriate information from the active site of the Lip[Frias et al, 1995]. The
three dimensional structure of Lignan was obtained from Pub Chem
(http://pubchem.ncbi.nlm.nih.gov) database as SDF format. The three dimensional structure of

4
the fungal and bacterial LiPS were determined by Swiss Model protein program. Molecular
docking was performed using Autodock4 (version 4.2) with the Lamarckian genetic
algorithm (Morris et al, 1998). Docking parameters which were selected for AutoDock4 runs
were as follows: 100 docking runs, population size of 200, random starting position and
conformation, translation step ranges of 2 Å, mutation rate of 0.02, crossover rate of 0.8, local
search rate of 0.06 and 2.5 million energy evaluations. As pointed out in a pioneer paper [Chen,
1977] about 40 years ago, there exist low-frequency collective motion in proteins and DNA.
Actually, many remarkable biological functions in proteins and DNA and their profound
dynamic mechanisms, such as switch between active and inactive states [Wang et al, 2009],
cooperative effects [Chou, 1989], allosteric transition [Kiang, 1985; Wang, 2010], intercalation
of drugs into DNA [Mao, 1988] and microtubule growth [Zhang, 1994], can be revealed by
studying their internal motions as summarized in a comprehensive review [Chou, 1988].
Likewise, to really understand the interaction of a protein receptor with its ligand and to reveal
their binding mechanism, we should consider not only the static structures concerned but also the
dynamical information obtained by simulating their internal motions or dynamic process, and we
shall make efforts in this regards in our future work.

2.8. MEME motif discovery


The MEME Suite web server (http://meme-suite.org/tools/meme) provides a unified portal for study
of sequence motifs representing features such as protein interaction domains. The MEME motif
discovery algorithm is now complemented by the GLAM2 algorithm which allows discovery of
motifs containing gaps. The motifs of fungal and bacterial LPs were obtained from the MEME
motif discovery server. Applied factors of MEME were as follows: minimum width for each
motif, six; maximum width for each motif, fifty; maximum number of motifs to discover three;
and amounts of each motif, zero or one per sequence (Bailey et al, 2009).

3. Results
3.1. Physicochemical property analysis
The results of ProtParam were analyzed using ROC curve analysis (Table 1). The ACC values of
four parameters (molecular weight, isoelectric point, aliphatic and extinction coefficient) of LPs
in fungi and bacteria were more than 80%. It was concluded that all the tested parameters except
GRAVY were significantly different between LiPs of fungi and bacteria.
3.2. Secondary structure analysis
ROC curve analysis of the secondary structure results is presented in Table 2. The results
showed that two sets of proteins were different in the case of the secondary structure. The ACC
values of the LiP proteins of fungi and bacteria are mostly differences in the percentage of
random coil and extended strand. However, alpha helix is not significantly different between
LiPs (Table 2).
3.3. PseAAC analysis
MNB algorithm showed high performance in classifying the proteins, meaning that the proteins
are significantly discriminated in terms of their PseAAC values. The results of classification
based on PseAAC are provided in Table 3. According to the results, all performance parameters

5
used here to analyze the classification reached over 80%, which demonstrated that the two
datasets were reasonably dissimilar.
3.4. Analysis of docking results

The results of docking study of LiPs with Lignan are showed in Table 5. The results
demonstrated that both bacterial and fungal LP had the appropriate interaction to Lignan with
RMSD less than 2. Analysis of docking results showed that bacterial LiPs had higher efficacy for
Lignan with Ki and ∆Gb mean values of -10.47±0.97 and 48.6±1.7 kj/mol respectively (Table5)
that Ki is estimated inhibition constant and ∆Gb is estimated free energy of binding.

3.5. MEME motif discovery


The result revealed three bacterial LiPs motifs (375-451, 276-325 and 215-264) and three fungi
LiPs motifs (222-298,166-242, 42-115) which were identified by the MEME Suite web server
(Figur1). The structural motifs of bacteria and fungi were respectively observed in regions of
215-264 and 42-115. In the MEME server presented motifs were ranked based on probable
score, therefore, we selected two more probable motifs for more analysis and Physicochemical
properties of these probable motifs consist of: molecular weight, theoretical pI, extinction
coefficient, aliphatic index and grand average of hydropathicity investigated by protparam
(Table4).

4. Discussion
The knowledge of 3D (three-dimensional) structures of target proteins and their binding sites
with ligands is vitally important for rational drug design. Although X-ray crystallography is a
powerful tool in this regard, it is time-consuming and expensive, and not all proteins can be
successfully crystallized. Membrane proteins are difficult to crystallize and most of them will not
dissolve in normal solvents. Therefore, so far very few membrane protein structures have been
determined. Although recent breakthrough in high resolution NMR has indicated that it is indeed
very powerful tool in determining the 3D structures of membrane proteins and their complexes
[Liu et al, 2015; Chou, 2015] but it is also time consuming and costly. To acquire the structural
information in a timely manner, a series of 3D protein structures and their binding sites with
ligands were derived by using various structural bioinformatics tools [Chou, 2005; Du et al,
2012; Cao et al, 2013] and a comprehensive review [Lin and Lapointe, 2013], and were found
very useful for drug design. In this study, we are to use some powerful bioinformatics tools, such
as PseAAC and docking, to reveal the difference of lignin peroxidases between fungi and
bacteria. The information of a binding pocket of a receptor for its ligand is very important for
drug design, particularly for conducting mutagenesis studies [Chou, 2004]. In the literature, the
binding pocket of a protein receptor to a ligand is usually defined by those residues that have at
least one heavy atom (i.e., an atom other than hydrogen) within a distance of 5Å from a heavy
atom of the ligand. Such a criterion was originally used to define the binding pocket of ATP in
the Cdk5- Nck5a* complex [Watenpaugh and Heinrikson, 1999] that has later proved quite
useful in identifying functional domains and stimulating 9 the relevant truncation experiments

6
[Zhang et al, 2002]. The similar approach has also been used to define the binding pockets of
many other receptor-ligand interactions important for drug design [Wei and Zhong, 2003; Huang
et al, 2008; Wang, 2012]. The present results demonstrated that physical and chemical properties,
secondary structure and PseAAC of LiPsi encoded by fungi and bacteria were significantly
different. The previous studies reported that some fungi are able to degrade lignin and
decolorize dyes, but have several disadvantages including longer treatment time. The bacterial
ligninolytic enzymes may be more suitable for industrial application because of their high
catalytic efficiencies (Shi et al, 2014; Arora and Gill, 2001). In the present study the diversity of
fungal and bacterial LiPs are investigated for the first time. In this research, extinction
coefficient, Instability, Aliphatic properties, Molecular weight and Isoelectric Point were
significantly different between LiPs in fungi and bacteria. Among those factors aliphatic index
and extension could be more important for enzyme activity (Huang et al, 2011). Aliphatic index
is defined as the relative volume of a protein occupied by aliphatic side chains (alanine, valine,
isoleucine, and leucine) of proteins (Viader-Salvadó et al, 2010). According to the result, the
high value of aliphatic index may be regarded as a positive factor for high catalytic efficiency in
bacteria. Using the classification technique used here in combination with PseAAC we are also
able to predict whether a given LiP sequence is related to fungi or bacteria. This can be done
since the 80% accuracy of the classifier ensures the most accurate prediction. These results
demonstrated that computational predictors were useful tools to compare LiPs in fungi and
bacteria. As shown in a series of recent publications [Chen et al, 2016; Jia et al, 2016; Jia et al,
2016; Jia et al, 2016; Jia et al, 2016; Liu et al, 2016; Liu et al, 2016; Liu et al, 2016;Chen et al,
2016; Jia et al, 2016; Liu et al, 2016; Qiu et al, 2016; Qiu et al, 2016; Qiu et al, 2016; Qiu et al,
2016; Xiao et al, 2016] in demonstrating new findings or approaches, user-friendly and publicly
accessible web-servers will significantly enhance their impacts [Chou, 2015], we shall make
efforts in our future work to provide a web-server to displaying findings that can be manipulated
by users according to their need.

5. Conclusion
Based on the results it was demonstrated that bacterial LPs have higher efficacy than fungal LPs.
Also computational predictors were useful tools to compare LPs in fungi and bacteria and the
high value of aliphatic index may be regarded as a positive factor for high catalytic efficiency in
bacteria.

6. Acknowledgment

This research did not receive any specific grant from funding agencies in the public, commercial, or not-
for-profit sectors.

7
References

Arora, D.S. Gill, P.K., 2001. Effects of various media and supplements on laccase production by some white rot
fungi. Bioresource Technology 77(1):89-91.
Bugg, T.D., Ahmad, M., Hardiman, E.M. and Singh, R.,2011. The emerging role for bacteria in lignin degradation
and bio-product formation. Current opinion in biotechnology, 22(3):394-400.
Bailey, T.L., Boden, M., Buske, F.A., Frith, M., Grant, C.E., Clementi, L., Ren, J., Li, W.W.Noble, W.S., 2009.
MEME SUITE: tools for motif discovery and searching. Nucleic acids research,335.
Chou, K.C., 2001. Prediction of protein cellular attributes using pseudo‐amino acid composition. Proteins: Structure,
Function, and Bioinformatics,43(3), pp.246-255.
Crawford, D.L.Crawford, R.L., 1976. Microbial degradation of lignocellulose: the lignin component. Applied and
Environmental Microbiology, 31(5):714-717.
Chou, K.C. Review: Structural bioinformatics and its impact to biomedical science. Current Medicinal
Chemistry, 2004, 11, 2105-2134.
Chou, K.C. Impacts of bioinformatics to medicinal chemistry. Medicinal Chemistry, 2015, 11, 218-234.
Chou, K.C. Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes.
Bioinformatics, 2005, 21, 10-19.
Cao, D.S.; Xu, Q.S.; Liang, Y.Z. propy: a tool to generate various modes of Chou's PseAAC.
Bioinformatics, 2013, 29, 960-962.
Chou, K.C. Some remarks on protein attribute prediction and pseudo amino acid composition (50th
Anniversary Year Review). Journal of Theoretical Biology, 2011, 273, 236-247.
Chou, K.C. Pseudo amino acid composition and its applications in bioinformatics, proteomics 7 and
system biology. Current Proteomics, 2009, 6, 262-274.
Chou, K.C. Impacts of bioinformatics to medicinal chemistry. Medicinal Chemistry, 2015, 11, 218-234.
Chen, W.; Lei, T.Y.; Jin, D.C. PseKNC: a flexible web-server for generating pseudo K-tuple nucleotide
composition. Analytical Biochemistry, 2014, 456, 53-60.
Chen, W.; Zhang, X.; Brooker, J. PseKNC-General: a cross-platform package for generating various
modes of pseudo nucleotide compositions. Bioinformatics, 2015, 31, 119-120.
Chen, W.; Lin, H. Pseudo nucleotide composition or PseKNC: an effective formulation for analyzing
genomic sequences. Mol Biosyst, 2015, 11, 2620-2634.
Chen, N.Y. The biological functions of low-frequency phonons. Scientia Sinica, 1977, 20, 447- 457.
Chou, K.C. Low-frequency resonance and cooperativity of hemoglobin. Trends in Biochemical Sciences,
1989, 14, 212-213.
Chou, K.C. Review: Low-frequency collective motion in biomacromolecules and its biological functions.
Biophysical Chemistry, 1988, 30, 3-48.
Du, P.; Wang, X.; Xu, C.; Gao, Y. PseAAC-Builder: A cross-platform stand-alone program for
generating various special Chou's pseudo-amino acid compositions. Analytical Biochemistry, 2012, 425,
117-119.
Dehzangi, A.; Heffernan, R.; Sharma, A.; Lyons, J.; Paliwal, K.; Sattar, A. Gram-positive and Gram-
negative protein subcellular localization by incorporating evolutionary-based descriptors into Chou's
general PseAAC. Journal of Theoretical Biology, 2015, 364, 284-294.

8
Du, P.; Gu, S.; Jiao, Y. PseAAC-General: Fast building various modes of general form of Chou's pseudo-
amino acid composition for large-scale protein datasets. International Journal of Molecular Sciences,
2014, 15, 3495-3506.

Eom, M.H. Kim, Y.H., 2014.Inactivating effect of phenolic unit structures on the biodegradation of lignin by lignin
peroxidase from Phanerochaete chrysosporium. Enzyme and microbial technology,61:48-54.
Fu, L., Niu, B., Zhu, Z., Wu, S. Li, W., 2012. CD-HIT: accelerated for clustering the next-generation sequencing
data. Bioinformatics, 28(23):3150-3152.
Fawcett, T., 2006. An introduction to ROC analysis. Pattern recognition letters, 27(8):861-874.
Gasteiger, E., 2005. In The Proteomics Protocols Handbook (ed Walker J.) 571–607.
Frías I, Trujillo JM, Romero J, Hernandez J, Perez JA. Lignan models as inhibitors of Phanerochaete chrysosporium
lignin peroxidase. Biochimie. 1995 Dec 31;77(9):707-12.
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P. Witten, I.H., 2009. The WEKA data mining
software: an update. ACM SIGKDD explorations newsletter, 11(1):10-18.
Huang, Y.B., Wang, X.F., Wang, H.Y., Liu, Y.Chen, Y., 2011. Studies on mechanism of action of anticancer
peptides by modulation of hydrophobicity within a defined structural framework. Molecular cancer
therapeutics, 10(3):416-426.
Janusz, G., Kucharzyk, K.H., Pawlik, A., Staszczak, M. Paszczynski, A.J., 2013. Fungal laccase, manganese
peroxidase and lignin peroxidase: gene expression and regulation. Enzyme and Microbial technology, 52(1):1-12.
Johansson, T. Nyman, P.O., 1993. Isozymes of lignin peroxidase and manganese (II) peroxidase from the white-rot
basidiomycete Trametes versicolor: I. Isolation of enzyme forms and characterization of physical and catalytic
properties. Archives of biochemistry and biophysics, 300(1):49-56.
Kloczkowski, A., Ting, K.L., Jernigan, R.L.Garnier, J., 2002. Protein secondary structure prediction based on the
GOR algorithm incorporating multiple sequence alignment information. Polymer, 43(2):441-449.
Kabir, M.; Hayat, M. iRSpot-GAEnsC: identifing recombination spots via ensemble classifier and
extending the concept of Chou's PseAAC to formulate DNA samples. Mol Genet Genomics, 2016, 291,
285-296.
Kiang, Y.S. The biological functions of low-frequency phonons: 5. A phenomenological theory.
Biophysical Chemistry, 1985, 22, 219-235.
Kumar, R.; Srivastava, A.; Kumari, B.; Kumar, M. Prediction of beta-lactamase and its class by Chou's
pseudo-amino acid composition and support vector machine. Journal of Theoretical Biology, 2015, 365,
96-103.
Lin, S.X.; Lapointe, J. Theoretical and experimental biology in one —A symposium in honour of
Professor Kuo-Chen Chou's 50th anniversary and Professor Richard Giegé's 40th anniversary of their
scientific careers. J. Biomedical Science and Engineering (JBiSE), 2013, 6, 435-442.
Liu, B.; Liu, F.; Fang, L. repDNA: a Python package to generate various modes of feature vectors for
DNA sequences by incorporating user-defined physicochemical properties and sequence-order effects.
Bioinformatics, 2015, 31, 1307-1309.
Liu, B.; Liu, F.; Wang, X.; Chen, J. Pse-in-One: a web server for generating various modes of pseudo
components of DNA, RNA, and protein sequences Nucleic Acids Research, 2015, 43, W65-W71.
Lin, S.X.; Lapointe, J. Theoretical and experimental biology in one —A symposium in honour of
Professor Kuo-Chen Chou's 50th anniversary and Professor Richard Giegé's 40th anniversary of their
scientific careers. J. Biomedical Science and Engineering (JBiSE), 2013, 6, 435-442.
Morris, G.M., Goodsell, D.S., Halliday, R.S., Huey, R., Hart, W.E., Belew, R.K. Olson, A.J., 1998. Automated
docking using a Lamarckian genetic algorithm and an empirical binding free energy function. Journal of
computational chemistry, 19(14):1639-1662.

9
Mondal, S.; Pai, P.P. Chou's pseudo amino acid composition improves sequence-based antifreeze protein
prediction. Journal of Theoretical Biology, 2014, 356, 30-35.
Mao, B. Collective motion in DNA and its role in drug intercalation. Biopolymers, 1988, 27, 1795-1815.
Parikh, R., Mathai, A., Parikh, S., Sekhar, G.C.Thomas, R., 2008. Understanding and using sensitivity, specificity
and predictive values. Indian journal of ophthalmology, 56(1):45.
Ruiz-Duenas, F.J., Lundell, T., Floudas, D., Nagy, L.G., Barrasa, J.M., Hibbett, D.S. Martínez, A.T., 2013. Lignin-
degrading peroxidases in Polyporales: an evolutionary survey based on 10 sequenced
genomes.Mycologia, 105(6):1428-1444.
Ramachandra, M., Crawford, D.L.Hertel, G.,1988. Characterization of an extracellular lignin peroxidase of the
lignocellulolytic actinomycete Streptomyces viridosporus. Applied and Environmental Microbiology, 54(12):3057-
3063.
Shi, L., Yu, H., Dong, T., Kong, W., Ke, M., Ma, F.Zhang, X., 2014. Biochemical and molecular characterization of
a novel laccase from selective lignin-degrading white-rot fungus Echinodontium taxodii 2538.Process
Biochemistry, 49(7),1097-1106.
Viader-Salvadó, J.M., Gallegos-López, J.A., Carreón-Trevino, J.G., Castillo-Galván, M., Rojo-Domínguez,
A.Guerrero-Olazarán, M., 2010. Design of thermostable beta-propeller phytases with activity over a broad range of
pHs and their overproduction by Pichia pastoris. Applied and environmental microbiology, 76(19):6423-6430.

Watenpaugh, K.D.; Heinrikson, R.L. A Model of the complex between cyclin-dependent kinase 5 (Cdk5)
and the activation domain of neuronal Cdk5 activator. Biochemical & Biophysical Research
Communications (BBRC), 1999, 259, 420-428.

Wang, X.; Zhang, W.; Zhang, Q.; Li, G.Z. MultiP-SChlo: multi-label protein subchloroplast localization
prediction with Chou's pseudo amino acid composition and a novel multi-label classifier. Bioinformatics,
2015, 31, 2639-2645.

Wang, J.F.; Gong, K.; Wei, D.Q. Molecular dynamics studies on the interactions of PTP1B with
inhibitors: from the first phosphate-binding site to the second one. Protein Eng Des Sel (PEDS), 2009, 22,
349-355.

Wang, J.F. Insights from studying the mutation-induced allostery in the M2 proton channel by molecular
dynamics. Protein Eng Des Sel (PEDS), 2010, 23, 663-666.

Zimmermann, W.,1990. Degradation of lignin by bacteria. Journal of biotechnology, 13(2-3):119-1.

Zhang, J.; Luan, C.H.; Johnson, G.V.W. Identification of the N-terminal functional domains of Cdk5 by
molecular truncation and computer modeling. Proteins: Structure, Function, and Genetics, 2002, 48, 447-
453.

Zhong, W.Z.; Zhou, S.F. Molecular science for drug development and biomedicine. Intenational Journal
of Molecular Sciences, 2014, 15, 20072-20078.

Zhou, G.P.; Zhong, W.Z. Perspectives in Medicinal Chemistry. Current Topics in Medicinal Chemistry,
2016, 16, 381-382.

Zhang, C.T.; Maggiora, G.M. Solitary wave dynamics as a mechanism for explaining the internal motion
during microtubule growth. Biopolymers, 1994, 34, 143-153.

10
Table 1. ROC curve analysis of ProtParam results
Physico-
Molecular
chemical Extinction Isoelectric Aliphatic
weight
properties coefficient Point Index GRAVY
ACC 0.843 0.88 0.860 0.892 0.719
AUC 0.741 0.801 0.771 0.866 0.69

Table 2. ROC curve Analysis of GOR IV results


Extended
2D structure Alpha helix strand Random coil
0.760 0.933 0.925
ACC
0.632 0.934 0.8702
AUC

Table 3. Performance of PseAAC-based MNB classification of datasets


Number of Incorrectly
Performance parameters
classified data
ACC PPV NPV fungal bacterial
(%) (%) (%) sequences enzyme

82.16 81.25 82.4 6 22

Table 4. Physicochemical properties of most probable motifs

Motifs Physicochemical properties of most probable motif

Theoretical pI Extinction Instability Aliphatic Grand average of


coefficients(M-1 index index hydropathicity
cm-1)
Motif 1 in 4.59 5500 27.64 75.43 -0.161
bacterial
sequences

11
Motif 3 in 5.05 5500 50.90 73.86 -0.431
fungal
sequences

Table5. Statitical analysis of docking results


Studied Sequences Number of sequences Mean of Gb Mean of ki
Fungal 23 -8.42±1.2 55.2±1.3
Bacterial 107 -10.47±0.97 48.6±1.7

Figure1. Logo of most probable motifs in fungal (a) and bacterial (b) sequences

(a)

(b)

12
HIGHLIGHTS:

1. THE CLASSIFICATION OF THESE TWO GROUPS BASED ON THE CONCEPT OF PSEAAC SHOWED
OVER 80% ACCURACY.

2. THE BINDING FREE ENERGY BETWEEN BACTERIAL LPS AND LIGNAN IS SIGNIFICANTLY MORE THAN
FUNGI LP AND LIGAND.

3. THE COMPUTATIONAL PREDICTORS WERE USEFUL TOOLS TO COMPARE LPS IN FUNGI AND
BACTERIA AND THE HIGH VALUE OF ELEPHANT’S INDEX MAY BE REGARDED AS A POSITIVE FACTOR
FOR HIGH CATALYTIC EFFICIENCY IN BACTERIA.

13

You might also like