Professional Documents
Culture Documents
Abstract—Evolution of drug-resistant microbial species is one of the major challenges to global health. Development of new
antimicrobial treatments such as antimicrobial peptides needs to be accelerated to combat this threat. However, the discovery of novel
antimicrobial peptides is hampered by low-throughput biochemical assays. Computational techniques can be used for rapid screening of
promising antimicrobial peptide candidates prior to testing in the wet lab. The vast majority of existing antimicrobial peptide predictors are
non-targeted in nature, i.e., they can predict whether a given peptide sequence is antimicrobial, but they are unable to predict whether the
sequence can target a particular microbial species. In this work, we have used zero and few shot machine learning to develop a targeted
antimicrobial peptide activity predictor called AMP0. The proposed predictor takes the sequence of a peptide and any N/C-termini
modifications together with the genomic sequence of a microbial species to generate targeted predictions. Cross-validation results show
that the proposed scheme is particularly effective for targeted antimicrobial prediction in comparison to existing approaches and can be
used for screening potential antimicrobial peptides in a targeted manner with only a small number of training examples for novel species.
AMP0 webserver is available at http://ampzero.pythonanywhere.com.
Index Terms—Antibiotic resistance, antimicrobial peptides, zero/few shot learning, target microbial species
1545-5963 © 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See ht_tps://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Rijeka Croatia. Downloaded on October 11,2023 at 12:49:22 UTC from IEEE Xplore. Restrictions apply.
276 IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, VOL. 19, NO. 1, JANUARY/FEBRUARY 2022
TABLE 1
Filtering Criteria Applied to DBAASP Database to
Obtain Required Dataset
cies (classes) available during training are denoted by m and optimization problem [35]. For this purpose, an m m sized
z, respectively and the rescaled MIC scores for each of the pep- kernel matrix K with Kij ¼ kðx xi ; x j Þ is computed over the
tide against each microbe is represented by the m z matrix training data using a kernel function such as the radial basis
Y 2 ½1; 1mz , the learning problem for ZSL can be formu- function (RBF) k ðaa; b Þ ¼ expðkkaa bk2 Þ with the hyper-
lated as the following optimization problem: parameter k > 0. The closed form solution of the kernelized
ZSL optimization problem requires calculation of an
Q ¼ arg min
Q2Rda
XT QS Y
2
Fro
þð g kQSk2F þ XT Q
2
F
þ gkQk2F Þ: ðm aÞ dimensional instance-attribute association matrix
A from training data as follows (see [35] for details):
1 1
Here, X 2 Rdm and S 2 Raz represent matrices of all A ¼ KT K þ gI KYS ST S þ I :
peptide features (m examples each with a d-dimensional fea-
ture vector) and attributes of microbial species (z classes each For inference or prediction of effectiveness of a peptide rep-
with a attributes), respectively. The first term represents the resented by a feature vector x against a microbial species rep-
loss function with the aim of minimizing the error between resented by its attribute vector s, an m-dimensional vector of
predicted and target MICs. The second term (gkQSk2F þ kernel scores k ðx x; x 1 Þ kðx
xÞ ¼ ½ kðx x; x 2 Þ x; xm Þ T
kðx
kXT Qk2F þ gkQk2F ) is the regularization factor that ensures of the test example with each training example is computed
smoothness of the prediction function fðx x; s ; QÞ and sparsity and used in the kernelized prediction function f ðx x; s; AÞ ¼
of the weight matrix Q through penalization of the Frobenius xÞT Ass.
kðx
norm k k2F of respective matrices. g and are regularization It is important to note that this framework extends seam-
hyper-parameters. In addition to better performance over lessly to FSL by simply adding further training instances for
benchmark datasets, another reason for choosing this ZSL a target class. The hyperparameters of the model ðg; ; kÞ
implementation is the existence of a computationally efficient are tuned through cross-validation. The best performance
closed-form solution of its underlying optimization problem of the model was found using g ¼ 2:0, ¼ 0:0001, and
which can be written as follows: the hyperparameter k of RBF kernel set to 2.0.
1 1
Q ¼ XXT þ gI XYST SST þ I 2.4 Performance Evaluation
We consider two practical use-cases of our system: 1) Target
Once the optimal weight matrix Q has been obtained, the Species Ranking (TSR): given a set of microbial species for
predictions for a peptide (represented by the feature vector which labeled peptide sequences are available for training,
x ) for species (represented by the attribute vector s ) can be predict the microbe that is most-likely to be targeted by a
generated by the decision function f ðx x; s; Q Þ ¼ x T Q s . novel peptide sequence and, 2) Peptide Activity Prediction
Note that this decision function can be used for generating for Novel Species (PAP): predict whether a peptide is effec-
predictions both for novel peptides and novel species pro- tive against a given species or not such that no or very few
vided their attribute representation s is available. The most peptide examples for that species are available during train-
likely target species for a given peptide can be identified by ing (i.e., Zero Shot or Few Shot Learning scenario) (see
simply ranking the resulting decision function scores across Fig. 4). It is important to note that both these scenarios
a given list of potential target species. reflect practical use cases for biologists who are interested
This formulation can be kernelized for non-linear kernels in machine-learning guided discovery for targeted antimi-
by applying the Representer theorem to the underlying crobial peptides.
Authorized licensed use limited to: University of Rijeka Croatia. Downloaded on October 11,2023 at 12:49:22 UTC from IEEE Xplore. Restrictions apply.
GULL AND MINHAS: AMP0: SPECIES-SPECIFIC PREDICTION OF ANTI-MICROBIAL PEPTIDES USING ZERO AND FEW SHOT LEARNING 279
TABLE 2
RFPP Prediction Scores Generated by Various Baseline and Proposed Model
3.1 Target Species Ranking (TSR) target a novel species for which no or very few training exam-
Fig. 5 shows the percentile-wise RFPP scores for all classi- ples are available. For this purpose, we compare the perfor-
fiers. As discussed in section 2.4, the ideal RFPP score for all mance of conventional machine learning models (SVM,
peptides is 1.0. For the random classifier that generates a XGBoost), the proposed Zero Shot Learning (ZSL) and Few
random score for a given example, the median RFPP is 75, Shot Learning (FSL) models in addition to existing state of the
i.e., for 50 percent test peptides in cross-validation, a true art non-targeted antimicrobial activity predictors (CAMP [22],
target species is within the top 75 (out of 336) predictions. In [74] and AMAP [19]). For this use case, XGBoost with amino
contrast, for XGBoost and SVM baseline models, the median acid composition features performed significantly better than
RFPPs are 50 and 9, respectively. However, the proposed SVM (results not shown for brevity). However, the prediction
model performs much better than these baseline models: performance of XGBoost was typically no better than a ran-
the RFPP for the proposed model at the 75th percentile is dom classifier especially when the number of training exam-
1.0, i.e., for up to 75 percent peptides, the top prediction by ples from a given test species was Similarly very small (see
the model is correct. This clearly shows the effectiveness of Supplementary Information, available online for complete
the proposed prediction scheme for identifying the correct results). Existing state of the art methods such as CAMP [22]
target species of a peptide. and AMAP [19] do not give satisfactory predictive perfor-
A numeric comparison of different classification meth- mance for the chosen species. In contrast, the proposed few
ods with 1-mer and 2-mer peptide features through 5-fold shot learning model performs significantly better with an
cross-validation is given in Table 2. The 2-mer representa- expected increase in prediction accuracy when the number of
tion works well for SVM and XGBOOST classifiers whereas, training examples of a species is increased. The supplementary
for nearest neighbor and neural network models, 1-mer fea- material, available online contains more detailed results in
ture representation gives better results. For the proposed which we analyze the impact of genomic distance between
ZSL model, the 2-mer representation gives better predictive train and test species on predictive performance of the pro-
accuracy as shown in Table 2. The supplementary material, posed model. This analysis shows that there is negative corre-
available online contains more detailed comparative results lation (pearson correlation score of 0.3) between predictive
over 10-fold cross-validation, non-redundant cross-valiation accuracy and genomic distance, i.e., as expected, if the test
and additional random peptide negative examples. Change
in the number of cross-validation folds has a marginal
impact on predictive performance of the proposed model.
Our non-redundant cross-validation analysis shows that the
proposed model is significantly better than other machine
learning methods in predicting the target species of pepti-
des when the test peptides share less than 40 percent
sequence identity with training peptides. Furthermore, the
proposed scheme is also robust to additional random nega-
tive test examples. These analyses clearly show the effec-
tiveness of the proposed model in comparison to a variety
of classical machine learning.
TABLE 3
Results for Peptide Activity Prediction for Novel Species
For each species the number of positive (P) and Negative (N) examples is Given together with its Average Test AUC-ROC (with standard deviation in
paranthesis).
species is similar to a training species, the predictions can be cancer cells. Tryptophan (W) can penetrate a microbial cell
expected to be more accurate. However, this analysis also membrane and is effective against numerous antibiotic resis-
shows that the proposed model does not undergo an abrupt tant bacteria. Phenylalanine-rich (F) AMPs have higher anti-
degradation in predictive performance when generating pre- microbial activity against Gram-positive bacteria, Gram-
dictions for test species that are different from species used in negative bacteria and yeast without hemolytic activity [76].
training. Cysteine (C) is also an important amino acid in natural antimi-
crobial peptides of vertebrates, invertebrates and plants [77],
have excessive ability of pore formation in a membrane which
3.3 Feature Analysis leads to high antimicrobial activity [76]. The supplementary
In order to gain an insight into the importance of various fea- material, available online shows plots of the parameter matrix
tures used by the linear ZSL model, we have analyzed the for genomic feature vector components and their association
weight values of the parameter matrix Q . Fig. 6 shows the with peptide sequence features.
sum of the weight values for each L- (small) and D- type (cap-
tialized) amino acid in the feature representation across all
species. The large magnitudes of weights of amino acids G, g, 3.4 Webserver
F, f, P, p, and w correlates with literature findings about the The webserver developed for proposed model is available
importance of these amino acids in AMPs. Specifically, the at the URL: http://ampzero.pythonanywhere.com. The
Proline-rich peptides (P) have capability of bacterial cell pene- webserver takes a peptide sequence along with any N/C-
tration. Glycine (G) improves antimicrobial activity of pepti- terminus modifications as input together with the genome
des and potentially targets fungi, Gram-negative bacteria, and of a species in order to predict the effectiveness of the
Authorized licensed use limited to: University of Rijeka Croatia. Downloaded on October 11,2023 at 12:49:22 UTC from IEEE Xplore. Restrictions apply.
282 IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, VOL. 19, NO. 1, JANUARY/FEBRUARY 2022
peptide against the given species. The user can upload a list [15] Z. Teng, M. Guo, Q. Dai, C. Wang, J. Li, and X. Liu,
“Computational prediction of protein function based on weighted
of known positive and negative example peptide sequences mapping of domains and GO terms,” BioMed. Res. Int., vol. 2014,
for the given species for FSL predictions. pp. 1–9, 2014.
[16] P. Radivojac et al., “A large-scale evaluation of computational
protein function prediction,” Nature Methods, vol. 10, no. 3, 2013,
Art. no. 221.
4 CONCLUSIONS [17] A. Valencia, “Automatic annotation of protein function,” Curr.
Opinion Struct. Biol., vol. 15, no. 3, pp. 267–274, 2005.
We have developed a targeted antimicrobial activity predic- [18] B. Rost, J. Liu, R. Nair, K. O. Wrzeszczynski, and Y. Ofran,
tor called AMP0 which can predict the effectiveness of a “Automatic prediction of protein function,” Cellular Mol. Life Sci.
given peptide sequence against a target species. The use of CMLS, vol. 60, no. 12, pp. 2637–2650, 2003.
[19] S. Gull, N. Shamim, and F. Minhas, “AMAP: Hierarchical multi-
zero and few shot learning in the proposed model helps in label prediction of biologically active and antimicrobial peptides,”
overcoming the shortcomings of conventional machine Comput. Biol. Med., vol. 107, pp. 172–181, 2019.
learning techniques for this purpose. Our cross-validation [20] P. Bhadra, J. Yan, J. Li, S. Fong, and S. W. Siu, “AmPEP: Sequence-
analysis shows that the proposed model can perform better based prediction of antimicrobial peptides using distribution pat-
terns of amino acid properties and random forest,” Sci. Rep.,
than existing approaches and it can be easily integrated in vol. 8, no. 1, 2018, Art. no. 1697.
experimental discovery of antimicrobial peptide sequences [21] M. Torrent, V. M. Nogues, and E. Boix, “A theoretical approach to
for novel species. spot active regions in antimicrobial proteins,” BMC Bioinf., vol. 10,
no. 1, 2009, Art. no. 373.
[22] F. H. Waghu, R. S. Barai, P. Gurung, and S. Idicula-Thomas,
ACKNOWLEDGMENTS “CAMPR3: A database on sequences, structures and signatures
of antimicrobial peptides,” Nucleic Acids Res., vol. 44, no. D1,
Sadaf Gull is supported by a grant under indigenous 5000 pp. D1094–D1097, 2015.
PhD fellowship scheme by the Higher Education Commis- [23] W. Lin and D. Xu, “Imbalanced multi-label learning for identify-
sion (HEC) of Pakistan. ing antimicrobial peptides and their functional types,” Bioinfor-
matics, vol. 32, no. 24, pp. 3745–3752, 2016.
[24] P. Agrawal and G. P. Raghava, “Prediction of antimicrobial poten-
REFERENCES tial of a chemically modified peptide from its tertiary structure,”
Front. Microbiol., vol. 9, 2018, Art. no. 2551.
[1] B. Aslam et al., “Antibiotic resistance: A rundown of a global
[25] V. V. Kleandrova, J. M. Ruso, A. Speck-Planche, and M. N. Dias
crisis,” Infection Drug Resistance, vol. 11, 2018, Art. no. 1645.
Soeiro Cordeiro, “Enabling the discovery and virtual screening of
[2] C. L. Ventola, “The antibiotic resistance crisis: Part 1: Causes and
potent and safe antimicrobial peptides. simultaneous prediction
threats,” Pharmacy Ther., vol. 40, no. 4, 2015, Art. no. 277.
of antibacterial activity and cytotoxicity,” ACS Combinatorial Sci.,
[3] J. M. Blair, “A climate for antibiotic resistance,” Nature Climate
vol. 18, no. 8, pp. 490–498, 2016.
Change, vol. 8, no. 6, 2018, Art. no. 460.
[26] B. Vishnepolsky et al., “Predictive model of linear antimicrobial
[4] M. Lakemeyer, W. Zhao, F. A. Mandl, P. Hammann, and S. A. Sieber,
peptides active against gram-negative bacteria,” J. Chem. Inform.
“Thinking outside the box—Novel antibacterials to tackle the
Model., vol. 58, no. 5, pp. 1141–1151, 2018.
resistance crisis,” Angewandte Chemie Int. Edition, vol. 57, no. 44,
[27] A. Speck-Planche, V. V. Kleandrova, J. M. Ruso, and M. DS Cordeiro,
pp. 14440–14475, 2018.
“First multitarget chemo-bioinformatic model to enable the discov-
[5] C. N. Spaulding, R. D. Klein, H. L. Schreiber, J. W. Janetka, and
ery of antibacterial peptides against multiple gram-positive patho-
S. J. Hultgren, “Precision antimicrobial therapeutics: The path of
gens,” J. Chem. Inform. Model., vol. 56, no. 3, pp. 588–598, 2016.
least resistance?,” NPJ Biofilms Microbiomes, vol. 4, no. 1, 2018,
[28] H. Larochelle, D. Erhan, and Y. Bengio, “Zero-data learning of
Art. no. 4.
new tasks,” in Proc. 23rd Nat. Conf. Artif. Intell. - Vol. 2, 2008,
[6] F. Kampshoff, M. D. Willcox, and D. Dutta, “A pilot study of the
pp. 646–651, Accessed: Apr. 25, 2020. [Online].
synergy between two antimicrobial peptides and two common
[29] M. Palatucci, D. Pomerleau, G. E. Hinton, and T. M. Mitchell,
antibiotics,” Antibiotics, vol. 8, no. 2, 2019, Art. no. 60.
“Zero-shot learning with semantic output codes,” in Proc. Int.
[7] F. Costa, C. Teixeira, P. Gomes, and M. C. L. Martins, “Clinical
Conf. Neural Inf. Process. Syst., 2009, pp. 1410–1418.
application of AMPs,” in Antimicrobial Peptides, Springer, 2019,
[30] Z. Zhang and V. Saligrama, “Zero-shot learning via semantic simi-
pp. 281–298.
larity embedding,” in Proc. IEEE Int. Conf. Comput. Vis., 2015,
[8] G. Yu, D. Y. Baeder, R. R. Regoes, and J. Rolff, “Predicting drug
pp. 4166–4174.
resistance evolution: Insights from antimicrobial peptides and
[31] R. Socher, M. Ganjoo, C. D. Manning, and A. Ng, “Zero-shot
antibiotics,” Proc. Roy. Soc. B: Biol. Sci., vol. 285, no. 1874, 2018,
learning through cross-modal transfer,” in Proc. IEEE Int. Conf.
Art. no. 20172687.
Neural Information Processing Systems, 2013, pp. 935–943.
[9] A. Sokolov, C. Funk, K. Graim, K. Verspoor, and A. Ben-Hur,
[32] M. Norouzi et al., “Zero-shot learning by convex combination of
“Combining heterogeneous data sources for accurate functional
semantic embeddings,” 2013, arXiv:1312.5650.
annotation of proteins,” BMC Bioinf., vol. 14, 2013, Art. no. S10.
[33] Y. Fu, T. M. Hospedales, T. Xiang, and S. Gong, “Transductive
[10] T. L. Campos, P. K. Korhonen, R. B. Gasser, and N. D. Young, “An
multi-view zero-shot learning,” IEEE Trans. Pattern Anal. Mach.
evaluation of machine learning approaches for the prediction of
Intell., vol. 37, no. 11, pp. 2332–2345, Nov. 2015.
essential genes in eukaryotes using protein sequence-derived
[34] E. Kodirov, T. Xiang, and S. Gong, “Semantic autoencoder for
features,” Comput. Struct. Biotechnol. J., vol. 17, pp. 785–796, 2019.
zero-shot learning,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog-
[11] M. Kulmanov, M. A. Khan, and R. Hoehndorf, “DeepGO: Predicting
nit., 2017, pp. 3174–3183.
protein functions from sequence and interactions using a deep ontol-
[35] B. Romera-Paredes and P. Torr, “An embarrassingly simple
ogy-aware classifier,” Bioinformatics, vol. 34, no. 4, pp. 660–668, 2017.
approach to zero-shot learning,” in Proc. Int. Conf. Mach. Learn.,
[12] A. S. Rifaioglu, T. Do gan, M. J. Martin, R. Cetin-Atalay, and
2015, pp. 2152–2161.
V. Atalay, “DEEPred: Automated protein function prediction
[36] L. Fei-Fei, R. Fergus, and P. Perona, “One-shot learning of object
with multi-task feed-forward deep neural networks,” Sci. Rep.,
categories,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 28, no. 4,
vol. 9, no. 1, Art. no. 7344, 2019.
pp. 594–611, Apr. 2006.
[13] R. Fa, D. Cozzetto, C. Wan, and D. T. Jones, “Predicting human
[37] J. Snell, K. Swersky, and R. Zemel, “Prototypical networks for
protein function with multi-task deep neural networks,” PloS One,
few-shot learning,” in Proc. Int. Conf. Neural Inf. Process. Syst.,
vol. 13, no. 6, 2018, Art. no. e0198216.
2017, pp. 4077–4087.
[14] S. Hua and Z. Sun, “Support vector machine approach for protein
[38] F. Sung et al., “Learning to compare: Relation network for few-
subcellular localization prediction,” Bioinformatics, vol. 17, no. 8,
shot learning,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit.,
pp. 721–728, 2001.
2018, pp. 1199–1208.
Authorized licensed use limited to: University of Rijeka Croatia. Downloaded on October 11,2023 at 12:49:22 UTC from IEEE Xplore. Restrictions apply.
GULL AND MINHAS: AMP0: SPECIES-SPECIFIC PREDICTION OF ANTI-MICROBIAL PEPTIDES USING ZERO AND FEW SHOT LEARNING 283
[39] S. Gidaris and N. Komodakis, “Dynamic few-shot visual learning [63] H. Nakashima, K. Nishikawa, and T. Ooi, “Di. erences in dinucle-
without forgetting,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog- otide frequencies of human, yeast, and escherichia coli genes,”
nit., 2018, pp. 4367–4375. DNA Res., vol. 4, no. 3, pp. 185–192, 1997.
[40] V. Garcia and J. Bruna, “Few-shot learning with graph neural [64] H. Nakashima, M. Ota, K. Nishikawa, and T. Ooi, “Genes from
networks,” 2017, arXiv:1711.04043. nine genomes are separated into their organisms in the dinucleo-
[41] S. Ravi and H. Larochelle, “Optimization as a model for few-shot tide composition space,” DNA Res., vol. 5, no. 5, pp. 251–259, 1998.
learning,” in Proc. Int. Conf. Learn. Representations, vol. 1, p. 6, 2017. [65] D. T. Pride, R. J. Meinersmann, T. M. Wassenaar, and M. J. Blaser,
[42] I. Deznabi, B. Arabaci, M. Koyut€ urk, and O. Tastan, “DeepKinZero: “Evolutionary implications of microbial genome tetranucleotide
zero-shot learning for predicting kinase-phosphosite associations frequency biases,” Genome Res., vol. 13, no. 2, pp. 145–158, 2003.
involving understudied kinases,” BioRxiv, pp. 670638, 2019. doi: [66] M. Takahashi, K. Kryukov, and N. Saitou, “Estimation of bacterial
10.1101/670638. species phylogeny through oligonucleotide frequency distances,”
[43] M. Mendieta and D. Romero, “A cross-modal transfer approach Genomics, vol. 93, no. 6, pp. 525–533, 2009.
for histological images: A case study in aquaculture for disease [67] C. Cortes and V. Vapnik, “Support-vector networks,” Mach.
identification using zero-shot learning,” in Proc. IEEE 2nd Ecuador Learn., vol. 20, no. 3, pp. 273–297, 1995.
Techn. Chapters Meeting, 2017, pp. 1–6. [68] T. Chen and C. Guestrin, “Xgboost: A scalable tree boosting sys-
[44] X. Sun, H. Xv, J. Dong, H. Zhou, C. Chen, and Q. Li, “Few-shot tem,” in Proc. 22nd ACM Sigkdd Int. Conf. Knowl. Discov. Data Min-
Learning for Domain-specific Fine-grained Image Classification,” ing, 2016, pp. 785–794.
IEEE Trans. Ind. Electronics, to be published, doi: 10.1109/ [69] K. Gurney, in An Introduction to Neural Networks. Boca Raton, FL,
TIE.2020.2977553 USA: CRC press, 1997.
[45] K. Cao, J. Ji, Z. Cao, C.-Y. Chang, and J. C. Niebles, “Few-shot video [70] T. Cover and P. Hart, “Nearest neighbor pattern classification,”
classification via temporal alignment,” 2019, arXiv:1906.11415. IEEE Trans. Inf. Theory, vol. 13, no. 1, pp. 21–27, Jan. 1967.
[46] M. Matsuki and S. Inoue, “Toward projection learning between [71] Z. John Lu, “The elements of statistical learning: Data mining,
sensor data and semantic word vector for zero-shot learning,” in inference, and prediction,” J. Roy. Statist. Soc.: Series A (Statist.
Proc. Joint 8th Int. Conf. Inform. Electronics Vis., 3rd Int. Conf. Imag. Soc.), vol. 173, no. 3, pp. 693–694, 2010.
Vis. Pattern Recognit., 2019, pp. 108–111. [72] F. ul A. Afsar Minhas, B. J. Geiss, and A. Ben-Hur, “PAIRpred:
[47] B. Liu, X. Wang, M. Dixit, R. Kwitt, and N. Vasconcelos, “Feature Partner-specific prediction of interacting residues from sequence
space transfer for data augmentation,” in Proc. IEEE Conf. Comput. and structure,” Proteins: Struct. Function Bioinf., vol. 82, no. 7,
Vis. Pattern Recognit., 2018, pp. 9090–9098. pp. 1142–1155, 2014.
[48] Z. Luo, Y. Zou, J. Hoffman, and L. F. Fei-Fei, “Label efficient learn- [73] J. Davis and M. Goadrich, “The relationship between Precision-
ing of transferable representations acrosss domains and tasks,” in Recall and ROC curves,” in Proc. 23rd Int. Conf. Mach. Learn., 2006,
Proc. Int. Conf. Neural Inf. Process. Syst., 2017, pp. 165–177. pp. 233–240.
[49] M. Pirtskhalava et al., “DBAASP v. 2: An enhanced database of [74] M. N. Gabere and W. S. Noble, “Empirical comparison of web-
structure and antimicrobial/cytotoxic activity of natural and syn- based antimicrobial peptide prediction tools,” Bioinformatics,
thetic peptides,” Nucleic Acids Res., vol. 44, no. D1, pp. D1104– vol. 33, no. 13, pp. 1921–1929, 2017.
D1112, 2015. [75] Y. Huang, B. Niu, Y. Gao, L. Fu, and W. Li, “CD-HIT Suite: A web
[50] M. Youmans, C. Spainhour, and P. Qiu, “Long short-term mem- server for clustering and comparing biological sequences,” Bioin-
ory recurrent neural networks for antibacterial peptide identi- formatics, vol. 26, no. 5, pp. 680–682, 2010.
fication,” in Proc. IEEE Int. Conf. Bioinf. Biomed., 2017, pp. 498–502. [76] J. Wang et al., “Antimicrobial peptides: Promising alternatives in
[51] T. S. Win et al., “HemoPred: A web server for predicting the hemo- the post feeding antibiotic era,” Med. Res. Rev., vol. 39, no. 3,
lytic activity of peptides,” Future Medicinal Chemistry, vol. 9, no. 3, pp. 831–859, 2019.
pp. 275–291, 2017. [77] J.-L. Dimarcq, P. Bulet, C. Hetru, and J. Hoffmann, “Cysteine-rich
[52] N. R. Coordinators, “Database resources of the national center for antimicrobial peptides in invertebrates,” Peptide Sci., vol. 47, no. 6,
biotechnology information,” Nucleic acids Res., vol. 44, no. Data- pp. 465–477, 1998.
base issue, 2016, Art. no. D7.
[53] F. Cava, H. Lam, M. A. De Pedro, and M. K. Waldor, “Emerging
knowledge of regulatory roles of D-amino acids in bacteria,” Cel-
lular Mol. Life Sci., vol. 68, no. 5, pp. 817–831, 2011. Sadaf Gull is currently working toward the PhD degree in the Depart-
[54] M. L. Mangoni et al., “Effect of natural L-to D-amino acid conver- ment of Computer and Information Sciences, Pakistan Institute of Engi-
sion on the organization, membrane binding, and biological func- neering and Applied Sciences (PIEAS), Islamabad, Pakistan. She is
tion of the antimicrobial peptides bombinins H,” Biochemistry, funded by the indigenous PhD fellowships scheme by the Higher Educa-
vol. 45, no. 13, pp. 4266–4276, 2006. tion Commission (HEC). Her area of research is machine learning in bio-
[55] R. H. Baltz, “Daptomycin: Mechanisms of action and resistance, medical informatics.
and biosynthetic engineering,” Curr. Opinion Chemical Biol.,
vol. 13, no. 2, pp. 144–151, 2009.
[56] Y. Kawai et al., “Structural and functional differences in two cyclic
bacteriocins with the same sequences produced by lactobacilli,”
Appl. Environ. Microbiol., vol. 70, no. 5, pp. 2906–2911, 2004.
Fayyaz Minhas received the PhD degree in bioinformatics from Colo-
[57] C. Leslie, E. Eskin, and W. S. Noble, “The spectrum kernel: A
string kernel for SVM protein classification,” in Biocomputing rado State University, USA, on a Fulbright Scholarship. is currently with
2002, World Scientific, Singapore, 2001, pp. 564–575. the Department of Computer Science, University of Warwick, Coventry,
[58] E. Crusca Jr et al., “Influence of N-terminus modifications on the UK and is partially supported by the PathLAKE digital pathology consor-
biological activity, membrane interaction, and secondary structure tium which is funded from the Data to Early Diagnosis and Precision
of the antimicrobial peptide hylin-a1,” Peptide Sci., vol. 96, no. 1, Medicine strand of the government’s Industrial Strategy Challenge
Fund, managed and delivered by UK Research and Innovation (UKRI) .
pp. 41–48, 2011.
For more information, please visit (https://warwick.ac.uk/fac/cross_fac/
[59] S. Karlin and I. Ladunga, “Comparisons of eukaryotic genomic
sequences,” Proc. Nat. Acad. Sci. USA, vol. 91, no. 26, pp. 12832–12836, pathlake/). He has also been awarded the National Youth Award by the
1994. Government of Pakistan for his contributions to science and technology.
[60] S. Karlin, A. M. Campbell, and J. Mrazek, “Comparative DNA anal- His research focuses on applications of machine learning in Bioinformat-
ysis across diverse genomes,” Annu. Rev. Genetics, vol. 32, no. 1, ics and histopathology.
pp. 185–225, 1998.
[61] S. Kariin and C. Burge, “Dinucleotide relative abundance
" For more information on this or any other computing topic,
extremes: A genomic signature,” Trends Genetics, vol. 11, no. 7,
pp. 283–290, 1995. please visit our Digital Library at www.computer.org/csdl.
[62] S. Karlin, “Global dinucleotide signatures and analysis of genomic
heterogeneity,” Curr. Opinion Microbiol., vol. 1, no. 5, pp. 598–610,
1998.
Authorized licensed use limited to: University of Rijeka Croatia. Downloaded on October 11,2023 at 12:49:22 UTC from IEEE Xplore. Restrictions apply.