TY - JOUR
T1 - Co-AMPpred for in silico-aided predictions of antimicrobial peptides by integrating composition-based features
AU - Singh, Onkar
AU - Hsu, Wen Lian
AU - Su, Emily Chia Yu
N1 - Funding Information:
This study was funded in part by the Ministry of Science and Technology (MOST) in Taiwan (grant number MOST109-2221-E-038–018 and MOST110-2628-E-038–001) and the Higher Education Sprout Project by Ministry of Education (MOE) in Taiwan (grant number DP2-108–21121-01-A-01–04) to Emily Chia-Yu Su. The funders had no role in the study design, data collection and analysis, decision to publish, or manuscript preparation.
Publisher Copyright:
© 2021, The Author(s).
PY - 2021/12
Y1 - 2021/12
N2 - Background: Antimicrobial peptides (AMPs) are oligopeptides that act as crucial components of innate immunity, naturally occur in all multicellular organisms, and are involved in the first line of defense function. Recent studies showed that AMPs perpetuate great potential that is not limited to antimicrobial activity. They are also crucial regulators of host immune responses that can modulate a wide range of activities, such as immune regulation, wound healing, and apoptosis. However, a microorganism's ability to adapt and to resist existing antibiotics triggered the scientific community to develop alternatives to conventional antibiotics. Therefore, to address this issue, we proposed Co-AMPpred, an in silico-aided AMP prediction method based on compositional features of amino acid residues to classify AMPs and non-AMPs. Results: In our study, we developed a prediction method that incorporates composition-based sequence and physicochemical features into various machine-learning algorithms. Then, the boruta feature-selection algorithm was used to identify discriminative biological features. Furthermore, we only used discriminative biological features to develop our model. Additionally, we performed a stratified tenfold cross-validation technique to validate the predictive performance of our AMP prediction model and evaluated on the independent holdout test dataset. A benchmark dataset was collected from previous studies to evaluate the predictive performance of our model. Conclusions: Experimental results show that combining composition-based and physicochemical features outperformed existing methods on both the benchmark training dataset and a reduced training dataset. Finally, our proposed method achieved 80.8% accuracies and 0.871 area under the receiver operating characteristic curve by evaluating on independent test set. Our code and datasets are available at https://github.com/onkarS23/CoAMPpred.
AB - Background: Antimicrobial peptides (AMPs) are oligopeptides that act as crucial components of innate immunity, naturally occur in all multicellular organisms, and are involved in the first line of defense function. Recent studies showed that AMPs perpetuate great potential that is not limited to antimicrobial activity. They are also crucial regulators of host immune responses that can modulate a wide range of activities, such as immune regulation, wound healing, and apoptosis. However, a microorganism's ability to adapt and to resist existing antibiotics triggered the scientific community to develop alternatives to conventional antibiotics. Therefore, to address this issue, we proposed Co-AMPpred, an in silico-aided AMP prediction method based on compositional features of amino acid residues to classify AMPs and non-AMPs. Results: In our study, we developed a prediction method that incorporates composition-based sequence and physicochemical features into various machine-learning algorithms. Then, the boruta feature-selection algorithm was used to identify discriminative biological features. Furthermore, we only used discriminative biological features to develop our model. Additionally, we performed a stratified tenfold cross-validation technique to validate the predictive performance of our AMP prediction model and evaluated on the independent holdout test dataset. A benchmark dataset was collected from previous studies to evaluate the predictive performance of our model. Conclusions: Experimental results show that combining composition-based and physicochemical features outperformed existing methods on both the benchmark training dataset and a reduced training dataset. Finally, our proposed method achieved 80.8% accuracies and 0.871 area under the receiver operating characteristic curve by evaluating on independent test set. Our code and datasets are available at https://github.com/onkarS23/CoAMPpred.
KW - Amino acid composition
KW - Antimicrobial peptide
KW - Composition-based feature
KW - Machine learning
UR - http://www.scopus.com/inward/record.url?scp=85111549818&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85111549818&partnerID=8YFLogxK
U2 - 10.1186/s12859-021-04305-2
DO - 10.1186/s12859-021-04305-2
M3 - Article
C2 - 34330209
AN - SCOPUS:85111549818
SN - 1471-2105
VL - 22
JO - BMC Bioinformatics
JF - BMC Bioinformatics
IS - 1
M1 - 389
ER -