A Sequence-Based Prediction Model of Vesicular Transport Proteins Using Ensemble Deep Learning

Nguyen Quoc Khanh Le, Quang Hien Kha

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This study aims to employ computational methods for the accurate identification of vesicular transport proteins. The identification of these proteins holds great significance in enhancing our understanding of their protein family structure, thereby enabling the design of more effective drug targets for individuals afflicted with endocrine disorders. In recent times, researchers in the field of biology have increasingly sought to leverage deep learning techniques to address this challenge. In order to further enhance the classification performance, we investigated the following models incorporating distinct features: (1) We devised a novel protein feature called AAC_PSSM by amalgamating amino acid composition (AAC) and position-specific scoring matrix (PSSM) features. Subsequently, a gated recurrent unit (GRU) model was employed to learn such features; (2) An ensemble model was constructed by combining the existing GRU model with the model of a neural network featuring the AAC feature; (3) Random forest analysis was conducted using the pseudo-amino acid composition (PseAAC) feature; (4) Furthermore, we explored a natural language processing (NLP) approach by considering the protein sequence as a natural language and applying various neural network architectures. Upon analyzing the results obtained from the different models, it was observed that the ensemble model incorporating PSSM and AAC features exhibited the highest sensitivity of 81.03% and accuracy of 82.43%. Notably, our proposed model surpassed the performance of state-of-the-art models addressing the same problem and datasets, thus establishing its superiority.

Original languageEnglish
Title of host publicationACM-BCB 2023 - 14th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics
PublisherAssociation for Computing Machinery, Inc
ISBN (Electronic)9798400701269
DOIs
Publication statusPublished - Sept 3 2023
Event14th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM-BCB 2023 - Houston, United States
Duration: Sept 3 2023Sept 6 2023

Publication series

NameACM-BCB 2023 - 14th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics

Conference

Conference14th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM-BCB 2023
Country/TerritoryUnited States
CityHouston
Period9/3/239/6/23

Keywords

  • deep learning
  • gate recurrent unit
  • nesemble learning
  • position-specific scoring matrix
  • protein sequence
  • vesicular transport

ASJC Scopus subject areas

  • Computer Science Applications
  • Software
  • Biomedical Engineering
  • Health Informatics

Fingerprint

Dive into the research topics of 'A Sequence-Based Prediction Model of Vesicular Transport Proteins Using Ensemble Deep Learning'. Together they form a unique fingerprint.

Cite this