An in Silico Vaccine Design Method for Allergen Prediction and Ige Epitope Identification Based on Deep Learning Algorithms

Project: A - Government Institutionb - National Science and Technology Council

Project Details


Epitope-based vaccine design has emerged as a promising method to treat many diseases including allergy and cancers. However, effective activation of anti-inflammatory responses greatly depends on successful identifications of T-cell epitopes. When allergens enter the body, the allergenic proteins are recognized by antigen-presenting cell (APC) and delivered into endoplasmic reticulum to bind to major histocompatibility complex (MHC) II molecules. Then, the peptide-MHC complexes are presented to T-cell receptors and then activate the B-cells. Despite extensive studies in allergen prediction and epitope identification, current approaches still suffer from the problems of low positive predictive values (i.e., high false positives) and the lack of interpretable biological features. Thus, developments of allergen prediction and IgE epitope identification methods from sequences have become highly important to facilitate in silico epitope-based vaccine design. In this project, we propose a systematic approach to predict allergenic proteins and develop IgE specific epitope identification based on deep learning algorithms. This study can help discover new prophylactic and therapeutic vaccines for dengue fever, influenza, and human immunodeficiency virus (HIV). Moreover, we analyze immunological features that can provide valuable insights into immunotherapies of cancer, allergy, and autoimmune diseases in translational bioinformatics. In the first year, we aim to develop refined methods to improve allergenic protein prediction, especially for proteins with low sequence identities with known allergens. First, we collect allergenic protein data from literature and databases, and construct an updated allergen benchmark data set. Then, a novel encoding scheme is used to represent sequence propensity, evolutionary information, and a set of optimal physicochemical properties (i.e., structural stability). Finally, the features are predicted by deep learning algorithms, in which convolutional neural networks and recurrent neural networks are incorporated to improve predictive performance. Finally, interpretable biological features proposed in our method are validated by immunologists with references to vaccine design. In the second year, we focus on integrating predicted allergenic proteins in the first year into a systematic pipeline analysis for IgE epitope identification and allergen cross-reactivity analyses. First, we extend our BeePro (previously published) to customize the epitope identification for specific IgE binding peptides based on allergens predicted in the first year. After that, the identified IgE epitopes are used as features to analyze cross-reactivity between allergens. A recent immunotheraphy study has also demonstrated a hypoallergenic vaccine based on grass pollen allergy is effective to induce antibody responses against hepatitis B infection. Finally, our proposed method is applied to analyze hypoallergenic proteins and facilitate epitope-based vaccine discovery in precision medicine.In this project, we endeavor to develop improved immunoinformatics tools and propose interpretable biological features that can be used collectively to assist immunologists in epitope-based vaccine design for translational bioinformatics.
Effective start/end date8/1/1810/1/19


  • allergen prediction
  • IgE epitope identification
  • cross-reactivity analysis
  • deep learning algorithms
  • in silico vaccine design


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.