Development and Applications of an Integrated Epitope Prediction System for Cancer Peptide-Based Vaccine Design in Translational Bioinformatics

Project: A - Government Institutionb - Ministry of Science and Technology

Project Details


Epitope-based vaccine design has emerged as a promising method to treat many diseases including cancers. However, effective activation of anticancer responses greatly depends on successful identifications of T-cell epitopes. In the antigen presenting pathways, the antigenic proteins are first processed with proteasomal cleavage, and the cleaved epitope peptides are recognized by the transporter associated with antigen processing (TAP) and delivered into endoplasmic reticulum to bind to major histocompatibility complex (MHC) molecules. Then, the peptide-MHC complexes are presented to T-cell receptors on the surface of antigen-presenting cells. Despite extensive studies in epitope prediction, current approaches still suffer from the problems of low positive predictive values (i.e., high false positives) and the lack of interpretable biological features. Thus, developments of refined epitope identification methods from sequences have become highly important to facilitate in silico epitope-based vaccine design. In this project, we propose a systematic approach to refine MHC binding affinity and integrate antigen processing events in epitope identification. This study can help discover new prophylactic and therapeutic vaccines for dengue fever, influenza, and human immunodeficiency virus (HIV). Moreover, we collaborate with cancer immunologists to analyze immunological features that can provide valuable insights into immunotherapies of cancer, allergy, and autoimmune diseases in translational bioinformatics. In the first year, we aim to develop refined methods to improve MHC peptide binding, especially for class II molecules. First, we collect epitope data from literature and database, and construct an updated MHC binding benchmark data set. Then, a novel encoding scheme is used to represent sequence propensity, evolutionary information, and a set of optimal physicochemical properties (i.e., structural stability). Finally, the features are predicted by a consensus method, in which five machine learning algorithms are incorporated to improve predictive performance. Finally, interpretable biological features proposed in our method are validated by our cancer immunologist collaborator with references to vaccine design. In the second year, we focus on integrating antigen processing pathway events into a systematic pipeline analysis for epitope prediction. First, we combine our ProCleSSP (recently published) for proteasomal cleavage, TAPcon (unpublished) for TAP recognition, and MHC binding developed in the 1st year into an integrated T-cell epitope prediction method. Instead of traditional simple additive or weighted sum approaches, machine learning algorithms are incorporated to combine these results and improve predictive performance. Finally, our proposed method is applied to analyze cancer immunology and facilitate epitope-based vaccine discovery in precision medicine (e.g., liver cancer and pancreatic cancer by our collaborator). In this project, we endeavor to develop improved immunoinformatics tools and propose interpretable biological features that can be used collectively to assist cancer immunologists in epitope-based vaccine design for translational bioinformatics.
Effective start/end date8/1/177/31/18


  • T-cell epitope prediction
  • cancer immunology
  • MHC binding
  • proteasomal cleavage
  • TAP binding
  • machine learning algorithms


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.