Prediction of Carbohydrate binding sites on protein surfaces with 3-dimensional probability density distributions of interacting atoms

Keng Chang Tsai, Jhih Wei Jian, Ei Wen Yang, Po Chiang Hsu, Hung Pin Peng, Ching Tai Chen, Jun Bo Chen, Jeng Yih Chang, Wen Lian Hsu, An Suei Yang

Research output: Contribution to journalArticlepeer-review

29 Citations (Scopus)

Abstract

Non-covalent protein-carbohydrate interactions mediate molecular targeting in many biological processes. Prediction of non-covalent carbohydrate binding sites on protein surfaces not only provides insights into the functions of the query proteins; information on key carbohydrate-binding residues could suggest site-directed mutagenesis experiments, design therapeutics targeting carbohydrate-binding proteins, and provide guidance in engineering protein-carbohydrate interactions. In this work, we show that non-covalent carbohydrate binding sites on protein surfaces can be predicted with relatively high accuracy when the query protein structures are known. The prediction capabilities were based on a novel encoding scheme of the three-dimensional probability density maps describing the distributions of 36 non-covalent interacting atom types around protein surfaces. One machine learning model was trained for each of the 30 protein atom types. The machine learning algorithms predicted tentative carbohydrate binding sites on query proteins by recognizing the characteristic interacting atom distribution patterns specific for carbohydrate binding sites from known protein structures. The prediction results for all protein atom types were integrated into surface patches as tentative carbohydrate binding sites based on normalized prediction confidence level. The prediction capabilities of the predictors were benchmarked by a 10-fold cross validation on 497 non-redundant proteins with known carbohydrate binding sites. The predictors were further tested on an independent test set with 108 proteins. The residue-based Matthews correlation coefficient (MCC) for the independent test was 0.45, with prediction precision and sensitivity (or recall) of 0.45 and 0.49 respectively. In addition, 111 unbound carbohydrate-binding protein structures for which the structures were determined in the absence of the carbohydrate ligands were predicted with the trained predictors. The overall prediction MCC was 0.49. Independent tests on anti-carbohydrate antibodies showed that the carbohydrate antigen binding sites were predicted with comparable accuracy. These results demonstrate that the predictors are among the best in carbohydrate binding site predictions to date.

Original languageEnglish
Article numbere40846
JournalPLoS ONE
Volume7
Issue number7
DOIs
Publication statusPublished - Jul 25 2012
Externally publishedYes

ASJC Scopus subject areas

  • General

Fingerprint

Dive into the research topics of 'Prediction of Carbohydrate binding sites on protein surfaces with 3-dimensional probability density distributions of interacting atoms'. Together they form a unique fingerprint.

Cite this