ProLoc-rGO: Using rule-based knowledge with gene ontology terms for prediction of protein subnuclear localization

Wen Lin Huang, Chun Wei Tung, Shih Wen Ho, Shinn Ying Ho

研究成果: 書貢獻/報告類型會議貢獻

2 引文 斯高帕斯(Scopus)

摘要

Gene Ontology (GO) annotation is a controlled vocabulary of terms and phrases describing the function of genes and gene products, which has been succeeded in predicting subcellualr and subnuclear localization. Generally, each gene product is annotated by very few GO terms from more than 25,000 annotations available at present. How to represent a protein sequence using GO terms as features plays an important role in designing prediction systems for protein subnuclear localization. Our previous work ProLoc-GO can select a small number m out of a large number n GO terms, where m ≤≤ n. However, its off-line time for training is large up to several days even though running on high speedily PC clusters. Therefore, this study proposes an efficient system (ProLoc-rGO) by using the decision tree method to speedily mine m informative GO terms and acquire interpretable rule-based knowledge for predicting subnuclear localization. The ProLoc-rGO performing on SNL9-80 (714 proteins in nine compartments with ≤80 identity) can mine m=17 informative GO terms, 17 interpretable rules and yield training and test accuracies of 84.9% and 78.2%. For comparison, an accuracy 82.6% (Matthews correlation coefficient (MCC) = 0.711) for ProLoc-rGO performed on SNL9-80 (714 proteins in nine compartments with ≤80 identity) is obtained, which is better than 67.4% (MCC = 0.50) for Nuc-PLoc that fuses the pseudo-amino acid composition of a protein and its position-specific scoring matrix.

原文英語
主出版物標題2008 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB '08
頁面201-206
頁數6
DOIs
出版狀態已發佈 - 12月 1 2008
對外發佈
事件2008 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB '08 - Sun Valley, ID, 美国
持續時間: 9月 15 20089月 17 2008

出版系列

名字2008 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB '08

會議

會議2008 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB '08
國家/地區美国
城市Sun Valley, ID
期間9/15/089/17/08

ASJC Scopus subject areas

  • 人工智慧
  • 計算機理論與數學
  • 生物醫學工程
  • 健康資訊學

指紋

深入研究「ProLoc-rGO: Using rule-based knowledge with gene ontology terms for prediction of protein subnuclear localization」主題。共同形成了獨特的指紋。

引用此