Semantic frame-based statistical approach for topic detection

Yung Chun Chang, Yu Lun Hsieh, Cen Chieh Chen, Chad Liu, Chun Hung Lu, Wen Lian Hsu

研究成果: 書貢獻/報告類型會議貢獻

4 引文 斯高帕斯(Scopus)

摘要

We propose a statistical frame-based approach (FBA) for natural language processing, and demonstrate its advantage over traditional machine learning methods by using topic detection as a case study. FBA perceives and identifies semantic knowledge in a more general manner by collecting important linguistic patterns within documents through a unique flexible matching scheme that allows word insertion, deletion and substitution (IDS) to capture linguistic structures within the text. In addition, FBA can also overcome major issues of the rule-based approach by reducing human effort through its highly automated pattern generation and summarization. Using Yahoo! Chinese news corpus containing about 140,000 news articles, we provide a comprehensive performance evaluation that demonstrates the effectiveness of FBA in detecting the topic of a document by exploiting the semantic association and the context within the text. Moreover, it outperforms common topic models like Näive Bayes, Vector Space Model, and LDA-SVM.
原文英語
主出版物標題Proceedings of the 28th Pacific Asia Conference on Language, Information and Computation, PACLIC 2014
編輯Prachya Boonkwan, Wirote Aroonmanakun, Thepchai Supnithi
發行者Faculty of Pharmaceutical Sciences, Chulalongkorn University
頁面75-84
頁數10
ISBN(電子)9786165518871
出版狀態已發佈 - 2014
對外發佈
事件28th Pacific Asia Conference on Language, Information and Computation, PACLIC 2014 - Phuket, 泰國
持續時間: 12月 12 201412月 14 2014

出版系列

名字Proceedings of the 28th Pacific Asia Conference on Language, Information and Computation, PACLIC 2014

會議

會議28th Pacific Asia Conference on Language, Information and Computation, PACLIC 2014
國家/地區泰國
城市Phuket
期間12/12/1412/14/14

ASJC Scopus subject areas

  • 語言與語言學
  • 電腦科學(雜項)

指紋

深入研究「Semantic frame-based statistical approach for topic detection」主題。共同形成了獨特的指紋。

引用此