Distributed keyword vector representation for document categorization

Yu Lun Hsieh, Shih Hung Liu, Yung Chun Chang, Wen Lian Hsu

研究成果: 書貢獻/報告類型會議貢獻

2 引文 斯高帕斯(Scopus)

摘要

In the age of information explosion, efficiently categorizing the topic of a document can assist our organization and comprehension of the vast amount of text. In this paper, we propose a novel approach, named DKV, for document categorization using distributed real-valued vector representation of keywords learned from neural networks. Such a representation can project rich context information (or embedding) into the vector space, and subsequently be used to infer similarity measures among words, sentences, and even documents. Using a Chinese news corpus containing over 100,000 articles and five topics, we provide a comprehensive performance evaluation to demonstrate that by exploiting the keyword embeddings, DKV paired with support vector machines can effectively categorize a document into the predefined topics. Results demonstrate that our method can achieve the best performances compared to several other approaches.
原文英語
主出版物標題TAAI 2015 - 2015 Conference on Technologies and Applications of Artificial Intelligence
發行者Institute of Electrical and Electronics Engineers Inc.
頁面245-251
頁數7
ISBN(電子)9781467396066
DOIs
出版狀態已發佈 - 2月 12 2016
對外發佈
事件Conference on Technologies and Applications of Artificial Intelligence, TAAI 2015 - Tainan, 台灣
持續時間: 11月 20 201511月 22 2015

出版系列

名字TAAI 2015 - 2015 Conference on Technologies and Applications of Artificial Intelligence

會議

會議Conference on Technologies and Applications of Artificial Intelligence, TAAI 2015
國家/地區台灣
城市Tainan
期間11/20/1511/22/15

ASJC Scopus subject areas

  • 人工智慧
  • 電腦科學應用

指紋

深入研究「Distributed keyword vector representation for document categorization」主題。共同形成了獨特的指紋。

引用此