Abstract
We propose a semantic template-based distributed representation for the convolutional neural network called Semantic Template-based Convolutional Neural Network (STCNN) for text categorization that imitates the perceptual behavior of human comprehension. STCNN is a highly automatic approach that learns semantic templates that characterize a domain from raw text and recognizes categories of documents using a semantic-infused convolutional neural network that allows a template to be partially matched through a statistical scoring system. Our experiment results show that STCNN effectively classifies documents in about 140,000 Chinese news articles into predefined categories by capturing the most prominent and expressive patterns and achieves the best performance among all compared methods for Chinese topic classification. Finally, the same knowledge can be directly used to perform a semantic analysis task.
Original language | English |
---|---|
Article number | 249 |
Journal | ACM Transactions on Asian and Low-Resource Language Information Processing |
Volume | 22 |
Issue number | 11 |
DOIs | |
Publication status | Published - Nov 20 2023 |
Keywords
- Chinese text categorization
- Natural language processing
- semantic template
- text representation
- topic classification
ASJC Scopus subject areas
- General Computer Science