摘要
A text/web document is a knowledge representation of a human idea (a structured set of thoughts). This paper refines TFIDF and Extended TFIDF(ETFIDF)[16]; These values really measures the co-occurrences of tokens. The ETFID captures the semantic more accurately. Tokens with high TFIDF values are called Keywords. The sets of (n+1) Co-occurring keywords with High ETFIDF are called n-granules. The collection of keywords and n-granules can be interpreted geometrically; they form a non-closed simplicial complex. The corresponding non-closed polyhedron is called Latent Semantic Space(LSS). LSS is a geometric knowledge base that provides the semantic to search engine:
原文 | 英語 |
---|---|
主出版物標題 | Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics |
頁面 | 4763-4767 |
頁數 | 5 |
卷 | 6 |
DOIs | |
出版狀態 | 已發佈 - 2007 |
事件 | 2006 IEEE International Conference on Systems, Man and Cybernetics - Taipei, 臺灣 持續時間: 10月 8 2006 → 10月 11 2006 |
其他
其他 | 2006 IEEE International Conference on Systems, Man and Cybernetics |
---|---|
國家/地區 | 臺灣 |
城市 | Taipei |
期間 | 10/8/06 → 10/11/06 |
ASJC Scopus subject areas
- 工程 (全部)