Formal concept analysis and document clustering via granular computing

Tsau Young Lin, I-Jen Chiang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

A text/web document is a knowledge representation of a human idea (a structured set of thoughts). This paper refines TFIDF and Extended TFIDF(ETFIDF)[16]; These values really measures the co-occurrences of tokens. The ETFID captures the semantic more accurately. Tokens with high TFIDF values are called Keywords. The sets of (n+1) Co-occurring keywords with High ETFIDF are called n-granules. The collection of keywords and n-granules can be interpreted geometrically; they form a non-closed simplicial complex. The corresponding non-closed polyhedron is called Latent Semantic Space(LSS). LSS is a geometric knowledge base that provides the semantic to search engine:

Original languageEnglish
Title of host publicationConference Proceedings - IEEE International Conference on Systems, Man and Cybernetics
Pages4763-4767
Number of pages5
Volume6
DOIs
Publication statusPublished - 2007
Event2006 IEEE International Conference on Systems, Man and Cybernetics - Taipei, Taiwan
Duration: Oct 8 2006Oct 11 2006

Other

Other2006 IEEE International Conference on Systems, Man and Cybernetics
Country/TerritoryTaiwan
CityTaipei
Period10/8/0610/11/06

Keywords

  • Granules
  • Latent semantic space
  • Simplex

ASJC Scopus subject areas

  • General Engineering

Fingerprint

Dive into the research topics of 'Formal concept analysis and document clustering via granular computing'. Together they form a unique fingerprint.

Cite this