Author_Institution :
Inst. of Math., Univ. of Warsaw, Warsaw, Poland
Abstract :
Tolerance Rough Set Model (TRSM) has been introduced as a tool for approximation of hidden concepts in text databases. In recent years, numerous successful applications of TRSM in web intelligence including text classification, clustering, thesaurus generation, semantic indexing, and semantic search, etc., have been proposed. This paper will review the fundamental concepts of TRSM, some of its possible extensions and some typical applications of TRSM in text mining. Moreover, the architecture o a semantic information retrieval system, called SONCA, will be presented to demonstrate the main idea as well as stimulate the further research on TRSM.
Keywords :
data mining; information retrieval systems; ontologies (artificial intelligence); rough set theory; text analysis; SONCA system; TRSM; Web intelligence; clustering; search based on ontologies and compound analytics; semantic indexing; semantic information retrieval system; semantic search; text classification; text databases; text mining; thesaurus generation; tolerance rough set model; Approximation methods; Indexes; Information retrieval; Ontologies; Semantics; Standards; Vectors; Tolerance rough set model; classification; clustering; semantic indexing; semantic search;