Title :
Literature Clustering using Citation Semantics
Author :
Tuanjie Tong ; Dinakarpandian, D. ; Yugyung Lee
Abstract :
Clustering is a common and powerful technique for statistical data analysis, document categorization and topic discovery. The majority of traditional clustering methods, especially for document clustering, are based on the vector space model for distance measure, where the vector is the word profile of a document in the context of the entire corpus. However, algorithms using this measure achieve limited accuracy. In this paper, we propose a semantic measure which incorporates citation semantics (Citonomy) into literature (document) clustering. Our experimental results show that the performance of clustering can be substantially improved by combining Citonomy and vector space measures.
Keywords :
citation analysis; data analysis; document handling; pattern clustering; statistical analysis; citation semantic; distance measurement; document categorization; document clustering; literature clustering; statistical data analysis; topic discovery; vector space model; Clustering algorithms; Clustering methods; Context modeling; Data analysis; Extraterrestrial measurements; Frequency; Iterative algorithms; Ontologies; Partitioning algorithms; Text analysis;
Conference_Titel :
System Sciences, 2009. HICSS '09. 42nd Hawaii International Conference on
Conference_Location :
Big Island, HI
Print_ISBN :
978-0-7695-3450-3
DOI :
10.1109/HICSS.2009.294