DocumentCode :
2131295
Title :
A New Graph-Based Algorithm for Clustering Documents
Author :
Suarez, A.P. ; Trinidad, José Fco Martínez ; Ochoa, Jesús Ariel Carrasco ; Pagola, José E Medina
Author_Institution :
Adv. Technol. Applic. Center, La Habana
fYear :
2008
fDate :
15-19 Dec. 2008
Firstpage :
710
Lastpage :
719
Abstract :
In this paper a new algorithm, called CStar, for document clustering is presented. This algorithm improves recently developed algorithms like generalized star (GStar) and ACONS algorithms, originally proposed for reducing some drawbacks presented in previous Star-like algorithms.The CStar algorithm uses the condensed star-shaped sub-graph concept defined by ACONS, but defines a new heuristic that allows to construct a new cover of the thresholded similarity graph and to reduce the drawbacks presented in GStar and ACONS algorithms. The experimentation over standard document collections shows that our proposal outperforms previously defined algorithms and other related algorithms used to document clustering.
Keywords :
document handling; graph theory; pattern clustering; ACONS algorithm; CStar algorithm; GStar algorithm; clustering documents; condensed star-shaped subgraph concept; generalized star algorithm; graph-based algorithm; Astrophysics; Clustering algorithms; Conferences; Data mining; Filtering; Gas insulated transmission lines; Information retrieval; Optical filters; Parallel algorithms; Proposals; Clustering; Data Mining; Text Mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining Workshops, 2008. ICDMW '08. IEEE International Conference on
Conference_Location :
Pisa
Print_ISBN :
978-0-7695-3503-6
Electronic_ISBN :
978-0-7695-3503-6
Type :
conf
DOI :
10.1109/ICDMW.2008.69
Filename :
4733997
Link To Document :
بازگشت