DocumentCode
2131295
Title
A New Graph-Based Algorithm for Clustering Documents
Author
Suarez, A.P. ; Trinidad, José Fco Martínez ; Ochoa, Jesús Ariel Carrasco ; Pagola, José E Medina
Author_Institution
Adv. Technol. Applic. Center, La Habana
fYear
2008
fDate
15-19 Dec. 2008
Firstpage
710
Lastpage
719
Abstract
In this paper a new algorithm, called CStar, for document clustering is presented. This algorithm improves recently developed algorithms like generalized star (GStar) and ACONS algorithms, originally proposed for reducing some drawbacks presented in previous Star-like algorithms.The CStar algorithm uses the condensed star-shaped sub-graph concept defined by ACONS, but defines a new heuristic that allows to construct a new cover of the thresholded similarity graph and to reduce the drawbacks presented in GStar and ACONS algorithms. The experimentation over standard document collections shows that our proposal outperforms previously defined algorithms and other related algorithms used to document clustering.
Keywords
document handling; graph theory; pattern clustering; ACONS algorithm; CStar algorithm; GStar algorithm; clustering documents; condensed star-shaped subgraph concept; generalized star algorithm; graph-based algorithm; Astrophysics; Clustering algorithms; Conferences; Data mining; Filtering; Gas insulated transmission lines; Information retrieval; Optical filters; Parallel algorithms; Proposals; Clustering; Data Mining; Text Mining;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining Workshops, 2008. ICDMW '08. IEEE International Conference on
Conference_Location
Pisa
Print_ISBN
978-0-7695-3503-6
Electronic_ISBN
978-0-7695-3503-6
Type
conf
DOI
10.1109/ICDMW.2008.69
Filename
4733997
Link To Document