DocumentCode
2118978
Title
A document clustering algorithm based on improved landmark semidefinite embedding
Author
Wang, Hui ; Qin, Hua ; Ding, Li-duo ; Hui, Wang
Author_Institution
School of Computer and Electronic Information, Guangxi University, Nanning, China
fYear
2010
fDate
4-6 Dec. 2010
Firstpage
4827
Lastpage
4830
Abstract
The document space is generally of high dimensionality, and clustering in such a high dimensional space is often infeasible due to the curse of dimensionality. In this paper, a novel document clustering method which based on improved landmark semidefinite embedding (lSDE) is proposed. Based on the general lSDE, the point selection rules is modified by Max-min distance algorithm, with a view to ensuring the stability of algorithm. By using the improved lSDE, the documents can be projected into a lower dimension kernel space in which redundant information was filtered, and the documents related to the same semantic are close to each other. On this low-dimensional representation, the processed document data was clustered by kernel K-means. Experimental results show that the new clustering algorithm gives better performance than several advanced clustering methods.
Keywords
Clustering algorithms; Computers; Educational institutions; Kernel; Principal component analysis; Programming; Semantics; Max-min distance algorithm; kernel K-means; nonlinear dimensionality reduction; text clustering;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Science and Engineering (ICISE), 2010 2nd International Conference on
Conference_Location
Hangzhou, China
Print_ISBN
978-1-4244-7616-9
Type
conf
DOI
10.1109/ICISE.2010.5690075
Filename
5690075
Link To Document