Title :
A document clustering algorithm based on improved landmark semidefinite embedding
Author :
Wang, Hui ; Qin, Hua ; Ding, Li-duo ; Hui, Wang
Author_Institution :
School of Computer and Electronic Information, Guangxi University, Nanning, China
Abstract :
The document space is generally of high dimensionality, and clustering in such a high dimensional space is often infeasible due to the curse of dimensionality. In this paper, a novel document clustering method which based on improved landmark semidefinite embedding (lSDE) is proposed. Based on the general lSDE, the point selection rules is modified by Max-min distance algorithm, with a view to ensuring the stability of algorithm. By using the improved lSDE, the documents can be projected into a lower dimension kernel space in which redundant information was filtered, and the documents related to the same semantic are close to each other. On this low-dimensional representation, the processed document data was clustered by kernel K-means. Experimental results show that the new clustering algorithm gives better performance than several advanced clustering methods.
Keywords :
Clustering algorithms; Computers; Educational institutions; Kernel; Principal component analysis; Programming; Semantics; Max-min distance algorithm; kernel K-means; nonlinear dimensionality reduction; text clustering;
Conference_Titel :
Information Science and Engineering (ICISE), 2010 2nd International Conference on
Conference_Location :
Hangzhou, China
Print_ISBN :
978-1-4244-7616-9
DOI :
10.1109/ICISE.2010.5690075