DocumentCode :
2118978
Title :
A document clustering algorithm based on improved landmark semidefinite embedding
Author :
Wang, Hui ; Qin, Hua ; Ding, Li-duo ; Hui, Wang
Author_Institution :
School of Computer and Electronic Information, Guangxi University, Nanning, China
fYear :
2010
fDate :
4-6 Dec. 2010
Firstpage :
4827
Lastpage :
4830
Abstract :
The document space is generally of high dimensionality, and clustering in such a high dimensional space is often infeasible due to the curse of dimensionality. In this paper, a novel document clustering method which based on improved landmark semidefinite embedding (lSDE) is proposed. Based on the general lSDE, the point selection rules is modified by Max-min distance algorithm, with a view to ensuring the stability of algorithm. By using the improved lSDE, the documents can be projected into a lower dimension kernel space in which redundant information was filtered, and the documents related to the same semantic are close to each other. On this low-dimensional representation, the processed document data was clustered by kernel K-means. Experimental results show that the new clustering algorithm gives better performance than several advanced clustering methods.
Keywords :
Clustering algorithms; Computers; Educational institutions; Kernel; Principal component analysis; Programming; Semantics; Max-min distance algorithm; kernel K-means; nonlinear dimensionality reduction; text clustering;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Science and Engineering (ICISE), 2010 2nd International Conference on
Conference_Location :
Hangzhou, China
Print_ISBN :
978-1-4244-7616-9
Type :
conf
DOI :
10.1109/ICISE.2010.5690075
Filename :
5690075
Link To Document :
بازگشت