DocumentCode :
441862
Title :
RPCL text clustering based on concept indexing
Author :
Gao, Mao-Ting ; Wang, Zheng-Ou
Author_Institution :
Inst. of Syst. Eng., Tianjin Univ., China
Volume :
4
fYear :
2005
fDate :
18-21 Aug. 2005
Firstpage :
2331
Abstract :
Text feature space usually has huge dimensionality, and the number of clusters cannot be determined before clustering. Concept indexing (CI) method can reduce text feature space in dimensionality rapidly. CI firstly clusters the texts set into L subsets, then regards the centroid vectors of the L clusters as the axes of the reduced L-dimensional space, and finally projects text vector-space on the reduced L-dimensional space and make it to be L-dimensional text vector-space. RPCL text clustering can work effectively and determine the number of clusters properly. This paper presents a new method using CI to reduce dimensionality for RPCL text clustering. Experimental results show that this algorithm not only improves clustering efficiency greatly, but also makes the averaged accuracy reach a high level.
Keywords :
data mining; indexing; pattern clustering; text analysis; unsupervised learning; RPCL text clustering; concept indexing; rival penalized competitive learning; text data mining; text feature space; Clustering algorithms; Computer science; Data mining; Frequency; Independent component analysis; Indexing; Matrix decomposition; Principal component analysis; Systems engineering and theory; Text mining; Clustering Analysis; Concept Indexing; RPCL; Text Clustering;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Cybernetics, 2005. Proceedings of 2005 International Conference on
Conference_Location :
Guangzhou, China
Print_ISBN :
0-7803-9091-1
Type :
conf
DOI :
10.1109/ICMLC.2005.1527333
Filename :
1527333
Link To Document :
بازگشت