Title :
Text clustering based on term weights automatic partition
Author :
Yonghong, Yu ; Wenyang, Bai
Author_Institution :
Dept. of Comput. Sci., Anhui Univ. of Finance & Econ., Bengbu, China
Abstract :
Text clustering is becoming more and more popular due to the increasing of texts on Web and the requirements in real application. This paper introduces a novel automatic text clustering method, in which the genetic algorithm is first applied to the global optimal and high searching efficient term selection to achieve dimensionality reduction, and then appropriate number of partitions of document set are created according to the different combinations of term weights, and each document partition is clustered into an initial clusters based on dynamic programming technique, and last all initial clusters are clustered using the same method to final text clusters. It also provides analysis and theorem proof that the algorithm can provide higher performance in computational complexity, clustering effect and high dimensional data clustering.
Keywords :
computational complexity; dynamic programming; genetic algorithms; pattern clustering; text analysis; theorem proving; automatic text clustering method; computational complexity; data clustering; dynamic programming technique; genetic algorithm; global optimal searching; theorem proof; weights automatic partition; Clustering algorithms; Clustering methods; Computer science; Data mining; Finance; Genetic algorithms; Information analysis; Information retrieval; Machine learning; Partitioning algorithms; genetic algorithm; term selection; term weight partition; text clustering;
Conference_Titel :
Computer and Automation Engineering (ICCAE), 2010 The 2nd International Conference on
Conference_Location :
Singapore
Print_ISBN :
978-1-4244-5585-0
Electronic_ISBN :
978-1-4244-5586-7
DOI :
10.1109/ICCAE.2010.5451390