Title :
Document Clustering Using Differential Evolution
Author :
Abraham, Ajith ; Das, Swagatam ; Konar, Amit
Author_Institution :
Chung Ang Univ., Seoul
Abstract :
This paper investigates a novel approach for partitional clustering of a large collection of text documents by using an improved version of the classical differential algorithm (DE). Fast and accurate clustering of documents plays an important role in the field of text mining and automatic information retrieval systems. The k-means has served as the most widely used partitional clustering algorithm for text documents. However, in most cases it provides only locally optimal solutions. In this work, the clustering problem has been formulated as an optimization task and is solved using a modified DE algorithm. To reduce the computational time, a hybrid k-means with DE method has also been proposed. The new algorithms were tested on a number of document datasets. Comparison with k-means, a state of the art PSO and one recently proposed real coded GA based text clustering methods reflects the superiority of the proposed techniques in terms of speed and quality of clustering.
Keywords :
data mining; document handling; evolutionary computation; information retrieval; pattern clustering; automatic information retrieval systems; differential evolution; document clustering; document datasets; hybrid k-means; partitional clustering; text clustering methods; text documents; text mining; Clustering algorithms; Clustering methods; Computer science; Genetic algorithms; Information retrieval; Particle swarm optimization; Partitioning algorithms; Testing; Text mining; Tree graphs;
Conference_Titel :
Evolutionary Computation, 2006. CEC 2006. IEEE Congress on
Conference_Location :
Vancouver, BC
Print_ISBN :
0-7803-9487-9
DOI :
10.1109/CEC.2006.1688523