Title :
An Improved Partitioning-Based Web Documents Clustering Method Combining GA with ISODATA
Author :
Zhu, Zhengyu ; Tian, Yunyan ; XU, Jingqiu ; Deng, Xin ; REN, Xiang
Author_Institution :
Chongqing Univ., Chongqing
Abstract :
The existing partitioning-based clustering algorithms, such as k-means, k-medoids and their variations, are simple in theory and fast in convergence speed, but they always just reach local optimum when the iterations terminate and they are not suitable for discovering clusters in the cases when their sizes are very different. This paper proposes an improved Web documents clustering method, using genetic algorithm (GA) which introduces some ideas of ISODATA [6] into the design of its mutation operation. Experiments show that the GA´s global search characteristic can avoid local optimum and the ISODATA-based mutation operation makes the improved clustering algorithm have the self-adjusting ability to discover clusters of different sizes.
Keywords :
Internet; data analysis; document handling; genetic algorithms; iterative methods; pattern clustering; ISODATA; genetic algorithms; improved partitioning-based Web documents clustering method; iterative self-organizing data analysis technique A algorithm; Clustering algorithms; Clustering methods; Data mining; Educational institutions; Genetic algorithms; Genetic mutations; Iterative algorithms; Merging; Partitioning algorithms; Streaming media;
Conference_Titel :
Fuzzy Systems and Knowledge Discovery, 2007. FSKD 2007. Fourth International Conference on
Conference_Location :
Haikou
Print_ISBN :
978-0-7695-2874-8
DOI :
10.1109/FSKD.2007.165