DocumentCode :
685803
Title :
An improved K-means algorithm combined with Particle Swarm Optimization approach for efficient web document clustering
Author :
Jaganathan, P. ; Jaiganesh, S.
Author_Institution :
Dept. of Comput. Applic., PSNA Coll. of Eng. & Technol., Dindigul, India
fYear :
2013
fDate :
12-14 Dec. 2013
Firstpage :
772
Lastpage :
776
Abstract :
Searching and discovering the relevant information on the web have always been challenging task. It is very hard to wade through the large number of returned documents in a response to a user query. This leads to the need to organize a large set of documents into categories through clustering. There is a need of efficient clustering algorithms for organizing documents. Clustering on large dataset can be effectively done using partitional clustering algorithms. The K-means algorithm is the appropriate partitional clustering approach for handling large dataset because of its efficiency with respect to execution time. But this algorithm is highly susceptible to the selection of initial positions of cluster centers. This paper introduces a new hybrid method using Particle Swarm Optimization (PSO) combined with an improved K-means algorithm for document clustering. We have tested K-means, PSO, our proposed PSOK, KPSO and KPSOK algorithms on various text document collections. The document range varies from 204 to 878 in the dataset and the terms ranges from 5804 to 7454. There is clear evidence from our results that the proposed method achieves better clustering than other methods taken for study.
Keywords :
Internet; document handling; particle swarm optimisation; pattern clustering; KPSOK algorithms; PSO; Web document clustering; improved K-means algorithm; particle swarm optimization approach; partitional clustering approach; text document collections; Algorithm design and analysis; Clustering algorithms; Equations; Mathematical model; Particle swarm optimization; Partitioning algorithms; Vectors; Cluster Centroid; Euclidian distance; PSO; Vector Space Model; cosine correlation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Green Computing, Communication and Conservation of Energy (ICGCE), 2013 International Conference on
Conference_Location :
Chennai
Type :
conf
DOI :
10.1109/ICGCE.2013.6823538
Filename :
6823538
Link To Document :
بازگشت