Title :
An improved method in clustering Web retrieval result based on relevance feedback
Author_Institution :
Dept. of Electron. & Commun. Eng., North China Electr. Power Univ., Baoding, China
Abstract :
Since the number of Web retrieval result is very large, the performance and reasonableness of clustering Web retrieval result are important. Existed methods cost much time while clustering all retrieval result and there were many unrelated document in their clustering result. To avoid the disadvantage, this paper proposed an improved k-means algorithm by using a few of related and unrelated feedback to guide clustering Web retrieval result. The improved algorithm first selected initial cluster metroid based on feedback messages, then during the clustering process, it removed large unrelated documents which increased the clustering speed and optimized the clustering result. During the clustering process, the metroids of clusters including unrelated documents needn´t be modified in order to avoid noise influence. Experiment result illustrate that our algorithm is superior to the traditional k-means algorithm.
Keywords :
Internet; pattern clustering; relevance feedback; Web retrieval clustering; cluster metroid; feedback messages; k-means algorithm; relevance feedback; unrelated document; unrelated feedback; Clustering algorithms; Computers; Information retrieval; Medical services; Ontologies; Proposals; Research and development; Web retrieval result; clustering; improved k-means algorithm; relevance feedback;
Conference_Titel :
Computer Science and Service System (CSSS), 2011 International Conference on
Conference_Location :
Nanjing
Print_ISBN :
978-1-4244-9762-1
DOI :
10.1109/CSSS.2011.5974767