DocumentCode :
2578370
Title :
Link based K-Means clustering algorithm for information retrieval
Author :
Sathya, M. ; Jayanthi, J. ; Basker, N.
Author_Institution :
Dept. of Comput. Sci. & Eng., Sona Coll. of Technol., Salem, India
fYear :
2011
fDate :
3-5 June 2011
Firstpage :
1111
Lastpage :
1115
Abstract :
In the rapid development of internet technologies, search engines play a vital role in information retrieval. To provide efficient search engine to the user, Link Based Search Engine for information retrieval using K-Means clustering algorithm has been developed. The traditional search engines provide users with a set of non-classified web pages to their request based on its ranking mechanism. In order to satisfy the needs of the user, an improvement to the search engine called Intelligent Cluster Search Engine (ICSE) has been proposed. The improvement of information retrieval process can be divided into two parts such as: comparison of co-occurrence terms and clustering of documents. In this information retrieval, the relevancy of documents is obtained based on the number of occurrences of each co-occurrent term (in links and out links) in a particular web page. The clustering of these relevant documents can be done based on the threshold values assigning to cluster and then the web pages are grouped into that cluster. When the web pages are clustered, a boost up factor is given to a web page based on the relevancy of content from title and summary. The documents can be classified into most relevant, relevant and irrelevant clusters. K-Means clustering algorithm is used to cluster the relevant web pages in order to increase the relevance rate of search results and reduce the computational time of the user.
Keywords :
Internet; document handling; information retrieval; pattern clustering; search engines; Internet technologies; Web pages; document clustering; information retrieval; intelligent cluster search engine; link based k-means clustering algorithm; link based search engine; Algorithm design and analysis; Classification algorithms; Clustering algorithms; Data mining; Search engines; Web pages; Clustering; Information extraction; Information retrieval; K-Means algorithm;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Recent Trends in Information Technology (ICRTIT), 2011 International Conference on
Conference_Location :
Chennai, Tamil Nadu
Print_ISBN :
978-1-4577-0588-5
Type :
conf
DOI :
10.1109/ICRTIT.2011.5972402
Filename :
5972402
Link To Document :
بازگشت