DocumentCode
3039629
Title
Clustering of web search results using Suffix tree algorithm and avoidance of repetition of same images in search results using L-Point Comparison algorithm
Author
Suneetha, Manne ; Fatima, S. Sameen ; Pervez, Shaik Mohd Zaheer
Author_Institution
Dept. of Inf. Technol., Velagapudi Ramakrishna Siddhartha Eng. Coll., Vijayawada, India
fYear
2011
fDate
23-24 March 2011
Firstpage
1041
Lastpage
1046
Abstract
It is a common experience to the web users with the existing search engines like Google, Yahoo, MSN, Ask, e.t.c., that the information related to the entered query returns a long ranked list of results (snippets). It becomes cumbersome to the user to go through each title, snippet and even sometimes link of the search results until relevant results are found to the query. Clustering of search results is a special technique in data mining using which the retrieved results are organized into meaningful groups enlightening the user work. This paper deals with the generalized Suffix tree based clustering approach. The most repeated phrase in the document tags is considered as cluster name. Thus in short, web search results that are fetched from the prevailing web search engines grouped under phrases that contain one or more search keywords. This paper aims at organizing web search results into clusters facilitating quick browsing options to the browser providing an excellent interface to results precisely. Suffix tree clustering produces comparatively more accurate and informative grouped results. A basic problem during image searching in any search engine is Image Repetition. This can be avoided by using the L-Point Comparison algorithm, a specially worked out technique in field of Information Retrieval systems, is also discussed with a practical example.
Keywords
Internet; content-based retrieval; data mining; image retrieval; pattern clustering; search engines; tree data structures; trees (mathematics); Ask; Google; L-point comparison algorithm; MSN; Web search result clustering; Yahoo; cluster name; data mining; document tags; generalized suffix tree based clustering approach; image repetition avoidance; image searching; information retrieval system; query return; quick browsing option; search engines; suffix tree algorithm; Clustering algorithms; Data mining; Engines; Pixel; Search engines; Shape; Web search; Cleaning of Document; Coherent clustering; L-point image Comparison (LPC); Shared phrase; Suffix Tree Based Clustering (STBC);
fLanguage
English
Publisher
ieee
Conference_Titel
Emerging Trends in Electrical and Computer Technology (ICETECT), 2011 International Conference on
Conference_Location
Tamil Nadu
Print_ISBN
978-1-4244-7923-8
Type
conf
DOI
10.1109/ICETECT.2011.5760272
Filename
5760272
Link To Document