• DocumentCode
    3039629
  • Title

    Clustering of web search results using Suffix tree algorithm and avoidance of repetition of same images in search results using L-Point Comparison algorithm

  • Author

    Suneetha, Manne ; Fatima, S. Sameen ; Pervez, Shaik Mohd Zaheer

  • Author_Institution
    Dept. of Inf. Technol., Velagapudi Ramakrishna Siddhartha Eng. Coll., Vijayawada, India
  • fYear
    2011
  • fDate
    23-24 March 2011
  • Firstpage
    1041
  • Lastpage
    1046
  • Abstract
    It is a common experience to the web users with the existing search engines like Google, Yahoo, MSN, Ask, e.t.c., that the information related to the entered query returns a long ranked list of results (snippets). It becomes cumbersome to the user to go through each title, snippet and even sometimes link of the search results until relevant results are found to the query. Clustering of search results is a special technique in data mining using which the retrieved results are organized into meaningful groups enlightening the user work. This paper deals with the generalized Suffix tree based clustering approach. The most repeated phrase in the document tags is considered as cluster name. Thus in short, web search results that are fetched from the prevailing web search engines grouped under phrases that contain one or more search keywords. This paper aims at organizing web search results into clusters facilitating quick browsing options to the browser providing an excellent interface to results precisely. Suffix tree clustering produces comparatively more accurate and informative grouped results. A basic problem during image searching in any search engine is Image Repetition. This can be avoided by using the L-Point Comparison algorithm, a specially worked out technique in field of Information Retrieval systems, is also discussed with a practical example.
  • Keywords
    Internet; content-based retrieval; data mining; image retrieval; pattern clustering; search engines; tree data structures; trees (mathematics); Ask; Google; L-point comparison algorithm; MSN; Web search result clustering; Yahoo; cluster name; data mining; document tags; generalized suffix tree based clustering approach; image repetition avoidance; image searching; information retrieval system; query return; quick browsing option; search engines; suffix tree algorithm; Clustering algorithms; Data mining; Engines; Pixel; Search engines; Shape; Web search; Cleaning of Document; Coherent clustering; L-point image Comparison (LPC); Shared phrase; Suffix Tree Based Clustering (STBC);
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Emerging Trends in Electrical and Computer Technology (ICETECT), 2011 International Conference on
  • Conference_Location
    Tamil Nadu
  • Print_ISBN
    978-1-4244-7923-8
  • Type

    conf

  • DOI
    10.1109/ICETECT.2011.5760272
  • Filename
    5760272