• DocumentCode
    2866235
  • Title

    Automatically mining result records from search engine response pages

  • Author

    Mundluru, Dheerendranath ; Katukuri, Jayasimha Reddy ; Celebi, Saygin

  • Author_Institution
    Center for Adv. Comput. Studies, Univ. of Louisiana, Lafayette, LA, USA
  • fYear
    2005
  • fDate
    27-30 Nov. 2005
  • Abstract
    Usually, Web applications such as deep Web crawlers, metasearch engines, and other Web mining systems need to extract information displayed in the form of result records on response pages returned by search engines in response to submitted queries. Extracting such records is challenging as search engines are heterogeneous in displaying their records. In addition, response pages returned by many search engines include other noisy content such as advertisements, suggestion links, etc., which make the extraction task even more complicated. In this paper, we propose a highly effective and efficient algorithm for automatically mining result records from search engine response pages.
  • Keywords
    Internet; data mining; search engines; Web applications; Web mining systems; automatically mining result records; deep Web crawlers; information extraction; metasearch engines; query submission; search engine response pages; search engines; Computer displays; Crawlers; Data mining; HTML; Humans; Metasearch; Search engines; Web mining; Web pages; XML;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, Fifth IEEE International Conference on
  • ISSN
    1550-4786
  • Print_ISBN
    0-7695-2278-5
  • Type

    conf

  • DOI
    10.1109/ICDM.2005.30
  • Filename
    1565773