• DocumentCode
    478602
  • Title

    Profile-Based Focused Crawler for Social Media-Sharing Websites

  • Author

    Zhang, Zhiyong ; Nasraoui, Olfa

  • Volume
    1
  • fYear
    2008
  • fDate
    3-5 Nov. 2008
  • Firstpage
    317
  • Lastpage
    324
  • Abstract
    In this paper, we present a novel profile based focused crawling system for dealing with increasingly popular social media-sharing Web sites. In this system, we treat users´ profiles as ranking criteria for guiding the crawling process. Furthermore, we divide a user´s profile into two parts, an internal part, which comes from the user´s own contribution, and an external part, which comes from the user´s social contacts. In order to efficiently and effectively extract data from a social media-sharing website for focused crawling, a path string based page-classification method was first developed for identifying list pages, detail pages and profile pages.
  • Keywords
    Web sites; pattern classification; social sciences computing; path string based page-classification method; profile based focused crawling system; social media-sharing Websites; Artificial intelligence; Computer science; Crawlers; Data mining; Learning systems; Search engines; Support vector machines; Taxonomy; Web sites; YouTube; focused crawl; profile; social;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Tools with Artificial Intelligence, 2008. ICTAI '08. 20th IEEE International Conference on
  • Conference_Location
    Dayton, OH
  • ISSN
    1082-3409
  • Print_ISBN
    978-0-7695-3440-4
  • Type

    conf

  • DOI
    10.1109/ICTAI.2008.119
  • Filename
    4669706