• DocumentCode
    3773807
  • Title

    A framework for focused linked data crawler using context graphs

  • Author

    Samita Bai;Sharaf Hussain;Shakeel Khoja

  • Author_Institution
    Faculty of Computer Science, Institute of Business Administration, Karachi, Pakistan
  • fYear
    2015
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    In this paper, we propose a framework for focused Linked Data (LD) crawler based on context graphs. A focused crawler searches for a specific subset of web, in our case it targets interlinked RDF data stores. The proposed crawler constructs set of context graphs for the given seed URIs by back crawling the web, and classifiers are trained to detect and assign documents to different categories based on the content type. These classifier help crawler in search and updating of context graphs automatically. The crawler are trained using supervised learning. Additionally, an extensive overview of existing LD crawlers is also provided along with its basic requirements, architecture, issues and challenges.
  • Keywords
    "Crawlers","Resource description framework","HTML","Search engines","Bandwidth","Context","Indexing"
  • Publisher
    ieee
  • Conference_Titel
    Information and Communication Technologies (ICICT), 2015 International Conference on
  • Type

    conf

  • DOI
    10.1109/ICICT.2015.7469580
  • Filename
    7469580