Title :
WebMiner--Anatomy of Super Peer Based Incremental Topic-Specific Web Crawler
Author :
Vikas, Om ; Chiluka, Nitin J. ; Ray, Purushottam K. ; Meena, Girraj ; Meshram, Akhil K. ; Gupta, Amit ; Sisodia, Abhishek
Author_Institution :
Indian Inst. of Inf. Technol. & Manage., Gwalior
Abstract :
This paper introduces "WebMiner", a super-peer based P2P system for building an incremental topic-specific Web crawler. This develops a topic-based repository of Web pages that would later be used in the construction of ontologies. Current crawlers suffer from centralized architecture, having single point of failure and heavy load. Super-peer systems strike a balance between the inherent efficiency of centralized search and the autonomity, load balancing and robustness to attacks, provided by distributed search, with heterogeneity of capabilities across peers. In this paper, we discuss the architecture of WebMiner in detail including the construction of the super-peer overlay network and the working of the system, which includes feature of crawling the hidden Web.
Keywords :
Web sites; data mining; ontologies (artificial intelligence); semantic Web; P2P system; Web crawler; WebMiner; hidden Web; ontologies; super-peer; Anatomy; Crawlers; Information management; Information technology; Load management; Ontologies; Search engines; Service oriented architecture; Uniform resource locators; Web pages;
Conference_Titel :
Networking, 2007. ICN '07. Sixth International Conference on
Conference_Location :
Martinique
Print_ISBN :
0-7695-2805-8
Electronic_ISBN :
0-7695-2805-8
DOI :
10.1109/ICN.2007.104