Title :
Analyzing the Web Crawler as a Feed Forward Engine for an Efficient Solution to the Search Problem in the Minimum Amount of Time through a Distributed Framework
Author :
Qureshi, M. Atif ; Younus, Arjumand ; Rojas, Francisco
Author_Institution :
Dept. of Comput. Sci., Korea Adv. Inst. of Sci. & Technol., Daejeon, South Korea
Abstract :
A web crawler forms the backbone of a search engine and this backbone needs a careful re- assessment that could enhance the efficiency of search engines. This paper conducts such a re- assessment from the perspective of systems and this is achieved through implementation and analysis of a web crawler "VisionerBOT" as a feed forward engine for search engines using the MapReduce distributed programming model. Our crawler implementations revisit the classical OS debate of threads vs. events, with a significant contribution from our work which concludes that events is the ideal way forward for web crawlers. Furthermore, in implementing the feed forward mechanisms within the web crawler, we came up with some important design considerations for the operating system research community which can lead to a whole new class of operating systems.
Keywords :
Internet; distributed programming; operating systems (computers); search engines; MapReduce distributed programming model; VisionerBOT; Web crawler analysis; distributed framework; feed forward engine mechanism; operating system; search engine; search problem; Crawlers; Feeds; Internet; Operating systems; Search engines; Search problems; Service oriented architecture; Spine; Web server; Yarn;
Conference_Titel :
Information Science and Applications (ICISA), 2010 International Conference on
Conference_Location :
Seoul
Print_ISBN :
978-1-4244-5941-4
Electronic_ISBN :
978-1-4244-5943-8
DOI :
10.1109/ICISA.2010.5480411