Title :
OGSA-DWC: A Middleware for Deep Web Crawling Using the Grid
Author :
Song, Jihwan ; Choi, Dong-Hoon ; Lee, Yoon-Joon
Author_Institution :
Div. of Comput. Sci., KAIST, Daejeon, South Korea
Abstract :
Conventional search engines generally cannot find information from the Deep Web because they use hyper link-based crawling techniques to visit Web pages. Recently, lots of research efforts are being tried to crawl the Deep Web. One of the obstacles for crawling the Deep Web is the requirement of huge computing resources, but most of search engine companies hardly meet the needs. We, therefore, propose the design of the Grid-based middleware, OGSA-DWC for crawling the Deep Web. With our middleware, developers will easily implement a Grid-based Deep Web crawling system although they do not have much knowledge about how to use idle and distributed computing resources.
Keywords :
Web sites; grid computing; middleware; open systems; search engines; software architecture; Deep Web crawling system; Web page; distributed computing resources; grid-based middleware; open grid services architecture; search engine; Computer science; Crawlers; Databases; Distributed computing; Grid computing; Information retrieval; Middleware; Production facilities; Search engines; Web pages; Deep Web; Grid; OGSA; crawling; middleware;
Conference_Titel :
eScience, 2008. eScience '08. IEEE Fourth International Conference on
Conference_Location :
Indianapolis, IN
Print_ISBN :
978-1-4244-3380-3
Electronic_ISBN :
978-0-7695-3535-7
DOI :
10.1109/eScience.2008.118