DocumentCode :
3489051
Title :
Design and implementation of web crawler based on dynamic web collection cycle
Author :
Kim, K.S. ; Kim, K.Y. ; Lee, K.H. ; Kim, T.K. ; Cho, W.S.
Author_Institution :
CoreEngineering, Ohchang, South Korea
fYear :
2012
fDate :
1-3 Feb. 2012
Firstpage :
562
Lastpage :
566
Abstract :
The amount of web information is increasing rapidly with advanced wireless networks and emergence of diverse smart devices like i-Phone, i-Pad and so on. The information is continuously being produced and updated in anywhere and anytime by means of easy web platforms, and social networks. Now, it is becoming a hot issue how frequently updated web data has to be refreshed in data integration and retrieval domain. In this paper, we propose dynamic web-data crawling methods, which include sensitive checking of web site changes, and dynamic retrieving of web pages from target web sites. Furthermore, we implemented a java-based web crawling application and compared performance between conventional static approaches and our proposed dynamic ones. Our experiment results showed 59% performance benefits compared to static crawling method.
Keywords :
Internet; Java; data integration; information retrieval; search engines; Java based Web crawling application; Web data; Web information; Web sites; advanced wireless networks; data integration; data retrieval; dynamic Web collection cycle; dynamic Web data crawling methods; i-Pad; i-Phone; smart devices; social networks; static crawling method; Crawlers; Databases; Dynamic scheduling; Java; Libraries; Web pages;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Networking (ICOIN), 2012 International Conference on
Conference_Location :
Bali
ISSN :
1976-7684
Print_ISBN :
978-1-4673-0251-7
Type :
conf
DOI :
10.1109/ICOIN.2012.6164440
Filename :
6164440
Link To Document :
بازگشت