DocumentCode :
2618426
Title :
Resolving domains in large scale Web crawling
Author :
Liu, Xiaofeng
Author_Institution :
Sch. of Software Eng., Huazhong Univ. of Sci. & Technol., Wuhan, China
fYear :
2011
fDate :
27-29 June 2011
Firstpage :
705
Lastpage :
708
Abstract :
Efficient domain resolving is essential for large scale Web crawl. Based on batch processing, data structure and algorithms are presented for maintaining domains and addresses in crawling, and their performances are analyzed mathematically. Large scale domain resolving system is designed with proposed data structure. The theoretical analysis and experiments show that the speed of several thousand links per second for billions of links or hundreds of millions hosts can be achieved on one common personal computer.
Keywords :
Internet; search engines; arge scale Web crawl; batch processing; data structure; large scale domain resolving system; personal computer; Algorithm design and analysis; Crawlers; Data structures; Maintenance engineering; Merging; Random access memory; Web sites; batch-based information maintenance; domain resolving; web crawling;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Science and Service System (CSSS), 2011 International Conference on
Conference_Location :
Nanjing
Print_ISBN :
978-1-4244-9762-1
Type :
conf
DOI :
10.1109/CSSS.2011.5974572
Filename :
5974572
Link To Document :
بازگشت