مرکز منطقه ای اطلاع رساني علوم و فناوري - Resolving domains in large scale Web crawling

DocumentCode :

2618426

Title :

Resolving domains in large scale Web crawling

Author :

Liu, Xiaofeng

Author_Institution :

Sch. of Software Eng., Huazhong Univ. of Sci. & Technol., Wuhan, China

fYear :

2011

fDate :

27-29 June 2011

Firstpage :

705

Lastpage :

708

Abstract :

Efficient domain resolving is essential for large scale Web crawl. Based on batch processing, data structure and algorithms are presented for maintaining domains and addresses in crawling, and their performances are analyzed mathematically. Large scale domain resolving system is designed with proposed data structure. The theoretical analysis and experiments show that the speed of several thousand links per second for billions of links or hundreds of millions hosts can be achieved on one common personal computer.

Keywords :

Internet; search engines; arge scale Web crawl; batch processing; data structure; large scale domain resolving system; personal computer; Algorithm design and analysis; Crawlers; Data structures; Maintenance engineering; Merging; Random access memory; Web sites; batch-based information maintenance; domain resolving; web crawling;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Computer Science and Service System (CSSS), 2011 International Conference on

Conference_Location :

Nanjing

Print_ISBN :

978-1-4244-9762-1

Type :

conf

DOI :

10.1109/CSSS.2011.5974572

Filename :

5974572

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2618426