DocumentCode :
2344494
Title :
Using high performance systems to build collections for a digital library
Author :
Bergmark, Donna
Author_Institution :
Comell Digital Libr. Res. Group, Ithaca, NY, USA
fYear :
2002
fDate :
2002
Firstpage :
431
Lastpage :
438
Abstract :
Nothing is more distributed than the Web, with its content spread across thousands of servers. High performance hardware and software is essential for an effective download, analysis, and organization of this content. We describe our experience with a highly parallel Web crawling system (Mercator) to construct - automatically - collections of scientific resources for the National Science Digital Library.
Keywords :
digital libraries; information resources; online front-ends; Web crawler; Web crawling system; automatic collection generation; digital library; massively parallel Web crawling; online resources; scientific resources; topic-related Web documents; Crawlers; Fingerprint recognition; Hardware; Knowledge based systems; Parallel processing; Performance analysis; Software libraries; Software performance; Uniform resource locators; World Wide Web;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel Processing Workshops, 2002. Proceedings. International Conference on
ISSN :
1530-2016
Print_ISBN :
0-7695-1680-7
Type :
conf
DOI :
10.1109/ICPPW.2002.1039762
Filename :
1039762
Link To Document :
بازگشت