Title :
Managing distributed collections: evaluating Web page changes, movement, and replacement
Author :
Dalai, Z. ; Dash, Suvendu ; Dave, Pratik ; Francisco-Revilla, Luis ; Furuta, Richard ; Karadkar, Unmil ; Shipma, F.
Author_Institution :
Dept. of Comput. Sci., Texas A&M Univ., College Station, TX, USA
Abstract :
Distributed collections of Web materials are common. Bookmark lists, paths, and catalogs such as Yahoo! Directories require human maintenance to keep up to date with changes to the underlying documents. The Walden´s paths path manager is a tool to support the maintenance of distributed collections. Earlier efforts focused on recognizing the type and degree of change within Web pages and identifying pages no longer accessible. We now extend this work with algorithms for evaluating drastic changes to page content based on context. Additionally, we expand on previous work to locate moved pages and apply the modified approach to suggesting page replacements when the original page cannot be found. Based on these results we are redesigning the path manager to better support the range of assessments necessary to manage distributed collections.
Keywords :
Internet; digital libraries; document handling; human factors; Walden paths path manager tool; Web material; Web page; bookmark list; catalog; distributed collection; human maintenance; Algorithm design and analysis; Art; Catalogs; Change detection algorithms; Computer science; Human factors; Information retrieval; Organizing; Software libraries; Web pages;
Conference_Titel :
Digital Libraries, 2004. Proceedings of the 2004 Joint ACM/IEEE Conference on
Print_ISBN :
1-58113-832-6
DOI :
10.1109/JCDL.2004.1336113