Title :
Data-Intensive Services for Large-Scale Archive Access
Author :
Tanaka, Masahiro ; Murakami, Yohei ; Zettsu, Koji
Author_Institution :
Inf. Service Platform Lab., Nat. Inst. of Inf. & Commun. Technol. (NICT), Kyoto, Japan
Abstract :
Recently many organizations have accumulated data from such various sources as web and network sensors and constructed large-scale archives. Some would like to publish their archives to public to facilitate the activities of other organizations, but the scale of the archives causes problems. Therefore, we propose the concept of data-intensive services, which publish large-scale archives. We show the architecture for data-intensive services and focus on the following fundamental functional properties: 1) enhancing search, 2) preprocessing, 3) and asynchronous transfer. We also developed a reference implementation of a framework for data-intensive services and applied it to a web archive that contains about 2 billion documents and greatly improved the access performance to the web archive at small development cost.
Keywords :
Internet; information retrieval systems; Web archive; asynchronous transfer; data-intensive service; large-scale archive access; Indexes; Organizations; Prototypes; Servers; Simple object access protocol; data-intensive service; large-scale archive;
Conference_Titel :
Services Computing (SCC), 2012 IEEE Ninth International Conference on
Conference_Location :
Honolulu, HI
Print_ISBN :
978-1-4673-3049-7
DOI :
10.1109/SCC.2012.75