DocumentCode :
2054973
Title :
Optimizing retrieval and processing of multi-dimensional scientific datasets
Author :
Chang, Chialin ; Kurc, Tahsin ; Sussman, Alan ; Saltz, Joel
Author_Institution :
Dept. of Comput. Sci., Maryland Univ., College Park, MD, USA
fYear :
2000
fDate :
2000
Firstpage :
405
Lastpage :
410
Abstract :
We have developed the Active Data Repository (ADR), an infrastructure that integrates storage, retrieval, and processing of large multi-dimensional scientific datasets on distributed memory parallel machines with multiple disks attached to each node. In earlier work, we proposed three strategies for processing range queries within the ADR framework. Our experimental results show that the relative performance of the strategies changes under varying application characteristics and machine configurations. In this work we investigate approaches to guide and automate the selection of the best strategy for a given application and machine configuration. We describe analytical models to predict the relative performance of the strategies where input data elements are uniformly distributed in the attribute space of the output dataset, restricting the output dataset to be a regular d-dimensional array
Keywords :
information retrieval; parallel processing; active data repository; distributed memory parallel machines; information storage and retrieval; infrastructure; multi-dimensional scientific datasets retrieval; range queries; regular d-dimensional array; Area measurement; Computer science; Data analysis; Educational institutions; Information retrieval; Microscopy; Microwave integrated circuits; Pathology; Satellites; Tomography;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing Symposium, 2000. IPDPS 2000. Proceedings. 14th International
Conference_Location :
Cancun
Print_ISBN :
0-7695-0574-0
Type :
conf
DOI :
10.1109/IPDPS.2000.846013
Filename :
846013
Link To Document :
بازگشت