DocumentCode
451122
Title
Querying Very Large Multi-dimensional Datasets in ADR
Author
Kurc, T. ; Chialin Chang ; Ferreira, R. ; Sussman, A. ; Saltz, J.
Author_Institution
University of Maryland
fYear
1999
fDate
13-18 Nov. 1999
Firstpage
12
Lastpage
12
Abstract
Applications that make use of very large scientific datasets have become an increasingly important subset of scientific applications. In these applications, datasets are often multi-dimensional, i.e., data items are associated with points in a multi-dimensional attribute space, and access to data items is described by range queries. The basic processing involves mapping input data items to output data items, and some form of aggregation of all the input data items that project to the each output data item. We have developed an infrastructure, called the Active Data Repository (ADR), that integrates storage, retrieval and processing of multi-dimensional datasets on distributed-memory parallel architectures with multiple disks attached to each node. In this paper we address efficient execution of range queries on distributed memory parallel machines within ADR framework. We present three potential strategies, and evaluate them under different application scenarios and machine configurations. We present experimental results on the scalability and performance of the strategies on a 128-node IBM SP.
Keywords
Application software; Biomedical imaging; Chemical sensors; Computer science; Educational institutions; Information retrieval; Parallel architectures; Pathology; Satellites; Sensor phenomena and characterization;
fLanguage
English
Publisher
ieee
Conference_Titel
Supercomputing, ACM/IEEE 1999 Conference
Print_ISBN
1-58113-091-0
Type
conf
DOI
10.1109/SC.1999.10046
Filename
1592655
Link To Document