DocumentCode :
2798500
Title :
Optimized Data Loading for a Multi-Terabyte Sky Survey Repository
Author :
Cai, Y. Dora ; Aydt, Ruth ; Brunner, Robert J.
Author_Institution :
National Center for Supercomputing Applications (NCSA)
fYear :
2005
fDate :
12-18 Nov. 2005
Firstpage :
42
Lastpage :
42
Abstract :
Advanced instruments in a variety of scientific domains are collecting massive amounts of data that must be postprocessed and organized to support research activities. Astronomers have been pioneers in the use of databases to host sky survey data. Increasing data volumes from more powerful telescopes pose enormous challenges to state-ofthe- art database systems and data-loading techniques. In this paper we present SkyLoader, our novel framework for data loading that is being used to populate a multi-table, multi-terabyte database repository for the Palomar-Quest sky survey. SkyLoader consists of an efficient algorithm for bulk loading, an effective data structure to support data integrity, optimized parallelism, and guidelines for system tuning. Performance studies show the positive effects of these techniques, with load time for a 40-gigabyte data set reduced from over 20 hours to less than 3 hours. Our framework offers a promising approach for loading other large and complex scientific databases.
Keywords :
Astronomy; Buildings; Data structures; Database systems; Guidelines; Instruments; Parallel processing; Permission; Space technology; Telescopes;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Supercomputing, 2005. Proceedings of the ACM/IEEE SC 2005 Conference
Print_ISBN :
1-59593-061-2
Type :
conf
DOI :
10.1109/SC.2005.50
Filename :
1559994
Link To Document :
بازگشت