DocumentCode :
2192884
Title :
Chimera: a virtual data system for representing, querying, and automating data derivation
Author :
Foster, Ian ; Vöckler, Jens ; Wilde, Michael ; Zhao, Yong
Author_Institution :
Math. & Comput. Sci. Div., Argonne Nat. Lab., IL, USA
fYear :
2002
fDate :
2002
Firstpage :
37
Lastpage :
46
Abstract :
A lot of scientific data is not obtained from measurements but rather derived from other data by the application of computational procedures. We hypothesize that explicit representation of these procedures can enable documentation of data provenance, discovery of available methods, and on-demand data generation (so-called "virtual data"). To explore this idea, we have developed the Chimera virtual data system, which combines a virtual data catalog for representing data derivation procedures and derived data, with a virtual data language interpreter that translates user requests into data definition and query operations on the database. We couple the Chimera system with distributed "data grid" services to enable on-demand execution of computation schedules constructed from database queries. We have applied this system to two challenge problems, the reconstruction of simulated collision event data from a high-energy physics experiment, and searching digital sky survey data for galactic clusters, with promising results.
Keywords :
astronomy computing; data analysis; data structures; physics computing; query processing; relational databases; scientific information systems; Chimera virtual data system; data definition operations; data derivation automation; data derivation querying; data derivation representation; database queries; digital sky survey data searching; distributed data grid services; documentation; galactic clusters; high-energy physics experiment; on-demand computation schedule execution; on-demand data generation; query operations; scientific data; simulated collision event data reconstruction; user request translation; virtual data catalog; virtual data language interpreter; Computational modeling; Computer applications; Data systems; Discrete event simulation; Distributed computing; Distributed databases; Documentation; Grid computing; Physics; Processor scheduling;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Scientific and Statistical Database Management, 2002. Proceedings. 14th International Conference on
ISSN :
1099-3371
Print_ISBN :
0-7695-1632-7
Type :
conf
DOI :
10.1109/SSDM.2002.1029704
Filename :
1029704
Link To Document :
بازگشت