Title :
A conceptual framework for composing and managing scientific data lineage
Author_Institution :
Donald Bren Sch. of Environ. Sci. & Manage., California Univ., Santa Barbara, CA, USA
Abstract :
Scientific research relies as much on the dissemination and exchange of data sets as on the publication of conclusions. Accurately tracking the lineage (origin and subsequent processing history) of scientific data sets is thus imperative for the complete documentation of scientific work. However, the lack of a definitive data model for lineage, and the poor fit between current data management tools and scientific software, effectively prevent researchers front determining, preserving, or providing the lineage of the data products they use and create. Based on a comprehensive review of lineage-related research and previous prototype systems, a conceptual framework is presented to help identify and assess basic lineage system components. Within this framework, a direction is outlined for future work on general methods for composing and managing lineage for scientific data.
Keywords :
electronic data interchange; natural sciences computing; conceptual framework; data management; data set dissemination; data set exchange; documentation; processing history; scientific data lineage tracking; scientific research; scientific software; Assembly; Data models; Documentation; Environmental management; History; Pipelines; Prototypes; Software prototyping; Software tools; Yarn;
Conference_Titel :
Scientific and Statistical Database Management, 2002. Proceedings. 14th International Conference on
Print_ISBN :
0-7695-1632-7
DOI :
10.1109/SSDM.2002.1029701