DocumentCode
3426045
Title
Supporting fine-grained data lineage in a database visualization environment
Author
Woodruff, Allison ; Stonebraker, Michael
Author_Institution
Dept. of Electr. Eng. & Comput. Sci., California Univ., Berkeley, CA, USA
fYear
1997
fDate
7-11 Apr 1997
Firstpage
91
Lastpage
102
Abstract
The lineage of a datum records its processing history. Because such information can be used to trace the source of anomalies and errors in processed data sets, it is valuable to users for a variety of applications, including the investigation of anomalies and debugging. Traditional data lineage approaches rely on metadata. However, metadata does not scale well to fine-grained lineage, especially in large data sets. For example, it is not feasible to store all of the information that is necessary to trace from a specific floating-point value in a processed data set to a particular satellite image pixel in a source data set. In this paper, we propose a novel method to support fine-grained data lineage. Rather than relying on metadata, our approach lazily computes the lineage using a limited amount of information about the processing operators and the base data. We introduce the notions of weak inversion and verification. While our system does not perfectly invert the data, it uses weak inversion and verification to provide a number of guarantees about the lineage it generates. We propose a design for the implementation of weak inversion and verification in an object-relational database management system
Keywords
data integrity; data visualisation; database theory; object-oriented databases; program debugging; relational databases; system monitoring; anomalies; base data; data processing history; database visualization environment; debugging; error sources; fine-grained data lineage; large data sets; lazy algorithm; limited information; lineage guarantees; metadata; object-relational database management system; processed data sets; processing operators; tracing; verification; weak inversion; Cyclones; Data mining; Data visualization; Debugging; Earth; History; Image databases; Research and development; Satellites; Visual databases;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Engineering, 1997. Proceedings. 13th International Conference on
Conference_Location
Birmingham
ISSN
1063-6382
Print_ISBN
0-8186-7807-0
Type
conf
DOI
10.1109/ICDE.1997.581742
Filename
581742
Link To Document