Title :
On the Efficiency of Provenance Queries
Author :
Kementsietsidis, Anastasios ; Wang, Min
Author_Institution :
IBM T.J. Watson Res. Center, Hawthorne, NY
fDate :
March 29 2009-April 2 2009
Abstract :
While models for data provenance have been extensively studied in the literature, the efficient evaluation of the resulting provenance queries remains an open problem. Traditional query optimization techniques, like the use of general-purpose indexes, or the materialization of provenance data, fail on different fronts to address the problem. Provenance-specific optimization techniques, like the use of customized indexes, similarly prove inadequate since the techniques are bound to specific provenance models. Therefore, the need to develop generic provenance-aware techniques quickly becomes apparent. In this paper, we argue for such a generic technique in the form of a provenance index structure that can be used to efficiently evaluate provenance queries in a variety of contexts. By highlighting the limitations of existing techniques, we identify the set of key properties of the generic index, including a novel property called duality which guarantees that the single index can evaluate both backward provenance queries (which data items from a set I are associated with an item from set O) and forward provenance queries (which items from O are associated with an item from I).
Keywords :
optimisation; query processing; data provenance; duality; provenance queries; provenance-specific optimization techniques; query optimization techniques; Banking; Blood pressure; Data analysis; Data engineering; Electrocardiography; Forward contracts; History; Medical services; Query processing; USA Councils; Data Provenance; Index; Query Efficiency;
Conference_Titel :
Data Engineering, 2009. ICDE '09. IEEE 25th International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-4244-3422-0
Electronic_ISBN :
1084-4627
DOI :
10.1109/ICDE.2009.206