DocumentCode :
3143562
Title :
On data dependencies in dataspaces
Author :
Song, Shaoxu ; Chen, Lei ; Yu, Philip S.
Author_Institution :
Hong Kong Univ. of Sci. & Technol., Hong Kong, China
fYear :
2011
fDate :
11-16 April 2011
Firstpage :
470
Lastpage :
481
Abstract :
To study data dependencies over heterogeneous data in dataspaces, we define a general dependency form, namely comparable dependencies (CDs), which specifies constraints on comparable attributes. It covers the semantics of a broad class of dependencies in databases, including functional dependencies (FDs), metric functional dependencies (MFDs), and matching dependencies (MDs). As we illustrated, comparable dependencies are useful in real practice of dataspaces, e.g., semantic query optimization. Due to the heterogeneous data in dataspaces, the first question, known as the validation problem, is to determine whether a dependency (almost) holds in a data instance. Unfortunately, as we proved, the validation problem with certain error or confidence guarantee is generally hard. In fact, the confidence validation problem is also NP-hard to approximate to within any constant factor. Nevertheless, we develop several approaches for efficient approximation computation, including greedy and randomized approaches with an approximation bound on the maximum number of violations that an object may introduce. Finally, through an extensive experimental evaluation on real data, we verify the superiority of our methods.
Keywords :
computational complexity; data handling; NP-hard problem; comparable dependencies; confidence validation problem; data dependencies; dataspaces; greedy approach; matching dependencies; metric functional dependencies; randomized approach; Approximation methods; Image color analysis; Measurement uncertainty; Query processing; Semantics;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering (ICDE), 2011 IEEE 27th International Conference on
Conference_Location :
Hannover
ISSN :
1063-6382
Print_ISBN :
978-1-4244-8959-6
Electronic_ISBN :
1063-6382
Type :
conf
DOI :
10.1109/ICDE.2011.5767857
Filename :
5767857
Link To Document :
بازگشت