Title :
A Decision-Theoretic Framework for Numerical Attribute Value Reconciliation
Author_Institution :
Dept. of Supply Chain & Inf. Syst., Iowa State Univ., Ames, IA, USA
fDate :
7/1/2012 12:00:00 AM
Abstract :
One of the major challenges of data integration is to resolve conflicting numerical attribute values caused by data heterogeneity. In addressing this problem, existing approaches proposed in prior literature often ignore such data inconsistencies or resolve them in an ad hoc manner. In this study, we propose a decision-theoretical framework that resolves numerical value conflicts in a systematic manner. The framework takes into consideration the consequences of incorrect numerical values and selects the value that minimizes the expected cost of errors for all data application problems under consideration. Experimental results show that significant savings can be achieved by adopting the proposed framework instead of ad hoc approaches.
Keywords :
data integration; decision theory; distributed databases; error analysis; data application problems; data heterogeneity; data inconsistencies; data integration; decision-theoretic framework; expected error cost minimization; incorrect numerical values; numerical attribute value reconciliation; numerical value conflict resolution; Accuracy; Data structures; Databases; Estimation; History; Probability density function; Probability distribution; Database integration; Type I; Type II; and misrepresentation errors.; data heterogeneity; heterogeneous databases; numerical value conflicts; probabilistic databases;
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
DOI :
10.1109/TKDE.2011.75