• DocumentCode
    28127
  • Title

    Propagation of Data Fusion

  • Author

    Bronselaer, Antoon ; Van Britsom, Daan ; De Tre, Guy

  • Author_Institution
    Dept. of Telecommun. & Inf. Process., Ghent Univ., Ghent, Belgium
  • Volume
    27
  • Issue
    5
  • fYear
    2015
  • fDate
    May 1 2015
  • Firstpage
    1330
  • Lastpage
    1342
  • Abstract
    In a relational database, tuples are called “duplicate” if they describe the same real-world entity. If such duplicate tuples are observed, it is recommended to remove them and to replace them with one tuple that represents the joint information of the duplicate tuples to a maximal extent. This remove-and-replace operation is called a fusion operation. Within the setting of a relational database management system, the removal of the original duplicate tuples can breach referential integrity. In this paper, a strategy is proposed to maintain referential integrity in a semantically correct manner, thereby optimizing the quality of relationships in the database. An algorithm is proposed that is able to propagate a fusion operation through the entire database. The algorithm is based on a framework of first and second order fusion functions on the one hand, and conflict resolution strategies on the other hand. It is shown how classical strategies for maintaining referential integrity, such as DELETE cascading, are highly specialized cases of the proposed framework. Experimental results are reported that (i) show the efficiency of the proposed algorithm and (ii) show the differences in quality between several second order fusion functions. It is shown that some strategies easily outperform DELETE cascading.
  • Keywords
    data integrity; relational databases; sensor fusion; set theory; DELETE cascading; conflict resolution strategies; data fusion; duplicate tuple removal; fusion functions; fusion operation; joint information; referential integrity; relational database management system; remove-and-replace operation; Backpropagation; Context; Data integration; Logic gates; Relational databases; Standardization; Data Fusion; Data fusion; Referential Integrity; Set Theory; referential integrity; set theory;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2014.2365807
  • Filename
    6948252