DocumentCode :
657982
Title :
Similar data elimination: MFB algorithm
Author :
Boufares, Faouzi ; Ben Salem, Aicha ; Rehab, Moufida ; Correia, Sebastiao
Author_Institution :
Lab. LIPN, Univ. Paris 13, Villetaneuse, France
fYear :
2013
fDate :
6-8 May 2013
Firstpage :
289
Lastpage :
293
Abstract :
Nowadays, the complex applications such as knowledge extraction, data mining, E-learning and web applications use heterogeneous and distributed data. In this context, the need for integration and improving data quality is increasingly felt. The problem of eliminating duplicates and similar data is still relevant in terms of both performance and in terms of the definition of similarity rules. We present in this paper a new deduplication algorithm based on the two functions Match and Merge. An evaluation is made experimentally using a set of randomly generated data.
Keywords :
data analysis; data integration; data mining; merging; MFB algorithm; Match function; Merge function; Web applications; data integration; data mining; data quality improvement; deduplication algorithm; distributed data; duplicate elimination; e-learning; heterogeneous data; knowledge extraction; similar data elimination; similarity rules; Cleaning; Companies; Couplings; Data mining; Knowledge discovery; Semantics; Switches; Data Quality; Deduplication; Duplicates; Match; Merge; Similar Data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Control, Decision and Information Technologies (CoDIT), 2013 International Conference on
Conference_Location :
Hammamet
Print_ISBN :
978-1-4673-5547-6
Type :
conf
DOI :
10.1109/CoDIT.2013.6689559
Filename :
6689559
Link To Document :
بازگشت