DocumentCode :
3436653
Title :
On the Use of Semantic Blocking Techniques for Data Cleansing and Integration
Author :
Nin, Jordi ; Muntés-Mulero, Victor ; Bazan, Norbert Martínez ; Larriba-Pey, Josep-L
Author_Institution :
CSIC, Bellaterra
fYear :
2007
fDate :
6-8 Sept. 2007
Firstpage :
190
Lastpage :
198
Abstract :
Record linkage (RL) is an important component of data cleansing and integration. For years, many efforts have focused on improving the performance of the RL process, either by reducing the number of record comparisons or by reducing the number of attribute comparisons, which reduces the computational time, but very often decreases the quality of the results. However, the real bottleneck of RL is the post-process, where the results have to be reviewed by experts that decide which pairs or groups of records are real links and which are false hits. In this paper, we show that exploiting the relationships (e.g. foreign key) established between one or more data sources, makes it possible to find a new sort of semantic blocking method that improves the number of hits and reduces the amount of review effort.
Keywords :
data handling; records management; data cleansing; data integration; record linkage; semantic blocking; Artificial intelligence; Collaboration; Councils; Couplings; Distributed computing; Pervasive computing; Semantic information; blocking algorithms; data cleansing.; data integration; linkage; record;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Database Engineering and Applications Symposium, 2007. IDEAS 2007. 11th International
Conference_Location :
Banff, Alta.
ISSN :
1098-8068
Print_ISBN :
978-0-7695-2947-9
Type :
conf
DOI :
10.1109/IDEAS.2007.4318104
Filename :
4318104
Link To Document :
بازگشت