DocumentCode :
3321593
Title :
Transformation-based Framework for Record Matching
Author :
Arasu, Arvind ; Chaudhuri, Surajit ; Kaushik, Raghav
Author_Institution :
Microsoft Res., Redmond, WA
fYear :
2008
fDate :
7-12 April 2008
Firstpage :
40
Lastpage :
49
Abstract :
Today\´s record matching infrastructure does not allow a flexible way to account for synonyms such as "Robert" and "Bob" which refer to the same name, and more general forms of string transformations such as abbreviations. We propose a programmatic framework of record matching that takes such user-defined string transformations as input. To the best of our knowledge, this is the first proposal for such a framework. This transformational framework, while expressive, poses significant computational challenges which we address. We empirically evaluate our techniques over real data.
Keywords :
data analysis; data warehouses; data warehouse; record matching; transformation-based framework; Data analysis; Data warehouses; Marketing and sales; Proposals; Standardization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on
Conference_Location :
Cancun
Print_ISBN :
978-1-4244-1836-7
Electronic_ISBN :
978-1-4244-1837-4
Type :
conf
DOI :
10.1109/ICDE.2008.4497412
Filename :
4497412
Link To Document :
بازگشت