DocumentCode :
2719512
Title :
Entity Resolution with crowd errors
Author :
Verroios, Vasilis ; Garcia-Molina, Hector
Author_Institution :
Stanford Univ., Stanford, CA, USA
fYear :
2015
fDate :
13-17 April 2015
Firstpage :
219
Lastpage :
230
Abstract :
Given a set of records, an Entity Resolution (ER) algorithm finds records that refer to the same real-world entity. Humans can often determine if two records refer to the same entity, and hence we study the problem of selecting questions to ask error-prone humans. We give a Maximum Likelihood formulation for the problem of finding the “most beneficial” questions to ask next. Our theoretical results lead to a lightweight and practical algorithm, bDENSE, for selecting questions to ask humans. Our experimental results show that bDENSE can more quickly reach an accurate outcome, compared to two approaches proposed recently. Moreover, through our experimental evaluation, we identify the strengths and weaknesses of all three approaches.
Keywords :
data handling; maximum likelihood estimation; question answering (information retrieval); records management; bDENSE; computer algorithms; crowd errors; entity resolution algorithm; error-prone humans; maximum likelihood formulation; question selection; record detection; record finding; strength identification; weakness identification; Erbium;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering (ICDE), 2015 IEEE 31st International Conference on
Conference_Location :
Seoul
Type :
conf
DOI :
10.1109/ICDE.2015.7113286
Filename :
7113286
Link To Document :
بازگشت