DocumentCode :
2771631
Title :
Resolving Identity Uncertainty with Learned Random Walks
Author :
Sandler, Ted ; Ungar, L.H. ; Crammer, Koby
Author_Institution :
Dept. of Electr. Eng., Technion - Israel Inst. of Technol., Haifa, Israel
fYear :
2009
fDate :
6-9 Dec. 2009
Firstpage :
457
Lastpage :
465
Abstract :
A pervasive problem in large relational databases is identity uncertainty which occurs when multiple entries in a database refer to the same underlying entity in the world. Relational databases exhibit rich graphical structure and are naturally modeled as graphs whose nodes represent entities and whose typed-edges represent relations between them. We propose using random walk models for resolving identity uncertainty since they have proven effective for finding points which are proximately located in a network. Because not all types of relations are equally helpful in alleviating identity uncertainty, we develop a supervised approach to learning the usefulness of different database relations from a training set of database entries whose true identities are known. When tested on the task of resolving uncertainty of ambiguously named authors in bibliographical data, the learned random walk models yield performance superior to support vector machines, and to a related spectral clustering method.
Keywords :
learning (artificial intelligence); relational databases; identity uncertainty resolution; random walk models; relational databases; spectral clustering method; supervised learning approach; support vector machines; Clustering methods; Data mining; Information science; Joining processes; Pervasive computing; Relational databases; Spatial databases; Support vector machines; Testing; Uncertainty; identity uncertainty; random walks; semi-supervised learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2009. ICDM '09. Ninth IEEE International Conference on
Conference_Location :
Miami, FL
ISSN :
1550-4786
Print_ISBN :
978-1-4244-5242-2
Electronic_ISBN :
1550-4786
Type :
conf
DOI :
10.1109/ICDM.2009.69
Filename :
5360271
Link To Document :
بازگشت