DocumentCode
3576305
Title
SIER: An Efficient Entity Resolution Mechanism Combining SNM and Iteration
Author
Taiming Wang ; Yue Kou ; Derong Shen ; Heng Liu ; Ge Yu
Author_Institution
Coll. of Inf. Sci. & Eng., Northeastern Univ., Shenyang, China
fYear
2014
Firstpage
238
Lastpage
241
Abstract
With the rapid increase of data, entity resolution (ER) faces two challenges: high quality and high performance. Correspondingly, current work focuses on iteration-based entity resolution or sorted neighborhood (SNM) - based entity resolution. The former iteratively merges similar records to acquire higher precision and recall. The latter only compares the records within the same sliding window to maintain higher performance. However, they are at the cost of either sacrificing efficiency or result quality. In this paper, we present an entity resolution mechanism combining SNM and iteration (called SIER). Unlike traditional approaches, SIER can fully exploit the advantages of SNM and iteration. Also a two-stage entity matching algorithm is proposed. In the first stage, the records are initially matched based on sliding window. In the second stage, the matching result is rectified iteratively to improve the quality of the result. The experiments demonstrate the feasibility and effectiveness of our method.
Keywords
data handling; iterative methods; ER; SIER; SNM; entity resolution mechanism; iteration-based entity resolution; result quality; sliding window; sorted neighborhood-based entity resolution; two-stage entity matching algorithm; Clustering algorithms; Couplings; Educational institutions; Erbium; Iterative methods; Merging; Sorting; iterative entity resolution; sliding window; sorted neighborhood;
fLanguage
English
Publisher
ieee
Conference_Titel
Web Information System and Application Conference (WISA), 2014 11th
Print_ISBN
978-1-4799-5726-2
Type
conf
DOI
10.1109/WISA.2014.50
Filename
7058019
Link To Document