DocumentCode :
3681174
Title :
Parallel NoSQL Entity Resolution Approach with MapReduce
Author :
Kun Ma;Bo Yang
Author_Institution :
Shandong Provincial Key Lab. of Network Based Intell. Comput., Univ. of Jinan, Jinan, China
fYear :
2015
Firstpage :
384
Lastpage :
389
Abstract :
To address the limitation of entity resolution of NoSQL documents, we propose a new parallel NoSQL entity resolution approach with MapReduce. Although current MapReduce framework enables efficient parallel execution of entity resolution, it cannot find duplicates in adjacent block easily. Therefore, we investigate possible solutions called Partition-Sort-Map-Reduce to find duplicates by overlapping boundary objects in adjacent blocks. Finally, our experimental evaluation based on NoSQL breeding data and the analysis of time complexity show the high effectiveness and efficiency of the proposed entity resolution approaches.
Keywords :
"Sorting","Time complexity","Batch production systems","Parallel processing","Artificial intelligence","Tin"
Publisher :
ieee
Conference_Titel :
Intelligent Networking and Collaborative Systems (INCOS), 2015 International Conference on
Type :
conf
DOI :
10.1109/INCoS.2015.16
Filename :
7312102
Link To Document :
بازگشت