DocumentCode :
2106849
Title :
A new distributed name disambiguation system based on MapReduce
Author :
Liu Pengfei ; Ge sheng
Author_Institution :
Sch. of Comput. Sci. & Eng., Beihang Univ., Beijing, China
fYear :
2012
fDate :
9-11 Nov. 2012
Firstpage :
550
Lastpage :
554
Abstract :
Social network search is a kind of vertical search based on the information polymerization of social network data. And name disambiguation is one of the vital issues in social network search. With the explosive growth of the information, effectively deal with name disambiguation in massive data scenario becomes an important issue. To tackle this issue, we combine the MapReduce model and document clustering algorithm to propose a distributed method for name disambiguation. And then present a distributed name disambiguation system. This system runs on the Hadoop platform, and makes the name disambiguation parallelized by dividing the document clustering task to a couple of maps and reduces. In addition, we evaluated our system in terms of expansibility and accuracy.
Keywords :
data mining; document handling; information retrieval; parallel processing; pattern clustering; social networking (online); Hadoop platform; MapReduce model; distributed name disambiguation system; document clustering algorithm; information polymerization; social network data; MapReduce; document cluster; name disambiguation; vector space model;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Communication Technology (ICCT), 2012 IEEE 14th International Conference on
Conference_Location :
Chengdu
Print_ISBN :
978-1-4673-2100-6
Type :
conf
DOI :
10.1109/ICCT.2012.6511416
Filename :
6511416
Link To Document :
بازگشت