DocumentCode
2106849
Title
A new distributed name disambiguation system based on MapReduce
Author
Liu Pengfei ; Ge sheng
Author_Institution
Sch. of Comput. Sci. & Eng., Beihang Univ., Beijing, China
fYear
2012
fDate
9-11 Nov. 2012
Firstpage
550
Lastpage
554
Abstract
Social network search is a kind of vertical search based on the information polymerization of social network data. And name disambiguation is one of the vital issues in social network search. With the explosive growth of the information, effectively deal with name disambiguation in massive data scenario becomes an important issue. To tackle this issue, we combine the MapReduce model and document clustering algorithm to propose a distributed method for name disambiguation. And then present a distributed name disambiguation system. This system runs on the Hadoop platform, and makes the name disambiguation parallelized by dividing the document clustering task to a couple of maps and reduces. In addition, we evaluated our system in terms of expansibility and accuracy.
Keywords
data mining; document handling; information retrieval; parallel processing; pattern clustering; social networking (online); Hadoop platform; MapReduce model; distributed name disambiguation system; document clustering algorithm; information polymerization; social network data; MapReduce; document cluster; name disambiguation; vector space model;
fLanguage
English
Publisher
ieee
Conference_Titel
Communication Technology (ICCT), 2012 IEEE 14th International Conference on
Conference_Location
Chengdu
Print_ISBN
978-1-4673-2100-6
Type
conf
DOI
10.1109/ICCT.2012.6511416
Filename
6511416
Link To Document