• DocumentCode
    2106849
  • Title

    A new distributed name disambiguation system based on MapReduce

  • Author

    Liu Pengfei ; Ge sheng

  • Author_Institution
    Sch. of Comput. Sci. & Eng., Beihang Univ., Beijing, China
  • fYear
    2012
  • fDate
    9-11 Nov. 2012
  • Firstpage
    550
  • Lastpage
    554
  • Abstract
    Social network search is a kind of vertical search based on the information polymerization of social network data. And name disambiguation is one of the vital issues in social network search. With the explosive growth of the information, effectively deal with name disambiguation in massive data scenario becomes an important issue. To tackle this issue, we combine the MapReduce model and document clustering algorithm to propose a distributed method for name disambiguation. And then present a distributed name disambiguation system. This system runs on the Hadoop platform, and makes the name disambiguation parallelized by dividing the document clustering task to a couple of maps and reduces. In addition, we evaluated our system in terms of expansibility and accuracy.
  • Keywords
    data mining; document handling; information retrieval; parallel processing; pattern clustering; social networking (online); Hadoop platform; MapReduce model; distributed name disambiguation system; document clustering algorithm; information polymerization; social network data; MapReduce; document cluster; name disambiguation; vector space model;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Communication Technology (ICCT), 2012 IEEE 14th International Conference on
  • Conference_Location
    Chengdu
  • Print_ISBN
    978-1-4673-2100-6
  • Type

    conf

  • DOI
    10.1109/ICCT.2012.6511416
  • Filename
    6511416