• DocumentCode
    234694
  • Title

    Author name disambiguation using vector space model and hybrid similarity measures

  • Author

    Arif, Tasleem ; Ali, Raian ; Asger, M.

  • Author_Institution
    Deptt. of Inf. Technol., BGSB Univ., Rajouri, India
  • fYear
    2014
  • fDate
    7-9 Aug. 2014
  • Firstpage
    135
  • Lastpage
    140
  • Abstract
    Differentiating people on the basis of their names has always been a complex issue and our desire for grouping people, in a particular domain, based on their attributes is growing day by day. Despite years of research and a bunch of proposed techniques, the name ambiguity problem remains largely unsolved and the so far proposed techniques have faced one problem or the other. In case of author name disambiguation in digital citations, additional attributes like e-mail ID and affiliation of author and co-authors, which are normally available in publications, can help a lot in disambiguation process. Vector space model has traditionally been used in information retrieval field with great degree of success and we explore its use in case of author name disambiguation here. In this paper we propose an enhanced vector space model for disambiguating authors and their publications. Experimental results show that additional attributes present in publications can help a lot in disambiguation and solve the name ambiguity problem with a great degree of confidence. From the study we conducted and the experimental results obtained we conclude that both mixed citation and split citations problem can be handled very efficiently. We obtained a great deal of improvement in evaluation metrics obtaining F1 score of 0.97.
  • Keywords
    citation analysis; text analysis; vectors; F1 score; author name disambiguation; coauthor affiliation; digital citation; e-mail ID; enhanced vector space model; evaluation metrics; hybrid similarity measures; information retrieval field; mixed citation problem; name ambiguity problem; people differentiation; people grouping; publication; split citations problem; Atomic layer deposition; Clustering algorithms; Educational institutions; Electronic mail; Libraries; Measurement; Vectors; Hierarchical Clustering; Hybrid-Similarity; Name Disambiguation; Vector Space Model;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Contemporary Computing (IC3), 2014 Seventh International Conference on
  • Conference_Location
    Noida
  • Print_ISBN
    978-1-4799-5172-7
  • Type

    conf

  • DOI
    10.1109/IC3.2014.6897162
  • Filename
    6897162