• DocumentCode
    1662548
  • Title

    Exploring Word Similarity to Improve Chinese Personal Name Disambiguation

  • Author

    Yang, Xia ; Jin, Peng ; Xiang, Wei

  • Author_Institution
    Lab. of Intell. Inf. Process. & Applic., Leshan Normal Univ., Leshan, China
  • Volume
    3
  • fYear
    2011
  • Firstpage
    197
  • Lastpage
    200
  • Abstract
    This paper presents an approach to the Chinese Personal Name Disambiguation (PND). The key to clustering is the similarity measure of context, which depends on the features selection and representation and calculation method. First HIT Tongyici Cilin (Extended) is introduced to Chinese PND to enhance the clustering effect. Exploration about more word similarity is also performed to alleviate the data sparseness. In this system, a HAC (Hierarchical Agglomerative Clustering) algorithm is adopted to cluster the mentions referring to a same person with features extracted from documents. The results show that the word similarity information is very helpful to improve the system´s performance.
  • Keywords
    pattern clustering; search engines; word processing; Chinese personal name disambiguation; First HIT Tongyici Cilin; data sparseness; feature calculation method; feature representation method; feature selection method; hierarchical agglomerative clustering algorithm; personal name search; word similarity; Buildings; Clustering algorithms; Conferences; Educational institutions; Equations; Feature extraction; Mathematical model; Chinese PND; HAC algorithm; Tongyici Cilin; Word Similarity;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence and Intelligent Agent Technology (WI-IAT), 2011 IEEE/WIC/ACM International Conference on
  • Conference_Location
    Lyon
  • Print_ISBN
    978-1-4577-1373-6
  • Electronic_ISBN
    978-0-7695-4513-4
  • Type

    conf

  • DOI
    10.1109/WI-IAT.2011.90
  • Filename
    6040839