• DocumentCode
    2352733
  • Title

    A Hybrid Model Based on CRFs for Chinese Named Entity Recognition

  • Author

    Li, Lishuang ; Ding, Zhuoye ; Huang, Degen ; Zhou, Huiwei

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Dalian Univ. of Technol., Dalian
  • fYear
    2008
  • fDate
    23-25 July 2008
  • Firstpage
    127
  • Lastpage
    132
  • Abstract
    This paper presents a hybrid model and the corresponding algorithm combining conditional random fields (CRFs) with statistical methods to improve the performance of CRFs for the task of Chinese named entity recognition (NER). CRFs has a good performance in the task of sequence labeling. In the experiment of recognizing Chinese named entity with CRFs, it can be found that the wrong tags labeled by CRFs are mostly the ones which have lower marginal probabilities. A statistical model is introduced to compliment it. In the hybrid model, marginal probability of every label in CRFs is used to separate CRFs method and statistical method. If the probability is greater than the given threshold, the test sample is recognized by CRFs; otherwise, the statistical model is used. By integrating the advantages of two methods, the hybrid model achieves 93.61% F-measure for Chinese person names and 91.75% F-measure for Chinese location names on MSRA dataset.
  • Keywords
    natural language processing; pattern recognition; probability; random processes; statistical analysis; Chinese location names; Chinese named entity recognition; Chinese person names; F-measure; MSRA dataset; conditional random fields; marginal probabilities; sequence labeling; statistical methods; Computer science; Entropy; Hidden Markov models; Information technology; Labeling; Natural languages; Probability; Statistical analysis; Testing; Text recognition; CRF; Chinese NER; Hybrid Model;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advanced Language Processing and Web Information Technology, 2008. ALPIT '08. International Conference on
  • Conference_Location
    Dalian Liaoning
  • Print_ISBN
    978-0-7695-3273-8
  • Type

    conf

  • DOI
    10.1109/ALPIT.2008.39
  • Filename
    4584354