• DocumentCode
    1909838
  • Title

    Automatic Recognition of Chinese Organization Name Based on Conditional Random Fields

  • Author

    Zhang, Suxiang ; Zhang, Suxian ; Wang, Xiaojie

  • Author_Institution
    North China Electr. Power Univ., Baoding
  • fYear
    2007
  • fDate
    Aug. 30 2007-Sept. 1 2007
  • Firstpage
    229
  • Lastpage
    233
  • Abstract
    Person, location and organization have been always mentioned as a bottleneck of a named entity recognition (NER) system. Automatic recognition of Chinese organization name is the most difficult problem in NER tasks. This paper presents a new approach of Chinese organization name recognition based on cascaded conditional random fields. In the proposed approach, we first recognize the person name and location name before recognizing organization. The model structure has been designed with the cascade way, the result then is passed to the high model and suppose the decision of high model for recognition of the complicated organization names. And we proposed the new feature to realize this task. We evaluate our approach on large-scale corpus with open test method using People´s Daily (January. 1998). Chinese ORG recalling rate achieves 88.78% and the precision rate is 82.35%. The evaluation results show that our approach based on cascaded conditional random fields significantly outperforms previous approaches.
  • Keywords
    information retrieval; natural languages; random processes; text analysis; Chinese organization recognition; cascaded conditional random field; information extraction; large-scale corpus; named entity recognition system; question answering system; text document; Character recognition; Data mining; Educational institutions; Hidden Markov models; Large-scale systems; Machine learning; Power engineering and energy; Sun; Testing; Text recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Natural Language Processing and Knowledge Engineering, 2007. NLP-KE 2007. International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-4244-1611-0
  • Electronic_ISBN
    978-1-4244-1611-0
  • Type

    conf

  • DOI
    10.1109/NLPKE.2007.4368038
  • Filename
    4368038