• DocumentCode
    2244106
  • Title

    Automatic identifying of maximal length noun phrase

  • Author

    Yegang Li ; Heyan Huang

  • Author_Institution
    Sch. of Comput. Sci. & Technol., Beijing in stitute of Technol., Beijing, China
  • fYear
    2012
  • fDate
    Oct. 30 2012-Nov. 1 2012
  • Firstpage
    1445
  • Lastpage
    1448
  • Abstract
    The automatic recognition of the maximal-length noun phrase (MNP) helps to the shallow parsing. In this paper, automatic labeling of Chinese MNP is regarded as a sequential labeling task and Support Vector Machine model (SVM) is employed in the model. We propose a method which takes 2-phase hybrid approach which first identifies base chunk and then identifies MNP. Furthermore, the base chunk features can be exploited to improve performance of MNP recognition. In addition, both left-right and right-left sequential labeling were employed to identify Chinese MNP by bidirectional sequence labeling merging. The data set in the experiments is selected from Penn Chinese Treebank 5.0 Corpus, and split into train set, development set and test set according to the proportion of 4:4:1. Experimental result shows a high quality performance of 90.13% in F1-measure.
  • Keywords
    grammars; natural language processing; support vector machines; 2-phase hybrid approach; F1-measure; MNP recognition; Penn Chinese treebank 5.0 corpus; SVM; automatic maximal length noun phrase identification; base chunk; bidirectional sequence labeling merging; left-right sequential labeling; right-left sequential labeling; sequential labeling task; shallow parsing; support vector machine model; Cloud computing; Labeling; Magnetic heads; Merging; Support vector machines; Syntactics; Tagging; 2-phase; MNP; base chunk feature; bidirectional sequence labeling merging;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cloud Computing and Intelligent Systems (CCIS), 2012 IEEE 2nd International Conference on
  • Conference_Location
    Hangzhou
  • Print_ISBN
    978-1-4673-1855-6
  • Type

    conf

  • DOI
    10.1109/CCIS.2012.6664624
  • Filename
    6664624